Search
Create
Log in
Sign up
Log in
Sign up
QMB 3200
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (75)
Describes a model with a few number of terms
Parsimonious model
True slope of line in simple linear regression
Beta1
A large sample
N greater than 30
# of dumy variables required to represent a QL var at 3 levels
2
Computer software used to generate your project printout
Statistix 9.0
ANOVA
Analysis of Variance
When 2 or more Independent variable are highly correlated
What is multicollinearty
The coefficient of determination
R^2
The tester's choice, usually .05
Alpha
A measure of central tendency in the population
Mew
The value of R2 in simple linear regression when x and y are uncorrelated
0
Squared or not, it measures dispersion in the population
Sigma (SD)
Decision made about Ho when alpha exceeds the p-value
Reject Ho
A type II error
Accepting Ho when Ho is false
Rejecting Ho when Ho is true
Type I error
In theory, the proportion of 90% CI's for mean that capture the true mean
90%
Unobservable regression error
Epsilon -error
# of treatements in ANOVA with 2 factors SEX (M,F) & race (W, NW)
4
Describes a model with a squared term (x2)
Second order model or quadratic model
Sum of squared errors
SSE
E(y)
the mean (or, expected value of y)
The null hypothesis for the global F test in regression
Ho: beta 1= beta 2= beta k=0
A p-value which supports Ha when alpha=.02
.002
when the relationship between y and x1 depends on x2
interaction
the object upon which measurements are taken
experimental unit
data sleuth who cant make bricks without clay
Sherlock holmes
Chinese sage who says a picture is worth a 1,000
Confuscious
Dr. Sinchich says this about a residual plot
no pattern:No problem
From Russia, with theorem
Chebychev
Known as the father of regression for his work with Galton on son's heights
Pearson
Galton's law, it says "tall fathers have shorter sons on average"
Law of universal regression
Star trek's spok knows this increases as indepenent variables are added to tha model
R^2
Term for level of a QL variable that's assigned all 0s for dummies
Base line / base level
Assign "1" to female and "0" to males
Dummy variable for QL variable
Check to see if a standardized residual exceeds 2 in abs value
Check for outlier
As a young boy, he yearned to become starship captain instead he now commands 800 usf students
Dr. Sinchic
Term for a model that performs well when LS assumptions are violated
Robust model
1 (of 2) problems with using stepwise regression to arrive at best model
increased likihood of making type 1 or type 11 error
number of project model that has no interaction and no curvature
model 4
Sci-fi writer who felt "statistical thinking is necessary for efficient citizenship"
HG wells
The 1st to apply ANOVA methodology (at a tea party)
Fisher
Y*= square root of y
Y variance stabilizing transformation
An indication of MC on regression printout
IV have high correlation
Type of data which generally leads to correlated regression errors
Time series data
Extrapolation in regression predictions
Predicting outside the data
Start with QN and QN^2 add QL finally add QN*QL interactions
Writing model 1
Equation of a model with 1 QN and 1 QL that graphs as two nonparallel straight lines
Beta0 + beta1x1+beta2x2 +beta3x1x2
Divide printout value in half if one tailed leave it alone if two tailed
p-value of a t-test
name of multiple comparisons method in ANOVA that controls EER
Tukeys method
Dr. Watson had the Phd, but he was the Holmesian sleuth behind the DW test for autocorrelation
Durbin
The interp. Of s=8 in a model for y=total points on final exam
About 95% of the actual exam scores will fall within 16 points of their predicted values
extreme MC exists if correlation between... QNx and QL x exceeds
QNx and QL x exceeds .7 in abs value
Dummy variables are for?
QL variable
one less then the # required
to recommend a model you need..
no extreme mc, no outliers, 2s low, r-squared high, pass global f-test
durbin watson statistic
between 0 (positive) and 4 (negative)
Who has his PhD?
Geoffrey watson
Mean of errors =0
residuals vs QN-X
-should not see curvature, if curvature, add a curv term
constant error variance
residuals vs pred. y
-should not see cone/funnel shape, use variance stabilizing on y ex) y*= sq. root (y)
errors normally distributed
stem and leaf plot of resid.
-should not see extreme skew- due to robustness assumption satisfied- remove outliers
errors independent
look if exp unit is a unit of time aka time series data
model #1
E(y)= β0+ β1x1+β2x1^2+β3x2+β4x1
x2+β5x1^2
5x1^2*x2
Interaction and curvature
model #2
E(y)= β0+ β1x1+β3x2+β4x1*x2
Interaction
model #3
E(y)= β0+ β1x1+β2x1^2+β3x2
Curvature
model #4
E(y)= β0+ β1x1+β3x2
No interaction or curvature
a 95% CI for mu will be wider than a 95% PI for y
TorF
false
a 95% CI for mu will be wider than a 90% PI for y
TorF
true
Models that have slopes
#2 and #4
how many levels of QN variables should you have for each level of QL?
3
Suppose an independent researcher is looking to see if there is a relationship between gender (M/F) and favorite color (Red/Yellow/Blue/Green). Maybe males prefer the color red or maybe females prefer the color yellow. In order to test whether an individual's favorite color depends on their gender, what is the appropriate data analysis technique AND the appropriate null hypothesis?
Chi-Square test for heterogeneity
Ho: Color and Gender are independent
In a general sense, what is a Z-score?
It is a measure of how far from the mean a value lies, and it is also a measure of how statistically "unusual" an observation is for a population or sample
When referencing a Chi-Square analysis, the assumption or a large sample is sufficiently satisfied when:
Expected value in each cell is greater than or equal to 5
The null and alternative hypothesis for a two-way analysis when testing whether or not the sentence (death or not) of a convicted murderer depend on victim's race (White, Black) is as follows
Ho: Victim's race and sentence are independent
Ha: Victim's race and sentence are dependent
Which interval will be the widest?
a. A 95% CI for E(y)
b. A 90% CI for E(y)
c. A 95% PI for y-hat
d. A 90% PI for y-hat
95% PI for y-hat
A magazine suggests that ¾ of Honda customers are satisfied (Yes/No) with their Honda vehicles. Out of a sample of 2,000 Honda vehicle owners, 1,358 said they were satisfied. If you wanted to support or refute the magazine's claim, how would you proceed?
Conduct a t-test for a population proportion to determine if p < 0.75. and construct a confidence interval for the true proportion of satisfied owners
multicollinearity:
-can occur...
-is a correlation between...
-in Multiple Regression, but not Simple Linear Regression
- different X variables in a model
;