Upgrade to remove ads
Terms in this set (78)
OLS, or Ordinary Least Squares, is....
the method through which we minimize the sum of squared residuals in order to estimate our line of best it
=> find our estimates for our parameters (Beta 1 hat and Beta 0 hat)
our estimate error term "u hat" can be re-written as....
u hat = y - yhat = y-(Beta_0 hat+Beta_1hat*x_i)
How do we get our estimates for the parameters using OLS?
We are minimizing the Sum of Squared Residuals so we...
=> take partial derivatives in order to solve for Beta1 hat and Beta0 hat
formula for Beta1 hat
= sum of covariance between x and y / sum of variance for x
to be statistically significant at 95% confidence intervals, we need out t-statistic to be greater than ....
the formula for our t statistic is...
tstat = | Beta_1 hat / SE_Beta 1 hat | > 2
formula for Beta 0 hat
= mean of y- (Beta_1 hat*mean of x)
OLS is BLUE
Best Linear Unbiased Estimator
what are the five gauss markov assumptions? (and mention the unofficial GM 6 assumption)
1) Linear in Parameters
2) random sampling
=> E[Beta hat 1] = Beta hat
3) sample variation in the explanatory variable
=> the variance of x [(x_i-x bar)^2] must be greater than zero so that our beta hat 1 can be defined!!!
4) zero conditional mean
=> E[u_i | x_i] = 0
5) Homoskedasticity (constant variance in the error term for any given x_i)
6) normally distributed residuals/large sample size
Our ideal estimator is
1) unbiased [or ___ ]
2) efficient [ or ___ ]
which GM assumptions make what aspects of OLS true?
GM1 (linear in parameters) and GM3 (variation in the explanatory variable) => Linear
GM2 (random sampling) and GM4 (zero conditional mean) => Unbiased
GM5 (homoskedasticity) => Best
RMSE is the average....
size of the residuals
Report RMSE if you want to talk about the....
overall goodness of fit of your model
report adjusted R^2 to talk about ...
how much of the overall variance you are accounting for
=> (so a higher r^2 means you're doing a better job of explaining the changes caused by movement in x)
What is the formulaic expression for random sampling/GM2?
E[Beta_1 hat] = Beta 1
Non random sampling is a violation of which GM assumption?
Violates GM 2: Random Sampling
=> makes our results biased
Why is non-random sampling stop OLS from being BLUE?
*Because it makes our estimates biased
*by not randomly selecting, our data cannot be representative of the entire population
*the derived relationship only applies to our sample
Omitted Variable Bias is a violation of which GM assumption?
Violates GM4: violation of zero conditional mean
How does omitted variable stop OLS from being BLUE
Makes our estimates biased!
*If our error term contains some piece of the relationship between x and y, then it means that we should include it within our regression!
=> if we didn't we are going to either be over or underestimating the true affect of Beta_1
what is the formulaic way to represent homoskedasticity?
var[u_i | x_i] = variance of u_i
=> i.e. variance of the residuals will be the same value (a constant) for any given x
Heteroskedasticity is a violation of which GM assumption?
GM5: homoskedasticity (constant variance of the residuals)
How does heteroskedasticity stop OLS from being BLUE?
Our model is not the Best!!
*for increasing values of x our variance of our residuals is NOT CONSTANT
=> basically, our model is a good fit (more precise) for some points of data, but bad for others!
need to know __ in order to see if our Beta_1 hat is statistically different from zero
Why do we need to know the variance of our estimator in order to see if our Beta_1 hat is statistically different from zero ?
Because we need to know if there is truly a relationship between x and y!
What is the formula for the variance of Beta_1 hat?
= (1/n-2)* [sum of (residuals^2) / sum of ((variance for x)^2)]
Looking at the formula for variance of Beta_1 hat, what control do we have over the variance of our estimator?
1. larger sample size
2. larger variance in x
3. smaller residuals
**these will all make the variance of our Beta_1 hat smaller => therefore making our model a better fit :)
Our null hypothesis we're testing is
H_o : Beta_1 = 0
What do we need to look at to reject the null that H_o : Beta_1 = 0
calculate our t statistic for Beta_1 hat and see whether it is greater than 1.96
=> if so, then we can reject the null and say that our x has a statistically significant impact on our y
What is the unofficial GM assumption?
GM6: our errors are normally distributed
what are some causes of violations of GM6 (list 3) ?
1) big outliers pose a threat => OLS favors them
2) variables are not normally distributed
3) violations of linearity
Our GM assumptions are additive, in other words they ...
build up on top of each other!!
explain how our GM assumptions are additive
GM1 (linear in parameters) and GM3 (variation in our explanatory variable) necessary to estimate Beta_1 hat
GM2 (random sampling) and GM4 (zero conditional mean) necessary to say Beta_1 hat is unbiased
GM5 (homoskedasticity) necessary to derive our variance of Beta_1 hat
GM6 (normal distribution of our residuals/large sample size) is necessary for hypothesis testing
Why do we need to have the GM6 assumption? I.e., why do our errors have to be normally distributed?
In order to do hypothesis testing, we MUST ensure that our estimates for Beta_1 hat and variances of Beta_1 hat follow a normal distribution
=> we do this by making sure that the errors are also normally distributed and that we have a large sample size
In order to conduct our hypothesis testing, what must be true? What GM assumption will make this possible?
Our estimated Beta_1 hat and our estimated variance of Beta_1 hat MUST be normally distributed?
GM6 => normally distributed residuals/large sample size will allow us to conduct hypothesis testing
what are some reasons to transform a variable?
1) must have linear relationship btwn x and y
=> if our parameters are not linear, we're violating GM1 and can't estimate Beta_1 hat
2) issues with functional form
3) take care of other problems like non normal errors on our IVs
3 types of transformations:
1. standardized variables
2. logarithmic variables
3. quadratic variables
If our p value is small, then the probability that we reject the null (Ho: Beta_1 = 0 ), given that the null is true is very __
our p value will tell us the probability of [incorrectly or correctly] rejecting the null
how do you transform an IV into a standard IV?
subtract the mean and divide by the standard deviation
why transform a variable into a standard variable?
*nice if u want to get different sized variables on the same scale in order to compare impacts intuitively
*also helpful if you're interesting in understanding how things relate to an average/"normal" value
=> looking at impact of high temps, high test scores, etc.
transformation- standardized variables: How do we interpret a 1 unit change in x, if x is standardized ?
If x is standardized, a 1 unit change in x (on Stata) can now be interpreted as a 1 standard deviation change in x
ex: If our SAT scores goes up by 1 standard deviation, it leads to a .27 increase in college GPA
what type of transformation lets us say that a 1 standard deviation change in x leads to a " " unit change in y?
standardizing your variables
=> in this case we standardized x
what are the three reasons to transform using logarithms?
1) to linearize data with diminishing returns or an exponential relationship
2) to interpret a relationship thru percent changes rather than a level-level interpretation
3) when we have highly skewed data like income and population => transforming makes the variable closer to normal distribution
why would we do a log transformation of variables like income that cannot take on negative values?
Because they can't take negative values => they are typically highly skewed and have long right tails!
*by doing a log transformation, you're making the distribution of the variable closer to normal
level-level interpretation of coefficients
a 1-unit change in x leads to a Beta_1 hat unit change in y
level-log interpretation of coefficients
a 1% change in x results in a (Beta_1 hat/100) unit change in y
log-level interpretation of coefficients
a 1 unit change in x leads to a (Beta_1 hat*100) percent change in y
log-log interpretation of coefficients
a 1% change in x results in a Beta_1 hat percent in y
when not to log a variable:
* don't log variables like years and time
*don't log a variable that is already a percent
*don't log a variable that has zero or negative values
interpreting coefficients with quadratic transformation:
a 1 unit change in x produces a (B_1 hat + 2
x)) change in y
In multivariate regression, interpret Beta_k hat
Beta_k hat is simply the change in y due to one of the independent variables holding all other IVs constant
With omitted variable bias, we incorrectly estimate the impact of our variable of interest (Beta_1 hat). Are we still able to isolate the effect of Beta_1 hat?
NO! We are unable to isolate the effect of Beta_1 hat b/c all parts of our estimated equation are biased
=> we're using a model that says changes in y are happening ONLY because of x_1 but in reality, these changes are also driven by other IVs!!!!
With the signing bias two by two, what does Beta_2 hat represent?
it's the sign of the relationship btwn the omitted variable, x_2, and y
=> the impact of x_2 on y
With the signing bias two by two, what does lowercase delta represent?
the sign of the relationship btwn x1 and x2
=> the impact of x_2 on x_1
what are the assumptions to to estimate Beta hat?
GM1: Linear in parameters
GM3: variation in independent variable of interest
NO PERFECT COLLINEARITY
to estimate beta hat in a multivariate sense, we add on to GM assumptions 1 and 3 and say that we cannot have...
what are the assumptions for beta hat to be unbiased?
GM2 : random sampling of observations
GM4: zero conditional mean for ALL x_k
what are the assumptions to calculate variance of beta hat?
=> need to have the same variances for any level of x_k
what are the assumptions to do hypothesis testing ?
GM6: need to have normal distributed errors for ALL x_k
Doing OLS in a multivariate world.... what happens when we have perfect collinearity?
then we can't even conduct hypothesis testing => why STATA drops one of the IVs when doing interactions for example
what does R^2 measure?
measures the amount of variation in the DV explained by the IVs
*R^2 flaw: the score goes up when you simply add more IVs
*adj R^2 adjusts for degrees of freedom and
outliers are points with [large or small] residuals that can __ our regression line
What's one way do we include qualitative info into our models?
Binary/dummy/indicator variables :)
When we create factor variables that have a numeric code overlaying a qualitative value (think district 1 = 1, district 2 = 2, etc.) what do we have to do?? (STATA can do this automatically)
MUST DROP one of the regressions => this makes it the base category
*thus avoiding perfect multicollinearity
what is an interaction term conceptually? what relationship is it exploring?
It is simply a generated Independent Variable that tests the impact of two variables acting together
=> exploring the possibility of an amplification effect
interaction term interpretation: binary#binary
simple in that we can see how wages differ for people who are female and minority
*we're able to find predicated value for particular categories (basically adding Bo+B1+B(female##minority) )
interaction term interpretation: binary#continuous
interaction term interpretation: continuous#continuous
ex: measuring how an additional year of education impacts wages for different levels of education
ex: reg case_age_days c.avg agi##c. district
what is a proxy variable
a proxy variable is one that stands in for the concept you are really trying to capture
what are the 1st assumption for proxy variables ?
#1: the proxy variable itself does NOT impact the dependent variable
what are the 2nd assumption for proxy variables ?
the proxy variable itself captures most of the variation due to the omitted variable it is standing in for
what are the two types of measurement error?
systematic error (error correlated with IV)
stochastic error (error NOT correlated with IV)
what's worse: stochastic error in IV or in DV
stochastic error in dv ok because it already takes into account a certain level of noise
whereas stochastic error in IV leads to attenuation bias => violating GM4 thru underestimating impact of Beta_1 hat
attenuation bias arises with what type of measurement error?
stochastic error in the IV
what type of error leads simply to a shift in the line?
systematic error with constant variable bias
stochastic measurement error does or does not make OLS BLUE
=> DOES NOT MAKE OLS BLUE
**stochastic measurement error makes our answers have an attenuation bias
=> violates GM 4
with multicollinearity, is OLS still BLUE?
Yes, but now it's less efficient (less precise) due to the large standard errors (variances)
the two main problems with multicollinearity:
2) inflated standard errors
YOU MIGHT ALSO LIKE...
PSC 202 Exam 2
Business Stats final
Research Methods Final
OTHER SETS BY THIS CREATOR
intl econ midterm... current account (GPCO 403/ sp…
PMP final (material after the midterm)