Upgrade to remove ads
STATISTICS: Chapter 8: Linear Regression
Terms in this set (48)
What is a model?
-an equation or formula that simplifies reality and helps us to understand underlying patterns and relationships
What is a linear model?
-an equation of a straight line through the data
-has the form of y = mx + b
-can summarize general patterns
What does linear regression allow us to understand?
-the relationship between two quantitative variables
What does it allow us to do?
-predict the value of a dependent variable based on the value of an independent variable
-explain the impact of changes in an independent variable on the dependent variable
What does "y-hat" symbolize in the world of statistics?
-the predicted value
-an estimate made from a linear model
What is a residual?
-the difference between the observed value and its associated predicted value
What does the residual value tell us?
0how far off the model's prediction is at that point
How do we calculate a residual?
residual = observed value - predicted value
What does a negative residual tell us?
-a negative residual tells us that the observed value is less than the predicted value
What does a positive residual tell us?
-a positive residual tells us that the observed value is greater than the predicted value
Can we assess how well a line fits by adding up all of the residuals?
-because the positive and negative values would cancel each other out
How do we deal with the issue?
-by squaring the residuals
What is the line of best fit?
-the line for which the sum of the squared residuals is the smallest
What is the line of best fit also known as?
-the least squares line
What notation is used to represent a straight line in the world of statistics?
y = b0 + b1*x
Why do we use y-hat?
-to emphasize that the points that satisfy this equation are just our predicted values, not the actual data values (which scatter around the line)
What does b1 represent?
-the slope of the line
What does the slope tell us?
-how rapidly y-hat changes with respect to x
What does b0 represent?
What is the model for the least squares line built from?
1. the correlation
2. the standard deviations
3. the means
What equation describes the slope of the least squares line?
b1 = r(sx/sy)
The slope inherits the sign of what values?
Do correlations have units?
Do slopes have units?
The units of a slope are always what?
-the units of y per unit of x
What conditions should be met to use a regression model?
1. the quantitative variables condition
2. the straight enough condition
3. the outlier condition
Data = _______ + _______
Data = Model + Residual
Residual = _______ - _______
Residual = Data - Model
How do you write this last equation using symbols?
e = y - y hat
When we want to know how well a model fits, what can we ask instead?
-what the model missed
To see that, what do we look at?
What value is used to measure how much residual points spread around a regression line?
-the standard deviation of the residuals
Write the equation for the standard deviation of the residuals.
-see book or flashcards
Why don't we need to subtract the mean within the above equation?
-because the mean of the residuals is equal to zero
What is key to assessing how well a model fits?
-the variation in the residuals
If the correlation between two variables is equal to one, what would the value of the residuals be equal to?
What does the squared correlation (R^2) give?
-the fraction of the data's variation accounted for by the model
What does 1-R^2 give?
-the fraction of the original variation left in the residuals
What does an R^2 of zero mean?
-that none of the variance in the data is in the model; all of it is still in the residuals
What is R^2 always between?
0% and 100%
An R^2 of what value is a perfect fit?
What is the most widely used model in all of statistics?
-the linear regression model
What type of relationships between variables are unsuitable for regression analysis?
1. Nonlinear relationships
2. Relationships with outliers (either extreme y-values or extreme x-values)
3. Relationships that feature variables that are not quantitative
What is a dependent variable?
-a variable that we wish to predict or explain
What is an independent variable?
-a variable that is used to predict or explain the dependent variable
What is the ideal value for a residual?
What do residuals help us to see?
-whether or not the model makes sense
YOU MIGHT ALSO LIKE...
Unit 2 vocab
AP Stats Vocab (7,8,9,10)
OTHER SETS BY THIS CREATOR