Upgrade to remove ads
AP Stats TPS4e Ch 3.2 Least-Squares Regression
Ch 3.1 Scatterplots and Correlations Ch 3.2 Least-Squares Regression
Terms in this set (24)
A variable that measures an outcome or response of a study.
A variable that may help explain or influences changes in a response variable.
Shows the relationship between two quantitative variables measured on the same individuals. The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis. Each individual in the data appears as a point in the graph.
How to Make a Scatterplot
1. Decide which variable should go on each axis.
2. Label and scale your axes.
3. Plot individual data values.
How to Examine a Scatterplot
Look for the overall pattern and for striking departures from that pattern. You can describe the overall pattern of a scatterplot by the direction, form, and strength of the relationship. An important kind of departure is an outlier, and an individual value that falls outside the overall pattern of the relationship.
When above-average values of one variable tend to accompany above-average values of the other, and below-average values also tend to occur together.
When above-average values of one variable tend to accompany below-average values of the other, and vice versa.
Measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r.
Facts about Correlation
1. correlation makes no ditinction between explanatory and response variables.
2. Because r uses the standardized values of the observations, r does not change when we change the units of measurement of x, y, or both.
3. Correlation r has no unit of measurement.
4. Correlation does not imply causation.
5. Correlation requres that both variables be quantitative.
6. Correlation only measures the strength of a linear relationship between variables, not curved ones.
7. An r value close to 1 or -1 does not guarantee a linear relationship.
8. Correlation is not resistant to outliers.
9. Correlation is not a complete summary of two-variable data
A line that describes how a response variable y changes as an explanatory variable x changes. We often use this to predict the value of y for a given value of x.
The predicted value of the response variable y for a given value of the explanatory variable x.
Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form y hat = a + bx. In this equation, b is the slope, the amount by which y is predicted to change when x increases by one unit.
Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form y hat = a + bx. In this equation, the number a is the y intercept, the predicted value of y when x = 0.
Regression line equation
The use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate.
The difference between an observed value of the response variable and the value predicted by the regression line. That is,
___________ = observed y - predicted y = y − y hat .
Least-squares regression line
The _____________________ of y on x is the line that makes teh sum of the squared residuals as small as possible.
A scatterplot of the regression residuals against the explanatory variable (or equivalently, against the predicted y-values). These plots help us assess how well a regression line fits the data.
Standard deviation of the residuals (s)
If we use a least-squares line to predict the values of a response variable y from an explanatory variable x, ______________ is given by the formula of the square root of the sum of the square of the residuals divided by the sample size - 2. This value gives the approximate size of a "typical" or "average" prediction error (residual).
The coefficient of determination, r squared
The fraction of the variation in the values of y that is accounted for by the least-squares regression line of y on x.
Correlation and Regression Facts
1. The distinction between explanatory and response variables is important in regression.
2. Correlation and regression lines describe only linear relationships.
3. Correlation and least-squares regression lines are not resistant.
An observation that lies outside the overall pattern of the other observations. Points that are ________ in the y direction but not the x direction of a scatterplot have large residuals. Other _________ may not have large residuals.
An observation is _________ for a statistical calculation if removing it would markedly change the result of the calculation. Points that are outliers in the x direction of a scatterplot are often ______ for the least-squares regression line.
Association does not imply causation
An association between an explanatory variable x and a response variable y, even if very strong is not by itself good evidence that changes in x actually cause changes in y.
THIS SET IS OFTEN IN FOLDERS WITH...
AP Stats TPS4e Ch 3 Describing Relations…
AP Stats TPS4e Ch 3.1 Scatterplots and C…
YOU MIGHT ALSO LIKE...
AP Stat ch. 3
The Practice of Statistics - Chapter 3
Stats 8 (ch.3 - ch.4)
AP Statistics - Chapter 3
OTHER SETS BY THIS CREATOR
AP Stats TPSU6e Unit 8 (Ch 11) Inference…
AP Stats TPSU6e Unit 7 (Ch 8.3, 9.3 & 10…
AP Stats TPSU6e Unit 7 (Ch 8.3, 9.3 & 10…
AP Stats TPSU6e Unit 4 (Ch 5 & 6) Probability, Ran…