How can we help?

You can also find more resources in our Help Center.

17 terms

AP Stats Chapter 3

STUDY
PLAY
Response variable
A response variable measures an outcome of a study
Explanatory variable
An explanatory variable may help explain or influence changes in a response variable
Scatterplot
A scatterplot shows the relationship between two quantitative variables measured on the same individuals

The values of one variable appear on the horizontal axis, and the values of the other variable appear on the vertical axis

Each individual in the data appears as a point in the graph
Positive association
Two variables have a positive association when above-average values of one tend to accompany above-average values of the other, and when below-average values also tend to occur together
Negative association
Two variables have a negative association when above-average values of one tend to accompany below-average values of the other
Correlation r
The correlation r measures the direction and strength of the linear relationship between two quantitative variables
Regression line
A regression line is a line that describes how a response variable y changes as an explanatory variable x changes

We often use a regression line to predict the values of y for a given value of x
Regression line, predicted value, slope, y intercept
Suppose that y is a response variable and x is an explanatory variable

A regression line relating y to x has an equation of the form

ŷ = a + bx

In this equation,
ŷ is the predicted value of the response variable y for a given value of the explanatory variable x
b is the slope, the amount by which y is predicted to change when x increases by one unit
a is the y intercept, the predicted value of y when x = 0
Extrapolation
Extrapolation is the use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line

Such predictions are often not accurate
Residual
A residual is the difference between an observed value of the response variable and the value predicted by the regression line

That is,

residual = observed y - predicted y

residual = y - ŷ
Least-squares regression line
The least-squares regression line of y on x is the line that makes the sum of the squared residuals as small as possible
Equation of the least-squares regression line
We have data on an explanatory variable x and a response variable y for n individuals

From the data, calculate the means x̄ and y-bar and the standard deviations Sx and Sy of the two variables and their correlation r

The least-squares regression line is the line ŷ = a + bx with slope

b = r(Sy / Sx)

and y-intercept

a = y-bar - bx̄
Residual plot
A residual plot is scatterplot of the residuals against the explanatory variable

Residual plots help us assess how well a regression line fits the data
Standard deviation of the residuals (s)
If we use the least-squares line to predict the values of a response variable y from an explanatory variable x, the standard deviation of the residuals (s) is given by

s = square root(Σresiduals^2 / (n - 2)) = square root(Σ(Yi - ŷ)^2 / (n - 2))

This value gives the approximate size of a "typical" or "average" prediction error (residual)
The coefficient of determination: r^2 in regression
The coefficient of determination r^2 is the fraction of the variation in the values of y that is accounted for by the least-squares regression line of y on x

We can calculate r^2 using the following formula:

r^2 = 1 - SSE / SST

where SSE = Σresidual^2 and SST = Σ(Yi - y-bar)^2
Outliers in regression
An outlier is an observation that lies outside the overall pattern of the other observations

Points that are outliers in the y direction but not in the x direction of a scatterplot have large residuals

Other outliers may not have large residuals
Influential observations in regression
An observation is influential for a statistical calculation if removing it would markedly change the result of the calculation

Points that are outliers in the x direction of a scatterplot are often influential for the least-squares regression line