41 terms

Response variable

Measures the outcome of a study, dependent variable, y

Explanatory variable

Attempts to explain observed outcomes, indepdent variable, x

Scatterplots

Show the relationship betweeen two quantitive variables (bivariate data). Each individual in a data set appears as a fixed point. All data points are plotted but not connected

Positive association

As x increases, y increases

Negative association

As x increases, y decreases

Correlation (r)

Measures strength and direction ( + or - )

Words to describe strength

Strong (when r is close to 1 or -1), moderately strong/weak, weak (when r is close to 0)

Words to describe direction

Positive, negative

Correlation coefficient

r

r is resistant or non-resistant to outliers?

Non-resistant

If r=1

Perfect positive linear slope

If r=-1

Perfect negative linear slope

What is the units of r?

r has no units

What is the range of r?

-1 < r < 1

Residual

Observed value (y) - predicted value (y "hat")

Least squares regression line (LSRL)

The line that makes the sum of the sqaures of the vertical distances of the data points from the line as small as possible. The LSRL minimizes the total area in all of the squares.

AKA "prediction line"

AKA "prediction line"

What point is always on the LSRL?

(x¯, y¯ )

Equation of the LSRL

Defining x and y

Where x denotes _______ and y denotes predicted _______

If residual is positive...

residual = (y) - (y "hat")

The predicted y was less than the observed y

Prediction was an underestimate

The predicted y was less than the observed y

Prediction was an underestimate

If residual is negative...

residual = (y) - (y "hat")

The predicted y (y "hat") was greater than the observed y

Prediction was an overestimate

The predicted y (y "hat") was greater than the observed y

Prediction was an overestimate

If residual = 0

y - yˆ= 0

y = yˆ

Prediction was accurate

y = yˆ

Prediction was accurate

b1

Slope of the LSRL

b0

y-intercept of the LSRL, the predicted y when x=0

Interpretation of the slope of the LSRL

For every one unit increase in ___(x)___ the predicted ___(y)___ increases/decreases on average by ___(b)___ units

Coefficient of determination

Is the proportion of the variation in the values of y that is explained by the LSRL

r²

Coefficient of determination

r² measures...

"how good the LSRL is at predicting y"

Interpretation of coefficient of determination

r² % of the variation in ___(y)___ is accounted for by the LSRL

r² > ?

0 (therefore, always positive)

r is negative or positive?

It can be both

The sign of r matches the sign of...

b1 (slope of the LSRL)

Residual Plot

When asked if a linear model is an appropriate model for the data, you MUST examine the ...

Residual Plot characteristic: Curved patterns

The LSRL will not be the best fit

-Not linear, so a line won't be the best choice

-Not linear, so a line won't be the best choice

Residual Plot charcteristic: Idealized patterns

Random, uniform scatter of points above and below the LSRL residual

Residual Plot characteristic: Varying spread

As x increases, the prediction will be more accurate for some values and less accurate for others

Outlier

An observation that lies outside the overall pattern of the other observations.

Influential

An observation that if it is removed it would drastically change the result of some calculation (either r or r squared)

R-squared adjusted

NEVER USE

Axes

When making a graph (including a scatterplot), NEVER forget to LABEL your...

Form, direction, and strength (IN CONTEXT)

"Describing the scatterplot" means to discuss the...