# Stat 212 - Ch. 3, Scatterplots and Correlation

### 21 terms by RheaWong

#### Study  only

Flashcards Flashcards

Scatter Scatter

Scatter Scatter

## Create a new folder

- Explanatory and response variables. - Displaying relationships: scatterplots. - Interpreting scatterplots. - Adding categotical variables to scatterplots. - Measuring linear association: correlation. - Facts about correlation.

### Scatterplot

A graph used to display the relationship between two quantitative variables measured on the same individuals.

### Response Variable Dependent Variable

Measures or records an outcome of a study.

### Explanatory Variable Independent Variable

May explain or influence changes in a response variable.

### Form

A way of describing a scatterplot relationship.
Linear, curved, clusters, or no pattern.

### Direction

A way of describing a scatterplot relationship.
Positive, negative, no direction.

### When examining a graph, examine...

Look for the overall pattern and for striking deviations.

### Strength

A way of describing a scatterplot relationship.
How closely the points fit the "form"; weak, strong.

### Positive Association

High values of one variable tend to occur together with high values of the other variable.

### Negative Association

High values of one variable tend to occur together with low values of the other variable.

### Relationships between categorical data

Not possible; relationships rely on quantitative variables.

### Scatter

Variation, as in, variation of data points around the main graph. Used to measure strength.

### Adding categorical data to scatterplots

Use points with different shapes/colors.

### Outliers

Not on the general line "drawn" for the scatterplot; if on the line, not actually an outlier.

### Correlation Coefficient (definition)

A measure of the direction and strength of a relationship; calculated using the mean and standard deviation of both the X and Y variables. Given as "r".

### Correlation Coefficient (equation)

r = (1 / n-1) ∑ (x₋i - x₋bar/ s₋x)(y₋i - y₋bar/ s₋y)

x₋bar = mean for the dependent variable
y₋bar = mean for the independent variable
s₋x = standard deviation for the dependent variable
s₋y = standard deviation for the independent variable

### Correlation makes no distinction between explanatory and response variables.

It doesn't matter what's called what in calculating the correlation.

### r does not change when we change the units of measurement of x, y, or both

In calculating correlation, elements are standardized; standardizing eliminates units.

### The correlation r is always a number between -1 and 1.

Negative indicates negative correlation, positive indicates positive; zero indicates no correlation.

### Correlation only works for linear relationships

Curved, etc. are moot.

### Correlation and resistence

Correlation is not resistent, and can be affected by outliers.

### Correlation from averaged data

Correlation calculated from averaged data is typically much stronger than correlation calculated from raw data points because averaging reduces some scatter.

Example: