Search
Browse
Create
Log in
Sign up
Log in
Sign up
Upgrade to remove ads
Only $2.99/month
STATS CH3
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (19)
Association
Is said for the first CATEGORICAL example
Correlation (r)
Is said for the second NUMERICAL example
Contingency Table
- displays two categorical variables
- rows = categories of one variable
- columns = categories of the other
- entries in table = frequencies
Response variable
Dependent variable
Explanatory variable
Independent variable
- when categorical, it defines the groups to be compared w/ respect to values on the response variable
- when numerical, it defines the change in different numerical values to be compared w/ respect to the values for the response variable
Scatterplot
- horizontal axis: explanatory, x
- vertical axis: response, y
How to examine a scatterplot
1. Trend: linear, curved, clusters, no pattern
2. Direction: positive, negative, no direction
3. Strength: how closely the points fit the trend
r - value (correlation)
- value closer to 1 or -1 = strong linear association
- value close to 0 = weak
- Pearson's formula measures strength and direction
- not resistant to outliers
Regression line
- predicts the value for y as a straightline function of x
- formula:
predicted value of y = y-intercept + slope(x)
y - intercept
may not have any interpretive value if no observations had x values near 0
Slope
- measures the change in the predicted variable (y) for a 1 unit increase in x
- doesn't tell whether the association is strong or weak
- two variables must be identified as response and explanatory
variables
Residuals
- measures the size of the prediction errors, the vertical distance between the point and the regression line
- each observation has a residual
- calculation: y - predicted y
- the smaller the absolute value of a residual, the closer the
predicted value is to the actual value, so the better is the
prediction
Correlation does not imply
causation
Extrapolation
Using a regression line to predict y-values for x-values
outside the observed range of the data
Regression outlier
Observation that lies far away from the trend that the rest of the data follows
Proportional reduction in error (r^2)
- proportion of the variation in the y-values that is accounted for by the linear relationship of y with x
- measures the proportion of the variation in the y values
that is accounted for by the linear relationship of y with x
Lurking variable
Usually unobserved variable that influences the association between the variables of primary interest
Confounding
When two explanatory variables are both associated with a response variable but are also associated with each other
Simpson's Paradox
When the direction of an association between two variables changes after we include a third variable and analyze the data at separate levels of that third variable
THIS SET IS OFTEN IN FOLDERS WITH...
STATS CH1
7 terms
STATS CH2
31 terms
STATS CH4
30 terms
STATS CH5
6 terms
YOU MIGHT ALSO LIKE...
AP Stats TPS4e Ch 3.2 Least-Squares Regr…
24 terms
STATS 311 Midterm - Chapter 3
42 terms
Stats 8 (ch.3 - ch.4)
17 terms
The Practice of Statistics - Chapter 3
18 terms
OTHER SETS BY THIS CREATOR
Prepositions
16 terms
Txtbk Vocab
111 terms
Rules for Definite/Indefinite and Ser/Estar
10 terms
Reflexive Verbs
44 terms