AP Statistic Chapter 2 - Two Variable Data
Terms in this set (57)
When variables have no relationship or association, we say they are _____________
independent
Choose marginal or conditional:
78% of people like video games
marginal
Choose marginal or conditional:
84% of boys like video games
conditional
Best way to represent the relationship of categorical data
Two Way Table
If there is an association between variables, we say they are ______________________________________
Not independent
Graphical way to display information in a two way table
Segmented bar graph
Variable that does the impacting; independent variable
Explanatory Variable
Variable of interest that is measured or observed; dependent variable
Response Variable
Best way to display a relationship between two quantitative variables
Scatterplot
Four items to describe a scatterplot
Direction
Form
Strength
Relationship in context
When the explanatory variable increases, the response variable increases
Positive direction
When the explanatory variable increases, the response variable decreases
Negative direction
When the explanatory variable increases, the response variable does not change
Constant (horizontal) direction
Form of a scatterplot that shows the explanatory variable and the response variable changing at a constant rate
Linear
Form of a scatterplot that shows the response variable increasing or decreasing at a faster rate over time
Curve
Form of scatterplot that shows a rise and then a fall
Parabolic
Form of scatterplot that looks like a swarm of bees
no form or pattern
When points on a scatterplot appear to follow a very clear pattern, we say the relationship is ___________
strong
When points on a scatterplot do not adhere to a specific patter, we say the relationship is __________
weak
correlation coefficient
A measure of the strength of a linear relationship between two quantitative variables
variable used for the correlation coefficient
r
Does it make a difference which variable is defined as the response or explanatory variable in a correlation?
no
Variables must meet this requirement to be considered to have a correlation
Must be quantitative
What two values is r always between?
-1 and 1
Strong negative relationship r value
close to -1
Strong positive relationship r value
close to +1
Weak relationship r value
close to 0
Is correlation affected by outliers?
Yes, correlation is strongly affected by outliers.
Can correlation describe various forms of data?
No, only linear relationships
Formula for LSRL
y hat = a + bx
Name for the line of best fit
Least Squares Regression Line
the variable for the predicted value
y hat
variable for the y intercept in the LSRL
a (sometimes expressed as b sub zero)
variable for the slope in the LSRL
b (sometimes expressed as b sub 1)
Two main things that cannot be done with the LSRL equation
1) You can't work backwards and predict an x value form a given y value
2) You cannot extrapolate the line past the given data set
the difference between an observed value and the predicted value
residual
used to determine how far away from a predicted value an observed value is
residual
what does the LSRL minimize?
the sum of the squared residuals
list the four ways to get the LSRL
1) Given
2) Calulator
3) Using the data and formulas
4) Computer output
Formula for the slope of the LSRL
b = correlation * standard deviation of y / standard deviation of x
Formula for the y intercept of the LSRL
a = y bar - slope * x bar
Based on the slope of the LSRL, a change of 1 standard deviation of x corresponds to a change of ________________________ in y
r standard deviations
variable for the coefficient of determination
r squared
percentage of variation in the y that is due to the variation in the x
r squared
percentage of the variation in y that is accounted for by the LSRL
r squared
the sum of the residuals on a LSRL
zero
what type of residual plot will show that a linear model is a good fit?
one with no pattern
two ways to know that a linear model is appropriate
1) the scatter plot shows a line
2) the residual plot shows no pattern
two ways to tell if a linear model is a reliable source for predictions
1) r squared is a high percentage
2) the standard deviation of the residuals is small relative to the data
value that tells us how far off we typically are when making predictions
s, the standard deviation of the residuals
what is the difference between a residual and the standard deviation of the residuals?
a residual deals with one piece of data whereas the standard deviation considers them all and assesses the typical residual
outliers in the x direction o a scatter plot
influential points
what does an influential point do to a LSRL?
pulls the line toward it
does an outlier in the y direction affect the slope of the LSRL?
no
does an outlier in the x direction affect the slope of the LSRL?
yes, it will pull the line toward it
what four values are obtained from a computer output table?
1) slope
2) y intercept
3) standard deviation of the residuals
4) coefficient of determination
how do you obtain the correlation coefficient from computer output?
take the square root of the coefficient of determination and then consider whether it will be positive of negative based on the data
