statistics exam 2
Terms in this set (35)
response variable
measures an outcome of a study
explanatory variable
explains or assess changes in response variables
scatterplot
shows the relationship between two quantitative variables measured on the same individuals
- values of one appear on the horizontal axis
- values of the other appear on the vertical axis
- each individual appears as the point in the plot fixed by the values of both variables for that individual
positive association
Above-average values of one variable tend to accompany above-average values of the other, and below-average values also tend to occur together.
negative association
Above-average values of one variable tend to accompany below-average values of the other, and vice versa.
when examining the relationship between two variables, the variables must be measured from ...
the same cases
correlation
measures the direction and strength of the linear relationship between two quantitative variables
- usually written as r
strength
the magnitude of r
direction
positive or negative direction
correlation of r ranges from
-1 to 1
when correlation is 0 ...
the slope of the lease - squares regression line will be 0
correlation uses what unit
none
- it is unitless
transformation
- log transformation can change a curved relationship into a more linear relationship
straight lines have what equation if
- y is a response variable ( on the y axis )
- x is a explanatory variable ( on the x axis )
y= b0 + b1x
- b1 = slope
- b0 = intercept
-- the value of y when x is at the starting point
least - squares regression line
a straight line where we have data on an explanatory variable and a response variable (y) for individuals
equation of lease squares regression line
y hat= b0 + b1x
x = the value of explanatory variable
y har = predicted value of response variable
b 0 = slope
r2
the fraction of the variation in the values of y that is explained by the least squares regression of y on x
extrapolation
Use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line.
residuals
the difference between an observed value of the response variable and the value predicted by the regression line
equation for residuals
y - y hat
if the regression line is a good fit for the data then ...
no obvious pattern should be shown in the residual plot
outlier
An observation that lies outside the overall pattern of the other observations.
- outliers in the y direction of a scatterplot have large regression residuals
- outliers in the x direction are influential
Simpson's paradox
an association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group
confounding
two variables are confounded when their effects on a response variable cannot be distinguished from each other
- the confounded variables may be either explanatory variables or lurking variables
lurking variable
a variable that is not among the explanatory or response variables in a study but that may influence the response variable
common response
an observed association between the variable x and y is explained by a lurking variable z
two-way table
when both variables are categorical, the raw data are summarized in a two-way table
raw data
The original data as it was collected.
marginal distribution
when we examine the distribution of a single variable in a two- way table
-- we are looking at a marginal distribution
conditional distribution
When we condition on the value of one variable and calculate the distribution of the other variable
-- we obtain a conditional distribution
if two variables are associated, knowing the value of one should...
let you predict the value of the otters
strong positive scatterplot
strong negative scatterplot
equation for r
1 xi-xbar yi - ybar
r= ----------- E ( ----------- ) ( ----------- )
n-1 sx sy
xi-x
- ------------ standardize x
sx
- " " standardize y
- ( ) * ( ). multiply those values
- E summation of multiple calc. for all the
individuals
- (n-1) divide the summation
intercept represents
the starting point
