Ch. 5 AP Statistics (Summarizing Bivariate Data)
About this set
Created by:
JuliusTembe on September 30, 2011
Subjects:
Classes:
Log in to favorite or report as inappropriate.
Order by
23 terms
Terms | Definitions |
|---|---|
Pearson Correlation Coefficient | ![]() The most popular measure of correlation Indicates the magnitude and direction of a co-relational relationship between variables. , measures the degree to which 2 variables are related to each other in a linear way from -1 to 1 it uses the mean and standard deviation to transform original scores into a number that represents the SD from the mean (z score), represented by r= Σ (ZxZy)/(n-1) |
Negative Relationship | ![]() A relationship in which the values of one variable increase as the values of another variable decrease (inverse) |
Correlation coefficient | A numerical assessment of the strength of relationship between the x and y values in a set of (x,y) pairs., a statistical index of the relationship between two things (from -1 to +1) |
Positive relationship | ![]() A relationship in which increases in the values of the first variable are accompanied by increases in the values of the second variable. |
Population Correlation Coefficient | An analogous measure of how strongly x and y are related in the entire population of pairs from which the sample was obtained. Represented by . Parallels r. |
Regression analysis | ![]() A mathematical approach for fitting an equation to a set of data to make quantitative predictions of one variable from the values of another. Think Sir Francis Galton. |
Least Squares Line | Also know as the sample regression line or line of best fit it is the line that minimizes the sums of the squares of the vertical distances from the actual points to the line. Is represented by the symbol ŷ. |
Sum of squared deviations | Most widely used criterion for measuring the goodness of fit of a line y = a +bx to bi-variate data (x1, y1),...., (xn, yn) about the line [Σ][y-(a+bx)]^2 |
Danger of extrapolation | ![]() Calculation of value of a function outside the range of known values, especially with the use of a regression line. |
Sample Regression Line | The estimate of the true regression line, gives the "best fit" of the sample data, estimating using method of the least squares, is the terminology that is frequently used because of the relationship between the least squares line and the Pearson's correlation coefficient. |
Sir Francis Galton | ![]() English scientist (cousin of Charles Darwin) who explored many fields: heredity, meteorology, statistics, psychology, anthropology. Believed in inheritance of mental ability, coined terms eugenics and nature v nurture. Brought about use of questionnaire data analysis and the use of c0-relational data and psychometric, |
Residuals | The name for difference between the observed value of the response variable and the value predicted by the regression line, can be positive or negative. |
Residual Plot | ![]() Scatterplot of the (x, residual) pairs. Isolated points or a pattern of points in residual plot are indicative of potential problems. Helps asses the appropriateness of the regression line. Is best if there is NO curvature. |
Influential Observation | ![]() An observation that substantially alters the values of slope and y-intercept in the regression equation when it is included in the computations. Does NOT have to be the observation with the largest residual. |
Coefficient of Determination | ![]() Denoted by r2, gives the proportion of variation in y that can be attributed to an approximate linear relationship between x and y, 100r2 is the percentage of variation that can be attributed to the approximate linear relationship between x and y. Larger when residuals are small. r2 = 1 - (SSResid/ SSTo) |
Total sum of squares | Denoted by SSTo, is defined as the summation of (each y observation minus the mean)^2. Usually more than SSResid. |
Residual sum of squares | Also known as the error sum of the squares, is denoted by SSResid, Is the sum of the squared residuals is a measure of y variation that cannot be attributed to an approximate linear relationship (unexplained variation). Usually less than SSTo |
Standard Deviation about the least-squares line | The size of "typical"deviation from the least-squares line. Represented by se = square root {(SSResid/n-2)} |
Polynomial Regression | Is a variation of multiple regression that describes curvilinear relationships, uses R2 = 1- (SSResid/SSTo) where SSResid is the sum of differences between residuals and y values) squared. |
Transformation | Also called a re-expression, is a method that involves using a function of a variable in place of the varaible itself. May involve taking square roots, logarithms, or reciprocals of x and relating them to the y-value. |
Power Transformation | A transformation in which a power/exponent is chosen, and then each original value is raised to that power to obtain the corresponding transformed value. Do NOT pick 0 as the exponent as that would make every value 1, and an exponent of 1 is NOT a transformation either. |
Logistic Regression | Special form of regression in which the dependent variable is a nonmetric, dichotomous (binary) variable. Although some differences exist, the general manner of interpretation is quite similar to linear regression. |
Logistic Regression equation | The graph of this equation is an S-shaped curve. Describes the relationship between the probability of success and a numerical predictor variable. P= e^(a+bx) / 1 +e^(a+bx), where a and b are constants. The further b is from 0 the steeper the curve, in other words b determines the slope. Can use ln on this equation to transform it to a linear graph. |
Flickr Creative Commons Images
Some images used in this set are licensed under the Creative Commons through Flickr.com. Click to see the original works with their full license.
- "Pearson Correlation Coefficient" image
- "Negative Relationship" image
- "Positive relationship" image
- "Regression analysis" image
- "Danger of extrapolation" image
- "Sir Francis Galton" image
- "Residual Plot" image
- "Influential Observation" image
- "Coefficient of Determination" image
This product uses the Flickr API but is not endorsed or certified by Flickr.
First Time Here?
Welcome to Quizlet, a fun, free place to study. Try these flashcards, find others to study, or make your own.








