| Term | Definition |
| scatter plot | a graph of data points |
| line of best fit | approximates the trend in data |
| model | sometimes a line or an equation used to represent data |
| Stroop Test | correlates a person's perception of words and colors for a list |
| matching list | the color of ink matches the color of the word |
| non-matching list | the color of ink does not match the color of the word |
| median | the middle of a set of data |
| median-median line | a method for calculating the line of best fit using the median |
| least squares method | a method of calculating the line of best fit using the distance each point is from the line of best fit |
| Pearson product-moment correlation coefficient | a measure of how well the regression equation fits the data |
| r | the correlation coefficient that varies from 0 to +/- 1 |
| regression equation | the equation found to represent a set of data |
| causation | when one event causes a second event |
| necessary condition | a correlation needed for causation |
| sufficient condition | a correlation does not show causation |
| quadratic regression | used to model quadratic data |
| If we use knowledge of SAT scores to predict his or her GPA. wHAT IS THE PREdicTOR AND WHAT IS THE CRITERION? | sat IS PREDICTOR AND GPA IS CRITERION |
| How do we translate S2y'? | The sample variance of the Y scores around the Y'. |
| When r=0.0, the Y-intercept is equal to? | the mean of all the Y scores in the sample |
| If we can claim to account for .65 of the vvariance in Y scores by knowing a relationship, it means that? | We are on average, 65% more accurate at predicting Y' scores than we would be if we did not know the relationship. |
| In general, the greater the proportion of variance accounted for... | the more accurately we can predict the behaviour |
| If heterodasticity is present Sy' will be? | greater than the actual average error in predictions of Y for some scores and less than the actual average error for other X scores |
| The regression line can be thought of as a series of points representing? | all the possible Y' values associated with all possible X scores |
| Standard error of the mean is defined as? | Average spread of actual Y scores around the predicted Y scores |
| Linear regression is defined as the procedure for determining? | the best-fitting straight line in a linear relationship |
| When we square hte correlation coefficient to produce r2, the result is equal to the? | proportion of variance accounted for |
| The Y-intercept of a line is the? | value of Y at the point where the regression line crosses the Y axis |
| Suppose you have several different predictor variables and one criterion variable. all your variables are measured using interval or rations scales. What is the appropriate statistical test to use? | Multiple regression |
| The absence of random assignemnt in any study allows for what? | potential confounding |
| The absolute value of a correlation coefficient indicates the? | strength of the relationship |
| We should always draw a scatterplot of the data when we compute a correlation because hte scatterplot allows us to? | see the nature of the relationship between the two variables |
| The best-fitting straight line through a scatterplot is known as the? | regression line |
| When your scale correlates with other procedures or scales that are valid, it has__________ validity ? | Convergent |
| When your scale does not correlate with other unrelated procedures or scales it has ________validity? | discriminant |
| When the relationship between two variables is high (for example, r=.98) the variability in the Ys at each X is ____________ realtive to the overall variability of Y scores in the sample. | smaller |
| In general, a positive linear relationship means that? | as the values of one variable increase, there is a tendency for the values of the other variable to also increase. |
| Suppose you find a restriction of range in your study of IQ scores and school achievement at school. Restricting the range is likely to _____ the correlation coefficient. | decrease the size of |
| Whe consistency of participants responses to the same test at two different times determines? | test-retest reliability |
| The consistency of participant response on different versions of the same test determines? | split-half reliability |
| If we plot a scatterplot, and the data points form a shape that appears to be random dots and is far from forming a slanted straight line as possible, the correlation for the data is? | 0.0: there is no relationship |
| THe defining formula for the Pearson correlation coefficient shows that it is the? | average correspondence of paired X and Y z-scores |
| Predictive validity | Extent to which a procedure is correlated with future behaviour |
| Concurrent validity | Extent to which a procedure is correlated with present behaviour |
| What procedure would be used to find out whether there is a relationship between SAT scores and GPA? | The Pearson correlation coefficient |
| The best-fitting line through a scatterplot is known as the? | regression line. |
| In general a positive relaitonship means that? | As one variable increases the other variable also increases |
| We should always draw a scatterplot of the data when we compute a correlation because it alows us to see? | the nature of the relationship between the two variables |
| r2 | coefficient of determination |
| Linear regression is defined as? | the best fitting straight line in a linear relationship |
| In the fomula Y' what does Y" stand for {Y'= (b)(x) + a}? | predicted Y score |
| In this formula,{Y'= (b)(x) + a} what does the "a" stand for? | the value of Y that hits the Y axis |
| Define the Standard error of the estimate | the average spread of Y scores around predicted Y scores |
| What value of "r" would yield the smallest Sy'(standard error)? | the highest numbered "r" |
| As the variability--differences--in Y scores at each X become larger, the relationship does what? | becomes weaker and results in a smaller correlation coefficient |
| Zero association means that? | No linear relationship is present |
| The larger the correlation coeficient (whether pos. or neg.), the stronger the relationship. Why? | The less the Ys are spread out at each X and the closer the data come to forming a straight line |
| What is another word for the degree of efficeincy in a relationship? | coefficient although it DOES NOT directly measure units of consistency |
| Define the purpose of computing a correlation coefficient. | Statistical technique for demonstrating the reliability and the validity of a measurement procedure in any experiment or correlational design. |
| What are the types of reliability that a correlation coefficient is used to show? | test-retest, inter-rater, split-half |
| inter-rater reliability | the consistency of ratings by any two raters |
| test-retest reliability | Test in which participants receive the same score when tested at different times |
| How high does a coefficient have to be in order to be considered reliable? | +.80 or higher |
| Face validity | Procedure is valid because it looks valid/Extent to which a measurement procedure appears to measure what it was intended to measure |
| Convergent Validity | Extent to which scores obtained from one procedure are positively correlated with scores obtained from another procedure that is already accepted |
| Discriminant validity | Extent to which scores obtained from one procedure are not correlated with scores from another procedure that measures OTHER variables or constructs. |
| Criterion validity | Extent to which a procedure correlates with a behavior. |
| Concurrent validity | Extent to which a procedure correlates with an individuals current behavior |
| Predictive validity | Extent to which a procedure correlates with an individuals future behavior |
| What is the range of a coefficient? | 0-+/-1.0 |
| What is the most common method of correlation coefficient? | Pearson correlation coefficient |
| Define the Pearson correlation coefficient | Corelation coeffieccient that describes the strength and type of a linear relationship between interval and ratio variables, symbolized by r. |
| Define the Spearman Rank order coefficient | The correlation coefficient that describes the linear relationship between pairs of ranked scores (ex: any two ordinal variables OR tied rank variables, symbolized by Rs |
| Tied rank variables | occcurs when two aprticipants receive the same ranking score in SPearman's rank coefficient, resolved by averaging the score and assigning it to both participant to correlate their scores. |
| Point biserial correlation coefficient | Describes the linear relationship between the scores from one continuous variable and one dichotomous variable (ex: correlating male/female with interval scores from a personality test).Can be used for one continuous interval or ration and one dichotomous, symbol is Rpb. |
| How does a restricted range affect a correlation coefficient? | reduces the accuracy, producing a smaller coefficient than if hte range were not restricted and leads to an underestimate of the degree of association between the two variables. Avoiding this increases power. |
| Why is the correlation coefficient important? | It is one number that allows us to envision and summarize the important information in a scatterplot, in terms of it's strength and direction. |
| what does a horizontal scatterplot, with a horizontal regression line indicate? | no relationship |
| The smaller the absolute value of the coefficient, the greater the ? | variability of the Ys at each X, the vertical width of the scatterplot, and the less accurately Y scores can be predicted from X |
| How can the power of a correlational design be increased? | Minimizing error variance and avoiding a restricted range, so that thelargest possible coefficient is obtained. |
| If it passes through the proper inferential procedure, a sample correlation coefficient is used to estimate what? | the corresponding population correlation coefficient: r=p,Rs estimates Ps, Rpb estimates Ppb. |
| Define linear regression | THe statistical procedure for using a relationship to predict scores aka the statistic that summarizes the linear relationship.It produces the line that summarzes the relationship |
| How is Y' pronounced | Y prime |
| What does the symbol Y' stand for | a predicted Y score. Our best prediction of the Y score at a corresponding X |
| Define regression line | straight line that summarizes the linear relationship in a scatterplot by,on average, passing through the center of the Y scores at each X and it consists of the predicted Y score-the Y'-for every possbile X |
| Why is "r" computed first? | to determine if a relationship exists. If r=0 their is no relationship |
| What is the importance of linear regression? | It is used to predict a individual's unknown Y score based on his/her X score from a correlated variable. Usually more external validity and more accurate description of the relationship.USed to predict unknown Y scores based on X scores from correlated variable. |
| Linear regression equation [(b)(x) + a] | equation that creates the straight line by producing a value of Y' at each X, define sthe line that summarzies the relationship. Describes it's slope and Y intercept. |
| Linear regression equation to calculate regression line points for scatterplot | Y'=[(b)(x) + a] |
| Y intercept equation | a=mean of Y- (b) (mean of x) |
| Slope equation | b |
| coefficient of determination | r2 |
| SEE (Sy) is acronym for | Standard error of estimate which is the standardized difference between predicted Y' and actual Y scores |
| How do you calculate proportion of variance accounted for? | r2 which is also known as "coefficient of determination" |
| When r=0, the standard erro of the estimate is at it's max. and that is equal to? | the standard deviation of all Y scores in the sample (Sy) |
| Stonger correlations produce what size SEE | smaller SEE |
| What does the equation r2 aka coefficient of determination aka proortion of variance indicate? | How important the realtionship is by comparing amount of error obtained using the regression equation for XY to errors without the regression equation for XY |
| what does Sy2 refer too? | Describes the error variance when using regressinon to predict Y scores, measures error in prediction. |
| Sr' | Standard error of estimate |
| Sr' definitional formula/average error | subtract Y' from Y and square each deviation/divide by N then find hte square root of that to get the error of the estimate |
| proportion of variance | is the amount we reduce errors in predicting Y scores when we use the relationship, compared too if we did not. Equals r2 |
| a= | y-intercept |
| Y intercept | value of Y when it corsses the Y axis |
| Y' is the predicted Y score for what? | the corresponding X |
| The differences (and error) between Y and Y' is also summarized by what? | the variance of the Y scores around Y' (S2y) |
| If there is a large R there is a week or strong relationship? | stronger the relationship and a small value of Sy and S2y, because the Y scores are closer to Y', thus the smaller difference between Y and Y' |
| When r=0 what doe Sy and S2y equal? | Sy and S2y equal each other |
| When R= +/- 1 how much is the eror in predictions | Zero error and Sy' equals zero. |
| another term for r | Is the correlation coefficient |
| Proportion of variance accounted for indicates what? | The importance of a relationship |
| heteroscedasticity | An unequal spread of Y scores around the regression line (that is around the values of Y') |
| Homodasticity | An equal spread of Y scores around the regression line (that is the values of Y') |
| Symbol for Pearson correlation coefficeint | r symbol |
| Coefficient of alienation | 1- r2 |
| Sr | standard error ofthe estimate symbol |
| Sx | sample standard deviation symbol |
| S2x | sample variance symbol |
| sideways px | population standard deviation symbol |
| rs | Spearman correlation coefficient symbol |
| rpb | point-biserial correlation coefficient sign |