least squares method
a method of calculating the line of best fit using the distance each point is from the line of best fit
Pearson product-moment correlation coefficient
a measure of how well the regression equation fits the data
If we use knowledge of SAT scores to predict his or her GPA. wHAT IS THE PREdicTOR AND WHAT IS THE CRITERION?
sat IS PREDICTOR AND GPA IS CRITERION
If we can claim to account for .65 of the vvariance in Y scores by knowing a relationship, it means that?
We are on average, 65% more accurate at predicting Y' scores than we would be if we did not know the relationship.
In general, the greater the proportion of variance accounted for...
the more accurately we can predict the behaviour
If heterodasticity is present Sy' will be?
greater than the actual average error in predictions of Y for some scores and less than the actual average error for other X scores
The regression line can be thought of as a series of points representing?
all the possible Y' values associated with all possible X scores
Standard error of the mean is defined as?
Average spread of actual Y scores around the predicted Y scores
Linear regression is defined as the procedure for determining?
the best-fitting straight line in a linear relationship
When we square hte correlation coefficient to produce r2, the result is equal to the?
proportion of variance accounted for
The Y-intercept of a line is the?
value of Y at the point where the regression line crosses the Y axis
Suppose you have several different predictor variables and one criterion variable. all your variables are measured using interval or rations scales. What is the appropriate statistical test to use?
We should always draw a scatterplot of the data when we compute a correlation because hte scatterplot allows us to?
see the nature of the relationship between the two variables
When your scale correlates with other procedures or scales that are valid, it has__________ validity ?
When your scale does not correlate with other unrelated procedures or scales it has ________validity?
When the relationship between two variables is high (for example, r=.98) the variability in the Ys at each X is ____________ realtive to the overall variability of Y scores in the sample.
In general, a positive linear relationship means that?
as the values of one variable increase, there is a tendency for the values of the other variable to also increase.
Suppose you find a restriction of range in your study of IQ scores and school achievement at school. Restricting the range is likely to _____ the correlation coefficient.
decrease the size of
Whe consistency of participants responses to the same test at two different times determines?
The consistency of participant response on different versions of the same test determines?
If we plot a scatterplot, and the data points form a shape that appears to be random dots and is far from forming a slanted straight line as possible, the correlation for the data is?
0.0: there is no relationship
THe defining formula for the Pearson correlation coefficient shows that it is the?
average correspondence of paired X and Y z-scores
What procedure would be used to find out whether there is a relationship between SAT scores and GPA?
The Pearson correlation coefficient
In general a positive relaitonship means that?
As one variable increases the other variable also increases
We should always draw a scatterplot of the data when we compute a correlation because it alows us to see?
the nature of the relationship between the two variables
As the variability--differences--in Y scores at each X become larger, the relationship does what?
becomes weaker and results in a smaller correlation coefficient
The larger the correlation coeficient (whether pos. or neg.), the stronger the relationship. Why?
The less the Ys are spread out at each X and the closer the data come to forming a straight line
What is another word for the degree of efficeincy in a relationship?
coefficient although it DOES NOT directly measure units of consistency
Define the purpose of computing a correlation coefficient.
Statistical technique for demonstrating the reliability and the validity of a measurement procedure in any experiment or correlational design.
What are the types of reliability that a correlation coefficient is used to show?
test-retest, inter-rater, split-half
Test in which participants receive the same score when tested at different times
Procedure is valid because it looks valid/Extent to which a measurement procedure appears to measure what it was intended to measure
Extent to which scores obtained from one procedure are positively correlated with scores obtained from another procedure that is already accepted
Extent to which scores obtained from one procedure are not correlated with scores from another procedure that measures OTHER variables or constructs.
Define the Pearson correlation coefficient
Corelation coeffieccient that describes the strength and type of a linear relationship between interval and ratio variables, symbolized by r.
Define the Spearman Rank order coefficient
The correlation coefficient that describes the linear relationship between pairs of ranked scores (ex: any two ordinal variables OR tied rank variables, symbolized by Rs
Tied rank variables
occcurs when two aprticipants receive the same ranking score in SPearman's rank coefficient, resolved by averaging the score and assigning it to both participant to correlate their scores.
Point biserial correlation coefficient
Describes the linear relationship between the scores from one continuous variable and one dichotomous variable (ex: correlating male/female with interval scores from a personality test).Can be used for one continuous interval or ration and one dichotomous, symbol is Rpb.
How does a restricted range affect a correlation coefficient?
reduces the accuracy, producing a smaller coefficient than if hte range were not restricted and leads to an underestimate of the degree of association between the two variables. Avoiding this increases power.
Why is the correlation coefficient important?
It is one number that allows us to envision and summarize the important information in a scatterplot, in terms of it's strength and direction.
The smaller the absolute value of the coefficient, the greater the ?
variability of the Ys at each X, the vertical width of the scatterplot, and the less accurately Y scores can be predicted from X
How can the power of a correlational design be increased?
Minimizing error variance and avoiding a restricted range, so that thelargest possible coefficient is obtained.
If it passes through the proper inferential procedure, a sample correlation coefficient is used to estimate what?
the corresponding population correlation coefficient: r=p,Rs estimates Ps, Rpb estimates Ppb.
Define linear regression
THe statistical procedure for using a relationship to predict scores aka the statistic that summarizes the linear relationship.It produces the line that summarzes the relationship
What does the symbol Y' stand for
a predicted Y score. Our best prediction of the Y score at a corresponding X
Define regression line
straight line that summarizes the linear relationship in a scatterplot by,on average, passing through the center of the Y scores at each X and it consists of the predicted Y score-the Y'-for every possbile X
What is the importance of linear regression?
It is used to predict a individual's unknown Y score based on his/her X score from a correlated variable. Usually more external validity and more accurate description of the relationship.USed to predict unknown Y scores based on X scores from correlated variable.
Linear regression equation [(b)(x) + a]
equation that creates the straight line by producing a value of Y' at each X, define sthe line that summarzies the relationship. Describes it's slope and Y intercept.
SEE (Sy) is acronym for
Standard error of estimate which is the standardized difference between predicted Y' and actual Y scores
How do you calculate proportion of variance accounted for?
r2 which is also known as "coefficient of determination"
When r=0, the standard erro of the estimate is at it's max. and that is equal to?
the standard deviation of all Y scores in the sample (Sy)
What does the equation r2 aka coefficient of determination aka proortion of variance indicate?
How important the realtionship is by comparing amount of error obtained using the regression equation for XY to errors without the regression equation for XY
what does Sy2 refer too?
Describes the error variance when using regressinon to predict Y scores, measures error in prediction.
Sr' definitional formula/average error
subtract Y' from Y and square each deviation/divide by N then find hte square root of that to get the error of the estimate
proportion of variance
is the amount we reduce errors in predicting Y scores when we use the relationship, compared too if we did not. Equals r2
The differences (and error) between Y and Y' is also summarized by what?
the variance of the Y scores around Y' (S2y)
If there is a large R there is a week or strong relationship?
stronger the relationship and a small value of Sy and S2y, because the Y scores are closer to Y', thus the smaller difference between Y and Y'
An unequal spread of Y scores around the regression line (that is around the values of Y')