Home
Subjects
Explanations
Create
Study sets, textbooks, questions
Log in
Sign up
Upgrade to remove ads
Only $35.99/year
Research Methods & Stats III Midterm
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (40)
regression equation
y = b0 + b1 * X1 + Ey
prediction equation
ŷ = b0 + b1 * X1
deviations (deviances)
individual score - mean
sum of squares
SS, sum of each score's squared deviation from the mean
sum of products
SP
the sum of the products of 2 deviations
covariance
How/the extent to which two variables vary together
Unstandardized, raw, measures of a linear relationship - thus range is unrestricted
There is no upper or lower limit - it's infinite - because our covariance value is scale dependent (still expressed in the original units that your x and y variable are measured in - bound to the scale that makes it up)
Whereas standardization restricts the range - transforms it to be interpretable/comparable
This maintains the information regarding the scale of the variables
But they're hard to interpret
Scale dependent
Cov = SP/df
correlation
-When we fully standardize covariance, we get ______ - this extracts information about the covariance without the original scale
-This is achieved by dividing the variance by the product of the SDs
-Now bound by 1 and -1
-Range restricted, easier to interpret, can interpret effect size
-We thus discard all info regarding the original scale of the variables, but we have a good index of linear association
-Scale invariant
Corr = r = Cov/sx * sy
t-test
Perform this to evaluate H0 for r
If the tobs > tcrit (critical value), then we can reject the null (i.e., there is a significant relationship between the variables)
If the tobs < tcrit (critical value), then we fail to reject the null (i.e., there is not sufficient evidence to show that the two variables predict one another)
variance
s^2: technically the covariance of a variable with itself
a measure of the spread of scores
unstandardized regression coefficient
b's - partially standardized
By dividing the covariance by the variance (s2) of x
Regression coefficient predicting Y from X (Y is regressed on X)
Shifting from looking at correlation to just looking at prediction - "anchoring into x" as it only takes into the account the variance of the predictor
b1 represents the number of units change in Y for 1 unit change in X
It also describes the line of best fit that bisects the datapoints in XY scatterplot
Can only interpret directionality (positive and negative) and slope of the line
"1 unit increase in X corresponds to b-units change in Y"
standardized regression coefficient
Betas - fully standardized and thus we can interpret effect size
Takes into account BOTH the SD of x and SD of y
Expressed in standardized units: "one SD increase in X, results in an expected change of Beta SD units in Y"
For a single predictor case, Beta is equivalent to the correlation coefficient
total variability in Y (Y - My)
This is partialed into/made up of regression (ŷ - My) and residual (Y - ŷ)
Regression + residual = total
There are always errors in prediction (or, how our model fails to predict the observed values of Y)
Squaring each of the deviations and summing across observations yields SS for each source of variability
SSReg(ression)= sum(ŷ-MY)^2
SSResidual= sum(Y-ŷ)^2
SSTotal= sum(Y-MY)^2
regression
variability in Y that CAN be explained by the predictor - represents the component of Y that is "shared" with X1
residual
variability in Y that CANNOT be explained by a predictor - simply what is "leftover" after accounting for X
ANOVA
Squaring each of the deviations and summing across observations yields SS for each source of variability
SSReg(ression)= sum(ŷ-MY)^2
SSResidual= sum(Y-ŷ)^2
SSTotal= sum(Y-MY)^2
Similar to ___: if you add SS of regression and SS of residuals, you will get SS of total
______ and Regression are both applications of the General Linear Model (GLM), which is where this 'variance partitioning' idea comes from - describes sources of variability in our regression model
r null
Null says that relationship between two variables is not significant or due to chance
If you reject the null: the two variables are significantly related/correlated or that they significantly covary
The population correlation coefficient from which our sample was drawn is equal to 0 - null hypothesis tested by the Student's t test statistic
b1 null
This involves prediction, so the null hypothesis says that your x value does not significantly predict your y value (or that that b1 is 0 at the population level)
For every 1 unit increase in x, there is a b1 unit increase/decrease in y
overall model null
if the overall model is significant, then you reject the null hypothesis
H0 = the regression model explains zero variance in y at the population level
sources of variance in Y (for a two-predictor MR model)
1. variance unique to x1
2. variance unique to x2
3. covariance of x1 and x2
4. variance of error/residuals
standardized MR coefficient (interpretation)
Interpreting Betas:
For every 1 SD increase in X1, Y increases/decreases by B1 SDs, when accounting for X2
For every 1 SD increase in X2, Y increases/decreases by B2 SDs, when accounting for X1
collinearity
How much predictors overlap - although predictors tend to have some overlap (covary), you want to have minimal collinearity so you can then say that the predictors uniquely predict the outcome; if they have too much overlap, then it's harder to say this is true.
See intuitive collinearity/shrinkage, classical suppression, and surprising suppression
intuitive collinearity or shrinkage
When β1 < rX1,Y
This basically means that when accounting for the covariance between X1 and X2 as well as the covariance between X2 and Y, the effect of X1 reduces (more collinearity between X1 and X2 --> less unique prediction of Y on X1)
classical suppression
When β1 ≠ 0 and rX1,Y = 0
surprising suppression
When β1 > rX1,Y
suppression
the effect of X1 increases, even when accounting for the covariance between X1 and X2 as well as the covariance between X2 and Y (less collinearity between X1 and X2 --> more unique prediction of Y on X1)
partial correlation
correlation between the error/residual variances from two models with the same predictor, but two different outcomes
They reflect the correlation between residuals of two dependent variables
semi-partial correlation
correlation between the residual variance of one DV and the variance of a second DV itself - still two models with the same predictor and two different outcomes
They reflect the correlation between the residual of one dependent variable and the raw scores of another
dummy coding
coded 0 (reference group) and 1 (coded group)
b0: b0 is the mean y/outcome score for the reference group
b1: The coded group is b1 units higher/lower than the reference group
useful for comparing groups - making a discrete dichotomous variable represented numerically
effect coding
coded -1 (reference group) and 1 (coded group)
b0: b0 is the grand mean outcome score across both the reference and coded groups.
b1: The coded group is b1 units higher/lower than the grand mean
useful for comparing groups - making a discrete dichotomous variable represented numerically
ordinal predictor
Predictors that reflect integer 'counts'
The numbers 1, 2, 3, have meaning that are in order, rather than represent a scale or score
Ex: # of concussions, prior arrests, beverages consumed, cigarettes smoked, cats owned
Continuous vs. nominal:
When range is sufficient (i.e.,
approximately
> 6ish), treat as continuous, but consider checking for quadratic effects (next lecture)
When range is restricted (i.e., <6ish), you can treat as either continuous or 'nominal'
Ordinal specification assumes linear relationship (unless you fit quadratic) - only requires 1 df for linear, 2 df for quadratic
Nominal specification assumes no trend - requires k - 1 df
interaction term
b3 describes the magnitutde direction of the X1×X2 interaction, which is the product of X1 and X2
"Effect of X1 depends on the level of X2" - same as ANOVA
Including X1×X2 changes interpretation of b1 and b2
-b1 is the regression coefficient specific to the value (person) X2 = 0
-For every one unit increase in X1, there will be a b1 unit change in Y, when X2 = 0 (average when centered, reference group when dummy coded)
-b2 is the regression coefficient specific to the value (person) X1 = 0
-For every one unit increase in X2, there will be a b2 unit change in Y, when X1 = 0 (average when centered, reference group when dummy coded)
These now represent simple effects
centering
This helps for improving interpretation
You are transforming the variable into one that uses the normal curve, where the avg/mean = 0
Sometimes, a score of zero can be nonsensical (i.e., no one really has zero stress), so it makes more sense to center it to get sensible simple effects and so a value of 0 is meaningful
Typically centered around the sample/grand mean (but you can center around the median or quartiles)
You're essentially standardizing your variable
It also reduces collinearity
b3
This describes how b1 and b2 change as a function of X2 and X1
The value of b1 (numeric value) changes by b3 units for every one unit increase in X2 (if dummy coded, then say for a person in the coded group)
The value of b2 (numeric value) changes by b3 units for every one unit increase in X1 (if dummy coded, then say for a person in the coded group)
If it is positive, it makes your focal predictor more positive; and if it is negative, then it makes your focal predictor more negative. It also explains the strength of the relationship when there is a significant interaction.
simple slopes analysis
After centering your moderator variable, it can be useful to understand the impact of the moderator at high and low levels of the variable - ONLY do if b3 is significant
Using the centered version of Stress (Stress_g) provides us with an interpretable estimate of the treatment effect (i.e., b1)
Coeff b1 is now specific to a person with 'average' Stress
However, recall that the significant b3 coefficient suggests that this treatment effect (i.e., b1) depends on a person's level of Stress!
We need a way to evaluate and illustrate how the tx effect (slope of b1) changes as a function of Stress
i.e., How does having a lower or higher level of stress depend on a stronger or weaker treatment effect?
-When stress is high, there's a positive effect of b1
-When stress is low, there's a negative effect of b1
Steps:
Manipulate the 0-point of the moderator
'Benchmark' values are ±1 SD
Consider using Q1 and Q3 if moderator is highly skewed and/or kurtotic
1. Create 'low' and 'high' versions of Stress to 'shift' 0-point:
LowX2 = X2_g + 1 SD
0 in LowX2 now represents someone who is 1 SD below the mean in X2
HighX2 = X2_g - 1 SD
0 in HighX2 now represents someone who is 1 SD above the mean in X2
2. Results in 2 simple regression equations for 'low' and 'high' levels of mod
Low (1 SD Below Mean) Mod Equation:
BAI=b0+b1*BPTdum
BAI=52.51+0.36*BPTdum
High (1 SD Above Mean) Mod Equation:
(BAI)=b0+b1*BPTdum
BAI=60.73+(-7.21)*BPTdum
In these above equations, the "low" slope is positive, and the "high" slope is negative (thus, b3 is likely to be negative)
bivariate linear regression
How do plants predict happiness?
happiness = b0 + b1*plants + ehappiness
happiness = 5 + 3*plants + ehappiness
b0: 5 is the level of happiness someone would have if they have 0 plants.
b1: For every one unit increase in the number of plants, a person's happiness increases by 3.
Multivariate regression
How does the number of plants and number of pets predict happiness?
happiness = b0 + b1
plants + b2
pets + ehappiness
happiness = 10 + -7
plants + 6
pets + ehappiness
b0: 10 is the level of happiness when someone has 0 pets and 0 plants
b1: For every one unit increase in the number of plants, happiness decreases by 7 units, when accounting for the number of pets.
b2: For every one unit increase in the number of pets, happiness increases by 6 units, when accounting for the number of plants.
Multivariate with an interaction: dummy coded and continuous
How does owning plants or not and number of pets predict happiness?
happiness = b0 + b1
plants_DUM + b2
pets + b3*plants_DUMxpets + ehappiness
happiness = 12 + 8
plants_DUM + -4
pets + 5*plants_DUMxpets + ehappiness
b0: 12 is the level of happiness when someone does not own plants and has 0 pets
b1: A person with plants has an 8 unit higher happiness score compared to those without plants, when number of pets owned is 0
b2: For every one unit increase in the number of pets owned, happiness decreases by 4 units, when someone has no plants (in comparison to someone with plants).
b3: - The value of b1 (8) changes by 5 units for every one unit increase in the number of pets owned.
- The value of b2 (-4) changes by 5 units, for a person who owns plants.
Multivariate with an interaction: dummy coding and centered
How does owning plants or not and number of pets predict happiness?
happiness = b0 + b1
plants_DUM + b2
pets_g + b3*plants_DUMxpets_g + ehappiness
happiness = 12 + 8
plants_DUM + -4
pets_g + 5*plants_DUMxpets_g + ehappiness
b0: 12 is the level of happiness when someone does not own plants and has an average number of pets
b1: A person with plants has an 8 unit higher happiness score compared to someone without plants, when pet ownership is average.
b2: For every one unit increase in the number of pets owned, happiness decreases by 4 units, when someone has no plants (in comparison to someone with plants).
b3: - The value of b1 (8) changes by 5 units for every one unit increase in the number of pets owned.
- The value of b2 (-4) changes by 5 units, for a person who owns plants.
Bivariate with effect coding
How does owning plants or not predict happiness?
happiness = b0 + b1*plants_eff + ehappiness
happiness = 12 + 8*plants_eff + ehappiness
b0: 12 is the grand mean of happiness across both groups (with or without plants)
b1: People who own plants have a happiness score that is 8 units higher than the grand mean across all groups.
Bivariate with dummy coding
How does owning plants or not predict happiness?
happiness = b0 + b1*plants_DUM + ehappiness
happiness = 12 + 8*plants_DUM + ehappiness
b0: 12 is the mean happiness score for people without plants.
b1: People with plants have an 8 unit higher happiness score compared to those without plants.
Sets found in the same folder
Research Methods I Midterm
205 terms
Research Methods I Final (Weeks 7-10)
123 terms
Research Methods & Stats II Final
94 terms
Research Methods and Stats III Final
32 terms
Other sets by this creator
Intellectual Assessment Final
56 terms
10: Substance Use and Abuse
47 terms
9: Anxiety, ADHD, and Stress Disorders
27 terms
8: Schizophrenia and Affective Disorders
37 terms
Other Quizlet sets
TEAM 8- PERSPECTIVAS COGNITIVAS DEL APRENDIZAJE
16 terms
Problem Employees and Impaired Employees
109 terms
Microbiology Unit 3
94 terms
College and Trade school
21 terms
Verified questions
ECONOMICS
Suppose that you are unemployed and looking for work. You are a good worker, but it is taking you longer than expected to find a job. What might be making your search so difficult?
QUESTION
Which of the following identities is true in a simplified economy with no government and no interaction with other countries? A. consumer spending= investment spending B. total income= consumer spending - investment spending C. total spending= investment spending + savings D. investment spending= total spending - savings E. savings = investment spending
ECONOMICS
The Investments Fund sells Class A shares with a front-end load of 6% and Class B shares with 12b-1 fees of .5% annually as well as back-end load fees that start at 5% and fall by 1% for each full year the investor holds the portfolio (until the fifth year). Assume the portfolio rate of return net of operating expenses is 10% annually. What if you plan to sell after 15 years?
ECONOMICS
Donna, vice president of finance, and bob, vice president of human resources, are ____ manager