Week Five: Quantitative Methods

What does an association claim state?

What is an association claim supported by?

What would happen if one of these variables is manipulated?
Click the card to flip 👆
1 / 89
Terms in this set (89)
An association claim states that two variables are linked, one does not cause the other.

Association claims are supported by correlational studies, in which both variables are measured in a set of participants.

If either of the variables is manipulated, the study is an experiment, which could potentially support a casual claim.
For a scatterplot, the correlation coefficient r can be used to describe the relationship.

For a bar graph, the difference between the two groups is used to describe the relationship.

Regardless of whether the association is analysed with scatterplots or bar graphs, if both variables are measured, the study is correlational.
While it not necessary to interrogate internal validity for an association claim because it does not make a causal statement, it can be tempting to assume causality from a correlational study.

No, they may show covariance but do not usually satisfy temporal precedence, and cannot establish internal validity.
A bivariate correlation, or bivariate association, is an association that involves exactly two variables.

1. Positive
2. Negative.
3. Zero

To investigate associations, researchers need to measure the first variable and the second variable in the same group of people.

They use graphs and simple statistics to describe the type of relationship the variables have with each other.
After recording the data, what is the next step in testing an association claim?After recording the data, describe the relationship between the two measured variables using the correlation and coefficient r.What does a positive r mean? What are the two qualities of r, describe what these refer to.A positive r means that the relationship is positive: High scores on one variable go with high scores on the other. (Low also goes with low). 1. Direction - whether the association is positive, negative, or zero. 2. Strength - "how much"/ how closely related to the two variables are.The more closely related two variables are, the closer the r will be to... ? Across many areas os psychology, correlations are typically around...? Why might the association between marital satisfaction and online dating be more difficult to describe using scatterplots and the correlation coefficient r?The more closely related two variables are, the closer the r will be to 1.0 or -1.0. Correlations are typically around r = .20, but some areas of psychology might average as high as r = .40 The dating variable is categorical; its values fall in either one category or another. (online/ offline).Is a scatterplot the best representation of an association in which one of the variables is measured categorically? How is this different from a scatterplot? If one of the variables is categorical, what statistical measures are used to analyse the data?- A bar graph may be more suitable. -Each person is not represented by one data point. -The graph shows the mean for both categories. -You can examine the differences between group averages. Although researchers may occasionally use r, it is more common to estimate the magnitude of differences between means/ group averages.Association claims can be supported by either scatterplots or bar graphs, and by using a variety of statistics (r/ difference in means). Which of these combinations makes a correlational study? In contrast, if one of the variables is manipulated what is the method of study?Regardless of the graph or statistical analysis, when a method of study involves measuring both variables, the study is correlational = supports an association claim. If one of the variables is manipulated, it's an experiment, which is more appropriate for a casual claim.With an association claim, what are the two most important validities to interrogate? Why is construct validity relevant? Once you know what kind of measure was used for each variable, you can ask questions to assess each one's construct validity:-Construct validity. -Statistical validity. An association claim describes the relationship between two measured variables, must ask how well was each of the two variables measured? -Does the measure have good reliability? -Is it measuring what it's intended to measure? -What is the evidence for its face validity, its concurrent validity, its discriminant, and convergent validity?When you ask about the statistical validity of an association claim, what are you investigating? What must you also consider?Factors that might have affected: -The scatterplot. -Correlation coefficient. -r. -Bar graph. -Or difference score that led to your association claim. Must consider: -The strength and precision of your estimate. -If the study has been replicated. -Any outliers. -Restriction of range. -Whether a seemingly zero association is actually curvilinear.What is an effect-size? What can effect-size indicate? Is a tiny effect-size ever important? Provide an example:Effect-size - the magnitude, or strength of a relationship between two or more variables. Effect size can indicate the importance of a result. When all else is equal, a larger effect size is often considered more important than a small one. Depending on the context, even a tiny effect size can be important - when a tiny effect size is combined over many people/ situations, it can have an important impact. E.G Students randomly assigned to a growth mindset group. Effect size, r =.05. Researchers - enough to prevent 79,000 U.S teens from getting worse grades.What is the third way to consider effect size?A third way to think about effect size is to compare it to well-understood benchmarks. The average effect size in psychology studies is around r = .20 and may only rarely be as high as r = .40.Interpret the magnitude of these Effect Sizes? cohens D .05 (or -.05) .10 (0r -.10) .20 (or -.20) .30 (or -.30) .40 (or -.40)Very small or very weak. Small or Weak. Moderate. Fairly powerful effect. Unusually large in psychology - either very powerful or possibly too good to be true (based on a small sample).What is a study's correlation coefficient? What does the correlation coefficient range between? To communicate the precision of their estimate of r, what do researchers report?A study's correlation coefficient is the point estimate of the level of correlation to the population - how precise is the estimate. A correlation coefficient can range between -1.0 (perfect negative) and +1.0 (perfect positive). Researchers report a 95% confidence interval (95% CI). The CI calculations ensure that 95% of CIs will contain the true population correlation/ captures the true relationship of 95% of studies.What is the 95% confidence interval? Why might the CI vary?The 95% confidence interval is a range of values that you can be 95% confident contains the true mean of the population. Due to natural sampling variability, the sample mean (center of the CI) will vary from sample to sampleWhy do small samples have wider (less precise) confidence intervals?Estimates based on small samples are unstable. This is because if researchers added a few more adults to the existing sample - this would alter the results. The CI has to be wide to capture the degrees of uncertainty that surround a small sample.What are the CIs like for large samples? When the 95% confidence interval does not include a zero, what can we infer?Large samples - much narrower, more precise confidence intervals. If you have a study of 20,000 adding a few more participants is unlikely to alter estimates. When the 95% CI does not include zero, it is common to say that the association is statistically significant. A statistically significant correlation is unlikely to have come from a population where the association is zero.When a 95% CI includes a zero, what can't we rule out? What is it common to say in this case?We cannot rule out that the association is zero. When the 95% CI contains zero, it is common to say that the association is not statistically significant.Name three things that provide important information about how strong the relationship might be in an association claim. What is an outlier? What can an outlier impact?- Effect size -95% CIs -Replication - the process of conducting a study again to test whether the result is consistent. Outlier - a score that stands out as either much higher or much lower than most of the other scores in a sample - has a disproportionate influence. Outlier can impact the correlation coefficient r - can make a medium-sized correlation appear stronger (vice versa) than it really is.Regarding a bivariate correlation, when are outliers mainly problematic? Give an example. What is the best way to detect outliers?Mainly problematic when they involve extreme scores on both variables. E.G study on the positive correlation between height and weight - an individual who is both extremely tall and extremely heavy would make r appear stronger. Examine the scatterplots and see if one or a few data points stand out.What is restriction of range? Give an example.Restriction of range - in a bivariate correlation, the absence of a full range of possible scores on one of the variables, so the relationship from the sample underestimates the true correlation. Schools only submitting students who score 1,200 or higher in their SATs in a claims - the true range in restricted.What do researchers do when they suspect a restriction of range? Provide 3 options?-In the SATs example, submit all students to see what grades they obtained and compute the correlation. -Ideally recruit more people at the ends of the spectrum. -Use a statistical technique, correction for restriction of range - to estimate the full set of scores based on what we know about existing, restricted set, and then recompute the correlation.When else may there be a restriction of range? How does restriction of range make correlations appear?Restriction of range can apply when, one of the variables has very little variance. E.G correlation between parental income and child school achievement - but the sample of parents are entirely middle class. Because restriction of range usually makes correlations appear smaller, we would ask about it primarily when the correlation appears weaker than expectedWhat is a curvilinear association? Which straight line best fits the pattern of a curvilinear association? When a researcher suspects a curvilinear association, what might they do?Curvilinear Association - an association between two variables which is not a straight line; instead, as one variable increases, the level of the other variable increases and then decreases or vice versa. r does not describe the pattern very well. The straight line that fits best through this set of points is flat and horizontal, with a slope of zero. They might compute the correlation between one variable and the square of the other.What is the temptation when reading a correlational result? Why is a simple association insufficient to establish causality? Briefly describe the criteria?When we read a correlational result, the temptation to make a causal claim can be almost irresistible. Because it does not satisfy the three criteria for causal claims. 1. Covariance - association between the cause and the effect variable. 2. Temporal Precedence - the cause variable came before the effect variable. 3. Internal Validity - no alternative explanations for the relationship between the two variable .What is the maximum amount of the causal criteria that association claims can satisfy?1.Covariance - the two variables are associated. 2. Temporal precedence - specific associations may have one variable come before the other.In associations claims, a third variable may be a problem. Give an example? Does this present an internal validity problem?Taller people have shorter hair - the potential third variable is gender (men are taller and often tend to have shorter hair). When we propose a third variable that could explain a bivariate correlation, it's not necessarily going to present an internal validity problem. We can ask the researchers if their bivariate correlation is still present within potential subgroups.What do you question when interrogating the external validity of an association claim?-Whether the association can generalise to other people, places, and times. -Who are the participants and how were they selected. -What method was used to select the sample from the population of interest - ideally random methods.How important is external validity in association claims? Do many association claims generalise?A correlational study may not have used a random sample, but you should not automatically reject the association for that reason - accept the studies results and leave the question of generalisability for the next study. Many associations do generalise—even to samples that are very different from the original one. E.G men being taller than women would most likely generalise to other countries.What is a moderator? Give an example of a moderator?Moderator - a variable that, depending on its level, changes the relationship between two other variables. Correlation: The more a team wins, the more people attend the games. E.G location of the game would massively impact this correlation - cities with high residential mobility move out of the city frequently - don't develop ties to their community. E.G even if a team loses people will still attend a game if they have ties to the community.Define statistically significant? What is spurious association?Statistically significant - in NHST, the conclusion assigned when p < .05; that is, when it is unlikely the result came from the null-hypothesis population. Spurious association - a bivariate association that is attributable only to systematic mean differences on subgroups within examples; the original association is not present within the subgroups. - In such situations, the original relationship between height and hair length is referred to as a spurious association. The bivariate correlation is there, but only because of some third variable (gender). - In statistics, a spurious correlation (or spuriousness) refers to a connection between two variables that appears to be causal but is not.Why might scientists want to look beyond a correlational study? Researchers have developed some techniques that enable them to test for cause. What is the best of these?Correlation is not causation. Psychological scientists want to know about causes and effects, not just correlations - to suggest treatments/ interventions. Experimentation: Instead of measuring both variables, researchers manipulate one variable and measure the other.If a researcher cannot set up an experiment, what can they do instead? Describe the three techniques and how they help get closer to a causal claim?They can use 3 advanced correlational techniques to get closer to making a causal claim. 1. Longitudinal designs - allow researchers to estabish temporal precedence in their data. 2. Multiple - regression analysis - helps researchers rule out certain third-varible explanations. 3. The 'pattern and parismony' approach - the results of a variety of correlational studies all supporting a single, causal theory.In the advanced correlational techniques - longitudinal, multiple - regression analyis, pattern and parsimony - are the variables measured or manipulated?As in all correlational studies, the variables are measured, none are manipulated.How is mutivariate designs different from bivariate correlations? Give examples of multivarate designs? Are these multivariate designs perfect solutions to the causality conundrum?Multivariate designs involve more than two measured variables. Multivariate designs - longitudinal designs, multiple regression designs, the parsimony approach. They are not perfect solutions, but are widely used tools - especially when experiments are impossible to run.Example: There is an association between parental overpraise and narcissm. Explain how this association claim would look on the Causal claim critera?1. Covariance - at least one study found a correlation between variables. 2. Temporal precedence - it is unclear which came first. Perhaps the children were already narcisistic. 3. Internal validity - this association may be explained by a third variable. Maybe the child mimics the parents narcissisic behaviour.What is a multivariate design? What is a longitudinal design? Which of the 3 causal claim critera can longitudinal designs provide evidence for?Multivariate design - a study designed to test an association involving more than two measured variables. Longitudinal design - a study in which the same variables are measured in the same people at different points in time. Temporal Precedence - measuring the same variables across different time points can determine which came first.Using the example of narcissm and parental praisal, outline a longitudinal study? Why is this a longitudinal study? Why is this also a multivariate correlational study?-Sample of 565 children - Children and parents contacted 4 times, every 6 months. -Each time children completed a narcissim questionaire. -Parents completed a praisal questionaire. Because the researchers measured the same variable, in the same group of people across time. Because 8 variables were considered. Narcissm and and praise at 4 time points.All of the following are found in Logitudinal Designs: What is cross-sectional correlation? What is autocorrection? What is cross-lag correlation?Cross-sectional correlation - a correlation between two variables that are measured at the same time. Autocorrection - the correlation of one variable with itself, measured at two different times. Cross-lag correlation - a correlation between an earlier measure of one variable and a later measure of another variable.Why can cross-sectional correlations alone not establish temporal precedence? Which of the 3 correlations are researchers most interested in? Why?Because both variables in a cross-sectional correlation are measured at the same time. Either variable may lead to a change in the other. Cross-lag correlations - show whether the either measure of one variable is associated with the later measure of the other variable. Cross-lag correlation addresses the directionality problem - helps establish temporal precedence.What can we conclude when a 95% CI for correlation does not include a zero? If the cross-lag correlation supports the hypothesis, what does this mean for our narcissm and praise example?No zero = the correlation is statistically significant. Because the "praise-to-narcissm" correlations are significant and the "narcissism-to-praise" are not = parental praise came before child narcissism.Describe what it means if a cross-lag correlation is mutually reinforcing.If both correlations were different from zero - one variable caused the other and vice versa at different time points = mutually reinforcing.Do longitudinal desins address the three criteria for causation? How may longitudinal studies attempt to address internal validity?1. Covariance - when 2 variables are correlated and their 95% CIs do not contain a zero, there is covariance. 2. Temporal Precedence - the cross-lag correlations can determine which variable comes first. 3. Internal Validity - longitudinal studies may NOT rule out a third variable. They sometimes design their studies to address a third variable. E.G creating gender subcategories to account for it as a third variable.As it is the only certain way to confirm causal claims, why not just do an experiment? Regarding our example of narcissm and praise, should researchers rely soley on correlational data?Practical - In some cases people cannot be randomly assigned to a causal variable of interest. E.G personality traits such as narcissim. Unethical - to assign people, especially children, to a particular group - especially over a long period. No, if possible they should combine correlational data with ethical experiments over a short period of time.What is multiple regression/ multivariate regression? Put simply, what is the intention of multiple regression? Which of the causal criteria does it address?Multiple Regression - a statistical technique that computes the relationship between a predictor variable and a criterion variable, controlling for other predictor variables. Multiple regression - helps rule out third variables, thereby addressing some internal validity concerns.Example: Consuming sexual tv content = higher risk of pregnancy. Third possible variable - age. What does using a multivariate design allow researchers to do? What does control for mean?By using a multivariate design, researchers can evaluate whether a relationship two key between variables still holds when they control for another variable. Control for - holding a potential third variable at a constant level (statistically or experimentally) while investigating the association between two other variables.If researchers "control for age" in pregnancy and sexy tv example, what are they interrogating? How does "controlling for" link to the identification of subgroups?If they take the relationship betwen age and pregnancy into account, is there still a portion of variability in pregnancy that is caused by watching TV. Does the relationship between TV and pregnancy remain positive, even when controlling for age. Controlling for - recognising that testing a third variable with multiple regressions means identifying subgroups.When researchers use regression, what are they testing for? When researchers use multiple regression, how many variables are they studying? What is the first step in multiple regression?They are testing whether some key relationship holds true even when a suspected third variable is statistically controlled for. 3 or more variables. The first step is to identify the variable they are most interested in predicting. This is known as the criterion variable or dependent variable. E.G pregnancy.What is the criterion variable/ dependent variable? What is the predictor variable/ independent variable?Criterion variable - the variable in multiple-regression analysis that the researchers are most interested in understanding or predicting. Predictor variable - a variable in multiple-regression analysis that is used to explain variance in the criterion variable.In a regression table, what value is frequently next to each predictor variable? What does a positive beta indicate? What does a negative beta indicate? What does a beta that is zero indicate?Beta . Positive beta (like a positive r) - positive relationship between the criterion and predictor variables, when the other predictor variables are statistically controlled for. Negative beta (like negative r) - negative relationship between the two variables, when the other predictor variables are statistically control for. zero - represents no relationship - when the other variables are statistically controlled for.What does a higher and lower beta mean? What does the coefficient b represent?The higher beta is, the stronger the relationship is between that predictor variable and the criterion variable. The smaller the beta is, the weaker the relationship. The coefficient b represents an unstandardised coefficent.How is b similar to beta?Sometimes a regression table will include the symbol b instead of beta. Similar - the sign of b positive or negative denotes a positive or negative association, when the other predictors are controlled for. Unlike two betas, we cannot compare two b values in the same table to each other. B values are computed from the original measurement of the predictor variables. such as dollars, centimetres, or inches), whereas betas are not computed from predictor variables that have been changed to standardised units. A predictor variable that shows a large b may not actually denote a stronger relationship to the criterion variable than a predictor variable with a smaller b. COME BACK HERE YOU HAVE HIGHLIGHTED THIS IN YOUR NOTES. WTF DOES THIS MEAN. WHAT DOES ANYTHING IN LIFE MEAN GAHHThe association between sexy tv and pregnancy has a beta of 0.25. Sex and age has a higher beta of 0.36. Does this mean the relationship between sexy tv and pregnancy is insignificant?No, even when we hold age constant statisitcally, there is still a positive relationship between sexy tv and pregnancy. Age predicts pregnancy too.How might the p value compelement the 95% CI. What is p is greater than .05When the P value is less than .05, you can infer that the 95% CI for beta does not contain zero and is therefore considered statistically significiant. When p is greater than .05, the beta is considered not significant, and you can infer that the 95% CI does contain a zero.Describe a couple of ways you could explain the beta for the sexy tv and pregnancy variables.The relationship between exposure to sex on TV and pregnancy is positive (high levels of sex on TV are associated with higher levels of pregnancy risk), even when age is controlled for. The 95% CI for the relationship between exposure to sex on TV and pregnancy does not contain zero, suggesting that this relationship is positive, controlling for age.Describe how the bivariate relationship - between academic success and families meals - may encounter issues with temporal precedence? Describe how this association claim may also encounter issues with internal validity? How would be tackle the issue of internal validity?Did family meals come first and reinforce academic skills or did high academic success come first, making it more pleasant for parents to have meals with their kids. Third variable that present an internal validity concern - more involved parents would arrange family meals, and be involved in their kids academic journey. A multiple-regression analysis could hold parental involvement constant, and see if family meals is still associated with academic success.If there are many predictor variables (third variables) in a table, all with positive betas. What does this mean for the original crieterion and predictor variable? Adding several predictors to a regression analysis can answer two kinds of question?Even after all other variables (age, parents education, ethnicity) are controlled for, exposure to sex on TV still predicted pregnancy. 1. Control for several third variables at once = closer to a causal claim. Because the relationship between the suspected cause and the suspected effect does not appear to be attributed to any of the other measured variables. 2. By looking at the betas for all the other predictor variables, we gain a sense of which factors predict the suspected effect/ can evaluate which other variables are important.When making association claims in the popular media, betas, 95% CIS, or predictor variables are rarely discussed. Which phrases in the popular media indicate that regression analysis has occured? What about multiple regression analysis? Provide 2 signs here. What if you cannot tell from the language if third possible variables were controlled for?"Controlled for' = sign of regression analysis. E.G Researcher X controlled for factors such as. "Adjusting for" = sign of multiple - regression analysis. E.G. They adjusted for age, demongraphics, ect... "Considering" = sign of multiple regression. E.G even when other factors were considered. It is reasonable to suspect that certain third variables cannot be ruled out.Multivariate designs analysed with regression statistics can control for third variables. However, what may prove an issue when wanting to make a causal claim? What else might prove an issue with internal validity?They cannot always establish temporal precedence. Even when a study takes place longitudinally, the researchers cannot control for variables they do not measure - variables they did not consider may account for the assocation.What does "lurking-variable" or "third-variable problem" push the researcher towards? How would an experiment account for third variables that have not been measured? Why is this better than multiple regression?The "lurking" variable - makes a well-run experiment ultimately more convincing in establishing causation than a correlational study. Random assignment would make the two groups likely to be equal to any third variables the reader did not measure. Randomised experiment = determines causation. Multiple regression = allows researchers to control for potential third variables, but only for those they choose to measure.Which corrrelation design can satisfy the temporal precedence criterion? Which correlation design can potentially satisfy the internal validity criterion? What does the pattern and parismony approach investigate? Why is it called the pattern and parismony approach?Longitudinal Correlational Design = temperol precedence. Multiple-regression analysis = internal validity. Pattern and parismony - researchers investigate causality by using a variety of correlational studies that all point in a signle, causal direction. Pattern and Parismony approach - there is a pattern of results best explained by a single, parismonious causal theory.Describe the classic example of pattern and parsimony? Why could they not use multiple-regression analysis or a experiment?-Smokers have higher rates of lung cancer. r=.49Critics argued against this - higher stress levels/ coffee -Multiple-regression analyisis could not control for every possible third variable. -Experiment - unethical and researchers could not assign people to be life long smokers.Using the smoking & cancer example, how was pattern and parismony used?They specified the mechanisms for the causal path. E.G toxic chemicals in cigarette smoke. This leads to the creating a set of predictions, all of which could be explained by a single, parismonious theory that chemicals in cigarettes cause cancer. 1. longer smoking - greater chance of cancer. 2. people who stop smoking - lower cancer rates. 3. Smokers cancer = lung cancer specifically. 4. Filterted cigs = less cancer/ unfilited = more. 5. Second hand smoke = cancer.How does the process of creating 5 theories for smoking & cancer utilise the theory-data cyle? How is parismony created in this example?Theory-data cycle: Theory - cigarette toxicity = particular set of questions - led researchers to frame hypotheses about what the data should show. 5 predictions = many studies, using a variety of methods are tied back to one central principle (chemicals in cigs) = strong case for parismony.Why does parismony create a strong case in correlational studies? Which of the causal criterion does parismony tackle?While each individual study has methodological weaknesses, taken together, they all support the same parismonious conclusion. The diversity of 5 empirical findings makes it harder to raise third-variable explanations. E.G coffee intake can not explain the effect of filtered cigs or second-hand smoke.How might journalists misrepresent pattern and parsimony in science? When journalists report only one study at a time, what are they doing? What is the drawback of this of this?-They may only report the results of the latest study. -May only report a single study, without mentioning the other studies and results. -Might not describe the context, what previous studies have revealed/ the theory being tested. Reporting one study at a time = selectively present only part of the scientific process. Drawback.- makes it seem as though scientistst conduct unconnected studies on a whim/ only one study would be needed to reverse decades of research.What is the definition of parismony?Parsimony - the degree to which a theory provides the simplest explanation of some phenomenon. In the context of investigating a claim, the simplest explanation of a pattern of data; the best explanation that requires making the fewest exceptions or qualifications.What may a mediation hypothesis investigate? How might researchers investigate mediation hypotheses?In a mediation hypothesis, researchers specify a variable that comes between the two variables of interest as a possible reason the two variables are associated. After collecting the data on all three variables (the original two, plus the mediator), they use statistical techniques to evaluate how well the data support the mediation hypothesis.What do mediators explore further? Provide an Example?Once a relationship between two variables has been established, we often want to move further by thinking about why. Mediators mediate the relationship between two variables. Consciencous people are more healthy than non-conscious people. WHY? Mediator - consciencous people are more likely to follow medical advice.What are the similarities between mediators and third variables? What are the differences?Similarities: -Both involve mutivariate research designs. -Both can be detected using multiple regression. Differences: -Third variables are external to the bivariate correlation (problematic). -This a lurking variable which distracts from the variable of interest. -A nuisance to the researcher. -Mediators are internal to the causal variable (not problematic). -Tells a theoretically meaningful step-by-step story "A leads to M leads to B". -Of interest to the researcher.How may researchers test for mediation? Regarding the causal criteria, when is mediation established?Statistical techniques: compute the relationship between all three variables and use multiple regression to test for mediation. Mediation hypothesis are causal claims - only established in conjunction with temporal precedence: when the proposed causal variable is measured or manipulated first in a study - followed by a later mediating variable.When testing for mediation what do researchers ask? When testing for moderation what do researchers ask?Mediating Variables: Why are these two variables linked? Moderating Variables: Are these two variables linked in the same way for everyone in every situation? For whom is the association the strongest?Describe the different proposals from mediation and moderation, using the example of better health and conscientiousness.Mediation Hypothesis: Medical compliance is the reason conscientiousness is related to better mental health. Moderation hypothesis: link between conscientiousness and good health is strongest amongst older people and weaker amongst younger people.Briefly define a mediator/ mediating variable? What should we ask when investigating the construct valdity of any multivariate and bivariate design? What about the external validity?Mediator - a variable that helps explain the relationship between two other variables. Construct validity: Why well was each variable measured. External validity: How were individuals in the research sampled/ what kind of population.When interrogating a multivariate correlational research study's statistical validity, what should we ask?- Ask about the point estimates and confidence intervals. -Ask whether the study has been replicated. - Have precautions been taken with outliers and curvilinear associations (more difficult to detect with more than 2 variables).Which correlational method does research often begin with? Which methods must researchers use to get closer to making causal claims.Research often begins with a simple bivariate correlation, which cannot establish causation. Researchers must use multivariate techniques to get closer to making a causal claim.What is the most common statistical method we will encounter? When comparing two groups/ variables what assumption do we start with? (regarding H0 and H1)The p value method of statistical inference. Everything starts with the assumption that the null hypothesis is the true statement. There is no difference between the two groups/ variables.Give an example of a null hypothesis (H0)? How is this different from our hypothesisIntroverts and extroverts have the same types of friendship networks. There is no difference between their networks - we assume this. This is different from our hypothesis (H1). Introverts have smaller friendship networks than extroverts.Is P value considered in relation to H1 or H0? What is the p value? Can the p value be used for the hypothesis?The P value is always calculated under the null distribution. The P value is the probability you observed the effect you did given that the null is true. Everything hinges on the null being true, that is how you interpret the P value. Importantly, the P value is not the strength of evidence you have for the hypothesis. This is another reason why scientists do not say we proved our hypothesis' as true, because it is not about H1 it is about H0.Where do p values come from? Why do we want smalller p values with NHST?They come from the distribution of possible values one can obtain assuming the null is true. The smaller the p-value, the less likely it is that we would have observed the effect we did if it came from the null distribution, and consequently the more evidence we must reject the null hypothesis.Briefly describe the p-value?The p value is the probability you observed the effect you did given that the H0 is true. P-hacking works because we are talking about the null: how many participants you recruited and how you assigned participants to experimental conditions and when you decide something is an outlier and if it should be included - they are all part of the null hypothesis. If you entertain a whole family of null hypothesis, eventually you'll find one to reject but that doesn't mean that you can reject all of them = p hacking.What is the effect size? What is the correlation coefficient?Effect size - difference in magnititude between the two groups. Large effect size = 0.5 and above. (small, large, medium.) Correlation coefficient - relationship between the two variables (positive, negative).