Only $35.99/year

Terms in this set (42)

1. History: History refers to events that occur during the course of a study and are not part of the study but affect its results.
-The best way to control history when it's due to events that occur outside the context of the study is to include more than one group and randomly assign participants to the different groups.
--When this is done, participants in all groups should be affected to the same extent by history.
-History can also be a threat when participants are exposed to the independent variable in groups and one group experiences an unintended event (e.g., a power outage or other disturbance) that's not experienced by other groups and that affects the results of the study.
--This type of history is more difficult to control and must be considered when interpreting the results of a study.

2. Maturation: Maturation refers to physical, cognitive, and emotional changes that occur within subjects during the course of the study that are due to the passage of time and affect the study's results. The longer the duration of the study, the more likely its results will be threatened by maturation.
-The best way to control maturation is to include more than one group in the study and randomly assign participants to the different groups.
--When this is done, participants in all groups should experience similar maturational effects and any differences between the groups at the end of the study will not be due to maturation.

3. Differential Selection: Differential selection is a misnomer because it actually refers to differential assignment of subjects to treatment groups. It occurs when groups differ at the beginning of the study due to the way they were assigned to groups and this difference affects the study's results.
-The best way to control differential selection is to randomly assign participants to groups so the groups are similar at the start of the study.

4. Statistical Regression: Statistical regression is also known as regression to the mean and threatens a study's internal validity when participants are selected for inclusion in the study because of their extreme scores on a pretest.
-It occurs because many characteristics are not entirely stable over time and many measuring instruments are not perfectly reliable.
-Statistical regression is controlled by not including only extreme scorers in the study or by having more than one group and ensuring that the groups are equivalent in terms of extreme scorers at the beginning of the study.

5. Testing: Testing threatens a study's internal validity when taking a pretest affects how participants respond to the posttest.
-This threat is controlled by not administering a pretest or by using the Solomon four-group design, which is described below.

6. Instrumentation: Instrumentation is a threat to internal validity when the instrument used to measure the dependent variable changes over time.
-For example, raters may become more accurate at rating participants over the course of the study.
-The only way to control instrumentation is to ensure that instruments don't change over time. If that's not possible, its potential effects must be considered when interpreting the study's results.

7. Differential Attrition: Differential attrition threatens internal validity when participants drop out of one group for different reasons than participants in other groups do and, as a result, the composition of the group is altered in a way that affects the results of the study.
-Attrition is difficult to control because researchers often don't have the information needed to determine how participants who drop out from a study differ from those who remain.
1. Reactivity: Reactivity threatens a study's external validity whenever participants respond differently to the independent variable during a study than they would normally respond.
Factors that contribute to reactivity include demand characteristics and experimenter expectancy.
--Demand characteristics: are cues that inform participants of what behavior is expected of them.
--Experimenter expectancy: occurs when the experimenter acts in ways that bias the results of the study and can involve (a) actions that take the form of demand characteristics and directly affect participants (e.g., saying "good" whenever a participant gives the expected or desired response) or (b) actions that don't directly affect participants (e.g., recording the responses of participants inaccurately in a way that supports the purpose of the study).
-The best ways to control reactivity are to use unobtrusive measures, deception, or the single- or double-blind technique.
-When using the single-blind technique, participants do not know which group they're participating in (e.g., if they're in the treatment or control group); when using the double-blind technique, participants and researchers do not know what group participants are in.

2. Multiple Treatment Interference: Multiple treatment interference is also referred to as carryover effects and order effects. It may occur whenever a within-subjects research design is used - i.e., when each participant receives more than one level of the independent variable.
-For example, if a low dose, moderate dose, and high dose of a drug are sequentially administered to a group of participants and the high dose is most effective, its superior effect may be due to the fact that it was administered after the low and moderate doses.
-Multiple treatment interference is controlled by using counterbalancing, which involves having different groups of participants receive the different levels of the independent variable in a different order.
-The Latin square design is a type of counterbalanced design in which each level of the independent variable occurs equally often in each ordinal position.

3. Selection-Treatment Interaction: A selection-treatment interaction is a threat to external validity when research participants differ from individuals in the population, and the difference affects how participants respond to the independent variable.
-For example, people who volunteer for research studies may be more motivated and, therefore, more responsive to the independent variable than non-volunteers would be. The best way to control this threat is to randomly select subjects from the population.

4. Pretest-Treatment Interaction. A pretest-treatment interaction is also known as pretest sensitization and threatens a study's external validity when taking a pretest affects how participants respond to the independent variable.
-For example, answering questions about a controversial issue in a pretest may make subjects pay more attention to information about that issue when it's addressed in a lecture or discussion during the study.
-The Solomon four-group design is used to identify the effects of pretesting on a study's internal and external validity.
--When using this design, the study includes four groups that allow the researcher to evaluate (a) the effects of pretesting on the independent variable by comparing two groups that are both exposed to the independent variable, with only one group taking the pretest and (b) the effects of pretesting on the dependent variable by comparing two groups that are not exposed to the independent variable, with one group taking the pre- and posttests and the other taking the posttest only.
The various single-subject designs share the following characteristics: (a) They include at least two phases: a baseline (no treatment) phase, which is designated with the letter "A," and a treatment phase, which is designated with the letter "B." (b) The treatment phase does not usually begin until a stable pattern of performance on the dependent variable is established during the baseline phase. (c) The dependent variable is measured multiple times during each phase, which helps a researcher determine if a change in the dependent variable is due to the independent variable or to maturation, history, or other factor. Although single-subject designs are ordinarily used with a single subject, the multiple-baseline design across subjects design includes two or more subjects and the other single-subject designs can be used with multiple subjects when the subjects are treated as a single group.

1. AB Design: The AB design consists of a single baseline (A) phase and a single treatment (B) phase. Like all single-subject designs, it helps a researcher determine if an observed change in the dependent variable is due to the independent variable or to maturation since maturational effects (e.g., fatigue, boredom) usually occur gradually over time. Consequently, changes in performance on the dependent variable due to maturation would be apparent in the pattern of the individual's performance. The AB design does not control history, however, because any change in the dependent variable that occurs when the independent variable is applied could be due to the independent variable or to an unintended event that occurred at the same time the independent variable was applied.

2. Reversal Designs: A single-subject design is referred to as a reversal or withdrawal design when at least one additional baseline phase is added. The ABA and ABAB designs are reversal designs. The ABAB design begins with a baseline phase which is followed by a treatment phase, withdrawal of the treatment during a second baseline phase, and then application of the same treatment during the second treatment phase. The advantage of adding phases is that doing so helps a researcher determine if a change in the dependent variable is due to history rather than the independent variable: When the dependent variable returns to its initial baseline level during the second baseline phase and to its initial treatment level during the second treatment phase, it's unlikely that changes in the dependent variable were due to unintended events.

3. Multiple Baseline Design: When using the multiple baseline design, the independent variable is sequentially applied across different "baselines," which can be different behaviors, tasks, settings, or subjects. For example, a psychologist might use a multiple-baseline across behaviors design to evaluate the effectiveness of response cost for reducing a child's undesirable interactions with other children during recess - i.e., name calling, hitting, and making obscene gestures. To do so, the psychologist would obtain baseline data on the number of times the child engages in each behavior during morning recess for five school days. He would then apply response cost to name calling during recess for the next five school days while continuing to obtain baseline data for hitting and making obscene gestures. Next, the psychologist would apply response cost to name calling and hitting for the next five school days while continuing to obtain baseline data for making obscene gestures. And, finally, he would apply response cost to name calling, hitting, and making obscene gestures during recess for the next five school days. If the results of the study indicate that each undesirable behavior remained stable during its baseline phase and decreased only when response cost was applied to it, this would demonstrate the effectiveness of response cost for all three behaviors. An advantage of the multiple baseline design over the reversal designs is that, once the independent variable is applied to a behavior, task, setting, or participant, it does not have to be withdrawn during the course of the study.
Quantitative research is used to identify and study differences in the amount of behavior and produces data that's "expressed numerically and can be analyzed in a variety of ways". The types of quantitative research can be categorized as descriptive, correlational, or experimental.

(a) Descriptive research is conducted to measure and "describe a variable or set of variables as they exist naturally".

(b) Correlational research involves correlating the scores or status of a sample of individuals on two or more variables to determine the magnitude and direction of the relationship between the variables.
-Variables are usually measured as they exist, without any attempt to modify or control them or determine if there's a causal relationship between them.
-The data collected in a correlational research study are often used to conduct a regression analysis or multiple regression analysis to derive a regression or multiple regression equation.
-The equation is then used to predict a person's score on a criterion from his/her obtained score(s) on the predictor(s). (In the context of correlational research, an independent variable is often referred to as the predictor or X variable and the dependent variable is referred to as the criterion or Y variable.)

(c) Experimental research is conducted to determine if there's a causal relationship between independent and dependent variables.
-A distinction is made between true experimental and quasi-experimental research:
--True experimental: A researcher conducting a true experimental research study has more control over the conditions of the study and, consequently, can be more confident that an observed relationship between independent and dependent variables is causal.
---The most important aspect of control for true experimental research is the ability to randomly assign subjects to different levels of the independent variable, which helps ensure that groups are equivalent at the beginning of the study.
1. Between-Subjects Designs: A study using a between-subjects design includes two or more groups of subjects, with each group being exposed to a different level of the independent variable.
-For example, in a study comparing the effectiveness of low, moderate, and high doses of an antidepressant for reducing depressive symptoms, one group of subjects would receive the low dose, a second group would receive the moderate dose, and a third group would receive the high dose.

2. Within-Subjects Designs: When using a within-subjects design, each participant is exposed to some or all levels of the independent variable, with each level being administered at a different time.
-When using a single-group within-subjects design to evaluate the effects of an antidepressant drug dose on depression, the low, moderate, and high doses would be compared by sequentially administering the three doses to all subjects and evaluating their depressive symptoms after they've taken each dose for a prespecified period of time.
-The time-series design is a type of within-subjects design that's essentially a group version of the single-subject AB design and involves measuring the dependent variable at regular intervals multiple times before and after the independent variable is administered so that all participants act as both the control (no treatment) and treatment groups.

3. Mixed Designs: When using a mixed design, a study includes at least two independent variables, with at least one variable being a between-subjects variable and another being a within-subjects variable.
-A mixed design is being used when the effects of drug dose on depressive symptoms are measured weekly for six weeks after participants begin taking either the low, moderate, or high dose of the antidepressant. In this situation, drug dose is a between-subjects variable because each subject will receive only one dosage level and time of measurement is a within-subjects variable because each subject's depressive symptoms will be measured at regular intervals over time.
A research design is referred to as a factorial design whenever it includes two or more independent variables.

An advantage of a factorial design is that it allows a researcher to obtain information on the main effects of each independent variable as well as the interaction between the variables.
-A main effect is the effect of one independent variable on the dependent variable
-An interaction effect is the combined effect of two or more independent variables on the dependent variable.

Note that any combination of significant main and interaction effects is possible:
-There can be main effects of one or more independent variables and interaction effects, main effects of one or more independent variables and no interaction effects, or no main effects but significant interaction effects.
-In addition, when there's a significant interaction, any main effects must be interpreted with caution because the interaction may modify the meaning of the main effects.

As an example, a research study evaluating the effects of type of therapy (cognitive-behavioral therapy, interpersonal therapy, and supportive therapy) and antidepressant drug dose (high, moderate, and low) on depressive symptoms has two independent variables (type of therapy and drug dose) and subjects will be assigned to one of nine groups that each represent a different combination of the levels of the two variables - cognitive therapy and low dose, interpersonal therapy and low dose, supportive therapy and low dose, cognitive therapy and moderate dose, etc.

The results of the statistical analysis of the data obtained in this study will indicate if there are significant main and/or interaction effects.

For example, the results might indicate that, overall, cognitive therapy is significantly more effective than interpersonal therapy and supportive therapy and the high dose is significantly more effective than the moderate or low dose of the antidepressant drug. In other words, there are main effects for both type of therapy and drug dose.
The results of the study might also indicate that the moderate dose is most effective for people who received cognitive therapy but that the high dose is most effective for people who received interpersonal therapy or supportive therapy.
-In other words, there's an interaction between type of therapy and drug dose: The effects of drug dose differ for different types of therapy.
It is ordinarily not possible to collect data from all members of the target population when conducting a research study and, consequently, a sample of individuals is selected from the population. Methods for selecting a sample are categorized as probability and non-probability sampling methods.

1. Probability Sampling: Probability sampling requires the random selection of the sample from the population, which helps ensure that members of the sample are representative of the population. However, even when a sample is randomly selected, the sample may be affected by sampling error, which means that the sample is not completely representative of the population from which it was selected due to the effects of chance (random) factors. Sampling error is most likely to be a problem when the sample size is small.

Methods of probability sampling include the following:

(a) When using simple random sampling, all members of the population have an equal chance of being selected.
-Using a computer-generated sample of individuals that was randomly chosen from a list of all individuals in the population is one method of simple random sampling.

(b) Systematic random sampling can be used when a random list of all individuals in the population is available.
-It involves selecting every nth (e.g., 10th or 25th) individual from the list until the desired number of individuals has been selected.

(c) Stratified random sampling is useful when the population is heterogeneous with regard to one or more characteristics that are relevant to the study (e.g., gender, age range, DSM diagnosis), and the researcher wants to make sure that each characteristic is adequately represented in the sample.
-This involves dividing the population into subgroups (strata) based on the relevant characteristics and selecting a random sample from each subgroup.

(d) Cluster random sampling is used when it is impossible to randomly select individuals from a population because the population is very large and there are natural clusters in the population (e.g., mid-sized cities, school districts, mental health clinics).
-It involves randomly selecting a sample of clusters and then either including in the study all individuals in each selected cluster or a random sample of individuals in each selected cluster.

2. Non-Probability Sampling: When using non-probability sampling, individuals are selected on the basis of non-random criteria and all members of the population do not have an equal chance of being selected.

Non-probability sampling is vulnerable to sampling error and sampling bias, which is also known as selection bias and systematic error.

Sampling bias occurs when participants in the study over- or underrepresent one or more relevant population characteristics because of the way that the sample was obtained.

Consequently, non-probability sampling is most useful for qualitative and exploratory studies designed to acquire a better understanding of an under-researched issue or population rather than studies designed to test hypotheses.

Methods of non-probability sampling include the following:

(a) Convenience sampling involves including in a sample individuals who are easily accessible to the researcher (e.g., the students in a psychologist's clinical psychology classes).

(b) When using voluntary response sampling, the sample consists of individuals who volunteered to participate in the study.

(c) Purposive sampling is also known as judgmental sampling. When using this method, researchers use their judgment to select individuals who are appropriate for the purposes of their studies.

(d) Snowball sampling is used when direct access to members of the target population is difficult. It involves asking initial individuals who participate in the study if they can recommend others who qualify for inclusion in the study.
CBPR "is a collaborative approach to research that equitably involves all partners in the research process and recognizes the unique strengths that each brings.

CBPR begins with a research topic of importance to the community and has the aim of combining knowledge with action and achieving social change to improve health outcomes and eliminate health disparities".

The research topic can be identified by the community itself or by the community in collaboration with the research team which may include educators; policy decision-makers; and psychologists, physicians, and other healthcare professionals.

Nine core principles of CBPR have been identified by Israel and her colleagues (1998):
CBRP (1) recognizes the community as a unit of identity,
(2) builds on the community's strengths and resources,
(3) emphasizes an equitable and collaborative partnership during all phases of the research study,
(4) fosters co-learning and capacity building among all partners,
(5) integrates knowledge generation and intervention for the benefit of all partners,
(6) recognizes that research should be driven by the community and locally relevant problems,
(7) involves a cyclical and iterative process,
(8) disseminates research findings to all partners and involves community participants in the dissemination process, and
(9) understands that CBPR is a long-term process that requires a commitment to sustainability. These principles are not considered to be absolute or exhaustive and are modified to fit the particular circumstances of a research project.
Multivariate correlational techniques are extensions of bivariate correlation and regression analysis. They make it possible to use two or more predictors to estimate status on one or more criteria.

(a) Multiple regression is the appropriate technique when two or more predictors will be used to estimate status on a single criterion that's measured on a continuous scale.
There are two forms of multiple regression:
1. Simultaneous (standard) multiple regression involves entering data on all predictors into the equation simultaneously.
2. Stepwise multiple regression involves adding or subtracting one predictor at a time to the equation in order to identify the fewest number of predictors that are needed to make accurate predictions.

When using multiple regression, the optimal circumstance is for each predictor to have a high correlation with the criterion but low correlations with other predictors since this means that each predictor is providing unique information. When predictors are highly correlated with one another, this is referred to as multicollinearity.

(b) Canonical correlation is the appropriate technique when two or more continuous predictors will be used to estimate status on two or more continuous criteria.

(c) Discriminant function analysis is the appropriate technique when two or more predictors will be used to estimate status on a single criterion that's measured on a nominal scale.
-Logistic regression is the alternative to discriminant function analysis when the assumptions for discriminant function analysis are not met (e.g., when scores on the predictors are not normally distributed).
Inferential statistics are used to determine if the results of a research study are due to the effects of an independent variable on a dependent variable or to sampling error and involve using an inferential statistical test to compare the obtained sample value to the values in an appropriate sampling distribution.

When the sample value of interest is a mean, the appropriate sampling distribution is a sampling distribution of means.
-It's the distribution of mean scores that would be obtained if a very large number of same-sized samples were randomly drawn from the population, and the mean score on the variable of interest was calculated for each sample.

While many of the sample means would be equal to the population mean, some of the samples would have higher or lower means because of the effects of sampling error, which is a type of random error.
-In other words, the sample means would vary, not because individuals in the samples were exposed to the independent variable, but because of the effects of sampling error.

In inferential statistics, a sampling distribution of means is not actually constructed by obtaining a large number of random samples from the population and calculating each sample's mean.
-Instead, probability theory - and, more specifically, the central limit theorem - is used to estimate the characteristics of the sampling distribution.
The central limit theorem makes three predictions about the sampling distribution of means:
(a) The sampling distribution will increasingly approach a normal shape as the sample size increases, regardless of the shape of the population distribution of scores.
(b) The mean of the sampling distribution of means will be equal to the population mean.
(c) The standard deviation of the sampling distribution - which is referred to as the standard error of means - will be equal to the population standard deviation divided by the square root of the sample size.
Because inferential statistics is based on probability theory, when a researcher makes a decision to retain or reject the null hypothesis based on the results of an inferential statistical test, it's not possible to be certain whether the decision is correct or incorrect.

With regard to correct decisions, a researcher can either retain a true null hypothesis or reject a false null hypothesis.
-When a researcher retains a true null hypothesis, he or she has correctly concluded that the independent variable has not had a significant effect on the dependent variable and that any observed effect is due to sampling error or other factors.
-And when a researcher rejects a false null hypothesis, the researcher has correctly concluded that the independent variable has had a significant effect on the dependent variable.

With regard to incorrect decisions, a researcher can either reject a true null hypothesis or retain a false null hypothesis.
-When a researcher rejects a true null hypothesis, the researcher has concluded that the independent variable has had a significant effect on the dependent variable, but the observed effect is actually due to sampling error or other factors.
--This type of incorrect decision is known as a Type I error.
---For the EPPP, you want to know that the probability of making a Type I error is equal to alpha, which is also known as the level of significance and is set by a researcher before analyzing the data he/she has collected. Alpha is usually set at .05 or .01: When it's .05, this means there's a 5% chance of making a Type I error; when it's .01, this means there's a 1% chance of making a Type I error.

The second type of incorrect decision occurs when a researcher retains a false null hypothesis.
-In other words, the researcher has concluded that the independent variable has not had a significant effect on the dependent variable when it actually has, but the researcher was not able to detect the effect because of sampling error or other factors.
--This type of incorrect decision is referred to as a Type II error.
---The probability of making a Type II error is equal to beta which is not set by the researcher but can be reduced by increasing statistical power.
The chi-square test is used when the data to be analyzed are nominal data.

There are two chi-square tests - the single-sample chi-square test, which is also known as the chi-square goodness-of-fit test, and the multiple-sample chi-square test, which is also known as the chi-square test for contingency tables.

It will be easier to determine which chi-square test to use for an exam question if you substitute the word "variable" for "sample":

The single-sample (single-variable) chi-square test is used to analyze data from a descriptive study that includes only one variable

The multiple-sample (multiple-variable) chi-square test is used to analyze data from (a) a descriptive study that has two or more variables that can't be identified as independent or dependent variables or (b) an experimental study that has independent and dependent variables.

Remember that, when determining the number of variables for the chi-square test, you count all of the variables.

As an example, you would use the single-sample chi square test to analyze data collected in a study to determine whether college undergraduates prefer to use a hard-copy textbook or an online textbook for their introductory statistics class. This is a descriptive study with a single nominal variable that has two categories, and the single-sample chi-square test would be used to compare the number of students in the two categories.
-If this study is expanded to include type of course (face-to-face course or online course), the study is still a descriptive study but it includes two variables, and a statistical test will be used to compare the number of students in the four categories (prefer hard-copy text/face-to-face course, prefer hard-copy text/online course, prefer online text/face-to-face course, and prefer online-text/online course).
--Because the study includes two variables and the data to be analyzed are nominal (the number of subjects in each nominal category), the multiple-sample chi-square test is the appropriate statistical test.
The Student's t-test is used when a study includes one independent variable that has two levels and one dependent variable that's measured on an interval or ratio scale. In this situation, the t-test will be used to compare two means. For example, the t-test would be used to compare the mean mock EPPP exam scores obtained by psychologists who participated in either a live exam review workshop or an online exam review workshop.

There are three t-tests and the appropriate one depends on how the two means were obtained:
-The t-test for a single sample is used to compare an obtained sample mean to a known population mean. (In this situation, the population is acting as the no-treatment control group.)
-The t-test for unrelated samples is also known as the t-test for uncorrelated samples and is used to compare the means obtained by two groups when subjects in the groups are unrelated - e.g., when subjects were randomly assigned to one of the two groups.
-Finally, the t-test for related samples is also known as the t-test for correlated samples and is used to compare two means when there's a relationship between subjects in the two groups.
--This occurs when (a) participants are "natural" pairs (e.g., twins), and members of each pair are assigned to different groups; (b) participants are matched in pairs on the basis of their pretest scores or status on an extraneous variable, and members of each pair are assigned to different groups; or (c) a single-group pretest-posttest design is used and subjects are "paired" with themselves.
The one-way analysis of variance (ANOVA) is the appropriate statistical test when a study includes one independent variable that has more than two levels and one dependent variable that's measured on an interval or ratio scale and the groups are unrelated.
-It would be the appropriate statistical test to compare the effects of cognitive-behavior therapy, interpersonal therapy, and acceptance and commitment therapy on severity of depressive symptoms when clinic clients are randomly assigned to one of the therapies and symptoms are measured on an interval or ratio scale.

Although the one-way ANOVA can be used when a study has one independent variable with only two levels, the t-test has traditionally been used in this situation. Also, separate t-tests can be used to compare three or more levels of a single independent variable, but this would require conducting separate t-tests for each pair of means.

The one-way ANOVA produces an F-ratio. For the EPPP, you want to know that the numerator of the F-ratio is referred to as the "mean square between" (MSB) and is a measure of variability in dependent variable scores that's due to treatment effects plus error and that the denominator is referred to as the "mean square within" (MSW) and is a measure of variability that's due to error only. Whenever the F-ratio is larger than 1.0, this suggests that the independent variable has had an effect on the dependent variable. Whether or not this effect is statistically significant depends on several factors including the size of alpha.

A disadvantage of this approach is that it increases the probability of making a Type I error - i.e., it increases the experimentwise error rate. As an example, when the independent variable has three levels and separate t-tests are used to compare means, three t-tests would have to be conducted (Group #1 vs. Group #2, Group #1 vs. Group #3, and Group #2 vs. Group #3).
-If alpha is set at .05 for each t-test, this would result in an experimentwise error rate of about .15.

When using the one-way ANOVA, all possible comparisons between means are made in a way that maintains the experimentwise error rate at the alpha level set by the researcher.

(Note that the terms experimentwise error rate and familywise error rate are often used interchangeably, but that some authors distinguish between the two, with experimentwise error rate referring to the Type I error rate for all statistical analyses made in a research study and familywise error rate referring to the Type I error rate for a subgroup of statistical analyses.

For example, in a study with two or more independent variables, analyses of the main effects of each independent variable would be one family and analyses of the interaction effects of the independent variables would be another family.)
The factorial ANOVA, mixed ANOVA, randomized block ANOVA, ANCOVA, MANOVA, and trend analysis are other forms of the analysis of variance that you want to be familiar with for the exam.

(a) The factorial ANOVA is an extension of the one-way ANOVA that's used when a study includes more than one independent variable. It's also referred to as a two-way ANOVA when the study includes two independent variables, a three-way ANOVA when a study includes three independent variables, etc. A factorial ANOVA produces separate F-ratios for the main effects of each independent variable and their interaction effects.

(b) The mixed ANOVA is also known as the split-plot ANOVA and is used when the data were obtained from a study that used a mixed design - i.e., when the study included at least one between-subjects independent variable and at least one within-subjects independent variable.

(c) The randomized block ANOVA is used to control the effects of an extraneous variable on a dependent variable by including it as an independent variable and determining its main and interaction effects on the dependent variable. When using the randomized block ANOVA, the extraneous variable is referred as the "blocking variable."

(d) The analysis of covariance (ANCOVA) is also used to control the effects of an extraneous variable on a dependent variable but does so by statistically removing its effects from the dependent variable. When using the ANCOVA, the extraneous variable is the "covariate."

(e) The multivariate analysis of variance (MANOVA) is the appropriate statistical test when a study includes one or more independent variables and two or more dependent variables that are each measured on an interval or ratio scale.

(f) Trend analysis is used when a study includes one or more quantitative independent variables and the researcher wants to determine if there's a significant linear or nonlinear (quartic, cubic, or quadratic) relationship between the independent and dependent variables.
When an analysis of variance produces a statistically significant F-ratio, this indicates that at least one group is significantly different from another group but does not indicate which groups differ significantly from each other.

Conducting planned comparisons and post hoc tests are two ways to obtain this information.

Planned comparisons are also known as planned contrasts and a priori tests. These comparisons are designated before the data is collected and are based on theory, previous research, or the researcher's hypotheses.
-For example, assume that a psychology professor at a large university conducts a study to test the hypothesis that adding instructor-led study sessions to her introductory psychology lectures will improve the final exam scores of undergraduate students. To test this hypothesis, she designs a study to evaluate four teaching methods: two that are currently available to students and two new methods that include instructor-led study sessions.
--The four teaching methods are lectures only (L), lectures with peer-led study sessions (LP), lectures with instructor-led in-person study sessions (LIP), and lectures with instructor-led Zoom study sessions (LIZ). Because the professor is interested only in comparing lectures to lectures with instructor-led study sessions, she will not conduct a one-way analysis of variance but, instead, will use two t-tests to compare the mean final exam scores obtained by students in the L and LIP groups and the mean final exam scores obtained by students in the L and LIZ groups.

Post hoc tests are also known as a posteriori tests and are conducted when an ANOVA produces a significant F ratio. For the teaching method study, if the psychology professor decides she is interested in comparing the effects of all of the teaching methods, she will first conduct a one-way ANOVA. If the ANOA yields a significant F-ratio, this indicates that at least one teaching method differs significantly from another teaching method but does not indicate which teaching methods differ significantly from each other. Therefore, the professor will use t-tests to compare all possible pairs of group means: L versus LP, L versus LIP, L versus LIZ, LP versus LIP, LP versus LIZ, and LIP versus LIZ.
As noted in the description of the one-way ANOVA, the greater the number of statistical tests used to analyze the data collected in a research study, the greater the experimentwise error rate.

Consequently, when conducting planned comparisons or post hoc tests, it is desirable to control the experimentwise error rate. One way to do so for both planned comparisons and post hoc tests is to use the Bonferroni procedure, which simply involves dividing alpha by the total number of statistical tests to obtain an alpha level for each test.
-For example, there are two planned comparisons for the teaching method study and, if the professor sets alpha at .05, alpha would be .025 (.05/2) for each comparison. An alternative for post hoc tests is to use one of the modifications of the t-test that are each appropriate for a different situation and differ in terms of the ways they control the experimentwise error rate.

Frequently used post hoc tests include Tukey's honestly significant difference (HSD) test, the Scheffe test, and the Newman-Keuls test.
The results of a statistical test indicate whether or not the results of a study are statistically significant; however, researchers often want to know if the results have practical significance, which refers to the magnitude of the effects of an intervention - i.e., the intervention's "effect size."

Cohen's d is one of the methods used to measure effect size and indicates the difference between two groups (a treatment group and a control group or two different treatment groups) in terms of standard deviation units.

The results of a statistical test indicate whether or not the results of a study are statistically significant; however, researchers often want to know if the results have practical significance.

Cohen's d is one of the methods used to assess practical significance and indicates the difference between two groups (a treatment group and a control group or two different treatment groups) in terms of standard deviation units.

It's calculated by dividing the mean difference between the groups on the dependent variable by the pooled standard deviation for the two groups.
-As an example, if d is .50 for treatment and control groups, this means that the treatment group's mean on the dependent variable was one-half standard deviation above the control group's mean. Cohen (1969) provided guidelines for interpreting d: A d less than .2 indicates a small effect of the independent variable, a d between .2 and .8 indicates a medium effect, and a d larger than .8 indicates a large effect. (Cohen's f is the alternative to Cohen's d when the comparison involves more than two groups.)
Statistical significance and practical significance do not indicate if the effects of an intervention have clinical significance, which refers to the importance or meaningfulness of the effects.

For example, even when an intervention has statistical and practical significance, this does not indicate if the intervention is likely to move an individual from a dysfunctional to a normal level of functioning.

The Jacobson-Truax method is one method for evaluating the clinical significance of an intervention for each participant in a clinical trial or other research study. It involves two steps:
The first step is to calculate a reliable change index (RCI) to determine if the difference in an individual's pretreatment and posttreatment test scores is statistically reliable - i.e., if the difference is due to actual change rather than measurement error. It's calculated by subtracting the individual's pretest score from his or her posttest score and dividing the result by the standard error of the difference.
-The RCI can be positive or negative, depending on whether a high or low test score is indicative of improvement.
-When the change in scores is in the desired direction and RCI is larger than +1.96 or -1.96, the change is considered reliable (not due to measurement error).

The second step is to identify the test cutoff score that distinguishes between dysfunctional and functional behavior or performance to determine if an individual's posttest score is within the functional range.
-One way to determine the cutoff score is to calculate the score that is midway between the mean score for the dysfunctional (patient) population and the mean score for the functional (non-patient) population.

Finally, using the information derived from these two steps, the individual is classified as recovered (passed RCI and cutoff criteria), improved (passed RCI but not cutoff criteria), unchanged/indeterminate (passed neither criteria), or deteriorated (passed RCI in the unintended direction).