the process of gathering data in order to make evaluative comparisons regarding different situations.
You need at least 30 individuals to conduct a 'true' experiment.
Correlational research requires 30 subjects per variable.
Uses PRE-EXISTING groups, so the independent variable (IV) cannot be altered (i.e., gender or ethnicity), and can't state with any statistical confidence that the IV caused the dependent variable (DV).
(also known as Occam's Razor)
interpreting the results in the simplest ways
(Literally a tendency to be miserly and not overspend.)
Ex post facto study
A type of quasi-experiment (literally means 'after the fact') connoting a correlational study in which preexisting groups are utilized
the variable the researcher manipulates, controls, alters, or wishes to experiment with
(memory: 'I' manipulate the IV)
expresses the outcome or the data regarding factors one wants to measure
(memory: 'D' in dependent and data)
Refers to whether the DVs were truly influenced by the experimental IVs or whether other factors had an impact.
Threats to internal validity
maturation of subjects (psychological and physical changes including fatigue due to time involved), mortality (subjects withdrawing), instruments used to measure the behavior or trait, or statistical regression (notion that extremely high or low scores would move toward the mean if utilized again)
Refers to whether the experimental research results can be generalized to larger populations (other people, settings, conditions).
[If the results of the study only apply to the population in the study then external validity is LOW.]
Causal Comparative design
a true experiment WITHOUT random assignment
(Data from the causal comparative ex post factor [after the fact] design can be analyzed with a test of significance [t test or ANOVA] just like any true experiment.)
Statistical procedure to summarize MANY variables. (i.e., A test measuring a counselors ability may try to describe 3 important variables that make up an effective helper although hundreds exist.)
Nonparametric statistical measure that tests whether a distribution differs significantly from an expected theoretical distribution of scores.
(Memory: ''chi' like 'chi-a pet' that I expected more from)
(also known as Lloyd Morgan's 1894 Canon)
suggests experimenters interpret the results in the simplest manner.
William of Occam
14th century philosopher and theologian.
(Occam's Razor, aka 'parsimony' named for)
Bubbles in research
Considered flaws in research (i.e., rubbing a sticker on car and getting no bubbles - impossible)
Confounded or flawed variable
Undesirable variables that invalidate experiments.
(The only experimental variable should be the independent variable.)
occurs when an undesirable variable (also known as contaminating variable) which is not controlled by the researcher is introduced in the experiment.
(aka 'action research' or experience-near research)
is conducted to advance our knowledge of how theories, skills, and techniques can be used in terms of practical application.
In experimental terminology, IV stands for _____ ______, and DV stands for ______ _______.
independent variable, dependent variable
a behavior or circumstance that can exist on at least two levels or conditions.
(a factor that 'varies' or is capable of change)
The variable you manipulate/control in an experiment is the
IV or independent variable
("I am the researcher so I manipulate or experiment with the IV.")
subjects informed of risk, negative after effects removed, allow subjects to withdraw at any time, confidentiality of subjects is protected, results will be presented in an accurate format that is not misleading, and will use only techniques trained in.
The control group
does not receive the IV
(same characteristics of the experimental group - the averages between the two groups should not differ significantly)
The experimental group
received the IV
(has the same characteristics of the control group the averages between the two groups should not differ significantly)
a hunch or educated guess which can be tested utilizing the experimental model.
A statement which can be tested regarding the relationship between the independent (IV) and the dependent variables (DV).
suggests there WILL NOT be a significant difference between the experimental group which received the IV and the control group which did not.
(asserts the samples will not change - will stay the same - even after the experimental variable is applied.)
The IV DOES NOT affect the DV.
"There will be differences between the control group and the experimental group.
(aka 'affirmative hypothesis')
asserts th independent variable (IV) has indeed caused a change.
is descriptive statistic telling the counselor what percentage of the cases fell below a certain level.
Use of tests of significance
to determine whether a difference in the groups' scores is significant or just due to change factors.
Used to determine whether two sample groups are significantly different, simple form of the ANOVA, for comparing 2 sample groups
(for "two-groups" or "two-randomized groups" research design)
Independent group comparison design
In a study of two groups, change in one group DID NOT influence the other group.
Repeated-measures comparison design
Measuring the SAME group of subjects without the IV and then with the IV.
A research study uses different subjects for each condition.
(Each subject receives only one value of the IV)
A value obtained from a population.
(Summarizes a characteristic of a population, i.e., average male height)
Traditionally, PROBABILITY in social science research is set at _____ or lower (i.e., 01, .001).
(.05 indicates differences would occur via chance only 5 times in 100.
In social sciences, the accepted probability level is usually .05 or less.
Type I (alpha error) occurs when a researcher rejects the null hypothesis although it is true.
there is only a 5% chance that the difference between the control group and the experimental group is due to chance.
(differences truly exist; the experimenter will obtain the same results 95 out of 100 times.)
A study that would best rule out chance factors would have a significance level of P=___.
The smaller the value for P, the more stringent the level of significance.
Type II error (beta error) occurs when a researcher accepts null even though it is false.
(memory: RA as in 'residence advisor'...
R - signifies reject when true
A - signifies accept when false
The probability of committing a Type I error equals the level of significance.
The level of significance is also called the 'alpha level'.
Power of a statistical test
(Power connotes a statistical test's ability to correctly reject a false null hypothesis.)
Parametric tests have more power than nonparametric statistical tests.
Parametric tests are used ONLY with interval and ratio data.
A Type II error is also called a ____ error and means you _____ null when it is _____.
beta, accept, false
Lowering the significance level LOWERS Type I errors, but it RAISES the risk of Type II errors.
Type I/Type II relationship is a seesaw.
The safest way to avoid Type I/Type II errors is to set alpha (significance level) at a very stringent level and use a large sample size for the study.
Differences revealed via large samples are more likely to be genuine than differences revealed using small sample size.
A counselor decides to increase the sample size in her experiment. This will ____
reduce Type 1 and Type II errors.
Raising the size of a sample helps lower the risk of chance/error factors.
If a researcher changes the significance level from .05 to .001, then
alpha errors decrease, but beta errors increase.
If t value is less than the t value in a statistical table
ACCEPT the null hypothesis
(computations must exceed the number cited in the table in order to reject null)
Analysis of covariance (ANCOVA)
tests 2 or more groups while controlling for extraneous variables that are called covariates
signed rank test used in place of the t test when data are nonparametric and you wish to test whether 2 correlated means differ significantly
(memory: 'co' to remind you of correlated)
determines whether 2 uncorrelated means differe significantly when data are nonparmetric
(memory: the 'u' reminds you of 'uncorrelated')
Spearman correlation (also known as Kendall's tau)
used in place of the Pearson r when parametric assumptions cannot be utilized
Chi-square nonparametric test
examines whether obtained frequencies differ significantly from expected frequencies
A 1-way analysis of variance (ANOVA) is used for testing ONE IV.
A two-way analysis of variance (ANOVA) is used to test TWO IVs.
(Two IVs requires a two-way ANOVA, 3 IVs requires a 3-way ANOVA, etc.)
Some researchers refer to the level of significance as where one _____ the ____, or as the ______ point.
draws, line, cutoff
(If a researcher sets the level of significance at .50, then the odds would be 50/50 that the results were due to pure chance.)
A statistic that indicates the degree or magnitude of relationship between two variables, often abbreviated using the lower-case 'r'.
(Makes a statement regarding the association of two variables and how a change in one is related to the change in the other.)
Correlations range from 0.00 (no relationship) to 1.0 or -1.0 (perfect relationship).
A positive relationship is not stronger than a negative relationship of the same numerical value.
(i.e., .70 and -.70 are the same significance)
A positive correction
Evident when both variables change in the same direction (imagine a graphical representation of scores)
A negative correlation
Evident wen the variables are inversely associated (one goes up and the other goes down).
Another name for N=1
intensive experimental design (pioneered by Freud), also known as a case study
(N= the number of people being studied)
single case investigations
(Case studies are often misleading because the results are not necessarily generalizable.)
The subject does not know whether they are in the control group, but the researcher does.
(helps eliminate 'demand characteristics')
Neither the subject nor the researcher knows of the person is in the control group.
(Researcher is sometimes unaware of the null hypothesis too.)
Things that can flaw an experiment because the researcher unconsciously communicates intent or expectations to the subjects.
ABA model of research (also known as 'withdrawal design')
A - baseline secured
B - intervention implemented
A - outcome is examined via a new baseline
Experimental is to cause and effect as correlational is to _____ of _______.
degree of relationship.
Correlation coefficient is a
descriptive statistic which indicates the degree of 'linear relationship' between two variables.
Pearson Product-Moment correlation r
used for interval or ratio data.
(memory - Pearson r uses I and R for Info and Referral)
68-95-99.7 rule (empirical rule)
Within a normal distribution, 68% of scores will fall within +/- 1 standard deviation (SD) of the mean; 95% within 2 SDs of the mean; and 99.7% within 3 SDs of the mean.
(Almost all scores will fall between 3 SDs of the mean.)
middle scores in a distribution of scores
(The middle scores when data are arranged from highest to lowest.)
most frequently occurring scores and the least important measure of central tendency.
(The highest or maximum point of concentration on a curve.)
The larger the range, the greater the dispersion or spread of scores from the mean.
The most useful measure of central tendency is the MEAN (i.e., average).
In skewed distributions, the median is the best choice.
Several experimental variables are investigated and interactions can be noted.
Factorial designs include 2 or more IVs.
Solomon 4 group design
Researcher uses 2 control groups - only one experimental group and one control group are PRE-tested. The other control group and experimental group are merely post-tested. (Lets the researcher known if results are influenced by testing.)
Regardless of the shape, the ____ will always be the high point when a distribution is displayed graphically.
In a graph, the tail indicates whether a distribution of scores is positively or negatively skewed.
(Tail to left - negatively skewed. Tail to right - positively skewed.)
The benefit of standard scores such as percentiles, t-scores, z-scores,stanines, or standard deviations over raw scores, is that a standard score allows you to analyze the data in relation to the properties of the normal bell shaped curve.*
X axis (also called 'abscissa')
The horizontal line drawn under a frequency distribution graph.
(horizontal axis plots the independent variable [IV])
Y axis (also known as 'ordinate')
Used to plot frequency of the DVs, plotting the experimental data.
(memory: Letter 'Y' is vertical like the line it represents in a graph)
If a distribution is bimodal, there is a good chance that the researcher is working with ____ distinct ______.
The RANGE is the simplest way to measure the spread of scores.
The RANGE is usually calculated by subtracting the lowest score from the highest scores (i.e., 93-33=60.)
- If 'inclusive range' is specified on exam, then use the formula but add '1' to the final value after subtraction of the range.
-generally increases with sample size
Scattergram (also known as scatterplot)
Pictorial diagram or graph of two variables being correlated.
John Henry Effect (also known as 'compensatory rivalry of a comparison group')
threat to internal validity when subjects strive to prove an experimental treatment that might threaten their livelihood isn't really effective. (i.e., sabotage)
Resentful Demoralization of the Comparison Group (also called compensatory equalization)
Threat to validity in which comparison group lowers their performance or behaves inept in an attempt to make the experimental group look better than they should.
(Noted if the comparison group deteriorates throughout the experiment while the experimental group does not.)
A measure of dispersion of scores around some measure of central tendency; it is also the standard deviation squared.
Statistically speaking, 68.26% of scores fall within + or - one SD of the mean.
Statistically speaking, 95.74% of scores fall within + or - 2 SD of the mean.
Statistically speaking, 99.74% of scores fall within + or - 3 SD of the mean.
The greater the standard deviation of scores, the greater the spread of a plotted graph.
Z-score (often called standard score)
same as a standard deviation - the most elementary of standard score.
(memory: Z score is simply SD)
A Z-score of +1 or 1 SD would include about 34% of the cases in a normal population.
T-score (often called transformed score)
Mean of 50 with each SD of 10 [different from a Z-score]
(i.e., a Z score of -1.0 would be a T score of 40. A Z-score of -1.5 would be a T-score of 35, etc.)
- Not mathematically threatening because never expressed as a negative number.
Z-scores (aka standard scores) are the same as standard deviations, thus a Z-score of -2.5 means
2.5 SD below the mean
College Entrance Examination Board (also known as Educational Testing Services [ETS]) scores range from 200 to 800 with a mean of 500.
Flatter and more spread out than a normal curve.
(Memory: 'Plat' sounds like 'flat')
Distribution curve is very tall, thin and peaked.
(Memory: Leptokurtic leaps tall buildings in a single bound.)
Stanine scores (contraction of 'standard' and 'nine')
Divides the distribution into 9 equal intervals with stanine 1 as the lowest 9th and 9 as the highest 9th - in this, 5 is the mean.
Four basic measurement scales (by. S. S. Steven)
Nominal - simplest type, strictly qualitative NOT quantitative
most basic, does not provide measurable info, merely classifies names, labels, or identifies by group, has NO TRUE ZERO point and DOES NOT INDICATE ORDER.
(i.e., street address, telephone #, gender, brand or therapy; adding/subtracting nominal categories is meaningless)
Ordinal scale (2nd level of measurement)
Rank-orders variables, though distance between the variables may not be equal - Provides relative placement or standing but does not delineate absolute differences
(adding/subtracting is no-no)
(Memory: 'ordinal' sounds like 'order')
Has numbers scaled at equal distances but NO ABSOLUTE ZERO point.
-Since intervals are same, amount of differences can be stipulated (i.e., 3 IQ points), can add/subtract but not multiply/divide
(IQ TESTS provide interval measurement!)
(highest level of measurement)
Interval scale with a TRUE ZERO POINT.
Add/subtract/multiply/divide all possible.
(Most psychological attributes can't be measured by ratio scale.)
When the researcher does not intervene but merely observes a subject, preferably in its natural setting.
(oldest method of research)
i.e., 2x3 factorial notation =
The first variable has 2 levels (i.e., male or female) and the second IV has 3 levels (age, height, weight)
The simplest form of descriptive research is the _______, which requires a questionnaire return rate of ______ to be accurate.
(Ideal sample size for a survey is 100, compared to an experimental study which gets by with 15)
Survey problems include - poor construction of instrument, low return rate, subjects are often not randomized
happens when an item is thought to have an effect and produces results, even though there is no effect from the item (all in their head)
Hawthorne Effect (also known as reacting to the presence of the investigator, or observer effect)
Happens sometimes if subjects know they are in experiment, their performance may improve because of the extra attention and knowledge they are being observed.
Rosenthal Effect (experimenter expectancy)
The experimenter's beliefs about the individual may cause the experimenter to treat them in a special way so that they begin to fulfill the experimenter's expectations.
When a trait that is not being evaluated (i.e., attractiveness) influences a researcher's rating on another trait (i.e., counseling skill).
Statistical procedure performed at different times to see if a trend is evident, using ANOVA sometimes.
tests a null hypothesis regarding the means of two or more groups AFTER the random samples are adjusted to eliminate average differences.
examines people who were born at the same time (or shared an event, like fought in Vietnam) in regard to a given characteristic.
predicts very high and very low scores will move toward the mean if a test is administered again. (It is a threat to internal validity.)
refers to the points that divide a distribution into fourths.
(indicates 25th percentile is the 1st quartile, 2nd quartile is the median, 3rd quartile is at 75 percentile.
Standardized tests always have formal procedures for test administration and scoring.
Standardization implies the testing format, test materials, and scoring process are consistent.
(also known as 'synchronic method')
Clients are assessed at one point in time.
(Indicative of measurements or observations at a single point, and thus preferable in terms of time consumption.)
(also known as 'diachronic method')
The same clients are studied over a period of time.
Confederate (also known as 'stooge')
An accomplice who poses as a client being studied.
(Frequently used in social psychology studies.)
(i.e., NOT normal distribution)
Most popular is the chi-square, used to determine whether an obtained distribution differs significantly from an expected distribution.
any characteristic (aka bit of knowledge, correct or incorrect), that the subject in an experiment is aware of that can influence his or her behavior.
(Demand characteristics can confound an experiment!)
Nondirectional experimental hypothesis
A two-tailed test
(i.e., 'The average patient who has completed psychoanalysis will have a statistically different IQ from the average patient who has not received analysis.')
Directional experimental hypothesis
(i.e., hypothesis specifies one average is larger than another)
switching the order in which stimuli are presented to a subject in a study.
(Used to control for the fact that the order of an experiment could impact its outcome.)
Pygmalion Effect (aka Rosenthal or Experimenter Effect)
Experimenter falls in love with his own hypothesis and the experiment becomes a self-fulfilling prophecy.
Any psychotherapeutic model that focuses on the here-and-now rather than the past.
Multiple treatment interference
If a subject receives more than one treatment, it is often tough to discern which modality caused the improvements.
Like sticking your hand in a fishbowl to pick up a winning lottery ticket - each individual in the population has an equal chance of being selected.
Used when it is nearly impossible to find a list of the entire population.
(Will not be as accurate as random sample but is used to save time and practical considerations.)
Sampling every nth person in a population (i.e., every 5th person, 10th person, etc.) - some believe this gives same results as random sampling although it is controversial.
outlines a procedure
(important so other researchers can attempt to replicate the study's findings)
implies another researcher can repeat the experiment exactly as it was performed before.
Mann-Whitney U-test, Wilcoxon signed-rank test (for matched pairs), Soloman and Kruskal-Wallis H-test
Subjects are literally 'matched' in regard to any variable that could be correlated with the DV, which is really the postexperimental performance.
research that reduces the general to the specific.
(contrasts inductive research)
Standard error of measurement (SEM)
indicates what the individual would score if he takes the same test again.
Uses choices like: strongly agree, agree, disagree, or strongly disagree.
Created by Renis Likert in 1930, helped improve the overall degree of measurement. (memory: How much do you Likert something?)