Terms in this set (98)
How well were the variables measured and/or manipulated? • How well were the constructs operationalized?
• How were the participants chosen and to what population do the results generalize? • How representative are the manipulations and measures?
Is/are the effect/s statistically significant? • How strong is the effect / the difference between groups
Can temporal precedence be established? • What possible confounding variables were controlled for and what did the researchers do to control for those variables
The result of applying some calculation to data.
They perform some operation on the data.
Useful to represent some aspect of data
are measurements or observation that we make
describe, summarize, organize, simplify the data
allow to interpret the data & make inferences about the population from data collected in samples
Frequencies / Histograms • Measures of central tendency (mode, median, mean) • Measures of variability (range, IQR, variance, standard deviation)
All potential observations have an equal chance of being in the sample
Surveys: Radon Sampling
All potential observations do not have an equal chance of being in the sample
Experiments: Random Assignment
• Actual research is conducted using a
Use sample data to answer questions about the population
Relationships between samples and populations are defined in terms of
Random Sampling is required
Each individual in the population has an equal chance of being selected ◦ Probability of being selected stays constant from one selection to the next when more than one individual is selected (sampling with replacement)
Events are ( ) when the occurrence of one event affects the probability of the other event.
usually involves a population of scores that can be displayed in a frequency distribution graph
percentage of individuals in a distribution who have scores that are less than or equal to the specific score. • Probability questions can
The difference, or error, between sample statistics and population parameters is called
A distribution of statistics is referred to as a
Central Limit Theorem
For any population with mean μ and standard deviation σ, the distribution of sample means for sample size n will have a mean of μ and a standard deviation of σ/𝑛 .
Standard Error of M
The SEM is the standard deviation of the distribution of sample means • The SEM provides a measure of how much distance is expected on average between the sample mean and the population mean, μ
Standard Error of M
Variability of a distribution of scores is measured by the standard deviation • Variability of a distribution of sample means is measured by the standard deviation of the sample means,
Determines the magnitude of the standard error
1) The size of the sample 2) The standard deviation of the population from which the sample is selected
The larger the ( ), the more probable it is that the sample mean will be close to the population mean
2) The smaller the ( ) in the population, the more probable it is that the sample mean will be close to the population mean
hypothesis testing logic
A statistical method that uses sample data to evaluate the validity of a hypothesis about a population parameter
Hypothesis testing logic
State hypothesis about a population • Predict the expected characteristics of the sample based on the hypothesis • Obtain a random sample from the population • Compare the obtained sample data with the prediction made from the hypothesis ◦ If consistent, hypothesis is reasonable ◦ If discrepant, hypothesis is rejected
evaluates how far the sample mean deviates, in standard error units, from the hypothesized population mean.
assumptions of Z test
The population is normally distributed or the sample is large enough (about > 30). 2) The population standard deviation (σ) is known
Steps in hypothesis testing
1)State the research problem 2. State the statistical hypotheses 3. Set the criteria for a decision (decision rule) 4. Collect data and compute calculations 5. Make a decision 6. Interpretation
State the Statistical Hypotheses
The research question must be translated into a statistical hypothesis regarding some population characteristic
(H0) states that, in the underlying population, there is no change, no difference, or no relationship.
In the context of an experiment, the H0 predicts that the independent variable has no effect on the dependent variable for the population.
states that there is a
change, a difference, or there is a relationship in
�(: � ≠ 80
In the context of an experiment, the H0 predicts that
the independent variable does have an effect on the dependent variable for the population
Alpha level, or significance level
probability value used
to define "very unlikely" outcomes. if Ho is true
consist of the extreme sample
outcomes that are "very unlikely" -that is, sample values that
provide convincing evidence that the treatment really has an effect
Data ( )collected after hypotheses stated.
• Data ( ) collected after establishing decision criteria
() is the difference between the observed
sample mean and the hypothesized population mean
divided by the standard error of the mean.
Make a decision
• If sample statistic (z) is located in the critical
region, the null hypothesis is rejected.
• If the sample statistic (z) is not located in the
critical region, the researcher fails to reject the null
The decision to retain H0 implies not that H0 is
probably true, but only that H0 could be true
However, the decision to reject H0 implies that H0 is
probably false (and that H1 is probably true)
Type 1 error
Samples are not expected to be identical to their
populations -if an extreme sample has been selected by chance, the
data may look like there is a strong effect, when actually there is none.
• In this case, the researcher rejects a null hypothesis that
is actually true.
• Researcher concludes that a treatment has an effect when
it has none.
• The hypothesis test is structured to minimize this risk.
Type 2 error
• Researcher fails to reject a null hypothesis that is really false.
• Researcher has failed to detect a real treatment effect.
• A Type II error means that a treatment effect really exists,
but the hypothesis test failed to detect it.
• Often this happens when the effect of the IV is relatively
• It is not possible to determine a single, exact probability for
a Type II error; it is normally represented by the symbol β
Type 1 errors
Which is more serious Type 1 error or type 2 errors
Read that slide
• Size of difference between sample mean and
original population mean
◦ Larger discrepancies = larger z-scores
• Variability of the scores
◦ More variability = larger standard error
• Number of scores in the sample
◦ Larger n = smaller standard error
region involves both tails to determine if the treatment
increases or decreases the target behavior
researcher specifies either an increase or a decrease in the
population mean as a consequence of the treatment
• The power of a test is the probability that the test
will correctly reject a false null hypothesis
An effect size increases
Larger sample sizes
Using a one-tailed test
reducing the alpha level
when you want to assess the evidence
provided by the data in favor of some claim about the
when you want to estimate a population parameter
level of confidence
indicates the percent of time
that a series includes the unknown population
characteristic (e.g., the mean).
• The smaller the SEM, the narrower the ( )will be and,
therefore, the more precise.
read the notes
• If the p value is less than your significance (alpha) level, the hypothesis test
is statistically significant.
• If the confidence interval does not contain the null hypothesis value, the
results are statistically significant.
• If the p value is less than alpha, the confidence interval will not contain the
null hypothesis value.
• An independent variable is manipulated
o Value is determined by the researcher
o Manipulating an independent variable means exposing
subjects to at least two values (levels) of the variable
§ The specific conditions associated with each level are the
treatments or groups of the experiment
• A dependent variable is measured
o Variable whose value is observed and measured
o Value is determined by the participant's behavior
any aspects related to
the experiment that are not of current interest.
extraneous variables that
change in parallel with the independent variable & show systematic variability with the independent variable
the variability among scores not caused by
the independent variable
refers to the ability to avoid the
influence of "other" extraneous variables in the
Between-Subjects designs (Independent groups)
• Different groups of participants are randomly assigned to
the different levels of your independent variable
• So, the independent variable is manipulated by assigning
each of the different levels to a different group of
• Each participant is exposed to only one level of the
• Example: Exercise group vs. No-exercise group
Between- Subjects design. Posttest only design
: Participants are randomly assigned
to one of two groups and tested on the dependent
Pretest- posttest design
Participants are randomly assigned to one of two groups
and tested on the dependent variable twice: before and
after exposure to the independent variable.
A single group of subjects is exposed to all levels of the
• So, the different levels of the independent variable are
administered at different times to the same participants.
• Each subject is exposed to all (or more than one) levels of
the independent variable.
Concurrent- measure design
o Participants are exposed to all levels of the IV at the
same time (e.g., in the same computer task)
repeated- measure design
Participants are exposed to the different levels of the
IV at different points in time (e.g., caffeine one day and
control beverage another day)
advantages of within-subject designs
Reduces error variance due to individual differences
among subjects across groups
◦ Reduced error variance results in a more powerful design
◦ Effects of independent variable are more likely to be detected
2. Fewer subjects are needed
Carryove effects (order effect within subject)
Some undetermined form of contamination carries over
from one condition to the next
practice effects (order effect within subject)
the more experience with the task, the better the participant becomes
fatigue effects (order effect within subject)
the more time with the task, the more tired or bored the participants get
Aim of ( ) is to control for order effects.
• The various treatments are presented in a different order
for different subjects.
Full counter balancing
All possible condition orders are presented
o It is feasible when the IV has 2 or 3 levels
partial counter balancing
When the IV has 4 or more levels
o Not all, but only some of the possible condition orders are
v Randomized order
v Latin square design: it ensures that each condition appears in
each position at least once.
Threats to internal validity in one-group pretest/ posttest designs
3. Regression to the mean
Sometimes, changes in behavior emerge spontaneously
◦ To prevent this threat, a comparison group without
treatment is required.
Sometimes, changes may occur not just because time has
passed, but because something has happened (an
external event) between the pretest and the posttest.
o To be a threat, that event must affect (almost) everyone.
o To prevent this threat, a comparison group is required.
Regression to the mean
When scores are relatively extreme at one point in time,
they tend to be less extreme on a subsequent point in time.
◦ Unusually good or bad performances are likely due to chance
factors, that will not be present on a next time.
◦ A comparison group, along with careful inspection of the
data, will help prevent this threat.
A reduction in the
participants from the
first to the second
◦ These participants
should be eliminated
from the pretest
g: Practice and fatigue
o A change in the participants as a result of taking a test
more than once.
o To avoid this, the pretest could be eliminated, if not
o Also, two equivalent forms of the tests may be used.
The measuring instrument changes over time.
◦ If two forms of the test are used, they may not be sufficiently
◦ Observers (they are measuring instruments) can change their
criteria over time, becoming more lenient or more strict.
◦ To avoid the instrumentation threats:
v If two forms of the test are used, make sure that they really
v Even better, counterbalance the version of the test that is
given at the pretest and at the posttest.
v If observers are used, retrain and have clear coding manuals.
When researchers' expectations influence their
interpretation of the results
Participants' expectations and ideas about the study
influence their behavior
People improve (or impairs -nocebo effect) for reasons
different from the treatment given.
◦ To deal with this situation, double-blind placebo control
study with one group receiving the real treatment and
another group receiving the placebo.
Not enough between-groups difference
Weak manipulation / insensitive measure
o For example: exercise 1 hour per week compared to no exercise
2. Ceiling and floor effects
o For the IV, similar to weak (or
too strong manipulation).
o For the DV, the values are all
too high or too low:
v For example, a memory task that
is too easy, like remembering 4
words. Everybody will be at
100%, regardless of the values of
help detect weak manipulations,
ceiling and floor effects.
• A ( ) is a separate measure included
to make sure that the manipulation worked
too much within-groups variability
◦ Use a more precise, reliable measure
◦ Measure more instances
2. Individual Differences
o Measure more people
o Use a within-subjects design
o Use a matched design
3. Situation Variability / Noise
o Use a controlled environment
estimated standard error
is used as an estimate of
the real standard error, �$% , when the value of � is unknown
• It provides an estimate of the standard distance between a sample mean & the population mean
is used to test hypotheses about an
unknown population mean u when the value of o is also
degrees of freedom
represent how much independent
information (scores) we have in our sample data.
t-test for dependent measures
• Used when we measure people twice -that is, a repeated
measured designs with two measurements.
• So, we will compare each individual to themselves; each
person will be their own control.
• The null hypothesis is H0: μD = 0
• Also used with matched-subjects designs, in which each
individual is matched with an individual in the other sample with
respect to a specific variable that we want to control.
t-test for independent measures
We are measuring separate groups of individuals and
comparing the means of those separate groups.
• The observations are completely independent.
• The goal is to evaluate a mean difference between two
• The null hypothesis is H0: μ1 - μ2 = 0
Hypothesis testing so far
2) one sample t-test
3) dependent samples t-test
4) independent samples t-test
YOU MIGHT ALSO LIKE...
Academic Word Lists - AWL Sublists
Stats Midterm 1
PSYC 3006 exam 1
OTHER SETS BY THIS CREATOR
Pics Test 3
Test 3 Ant
Test 3 phsy