29 terms

# Intro to hypothesis testing

###### PLAY
Where we have been
Making predictions about likelihood of true population
mean falling within specific range around sample mean
Want to know how closely sample statistic approximates
population parameter
In other words, we interested drawing conclusions about
characteristic of SINGLE VARIABLE (ie. test scores or time
spent studying) from sample data
Where we are going
Test whether mean value of SINGLE variable is greater
or less than predetermined value
AND
Test whether mean values of SINGLE variable from
TWO DIFFERENT samples are equivalent
TEST HYPOTHESES about
Single variable in relation to specific value OR
Single variable from one sample in relation to same
variable from another sample
Two Types of Hypothesis Tests
single means/two means test
single means test
Test whether population
mean is greater to or lesser than predetermined
value
two means test
Test whether population
means from two distinct groups actually differ
single means test example
New 5th grade reading
program increases standardized test scores by
more than 1.25 grade levels
two means test example
Reading Program X increases
standardized test scores more than Reading
Program Y
null hypothesis (Ho)
States that in population there is
no association, no change, or no difference between two
variables or conditions. Indicates statistical
INDEPENDENCE
alternative hypothesis (H1)
States that in population
there is an association, change, or difference between two
variables or conditions. Indicates statistical
DEPENDENCE.
null hypothesis example
New 5th grade
reading program does NOT increase
standardized test scores by more than 1.25
grade levels per year.
alternative hypothesis example
New 5th
grade reading program increases
standardized test scores by more than 1.25
grade levels per year.
What's up with this null hypothesis nonsense?
Because of basic assumptions about epistemology & philosophy
of science, can only directly test null hypothesis
Can only REJECT or FAIL TO REJECT null hypothesis. Cannot
prove that alternative hypothesis is true.
Central principle of inductive reasoning (null)
single study can
never PROVE something to be true. We can only FAIL TO
PROVE that it is false (thanks Karl Popper).
This is what is meant by "falsifiability" in science. In order for
hypothesis to be testable, it has to be possible to prove it to be
false.
It is basically science's way of being VERY CONSERVATIVE
about conclusions we draw
Type I Error
We reject null hypothesis when, in fact, null hypothesis is
really true
Conclude that treatment had effect when it actually was
not effective (less conservative conclusion)
Occurs when information from sample is misleading.
Cannot make "correct" estimates about population
parameters from sample statistics
Probability of making Type I error is alpha
Type II Error
We fail to reject null hypothesis when, in fact, null
hypothesis is really false
Conclude that treatment had no effect when it actually
was effective (more conservative conclusion)
Occurs when hypothesis test fails to detect statistical
dependence
Probability of making Type II error is beta (b)
Which Error to Minimize?
Need to carefully examine specific research question
What if you want to determine if sexual contact is related
to particular viral infection
Want to use this information to decide whether or not to
inform public about potential risk.
Ho: Sexual contact is not related to viral infection (do not
inform patients)
H1: Sexual contact is related to viral infection (inform
patients)
Which Error to Minimize?
Fail to reject null hypothesis & say that sexual contact is NOT
related to virus when, in fact, it is (Type II error). Therefore, you
do not inform public of risks.
Implications: jeopardize health of sexually active
individuals
OR
Reject null hypothesis & conclude that sexual contact IS related
to virus when, in fact, it is not (Type I error). Therefore, you tell
public there are risks which really do not exist.
Implications: people have safer sex when they don't really
need to, as least as far as this virus is concerned
What is the point?
Must carefully consider your research question(s) when
deciding what type of error to minimize
However, we will largely focus on decreasing Type I error
because it is more common in social sciences AND
Type I error typically can result in potentially serious
consequences
How to Minimize Errors
increase sample size/replicating study by selecting new sample
increase sample size
Reduces error because samples are never identical
to population from which drawn
Replicate study by selecting new sample
Reduces error because samples are never identical
to one another
Steps to follow
1. State null & alternative hypotheses
2. Set significance level
3. Determine critical region
4. Collect data & compute test statistic(s)
5. Make decision to either reject null hypothesis or fail
to reject null hypothesis
Take Home Message (of example)
By doubling sample size, we can now reject null hypothesis
and conclude, with 99% certainty, that the Cooper gets
less than 35 miles to the gallon...
Even though sample mean & population SD did not change
The bigger sample size, more information we have about
given population so...
Our sample statistics will more accurately approximate
population parameters
Two Tailed Single Means Test
Tests whether population mean is equal to or not
equal to predetermined value
Previously (with one-tailed test) we were testing
whether population mean was greater than or less
than predetermined value
Error in TwoTailed Test
Alpha level that we choose (probability of Type I error) is
now distributed in BOTH tails of distribution (rather than
in just one tail)
You can make a Type I error in two ways:
(a) by rejecting H0 because you think μ is greater than
value of interest when it is not OR
(b) by rejecting H0 because you think μ is less than value
of interest when it is not
Relationship between One & Two
Tailed Critical Value
For a given a, one-tailed CV will be smaller than
two-tailed CV (ie. closer to zero)
For a = 0.05, one-tailed CV is 1.65 while two-tailed
CV is 1.96
This is because we are dividing alpha by two
Interpret Results
Based on our findings, we reject null hypothesis &
conclude that educational attainment was significantly
different in 2007 than in 2000
A Reminder
We expect sample mean to approximate population mean
Standard error provides simple measure of degree to which
sample mean differs from population mean
Based on mean & SD we can calculate Z-score
This Z-score indicates whether observed difference is
significantly greater than would be expected by
chance alone