ANOVA

Created by exodus1513 

Upgrade to
remove ads

ANOVA

Comparison of Means from
More Than 2 Groups

WHERE WE HAVE BEEN...

Compare mean obtained from one group to
predetermined number
1. One-mean hypothesis test
2. Can either be one-tailed (specify direction of
association) or two-tailed (no direction
specified)

WHERE WE HAVE BEEN...

Compare means from two groups
1. Two-means hypothesis test
2. Can either be one-tailed (specify direction of
association) or two-tailed (no direction
specified)

WHAT WE WERE REALLY DOING WHEN
COMPARING TWO MEANS

Testing hypothesis about association between
two variables/Associations are between CATEGORICAL IV
(nominal or ordinal) with two categories and
CONTINUOUS DV

WHERE WE ARE GOING...

What if we have more than 2 groups to compare?
/For example, what if we want to know if
happiness scores among people who are married,
divorced, widowed, OR never-married differ from
one another?
/ Cannot use z-tests or t-tests with more than 2
groups. So what do we use instead???

ANOVA

Analysis of Variance (ANOVA)
/Allows us to test whether there is association
between CATEGORICAL IV (nominal or ordinal
level) that has more than 2 categories and
CONTINUOUS DV

stating hypothesis

null:no association/research:assocation

WHAT WOULD PERFECT ASSOCIATION
BETWEEN CITY TYPE AND MURDER RATES
LOOK LIKE?

WITHIN every category of city type, all values
(ie. murder rates) would be same
/ BETWEEN categories of city type, mean murder
rates would be different
/ In other words, mean murder rates would be
different for each type of city BUT
/ All cities within each type (manufacturing, trade,
government) would have same murder rate

WHAT WOULD PERFECT ASSOCIATION
LOOK LIKE?

every row for respective column would have same number

WHAT WOULD ABSOLUTELY NO
ASSOCIATION LOOK LIKE?

no correlation between numbers in columns

WHAT REALITY GENERALLY LOOKS
LIKE

Of course, we never have perfect association (or
absolutely no association) between two variables
in social science
/BUT when we have STRONG association, most
of variation occurs BETWEEN categories/ Means that independent variable (city type)
explains most of variation in dependent variable
(murder rate)

PARTITIONING VARIANCE

How much single observation deviates from
grand mean/Mathematically we can divide total
deviation for given observation (xik) into
1. Extent to which xik differs from group mean k
(ie. difference WITHIN category)
2. Extent to which group mean k differs from
grand mean (ie. difference BETWEEN
categories)

PARTITIONING VARIANCE FOR SINGLE
OBSERVATION

We are interested in doing this for every observation in data
set
/ Dividing TOTAL variation across all observations into
variation BETWEEN categories & variation WITHIN
categories
/ We will use same methods to partition variance but this time
do it for all observations in data set
/ So we use sum of squares again

IN ENGLISH, PLEASE

WITHIN GROUP sum of squares is sum of
squared deviation of every raw score from its
group mean
1. You are figuring out extent to which each raw
score (xik) deviates from group mean (k)
2. Squaring these deviation scores to get rid of
negative signs
3. Then adding up squared deviations from each
observation within given group

MORE ENGLISH, PLEASE

In contrast, BETWEEN GROUP sum of squares
is sum of squared deviation of every group mean
from grand mean
1. You are figuring out extent to which each group
mean (k) deviates from grand mean (total)
2. Squaring these deviation scores to get rid of
negative signs
3. Then adding up squared deviations for all
groups

TESTING FOR ASSOCIATION WITH
ANOVA

How do we use this information to determine if
there is association between IV (type of city) &
DV (murder rates)?
 If most of TOTAL VARIATION (SStotal) can be
attributed to variation WITHIN categories of IV
 Then there is NO ASSOCIATION between IV
and DV

TESTING FOR ASSOCIATION WITH
ANOVA

How do we use this information to determine if
there is association between IV (type of city) &
DV (murder rates)?
/ If most of TOTAL VARIATION (SStotal) can be
attributed to variation BETWEEN categories of
IV
/ Then there is SIGNIFICANT ASSOCIATION
between IV and DV

step 1

CALCULATE MEAN FOR EACH
GROUP

step 2

CALCULATE WITHIN GROUP
SUM OF SQUARES

step 3

calculate between group sum of squares

STEP #4

CALCULATE DEGREES OF
FREEDOM (BETWEEN & WITHIN)

step 4

df between = k-1
where k is number of categories in IV/df within = n-k
where n is number of cases & k is number
of categories in IV

STEP #5

CALCULATE MEAN SQUARES
(BETWEEN & WITHIN)

step 5

Transform sums of squares (which are
measures of variation) into measures of
VARIANCE
/Measures of VARIANCE (mean squares)
differ from gross measures of VARIATION
(sums of squares) because...
/VARIANCE (mean squares) takes into
account degrees of freedom (ie. sample
size & number of groups in IV)

mean squares between

MS between = SSbetween / df between

mean squares within

MS within = SSwithin / df within

STEP #6

CONDUCT HYPOTHESIS TEST/Follow same 5 steps we have been using for
hypothesis testing
1. State null & alternative hypotheses
2. Determine alpha level
3. Determine critical value of F
4. Compute test statistic (in this case, use F test)
5. Compare obseved F to critical F & state
conclusion

STEP #1

state hypothesis

step 2

determine alpha level

step 3

find critical f/If df fall between two listed values, use SMALLER df/If df is greater than largest listed value (> 20 in
numerator or > 1000 in denominator), use infinity for
that component

step 4

calculate observed f/HIGHER ratio is, more variance that can be attributed to
differences BETWEEN categories
/ LOWER ratio is, more variance that can be attributed to
differences WITHIN categories

f observed=

MS between/ MS within

step 5

compare critical f to observed f

step 5

Remember, with ANOVA we are testing whether
between group variance is greater than within
group variance
/ We want to know whether observed value of F is
relatively large
/If observed F is greater than critical F, we will
reject H0 and conclude there is an association
between independent & dependent variables

STRENGTH OF ASSOCIATION

Once we know that there is SIGNIFICANT
association between IV & DV, we need to
estimate STRENGTH of association
/ This is important because it is possible for
associations that exist in population to differ in
how strong (or important) they are
/ In fact, relatively weak association can be
significant if sample size is large enough

STRENGTH OF ASSOCIATION

Measure strength of association in ANOVA using
eta squared (η2)
/ Indicates proportion of total variation that is due
to (explained by) independent variable/ between SS over total SS

STRENGTH OF ASSOCIATION

Interpretation: 6.66 % of total variation in
dependent variable (reading comprehension) is
explained by independent variable (type of
school)
/Thus, association between reading
comprehension & type of school is significant but
WEAK.

WHAT LEVEL OF ETA IS CONSIDERED
STRONG OR WEAK?

<10% = weak
10%-25% = moderate
>25% = strong/Remember, this is also dependent upon your
research question, hypotheses, units used to
measure variables & expected effect size

WHAT'S UP WITH ANOVA...IS IT ONETAILED
OR TWO-TAILED?

ANOVA is an OMNIBUS test, meaning that it
just tests OVERALL differences
/There really isn't one-tailed vs. two-tailed option
with ANOVA (or F distribution)
/ F test is one-tailed. We reject H0 if observed F is
greater than critical F/However, ANOVA really tests two-tailed
hypothesis because testing whether there is
significant difference between groups (do not
state specific directional difference)

In Other Words...

Significant F test only tells us that at least two of
groups are significantly different on DV
/ But we cannot tell which two are different
/ Could conduct t-test of difference between two
means to determine which two groups are
significantly different from each other

NORMAL DISTRIBUTION &
EQUALITY OF VARIANCES

When using ANOVA, we assume that the
dependent variable is normally distributed
/ However, if sample size is large enough, we can
relax this assumption because of CLT
/ Equal variances? ANOVA assumes that in
population, variance of DV is equivalent across
groups. Sample variances may not be exactly
equal. If they are close enough, F test will be
valid
/ Nonequivalence of variances only makes a
difference when working with small sample sizes
(not common in Sociology)

Please allow access to your computer’s microphone to use Voice Recording.

Having trouble? Click here for help.

We can’t access your microphone!

Click the icon above to update your browser permissions above and try again

Example:

Reload the page to try again!

Reload

Press Cmd-0 to reset your zoom

Press Ctrl-0 to reset your zoom

It looks like your browser might be zoomed in or out. Your browser needs to be zoomed to a normal size to record audio.

Please upgrade Flash or install Chrome
to use Voice Recording.

For more help, see our troubleshooting page.

Your microphone is muted

For help fixing this issue, see this FAQ.

Star this term

You can study starred terms together

NEW! Voice Recording

Create Set