# ANOVA

### 41 terms by exodus1513

#### Study  only

Flashcards Flashcards

Scatter Scatter

Scatter Scatter

## Create a new folder

### ANOVA

Comparison of Means from
More Than 2 Groups

### WHERE WE HAVE BEEN...

Compare mean obtained from one group to
predetermined number
1. One-mean hypothesis test
2. Can either be one-tailed (specify direction of
association) or two-tailed (no direction
specified)

### WHERE WE HAVE BEEN...

Compare means from two groups
1. Two-means hypothesis test
2. Can either be one-tailed (specify direction of
association) or two-tailed (no direction
specified)

### WHAT WE WERE REALLY DOING WHEN COMPARING TWO MEANS

two variables/Associations are between CATEGORICAL IV
(nominal or ordinal) with two categories and
CONTINUOUS DV

### WHERE WE ARE GOING...

What if we have more than 2 groups to compare?
/For example, what if we want to know if
happiness scores among people who are married,
divorced, widowed, OR never-married differ from
one another?
/ Cannot use z-tests or t-tests with more than 2
groups. So what do we use instead???

### ANOVA

Analysis of Variance (ANOVA)
/Allows us to test whether there is association
between CATEGORICAL IV (nominal or ordinal
level) that has more than 2 categories and
CONTINUOUS DV

### stating hypothesis

null:no association/research:assocation

### WHAT WOULD PERFECT ASSOCIATION BETWEEN CITY TYPE AND MURDER RATES LOOK LIKE?

WITHIN every category of city type, all values
(ie. murder rates) would be same
/ BETWEEN categories of city type, mean murder
rates would be different
/ In other words, mean murder rates would be
different for each type of city BUT
/ All cities within each type (manufacturing, trade,
government) would have same murder rate

### WHAT WOULD PERFECT ASSOCIATION LOOK LIKE?

every row for respective column would have same number

### WHAT WOULD ABSOLUTELY NO ASSOCIATION LOOK LIKE?

no correlation between numbers in columns

### WHAT REALITY GENERALLY LOOKS LIKE

Of course, we never have perfect association (or
absolutely no association) between two variables
in social science
/BUT when we have STRONG association, most
of variation occurs BETWEEN categories/ Means that independent variable (city type)
explains most of variation in dependent variable
(murder rate)

### PARTITIONING VARIANCE

How much single observation deviates from
grand mean/Mathematically we can divide total
deviation for given observation (xik) into
1. Extent to which xik differs from group mean k
(ie. difference WITHIN category)
2. Extent to which group mean k differs from
grand mean (ie. difference BETWEEN
categories)

### PARTITIONING VARIANCE FOR SINGLE OBSERVATION

We are interested in doing this for every observation in data
set
/ Dividing TOTAL variation across all observations into
variation BETWEEN categories & variation WITHIN
categories
/ We will use same methods to partition variance but this time
do it for all observations in data set
/ So we use sum of squares again

WITHIN GROUP sum of squares is sum of
squared deviation of every raw score from its
group mean
1. You are figuring out extent to which each raw
score (xik) deviates from group mean (k)
2. Squaring these deviation scores to get rid of
negative signs
3. Then adding up squared deviations from each
observation within given group

In contrast, BETWEEN GROUP sum of squares
is sum of squared deviation of every group mean
from grand mean
1. You are figuring out extent to which each group
mean (k) deviates from grand mean (total)
2. Squaring these deviation scores to get rid of
negative signs
3. Then adding up squared deviations for all
groups

### TESTING FOR ASSOCIATION WITH ANOVA

How do we use this information to determine if
there is association between IV (type of city) &
DV (murder rates)?
 If most of TOTAL VARIATION (SStotal) can be
attributed to variation WITHIN categories of IV
 Then there is NO ASSOCIATION between IV
and DV

### TESTING FOR ASSOCIATION WITH ANOVA

How do we use this information to determine if
there is association between IV (type of city) &
DV (murder rates)?
/ If most of TOTAL VARIATION (SStotal) can be
attributed to variation BETWEEN categories of
IV
/ Then there is SIGNIFICANT ASSOCIATION
between IV and DV

### step 1

CALCULATE MEAN FOR EACH
GROUP

### step 2

CALCULATE WITHIN GROUP
SUM OF SQUARES

### step 3

calculate between group sum of squares

### STEP #4

CALCULATE DEGREES OF
FREEDOM (BETWEEN & WITHIN)

### step 4

df between = k-1
where k is number of categories in IV/df within = n-k
where n is number of cases & k is number
of categories in IV

### STEP #5

CALCULATE MEAN SQUARES
(BETWEEN & WITHIN)

### step 5

Transform sums of squares (which are
measures of variation) into measures of
VARIANCE
/Measures of VARIANCE (mean squares)
differ from gross measures of VARIATION
(sums of squares) because...
/VARIANCE (mean squares) takes into
account degrees of freedom (ie. sample
size & number of groups in IV)

### mean squares between

MS between = SSbetween / df between

### mean squares within

MS within = SSwithin / df within

### STEP #6

CONDUCT HYPOTHESIS TEST/Follow same 5 steps we have been using for
hypothesis testing
1. State null & alternative hypotheses
2. Determine alpha level
3. Determine critical value of F
4. Compute test statistic (in this case, use F test)
5. Compare obseved F to critical F & state
conclusion

state hypothesis

### step 2

determine alpha level

### step 3

find critical f/If df fall between two listed values, use SMALLER df/If df is greater than largest listed value (> 20 in
numerator or > 1000 in denominator), use infinity for
that component

### step 4

calculate observed f/HIGHER ratio is, more variance that can be attributed to
differences BETWEEN categories
/ LOWER ratio is, more variance that can be attributed to
differences WITHIN categories

### f observed=

MS between/ MS within

### step 5

compare critical f to observed f

### step 5

Remember, with ANOVA we are testing whether
between group variance is greater than within
group variance
/ We want to know whether observed value of F is
relatively large
/If observed F is greater than critical F, we will
reject H0 and conclude there is an association
between independent & dependent variables

### STRENGTH OF ASSOCIATION

Once we know that there is SIGNIFICANT
association between IV & DV, we need to
estimate STRENGTH of association
/ This is important because it is possible for
associations that exist in population to differ in
how strong (or important) they are
/ In fact, relatively weak association can be
significant if sample size is large enough

### STRENGTH OF ASSOCIATION

Measure strength of association in ANOVA using
eta squared (η2)
/ Indicates proportion of total variation that is due
to (explained by) independent variable/ between SS over total SS

### STRENGTH OF ASSOCIATION

Interpretation: 6.66 % of total variation in
explained by independent variable (type of
school)
comprehension & type of school is significant but
WEAK.

### WHAT LEVEL OF ETA IS CONSIDERED STRONG OR WEAK?

<10% = weak
10%-25% = moderate
>25% = strong/Remember, this is also dependent upon your
research question, hypotheses, units used to
measure variables & expected effect size

### WHAT'S UP WITH ANOVA...IS IT ONETAILED OR TWO-TAILED?

ANOVA is an OMNIBUS test, meaning that it
just tests OVERALL differences
/There really isn't one-tailed vs. two-tailed option
with ANOVA (or F distribution)
/ F test is one-tailed. We reject H0 if observed F is
greater than critical F/However, ANOVA really tests two-tailed
hypothesis because testing whether there is
significant difference between groups (do not
state specific directional difference)

### In Other Words...

Significant F test only tells us that at least two of
groups are significantly different on DV
/ But we cannot tell which two are different
/ Could conduct t-test of difference between two
means to determine which two groups are
significantly different from each other

### NORMAL DISTRIBUTION & EQUALITY OF VARIANCES

When using ANOVA, we assume that the
dependent variable is normally distributed
/ However, if sample size is large enough, we can
relax this assumption because of CLT
/ Equal variances? ANOVA assumes that in
population, variance of DV is equivalent across
groups. Sample variances may not be exactly
equal. If they are close enough, F test will be
valid
/ Nonequivalence of variances only makes a
difference when working with small sample sizes
(not common in Sociology)

Example: