### WHERE WE HAVE BEEN...

Compare mean obtained from one group to

predetermined number

1. One-mean hypothesis test

2. Can either be one-tailed (specify direction of

association) or two-tailed (no direction

specified)

### WHERE WE HAVE BEEN...

Compare means from two groups

1. Two-means hypothesis test

2. Can either be one-tailed (specify direction of

association) or two-tailed (no direction

specified)

###
WHAT WE WERE REALLY DOING WHEN

COMPARING TWO MEANS

Testing hypothesis about association between

two variables/Associations are between CATEGORICAL IV

(nominal or ordinal) with two categories and

CONTINUOUS DV

### WHERE WE ARE GOING...

What if we have more than 2 groups to compare?

/For example, what if we want to know if

happiness scores among people who are married,

divorced, widowed, OR never-married differ from

one another?

/ Cannot use z-tests or t-tests with more than 2

groups. So what do we use instead???

### ANOVA

Analysis of Variance (ANOVA)

/Allows us to test whether there is association

between CATEGORICAL IV (nominal or ordinal

level) that has more than 2 categories and

CONTINUOUS DV

###
WHAT WOULD PERFECT ASSOCIATION

BETWEEN CITY TYPE AND MURDER RATES

LOOK LIKE?

WITHIN every category of city type, all values

(ie. murder rates) would be same

/ BETWEEN categories of city type, mean murder

rates would be different

/ In other words, mean murder rates would be

different for each type of city BUT

/ All cities within each type (manufacturing, trade,

government) would have same murder rate

###
WHAT REALITY GENERALLY LOOKS

LIKE

Of course, we never have perfect association (or

absolutely no association) between two variables

in social science

/BUT when we have STRONG association, most

of variation occurs BETWEEN categories/ Means that independent variable (city type)

explains most of variation in dependent variable

(murder rate)

### PARTITIONING VARIANCE

How much single observation deviates from

grand mean/Mathematically we can divide total

deviation for given observation (xik) into

1. Extent to which xik differs from group mean k

(ie. difference WITHIN category)

2. Extent to which group mean k differs from

grand mean (ie. difference BETWEEN

categories)

###
PARTITIONING VARIANCE FOR SINGLE

OBSERVATION

We are interested in doing this for every observation in data

set

/ Dividing TOTAL variation across all observations into

variation BETWEEN categories & variation WITHIN

categories

/ We will use same methods to partition variance but this time

do it for all observations in data set

/ So we use sum of squares again

### IN ENGLISH, PLEASE

WITHIN GROUP sum of squares is sum of

squared deviation of every raw score from its

group mean

1. You are figuring out extent to which each raw

score (xik) deviates from group mean (k)

2. Squaring these deviation scores to get rid of

negative signs

3. Then adding up squared deviations from each

observation within given group

### MORE ENGLISH, PLEASE

In contrast, BETWEEN GROUP sum of squares

is sum of squared deviation of every group mean

from grand mean

1. You are figuring out extent to which each group

mean (k) deviates from grand mean (total)

2. Squaring these deviation scores to get rid of

negative signs

3. Then adding up squared deviations for all

groups

###
TESTING FOR ASSOCIATION WITH

ANOVA

How do we use this information to determine if

there is association between IV (type of city) &

DV (murder rates)?

If most of TOTAL VARIATION (SStotal) can be

attributed to variation WITHIN categories of IV

Then there is NO ASSOCIATION between IV

and DV

###
TESTING FOR ASSOCIATION WITH

ANOVA

How do we use this information to determine if

there is association between IV (type of city) &

DV (murder rates)?

/ If most of TOTAL VARIATION (SStotal) can be

attributed to variation BETWEEN categories of

IV

/ Then there is SIGNIFICANT ASSOCIATION

between IV and DV

### step 4

df between = k-1

where k is number of categories in IV/df within = n-k

where n is number of cases & k is number

of categories in IV

### step 5

Transform sums of squares (which are

measures of variation) into measures of

VARIANCE

/Measures of VARIANCE (mean squares)

differ from gross measures of VARIATION

(sums of squares) because...

/VARIANCE (mean squares) takes into

account degrees of freedom (ie. sample

size & number of groups in IV)

### STEP #6

CONDUCT HYPOTHESIS TEST/Follow same 5 steps we have been using for

hypothesis testing

1. State null & alternative hypotheses

2. Determine alpha level

3. Determine critical value of F

4. Compute test statistic (in this case, use F test)

5. Compare obseved F to critical F & state

conclusion

### step 3

find critical f/If df fall between two listed values, use SMALLER df/If df is greater than largest listed value (> 20 in

numerator or > 1000 in denominator), use infinity for

that component

### step 4

calculate observed f/HIGHER ratio is, more variance that can be attributed to

differences BETWEEN categories

/ LOWER ratio is, more variance that can be attributed to

differences WITHIN categories

### step 5

Remember, with ANOVA we are testing whether

between group variance is greater than within

group variance

/ We want to know whether observed value of F is

relatively large

/If observed F is greater than critical F, we will

reject H0 and conclude there is an association

between independent & dependent variables

### STRENGTH OF ASSOCIATION

Once we know that there is SIGNIFICANT

association between IV & DV, we need to

estimate STRENGTH of association

/ This is important because it is possible for

associations that exist in population to differ in

how strong (or important) they are

/ In fact, relatively weak association can be

significant if sample size is large enough

### STRENGTH OF ASSOCIATION

Measure strength of association in ANOVA using

eta squared (η2)

/ Indicates proportion of total variation that is due

to (explained by) independent variable/ between SS over total SS

### STRENGTH OF ASSOCIATION

Interpretation: 6.66 % of total variation in

dependent variable (reading comprehension) is

explained by independent variable (type of

school)

/Thus, association between reading

comprehension & type of school is significant but

WEAK.

###
WHAT LEVEL OF ETA IS CONSIDERED

STRONG OR WEAK?

<10% = weak

10%-25% = moderate

>25% = strong/Remember, this is also dependent upon your

research question, hypotheses, units used to

measure variables & expected effect size

###
WHAT'S UP WITH ANOVA...IS IT ONETAILED

OR TWO-TAILED?

ANOVA is an OMNIBUS test, meaning that it

just tests OVERALL differences

/There really isn't one-tailed vs. two-tailed option

with ANOVA (or F distribution)

/ F test is one-tailed. We reject H0 if observed F is

greater than critical F/However, ANOVA really tests two-tailed

hypothesis because testing whether there is

significant difference between groups (do not

state specific directional difference)

### In Other Words...

Significant F test only tells us that at least two of

groups are significantly different on DV

/ But we cannot tell which two are different

/ Could conduct t-test of difference between two

means to determine which two groups are

significantly different from each other

###
NORMAL DISTRIBUTION &

EQUALITY OF VARIANCES

When using ANOVA, we assume that the

dependent variable is normally distributed

/ However, if sample size is large enough, we can

relax this assumption because of CLT

/ Equal variances? ANOVA assumes that in

population, variance of DV is equivalent across

groups. Sample variances may not be exactly

equal. If they are close enough, F test will be

valid

/ Nonequivalence of variances only makes a

difference when working with small sample sizes

(not common in Sociology)