24 terms

shape, center and spread of a SAMPLING DISTRIBUTION of a sample proportion

SHAPE- approx. normal if np≥10 and n(1-p)≥10

CENTER- µ(p-hat)=p

SPREAD- σ(p-hat)=sq rt(p(1-p)/n) if sample is no more than 10% of population

CENTER- µ(p-hat)=p

SPREAD- σ(p-hat)=sq rt(p(1-p)/n) if sample is no more than 10% of population

what do subscripts denote when comparing 2 proportions/means

which group the statistic/parameter is from

shape, center and spread of an SRS of size n1 from population 1 with proportion of successes p1 and an independent SRS of size n2 from population 2 with proportion of successes p2

SHAPE- when n1p1, n1(1-p1), n2p2, and n1(1-p2) are all at least 10, the sampling distribution of p1-p2 is approximately normal

CENTER- the mean of the sampling distribution is p1-p2. that is, the difference in sample proportions is an unbiased estimator of the difference in population proportions

SPREAD- standard deviation of the sampling distribution of p-hat1 - p-hat2 is sqrt((p1(1-p1)/n1) + ((p2(1-p2)/n2) [on happy fun] as long as each sample is no more than 10% of it's population

CENTER- the mean of the sampling distribution is p1-p2. that is, the difference in sample proportions is an unbiased estimator of the difference in population proportions

SPREAD- standard deviation of the sampling distribution of p-hat1 - p-hat2 is sqrt((p1(1-p1)/n1) + ((p2(1-p2)/n2) [on happy fun] as long as each sample is no more than 10% of it's population

CONFIDENCE INTERVALS FOR P1-P2:

-what independent condition gives us

-what normal condition gives us

-formula

-what independent condition gives us

-what normal condition gives us

-formula

-when independent condition is met: Standard deviation becomes standard error (bc we don't know parameters P1 or P2) with same formula as previous definition - on happy fun)

-when normal condition is met: find critical value z* for the given confidence level

-formula: (p-hat1 - p-hat2) ± Z*(standard error)

-when normal condition is met: find critical value z* for the given confidence level

-formula: (p-hat1 - p-hat2) ± Z*(standard error)

Conditions for a 2-sample proportion z-interval

-RANDOM: the data are produced by a random sample of size n1 from population 1 and n2 from population 2 or by 2 groups of size n1 and n2 in a randomized experiment

-NORMAL: the counts of successes and failures in each sample or group are all at least 10

-INDEPENDENT: both the samples or groups themselves and the individual observations in each sample or group are independent. When sampling without replacement, check that the 2 populations are at least 10x bigger than the corresponding samples

-NORMAL: the counts of successes and failures in each sample or group are all at least 10

-INDEPENDENT: both the samples or groups themselves and the individual observations in each sample or group are independent. When sampling without replacement, check that the 2 populations are at least 10x bigger than the corresponding samples

what must you state when doing a 4-part problem?

-hypotheses: null & alternate

-conditions: data must meet all conditions (random, normal, independent)

-state which type of test/interval you are performing (2 sample/1 sample/paired data, z/t, test/interval)

-show adequate work/ equations

-conclusion in context

-conditions: data must meet all conditions (random, normal, independent)

-state which type of test/interval you are performing (2 sample/1 sample/paired data, z/t, test/interval)

-show adequate work/ equations

-conclusion in context

what do hypotheses look like for 2 sample tests?

Ho: P1-P2=hypothesized value

Ha: P1-P2>/</≠hypothesized value

Ha: P1-P2>/</≠hypothesized value

general formula for a 2 sample test

statistic-paramter/st dev of statistic

(P1-P2)-0/standard deviation of statistic

(P1-P2)-0/standard deviation of statistic

Pooled (combined) sample proportion equation

P-hatc=successes in both samples /total individuals in both samples= X1+X2/n1+n2

z=(p-hat1 - p-hat2)-0/ sqrt(p-hatc(1-p-hatc)/n1 + p-hatc(1-p-hatc)/n2

z=(p-hat1 - p-hat2)-0/ sqrt(p-hatc(1-p-hatc)/n1 + p-hatc(1-p-hatc)/n2

when do you use a pooled test?

Only when we are conducting a 2 sample test statistic for PROPORTIONS

How do you know if a test is one tailed or 2 tailed?

2 tailed when Ha has ≠. Anything otherwise: one tailed

when deciding which test/ interval to use, ask yourself...

is this test...

one/two tailed?

proportions/means?

1-sample/2-sample/paired data?

use a z/t score?

one/two tailed?

proportions/means?

1-sample/2-sample/paired data?

use a z/t score?

shape, center and spread of difference of two means

SHAPE- approximately normal if the population distribution of x-bar1 - x-bar 2 is normal OR n1>30, n2>30

CENTER- µ1-µ2 -> unbiased estimator

SPREAD- σ(xbar1-xbar2)=sqrt(σ1^2/n1 + σ2^2/n2) as long as each sample is no more than 10% of it's population

CENTER- µ1-µ2 -> unbiased estimator

SPREAD- σ(xbar1-xbar2)=sqrt(σ1^2/n1 + σ2^2/n2) as long as each sample is no more than 10% of it's population

what can we do when the independent condition is met for a 2-sample t statistic?

we can find the standard deviation (in this case standard error since we don't usually know the parameter)

xbar1-xbar2=sqrt(S1^2/n1 + S2^2/n2)

xbar1-xbar2=sqrt(S1^2/n1 + S2^2/n2)

what can we do if the normal condition is met for 2 sample t statistic?

find the t-score:

t=(xbar1-xbar2)-(µ1-µ2)/sqrt(S1^2/n1 + S2^2/n2)

t=(xbar1-xbar2)-(µ1-µ2)/sqrt(S1^2/n1 + S2^2/n2)

what is the conservative approach for determining degrees of freedom?

subtract 1 from each n and use the smaller one - calculator will give a different df and then will give a smaller range for CIs

Confidence interval for 2 sample means

statistic ± critical value(standard error)

(xbar1-xbar2) ± t**(sqrt(S1^2/n1 + S2^2/n2)) - where t*** is the critical value for confidence level C

(xbar1-xbar2) ± t

what are the conditions for 2 sample means procedures?

RANDOM- the data are produced by a random sample of size n1 from population 1 and n2 from population 2 or by 2 groups of size n1 and n2 in a randomized experiment

NORMAL- both population distributions are normal OR both sample groups sizes are large (n1>30, n2>30)

INDEPENDENT- both samples or groups themselves and the individual observations in each sample or group are independent. When sampling without replacement, check that the 2 populations are at least 10 times as large as the corresponding samples

NORMAL- both population distributions are normal OR both sample groups sizes are large (n1>30, n2>30)

INDEPENDENT- both samples or groups themselves and the individual observations in each sample or group are independent. When sampling without replacement, check that the 2 populations are at least 10 times as large as the corresponding samples

what do the hypotheses look like for significance tests for difference of 2 means?

Ho: µ1-µ2=hypothesized value

Ha: µ1-µ2>/</≠hypothesized value

Ha: µ1-µ2>/</≠hypothesized value

for the normal condition, what do you do if ...

-sample size<15

-sample size at least 15

-sample size>30

-sample size<15

-sample size at least 15

-sample size>30

-sample size<15: use 2-sample t procedures if data in both samples appear close to normal ( no outliers/strong skewness)

-sample size at least 15: 2 sample t procedures can be used except in the presence of strong skewness or outliers

-sample size>30:2 sample t procedures can be used even for clearly skewed data when both samples are large

-sample size at least 15: 2 sample t procedures can be used except in the presence of strong skewness or outliers

-sample size>30:2 sample t procedures can be used even for clearly skewed data when both samples are large

should you use 2 sample t procedures on paired data?

NO

is it better to have equal sample sizes of 2 groups, or differing sample sizes?

equal

should you use pooled procedures for 2 sample t procedures?

no - only on proportions

can results be generalized to the larger population if randomization is not present?

no