Search
Create
Log in
Sign up
Log in
Sign up
AP Stat Ch. 8
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (76)
sample mean x is an unbiased estimator of
population mean u
point estimator
a statistic that provides an estimate of a population parameter
point estimate
the value of that statistic
must still check
10% condition when using standard deviation o/square root of "n"
BIG IDEA:
sampling distribution of x tells us how close to u the sample mean x is likely to be, where statistical estimation says how close to x the unknown population mean u is likely to be.
confidence interval
1) an interval: estimate +/ margin of error
2) margine of error
margin of error
tells us how close the estimate tends to be to the unknown parameter in repeated random sampling
confidence level C
the overall success rate of the method for calculating the confidence interval > in C% of all possible samples, the method would yield an interval that captures the true parameter
most common confidence level is
95%, 90% is the lowest we'd like to be
when using the +/10 formula,
95% confidence
Confidence Level: 95% confident
95% of all possible samples of a given size from the population will result in an interval that captures the unknown parameter
Confidence Interval: C%
we are C% confident that the interval from ___ to ___ captures the actual value of [the population parameter in context]
the confidence interval tells us how likely it is that the method we are using will produce an interval that captures the population parameter IF
we use it MANY TIMES
confidence level does NOT tell us
the PROBABILITY that a particular confidence interval captures the population parameter; it tell us a plausible set of values for the parameter
confidence intervals are statements about
PARAMETERS > we believe that the population mean is somewhere between here and here
Calculating a Confidence Interval:
to estimate a population parameter use:
statistic +/ (critical value) x (S.D.)
**Statistic is the POINT ESTIMATOR for the parameter
user chooses the confidence level,
margin of error follows from this choice
we want:
high confidence and small margin of error
high confidence says that
our method almost always gives the correct answers
small margin of error says that the method is
pretty precise
Calculating Margin of Error:
(critical value) x (S.D. of the statistic)
***critical value tied directly to the confidence level:
greater confidence requires a larger critical value
margin of error shrinks when:
confidence level decreases AND/OR sample size "n" increases
TRADE OFF > smaller margin of error =
decrease in confidence
Conditions for Constructing a Confidence Interval:
 random
 Normal (samp. distribution of the STATISTIC)
 Independent (sample size no more the 10% of the population)
SAMPLING:
must be an SRS!!!!!!!
> using stratified/cluster sampling requires different methods
!!!!
the margin of error in a confidence interval covers only chance variation due to the random sampling/assignment > it DOESN'T cover variation from things like undercoverage or nonresponse.
IMP. Properties of a Sampling Distribution of a Statistic:
ShapeNORMAL np(1p) is > 10, Center Mean is "p"  sample proportion is an unbiased estimator of the population proportion, spreadS.D. = square root of p(1p) / n PROVIDED THAT the 10% condition is met
WE DON'T KNOW
"p" > otherwise we wouldn't need to construct a confidence interval for it > SO WE USE p hat
n^p and n(1^p) now instead of p
same for the standard deviation formula
SD found using p hat is a quantity called the
standard error of the sample proportion ^p > describes how close the sample proportion ^p will be on average to the population proportion p in the repeated SRSs of size n
Standard Error
when the standard deviation is estimated from the data, the result is called the standard error of the statistic
ON QUIZ MUST CHECK:
RANDOM (random sample?)
NORMAL (np rules)
INDEPENDENT (10%)
GETTING A Z* SCORE 
80% confidence?
1.8 = 0.2 > 0.2/2 = 0.1
invNorm(0.1) = 1.28 > Z* SCORE
Margin of Error/Standard Error w/ proportions = to
^ ^ ^
p +/ (z*)(srt of p (1p) / n)
CONFIDENCE INTERVAL:
^p +/ the confidence interval
Margin of Error ME in the confidence interval for p is
ME = z* times (srt of p hat(1phat) / n)
In the ME formula, z* is the standard Normal CRITICAL CALUE for the level of confidence we want. Because the margin of error INVOLVES THE SAMPLE PROPORTION OF SUCCESSES p hat, we HAVE to guess the latter value
when choosing n
There are 2 ways to guess that latter value when choosing n >
1) Use a guess for ^p based on a pilot study of in past experience (italians example, 0.75)
OR
2) Use ^p = 0.5 as the guess. > the margin of error ME is largest when p hat = 0.5, so this guess is conservative in the sense that if we get any other p hat when we do our study, we will get a margin of error smaller than planned > plan for big, if get smaller, great!
Once you have guessed what ^p is, use the formula
^ ^
z* x (srt of p (1p) / n) is </= Margin of Error
>> ** THis insures that the Margin of Error will always be less than or equal to the actual ME if you take the guess that ^p = 0.5
smaller margins of error call for
LARGER SAMPLES
As long as the 3 conditions are met, a level C confidence interval for "u" is

x +/ (z*) (s.d./srt of "n")
in most situations, if we don't know the mean, we likely
don't know the S.D. either, but we can use the onesample z interval for a population mean to estimate the sample size needed to achieve a specific margin of error.
you can arrange to have high confidence and small margin of error by
taking enough observations
ME of the confidence level for the population mean u is
z* (s.d./srt of n)
Find the ME by:
1) substitute the value of z* for desired confidence level
2) use estimate for population S.D.
3) set the expression for ME less than/equal to specified margin of error
4) solve for n
> *
same process as in 8.2
*
it is the size of the SAMPLE that influences margin of error, NOT the
size of the population
when we don't know the S.D. of the population, we can find probabilities by standardizing
_
z = x  u

s.d./(srt of n)
tdistribution
symmetric w/ single peak at 0, much more area in the tails than Normal curve does
> says how far xbar is from the mean u in S.D. units
t statistic
_
t = x  u

sx/(srt of n)
specify a particular tdistribution by giving its
degrees of freedom (df)
to find df >
subtract 1 from the sample size n
df = n1
> this statistic will have an approximately t(n1) distribution as long as the sampling distribution of xbar is close to Normal.
when we don't know what "o" (SD) is, we estimate it by the
sample standard deviation sx
we then estimate the standard deviation of teh sampling distribution by
sx / (srt of n)
Standard Error of the Sample Mean
sx / (srt of n)
Standard Error of the Sample Mean describes
how far xbar will be from u on average in REPEATED SRSs of size n
tinterval for a Population Mean
_
x +/ (t*)(sx/ (srt of n))
> ***population distribution must be Normal
or n >/= to 30
10% condition applies
random
inference for PROPORTIONS uses
z*
inference for MEANS (when we don't know the S.D.) uses
t*
t* in calculator:
95% confidence?
100 95 = 0.05/2 = 0.025
1000.025 = 0.975
InvT (1tp, df)
InvT (0.975, df)
the stated confidence level of a onesampled t interval for u is
exactly correct when the population distribution is exactly Normal, but no population distribution is exactly Normal
procedures that are not strongly affected when a confidition for using them is voilated is called
robust
robust procedures
an inference procedure is called robust if the probability calculations involved in that procedure remain fairly accurate when a condition for using the procedure is violated
for confidence intervals, "robust" means that
the stated confidence level is still pretty accurate
if it's not robust, and says 95%,
then the actual capture rate might be very different form 95%
if there are OUTLIERS in the sample,
the population may not be Normal
!!!!!!!!tprocedures are NOT robust against outliers, because!!!!!!
!!!!!!!!xbar and sx are not resistant to outliers!!!!!!!!!
tprocedures are QUITE ROBUST against NonNormality so long as
there are NO OUTLIERS and/or it is NOT STRONGLY SKEWED
larger samples improve the accuracy of critical values from t distributions
when the population is NOT normal
this is true because:
1) sampling distribution of teh sample mean xbar froma large sample abise by CLT
2) as sample size N grows, sample SD will be an accurate estimate of o whether or not the population has a Normal distribution
OneSample t procedures: THE NORMAL CONDITION
 sample size is less than 15  CHECK HISTOGRAM/RUN VAR STATS  if it is roughly symm.,single peak, no outliers > use t
 sample size is at least 15  tprocedures can be used except in the presence of outliers or strong skewness
 large samples: tprocedures can be used even for clearly skewed distributions if CLT applies
***If your sample would give biased estimates,
don't compute the tinterval
***if the data you have IS the entire population,
can find true parameter, so no need to calculate confidence
In general:
the condition that a sample is RANDOM is more important than it being NORMAL, then independent.
critical values for any specific confidence level DECREASES
as the degrees freedom INCREASE!
a zinterval is equivalent to a tinterval with
"infinite" degrees of freedom
;