Sampling distribution of or diff. of means: Follows z if is known (rare), otherwise t.
Sampling distribution of : Really binomial, but almost normal if pop. is large,
np 10, and nq 10.
Sampling distribution of difference of proportions: Almost normal if pops. are large,
n1p1 5, n1q1 5, n2p2 5, n2q2 5.
Sampling distrib. of : Follows , with df given either by
(# of bins - 1) for g.o.f., or by (rows - 1)(cols. - 1) for 2-way tables.
Law of large numbers.
CORRECT: As , approaches p. (Sometimes stated as " as .")
WRONG: If , then the proportion of successes will start to increase until we "catch up." (Or, if , the proportion of successes will start to decrease until we are "back down to the correct value.") These are both wrong, because what really happens is that the effect of any finite collection of observations becomes diluted as . A coin has no memory, no desire to set things right, and no ability to iron out past discrepancies. Nevertheless, the proportion of heads—even if the coin is biased—will, over time, approach whatever the true probability is.
Two-tailed, since if the experiment goes the wrong way (as sometimes occurs in science), there will still be the possibility of making an inference. All decisions regarding methodology are supposed to be made before any data-gathering occurs. (Otherwise, people could say that the methodology was tailored toward achieving a low P-value. In theory, the experiment should be repeatable, so that anyone following the same methodology would likely reach a similar conclusion.)
The one-tailed/two-tailed decision should be based on the research question posed. If the researcher is wondering whether there is "a difference," direction unspecified, then plan for a two-tailed test. If the researcher is wondering whether treatment X increases hair strength, decreases yellowness of teeth, or whatever, then plan for a one-tailed test.
It is possible to write a true sentence using the words probability and confidence interval. However, it is also very easy to make an error along the way. That is why it is much better to say, "We are 95% confident that the true proportion of voters favoring Smedley is between 48% and 54%," not anything involving probability. Probability is a technical term meaning long-run relative frequency, and it cannot be haphazardly misused in the way laypeople misuse it.
It would be correct to say, "If we repeatedly generated confidence intervals with samples of this size and with m.o.e. of 3%, then the probability that a future confidence interval will bracket the true proportion of voters favoring candidate Smedley is 95%; that is, 95% of the confidence intervals generated by this process will bracket the true value." However, you cannot make a probability statement about a confidence interval once it has been generated, because then you are not making a statement about the process (which is legitimate), but rather about this one-shot confidence interval. There is no "long run" in a one-shot confidence interval!
Bias = any situation in which the expected value of a statistic does not equal the parameter being estimated. Selection bias refers to a methodology that produces samples that are systematically different from the population in a way that causes a parameter to be systematically underestimated or overestimated. An SRS is not biased; although an SRS often fails to match the population, the differences are random differences, not systematic differences.
"Systematic" means that there are methodological flaws that may become evident over a period of time, because the flaws are built into the design of the process. For example, if we try to poll the STA parent body on the question, "How many days per year does your son spend traveling?" we will get a statistic that is biased on the high side if we use an SRS of all parents. (That is because students with stepparents, who may well travel more than the average, will be more likely to have a parent chosen as part of the SRS.) If the SRS were based on students instead of parents, the question should be able to avoid selection bias.
Common types of bias include selection bias (undercoverage or overcoverage), response bias (a.k.a. lying), nonresponse bias, voluntary response bias, hidden bias, experimenter bias, and wording of the question.