60 terms

# AP Statistics Exam Review - Concepts & Vocabulary

###### PLAY
Interpret Standard Deviation
Standard deviation measures spread by giving the "typical" or "average" distance that the observations (CONTEXT) are away from their (CONTEXT) mean
Outlier Rule
Linear Transformations
Describe the Distributions OR Compare the Distributions
SOCS!
Only discuss outlires if there are obviously outliers present. Be sure to address SCS in context!

If it says "Compare"
YOU MUST USE comparision phrases like "is greater than" or "is less than" for Center & Spread
SOCS
Shape:
Skewed Left (Mean < Median)
Skewed Right (Mean > Median)
Roughly Symmetric (Mean ≈ Median)

Outliers:
Discuss them if there are obvious ones

Center:
Mean or Median

Range, IQR, or Standard Deviation
(If using Mean for center, then use st. dev. for spread)
(If using Median for center, then use IQR for spread)

NOTE:
Also be on the lookout for gaps, clusters, or other unusual features of the data set. Make observations of these!
Using "Normalcdf" and "InvNorm" (Calculator Tips)
Interpret a z-score
What is an Outlier?
When given 1-Variable Data:
An outlier is any value that falls more than 1.5(IQR) above Q3 or below Q1

When looking at 2-Variable Data (Regression Line):
Any value that falls outside the pattern of the rest of the data.
Interpret LSRL Slope "b"
Interpret LSRL y-intercept "a"
Interpret r^2
Interpret r
Interpret LSRL "SEb"
Interpret LSRL "s"
Interpret LSRL "ŷ"
Extrapolation
Using a LSRL to predict values far outside the domain of the explanatory variable.

Can lead to ridiculous conclusions if the current linear trend does not continue.
Interpreting a Residual Plot
What is a Residual?
Residual = y - ŷ

A residual measures the difference between the actual (observed) y-value in a scatterplot and the y-value that is predicted by the LSRL using its corresponding x-value.

In the calculator: L3 = L2 - Y1(L1)
Sampling Techniques
Experimental Designs
1. CRD (Completely Randomized Design) -
All experimental units are allocated at random among all treatments.

2. RBD (Randomized Block Design) -
Experimental units are put into homogeneous blocks. The random assignment of the units to the treatments is carried out separately within each block.

3. Matched Pairs -
A form of blocking in which each subject receives both treatments in a random order or the subjects are matched in pairs as closely as possible and one subject in each pair receives each treatment, determined at random.
Goal of Blocking
Benefit of Blocking
The goal of blocking is to create groups of homogeneous experimental units.

The benefit of blocking is the reduction of the effect of variation within the experimental units (CONTEXT).
Advantage of using a Stratified Random Sample Over an SRS
Stratified random sampling guarantees that each of the strata will be represented. When strata are chosen properly, a stratified random sample will produce better (less variable/more precise) information than an SRS of the same size.
Experiment or Observational Study?
A study is an experiment only if researchers impose a treatment upon the experimental units.

In an observational study researchers make no attempt to influence the results.
Does _____ Cause _____ ?
Association is NOT Causation!

An observed association, no matter how strong, is not evidence of causation. Only a well-designed, controlled experiment can lead to conclusions of cause and effect.
SRS
An SRS (simple random sample) is a sample taken in such a way that every set of n individuals has an equal chance to be the sample actually selected.
Why use a control group?
A control group gives the researchers a comparison group to be used to evaluate the effectiveness of the treatment(s). (CONTEXT)

(gauge the effect of the treatment compared to no treatment at all)
Complementary Events
P(at least one)
P(at least one) = 1 - P(None)

Example:
P(at least one 6 in three rolls)
P(at least one 6) = 1 - P(No Sixes)
= 1 - (5/6)^3
= 0.4213
Two Events are Independent If...
Interpreting Probability
The probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions.

Probability is a longterm relative frequency
Interpreting Expected Value/Mean
The mean/expected value of a random variable is the long-run average outcome of a random phenomenon carried out a very large number of times.
Mean and Standard Deviation of a Discrete Random Variable
Mean and Standard Deviation of a Difference of Two Random Variable
Mean and Standard Deviation of a Sum of Two Random Variable
Binomial Distribution (Conditions)
Geometric Distribution (Conditions)
Binomial Distribution (Calculator Usage)
Mean and Standard Deviation of a Binomial Random Variable
Why Large Samples Give More Trustworthy Results...
(When collected appropriately)
The Sampling Distribution of the Sample Mean
(Central Limit Theorem)
Unbiased Estimator
Bias
The systematic favoring of certain outcomes due to a flawed sample selection, poor question wording, undercoverage, nonresponse, etc.

Bias deals with the center of a sampling distribution being "off"!
Explain P-value
Can we generalize the results of the population of interest?
Finding the Sample Size (for a given margin of error)
Carrying out a Two-Sided Test from a Confidence Interval
4-Step Process: Confidence Intervals
4-Step Process: Significance Tests
Interpreting a Confidence Interval
(Not a Confidence Level)
Interpreting a Confidence Level
(The Meaning of 95% Confident)
Paired t-test:
Phrasing Hints, Ho and Ha, Conclusion
Two-Sample t-test:
Phrasing Hints, Ho and Ha, Conclusion
Type I Error,
Type II Error, &
Power
Factors that Affect Power
Inference for Mean: Conditions
Inference for Proportions: Conditions
Chi-Square Tests
Goodness of Fit:
Use to test the distribution of one group or sample as compared to a hypothesized distribution.

Homogeniety:
Use when you have a sample from 2 or more independent populations or 2 or more groups in an experiment. Each individual must be classified based upon a singe categorical variable.

Association/Independence:
Use when you have a single sample from a single population. Individuals in the sample are classified by two categorical variables.
Chi-Square Tests: df and Expected Counts
Inference for Counts (Chi-Squared Tests): Conditions
Inference for Regression: Conditions