60 terms

Interpret Standard Deviation

Standard deviation measures spread by giving the "typical" or "average" distance that the observations (CONTEXT) are away from their (CONTEXT) mean

Outlier Rule

Linear Transformations

Describe the Distributions OR Compare the Distributions

SOCS!

Shape, Outlier, Center, Spread

Only discuss outlires if there are obviously outliers present. Be sure to address SCS in context!

If it says "Compare"

YOU MUST USE comparision phrases like "is greater than" or "is less than" for Center & Spread

Shape, Outlier, Center, Spread

Only discuss outlires if there are obviously outliers present. Be sure to address SCS in context!

If it says "Compare"

YOU MUST USE comparision phrases like "is greater than" or "is less than" for Center & Spread

SOCS

Shape:

Skewed Left (Mean < Median)

Skewed Right (Mean > Median)

Roughly Symmetric (Mean ≈ Median)

Outliers:

Discuss them if there are obvious ones

Center:

Mean or Median

Spread:

Range, IQR, or Standard Deviation

(If using Mean for center, then use st. dev. for spread)

(If using Median for center, then use IQR for spread)

NOTE:

Also be on the lookout for gaps, clusters, or other unusual features of the data set. Make observations of these!

Skewed Left (Mean < Median)

Skewed Right (Mean > Median)

Roughly Symmetric (Mean ≈ Median)

Outliers:

Discuss them if there are obvious ones

Center:

Mean or Median

Spread:

Range, IQR, or Standard Deviation

(If using Mean for center, then use st. dev. for spread)

(If using Median for center, then use IQR for spread)

NOTE:

Also be on the lookout for gaps, clusters, or other unusual features of the data set. Make observations of these!

Using "Normalcdf" and "InvNorm" (Calculator Tips)

Interpret a z-score

What is an Outlier?

When given 1-Variable Data:

An outlier is any value that falls more than 1.5(IQR) above Q3 or below Q1

When looking at 2-Variable Data (Regression Line):

Any value that falls outside the pattern of the rest of the data.

An outlier is any value that falls more than 1.5(IQR) above Q3 or below Q1

When looking at 2-Variable Data (Regression Line):

Any value that falls outside the pattern of the rest of the data.

Interpret LSRL Slope "b"

Interpret LSRL y-intercept "a"

Interpret r^2

Interpret r

Interpret LSRL "SEb"

Interpret LSRL "s"

Interpret LSRL "ŷ"

Extrapolation

Using a LSRL to predict values far outside the domain of the explanatory variable.

Can lead to ridiculous conclusions if the current linear trend does not continue.

Can lead to ridiculous conclusions if the current linear trend does not continue.

Interpreting a Residual Plot

What is a Residual?

Residual = y - ŷ

A residual measures the difference between the actual (observed) y-value in a scatterplot and the y-value that is predicted by the LSRL using its corresponding x-value.

In the calculator: L3 = L2 - Y1(L1)

A residual measures the difference between the actual (observed) y-value in a scatterplot and the y-value that is predicted by the LSRL using its corresponding x-value.

In the calculator: L3 = L2 - Y1(L1)

Sampling Techniques

Experimental Designs

1. CRD (Completely Randomized Design) -

All experimental units are allocated at random among all treatments.

2. RBD (Randomized Block Design) -

Experimental units are put into homogeneous blocks. The random assignment of the units to the treatments is carried out separately within each block.

3. Matched Pairs -

A form of blocking in which each subject receives both treatments in a random order or the subjects are matched in pairs as closely as possible and one subject in each pair receives each treatment, determined at random.

All experimental units are allocated at random among all treatments.

2. RBD (Randomized Block Design) -

Experimental units are put into homogeneous blocks. The random assignment of the units to the treatments is carried out separately within each block.

3. Matched Pairs -

A form of blocking in which each subject receives both treatments in a random order or the subjects are matched in pairs as closely as possible and one subject in each pair receives each treatment, determined at random.

Goal of Blocking

Benefit of Blocking

Benefit of Blocking

The goal of blocking is to create groups of homogeneous experimental units.

The benefit of blocking is the reduction of the effect of variation within the experimental units (CONTEXT).

The benefit of blocking is the reduction of the effect of variation within the experimental units (CONTEXT).

Advantage of using a Stratified Random Sample Over an SRS

Stratified random sampling guarantees that each of the strata will be represented. When strata are chosen properly, a stratified random sample will produce better (less variable/more precise) information than an SRS of the same size.

Experiment or Observational Study?

A study is an experiment only if researchers impose a treatment upon the experimental units.

In an observational study researchers make no attempt to influence the results.

In an observational study researchers make no attempt to influence the results.

Does _____ Cause _____ ?

Association is NOT Causation!

An observed association, no matter how strong, is not evidence of causation. Only a well-designed, controlled experiment can lead to conclusions of cause and effect.

An observed association, no matter how strong, is not evidence of causation. Only a well-designed, controlled experiment can lead to conclusions of cause and effect.

SRS

An SRS (simple random sample) is a sample taken in such a way that every set of n individuals has an equal chance to be the sample actually selected.

Why use a control group?

A control group gives the researchers a comparison group to be used to evaluate the effectiveness of the treatment(s). (CONTEXT)

(gauge the effect of the treatment compared to no treatment at all)

(gauge the effect of the treatment compared to no treatment at all)

Complementary Events

P(at least one)

P(at least one) = 1 - P(None)

Example:

P(at least one 6 in three rolls)

P(at least one 6) = 1 - P(No Sixes)

= 1 - (5/6)^3

= 0.4213

Example:

P(at least one 6 in three rolls)

P(at least one 6) = 1 - P(No Sixes)

= 1 - (5/6)^3

= 0.4213

Two Events are Independent If...

Interpreting Probability

The probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions.

Probability is a longterm relative frequency

Probability is a longterm relative frequency

Interpreting Expected Value/Mean

The mean/expected value of a random variable is the long-run average outcome of a random phenomenon carried out a very large number of times.

Mean and Standard Deviation of a Discrete Random Variable

Mean and Standard Deviation of a Difference of Two Random Variable

Mean and Standard Deviation of a Sum of Two Random Variable

Binomial Distribution (Conditions)

Geometric Distribution (Conditions)

Binomial Distribution (Calculator Usage)

Mean and Standard Deviation of a Binomial Random Variable

Why Large Samples Give More Trustworthy Results...

(When collected appropriately)

(When collected appropriately)

The Sampling Distribution of the Sample Mean

(Central Limit Theorem)

(Central Limit Theorem)

Unbiased Estimator

Bias

The systematic favoring of certain outcomes due to a flawed sample selection, poor question wording, undercoverage, nonresponse, etc.

Bias deals with the center of a sampling distribution being "off"!

Bias deals with the center of a sampling distribution being "off"!

Explain P-value

Can we generalize the results of the population of interest?

Finding the Sample Size (for a given margin of error)

Carrying out a Two-Sided Test from a Confidence Interval

4-Step Process: Confidence Intervals

4-Step Process: Significance Tests

Interpreting a Confidence Interval

(Not a Confidence Level)

(Not a Confidence Level)

Interpreting a Confidence Level

(The Meaning of 95% Confident)

(The Meaning of 95% Confident)

Paired t-test:

Phrasing Hints, Ho and Ha, Conclusion

Phrasing Hints, Ho and Ha, Conclusion

Two-Sample t-test:

Phrasing Hints, Ho and Ha, Conclusion

Phrasing Hints, Ho and Ha, Conclusion

Type I Error,

Type II Error, &

Power

Type II Error, &

Power

Factors that Affect Power

Inference for Mean: Conditions

Inference for Proportions: Conditions

Chi-Square Tests

Goodness of Fit:

Use to test the distribution of one group or sample as compared to a hypothesized distribution.

Homogeniety:

Use when you have a sample from 2 or more independent populations or 2 or more groups in an experiment. Each individual must be classified based upon a singe categorical variable.

Association/Independence:

Use when you have a single sample from a single population. Individuals in the sample are classified by two categorical variables.

Use to test the distribution of one group or sample as compared to a hypothesized distribution.

Homogeniety:

Use when you have a sample from 2 or more independent populations or 2 or more groups in an experiment. Each individual must be classified based upon a singe categorical variable.

Association/Independence:

Use when you have a single sample from a single population. Individuals in the sample are classified by two categorical variables.

Chi-Square Tests: df and Expected Counts

Inference for Counts (Chi-Squared Tests): Conditions

Inference for Regression: Conditions