Search
Create
AP Statistics Exam
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (127)
state, plane, do, conclude
steps to organize a stat problem
descriptive stat
summarizes data (to get an average)
influential stat
generalizes from sample (to make an assumption)
univariate (bivariate)
data that collects one (or two) variables
categorical
qualitative
numerical
quantitative
continuous
type of numerical data that measures time, height, number line values, etc.
discrete
type of numerical data that measures countable variables
bar chart
type of relative frequency distribution chart used with categorical data
dot plot
rudimentary histogram used with numerical data
bar chart, pie chart, relative porportion
types of charts used with *categorical data* and what you must discuss with them
stem and leaf plot, dot plot, histogram, distribution
types of charts used with *numerical data* and what you must discuss with them
frequency / # of observations in data
relative frequency =
marginal distribution
row total, column total, total totals in bar chart
make stem (with splits), add leaves, order leaves, add key
steps to making a stem and leaf plot
shape, spread, center, outlier
patterns to look for in a dot plot
center or typical value, spread of variability, general shape, location and # of peaks, gaps and outliers
what to look for in a histogram
right skewed has peak on left and vice versa
how skewed graphs look
median is less than the mean
mean and median on a skewed right graph
mean is less than the median
mean and median on a skewed left graph
unimodal, bimodal, multimodal
number of peaks a distribution can have
cumulative frequency / sample size
cumulative relative frequency =
mean, outliers
arithmetic average that is sensitive to...
µ, X
population and sample mean
median
exact middle of data
mode
most often used point in a set of data
order from smallest to largest and chop off equal number from both sides
trimmed mean
# deleted from side / n
trimmed percentage =
p hat = # of successes / n
sample proportion =
sample variance, squared standard deviation
average squared deviation from the mean, same thing as...
standard deviation
typical deviation from mean
δ = √[∑(x-mean)² ] / [n-1]
standard deviation =
Q3 - Q1
IQR =
Q1-(1.5xIQR) or Q3+(1.5xIQR)
how to find outliers in box plot
68%, 95%, 99.7%
within 1, 2, and 3 standard deviations of the mean
34%, 13.5%, 2.35%
other divisions for 68-95-99.7 Rule
(value - mean) / standard deviation
z-score =
rth percentile
value such that *r* percent of observations in the data set fall below that value
response
y variable in scatter plot
y
response variable
explanatory
x variable in scatter plot
x
explanatory variable
direction (- or +), form (clusters, curved, or linear), strength (tightly or loosely packed), outliers
used to interpret scatterplots
correlation (r), -1 to +1
measures direction and strength of the linear relationship between 2 quantitative variables, on a scale of.....
±1 to ±0.8, ±0.8 to ±0.5, ±0.5 to 0
strong correlation, medium correlation, weak correlation
straight line
the stronger a correlation is, the more it looks like a....
y hat = a + b x
regression line equation
b = r(Sy/Sx)
*b* of regression line equation
a = y hat - b (x bar)
*a* of regression line equation
residuals
occur when the regression line does not perfectly fit the data
observed y - predicted y
residuals =
1. Stat, edit, L3, 2nd, list, resid; 2. Turn off plot 1; 3. Turn on plot 2, plot (xList: L1 and yList: L3)
how to make residual plot in calculator
s = √(∑residuals²) / (n-2)
standard deviation of the residuals =
when we use x to estimate y, we will be off by an average of s
after finding *s*, you interpret it by saying...
coefficient of determination
percent of the variation in the values of *y* that is accounted for by the least squares regression line of *y* on *x*
r²% of the variation in y is accounted for in the linear model relating y to x
after finding *r²*, you interpret it by saying...
y hat = dependent variable, a = coefficient of constant, b = coefficient of other
how to interpret computer data
does not matter
in correlation, explanatory and response...
does matter
in regression, explanatory and response...
quantitative data
correlation and regression can only be used with...
large residuals, don't change correlation much
outliers in y direction
small residuals, change correlation a lot
outliers in x direction
convenience sample
choosing people for a sample because they are easiest to reach
voluntary response sample
using individuals who volunteer to participate
simple random sample
each individual has an equal chance of being selected
math, prb, 5: RandInt (min, max, #)
how to do simple random sampling in calculator
stratified random sample
divide people into separate groups according to characteristics, then select random sample from each group
cluster sample
divide people into random groups, then everyone in particular group is sampled
sampling error (convenience, voluntary response) , non-sampling error (nonresponse, response bias, bad question wording), under-coverage (didn't cover entire population
possible sampling errors
observational study
observes individuals and measures variables of interest without trying to influence the responses
experiment
imposes a treatment on individuals and then measures responses
lurking variable
not mentioned in study, but can affect the explanatory variable
confounding
two variables' effect on a response variable cannot be distinguished from one another
treatment
specific condition being applied to to individuals
experimental units
smallest collection of individuals getting a treatment
factors
explanatory variables
control (no lurking variables), random assignment, replication (repeat trials to reduce variability)
three principles of experimental design
double-blind experiment
neither subjects nor researchers know which treatment is being received
blocking
group of experimental units that are known to be similar in some way and could affect the response of the treatments (cannot be controlled)
random sampling, population
used in *observational studies* and makes an inference about the...
random assignment, cause and effect
used in *experiments* and makes an inference about the...
P(A∪B) = P(A) + P(B)
probability of A or B when mutually exclusive
P(A∪B) = P(A) + P(B) - P(A∩B)
probability of A or B when NOT mutually exclusive
two-way table
should be used when dealing with two variables that are not mutually exclusive
P(A∩B) = P(A) x P(B)
probability of A and B when independent
P(A∩B) = P(A) x P(B | A)
probability of A and B when NOT independent
tree diagram
should be used when dealing with two variables that are not independent
P(B|A) = P(B)
probability of B given A when independent
B given A
conditional probability example
P(B|A) = [P(A∩B)] / [P(A)]
probability of B given A when NOT independent
probability of at least one, probability of none
compliment probability example
P(not) = 1 - P(does)
probability of none occurring
P(at least 1) = 1 - P(none)
probability of at least one occurring
look at # of B and A and divide by total # of A, see if it equals total # of B divided by total #
how to tell if A and B are independent using a two way table
standard normal deviations
normal distribution with a mean of 0 and standard deviation of 1
area to left
area that z-table finds
z = (X - µ) / (δ)
standard normal deviations =
histogram (bell shaped, symmetric, single peaked), 68-95-99.7 Rule, normal probability plot
how to tell if data is normal
2nd Vars, 2:normalcdf(min,max,µ, δ)
calculator to find z-score
random variable
takes numerical value that describes the outcome of a chance process
probability distribution
gives values and their respective probabilities in a chart
expected value (mean)
long run average value of the variables after many repetitions of a chance process
µx = ∑xi(pi)
expected value =
on average, the outcome of x will differ from the mean by about δx
after finding *δ*, you can interpret it by saying....
δx = √∑(xi-µi)² pi
standard deviation of random variable =
stat, edit, L1=random variable values, L2=probability values, stat, calc, 1-var stat, List 1: L1 and FreqList: L2
standard deviation of random variable on a calculator
continuous random variable
x takes on all values in an interval or numbers and have infinitely many values
uniform
when normal distribution is square
calculate height (area must equal one), multiply the amount between the two values given (lenght) by the height
how to solve a uniform normal distribution problem
state random variable (x=?), state probability (P[x≤?]), answer
when solving a random variable problem...
multiplies mean, median, quartiles, and percentiles by x; multiplies range, IQR, and sd by |x|; shifts distribution but does not change shape
multiplying or dividing a random variable by *x*...
adds x to mean, median, quartiles, and percentiles; doesn't change range, IQR, or sd; doesn't change shape
adding or subtracting *x* from a random variable...
µt = µx + µy ; δt = √(δx² + δy²)
random variables *X* + *Y* = ?
µt = µx - µy ; δt = √(δx² + δy²)
random variables *X* - *Y* = ?
binomial
gives set number of trials
binary, independent, number of set trials, success of probability is same
binomial conditions
p(x=k) = (nCk) (p) ^(k) (1 - p) ^(n-k); n=# of trials; k=# of successes; p=prob. of success
binomial (explain variables) =
µx = n x p ; δx = √(n x p [1 - p] )
mean and standard deviation of a binomial
geometric
trials until success
binary, independent, trials until success, success of probability is same
geometric conditions
p(x=k) = (1-p) ^(k-1) (p)¹; k=# of trials until success; p=prob. of success
geometric (explain variables) =
µx = 1/p ; δx = not possible
mean and standard deviation of a geometric
binompdf(n,p,k); binomcdf(n,p,k); 1 - binomcdf(n,p,k)
Binomial for P ( x = k ) , P ( x ≤ k ) , and P ( x ≥ k )
geometpdf(p,k); geometcdf(p,k); 1 - geometcdf(p,k)
Geometric for P ( x = k ) , P ( x ≤ k ) , and P ( x ≥ k )
invNorm(area to left, µ, δ)
use the calculator to find a *z* if you already have the area
matched pairs design
randomized blocked experiment in which each block consists of a matching pair of similar experimental units
take pulse sitting, take pulse standing, find difference between the two and plot points
example of a matched pair design
;