AP Statistics Vocabulary List

Various terms of importance for success on the AP test.
STUDY
PLAY
Statistical Significance
When your discovered p-value is less than your alpha (.05 if not given). States that chance alone would rarely produce an equally extreme result.
Non-Response Bias
A bias caused by a number of people who did not respond to the survey.
p-Value
The probability of getting a result at least as extreme as the result given from the test. The lower the value the stronger the evidence.
Empirical rule
States that, in a normal distribution, about 68% of the terms are within one standard deviation of the mean, about 95% are within two standard deviations, and about 99.7% are within three standard deviations (normal curve).
Lurking Variable
A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two.
Null Hypothesis
The hypothesis that states there is no difference between two or more sets of data in a significance test.
Quota Sample
A sample deliberately constructed to reflect several of the major characteristics of a given population.
Probability
The likelihood that a particular event will occur.
Descriptive Statistics
Statistical procedures used to describe characteristics and responses of groups of subjects.
Median
The middle score in a distribution; half the scores are above it and half are below it.
Stemplot
A graphical representation of a quantitative data set. Leading values of each data point are presented as stems and second digits are given as leaves.
Data
Information gathered from observations.
Margin of Error
The range of percentage points in which the sample accurately reflects the population, the range surrounding a sample's response within which researchers are confident the larger population's true response would fall.
Normal
A sample which follows the Empirical Rule for distribution.
Simple Random Sample
Every member of the population has a known and equal chance of selection.
Sampling Distribution
The distribution of values taken by the statistic in all possible samples of the same size from the same population.
Interpolation
Using the Least Squares Regression Line to predict a y-value for an x-value within the x-data set.
Qualitative
Data identified by something other than numbers.
Theoretical Probability
The ratio of the number of favorable outcomes to the number of possible outcomes if all outcomes have the same chance of happening.
Block Design
The random assignment of subjects to treatments is carried out separately within each block.
Least Squares Regression Line
The line that minimizes the sum of squared residuals.
Type II Error
An error that occurs when a researcher concludes that the independent variable had no effect on the dependent variable, when in truth it did; a false negative.
Histogram
A bar graph that shows the frequency of data within equal intervals.
Undercoverage
Occurs when some groups in the population are left out of the process of choosing the sample.
Joint Frequency
The number of responses for a given characteristic.
Matched Pairs
Either two measurements are taken on each individual such as pre and post OR two individuals are matched by a third variable (different from the explanatory variable and the response variable) such as identical twins.
Conditional Probability
The probability that a particular event will occur, given that another event has already occurred.
Mode
The datum which occurs the most in a set of data.
Systematic Sample
A sample drawn by selecting individuals systematically from a sampling frame.
Sample Space
All possible outcomes of an experiment.
Confounded Variable
An unintended difference between the conditions of an experiment that could have affected the dependent variable.
Experimental Probability
Probability based on what happens when an experiment is actually done.
Placebo Effect
Experimental results caused by expectations alone; any effect on behavior caused by the administration of an inert substance or condition, which is assumed to be an active agent.
Marginal Frequency
Row and column totals in a contingency table (cross-tabulation) that represent the univariate frequency distributions for the row and column variables.
Parameter
A numerical measurement describing some characteristic of a population.
Voluntary Response Bias
Bias introduced to a sample when individuals can choose on their own whether to participate in the sample.
Mean
The arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores.
Alternative Hyothesis
The hypothesis which states the Null Hypothesis is incorrect in a significance test.
Correlation
The measure of a relationship between two variables or sets of data.
Response Bias
Anything in the survey design that influences the responses from the sample.
Coefficient of Determination
Measures the percentage of variation in a dependent variable explained by one or more independent variables (r^2).
Random Sample
A sample in which every element in the sample has an equal chance of being selected.
Binomial
An experiment in which a set number of trials is used.
Experiment
The act of conducting a controlled test or investigation.
Law of Large Numbers
Law stating that a large number of items taken at random from a population will (on the average) have the population statistics.
Outlier
An extreme deviation from the mean.
Extrapolation
Estimating a value outside the range of measured data.
Snowball Sample
Samples in which informants provide contact information about other people who share some of the characteristics necessary for a study.
Independent
A relationship between two sets of data or two datum which states the outcome of one has no effect on the outcome of the other.
IQR
Range of the middle 50% of the values; Q3-Q1 = 75th percentile - 25th percentile.
Ogive
A line graph that depicts cumulative frequencies.
Confidence Interval
The range of values within which a population parameter is estimated to lie.
Standard Error
The standard deviation of a sampling distribution.
Observational Study
An experiment which observes individuals and measures variables of interest but does not attempt to influence the responses.
Residual
The difference between an observed value of the response variable and the value predicted by the regression line.
Convenience Sample
A sample that includes members of the population that are easily accessed.
Simulation
The act of repeating an experiment to get more accurate statistical evidence.
Degrees of Freedom
A parameter of the t distribution. When the t distribution is used in the computation of an interval estimate of a population mean, the appropriate t distribution has n-1 degrees of freedom, where n is the size of the simple random sample.
Dotplot
Graphs a dot for each case against a single axis.
Two-way Table
A table containing counts for two categorical variables. It has r rows and c columns.
Geometric
An experiment in which there is no set number of trials but is ended by achieving an outcome.
Inferential Statistics
Statistics that are used to interpret data and draw conclusions.
Spread
A descriptive feature in which describes the range of the data graphically.
Discrete Random Variable
Variable where the number of outcomes can be counted and each outcome has a measurable and positive probability.
Population
The entire aggregation of items from which samples can be drawn.
Sample
Items selected at random from a population and used to test hypotheses about the population.
Central Limit Theorem
The sampling distribution of the mean will approach the normal distribution as n increases (n>30).
Standard Deviation
A measure of variability that describes an average distance of every score from the mean (r).
Cluster Sample
A sampling design in which entire groups are chosen at random.
Type I Error
An error that occurs when a researcher concludes that the independent variable had an effect on the dependent variable, when no such relation exists; a false positive.
Standardized Value
A value found by subtracting the mean and dividing by the standard deviation.
Boxplot
Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values.
Mutually Exclusive
Each event or variable is independent from one another. No event or variable will have an effect on the probability of outcome for any other event or variable.
Scatterplot
A graphed cluster of dots, each of which represents the values of two variables.
Stratified Sample
The population is divided into strata and a random sample is taken from each stratum.
Quantitative
Data or datum being numerically defined.
Wording Bias
A type of response bias where the question is posed to achieve a desired result.
Causation
A cause and effect relationship in which one variable controls the changes in another variable.
Statistic
A numerical measurement describing some characteristic of a sample.
Center
A descriptive feature which describes the placement and relation of the median to the other parts of the graphic representation.
z-Test
A parametric inferential statistical test of the null hypothesis for a single sample where the population standard deviation is known.
t-Test
A parametric inferential statistical test of the null hypothesis for a single sample where the population standard deviation is unknown.
Chi-Squared Goodness of Fit
uses sample data to test hypotheses about the shape or proportions of a population distribution. The test determines how well the obtained sample proportions fit the population proportions specified by the null hypothesis.