49 terms

AP Statistics TPS Chapter 1

Aligned to The Practice of Statistics (Starnes, Tabor, Yates, Moore) 5th edition
STUDY
PLAY

Terms in this set (...)

stem & leaf plot
a graph of a distribution of quantitative data in which all but the final digits of data values (written in numerical order) form a column called a stem and the final digit of each data value is written in increasing order outward from the column to form leaves
back-to-back stem & leaf plot
a stem-and-leaf plot or stemplot that is used to compare distributions of quantitative variables for two data sets
dotplot
a graph of a distribution of quantitative data in which each data value is shown as a dot above its location on a number line
histogram
a graph of a distribution of quantitative data in which nearby values are grouped together in what are often called "bins" or "classes"
sample mean
the average of a subset of the population, denoted by x-bar: x̄
population mean
the average of a population, denoted by the Greek letter mu: μ
resistant
a statistic that is not influenced by extreme observations
unimodal
describes the shape of a distribution whose graph has one major peak
bimodal
describes the shape of a distribution whose graph has two major peaks
symmetric
describes the shape of a distribution whose graph has roughly mirror images on the left and right sides
skewed
describes the shape of a distribution whose graph has one side that is much longer than the other ; distribution is named based on the direction of the tail
quartiles
One quarter of the data values are smaller than the first quartile (Q1) and three quarters of the data values are smaller than the third quartile (Q3)
interquartile range (IQR)
The range for the middle 50% of the data values-- the range between the quartiles (Q3-Q1)
1.5 IQR Rule
used for identifying outliers: any values that are more than 1.5 times the IQR lower than the first quartile or higher than the third quartile are called outliers
standard deviation
measures the average distance of the observations from their mean. It is calculated by finding an average of the squared distances and then taking the square root.;

*usually denoted s for a sample or lower case Greek sigma, σ, for a population
variance
the square of the standard deviation (σ^2)
distribution
tells all the possible values of a variable and how often they each occur
two-way table
array that displays counts of two categorical variables with columns indicating the distribution for one variable and rows indicating the distribution for the other variable
marginal distribution
in a two-way table, this counts the distribution of values of one of the categorical variables among all individuals described by the table
conditional distribution
in a two-way table, this describes the values of a variable among individuals who have a specific value of another variable (there's a separate conditional distribution for each value of the other variable)
association
two variables have this quality if knowing the value of one variable helps predict the value of the other
center
a typical value-- could be the median or the mean or the mode
spread
how much a data set varies-- could be the range, the IQR, or the standard deviation
shape
a description of the symmetry or asymmetry and the number of modes/peaks
outlier
any data value that is unusually low or unusually high
bar graph
a graph of a distribution of categorical data in which bars extend to display the frequency of various categories that are placed on an axis
median
the midpoint of a distribution, such that about half the observations are smaller and half the observations are larger
range
the distance (a single number) between the minimum and maximum values in a data set (max - min)
boxplot (or box and whisker plot)
a graph of a distribution of quantitative data in which a central box extends between the quartiles with a central line marking the the median, lines (called whiskers) extend to the largest and smallest values that are not outliers, while outliers are marked with individual points/dots
categorical variable
individual into one of several groups that describes something
quantitative variable
numerical values that can be averaged in a way that makes sense in context
What are three ways you can display categorical data?
pie charts, bar graphs, 2-way table
segmented bar graph
a bar graph that is used to compare distributions of a categorical variable for two or more data sets; *segments of bars that add up to 100
What does SOCS stand for?
Shape, Outliers, Center, Spread
unimodal has ___ center(s), bimodal has ___ center(s)
1;2
A symmetric distribution has the same ____, ______, and ____
mean, median, mode
When should you use mean as the center?
when you have a symmetric distribution
When should you use the median as the center?
when you have a skewed/non-symmetric distribution
What are three ways you can use to measure spread?
range, standard deviation, and interquartile range
What's the difference between histograms and bar graphs?
histograms are used with quantitative data, bar graphs are used with categorical data
five-number summary
The minumum value, lower quartile, median, upper quartile, and maximum value for a data set. These five values give a summary of the shape of the distribution and are used to make box plots.


The five numbers that help describe the center, spread and shape of data
population
the entire group of individuals that we want information about
sample
representative of an entire population
variable
A variable is any characteristic whose value may change from one individual to another
univariate data set
observations on a single variable made on individuals in a sample or population.
bivariate data set
observations on two variables made on individuals in a sample or population
multivariate data set
observations on two or more variables made on individuals in a sample or population
frequency
the number of times the category appears in the data set.
relative frequency
the fraction or proportion of the time that the category appears in the data set. It equals frequency/number of observations in the data set.

Flickr Creative Commons Images

Some images used in this set are licensed under the Creative Commons through Flickr.com.
Click to see the original works with their full license.