the objects described by a set of data. Individuals may be people, animals or things.
any characteristic of an individual. A variable can take different values for different individuals.
places an individual into one of several groups or categories.
take numerical values for which it makes sense to find an average.
The distribution of a variable tells us what values the variable takes and how often it takes these values.
A graph that uses horizontal or vertical bars to display data
A form of graph which represents numeric values as segments of a circle.
Display visual relationships between the two categorical variables
The Marginal Distributions of one of the categorical variables in a two-way table of counts is the distribution of values of that variable among all individuals described by the table.
A Conditional Distribution of a variable describes the values of that variable among individuals who have a specific value of another variable. There is a separate conditional distribution for each value of the other variable.
We say that there is an association between two variables if knowing the value of one variable helps predict the value of the other. If knowing the value of one variable does not help you predict the value of the other, then there is no association between the variables.
How often an event happens during a period of time
Side-by-side Bar Graph
A bar graph that compares different groups, in one category, to one another.
Each data is shown as a dot above its location on a number line.
A distribution is roughly symmetric if the right and left sides of the graph are approximately mirror images of each other.
Skewed to the Right
A distribution is skewed to the right side of the graph (containing the half of the observations with larger values) is much longer than the left side.
Skewed to the Left
A distribution is skewed to the left side of the graph if the left side of the graph is much longer than the right side.
The concept of the shape of the distribution refers to the shape of a probability distribution (What are the characteristics of the distribution).
Measures of the Center
Mean and median
Measure of Spread
Used to describe the variability in a sample or population.
Numbers that are much greater or much less than the other numbers in the set
A distribution with a single, clearly defined, peak.
A distribution with two clear peaks.
A distribution with more than two clear peaks.
A graphical tool that displays actual numerical values in an ordered fashion using stems and leaves to group the data.
A method for spreading out a stemplot that has too few stems.
Use this when you want to compare two related distributions.
A graph of vertical bars representing the frequency distribution of a set of data.
To find the mean of a set of observations, add their values and divide by the number of observations.
The midpoint of a distribution, the number such that about half the observations are smaller and about half are larger when arranged in order.
Distance between highest and lowest scores in a set of data.
Lies one-quarter of the way up the list.
Lies three-quartiles of the way up the list.
The 1.5 x IQR rule for Outliers
Call an observation an outlier if it falls more than 1.5 x IQR above the third quartile or below the first quartile.
The Five Number Summary
Minimum, Q1, M, Q3, Maximum
A graph that displays the highest and lowest quarters of data as whiskers, the middle two quarters of the data as a box, and the median
Measures the typical distance of the values in a distribution from the mean.
Standard Deviation (Calculated)
It is calculated by finding an average of the squared deviation and then taking the square root.
A measure of spread within a distribution (the square of the standard deviation).