Statistics Vocabulary Chapter 1-3
Terms in this set (59)
Individuals
The objects described by a set of data
Median
The midpoint of a distribution of quantitative data
Conditional
A _________ distribution describes the distribution of values of a categorical variable among individuals who have a specific value of another variable
Categorical
A variable that places an individual into one of several groups or categories
Variable
A characteristic of an individual that can take different values for different individuals
Two Way Table
When comparing two categorical variables, we can organize the data in a Two-way Table
Boxplot
A graphical display of the five number summary
Histogram
A graphical display of quantitative data that shows the frequency of values in intervals using bars
Quantitative
A variable that takes numerical values for which it makes sense to find an average
Symmetric
The shape of a distribution whose right and left sides are approximate mirror images of each other
Quartiles
These values lie one-quarter, one-half, and three-quarters of the way up the list of quantitative data
Outlier
A value that is at least 1.5 IQRs above the third quartile or below the first quartile
SOCS
When exploring data, don't forget your _________
Standard Deviation
The average distance of observations from their mean (two words)
Variance
The average squared distance of the observations from their mean
Bar Graph
Displays the counts or percentages of categories in a categorical variable through different heights
Distribution
Tells you what values a variable takes and how often it takes these values
Pie Chart
Displays a categorical variable using slices sized by the counts or percentages for the category
Association
When specific values of one variable tend to occur in common with specific values of another
Mean
Measure of center, also called the average
Dot Plot
One of the simplest graphs to construct when dealing with a small set of quantitative data
Stem Plot
A graphical display of quantitative data that involves splitting the individual values into two components
Skewed
The shape of a distribution if one side of the graph is much longer than the other
Resistant
What we call a measure that is relatively unaffected by extreme observations
Mean
The balance point of a density curve, if it were made into solid material
z score
The standardized value of an observation
Normal
These common density curves are symmetric and bell-shaped
Probability
A normal ____________ plot provides a good assessment of whether a data set is approximately normally distributed
Ogive
Another name for a cumulative relative frequency graph
Left
The standard normal table tells us the area under the standard normal curve to the _______ of z
Density
A ________ curve is a smooth curve that can be used to model a distribution
Standard
This normal distribution has a mean 0 and a standard deviation 1
Empirical
The _________ rule is also known as the 68-95-99.7 rule for the normal distributions
Mean
To standardize a value, subtract the _______ and divide by the standard deviation
Percentile
The value with p percent of the observations less than it
One
The area under any density curve is always equal to ________
Transform
We _________ data when we change each value by adding a constant and/or multiplying by a constant
Linear
If a normal probability plot shows a _________ linear pattern, the data are approximately normal
Median
The point that divides the area under a density curve in half
Gauss
This mathematician first applied normal curves to data to errors made by astronomers and surveyors
Residual
The difference between an observed value of the response and the value predicted by a regression line
Causation
Important note: Association does not imply ______________
Scatter Plot
Graphical display of the relationship between two quantitative variables
Regression
Line that describes the relationship between two quantitative variables
Determination
The coefficient of _________ describes the fraction of variability in y values that is explained by least squares regression
Negative
A _________ association is defined when above average values of one variable are accompanied by below average values of the other
Influential
Individual points that substantially change the correlation or slope of the regression line
Extrapolation
The use of a regression line to make a prediction far outside the observed x values
Slope
The amount by which y is predicted to change when x increases by one unit
Strength
The _________ of a relationship in a scatterplot is determined by how closely the point follow a clear form
Least squares
The _______-________ regression line is also known as the line of best fit (2 words0
Outlier
An individual value that falls outside the overall pattern of the realtionship
Correlation
Values that measures the strength of the linear relationship between two quantitative variables
Positive
A __________ association is defined when above average values of the explanatory are accompanied by above average values of the response
Predicted
y-hat is the ________ value of the y-variable for a given x
Response
Variable that measures the outcome of the study
Explanatory
Variable that may help explain or influence changes in another variable
Direction
The _________ of a scatterplot indicates a positive or negative association between the variables
Form
The _________ of a scatterplot is usually linear or nonlinear
