# Raft 4

Dot Plot
Is a data representation that uses a number line and X's, dots, or other symbols to show frequency.
First Quartile (Q1)
Is the median of the lower half of the set.
Histogram
Is a bar graph that is used to display the frequency of data divided into equal parts.
Interquartile Range
Is the difference between Q3 and Q1.
Mean
Is the sum of the of the values in the set divided by the number of values in the set.
Median
Is the middle value in a set when the values are arranged in numerical order.
Normal Curve
Has the following properties:
- about 68% of the data fall within 1 standard deviation of the mean.
- about 95% of the data fall within 2 standard deviation of the mean.
- about 99.7% of the data fall within 3 standard deviation of the mean.
Normal Distribution
A bell-shaped, symmetric distribution with a tail on each end.
Outlier
Is a value in a data set that is much greater or much less than most of the other values in the data set.
Quartiles
Are values that divide a set into four equal parts.
Range
Is the difference between the greatest and the least data values.
Second Quartile (Q2)
Is the median of the whole set; aka median.
Standard Deviation
Represents the average of the distance between individual data values and the mean.
Statistics
Numbers that characterize a data set, such as measures of center and spread.
Third Quartile (Q3)
Is the median of the upper half of the set.
A Line of Best Fit
Is the line that comes closest to all of the points in the data set, using a given process.
Correlation
Is a measure of the strength and direction of the relationship between two variables.
Correlation Coefficient
One way to quantify the correlation of a data set; denoted by r; varies from -1 to 1
Extrapolation
A method for predicting data values for one variable from another based on a line of fit; when the predicition is made for a value outside the extremes.
Interpolation
.
A method for predicting data values for one variable from another based on a line of fit; when the prediction is made for a value within the extremes (minimum and maximum) of the original data set
Line of Fit
.
Is a line through a set of two-variable data that illustrates the correlation
Linear Regression
Is a method for finding the least-squares line.
Residual
Is a signed vertical distance between a data point and a line of fit.
Residual Plot
Is a graph of points whose x-coordinates are the variables of the independent variable and whose y-coordinates are the corresponding residuals.
Scatter Plot
One method of visualizing two-variable data.
The Least-Squares Line
For a data set is the line of fit for which the sum of the squared residuals is as small as possible.
Two-Variable Data
Is a collection of paired variable values, such as a series of measurements of air temperature at different times of day.
Categorical Data
Data that can not be expressed with numerical measurements.
Conditional Relative Frequency
Describes what portion of a group with a given characteristics also has another characteristic.
Frequency Table
Shows how often each item occurs in a set of categorical data.
Joint Relative Frequency
Is found by dividing a frequency that is not in Total column by the grand total.
Marginal Relative Frequency
Is found by dividing a row total or a column total by the grand total.
Quantitative Data
Data that can be expressed with numerical measurements.
Relative Frequency
The frequency of the category divided by the total of all frequencies.
Two-Way Frequency Table
Frequency can be listed in paired values if a data set has two categorical variables.