35 terms

Dot Plot

Is a data representation that uses a number line and X's, dots, or other symbols to show frequency.

First Quartile (Q1)

Is the median of the lower half of the set.

Histogram

Is a bar graph that is used to display the frequency of data divided into equal parts.

Interquartile Range

Is the difference between Q3 and Q1.

Mean

Is the sum of the of the values in the set divided by the number of values in the set.

Median

Is the middle value in a set when the values are arranged in numerical order.

Normal Curve

Has the following properties:

- about 68% of the data fall within 1 standard deviation of the mean.

- about 95% of the data fall within 2 standard deviation of the mean.

- about 99.7% of the data fall within 3 standard deviation of the mean.

- about 68% of the data fall within 1 standard deviation of the mean.

- about 95% of the data fall within 2 standard deviation of the mean.

- about 99.7% of the data fall within 3 standard deviation of the mean.

Normal Distribution

A bell-shaped, symmetric distribution with a tail on each end.

Outlier

Is a value in a data set that is much greater or much less than most of the other values in the data set.

Quartiles

Are values that divide a set into four equal parts.

Range

Is the difference between the greatest and the least data values.

Second Quartile (Q2)

Is the median of the whole set; aka median.

Standard Deviation

Represents the average of the distance between individual data values and the mean.

Statistics

Numbers that characterize a data set, such as measures of center and spread.

Third Quartile (Q3)

Is the median of the upper half of the set.

A Line of Best Fit

Is the line that comes closest to all of the points in the data set, using a given process.

Correlation

Is a measure of the strength and direction of the relationship between two variables.

Correlation Coefficient

One way to quantify the correlation of a data set; denoted by r; varies from -1 to 1

Extrapolation

A method for predicting data values for one variable from another based on a line of fit; when the predicition is made for a value outside the extremes.

Interpolation

.

.

A method for predicting data values for one variable from another based on a line of fit; when the prediction is made for a value within the extremes (minimum and maximum) of the original data set

Line of Fit

.

.

Is a line through a set of two-variable data that illustrates the correlation

Linear Regression

Is a method for finding the least-squares line.

Residual

Is a signed vertical distance between a data point and a line of fit.

Residual Plot

Is a graph of points whose x-coordinates are the variables of the independent variable and whose y-coordinates are the corresponding residuals.

Scatter Plot

One method of visualizing two-variable data.

The Least-Squares Line

For a data set is the line of fit for which the sum of the squared residuals is as small as possible.

Two-Variable Data

Is a collection of paired variable values, such as a series of measurements of air temperature at different times of day.

Categorical Data

Data that can not be expressed with numerical measurements.

Conditional Relative Frequency

Describes what portion of a group with a given characteristics also has another characteristic.

Frequency Table

Shows how often each item occurs in a set of categorical data.

Joint Relative Frequency

Is found by dividing a frequency that is not in Total column by the grand total.

Marginal Relative Frequency

Is found by dividing a row total or a column total by the grand total.

Quantitative Data

Data that can be expressed with numerical measurements.

Relative Frequency

The frequency of the category divided by the total of all frequencies.

Two-Way Frequency Table

Frequency can be listed in paired values if a data set has two categorical variables.