37 terms

Data

A collection in a context.

Population

A set of individuals that we wish to describe and/or make predictions about.

Individual

Member of a population

Variable

Characteristics recorded about each individual in a data set.

Categorical variable

A variable that records qualities or characteristics of an individual such as gender or eye color.

Quantitative variable

Variable that measures a characteristic of an individual such as height, weight, or age.

Center

What is the most typical value.

Mode

The value that occurs the most.

Spread

How much do values typically vary from the center.

Range

The difference between the lowest and the highest values.

Outlier

Data value that doe not fit in the overall pattern.

Shape

Mound shaped and symmetrical, uniform, skewed left, and skewed right.

Sample

A set of data collected and/ or selected from a statistical population by a defined procedure.

Frequency

The number of times the event occurred in an experiment or study.

Frequency table

A table that shows the number of times a particular value is used as a data point.

Interval

Scales are numeric scales in which we know not only the order, but also the exact differences between the values.

Mean

Adding up the values and then dividing by the number of values.

Median

The middle value or the average of the middle of two values when the data is arranged in numerical order.

5 number summary

Minimum, Q1, median, Q3, and maximum.

IQR

Q1-Q3 the spread of middle 50% of data.(not sensitive to outliers.

Deviation

The amount that a single data value differs from the mean.

Standard deviation

Measure of how spread out numbers are.

Mean absolute deviation

The average distance between each data value and the mean.

First quartile

The middle lower half of the data. 25% of data is below Q1.

Third quartile

The middle upper half of the data. 25% of data is above Q3.

Two way table

Shows relationships between two categorical variables.

Joint frequencies

Represent the body of the table.

Marginal frequencies

Represents the total row and total column.

Sample size

The total number of individuals surveyed.

Residual

The difference between the observed value of the dependent variable and the predicted value.

Residual plot

A graph that shows the difference between the actual data. (What is provided through a table or graph) and the predicted data. (what the model says should happen).

Correlation

Measures a relationship between two variables.

Causation

One evet is the result of the occurrence of the other event.

Skew

To go in One Direction or the other.

Line of best fit

A straight line drawn through the center of a group of data points plotted on a scatterplot.

Bivariate statistics

Analyzing two variables to find the relationship between them.

Univariate statistics

Involves only one variable.