44 terms

raw data

numbers and categories that have been collected but have not yet been processed in any way

variable

a characteristic that can differ from one individual to the next

observational unit

a single individual who participates in a study

sample data

measurements that are taken from a subset of a population

sample size

the total number of observational units

dataset

the complete set of raw data, for all observational units and variables in a survey or experiment

population data

measurements that are taken from all individuals of a population

statistic

a summary measure computed from sample data

parameter

a summary measure for an entire population

descriptive statistics

the summary numbers for either a population or sample

categorical variable

consists of a group or category names that don't necessarily have any logical ordering; each individual only falls into one category

ordinal variable

a categorical variable that may be used to describe the data when a categorical variable has ordered categories

quantitative variable

raw data that are recorded as numerical values (either measurements or counts)

continuous variable

a type of quantitative variable that is used when every value within some interval is a possible result

explanatory variable

the value of which for an individual is thought to partially explain the value of the response variable for that same individual

response variable

a variable that is an effect of another variable

frequency

count of how many observations fall into a category

relative frequency

the proportion or percentage in a category relative to the total count over all categories

frequency distribution

the listing of all categories along with their frequencies

relative frequency distribution

a listing of all categories along with relative frequencies (given a proportions or percentages)

pie charts

visual representations that are useful for summarizing a single categorical variable if there aren't too many categories

bar graphs

visual representations that are useful for summarizing one or two categorical variables; especially useful for comparing two categorical variables

distribution

the overall pattern of how often the possible values occur

location

on a distribution, this is represented by the center or the average

median

approximate middle value of data

mean

the arithmetic average of data

variability

on a distribution, the spread among individual measurements

shape

on a distribution, can be clumped or skewed; describes the graph

outliers

data points that are not consistent with the bulk of the data

histogram

similar to a bar graph, though not extremely informative when the sample size is small

stem-and-leaf plots

present all individual values; can be overwhelming for large datasets

boxplot

displays information in a five-number summary; useful for comparing multiple groups and identifying outliers

right

a graph is skewed to the _______ if higher values are more spread out than lower values

left

a graph is skewed to the _______ if lower values are more spread out than higher values

mode

the most frequent value

unimodal

if there is a single prominent peak in a histogram, stemplot, or dotplot

range

the highest value minus the lowest value

interquartile range

upper quartile - lower quartile

resistant statistic

a numerical summary of the data that is "resistant" to the influence of outliers, meaning outliers won't have a major influence on a statistic's numerical value

first summary number

the mean of a bell-shaped distribution

second summary number

the standard deviation of a bell-shaped distribution

standard deviation

the measure of the spread of values, represented by s; the average distance that values fall from the mean

variance

the squared value of the standard deviation

empirical rule

68% of values fall within 1 standard deviation of the mean in either direction; 95% of values fall within 2 standard deviations of the mean in either direction; 99.7% of values fall within 3 standard deviations of the mean in either direction