the science of collecting, analyzing and drawing conclusions from data.
population of interest
the entire collection of individuals or objects about which information is desired.
a subset of the population, selected for study in some prescribed manner.
methods for organizing and summarizing data.
generalizing from a sample to the population from which it was selected.
univariate data set
a data set consisting of observations on a single attribute.
categorical data (qualitative)
if the individual observations are categorical responses.
numerical data (quantitative)
if each observation is a number.
bivariate data set
a data set consisting of observations on two different attributes.
multivariate data set
a data set consisting of observations for each of two or more attributes.
numerical data: discrete
if the possible values are isolated points on the number line.
numerical data: continuous
if the set of possible values forms an entire interval on the number line.
frequency distribution for categorical data
a table that displays the possible categories along with the associated frequencies or relative frequencies.
for a particular category, it is the number of times the category appears in the data set.
for a particular category, it is the fraction or proportion of the time that the category appears in the data set, it is calculated as: relative frequency = frequency/number of observations in the data set
relative frequency distribution
when the table includes relative frequencies.
a graph of the frequency distribution of categorical data.
a simple way to display numerical data when the data set is reasonably small.