statistics applied to the business world to help improve people's decision making in fields such as marketing, operations, finance, and human resources.
a sample that does not represent the intended population and that can lead to distorted findings.
data values that correspond to a specific time period. (pg. 10)
the values assigned to specific observations or measurements.
the branch of statistics that focuses on summarizing, or displaying, data.
a method of gathering primary data by observing subjects of interest in their natural environment.
a method of gathering primary data by exposing subjects to certain treatments and recording the data of interest.
a direct observational technique whereby individuals are often paid to discuss their attitudes toward products or services in a group setting controlled by a moderator.
the field of statistics that allows us to make claims or conclusions about a population based on a sample of data from that population.
Data that are transformed into useful facts that can be used for a specific purpose, such as making a decision.
Interval Level of Measurement
quantitative data that can measure the difference between categories with actual numbers, giving measurable meaning to the differences.
Nominal Level of Measurement
qualitative data observations that are assigned to predetermined categories.
Ordinal Level of Measurement
data that have all the properties of nominal data but that also permit the rank-ordering of values from the highest to lowest.
data that describe characteristics of a population.
represents all possible outcomes or measurements of interest.
data collected by the person or organization that eventually uses the data.
data that use descriptive terms to measure or classify something of interest.
data that use numerical values to describe something of interest.
Ratio Level of Measurement
data that have all the features of interval data, with the added benefit of a true zero point.
True Zero Point
a zero data value indicates the absence of the object being measured.
a subset of a population.
data that somebody else has collected and made available for others to use.
data that describe characteristics about a sample.
the mathematical science that deals with the collection, analysis, and presentation of data, which can then be used as a basis for inference and induction.
a method of collecting primary data by asking subjects a series of questions.
Time Series Data
data values that correspond to a specific measurement over a range of time periods. (pg. 10)
a chart that displays qualitative data that have been organized in categories and can be arranged vertically or horizontally.
a category in a frequency distribution.
Clustered Bar Chart
a bar chart that groups several values side by side within the same category in a vertical direction.
a table that provides a format to display the frequencies of two qualitative variables.
values that can take on any value within a given range; are often the result of measuring observations rather than counting them.
Cumulative Percentage Polygon (ogive)
a line graph that plots the cumulative relative frequency distribution.
Cumulative Relative Frequency Distribution
a frequency distribution that totals the proportion of observations that are less than or equal to the class at which you are looking.
the variable in the scatter plot that is influenced by changes in the other variable; is placed on the vertical axis.
values based on observations that can be counted; they are restricted to integer values (whole numbers).
a table that shows the number of data observations that fall into specific intervals. (pg. 27, 33)
a bar graph showing the number of observations in each class as the height of each bar. (pg. 30)
Horizontal Bar Chart
a bar chart that displays the bars in a horizontal direction.
is the variable in the scatter plot that is not affected by changes in the other variable; is placed on the horizontal axis.
a special type of scatter plot in which the data points in the scatter plot are connected with a line.
a bar chart that shows in a decreasing order the frequency of the categories that cause quality control problems.
graphs the midpoint of each class as a line rather than a column. The height of each midpoint represents the relative frequency of the corresponding class.
a chart that displays categories in the slope of a pie, each segment of the pie represents the relative frequency of that category.
Relative Frequency Distribution
a frequency distribution that displays the proportion of observations of each class relative to the total number of observations.
a chart that provides a picture of the relationship between two quantitative variables that are paired together on a graph with a horizontal and vertical axis. (pg. 62)
Stacked Bar Chart
a bar chart the groups several values in a single column within the same category in a vertical direction.
Stem and Leaf Display
a chart that splits data values into stems (the larger place values) and leaves (the smaller place values).
the shape of a distribution when its two halves mirror one another.
a data set where each data point is associated with a specific point in time.
a graphical display showing the relative position of a distribution's three quartiles as a box on a number line, along with the minimum and maximum values.
data that are described as category or label.
a theorem that states that regardless of whether a distribution is bell-shaped, at least certain percentages fall within a certain number of standard deviations of the mean.
Theorem: at least 94% of data values
will fall within +/- four standard deviations of the mean.
Theorem: at least 89% of data values
will fall within +/- three standard deviations of the mean.
Theorem: at least 75% of data values
will fall within +/- two standard deviations of the mean.
Coefficient of Variation
a measure of the standard deviation in terms of its percentage of the mean.
a rule which states that if a distribution follows a bell-shaped, symmetrical curve centered around the mean, approximately 68%, 95%, and 99.7% of the values fall within one, two, and three standard deviations around the mean respectively.
a list that consists of a distribution's minimum value; first, second, and third quartiles; and maximum value.
Index Point (i)
marks the middle of the data values and is used to determine the position of the median in the data set.
Interquartile Range (IQR)
the difference between the first and third quartiles. It corresponds to the data in the middle 50% of the range.
the shape of the distribution when the median is higher than the mean.
a measure of central tendency that is calculated by adding up all of the values in a data set and then dividing the result by the number of observations.
Measures of Central Tendency
measures that use a single value to describe the center point of a data set.
Measures of Relative Position
measures that compare the position of one value in relation to other values in a data set.
Measure of Variability
measures that determine how much of a spread there is within a data set.
the value in a data set for which half the observations are higher and half the observations are lower.
the halfway point in a set of data. It can be found by taking the average of the endpoints for each class.
the value that appears most often in a data set.
values that are much higher or lower than most of the data.
measure the approximate percentage of values in the data set that are below the value of interest.
identifies the percentile of a particular value within a set of data.
the approximate percentage of values in the data set that are below the value of interest (where p is any number between 1 and 100).
the first, second and third quartiles are the 25th, 50th (median), and 75th percentiles, respectively.
a measure of variability that is found by subtracting the lowest value from the highest value in a data set.
the shape of a distribution when its mean is higher than its median.
Sample Correlation Coefficient (rxy)
measure both the strength and direction of the linear relationship between two variables.
Sample Covariance (sxy)
measure the direction of the linear relationship between two variables.
the square root of a distribution's variance.
a measure of variation that describes the relative distance between the data points in a set around the mean of the data set.
allows you to assign more weight to certain values and less weight to others when calculating the mean.
a measure that identifies the number of standard deviations a particular value is from the mean of its distribution.
no units, zero for values are equal to the mean (+ for values above mean, - for values below mean), data value that are above +3 or below -3 means they have an outlier.