82 terms

Business Statistics

statistics applied to the business world to help improve people's decision making in fields such as marketing, operations, finance, and human resources.

Biased Sample

a sample that does not represent the intended population and that can lead to distorted findings.

Cross-sectional data

data values that correspond to a specific time period. (pg. 10)

Data

the values assigned to specific observations or measurements.

Descriptive Statistics

the branch of statistics that focuses on summarizing, or displaying, data.

Direct Observation

a method of gathering primary data by observing subjects of interest in their natural environment.

Experiment

a method of gathering primary data by exposing subjects to certain treatments and recording the data of interest.

Focus Group

a direct observational technique whereby individuals are often paid to discuss their attitudes toward products or services in a group setting controlled by a moderator.

Inferential Statistics

the field of statistics that allows us to make claims or conclusions about a population based on a sample of data from that population.

Information

Data that are transformed into useful facts that can be used for a specific purpose, such as making a decision.

Interval Level of Measurement

quantitative data that can measure the difference between categories with actual numbers, giving measurable meaning to the differences.

Nominal Level of Measurement

qualitative data observations that are assigned to predetermined categories.

Ordinal Level of Measurement

data that have all the properties of nominal data but that also permit the rank-ordering of values from the highest to lowest.

Parameter

data that describe characteristics of a population.

Population

represents all possible outcomes or measurements of interest.

Primary Data

data collected by the person or organization that eventually uses the data.

Qualitative Data

data that use descriptive terms to measure or classify something of interest.

Quantitative Data

data that use numerical values to describe something of interest.

Ratio Level of Measurement

data that have all the features of interval data, with the added benefit of a true zero point.

True Zero Point

a zero data value indicates the absence of the object being measured.

Sample

a subset of a population.

Secondary Data

data that somebody else has collected and made available for others to use.

Statistic

data that describe characteristics about a sample.

Statistics

the mathematical science that deals with the collection, analysis, and presentation of data, which can then be used as a basis for inference and induction.

Survey

a method of collecting primary data by asking subjects a series of questions.

Time Series Data

data values that correspond to a specific measurement over a range of time periods. (pg. 10)

Bar Chart

a chart that displays qualitative data that have been organized in categories and can be arranged vertically or horizontally.

Class

a category in a frequency distribution.

Clustered Bar Chart

a bar chart that groups several values side by side within the same category in a vertical direction.

Contingency Table

a table that provides a format to display the frequencies of two qualitative variables.

Continuous Data

values that can take on any value within a given range; are often the result of measuring observations rather than counting them.

Cumulative Percentage Polygon (ogive)

a line graph that plots the cumulative relative frequency distribution.

Cumulative Relative Frequency Distribution

a frequency distribution that totals the proportion of observations that are less than or equal to the class at which you are looking.

Dependent Variable

the variable in the scatter plot that is influenced by changes in the other variable; is placed on the vertical axis.

Discrete Data

values based on observations that can be counted; they are restricted to integer values (whole numbers).

Frequency Distribution

a table that shows the number of data observations that fall into specific intervals. (pg. 27, 33)

Histogram

a bar graph showing the number of observations in each class as the height of each bar. (pg. 30)

Horizontal Bar Chart

a bar chart that displays the bars in a horizontal direction.

Independent Variable

is the variable in the scatter plot that is not affected by changes in the other variable; is placed on the horizontal axis.

Line Chart

a special type of scatter plot in which the data points in the scatter plot are connected with a line.

Pareto Chart

a bar chart that shows in a decreasing order the frequency of the categories that cause quality control problems.

Percentage Polygon

graphs the midpoint of each class as a line rather than a column. The height of each midpoint represents the relative frequency of the corresponding class.

Pie Chart

a chart that displays categories in the slope of a pie, each segment of the pie represents the relative frequency of that category.

Relative Frequency Distribution

a frequency distribution that displays the proportion of observations of each class relative to the total number of observations.

Scatter Plot

a chart that provides a picture of the relationship between two quantitative variables that are paired together on a graph with a horizontal and vertical axis. (pg. 62)

Stacked Bar Chart

a bar chart the groups several values in a single column within the same category in a vertical direction.

Stem and Leaf Display

a chart that splits data values into stems (the larger place values) and leaves (the smaller place values).

Symmetrical Distribution

the shape of a distribution when its two halves mirror one another.

Time Series

a data set where each data point is associated with a specific point in time.

Box-and-Whisker Plot

a graphical display showing the relative position of a distribution's three quartiles as a box on a number line, along with the minimum and maximum values.

Categorial Data

data that are described as category or label.

Chebyshev's Theorem

a theorem that states that regardless of whether a distribution is bell-shaped, at least certain percentages fall within a certain number of standard deviations of the mean.

Theorem: at least 94% of data values

will fall within +/- four standard deviations of the mean.

Theorem: at least 89% of data values

will fall within +/- three standard deviations of the mean.

Theorem: at least 75% of data values

will fall within +/- two standard deviations of the mean.

Coefficient of Variation

a measure of the standard deviation in terms of its percentage of the mean.

Empirical Rule

a rule which states that if a distribution follows a bell-shaped, symmetrical curve centered around the mean, approximately 68%, 95%, and 99.7% of the values fall within one, two, and three standard deviations around the mean respectively.

Five-Number Summary

a list that consists of a distribution's minimum value; first, second, and third quartiles; and maximum value.

Index Point (i)

marks the middle of the data values and is used to determine the position of the median in the data set.

Interquartile Range (IQR)

the difference between the first and third quartiles. It corresponds to the data in the middle 50% of the range.

Left-skewed Distribution

the shape of the distribution when the median is higher than the mean.

Mean

a measure of central tendency that is calculated by adding up all of the values in a data set and then dividing the result by the number of observations.

Measures of Central Tendency

measures that use a single value to describe the center point of a data set.

Measures of Relative Position

measures that compare the position of one value in relation to other values in a data set.

Measure of Variability

measures that determine how much of a spread there is within a data set.

Median

the value in a data set for which half the observations are higher and half the observations are lower.

Midpoint

the halfway point in a set of data. It can be found by taking the average of the endpoints for each class.

Mode

the value that appears most often in a data set.

Outliers

values that are much higher or lower than most of the data.

Percentiles

measure the approximate percentage of values in the data set that are below the value of interest.

Percentile Rank

identifies the percentile of a particular value within a set of data.

Pth Percentile

the approximate percentage of values in the data set that are below the value of interest (where p is any number between 1 and 100).

Quartiles

the first, second and third quartiles are the 25th, 50th (median), and 75th percentiles, respectively.

Range

a measure of variability that is found by subtracting the lowest value from the highest value in a data set.

Right-Skewed Distribution

the shape of a distribution when its mean is higher than its median.

Sample Correlation Coefficient (rxy)

measure both the strength and direction of the linear relationship between two variables.

Sample Covariance (sxy)

measure the direction of the linear relationship between two variables.

Standard Deviation

the square root of a distribution's variance.

Variance

a measure of variation that describes the relative distance between the data points in a set around the mean of the data set.

Weighted Mean

allows you to assign more weight to certain values and less weight to others when calculating the mean.

Z-Score

a measure that identifies the number of standard deviations a particular value is from the mean of its distribution.

Z-Score Rules

no units, zero for values are equal to the mean (+ for values above mean, - for values below mean), data value that are above +3 or below -3 means they have an outlier.