Business Statistics key terms

quick review of some definitions key to understanding statistics (chapter 1-4) book used: Essentials of Statistics for Business and Economics 6th edition

Terms in this set (...)

facts and figures collected, analyzed, and summarized for presentation and interpretation
data set
all the data collected in a particular study
entities on which data are collected
characteristic of interest for the elements
the set of measurements obtained for a particular element
nominal scale
data for a variable consists of labels or names used to identify an attribute of the element.
ordinal scale
data exhibit properties of nominal & order or rank of data is meaningful.
interval scale
data have all the properties of ordinal data and interval between values is expressed in terms of a fixed unit of measure.
always numerical
ratio scale
data have all the properties of interval data and ratio of values is meaningful.
scale must contain a zero value.
categorical data
data that can be grouped by specific categories.
uses either nominal or ordinal scale of measure.
quantitative data
data that use numerical values to indicate how much or how many.
use either interval or ratio scale to obtain data.
cross-sectional data
data collected at the same time or approximately the same time.
time series data
data collected over several time periods (longitudinal).
descriptive statistics
summaries of data, which may be tabular, graphical or numerical.
common ways to express this is through bar charts or histograms.
the set of all elements of interest in a study
a subset of the population
process of conducting a survey to collect data for the entire population
sample survey
process of conducting a survey to collect data for a sample
statistical inference
using data from a sample to make estimates and test hypotheses about the characteristics of a population
data mining
deals with methods for developing useful decision-making information from large data bases.
frequency distribution
a tabular summary of data showing the number (frequency) of items in each of several non-overlapping classes
bar chart
graphical device for depicting categorical data summarized in a frequency, relative frequency or percent frequency or percent frequency distribution.
each class is separate.
pie chart
graphical device for presenting relative frequency and percent frequency distribution for categorical data.
a common graphical presentation of quantitative data.
variable of interest on x-axis and frequency or relative frequency on the y-axis.
a graph of a cumulative distribution shows data values on x-axis and either cumulative frequency, relative frequency, or percent frequency on the y-axis.
stem-and-leaf display
can be used to show both the rank and shape of data simultaneously.
a tabular summary of data for two variables
scatter diagram
a graphical presentation of the relationship between two quantitative variables
sample statistics
measure computed for data from a sample
population parameters
measures computed for data from population
point estimator
sample statistics, such as mean, variance, and standard deviation, when used to estimate the corresponding population parameter.
average value
value in the middle of the data arranged in ascending order
value that occurs with the greatest frequency in a set of data
provides information about how the data are spread over the interval from the smallest to the largest value
interquartile range
measure of variability that overcomes the dependency on extreme values
the measure of variability that utilizes all the data
SIGMA(xi-xbar)/n-1 (SIGMA=summation) (xi=data value) (xbar=mean) (n=sample total)
standard deviation
positive square root of the variance
sqrt(SIGMA(xi-xbar)/n-1) (sqrt=square root) (SIGMA=summation) (xi=data value) (xbar=mean) (n=sample total)
coefficient of variation
descriptive statistic that indicates how large the standard deviation is relative to the mean
((sd/xbar)*100)% (sd=standard deviation) (xbar=mean)
an important numerical measure of the shape of a distribution
(n/(n-1)(n-2))*SIGMA((xi-xbar)/sd)^3 (n=sample total)(SIGMA=summation) (xi=data value) (xbar=mean) (sd=standard deviation)
used to measure the relative location for a particular value from the mean
(xi-xbar)/sd (xi=data value)(xbar=mean)(sd=standard deviation)
Chebyshev's Theorem
theorem enables us to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean.
(1-(1/z^2)) (z=number of standard deviations)
empirical rule
used to determine the percentage of data values that must be within a specified number of standard deviations of the mean.
data has to approximate the bell-shaped distribution.
graphical summary of data that is based on a five-number summary.
1. smallest value
2. first quartile (Q1)
3. median (Q2)
4. third quartile (Q3)
5. largest value
a measure of linear association between two variables.
positive values indicate a positive relationship.
negative values indicate a negative relationship.
Sxy=SIGMA(xi-xbar)(yi-ybar)/n-1 (SIGMA=summation) (xi=data value of x) (xbar=mean of x) (yi=data value of y) (ybar=mean of y) (n=sample total)
correlational coefficient
measurement of the relationship between two variables that is not affected by the units of measurement for x & y.
Rxy=Sxy/(Sx*Sy) (Sxy=covariance) (Sx=standard deviation of x) (Sy=standard deviation of y)
weighted mean
mean computed by giving each observation a weight that reflects its importance.
xbar=SIGMA(wi*xi)/SIGMA(wi) (SIGMA=summation) (wi=weight for observation i) (xi=value of observation i)
grouped data
data available in class intervals as summarized by a frequency distribution.
a numerical measure of the likelihood that an event will occur
must be between 0 - 1.
any process that generates well defined outcomes
sample space
set of all experimental outcomes
multiple-step experiment
an experiment described as a sequence of k steps with n1 possible outcomes on the first step, n2 possible outcomes on second step, and so on, then total number of experimental outcomes is given by (n1)(n2)(n3)...(nk)
tree diagram
a graphical representation that helps in visualizing a multiple-step
allows one to count the number of experimental outcomes when the experiment involves selecting n objects from set of N objects
N!/(n!(N-n)!) = C (N!=total objects factorial) (n!=selected objects factorial)
allows one to compute the number of experimental outcomes when n objects are to be selected from a set of N objects where the order of selection is important.
N!/(N-n)! (N!=total objects factorial) (n!=selected objects factorial)
classical method
method used when all experimental outcomes are equally likely.
if n outcomes are possible, 1/n probability is assigned to each experimental outcome.
relative frequency method
method of assigning probabilities when data are available to estimate the proportion of the time the experimental outcome will occur if the experiment is repeated a large number of times.

think of frequency
subjective method
method of assigning probabilities most appropriate when one cannot realistically assume that the experimental outcomes are equally likely and when little relevant data are available.

think of different people assigning different probabilities to the same experimental outcomes.
a collection of sample points
the complement of A
defined to be the event consisting of all the sample points that are NOT in A.
denoted Ac
union of A & B
the event containing all sample points belonging to A or B or both.
A U B.
P(A U B)=P(A)+P(B)-P(AintersectB)
intersection of A & B
event containing only the sample points that A & B share.
denoted A intersect B.
mutually exclusive
if the events have no sample points in common but when one event occurs, the other one cannot.
P(A intersect B)=0
conditional probability
the probability of an event given that another event already occurred.
P(A|B)=P(A intersect B)/P(B)
joint probabilities
the probability of the intersection between two events.
marginal probabilities
values referred to because they are located in the margins of the JPT. (JPT=joint probability table)
independent events
two events that have no influence on each other.
P(A|B)=P(A) or P(B|A)=P(B)