quick review of some definitions key to understanding statistics (chapter 1-4)
book used: Essentials of Statistics for Business and Economics 6th edition
Terms in this set (...)
facts and figures collected, analyzed, and summarized for presentation and interpretation
all the data collected in a particular study
entities on which data are collected
characteristic of interest for the elements
the set of measurements obtained for a particular element
data for a variable consists of labels or names used to identify an attribute of the element.
data exhibit properties of nominal & order or rank of data is meaningful.
data have all the properties of ordinal data and interval between values is expressed in terms of a fixed unit of measure. always numerical
data have all the properties of interval data and ratio of values is meaningful. scale must contain a zero value.
data that can be grouped by specific categories. uses either nominal or ordinal scale of measure.
data that use numerical values to indicate how much or how many. use either interval or ratio scale to obtain data.
data collected at the same time or approximately the same time.
time series data
data collected over several time periods (longitudinal).
summaries of data, which may be tabular, graphical or numerical. common ways to express this is through bar charts or histograms.
the set of all elements of interest in a study
a subset of the population
process of conducting a survey to collect data for the entire population
process of conducting a survey to collect data for a sample
using data from a sample to make estimates and test hypotheses about the characteristics of a population
deals with methods for developing useful decision-making information from large data bases.
a tabular summary of data showing the number (frequency) of items in each of several non-overlapping classes
graphical device for depicting categorical data summarized in a frequency, relative frequency or percent frequency or percent frequency distribution. each class is separate.
graphical device for presenting relative frequency and percent frequency distribution for categorical data.
a common graphical presentation of quantitative data. variable of interest on x-axis and frequency or relative frequency on the y-axis.
a graph of a cumulative distribution shows data values on x-axis and either cumulative frequency, relative frequency, or percent frequency on the y-axis.
can be used to show both the rank and shape of data simultaneously.
a tabular summary of data for two variables
a graphical presentation of the relationship between two quantitative variables
measure computed for data from a sample
measures computed for data from population
sample statistics, such as mean, variance, and standard deviation, when used to estimate the corresponding population parameter.
value in the middle of the data arranged in ascending order
value that occurs with the greatest frequency in a set of data
provides information about how the data are spread over the interval from the smallest to the largest value
Q3-Q1 measure of variability that overcomes the dependency on extreme values
the measure of variability that utilizes all the data SIGMA(xi-xbar)/n-1 (SIGMA=summation) (xi=data value) (xbar=mean) (n=sample total)
positive square root of the variance sqrt(SIGMA(xi-xbar)/n-1) (sqrt=square root) (SIGMA=summation) (xi=data value) (xbar=mean) (n=sample total)
coefficient of variation
descriptive statistic that indicates how large the standard deviation is relative to the mean ((sd/xbar)*100)% (sd=standard deviation) (xbar=mean)
an important numerical measure of the shape of a distribution (n/(n-1)(n-2))*SIGMA((xi-xbar)/sd)^3 (n=sample total)(SIGMA=summation) (xi=data value) (xbar=mean) (sd=standard deviation)
used to measure the relative location for a particular value from the mean (xi-xbar)/sd (xi=data value)(xbar=mean)(sd=standard deviation)
theorem enables us to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean. (1-(1/z^2)) (z=number of standard deviations)
used to determine the percentage of data values that must be within a specified number of standard deviations of the mean. data has to approximate the bell-shaped distribution.
graphical summary of data that is based on a five-number summary. 1. smallest value 2. first quartile (Q1) 3. median (Q2) 4. third quartile (Q3) 5. largest value
a measure of linear association between two variables. positive values indicate a positive relationship. negative values indicate a negative relationship. Sxy=SIGMA(xi-xbar)(yi-ybar)/n-1 (SIGMA=summation) (xi=data value of x) (xbar=mean of x) (yi=data value of y) (ybar=mean of y) (n=sample total)
measurement of the relationship between two variables that is not affected by the units of measurement for x & y. Rxy=Sxy/(Sx*Sy) (Sxy=covariance) (Sx=standard deviation of x) (Sy=standard deviation of y)
mean computed by giving each observation a weight that reflects its importance. xbar=SIGMA(wi*xi)/SIGMA(wi) (SIGMA=summation) (wi=weight for observation i) (xi=value of observation i)
data available in class intervals as summarized by a frequency distribution.
a numerical measure of the likelihood that an event will occur must be between 0 - 1.
any process that generates well defined outcomes
set of all experimental outcomes
an experiment described as a sequence of k steps with n1 possible outcomes on the first step, n2 possible outcomes on second step, and so on, then total number of experimental outcomes is given by (n1)(n2)(n3)...(nk)
a graphical representation that helps in visualizing a multiple-step
allows one to count the number of experimental outcomes when the experiment involves selecting n objects from set of N objects N!/(n!(N-n)!) = C (N!=total objects factorial) (n!=selected objects factorial)
allows one to compute the number of experimental outcomes when n objects are to be selected from a set of N objects where the order of selection is important. N!/(N-n)! (N!=total objects factorial) (n!=selected objects factorial)
method used when all experimental outcomes are equally likely. if n outcomes are possible, 1/n probability is assigned to each experimental outcome.
relative frequency method
method of assigning probabilities when data are available to estimate the proportion of the time the experimental outcome will occur if the experiment is repeated a large number of times.
think of frequency
method of assigning probabilities most appropriate when one cannot realistically assume that the experimental outcomes are equally likely and when little relevant data are available.
think of different people assigning different probabilities to the same experimental outcomes.
a collection of sample points
the complement of A
defined to be the event consisting of all the sample points that are NOT in A. denoted Ac
union of A & B
the event containing all sample points belonging to A or B or both. A U B. P(A U B)=P(A)+P(B)-P(AintersectB)
intersection of A & B
event containing only the sample points that A & B share. denoted A intersect B.
if the events have no sample points in common but when one event occurs, the other one cannot. P(A intersect B)=0
the probability of an event given that another event already occurred. P(A|B)=P(A intersect B)/P(B)
the probability of the intersection between two events.
values referred to because they are located in the margins of the JPT. (JPT=joint probability table)
two events that have no influence on each other. P(A|B)=P(A) or P(B|A)=P(B)