94 terms

sample statistics

A numerical value used as a summary measure for a sample (e.g., the sample mean, x̄, the sample variance, s2, and the sample standard deviation, s).

population parameters

A numerical value used as a summary measure for a population (e.g., the population mean, μ, the population variance, σ2, and the population standard deviation, σ). PLURAL

point estimator

The sample statistic, such as x̄, s2, and s, when used to estimate the corresponding population parameter.

mean

A measure of central location computed by summing the data values and dividing by the number of observations. The average.

x

In statistical formulas, it is customary to denote the value of variable BLANK for the first observation by x1, the value of variable x for the second observation by x2, and so on.

x̄

Symbol for sample mean.

Σ

Symbol used to denote "summation of." This is incredibly confusing as this same symbol in Economics means "the change in."

n

Symbol used to denote the number of observations in a sample.

x̄

(Σxi)/n =

μ

Symbol for population mean.

N

Symbol used to denote the number of observations in a population.

μ

(Σxi)/N =

median

A measure of central location provided by the value in the middle when the data are arranged in ascending order.

mode

the value that occurs with greatest frequency.

percentile

A value such that at least p percent of the observations are less than or equal to this value and at least (100 − p) percent of the observations are greater than or equal to this value.

median

The fiftieth percentile is the...

index

A portion of the data corresponds to a specific percentile.

i

Symbol for index

p

Symbol for percentile

i

(p/100)n =

1st Quartile

The 25th percentile, or the information up through that number. USE ENTIRE PHRASE, BUT WITH NUMBER KEYS

3rd Quartile

The 75th percentile, or the information up through that number. USE ENTIRE PHRASE, BUT WITH NUMBER KEYS

Q2

Symbol for the median

Q1

Symbol for the 1st quartile

Q3

Symbol for the 3rd quartile

median

It is better to use the (median/mean) as a measure of central location when a data set contains extreme values.

range

A measure of variability, defined to be the largest value minus the smallest value.

interquartile range

A measure of variability, defined to be the difference between the third and first quartiles. WRITE ENTIRE PHRASE

IQR

Symbol for the interquartile range

IQR

Q3 - Q1 =

variance

A measure of variability based on the squared deviations of the data values about the mean.

population variance

The variance for an entire population. (This one's easy) WRITE ENTIRE PHRASE

σ^2

Symbol for population variance.

σ^2

[ Σ (xi - μ)^2 ] / N

sample variance

The variance for a specific sample (This one's easy). WRITE ENTIRE PHRASE

s^2

Symbol for sample variance

s^2

[ Σ (xi - x̄)^2 ] / (n - 1)

standard deviation

A measure of variability computed by taking the positive square root of the variance.

s

Symbol for sample standard deviation

σ

Symbol population standard deviation

s

|√(s^2)| = (SINCE YOU ARE USING SYMBOLS, THIS SHOULD BE STRAIGHTFORWARD)

σ

|√(σ^2)| = (SINCE YOU ARE USING SYMBOLS, THIS SHOULD BE STRAIGHTFORWARD)

coefficient of variation

A measure of relative variability computed by dividing the standard deviation by the mean and multiplying by 100.

coefficient of variation

(Standard Deviation / Mean) * 100 in percents is the...

skewness

A measure of the shape of a data distribution.

negative

Data skewed to the left results in (negative/positive/zero) skewness

positive

Data skewed to the right results in (negative/positive/zero) skewness

zero

A symmetric data distribution results in (negative/positive/zero) skewness

z-score

A value computed by dividing the deviation about the mean (xi − x̄) by the standard deviation s. A standardized value, it denotes the number of standard deviations xi is from the mean.

zi

Symbol for z-score of xi

zi

(xi-x̄)/s =

standardized value

The z-score is often called the...

z-score

The BLANK is often called the standardized value

Chebyshev's Theorem

A theorem that can be used to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean.

Chebyshev's Theorem

At least (1 − 1/z^2) of the data values must be within z standard deviations of the mean, where z is any value greater than 1.

75%

According to Chebyshev's Theorem, at least BLANK of the data values must be within two standard deviations of the mean.

89%

According to Chebyshev's Theorem, at least BLANK of the data values must be within three standard deviations of the mean.

94%

According to Chebyshev's Theorem, at least BLANK of the data values must be within four standard deviations of the mean.

true

True or false: Chebyshev's theorem requires z > 1; but z need not be an integer.

Empirical Rule

A rule that can be used to compute the percentage of data values that must be within one, two, and three standard deviations of the mean for data that exhibit a bell-shaped distribution.

normal probability distribution

The empirical rule is based on (and only works for)...

68%

According to the Empirical Rule, at least BLANK of the data values must be within one standard deviation of the mean.

95%

According to the Empirical Rule, at least BLANK of the data values must be within two standard deviations of the mean.

100%

According to the Empirical Rule, at least BLANK of the data values must be within three standard deviations of the mean.

outlier

An unusually small or unusually large data value.

Chebyshev's Theorem

(Chebyshev's Theorem/Empirical Rule) is applicable for any data set and can be used to state the minimum number of data values that will be within a certain number of standard deviations of the mean.

Five-Number Summary

An exploratory data analysis technique that uses five numbers to summarize the data: smallest value, first quartile, median, third quartile, and largest value.

true

True or false: The five numbers in the five-number summary are the smallest value, Q1, Q2, Q3, and the largest value.

false

True or false: The five numbers in the five-number summary are the mean, median, mode, range, and standard deviation.

box plot

A graphical summary of data based on a five-number summary.

covariance

A measure of linear association between two variables.

true

True or false: In covariance, positive values indicate a positive relationship and negative values indicate a negative relationship.

sxy

Symbol for sample covariance

sxy

[Σ(xi-x̄)(yi-ȳ)]/(n-1) =

σxy

Symbol for population covariance

σxy

[Σ(xi-μx)(yi-μy)]/(n-1) =

correlation coefficient

A measure of linear association between two variables that takes on values between −1 and +1.

positive

Values near +1 correlation coefficient indicate a strong (negative/positive) linear relationship

negative

Values near -1 correlation coefficient indicate a strong (negative/positive) linear relationship

zero

Values near 0 correlation coefficient indicate BLANK linear relationship

rxy

Symbol for sample correlation coefficient

rxy

sxy/sxsy =

ρxy

Symbol for population correlation coefficient

ρxy

σxy/σxσy =

weighted mean

The mean obtained by assigning each observation a weight that reflects its importance.

weighted x̄

(Σwixi)/Σwi =

w

Symbol for weight

grouped data

Data available in class intervals as summarized by a frequency distribution. Individual values of the original data are not available. Used when measuring frequency.

f

Symbol for frequency

grouped x̄

(ΣfiMi)/n

xi

the value of variable x for the "i"th observation

zi

the number of standard deviations xi is from the mean x̄. (JUST USE ABBREVIATION)

sxy

100(s/x̄)

coefficient of variation

The standard deviation divided by the mean times 100 is the...