Statistics for People Who (Think They) Hate Statistics Neil J. Salkind 3rd Edition Statistics For Managers Final UAB
Terms in this set (137)
In Excel, what is a collection of cells called?
range
If a student were to count the number and type of cars passing through an intersection in a two hour period, she would be collecting which type of random variable?
discrete
What is a mathematical operator used to perform a mathematical task called?
formula
When surveying patient satisfaction and you are using the choices "poor" to "excellent" you are using a
qualitative variable
Which of the following is an example of an Excel formula for taking the number 6 to the power of 2?
=6^2
The ____________ branch of statistics consists of methods that enable the administrator to develop generalizations, estimations, or predictions concerning the phenomenon of managerial interest.
inferential
Which of the following is a discrete quantitative variable?
the number of employees of an insurance company
Which of the following does NOT describe statistics?
A factor that exhibits variability, volatility or assumes different values
The political affiliation (republican, democrat, independent) of an individual is an example of which type of variable?
categorical
The universe or "totality of items or things" under consideration is called a
population
In the formula for computing the mean, what does the letter "X" represent?
individual scores
Which measure of central tendency is most appropriate for samples where you DO NOT have extreme values and your data is NOT categorical?
mean
Which of the following is most sensitive to extreme values?
mean
What is the mode of the following set of scores? 10, 15, 12, 18, 19, 16, 12
12
What is the mean of the following set of scores (round up)? 10, 15, 12, 18, 19, 16, 12
15
Which of the following is another way of representing the 75th percentile?
Q3
Which of the following is the correct function for calculating the midpoint of a set of scores?
MEDIAN(A1:A10)
Which measure of central tendency can be used for both numerical and categorical variables?
mode
Which of the following requires you to multiply a set of scores by the frequency of their occurrence, adding the total of these products and then dividing by the total number of scores?
Weighted mean
When using the AVERAGE function in Excel, which of the following are you calculating?
Mean
Which of the following measures of central tendency is known as the midpoint for a set of scores?
Median
What is the median of the following set of scores? 10, 15, 12, 18, 19, 16, 12
15
Variability is a measure of how much individual scores differ from the __________.
mean
What is the standard deviation of the following set of scores? 10, 15, 12, 18, 19, 16, 12
3.36
When calculating the standard deviation, what must be done in order to obtain an unbiased estimate of the population?
subtract 1 from n
In the formula for computing the variance, what does the letter "n" represent?
sample size
What is the range of the following set of scores? 10, 15, 12, 18, 19, 16, 12
9
What is obtained by squaring the standard deviation?
variance
Which of the following Excel functions will allow you to examine the symmetry of a set of scores?
SKEW(A1:A20)
Excel uses "bins" to create a frequency distribution. The bin value is
the maximum value (upper limit) for a given category
What type of graph displays class intervals along an x-axis?
Histogram
When observations can only be included in one category (and not more than one), we refer to the categories as
mutually exclusive
The point halfway between the boundaries of each class interval in a
grouped frequency distribution is called the __________.
class mark
What do you call a continuous line that represents the frequency of scores within a class interval?
Polygon
What is the specific term associated with how flat a distribution appears?
platykurtic
According to the PowerPoint slides, you should group your data into no fewer than __ groups and no more than __ groups.
3, 10
The width of each bar in a histogram corresponds to the
differences between the boundaries of the class.
If the correlation between variables is .70, what percent of the variance is shared variance?
49%
What test would you want to use to test a nondirectional research hypothesis?
Two-tailed test
Which of the following refers to the group to which you wish to generalize your results?
Population
Which of the following symbols or statements would be used in a nondirectional hypothesis?
not equal
When data points group together in a cluster from the lower left-hand side of the xy axis to the upper right-hand side, what is this?
Positive slope
Which of the following represent the Excel function to be used when computing correlation coefficients?
CORREL(A1:A10, B1:B10)
In its standardized form, the normal distribution
has a mean of 0 and a standard deviation of 1
Which of the following characteristics is associated with the "tails" of the normal curve
Asymptotic
Approximately what percent of scores fall between -1 and -2 standard deviations under the normal curve?
14%
In a distribution with a mean of 100 and a standard deviation of 15, what is the probability that a score will be 115 or higher?
16%
What is the z score for a raw score of 85 where the group mean is 75 and the standard deviation is 5?
2
For some positive value of Z, the probability that a standard normal variable is between 0 and Z is 0.3770. The value of Z is?
1.16
A large urban hospital delivers 600 babies per month on average with a standard deviation of 40. What is the percent probability that the hospital delivers less than 640 babies in a given month?
84%
Given that X is normally distributed variable with a mean of 50 and a standard deviation of 2, find the probability that x is between 47 and 54.
0.9104
The central limit theorem states that if the _______ and _______ are large, the theoretical sampling distribution may be approximated by the standard normal curve.
sample size and degrees of freedom
We use the following descriptive statistics to determine the curve of a normal distribution
mean and standard deviation
Which major assumption of the t-test deals with the amount of variability in each group?
Variances are equal
In order to determine whether or not you will reject the null hypothesis, the test statistic must be compared against the ___________.
Critical value
In the formula that computes a t value, what does Xbar1 represent?
Mean for Group 1
If the obtained value is greater than the critical value, what should you do?
Reject the null hypothesis
What does the number 2.001 represent in the following: t = 2.001, p < .05?
the obtained t value
If your obtained t value is 6.10 and the critical value is 7.14, what decision should you make?
Fail to reject (accept) the null hypothesis
In order to be 95% confident you have not committed a Type I error, at what level should you set your p value?
.05
When examining group difference where no direction of the difference is specified, which of the following is used?
Two-tailed test
The hypothesis that directly evaluates the statistical test is called the
research hypothesis
When interpreting F(2, 27) = 8.80, p < .05, what is the between-groups df?
2
When computing the degrees of freedom for ANOVA, how is the between-group estimate calculated?
k - 1
When testing difference in the means of three groups simultaneously, you would use which of the following statistical tests?
ANOVA
Computing the between-group variance first calls for summing the difference between the grand mean and the group means. This is known as the _______.
SS between
Step 1 of the hypothesis testing steps includes the hypotheses as well as the
assumptions
The F value tests the __________ between groups.
variability
If you set your significance level at 0.05 and you obtain a probability (p-value) of 0.03, you would
reject the null and find support for the research hypothesis
Which of the following is the formula for computing the F statistic?
F = MS between/MS within
If your test statistic (obtained value) is LESS than the critical value, you would
fail to reject (accept) the null and reject the research hypothesis
A Type I error is committed when
we reject a null hypothesis that is true
descriptive statististics
stats used to organize and describe the characteristics of a collection of data
inferential statistics
Involves using a sample to draw conclusions about a population.
sample
subset of the population or universe of interest and conveys information that is of administrative usefulness
population
all observations or all theoretically conceivable observations concerning a phenomenon of interest
nominal measurement
characterized by data that consist of names, labels, or categories only
ordinal measurement
A level of measurement in which different numbers indicate the rank order of cases on some variable.
interval measurement
a level of measurement that has the qualities of the ordinal level plus the requirement that intervals between assigned numbers represent equal distances in the variable being measured
ratio measurement
highest level. have a rational, meaningful zero and therefore provide info about the absolute magnitude of the attribute
variable
Anything that, when measured, can produce two or more different values
categorical (qualitative) variable
variable that measures outcomes that are expressed numerically
numerical (quantitative) variable
variable that consists of outcomes that, without modification, are not expressed numerically
discrete variable
variables that arise from a counting process
continuous variable
variables that arise from a measuring process
frequency distribution
method for illustrating the distribution of scores within class intervals
cumulative frequency distribution
frequency distribution that shows frequencies for class intervals along with the cumulative frequency for each
central tendency
statistical measure to determine a single score that defines the center of a distribution.
mean
typical average score
median
middle score
mode
most common score
weighted mean
when the mean is computed by giving each data value a weight that reflects its importance
variance
square of the standard deviation and another measure of a distribution's spread or dispersion
range
Distance between highest and lowest scores in a set of data.
fractiles
a value which lies a given proportion or percentage of the data
quartile
values that divide the distribution into four equal parts
standard deviation
average deviation from the mean of a sample
class interval (width)
the range of values contained in a given category of a frequency distribution
class frequency
the number of observations or items assigned to a given category
upper class limit
the maximum value that appears in a given category
lower class limit
the minimum value appearing in a given group
mid-point
mid-point between the upper and lower class intervals of a given category
graphical excellence
well designed presentation of data that provides substance, statistics, and design
skewness
the quality of a distribution that defines the disproportionate frequency of certain scores.
positive skew
a longer right tail than left means a smaller number of occurrences at the high end of the distribution
negative skew
shorter right tail than left meaning a larger number of occurrences at the high end of the distribution
collectively exhaustive
tells us that the distribution will accommodate all observations, ranging from the smallest to the largest value in the set of data
collectively exhaustive "example"
Example: Individuals grouped by race
Two Categories Chosen: African-American and White
Problem: 10 Hispanic individuals are in our sample
Your categories were not collectively exhaustive
mutually exclusive
tells us that an observation is assigned to one and only one category
mutually exclusive "example"
Example: Hospitals grouped by type
Types: Rural, urban, academic, for-profit, not-for-profit, and government
What about Cooper Green (government and urban) and Shelby Baptist (rural and not-for-profit)? Where do they fit?
Problem: Your categories were not mutually exclusive.
correlation coefficient
Examines the relationship between variables.
How the value of one variable changes in relation to changes in another variable
positive correlation
Direction correlation
When variables change in the same direction
negative correlation
Indirect correlation
When variables change in opposite directions
Strength of correlation coefficient
The strength of the relationship between two variables is measured by the coefficient of correlation p, 'rho'.
Significance
Any difference between groups that is due to a systematic influence rather than chance
statistical significance
The degree of risk you are willing to take that you will reject a null hypothesis when it is actually true
significance vs. meaninfulness
A study can be statistically significant but not very meaningful
Statistical significance can be interpreted only in terms of the context in which it occurred
Statistical significance should not be the only goal of scientific research
Significance is influenced by same size
If the computed statistics (obtained value) is more extreme then critical values...
the null hypothesis cannot be accepted.
Only in the case where the obtained value is greater than the expected value can you say that the any difference you find is not due to chance and the null will be rejected and you will find support for your research hypothesis
If the obtained value does not exceed the critical value...
the null hypothesis is the most attractive explanation
Any difference you have found must be due to chance or something you don't have control over
Interpret t= -10.44, P <.001
what does the "t= -10.44"
t represents the test statistic used: t test
-10.44 is the obtained value (from the formula)
Interpret t= -10.44, P <.001
what does the "P <.001"
P < .001 indicates the probability of committing a Type I error
The p-value is compared with the desired significance level of our test and, if it is smaller, the result is significant. That is, if the null hypothesis were to be rejected at the 5% significance level, this would be reported as "p < 0.05".
Interpret F(2,27) = 8.80, p < .05
F = test statistic
2,27 = df between groups and df within groups
8.80 = obtained value
p < .05 = probability less than 5% that null hypothesis is true
weighted mean definition
When the mean is computed by giving each data value a weight that reflects its importance, it is referred to as a weighted mean
In the computation of a grade point average (GPA), the weights are the number of credit hours earned for each grade
weighted mean formula
x bar = E w(i) x(i)
________
E w(i)
xi = value of observation i
wi = weight for observation i
Check Unit 2. Slide 21
Probability
is the Numerical Measure of the Likelihood that an Event Will Occur
Peri-natal Mortality Rate definition
Death rate including both fetal and neonatal deaths
Type I error
A jury sometimes makes an error and an innocent person goes to jail. Statisticians, being highly imaginative, call this a Type I error.
Risk of rejecting the null and accepting the research hypothesis (there really was no difference between the tested value and the actual value)
Note that a Type I error is often called "alpha" and is equal to the p-value.
Type II error
Sometimes, guilty people are set free. Statisticians have given this error the highly imaginative name, Type II error.
The probability of failing to reject a null hypothesis when it is false
The Type II error is often called "beta". The power of the test = ( 100% - beta).
null hypothesis
a statement of no relationship between variables (or the opposite relationship from the research hypothesis)
research hypothesis
a research hypothesis is a definite statement that there is a relationship between variables
Non-directional Research Hypothesis
Nondirectional -- H1 : X1 =/= X2
Reflects a difference; direction is not specified
Directional Research Hypothesis
Directional -- H1 : X1 > X2
Reflects a difference; direction is specified
Non-directional -- H1 : X1 =/= X2
Define each variable.
H1: represents the symbol for the first (of probably several) research hypotheses
xbar1: represents the average LOS for Alabama nursing home residents
xbar2: represents the average LOS for Mississippi nursing home residents
What is another name for directional hypothesis?
One-tailed test
What is another name for non-directional hypothesis?
Two-tailed test
Differences between null and research hypotheses
Null: No relationship between variables. Refers to the population. Indirectly tested and often implied. Written using Greek symbols (µ). Implied hypothesis.
Research: A relationship between variables exists.Refers to the sample.Directly tested and explicitly stated. Written using Roman symbols (Xbar).Explicit hypothesis.
What makes a good hypothesis?
Stated in declarative form (not a question)
Posits an expected relationship
Reflects the theory or literature on which they are based
Brief and to the point
Testable
