Intro to Statistics Reading Quiz Chpt 1-3
Terms in this set (85)
A frequency distribution lists the _________ of occurrences of each category of data, while a relative frequency distribution lists the _________ of occurrences of each category of data.
number , proportion
A quantitative variable has an infinite number of possible values that are not countable is called _______
continuous
When an odd number of data values are arranged in order, the ____ is the middle value
median
Suppose a newspaper surveys 250 adults in a neary town and inquires about their cell phone carrier. The accompanying table summarizes the results. Does this table describe a relative frequency distribution? Why or why not?
No. The sum of the relative frequencies is 95% not 100%
A variable is at the ____ level of measurement if it allows for the values of the variable to be arranged in a specific order and both differences and ratios of values have meaning.
ratio
In a ___ study, data are collected in the future from cohorts which share common factors.
prospective study
Which of the statements below is true concerning bar graphs?
The height of each bar represents the category's frequency or relative frequency.
The coefficient of variation describes the standard deviation relative to the _____ and is expressed as a _____.
mean, percent
What is the symbol used to represent the sample variance?
s2 (s squared)
An ogive will always end at a point whose y-coordinate is the ____
Sample size
True or False? A histogram and a relative frequency histogram constructed from the same data always have the same basic shape.
True. A relative frequency histogram will have a different scale on the y-axis but the same shape as a regular histogram.
A ____ sample is obtained by selecting individuals who are easily contacted or those who voluntarily respond.
convenience
The _____ is found by adding all the data values and dividing by the total number of values.
mean
A variable is at the ____ level of measurement if it allows for the values of the variable to be arranged in a specific order and the difference between any two values is meaningful, but the ratio is not.
interval
The _______ is the difference between two consecutive lower class limits or two consecutive upper class limits.
class width
If all the data values in a set are identical what can you conclude about the standard deviation?
The standard deviation is zero.
What is a placebo and what purpose does it serve in an experiment?
A placebo is a fake treatment that looks like the treatment being tested in the experiment. Placebos blind subjects so they do not know whether or not they are receiving the treatment.
A ____ is found by adding the lower and upper class limit and then dividing the sum by 2.
class midpoint
A company was conducting a survey to investigate people's spending habits and how they may have changed in recent years. One question on the survey was, "Did you spend more/less/the same amount of money this year as you did in 2007, the year the recession began in earnest in this country?" Is this question biased? If so, what answer does it favor?
This question is biased toward "spend less," since it mentions the recent recession. Many people would feel that they should answer that they spent less, since the country is in a recession.
A sample of thirty users of a popular social networking site yielded the histogram on the right for the number of friends. What is the relationship between the mean and the median for this data?
The mean and median will be roughly equal since the distribution is fairly symmetric.
Do people walk faster in an airport when they are departing (getting on a plane) or after they have arrived (getting off a plane)? An interested passenger watched a random sample of people departing and a random sample of people arriving and measured the walking speed (in feet per minute) of each. What type of study design is being performed?
observational study
What is wrong with the following class limits for organizing weight data for a sample of 200 adult men in the United States?
140-150 pounds
150-160 pounds
160-170 pounds
170-180 pounds
180-190 pounds
190-200 pounds
200-210 pounds
210-220 pounds
220-230 pounds
Classes are overlapping.
The numbers used to separate the classes of a frequency distribution, but without the gaps created by class limits, are called
class boundaries.
Suppose every student in a class is surveyed and it is reported that 75% of the class plans to take another math class. Is this an example of descriptive or inferential statistics? Explain.
Descriptive statistics; The results of the class sample are described without making any generalizations about the population of all students at the school.
Which two graphs allow the reader to retrieve the original list of data?
Stem-and-leaf plots and dotplots
What is the symbol used to represent the sample standard deviation?
s
Suppose a student earns a 75 on his statistics exam, and his grade has a z-score of 1.5. Since the class did not perform well on the exam, the professor announces that she will adjust the grades by adding 10 points to each score. How will this adjustment change the student's z-score?
Your z-score will not change since the adjustment shifts the entire distribution of scores but does not change the relative position of your score in the class.
What purpose does replication serve in an experiment?
Replication insures that the effect of a treatment is not due to some characteristic of a single experimental unit.
A z-score represents how many ______________ a data value is above or below the ______________.
standard deviations; mean
A(n) ____________________ is a bar graph in which the height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same, and the rectangles touch each other.
histogram
Which statement is NOT true regarding the mean?

A.
The calculation of the mean uses all the values in the data set.
Your answer is not correct.
B.
The mean should be used when the distribution is roughly symmetric.
C.
The mean is the center of gravity or balancing point for the data set.
D.
The mean is always the best measure of center.
D.
The mean is always the best measure of center.
If the maximum and minimum values in a data set are averaged, the result is the __________.
midrange
Suppose a pharmaceutical company is designing a double-blind experiment to test their new allergy medication. They divide the 200 subjects by gender and then randomly assign the men and women to either receive the medication or a placebo. The researcher finds that even the group receiving the placebo (a sugar pill) showed a decrease in allergy symptoms. Which of the following accounts for this result?

Placebo effect
Choose the correct answer below.
Blocking
Confounding
Placebo effect
Response bias
Placebo effect
The ____________________ for a class is the sum of the frequencies for that class and all previous classes.
cumulative class frequency
The _________________ is/are the entire group of individuals or items being studied.
population
What is the symbol used to represent the population mean?

B. mu
Choose the correct answer below.
A. s
B. mu
C. x overbar
D. sigma
B. mu
What is the symbol used to represent the population variance?

D.sigma squared
A.
sigma
B.
s squared
C. s
D.sigma squared
D.sigma squared
Explain why the mean should not be found for a sample of zip codes. Which measure of center should be used instead?
Choose the correct answer below.
A.
Zip codes are quantitative data, but are at the ordinal level of measurement. Since they cannot be added in a meaningful way, the mean cannot be found. The median should be used instead.
B.
Zip codes are quantitative data, but are at the ordinal level of measurement. Since they cannot be added in a meaningful way, the mean cannot be found. The mode should be used instead.
C.
Even though they are numeric data, zip codes are qualitative since they do not measure or count anything. The mean cannot be found since adding zip codes would be meaningless. For qualitative data, the mode is the only measure of center that can be found.
D.
Even though they are numeric data, zip codes are qualitative since they do not measure or count anything. The mean cannot be found since adding zip codes would be meaningless. For qualitative data, the median is the only measure of center that can be found.
C.
Even though they are numeric data, zip codes are qualitative since they do not measure or count anything. The mean cannot be found since adding zip codes would be meaningless. For qualitative data, the mode is the only measure of center that can be found.
Which distribution shape (skewed left, skewed right, or symmetric) is most likely to result in the mean being substantially smaller than the median?
Choose the correct answer below.
A.
A distribution that is skewed right will likely have a mean that is smaller than the median since the extreme values in the tail tend to pull the mean to the right.
B.
A distribution that is skewed left will likely have a mean that is smaller than the median since the extreme values in the tail tend to pull the mean to the left.
C.
A symmetric distribution will likely have a mean that is smaller than the median since the median is more resistant to extreme values than the mean.
D.
There is no way to predict the relationship between the mean and median based on the shape of the distribution.
Question is complete.
B.
A distribution that is skewed left will likely have a mean that is smaller than the median since the extreme values in the tail tend to pull the mean to the left.
Fill in the blank to complete the statement below.
A _________________ is a numerical measurement describing some characteristic of a population.
parameter
A company advertises a mean lifespan of 1000 hours for a particular type of light bulb. If you were in charge of quality control at the factory, would you prefer that the standard deviation of the lifespans for the light bulbs be 5 hours or 50 hours? Why?

B.
5 hours would be preferable since a smaller standard deviation indicates more consistency.
Choose the correct answer below.
A.
5 hours would be preferable since a smaller standard deviation indicates a longer average lifespan for the light bulbs.
B.
5 hours would be preferable since a smaller standard deviation indicates more consistency.
C.
50 hours would be preferable since a larger standard deviation indicates a longer average lifespan for the light bulbs.
D.
50 hours would be preferable since a larger standard deviation indicates more consistency.
B.
5 hours would be preferable since a smaller standard deviation indicates more consistency.
Suppose you want to calculate the z-score for your height. How will the z-scores compare if you use your height in inches verses centimeters?
Choose the correct answer below.
A.
If you use your height in inches, the z-score will be larger since an inch is larger than a centimeter.
B.
The z-scores will be the same regardless of the unit used for your height because z-scores are unitless.
C.
The z-scores will be different based on the unit used for height, but it is not possible to predict which will yield a larger z-score.
D.
If you use your height in inches, the z-score will be smaller since you are measuring it in a larger unit.
Question is complete.
B.
The z-scores will be the same regardless of the unit used for your height because z-scores are unitless
Fill in the blank to complete the statement below.
A __________ sample is obtained by randomly selecting an individual and then selecting every kth individual from the population after the first one.
systematic
In a _______ study, data are observed, measured, and collected at one point in time.
cross-sectional
The data value that occurs with the greatest frequency is called the ___________.
mode
Explain why Social Security Number is considered a qualitative variable even though it contains numbers.
Choose the correct answer below.
A.
Addition and subtraction of Social Security Numbers does not provide meaningful results. This makes it qualitative even though it is numeric.
B.
Since Social Security Number is a variable at the interval level of measurement, it must be qualitative rather than quantitative.
C.
Social Security Number is a qualitative variable since there are an infinite number of possible values that are not countable.
D.
Social Security Number is a qualitative variable since there are a finite or countable number of values.
A.
Addition and subtraction of Social Security Numbers does not provide meaningful results. This makes it qualitative even though it is numeric.
Which measure of center (mean or median) is resistant? Explain what it means for that measure to be resistant.
Choose the correct answer below.
A.
The median is resistant because it is not sensitive to extreme values in the data set. If the largest observation was doubled, for example, the median would not change since that largest value does not factor into its computation.
B.
The mean is resistant because it is sensitive to extreme values in the data set. If the largest observation was doubled, for example, the mean would change since that largest value factors into its computation.
C.
The mean is resistant because it is not sensitive to extreme values in the data set. If the largest observation was doubled, for example, the mean would not change since that largest value does not factor into its computation.
D.
The median is resistant because it is sensitive to extreme values in the data set. If the largest observation was doubled, for example, the median would change since that largest value factors into its computation.
A.
The median is resistant because it is not sensitive to extreme values in the data set. If the largest observation was doubled, for example, the median would not change since that largest value does not factor into its computation.
How can you tell from a boxplot if the distribution is skewed right?
Choose the correct answer below.
A.
The median is to the right of the center of the box, and the left whisker is substantially longer than the right whisker.
B.
The median is to the right of the center of the box, and the right whisker is substantially longer than the left whisker.
C.
The median is to the left of the center of the box, and the left whisker is substantially longer than the right whisker.
D.
The median is to the left of the center of the box, and the right whisker is substantially longer than the left whisker.
D.
The median is to the left of the center of the box, and the right whisker is substantially longer than the left whisker.
When plotting an ogive, the x-coordinates of the points are equal to the __________________ of each class.
upper limits
Suppose that a researcher is interested in the average standardized test score for fifth graders in a local school district. The fifth graders at a specific school would comprise a ___________ and their average test score would be a ___________.
sample; statistic
A study that requires individuals to look back in time or requires the researcher to look at existing records is considered _______.
retrospective
What is the symbol used to represent the population standard deviation?

A.
sigma
A.
sigma
B.
s
C.
mu
D.
x overbar
A.
sigma
The _________________ is/are a subset of the population that is being studied.
sample
When comparing two populations with the same variable of interest in the same unit of measure, the larger the standard deviation, the _____________ dispersion there is in the distribution.
more
To find the mean from a frequency distribution, calculate the weighted mean by using the class ______________ as weights.
freqencies
What is the symbol used to represent the sample mean?

B.
x overbar
Choose the correct answer below.
A.
sigma
B.
x overbar
C.
mu
D.
s
B.
x overbar
After constructing any relative frequency distribution, what should be the sum of the relative frequencies?

d. 1 or 100%
Choose the correct answer below.
a. the sample size
b. the total of all class frequencies
c. 0
d. 1 or 100%
d. 1 or 100%
A variable is at the _______ level of measurement if it allows for the values of the variable to be arranged in a specific order but the differences between values either cannot be determined or are meaningless.
ordinal
Suppose every student in a class is surveyed and it is found that 75% of the class plans to take another math class. It is reported that 75% of all students at the school plan to take another math class. Is this an example of descriptive or inferential statistics? Explain.
Choose the correct answer below.
A.
Descriptive statistics; the results of the class sample are described without making any generalizations about the population of all students at the school.
B.
Descriptive statistics; the results of the class sample are extended to make a generalization about the population of all students at the school.
C.
Inferential statistics; the results of the class sample are described without making any generalizations about the population of all students.
D.
Inferential statistics; the results of the class sample are extended to make a generalization about the population of all students at the school.
D.
Inferential statistics; the results of the class sample are extended to make a generalization about the population of all students at the school.
A __________ sample is obtained by dividing the population in groups and selecting all the individuals in randomly selected groups.
cluster
A variable is at the _______ level of measurement if the values of the variable name, label, or categorize.
nominal
A __________ sample is obtained by dividing the population into homogenous groups and randomly selecting individuals from each group.
stratified
A sample of thirty users of a popular social networking site yielded the histogram on the right for the number of friends. What is the relationship between the mean and the median for this data?
Choose the correct answer below.
A.
The mean will be substantially smaller than the median since the distribution is skewed left.
B.
The mean and median will be roughly equal since the distribution is fairly symmetric.
C.
The mean will be substantially larger than the median since the distribution is skewed right.
D.
The mean will be substantially smaller than the median since the distribution is skewed right.
Your answer is not correct.
C.
The mean will be substantially larger than the median since the distribution is skewed right. see pic
Suppose a pediatrician is wondering whether there is more variability in the heights or weights of the 2-year-old boys that he sees and collects the data below for a sample of 100 2-year-old boys in his practice. He concludes that the boys' weights vary more than their heights since the standard deviation is greater for weight than for height. What is wrong with this conclusion?
Heights: mean =30.2 in.,
standard deviation = 1.9 in.
Weights: mean =29.4 lb,
standard deviation = 2.1 lb
Choose the correct answer below.
A.
Since the standard deviation is larger for the weights, the weights actually vary less than the heights.
B.
Since the standard deviations have different units, he cannot compare them directly. The coefficient of variation should be used instead.
C.
In order to compare the variability, he would need to calculate the variance instead of the standard deviation.
D.
Nothing is wrong with his conclusion. Since the heights and weights have equal means, it is perfectly valid to compare the standard deviations to decide which data has more variability.
B.
Since the standard deviations have different units, he cannot compare them directly. The coefficient of variation should be used instead.
If someone's gross annual income has a z-score of positive 2, what can be concluded?
Choose the correct answer below.
A.
Their income is 2 standard deviations above the median income.
B.
Their income is 2 standard deviations above the mean income.
C.
Their income is twice the mean income.
D.
Their income is 2 standard deviations below the median income.
B.
Their income is 2 standard deviations above the mean income.
How can you tell from a boxplot if the distribution is symmetric?
Choose the correct answer below.
A.
It is not possible to determine the shape of the distribution from a boxplot.
B.
The median is in the center of the box, and the left and right whiskers are approximately the same length.
C.
The median is in the center of the box.
D.
The left and right whiskers are approximately the same length.
B.
The median is in the center of the box, and the left and right whiskers are approximately the same length.
If all the data values in a population are converted to z-scores, the distribution of z-scores will have what mean?

B.
The mean of the z-scores will be zero.
Choose the correct answer below.
A.
The mean of the z-scores cannot be determined.
B.
The mean of the z-scores will be zero.
C.
The mean of the z-scores will be the same as the mean of the original distribution of data.
D.
The mean of the z-scores will be one.
B.
The mean of the z-scores will be zero.
If the effect of two factors on the response variable cannot be distinguished, what can be concluded about the experiment?

D.
Confounding has occurred in the experiment.
Choose the correct answer below.
A.
There has been a placebo effect in the experiment.
B.
Blinding was not used in the experiment.
C.
There is a lurking variable in the experiment.
D.
Confounding has occurred in the experiment.
D.
Confounding has occurred in the experiment.
The interquartile range (IQR) is a measure of the ____________ of the middle ____________ percent of the data.
spread; 50
A _________________ is a numerical measurement describing some characteristic of a sample.
statistic
What is an advantage to using a stem-and-leaf plot instead of a histogram to display data?
Choose the correct answer below.
A.
A stem-and-leaf plot shows the shape of the distribution while the histogram does not.
B.
A stem-and-leaf plot allows for retrieval of the original data from the plot while the histogram does not.
C.
A stem-and-leaf plot is often much faster to create than a histogram, especially for large data sets.
D.
These are all advantages of a stem-and-leaf plot over a histogram.
B.
A stem-and-leaf plot allows for retrieval of the original data from the plot while the histogram does not.
What must be true for a sample to be considered a simple random sample?
Choose the correct answer below.
A.
Every member (or sample) of the population must have a known probability of being selected.
B.
Every individual member of the population must be selected.
C.
Every member (or sample) of the population must have a different probability of being selected.
D.
Every member (or sample) must have the same chance of being selected as every other member (or sample of the same size).
D.
Every member (or sample) must have the same chance of being selected as every other member (or sample of the same size).
If a professor adds 10 points to each student's final exam score, how will it affect the class mean on the final exam?

D.
The mean will increase by 10 points.
Choose the correct answer below.
A.
The mean will not change.
B.
The mean will increase by less than 10 points.
C.
The mean will increase by more than 10 points.
D.
The mean will increase by 10 points.
D.
The mean will increase by 10 points.
When reporting the mean or standard deviation, the result is usually rounded to how decimal places?
Choose the correct answer below.
A.
The same number of decimal places as is present in the original data
B.
Two more decimal places than is present in the original data
C.
One more decimal place than is present in the original data
D.
One fewer decimal places than is present in the original data
C.
One more decimal place than is present in the original data
In a typical boxplot, the length of the box indicates which measure of spread?

d. Interquartile range (IQR)
Choose the correct answer below.
a. Range
b. Standard deviation
c. Variance
d. Interquartile range (IQR)
d. Interquartile range (IQR)
Explain the difference between a bar graph and a Pareto chart.
Choose the correct answer below.
A.
A Pareto chart is a particular type of bar graph in which the bars are drawn in decreasing order of height.
B.
A bar graph is used for qualitative data, and a Pareto chart is used for quantitative data.
C.
A Pareto chart is always drawn with vertical bars, while a bar graph can be horizontal or vertical.
D.
A Pareto chart is a particular type of bar graph in which the bars are touching and all the same width.
A.
A Pareto chart is a particular type of bar graph in which the bars are drawn in decreasing order of height.
Suppose you construct a graph to compare the student populations of the five largest high schools in your city and choose to depict the populations with school buildings of various sizes. If the school buildings are drawn so that the length and the width are each in proportion to the population of the corresponding schools, is the resulting graph misleading? Why or why not?

A.
The graph will be misleading since the student populations are one-dimensional data, but the graph uses a two-dimensional school building to represent it.
Choose the correct answer below.
A.
The graph will be misleading since the student populations are one-dimensional data, but the graph uses a two-dimensional school building to represent it.
B.
The graph will be misleading since pictures cannot be used to represent data.
C.
The graph will not be misleading since the areas of the school buildings are in proportion to the population of the corresponding schools.
D.
The graph will not be misleading since the school building picture being used is related to the school population data.
A.
The graph will be misleading since the student populations are one-dimensional data, but the graph uses a two-dimensional school building to represent it.
A ________________ variable counts or measures something and has numeric values.
quantitative
Suppose, on the warmest day of the month, the daily high temperature in a city is accidentally recorded as 700 instead of 70 degrees Fahrenheit. Compare the effect this mistake will have on the mean monthly high temperature to the effect on the median monthly high temperature.

A.
The mean will increase significantly, but the median will not change as a result of the mistake.
Choose the correct answer below.
A.
The mean will increase significantly, but the median will not change as a result of the mistake.
B.
The median will increase significantly, but the mean will not change as a result of the mistake.
C.
The mean and median will both increase significantly as a result of the mistake.
D.
Neither the mean nor the median will increase as a result of the mistake.
A.
The mean will increase significantly, but the median will not change as a result of the mistake.
A(n) ____________________ is a graph that plots the class midpoints on the horizontal axis against the class frequencies on the vertical axis, and then connects the points with the line segments.
frequency polygon
Suppose your statistics professor teaches two sections of your course this semester. She gives the same exam to each class. Their performance is summarized below. Can she conclude that the overall mean on the exam was 80 (the average of the two individual class means)? Why or why not?
First Class:
n=32
mean= 75
standard deviation = 5.6
Second Class:
n=38
mean= 85
standard deviation = 7.2
Choose the correct answer below.
A.
Yes. The two means can be averaged as long as the classes took the same exam.
B.
Yes. The two means can be averaged as long as each class is considered a simple random sample.
C.
No. Since the classes have different standard deviations on the exam, she cannot average the means.
D.
No. Since the class sizes are different, she would need to find the weighted mean.
D.
No. Since the class sizes are different, she would need to find the weighted mean.
A quantitative variable that has a finite or countable number of values is called _______.
discrete
A collection of data on class sizes at a community college produces the five-number summary below. Comment on the shape of the distribution of class sizes.
Min = 12
Upper Q 1 = 22
Upper Q 2 = 35
Upper Q 3= 38
Max = 40
Choose the correct answer below.
A.
The distribution appears to be skewed right since the median is further from the first quartile than the third quartile. Also, the left whisker would be longer than the right whisker in a boxplot for the data.
B.
The distribution appears to be fairly symmetric since the median is about the same distance from the first and third quartiles.
C.
The distribution appears to be skewed left since the median is closer to the first quartile than the third quartile. Also, the right whisker would be longer than the left whisker in a boxplot for the data.
D.
The distribution appears to be skewed left since the median is further from the first quartile than the third quartile. Also, the left whisker would be longer than the right whisker in a boxplot for the data.
D.
The distribution appears to be skewed left since the median is further from the first quartile than the third quartile. Also, the left whisker would be longer than the right whisker in a boxplot for the data.
A ________________ variable classifies individuals based on some attribute or characteristic.
qualitative
The interquartile range (IQR) is the difference between the _______ quartile and the _______ quartile.
third; first
