Statistics Test #1 - VOCAB


Terms in this set (...)

the science of collecting, describing, and interpreting data
observations (measurements or survey responses) that have been collected
the complete collection of the people, objects, events, etc. that are to be analyzed
a sub collection of members collected from the population
Statistical Thinking includes:
1. context of the data
2. source of the data
3. sampling method
4. practical implications
5. conclusion
A paramater
a number describing a population
A statistic
number describing some characteristic of a sample
Discrete Data
numerical type of data in intervals
(age, # siblings, shoe sizes)
when data values are quantitative and the #s of values are FINITE or COUNTABLE
Continuous Data
any number
(height, weight, volume)
infinitely many possible quantitative values (NOT COUNTABLE)
Reported Data
Qualitative Data
categorical (not a #)
(letter grade, ethnicity, eye color, hair color)
Observational Study
we observe and measure specific characteristics but don't attempt to MODIFY the subjects being studied
apply some treatment then proceed to observe its effects on the subjects
Random Sample
members from the population are selected so that each individual has an equal chance of selection
Simple Random Sample (of size n)
members of the populations are selected so that each sample has an equal chance of selection
*If its not random, its not simple random
Cluster Sample
Divide the population into sections (clusters), randomly select one or more clusters, then select ALL members from those clusters
Stratified Sample
Divide the population into subgroups so that the subjects within the subgroups share some characteristic, then draw a sample from each subgroup
Systematic Sampling
select some starting point, then select every kth element of the population
Frequency Distribution
a list that pairs each data value (either individually or by group intervals) with its frequency
Normal Frequency
if the frequencies start out low, then increase to one or two high frequencies, then decrease to a lower frequency
symmetric around the middle (bell shaped)
Relative Frequency Distribution
list the relative frequencies of each class instead of the frequency
given in decimal form or percentage
Pie Chart
the amount of data that belongs in each category is shown as the corresponding proportion of a circle
Pareto Chart
a bar graph for qualitative (non-numerical) data with bars arranged high to low
Dot Plot
each piece of data is represented as a dot along a scale
Stem-and-Leaf Display
each piece of data is divided into 2 parts:
Leading digits = stem
Trailing digits = leaf
Bar Graph
uses bars of equal width to show frequencies of categories with qualitative data
Multiple Bar Graph
has two or more sets of bars and is used to compare two or more sets of data
a bar chart where the bars touch one another that displays frequency distributions
the average of group of #s
number that lies in the middle when the data is sorted by size. median = x bar
number that occurs most often. mode = M
average of the max and the min
Weighted Mean
used to determine an average value if there are different weights (think GPA)
difference between the high and low data values
Standard Deviation
the square root of the variance
Range Rule of Thumb
values that lie outside of 2 standard deviations of the mean are unusual values. values lying within 2 are usual
The Empirical Rule
applies to data with normal distribution
1. 68% of data falls within 1 S
2. 95% of data falls within 2 S
3. 99.7% of data falls within 3 S
number of standard deviation that a given value of x lies above or below the mean
percentage of number that are lower k% from the upper
used to determine normality but offer less specific information. the higher S the more varied and less predictability it has
simple process that can be repeated and may result in different outcomes
Sample Space
set of ALL possibly outcomes of an experiment
a set including a collection of the possible outcomes of an experiment
Law of Large Numbers
as a procedure is repeated again and again the relative frequency approximation to the probability of an event tend to approach the actual probability