Statistics Test 1
Terms in this set (64)
Data
a collection of stuff
sample
not all
population
all
statistic
a number about a population
variable but "close enough"
parameter
a number about a population
better because there's only one. difficult to get
statistical significance
something unexpected (rare) happened
practical significance
something meaningful has happened
biased samples
a sample that does not match the population
1. correlation v. causality
correlation is a relationship, whereas causality is cause and effect. just because something is related does not mean they caused each other
lurking variable
something we cant see (like money)
confounding
lots of stuff cause an outcome
2. reported results
ask but do not measure (people might lie/not know)
3. small sample
less than 30
4. loaded question
only one answer. no point in asking the question
5. order of question
the way it is asked pushes people to pick a certain answer. our brain stops at the first good thing, of if both are bad it goes to the second bad thing.
6. voluntary response
individuals choose to give information, typically very strong opinions, loose on middle
7. non response
ask many but only a few respond
8. missing data
group never has a chance to be asked
9. precise numbers
exact or specific. close to each other. accurate=correct. precise is not always accurate
10. percent error
cant have a percent that means half of a human, ect. percents don't mean the same thing going up and down.
11. bad source
if the source of the data has something to gain, then the data should not be trusted
quantitative
how many or how much
qualitative
which are you?
continuous
no spaces (there's always a number between to numbers)
discrete
breaks (its either here or here)
nominal
no true order
ordinal
order but no spacing
interval
order, spacing, but no true zero. twice as much is no longer twice as much
ratio
order, spacing, and true zero
observational study
just looking. always confounding. you do this if an experiment is impossible or unethical
experiment
treatment. may have control. cause and effect
placebo
fake treatment, show the outcome is due to the real treatment
blinding
do not tell the subject which treatment. removes the placebo effect
double blinding
the subject and the data recorder do not know treatment
totally random experiment
no bias, no control
matched pairs
pair like individuals and split the treatments. Control and bias
block design
split sample into groups and do smaller, random experiments
retrosepctive
into the past
prospective
into the future
cross-sectional
snapshot, one moment in time
random sample
every individual had the same chance. fair
simple random sample
every group has the same chance to be chosen. like pulling names out of a hat. no chance for bias
convenience sample
easy to get. overlap a lot of answers
systematic sample
every nth in line.
stratified sample
break into groups and choose some from all groups
cluster sample
break into groups and choose all from some groups
sample error
due to bad luck
non sample error
due to bad set up
mean
average of all the data. xbar= sample mean. u=population mean
median
50% of the data is below, 50% of the data is above
mode
most common number
midrange
midpoint of the highest and lowest. (add and divide by 2)
symmetric
mean and median always in the center
left skewed data
has low outliers. mostly high data
right skewed data
high outliers. mostly low data
range
maximum- minimum
standard deviation
average distance from the mean. s=sample std. o-=population std
variance
total measure of spread. s^2
empirical rule
if data is normal, then 68% of all data is within 1 std of mean. 95% within 2 std, 99.7% within 3 stds.
unusual
outside 2 std
z score
number of std from the mean. data-mean/std
percentile
percent of data below a point P=B/T
B=data below, T= total number of data
when solving for B you cant round down... when B is a whole number, average that data point and the next highest
5 number summary
min, Q1, med, Q3, max
outlier rule
high= Q3+ 1.5(Q3-Q1)
low= Q1+ 1.5(Q3-Q1)
