Search
Create
Statistics Midterm 1 Vocabulary
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (55)
dataset
the complete set of raw data
observational unit
a single individual entity
observation
individual entity
sample size (n)
total number of observational units
census
when data is collected from all members of a population
sample data
when measurements are taken from a subset of a population
population data
collected when all individuals in a population are measured
statistic
a summary measure computed from sample data
parameter
a summary measure using data from population data
descriptive statistics
summary numbers for either a sample or a population
variable
a characteristic that can differ from one individual to the next
categorical variable
the raw data consist of group or category names
ordinal variable
used to describe the data when a categorical data has ordered categories
quantitative variable
a variable which raw data is numerical measurements or counts collected from an individual
explanatory/response variable
the value of the explanatory variable might partially explain the value of the response value for an individual (does not imply cause)
frequency distribution
for a categorical variable it is a listing of all categories along with their frequencies (counts)
relative frequency distribution
listing of all categories along with their relative frequencies (given as proportions or percentages)
pie charts
summarizes a single categorical variable if there are not too many categories
bar graphs
summarizes one or two categorical variables and are useful for making comparisons when there are two categorical variables
histograms, stem-and-leaf plots, and dot plots
use for quantitative data
unimodal
when there is a single peak in a histogram, stem-and-leaf, and dot plot
bimodal
when there are two prominent peaks in a distribution
lower quartile
median of lower half of the ordered data values
upper quartile
median of upper half of the ordered data values
z-score
the distance between the observed value and the mean (in terms of standard deviations)
empirical rule
68% of values are between -1 and +1, 95% are between -2 and +2, 99.7% are between -3 and +3
scatterplot
two dimensional graph of data values
correlation
statistic that measures the strength and direction of a linear relationship between two quantitative variables
regression equation
an equation that describes the average relationship between a response and explanatory variables
least squares regression line
a line that minimizes the vertical distances between the data and the line
residuals (e)
difference between predicted observed y and predicted y. sum of residuals is zero.
residual plot
a scatter plot that using (x,e) rather than (x,y)
coefficient of determination (R^2)
measures predictive power
correlation coefficient (r)
magnitude measures the strength of a relationship and sign shows direction
odds
compares the chance that the event will happen to the chance that it will not
relative risk
(risk in category 1)/(risk in category 2)
simple random sample (SRS)
each member of a population has an equally likely chance of being in the sample
voluntary sampling
people volunteer to be part of the survey
convenience sampling
selects people that are easiest to reach
stratified random sampling
divides the population into strata then does simple random sample from each strata, then combines to make one sample
cluster sampling
population is divided into clusters, then whole clusters are selected randomly to be part of the sample
systematic sampling
divide the list into consecutive segments, then choose a starting point and then select the sample at that point in each segment
undercoverage bias
the design of the sample fails to represent that target population
non-response bias
individual selected cannot be contacted or refuses to be contacted
response bias
interviewee's response is influenced by the interviewer
retrospective study
looks backward in time (type of observation study)
prospective study
looks forward in time, follows a cohort over time (type of observational study)
experimental studies
researcher observes and measures variables of interest and DOES impose treatments
observational studies
research observes and measure variables of interest but DOES NOT impose a treatment
confounding variable
a variable that both affects the response variable and is related to the explanatory variable
lurking variable
potential confounding variable that is not measured and is not considered in the interpretation of a study
blinding
when either the participant or researcher does not know which treatment was taken, double if both are unaware
statistically signficant
when the difference is so large that it could not be due to chance
block design
when groups of people are blocked off to reduce variability in order to see the results more clearly
matched-pair
(best match is yourself), individuals are matched and then each receives a different treatment
;