Home
Subjects
Explanations
Create
Study sets, textbooks, questions
Log in
Sign up
Upgrade to remove ads
Only $2.99/month
Math
Statistics
biostats midterm 2
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (50)
proportion
fraction of individuals having a particular attribute
corollary of probability
the probability that a randomly selected individual will have that attribute is the same as the fraction of the population having said attribute
binomial distribution
describes the probability of a given number of "successes" from a fixed number of independent trials with constant probability of success in each trial
law of large numbers
the improvement in precision as sample size increases; larger samples yield more precise estimates
binomial test
uses data to test whether a population proportion p matches a null expectation for the proportion (most powerful test)
null hypothesis
a specific statement about a population parameter made for the purpose of argument; a good null hypothesis is a statement that would be interesting to reject
test statistic
number calculated from the data that is used to evaluate how compatible the data are with the result expected under the null hypothesis
alpha
0.05
null distribution
the sampling distribution of outcomes for a test statistic under the assumption that the null hypothesis is true
p-value
the probability of obtaining the data (or data showing as great or greater difference from the null hypothesis) if the null hypothesis were true
discrete distribution
a probability distribution describing a discrete numerical random variable
X2 Goodness of fit test
compares observed frequency data to a probability model stated by the null hypothesis; test statistic is X2, chi
degrees of freedom
the number of degrees of freedom of a test specifies which of a family of distributions to use
assumptions of X2 test
-no more than 20% of categories have expected frequency < 5
-no category with expected frequency < 1
-each datum is random and independent
critical value
the value of the test statistic where P= α
contingency analysis
test the independence of two or more categorical variables OR test for an association between two or more categorical variables
assumptions for contingency analysis
you can't have any expectation less than 1, and no more than 20% < 5
Fisher's exact test
-for 2 x 2 contingency analysis
-does not make assumptions about the size of expected frequencies
-most powerful test for a 2 x 2 contingency table
-provides an exact P-value
normal distribution
-continuous probability distribution describing a bell-shaped curve
-good approximation to the frequency distributions of many biological variables
-symmetric around its mean
-mean, median, and mode are all the same
standard normal distribution
a normal distribution with a mean (μ) of zero and a standard deviation (σ) of one
standard normal table
gives the probability of getting a random draw from a standard normal distribution greater than a given value
Z
tells us how many standard deviations Y is from the mean
standard error of an estimate of a mean
the standard deviation of the distribution of sample means
central limit theorem
the sum or mean of a large number of measurements randomly sampled from a non-normal population is approximately normally distributed
confounding variable
an unmeasured variable that changes in tandem with one or more of the measured variables; this gives the false appearance of a causal relationship between the measured variables
observational versus experimental studies
cause and effect can ONLY be established using well-designed experiments; randomly assigning subjects to treatments ensures that any association that exists between a confounding variable and an explanatory variable is BROKEN
one-sample t-test assumptions
-the variable is normally distributed in the population (the sample need not be strictly normal)
-the sample is a random sample
paired designs
-data from the two groups are paired
-each member of the pair shares much in common with the other, except for the tested categorical variable
-there is a one-to-one correspondence between the individuals in the two groups
paired design assumptions
-pairs are chosen at random
-the differences have a normal distribution
-if does not assume that the individual values are normally distributed, only the differences
two-sample t-test
compares the means of a numerical variable between two populations
assumptions of two-sample t-test
-both samples are random samples
-both populations have normal distributions
-the variance of both populations is equal
Welch's t-test
compares the means of two groups and can be used even when the variances of the two groups are not equal
Levene's test
-preferred over F-test when variances are not normally distributed in both populations
-assumes frequency distribution of variances are symmetrical in both populations
-better performance than F-test when variances are not symmetrical
Shapiro-Wilk test
evaluates the goodness of fit of a normal distribution to a set of data randomly sampled from a population
what do when assumptions not met
-if sample size is LARGE (i.e. N>50), sometimes parametric tests OK
-transform data
-non-parametric tests
-randomization/resampling methods
data transformation
changes each data point by some simple mathematical formula
log transformation is often useful when
-the variable is likely to be the result of multiplication of various components
-the frequency distribution of the data is skewed to the right
-the variance seems to increase as the mean gets larger (in comparisons across groups)
non-parametric methods
-assume less about the underlying distributions
-also called "distribution-free"
- "parametric" methods assume a distribution or a parameter
permutation test (non-parametric)
generates a null distribution for the association between two variables by repeatedly rearranging the values of one of the two variables in the data
experiment
-manipulates the factor levels to create treatments
-randomly assigns subjects to these treatment levels
-compares the responses of the subject groups across treatment levels
goal of experimental design
-eliminate/reduce bias
-reduce sampling error
-ultimately, identify causal relationships between treatments and responses
factor
an explanatory variable under control of the experimenter
levels of the factor
the specific values (e.g. magnitudes) that the experimenter chooses for a factor
response
what the experimenter will measure on each subject or experimental unit
treatment
the combination of specific levels from all the factors that an experimental unit receives
experimental artifact
a bias in a measurement produced by unintended consequences of experimental procedures
placebo effect
pet hypotheses of researchers
replication
the application of every treatment to multiple, independent experimental units
blocking
grouping of experimental units that have similar properties; within each block, treatments are randomly assigned to experimental units
area under curve
1
Recommended textbook explanations
Elementary Statistics: A Step By Step Approach
10th Edition
Allan G. Bluman
1,839 explanations
Elementary Statistics
12th Edition
Mario F. Triola
2,593 explanations
Understanding Basic Statistics
8th Edition
Charles Henry Brase, Corrinne Pellillo Brase
1,332 explanations
Understanding Basic Statistics
6th Edition
Charles Henry Brase, Corrinne Pellillo Brase
1,384 explanations
Sets with similar terms
Stats Final
80 terms
Statistics for Data Analytics
60 terms
Stats Definitions
91 terms
QTM-100 Final Fall 2014
77 terms
Other sets by this creator
psychology final
177 terms
psychology midterm 2
220 terms
clinical neuroscience midterm
198 terms
neuroanatomy quiz
23 terms
Other Quizlet sets
Circular Motion and Universal Gravity Review
19 terms
Sociology Exam 3 Review
45 terms
ANAT116 - 22Jan15 - Neck
46 terms
Sigma Pi - Final Exam
108 terms