Search
Create
Log in
Sign up
Log in
Sign up
Biostatistics
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (65)
descriptive statistics
statistical procedures used to describe characteristics and responses of groups of subjects
inferential statistics
procedures used to draw conclusions about larger populations from small samples of data
parametric statistics
A branch of statistics which assumes that sample data comes from a population that follows a probability distribution based on a fixed set of parameters. More precise.x
Non-parametric statistics
normal distribution not assumed
Central Limit Theorem
The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a normal distribution.
normal distribution curve
the bell-shaped curve that results from plotting continuous variation data on a graph.
Determinants of shape on normal distribution
Mean, height. Standard deviation, width
Small SD
observations are clustered tightly around a central value, less variation
Large SD
observations are scattered widely from the mean, more variation
area under the normal curve
1
probability distribution
68-95-99.7 rule
in a normal model, about 68% of values fall within 1 standard deviation of the mean, about 95% fall within 2 standard deviations of the mean, and about 99.7% fall within 3 standard deviations of the mean
How do you know if data is normally distributed?
Calculate the mean plus or minus twice the standard deviation
D'Agostino & Pearson omnibus normality test, Shapiro-Will normality test
Not very useful but can do
Skew
measure of the asymmetry of a probability distribution
Kurtosis
Measure of the fatness of the tails of a probability distribution relative to that of a normal distribution. Indicates likelihood of extreme outcomes. (Tailed ness)
bimodal distribution
a distribution with two modes
Histogram
A graph of vertical bars representing the frequency distribution of a set of data.
box plot
A graph that displays the highest and lowest quarters of data as whiskers, the middle two quarters of the data as a box, and the median
P-P plot
Compare the cumulative probability of our empirical data with an ideal "test" distribution
Q-Q plot
Compare the quantiles of our empirical data with the ideal
measures of location (central tendency)
mean, median, mode
Measure of dispersion
range, variance, standard deviation, standard error, confidence interval
Mean
the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores (data must show normal distribution)
Median
the middle score in a distribution; half the scores are above it and half are below it
Mode
The value that occurs most frequently in a given data set.
range
the difference between the highest and lowest scores in a distribution
Variance
The average of the squared differences from the mean.
standard deviation
a computed measure of how much scores vary around the mean score
standard error
A measure of how far the sample mean is away from the population mean
When should you use Standard deviation vs. standard error
confidence interval
An estimate of the range that is likely to contain the true population mean, generates an interval I. Which the probability that the sample mean reflects the population mean is high
Confidence interval with known SD
confidence interval with unknown SD
T value instead of z value
SD/SEM/95% CI error bars
Precision
a measure of how close a series of measurements are to one another
Accuracy
A description of how close a measurement is to the true value of the quantity measured.
null hypothesis
the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.
alternative hypothesis
The hypothesis that states there is a difference between two or more sets of data, observed effect for the experiment
Type I error (alpha)
incorrectly rejects the null. Think there is a difference in the population when there actually is not. More serious error
When you reduce your significance level you reduce the chance of making this error, as chance of making type I error decreases the chances of making type II error increase.
Type II error (beta)
Incorrectly retaining a false null hypothesis (false negative)
type I vs. type II error
random error
Caused by inherently unpredictable fluctuations in the readings of a measurement apparatus or in the experimenter's interpretation of the instrumental reading, can be in either direction
Systematic error
Is predictable, and typically constant or proportional to the true value. Systematic errors are caused by imperfect calibration of measurement instruments or imperfect methods of observation. Typically occurs in one direction
Power calculation
Variability, detection difference, and power influence on n?
Student t test
Comparison of means, N<30, independent data points (unless paired t-test), normal distribution for equal and unequal variance, random sampling, equal sample size, degrees of freedom (n-1)
Student t-test equation
One tailed vs two tailed t-test
Paired vs. unpaired t-test
Degrees of freedom for t-test
Df= Na-1+Nb-1= 2N-2=N-1
p-value
The probability level which forms basis for deciding if results are statistically significant (not due to chance), <0.05 you can reject the null hypothesis
ANOVA
Compares mean values of a contributes variable for multiple categories/groups (F value)
degree of freedom (df) for ANOVA
post hoc tests
additional hypothesis tests that are done after an ANOVA to determine exactly which mean differences are significant and which are not
Non-parametric tests
Pros: make fewer assumptions about the distribution of the data, Cons: less powerful, difficult to detect small differences
Parametric vs. Non-parametric tests
Mann-Whitney U test
Determines whether two uncorrelated means differ significantly when data are nonparametric, uses ranks of the measurements
Mann-Whitney U test equation
Once you calculate U, use the lower of the two to look in the table. NOTE: want Ustat to be less than Ucrit
Kruskal-Wallis test
The non-parametric equivalent to the one-way ANOVA, uses chi-square as calculated value
Pearson correlation coefficient
The most common statistical measure of the strength of linear relationships among variables, 1 total positive correlation, 0 is no correlation, -1 is total negative correlation
linear regression
a statistical method used to fit a linear model to a given data set
r^2
Relationship between x and y, 0 means that knowing x doesn't help in predicting y, 1 means you can perfectly predict x from y
r^2 equation
Grubb's test for outliers
YOU MIGHT ALSO LIKE...
Essentials of Business Research | Silver, Stevens, Kernek, Wrenn, Loudon
AcademicMediaPremium
$9.99
STUDY GUIDE
Module 23 Quiz
50 Terms
Demitri_Bautista
Business Statistics Final Exam Review
60 Terms
AnnaMuntin
Statistics
69 Terms
PrestonFrasch
OTHER SETS BY THIS CREATOR
Adrenergic Agonists and Antagonists
15 Terms
talia_schwartz6
PLUS
Antiviral Drugs
71 Terms
talia_schwartz6
PLUS
Anti-HIV Drugs
78 Terms
talia_schwartz6
PLUS
Prions
48 Terms
talia_schwartz6
PLUS
THIS SET IS OFTEN IN FOLDERS WITH...
Global Health Exam 1 review
99 Terms
gjpate14
Global Health Chapter 12: Communicable Diseases
68 Terms
theresa_iwanicki
Global Health
17 Terms
Kimball_Bryan
Biostatistics
77 Terms
Michelle_Laslovich
;