Search
Create
Log in
Sign up
Log in
Sign up
BCPS Biostats
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (48)
Stats
Method for:
Collecting, Classifying, Summarizing & Analyzing data
Random Variables
Outcome can't be anticipated before the experiment is conducted
Types of Random Variables
1. Discrete
2. Continous
Discrete variables
-limited number value within a given range
-ex: dichotomous, categorical
-measures central tendency: means and SD's
Discrete variable (Nominal)
-no indication of severity
-data expressed as a frequency or proportion
ex: sex (M or F), mortality(dead or alive)
Discrete variable (Ordinal)
-No mean or SD
-ranked in a specific order with NO level of magnitude between ranks
ex: NYHA class , Likert-type scales
Continous (or counting) variable
-any value in a given range
Types of statistics
Descriptive
Inferential
Descriptive stats
-used to summarize & describe data visually and numerically
-visual methods: frequency distribution, histogram, scatterplot
-numerical methods: measures central tendency, mean, median & mode
Descriptive stats (numerical methods) Mean = µ
-average (sum / # of values) know equation
-used for continuous & normally distributed data
-sensitive to outliers
-most commonly used measure of central tendency
Descriptive stats (numerical methods) Median
-Midpoint (1/2 above and 1/2 below) *calculate
-50th percentile
-used for continuous & ordinal data(good for skewed populations)
-INsensitive to outliers
Descriptive stats (numerical methods) Mode
-most common value that occurs in the distribution *observed
-used for nominal, ordinal or continuous
-can me multiple (ex: bi/tri-modal)
-useless for large range of values
Descriptive stats (numerical methods) SD = σ or s
-measure of variability around the Mean
-most common measure to describe the spread of data
-square root of variance, data in original units
-continuous data that are at or near normally distributed
-68% +/- 1 SD, 95%+/- 2SD, 99% +/- 3SD
-Coefficient of variation(CV) relates to mean & SD
CV= (SD/mean x 100%)
Descriptive stats (numerical methods) Range
-difference between the largest and smallest value (big-small) duh
-size of range is Sensitive to outliers
-reported as actual value
Descriptive stats (numerical methods) Percentiles
-the point in a distribution which a value is larger than some other point in a data set
-ex: 75th percentile = 75% of the other values are smaller
-Does not assume normal distribution
-Interquartile Range(IQR): describes the middle of the 50% values sucha as 25-75th percentile
Inferential stats
-conclusions/generalizations made about the study sample
-statistical inference: made by estimation or hypothesis testing
Population Distributions
-binomial (looks normal)
-poisson (discrete probability distribution) could look normal or not
Population Distributions (Normal/Gaussian distribution)
-most common model
-bell shaped
-continuous/normally distributed data
-mean and SD define a normal distributed data
-median = mean in normal distributed data
Population Distributions (Normal/Gaussian distribution) Probability
likelihood that any one event will occur given all the possible outcomes
Population Distributions (Normal/Gaussian distribution)
Estimation and sampling variability
-one method can be used to infer a conclusion
-SD of mean is estimated by standard error mean (SEM)
-SEM = SD(s or σ )/ √n (square root of n)
Confidence Interval (CI)
-commonly reported as a way to estimate a population parameter, 95% CI's are commonly reported
-magnitude of difference between groups & statistical significance
-if CI includes 0 = no statistical difference
-if CI for odds ratio & RR = 1 = no statistical difference
Hypothesis Testing
-Null hypothesis
-Alternative hypothesis
Null Hypothesis
-Ho
-no diff between treatment groups if (A=B)
-Ho rejected/True = Significant diff between groups(chance not involved)
-Ho Not rejected/False = No difference (chance)
Alternative hypothesis
-Ha
-diff between treatment groups if (A≠B)
Types of hypothesis testing
-Non-directional
-Directional
Non-directional hypothesis testing (Difference)
Are the means different
-Ho Mean1=Ho Mean2 = 0
-Ha Mean1≠Ha Mean2 ≠ 0
Test used
*Traditional 2-sided t test & CI
Non-directional hypothesis testing (Equivalence)
Are the means practically equivalent
-Ho Mean1 ≠ Ho Mean2
Test used
*Two 1 sided t-test & CI
Directional hypothesis testing (Superiority)
Is Mean 1>2
Ho Mean 1 </= Ho Mean 2 or </= 0
Ha Mean 1> Ha Mean 2
Test used
*1 sided t-test
Directional hypothesis testing (Non-inferiority)
Is Mean 1 no more than a certain amount lower than Mean 2
Ho Mean1 >/= Ho Mean 2
Ha Mean 1< Ha Mean 2
Test used
*CI
P value (priori α)
<0.05 = Good
Parametric Tests
-Used when data has a normal distribution
-data is continuous measured on interval or ratio scale
-continuous & homogenous between groups
Parametric Tests examples
Student t-test:
1 sample - compares mean of study to population mean
2 sample -compares means of two independent samples
paired test - compares mean difference of matched samples
ANOVA - compares means of >/= 3 groups
ANCOVA - control for the effect of confounding variable
Non-parametric Tests
-used when data is not normally distributed
-continuous data but does NOT meet assumptions of t-test or ANOVA
Non-parametric Tests examples
Independent samples
-Wilcoxon Rank Sum & Mann-Whitney U test: compares two independent samples (related to t-test)
-Kruskal-Wallis: compares 3 or more independent groups (like ANOVA)
-post hoc testing
Non-parametric Tests examples
Related or paired samples
-Sign test & Wilcoxon Signed rank test: 2 matched or paired samples
-Friedman ANOVA: compares means of >/= 3 groups
Nominal data
Chi-square(x2): compares expected & observed proportions in >/= 2 groups
-tests independence and good fit
Fisher exact: like chi but between 2-5 groups
McNemar: paired samples
Mantel-Haenszel: tests confounders
Summary of Decision Errors
Summary of decision
Type I(1) error
- α(alpha) error = p value = 0.05
- happens when H0 is rejected
-Concluding that there is difference when actually there is NO difference
Type II(2) error
- β = beta
- set between 0.2-0.1
-Concluding that there is NO difference when actually there IS a difference
Power
- (1-β)
- the probability of making a correct decision when H0 is false
-the ability to detect differences between groups
Power
-dependent upon:
-α
-sample size: size difference between outcomes you want to detect
-decreased by poor study design
Statistical vs Clinical significance
- <p value = chance less likely to explain observed differences
- statistical difference ≠ clinical significance
-lack of statistical difference doesn't mean results aren't important
Correlation
-strength of association between 2 variables
-doesn't mean one variable is dependent on another
-Pearson correlation = r, Linear
- (-1) negative relationship
- (0) NO relationship
- (+1) positive relationship
-if r = 1 the more highly correlated the two variable
-influenced by sample size
Spearman Rank Correlation
-nonparametric test
-NOT normally distributed Continuous data
-correlates association between 2 variables
Regression
-ability of one or more variables to predict another
- Y=mx+b (dependent variable = (m)slope x (x)independent variable + (b)intercept)
-r2(sq)
Survival anylasis
-studies the time between entry in a study and some event
-censoring: subject leaves the study other than some event
Survival anylasis
-Kaplan-Meier: curve, survival times vs length of time
-Log-rank test: survival distribution between >/= 2 groups
-H0 = no difference in survival between 2 groups
-Cox proportional hazard: compares survival of 2 groups after other variables
-calculation of hazard ratio & CI
Selected Represented Statistical Tests
;