Search
Create
Log in
Sign up
Log in
Sign up
Get ahead with a $300 test prep scholarship
| Enter to win by Tuesday 9/24
Learn more
Introduction to Biostatistics
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (45)
Statistics
the only science that enables different experts using the same figures to draw different conclusions
the science of learning from data and of measuring, controlling and communicating uncertainty by collection, presentation, analysis, interpretation or explanation of data
Statistician
A person who believes figures don't lie, but admits that under analysis some of them won't stand up either
Descriptive Statistics
1. summarizes data from a population or a sample using numerical data, such as a mean or standard deviation
2. presentation, organization and summarization of data
Inferential Statistics
1. draws conclusions from data (e.g. significant, non significance) that are subject to random variation (eg observational errors, sampling variation)
2. allows us to make inferences from sample data and generalize to a larger group
Evidence based Dentistry
an approach to oral healthcare that requires the judicious integration of systematic assessments of clinically relevant scientific evidence, relating to the patient's oral and medical condition and history, with the dentist's clinical expertise and the patient's treatment needs and preferences
Elements of EBD
1. systematic assessment of scientific literature
2. dentist's clinical expertise
3. patient's treatment needs and preferences
How to practice EDB
1. Define the question
2. Search for the information
3. Interpret the evidence
4. Act on the evidence
Variables
things measured, controlled or manipulated in research
observational research
we do not influence, only measure variables and look for relationships
experimental research
we manipulate variables and measure the effects of this manipulation on other variables
independent variables
those manipulated
dependent variables
outcomes measured or registered
Precision
consistency of repeated measurements
Accuracy
how well measure value reflects "true" value
Error
the difference between a calculated or observed value and the "true" value
Systematic error
Errors that occur reproducibly from faulty calibration of equipment or observer bias
--> if you have systematic error you might as well throw out the study
Random error
errors that result from the fluctuations in observations; random variability
--> THIS IS WHERE STATS MATTERS
Qualitative variables
a variable that cannot naturally be expressed in a number (gender, race/ethnicity, disease severity)
Quantitative variable
a variable that can be naturally expressed in a number (bone loss, pocket depth)
Discrete variables
a variable that can take on FINITE number of values (DMF, number of teeth)
is an exact number
continuous variable
a variable that can take on an INFINITE number ( treatment time, blood pressure)
can only be approximated
Nominal variables
only qualitative classification. Info measured only in terms of whether the individual items belong to some distinctively different categories, (e.g. gender, race, graduating class)
Ordinal variables
ordered categories, allows ranking, difference between ranks is not quantifiable. (e.g., disease severity, socioeconomic status, pain level)
Interval variable
can rank order AND quantify differences between units (e.g., temperature, as measured in degrees Fahrenheit or Celsius).
(40 F > 20 F , 40 F is warmer than 20 F, but 40 F is not twice as warm as 20 F )
Ratio variables
same as interval variables AND has a meaningful 0 point such that ratios between numbers are meaningful (x is two times more than y). (e.g., weight, blood pressure) Typical examples of ratio scales are measures of time or space. ( a weight of 160 lbs. is double the weight of 80 lbs}
Measures of central tendency
mode, median, mean
Mode
value that occurs most frequently in distribution
Median
value midway in the frequency distribution
Poor measure of central tendency in skewed distributions
Mean
arithmetic average (sum of obs. by # of obs)
Variance
average of the squared sum of deviations numbers around mean; gets around negative values tells how far a given score is from the typical or average score
Standard deviation (SD)
square root of variance, measures "average" variation (differences from the mean); used since square units are awkward to deal with most meaningful and widely used measure of variability
Coefficient of variation (CV)
SD divided by the mean independent of any scale; best used with ratio level data
Measures of dispersion
range, interquartile range, average deviations
Range
Measure of dispersion about mean (Max. minus min)
when max and min unusual values, range may be misleading
used with original data, rarely used in scientific work as fairly insensitive (two very different sets of data can have the same range)
Interquartile range
comprises 50% of the data; mid-spread
less affected by a few extreme scores
Average deviations
sum of difference between datum and mean divided by number of datum; half will be positive and half will be negative so average is always 0
Skewness
1. Used with interval and ratio data to "check" normality
2. Measure of symmetry in distribution of scores; length of tails;
direction refers to direction of longer tail
a. right or positive skew:
right tail is longer
b. left or neg skew: left tail
is longer
c. <0 negative skew; >0
positive skew
Quartiles
the first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50% are larger)
Only 25% of the observations are greater than the third quartile
Interquartile Range
3rd quartile - 1st quartile = Q3 - Q1
Probability
Chance, or probability, is assigned a number between 0% chance and 100% chance, or 0 to 1.
The probability is the ratio of an event of interest occurring compared to all possible events
Odds
the ratio of the probability that an event of interest occurs to the probability that it doesn't occur
Binomial distribution
Gives the probability that a specified outcome occurs in a given # of independent trials
Based on counts of discrete events
Poisson distribution
Useful to model events that take place over and over again in a completely haphazard way.
Based on counts of continuous events
Normal Curve
With a large number of repetitions, the binomial distribution approximates the normal distribution, curve (bell, gaussian)
Most important probability distribution in the statistical analysis of experimental data
Goal of graphing
1. Presentation of descriptive statistics
2. Presentation of evidence
3. some people understand subject matter better with visual aids
4. provide a sense of the underlying data generating process (scatter-plots)
;