Search
Create
Log in
Sign up
Log in
Sign up
Research Statistics
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (119)
Evidence Based Practice
Choosing the most appropriate form of treatment for any diagnosis/symptom on the basis of research evidence
The integration of best research evidence with clinical expertise and patient values
PICO - P
Patient/Problem (question)
What is the problem/question about? What is the target population?
PICO - I
Intervention/Diagnosis/Prognosis
What main intervention/treatment/diagnostic test are you considering?
PICO - C
Comparison
What is the main alternative intervention/treatment/diagnostic test you are considering? (or no treatment, placebo, sham?)
PICO - O
Outcome
What are you trying to accomplish, measure, improve, or affect?
PICO - Type of question
Is the type of questions about:
Diagnosis
therapy
prognosis
etiology - identifying causes for the disease
PICO - Type of Study
What type of study would provide the best answer?
RCT, cohort study..etc.
PICO - Research question
Can be vague
relationship between two or more variables is phrased in a question
PICO - Research hypothesis
Relationship phrased as a declarative statement, needs statistical testing
Independent variables
Any variable that is being manipulated and freely chose, independent of other variables (eg. tx group/group * tx intensity)
The number of levels of an independent variable is the number of experimental conditions (2 levels: experimental-control group; 5 levels of treatment intensity)
Dependent variables (or criterion variables)
Any variable that is being measured
Depends on the independent variable
Control (or extraneous) variables
Any variable that could change an experiment, but does not and is not the focus of the experiment
This variable is kept constant or is monitored to minimize its effect on the study (sex, fixed number of weeks of therapy, age range, race)
Nominal level of measurement
numerals are category variables
unordered categories, object or person is assigned to only 1 category
cannot demonstrate change
examples: blood type, sex, dx
Ordinal level of measurement
numbers indicate rank order
difference between adjacent (greater than/less than) scores are not equal
has no arithmetic properties
distance between intervals is not known and are not equal
sum and derived scores may not be interpretable
examples; MMT, pain questionaire
Interval level of measurement
numbers have equal intervals, but no true zero
we can determine the distance of the change, but not the true amount of change
Examples: calendar years, BMI, temperature
Ratio level of measurement
numbers represent units with equal intervals, measure from true or fixed zero
measures are known qualities
we can measure the amount of change
Examples:
Distance, age, time, decibels
Reliability
consistency/reproducibility/repeatability/equivalence/agreement of measurements
Reliability coefficient
descriptive summary of data consistency assumes a value between 0 and 1
Interpretation of reliability (in general)
<.5 = poor reliability
.5-.75 = moderate reliability
>.75 = good reliability (usually needs to be above .9)
Internal consistency
extent to which the items of a scale correspond to the same dimension
Chronbachs Alpha
internal consistency
Intra-rater Reliability
consistency of repeated measurements made by 1 person over time
weighted Kappa, percentage agreement, intra-class coefficient (ICC) (model 3)
intra-rater reliability
Inter-rater reliability
consistency of repeated measurements made by more than one person (different examiners)
weighted, Kappa, percentage agreement, ICC (model 2)
Inter-rater reliability
Test-retest reliability
consistency of repeated measurements made on the same patient, but on different occasions
weighted, Kappa, percentage agreement, ICC
Test-retest reliability
Response stability
SEM estimates the standard error in a set of repeated scores
SEM = standard deviation from the 1st test x (square root of (1-ICC))
When to use SEM
MCD or response stability
Cronbach's Alpha Guidelines
>.9 = Excellent (high stakes testing)
.7 < alpha <.9 = good (low stakes testing)
.6 < alpha < .7 = acceptable
.5 < alpha < .6 = Poor
alpha <.5 = unacceptable
Kappa and weighted kappa (Kandis and Koch): guidelines
> .8 = excellent agreement
.6-.8 = substantial
.4-.6 = moderate
<.4 = poor agreement
Intra-class coefficient (ICC)
>.9 = excellent reliability
.75-.89 = good reliability
<.75 = poor - moderate reliability
Statistics by level of measurement
nominal or categorical: kappa, percentage agreement
Dichotomous (2 responses) or ordinal (more than 2): chronbachs alpha (internal consistency)
ordinal or rank: weighted kappa, ICC
Interval and ratio (continuous data): ICC
Validity
extent to which an instrument measures what it is intended to measure
we want to use test measures for discriminating among individuals, evaluating change, or making accurate predictions, based on outcomes of the test
Content validity
the degree to which a measurement appears to test what it is supposed to (often in a form of discussion by a panel of experts)
Construct validity
the degree to which a theoretical/abstract construct/trait (ADL, pain) is measured by a test or measurement
convergenet validity
two measures of a construct that theoretically should be related, are related
Discriminant validity
two measures of a construct that theoretically should be unrelated, are unrelated
Interpretation of construct validity
correlations, denoted in papers with an "r"; pearson for parametric data (interval/interval ratio) or spearman for non-parametic data (ordinal/nominal data)
Correlation coefficients
high correlations = convergent
low correlations = discriminant
Correlation coefficient guidelines:
above .75 = good-excellent relationship
.5-.75 = moderate-good relationship
.25-.5 = fair relationship
.01-.25 = little relationship
0.00 = no relationship
Negative value for correlation coefficient
inverse relationship (following the strength of an inverse relationship in the same way as above
Criterion related validity
the validity of the measurement is established by comparing it to a gold standard, a reference standard, or data obtained by different forms of testing
concurrent validity
comparing measurement to gold standard measurement by testing both of them at the same time
subtype of criterion related validity
predictive validity
a measurement is predictive of a future criterion score or outcome
Floor effect
a limitation of a measure in which the instrument does not register a further decrease in score for the lowest scoring individuals
the scale is not capturing the true baseline score for a floor effect: the true score could be lower than the lowest scale score
Ceiling effect:
a limitation of a measure in which the instrument does not register a further increase for the highest scoring individuals
the scale is not capturing a true baseline (or follow up) scoring for a ceiling effect: the true score could be higher than the highest scale score
Responsiveness
important if the test is used to test effectiveness of a treatment -- meaningful change MCID
MCID
minimally clinically important difference
ability to detect minimal clinically important change over time
the smallest tx effect that would result in a change in pt management, given its side effects, costs, and inconveniences
MCID is critical for judging the benefit of the intervention
MCID is the smallest difference that is a measured variable that signifies an important rater than a trivial difference in a patients condition (pt perception of being beneficial)
MDD or MDC
reliability of the measurement
amount of change in a variable that must be achieved to reflect a true difference
smallest amount of change that an instrument can accurately measure
MDC = SEM
1.96
/2
SEM
is the absolute amount of measurement error in units used of the measurement. because SEM calculates the standard error in a set of repeated scores, it is a reliability measure that asses response stability
SEM = (SD baseline)* square root (1-ICC)
Expresses variability in same unit as original measure
ICC
relative measure of reliability
unit-less
varies between 0 (no reliability) and 1 (perfect reliability)
magnitude of ICC depends on
- between subject variability
- magnitude of ICC depends on sample size
- patient population
Diagnostic Accuracy
Validity of a diagnostic test is evaluated in terms of its ability to accurately assess the presence and absence of a target condition
sensitivity
percentage of people who test positive for a specific disease among a group of people who have the disease
test with high sensitivity will properly identify most of those who have the disorder
Specificity
Percentage of people who test negative for a specific disease among a group of people who do not have the disease
test with high specificity will properly identify most of those without the disorder
positive predictive value
ability of a diagnostic test to correctly determine the proportion of patients with the disease from all the patients with positive test results
proportion of persons who tests positive actually have the disease (True positives)
Negative predictive value
ability of a diagnostic test to correctly determine the proportion of patients without the disease from all the patients with negative test results
proportion of those who were tested negative; who were true negatives
A high negative predictive value will provide a strong estimate of the actual number of people who do NOT have the target condition
Positive likelihood ratio
LR+ will tell us how much more likely a person with a positive test will have the disorder compared to not having the disorder (ruling IN the disease
Negative likelihood ratio
Lr- will tell us how much less likely a person with a negative test will have the disorder comapred to not having the disorder (ruling out the disease)
-LR guidelines
<.1 large/conclusive change pre to posttest probability
.1-.2 moderate change
.2-.5 small but sometimes important change
.5-.9999999 negligible change
1 no impact on likelihood of disease
+LR
1-2 negligible change
2-5 small but sometimes important change
5-10 moderate change
>10 large/conclusive change pre to post test probability
Pre and post test probability procedure
1. convert pretest probability (prevalence) to pretest odds
2. multiply the pretest odds by the likelihood ratio to get the post test odds
3. convert the posttest odds to the posttest probability
Prevalence:
existing frequency of a disease in a particular population at a particular time
prevalence is influenced by the duration of the disorder
Prevalence: how to calculate
number of existing cases of a disease at a given point in time/ total population at risk x 100
Incidence
The number or rate of new cases of a disorder or disease in the population during a specific time, and therefore, represents an estimate of the risk of developing the disease during that time
not influenced by the length of the disorder
incidence formula
# of new cases of a disease during a given time period/total population initially at risk x100
Interpretation of odds ratio
>1 increased frequency of exposure
1 = no change in frequency of exposure
<1 decreased frequency of exposure
relative risk
>1 increased risk of outcome (disease) in exposed persons
1=no difference in risk of outcome (disease)
<1 reduced risk of outcome (disease)
Clinical relevance
clinical decisions regarding effectiveness of interventions, we want to know:
if the tx will improve pts condition
if tx will prevent or decrease the risk of an adverse event
RCT
success of an intervention is reflected by difference between beneficial/adverse outcomes between experimental group and control group
NNT
1/ARR
number needed to treat : benefits of therapeutic intervention
the number of pts that need the experimental treatment to reduce risk in one
Relative risk reduction
the relative value of the rate of decrease in adverse outcomes for the intervention group relative to the control group (expressed as a percentage)
Interpretation of RRR
>0 relative risk reduction when you get treatment compared to control condition
0= no change in relative risk whether you get tx or not
<0 relative risk increase when you get tx compared to control condition
absolute risk reduction
The absolute value of the difference in rates of adverse outcomes between the intervention group and the control group (expressed as a percentage)
CER-EER
interpretation of ARR
>0 absolute risk reduction (amount which your therapy reduces the risk of bad outcome)
=0 no change in absolute risk weather you get therapy or not
<0 absolute risk increase (amount by which your therapy increases the risk of a bad outcome)
Interpretation of NNT:
>1 average number of pts who need to be tx to prevent one additional bad outcome
= 1 ideal, everyone that received treatment has experienced benefit (no bad outcomes)
<1 more bad outcomes than benefits with treatment
sampling bias
when individuals selected from a sample over-represent or under-represent certain population attributes
conscious (inclusion criteria)
unconscious (convenience/opportunity sampling - persons who just happen to be there)
Goal of a sample
estimate characteristics or drawing conclusions about the population
ideally, sample reflects the relevant characteristics and variations of the population, in the same proportions as they exist in the population
Statistical bias
ability to find significant differences when they exist. Sample size affects the statistical power of a study
simple random sampling
everyone has an equal chance of being selected
ex) choosing US citizens at random to determine prevalence of obesity)
convienence sampling
selecting participants who just happen to be available (consecutive sampling: any person meeting the inclusion or exclusion criteria is recruited)
ex) subjects who attend the state fair
systematic sampling
select random starting point and select every nth subject from a population (sampling interval)
ex. choose every 5th person that walks into a room to take a survey
Stratified random sampling
partitioning members into relevant population characteristics (subsets or strata)
ex) 30 students of each year of undergrad + of each DPT class currently studying to become a PT
Cluster sampling
Random sampling from each cluster (multi-stage sampling)
ex) randomly choose 1 state, then randomly choose one cit, then randomly choose 10 schools, randomly choose one class in each school, and randomly select 4 students from each class
quota sampling
similar to stratified random sampling but subjects from each subgroup are not selected randomly, but by convenience sampling. Research stops once quota is reached
ex) separating into age groups and then recruiting subjects until he or she reaches the number of subjects needed
Purposive sampling
Researcher hand picks the subjects based off specific criteria
ex) recruiting subjects over 65, with hypertension, but not on any meds
snowball sampling
ask current participants to refer you to people they know so you can ask them to participate
ex) asking a hispanic individual to refer you to more hispanic individuals
volunteer sampling
advertise your study and then wait for people to respond to the advertisement
ex)create a flyer or website for a study and wait until someone contacts you
Characteristics of experiments
independent variables
dependent variables
extraneous variables (any variable not directly related to the purpose of the study but that may affect the dependent variable.)
experiments are designed to control for a confounding influence on the independent variable
Essentials characteristics of a TRUE experiment
independent variable must be manipulated by the experimenter
subjects must be randomly assigned across groups
design must have experimental and control group
Manipulation of independent variables: active variable
manipulated by the researcher - cause and effect interpretation
Manipulation of independent variable: attribute variable
inherent to the group and must be observed -- relationships
Random assignment
refers to groups being considered equivalent
equivalent does not mean groups are exactly the same
equivalent means that any difference between groups happened by chance
Experimental and control group
reasonable degree of equivalence between experimental and control group to draw valid comparisons
comparing groups on initial values that are considered relevant to the dependent variable, to determine if the extraneous variables did balance out -- statistical analysis
single blind
participant knows which group they are in (tx is obvious)
only experimenter or measurement team is blinded
double blind
both participant and experimenter do not know which group the participants are in
triple blind
participant, experimenter, and data analyzer do not know which group the participants are in
strategies for controlling for inter-subject differences
selection of homogenous subjects
blocking variable ((no) tx randomly assigned to 3 age ranges)
matching/pairing
using subjects as their own control
analysis of covariance (ANCOVA)
Handling incomplete and lost data
maximize adherence to the research protocol to limit loss of data
incomplete data:
- compromise the effect of random assignment
- decrease the power of the study
Causes of loss of data because subjects
drop out for a specific reason (attrition)
cross over to another treatment during the course of the study
refuse the assigned treatment after allocation (ethically, the pts must be allowed to receive the treatment they want)
may be excluded after randomization because they do not meet eligible requirements
may not be complaint with assigned treatment
On-protocol or on-treatment or completer analysis
include those that completed the trials protocol
eliminate incomplete and lost data
bias towards treatment effect
ex) those who complied with the strengthening protocol are those who tend to see positive results
treatment recieved analysis
analyze all subjects according to the tx that they actually did receive, regardless of original group assignment
Bias because effect of randomization has been compromised
intention to treat analysis
more conservative approach
data analyzed according to the way we intended to tx the subjects
- ideally, we include all subjects
- if drop outs, say why they dropped out (see diagram)
guards against the potential for bias if drop outs are related to outcomes or group assignments
may result in underestimating the tx effect
appropriate use of statistical procedures for analyzing data: threats to validity
low statistical power
violated assumptions of statistical tests
reliability and variance
failure to use intention to treat analysis
threats to validity: internal validity - relationship between independent and dependent variables
hx effects
maturation
attrition
threats to validity: construct validity of cause and effect
length of follow up
experimental bias
- hawthorne effect (pateints performing better when observed)
- experimenter effect (passive or active)
Threats to validity: external validity:
interactions of tx and a)selection, b) setting, c) history
systematic review:
review of medical literature that uses specific methods to systematically search, identify, appraise, and summarize literature on a specific topic
includes detailed description of the methods and criteria used to select and evaluate articles that are included
Meta analysis
systematic review that uses a statistical technique to drive an overall estimate of effect size, by combining results of several controlled trials to determine overall effectiveness of a treatment
pooling of the trials increases sample size
combines randomized controlled studies using a quantitative index to develop a single overall estimate of the intervention effect
Randomized control trial
assesses the relative effect of a specific intervention compared to the control condition
the condition is either no treatment or a standard default treatment
ideally, experimental group and control group are identical except for tx received
most robust of intervention studies
random allocation of participants to conditions
often double blind
gold standard for clinical trial
Cohort study
longitudinal (prosepctive), observational study
individuals at risk of exposure are followed over time
the goal is to compare occurrence of a disease in exposed versus unexposed group
relative risk = ratio of incidence of disease among exposed subjects to incidence of disease among the unexposed
relative risk = a/(a+b)/ c/(c+d)
Case control study
retrospective, observational study
with one group having a particular disease, and 1 group without the disease
history of exposure + other characteristics prior to onset of disease is recorded and compared between two groups
the goal is to compare how frequently the exposure/other characteristics is present in each of the groups (those with the disease and those who are healthy)
Odds ratio = ratio of odds of exposure in persons with the disease to odds of exposure in persons without the disease
odds ratio = a/c/b/d
Cross sectional study
observational study
data observations are made at one point in time
all subjects tested at relatively the same time
the goal is to describe the relationships between a disease or condition and factors of interest that exist in a specific population at a given time
Survey
can be descriptive, exploratory or experimental study
is composed of a series of questions that are posed to a group of subjects
Case series
descriptive study
a collection of observations of similar cases
Case report
descriptive study
in depth description of an individuals condition or response to treatment
they may be used to generate theories/hypothesis for future research
they cannot test hypotheses or establish cause-effect relationships
Experimental research
comparing two or more conditions to determine cause effect relationship between independent and dependent variables
single factor for independent groups
pre-posttest control group design (basic structure of RCT)
used to compare two or more groups that are formed by random assignment
one group is the experimental group, other group is the control group
these independent groups are called the treatment arms of the study
both groups are test pre and post treatment
changes in scores on measures post minus pre testing are compared between experimental and control group
hypothesis: changes appear in experimental group and not in control group
Factorial design
incorporates two or more independent variables with independent groups of subjects randomly assigned to various combinations of levels of two or more variables
;