Search
Browse
Create
Log in
Sign up
Log in
Sign up
Ch 13: Handling violations of assumptions
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (27)
options when data do not meet assumptions
1. ignore violations of assumptions (test might be robust enough)
2. transform the data (take logarithm of data)
3. Use a nonparametric method (method that does not require the assumption of normality)
Detecting deviations from normality
1. graphical methods
2. formal test of normality
(1) graphical methods
evaluating whether data fit a normal distribution:
a. Histogram
b. Normal quantile plot
(a) Histogram
strongly skewed or strongly bimodal, has outliers, then it is unlikely to be normal
(B) normal quantile plot
compares each observation in the sample with its quantile expected from the standard normal distribution. Points should fall roughly along a straight line if the data come from a normal distribution; curvature or jumps in the distribution indicate potential deviations from normality
2. Formal test of normality
tests the following hypotheses:
Ho: the data are sampled from a population having a normal distribution
Ha: the data are sampled from a population not having an normal distribution
used with caution: small n may not yield enough power, large sample size may reject when not appropriate
Shapiro-Wilk Test
Shapiro-Wilk test
evaluates the goodness of fit of a normal distribution to a set of data randomly sampled from a population
When to ignore violations of assumptions
when methods are robust: if the answer it ives is not sensitive to violations of the assumptions of the method; robustness applies only to methods for means, (i.e. not to the f-test for testing variances); unequal standard deviations: moderate sample sizes (≥30 in each group), and approximately equal even when there is a 3-fold difference b/t sd's (if otherwise, use Welch's test or try to transform the data)
data transformation
changes each measurement by the same mathematical formula; to improve the fit of the normal distribution to data and make sd's more similar in different groups
1. Log
2. arcsin
3. square
4. reciprocal
5. antilog
1. log transformation
Y' = ln[Y]; cant compare ln of population 1 to ln of pop. 2 this way; applied when all values are greater than zero; If data include zero, then Y' = ln[Y+1]
advantages:
- measurements are ratios or products of variables
- frequency distribution of data is skewed to the right
- group have the larger mean also has a higher standard deviation
- the data span several orders of magnitude
2.arcsine transformation
used on data that are proportions:
p' = arcsin[√p]
3. square root transformation
used when data are counts:
Y' = √(Y+1/2)
helps equalize standard deviations between groups when the group with the higher mean also has the higher standard deviation
4. reciprocal transformation
when the data are skewed right:
Y' = 1/Y
numbers must be postiive
5. square root
Y' = Y^2
numbers must be positive
6. antilog
if square root doesn't work:
Y' = e^Y
Confidence intervals with transformations
calculate CI with transformation and the do a back-transformation with lower and upper limits
nonparametric methods
a nonparametric method makes fewer assumptions than standard parametric methods about the distributions of the variable; usually based on ranks
ranks
data points are ranked from smallest to largest; frees us from making assumptions about the probability distributions of the measurements
sign test
compares the median of a sample to a constant specified in the null hypothesis; measurements above null hypothesized median are + and below are -; used in place of a one-sample t-test or paired t-test when normality assumption cannot be met; has very little power; if n ≤ 5 then impossible to reject null
Ho: median difference .... = 0
calculations based on binomial test (p = 0.5)
Mann-Whitney U-test
compares the distributions of two groups; it does not require as many assumptions as the two sample t-test; if distribution has the same shape, then it compares the locations (medians or means) of the two groups
Ho: difference b't means are equal
Mann-Whitney U-test calculations
1. Rank data from smallest to largest; all data
2. calculate the rank-sum for one of the two groups
3. use rank-sum to calculate U1
4. calculate U2
5. choose the larger of U1 or U2 as our test statistic,
6. Determine the P-value vy comparing the observed U with the CV of the null distribution for U (U (a(2),n1,n2))
rank-sum
R1 = add all ranks form group 1 up
calculating U1
U1 = (n1xn2) + ((n1(n1+1))/2) - R1
number of times that a data point from sample 1 is smaller than a data point from sample 2 if we compare all possible pairs of points taken one from each sample
calculate U2
U2 = n1xn2 - U1
tied ranks
assign the average of the ranks that the tied points would have received
Assumptions
1. both samples are random sample from their populations
MW U-test: the distributions have the same shape (same variance and skew)
T1 and T2 error rate os non-p methods
By using only ranks, non- tests use less info from data, cause tests to have less power; less power means lower prob of rejecting a false null hypothesis = higher T2 error rate
non-p tests are typically less powerful than parametric tests
;