Terms in this set (53)
margin of error
an amount (usually small) that is allowed for in case of miscalculation or change of circumstances.
simple random sample
every member of the population has a known and equal chance of selection
systematic sampling
A procedure in which the selected sampling units are spaced regularly throughout the population; that is, every n'th unit is selected.
stratified random sampling
divides the population into separate groups, called strata, and then selects a simple random sample from each stratum
cluster sampling
A probability sampling technique in which clusters of participants within the population of interest are selected at random, followed by data collection from all individuals in each cluster.
Nonresponse
occurs when an individual chosen for the sample can't be contacted or refuses to participate
treatment vs. control group
receives treatment being studied vs. receives no treatment or a different treatment
Case-control study
Observational study where 2 people differing in outcome are identified and compared to find a causal factor
observational study
observes individuals and measures variables of interest but does not attempt to influence the responses
experimental study
the researcher manipulates one of the variables and tries to determine how the manipulation influences other variables
matched pair study
a design in which one creates a set of two participants who are highly similar on a key trait and then randomly assigns individuals in the pair to different groups
symmetric data set
mean and median are equal
skewed data set
When the mean of a data set is not equal to the median
Boxplots
Used to display one quantitative variable. The data is sectioned into quartiles. A box is drawn around the range of the middle two quartiles and whiskers are drawn to represent the range of the upper and lower quartiles
5-number summary
consists of the minimum and maximum, the quartiles Q1 and Q3, and the median
SLD
Specific Learning Disability
mean
the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores
variance
Variance is a measure of how spread out a data set is. It's useful when creating statistical models since low variance can be a sign that you are over-fitting your data.
deviation squared
s^2 means variance
standard deviation
a measure of variability that describes an average distance of every score from the mean
Steps:
1.Find the mean
2.Find the deviation of each value from the mean: value - mean
3.Square the deviations
4.Sum the squared deviations
5.Divide the sum by (the number of values ) - 1.... this is the variance
6. Take the square root of the variance. The result is the standard deviation
Empirical rule
The rules gives the approximate % of observations w/in 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean when the histogram is well approx. by a normal curve
normal percentile
Gives the percentage of values in a standard Normal distribution found at that z-score or below
Odds (Statistical)
are an expression of relative probabilities, generally quoted as the odds in favor. The odds (in favor) of an event or a proposition is the ratio of the probability that the event will happen to the probability that the event will not happen.
Risk
a quantification of a situation's risk using statistical methods. These methods can be used to estimate a probability distribution for the outcome of a specific variable, or at least one or more key parameters of that distribution, and from that estimated distribution a risk function can be used to obtain a single non-negative number representing a particular conception of the risk of the situation.
increased risk
(relative risk - 1.0)x100: the percentage by which the conditional proportion is greater in one category of x than in another; describes pattern and strength, in percent.
natures line
...
Simpson's Paradox
when averages are taken across different groups, they can appear to contradict the overall averages
correlation
A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other.
Contingency tables
provide a format to display observations that have more than one value associated with them
false positive
error of recognition in which people think that they recognize some stimulus that is not actually in memory
false negative
Assessment error in which no pathology is noted (that is, test results are negative) when one is actually present.
2x2 table
...
observed frequencies
actual frequencies obtained from a sample
expected frequency
the frequency expected in a category if the sample data represent the population
chi-square statistic
The statistic used to test the statistical significance of the observed association in a cross-tabulation. It assists in determining whether a systematic association exists between the two variables
p-value
The probability of results of the experiment being attributed to chance.
incompatible outcome
...
independent outcomes
two or more outcomes such that the realization of one of the outcomes does not affect the realization of the other outcomes
Addition Rule of Probability
Used to determine the probability that at least one of two events will occur.
P(A or B) = P(A) + P(B) - P(AB)
multiplication rule
To determine the probability, we multiply the probability of one event by the probability of another.
Conditional Probabilities
The probability of one event given the known outcome of a (possibly) related event.
bell shaped curve
Also referred to as a normal distribution or normal curve, a bell-shaped curve is a perfect mesokurtic curve where the mean, median, and mode are equal.
standard error
the standard deviation of a sampling distribution
95% CI
A 95% confidence interval is a range of values that you can be 95% certain contains the true mean of the population. This is not the same as a range that contains 95% of the values.
null hypothesis
the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.
alternative hypothesis
The hypothesis that states there is a difference between two or more sets of data.
level of significance
also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true. For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference
2 sided alternative hypothesis
claims that a parameter is simply not equal to the value given by the null hypothesis -- the direction does not matter.
1-sided alternative hypothesis
claims that a parameter is either larger or smaller than the value given by the null hypothesis.
test statistic
a statistic whose value helps determine whether a null hypothesis should be rejected
two types of error
they call these errors Type I and Type II errors. Both types of error relate to incorrect conclusions about the null hypothesis.
confidence interval
the range of values within which a population parameter is estimated to lie
90% CI
level means that we would expect 90% of the interval estimates to include the population parameter.
;