Individuals

the objects, subjects or observations described by a data set, usually denoted by Xi

Quantitative Variable

a variable that takes on a numerical value for which it makes sense to do arithmetic operations like averaging

outlier

an individual observation that fall outside the overall pattern of any graph of data

Type 2 Error

An error made by failing to reject (accepting) the null hypothesis when in fact it is false

Law of large numbers

a property that states as the number of observations increases, the mean (statistic) of the observed values gets closer and closer to the true parameter

p-value

the probability of observing a value for a test statistic as extreme or more than the hypothesized given value

linear relationship

the relationship between two variables when the plotting of them in a scatterplot depicts a straight-line pattern

data analysis

the method of collecting, organizing, and then describing data using graphs, numerical summaries in order to make statistical inferences

placebo

a dummy treatment that has NO physical affect. Used in control gropus

Statistically significant

an observed effect too large to attribute plausibly to chance

extrapolation

the statistically unsound method of using a regression line to make predictions outside the range of values

categorical variable

a variable that belongs to one of many levels or groups. A proportion is formed by counting how many of the observations belong to the individual groups.

central limit theorem

as a randomly collected sample size gets large enough, the sampling distribution becomes more normal regardless of population distribution

standard deviation

a numerical value that measures the spread of a distribution by looking at how far the observations are from the mean

standard normal distribution

a bell shaped distribution with mean = 0 and standard deviation = 1

explanatory variable

a variable that is used as an attempt to explain the observed outcomes

correlation

a numerical value that measures the strength and direction of the linear relationship between two quantitative variables

lurking variable

a variable that has a significant effect on the response variable in the relationship between two variables in a study, but is not one of the two variables being studied

statistical inference

drawing conclusions about data after performing statistical procedures

Simpson's Paradox

this refers to a reversal of the direction of a comparison of an association when data from several groups are combined to form a single group

residual

the difference between an observed value and the predicted value (y-y(hat))

variable

any measurable characteristic of an individual (quantitative or categorical)

double blind experiment

an experiment in which neither the subjects nor the people conducting the experiment know which treatment a subject receives

power

the probability that a fixed level of significance will correctly reject a false null hypothesis

sample

the part of the population that we actually select and examine

undercoverage

this occurs when some groups in the population are left out of the process of choosing a sample

type 1 error

an error made by rejecting the null hypothesis when in fact it is true

Voluntary Response

a statistically unsound method of survey in which people choose themselves by responding to a general appeal

Matched Pairs Design

an experiment in which the subjects are matched in pairs and each treatment is given to one subject in each pair

Biased

a study that systematically favors certain outcomes

Distribution

the pattern of a data set. It tells us the range of the variable, the frequency of each value, the center, and the shape (normal or skewed) of the data.

Stratified Random Sample

a sample collected by dividing up the population into homogeneous groups and then using an SRS to sample from each group

Robust

a confidence interval or the outcome of a significance test that is unchanged when the assumptions are violated OR a resistant statistical procedure

Non-Response

this occurs when an individual chosen for a sample can't be contacted or refuses to cooperate

Standard Error

this is the estimated standard deviation computed from a sample of data

Unbiased Statistic

a numerical value that is estimated from a sample distribution that is equal to the true value of the population (statistic=parameter)

Random

a surveying or experimental phenomenon that occurs when the individual outcomes are uncertain

Null Hypothesis

the statistical hypothesis that states that there is no effect or no change in the claimed population parameter

Population

the entire group of individuals that we want information about

Parameter

a numerical value that describes the average of the population

Confidence Interval

an interval computed from sample data by a method that has a certain insurance of producing an interval containing the true population parameter

Probability

the proportion of times the outcome would occur in a very long series of repetitions

Critical Value (upper)

a numerical value, z** or t** that has probability = p, lying to the right of it under the standard normal curve

Statistic

a numerical value computed from a sample of data

Control Group

a group of individuals or experimental units that receive the placebo treatment

Treatment

a specific experimental condition applied to the units in order to change the environment to detect a significant effect

Experimental Units

the individuals on which the experiment is done on

Least Square Regression

the line that makes the sum of squares of the vertical distances of the data points from the lines as small as possible. Used for making predictions.

Square of the Correlation

r-squared, the fraction of variation in the values of y that is explained by the least-squares regression of y on x

Time Plot

a graph that depicts each observation against the time at which it is measured

Response Variable

a variable that measures an outcome of a study

Center, Spread, Shape

ways of describing the overall pattern of a distribution's histogram or timeplot