descriptive stats
purpose:
-to organize and understand large sets of data
-to represent and describe groups of data
inferential stats
purpose:
-allows us to make inferences about a population from sample data
discrete variable
a variable that can take on a finite set of values (can be categories, for ex. male or female. whole numbers)
continuous variable
a variable that can take on an infinite set of values (ex. decimals)
quantitative variable
varies by amount; it includes both discrete and continuous variables
(ex. class rank, whether or not someone is in a fraternity or sorority)
qualitative variable
varies by category; it includes only discrete variables
nominal scale
mutually exclusive categories
(ex. gender, political affiliation, position on a team)
ordinal scale
mutually exclusive categories and logical natural order
(ex. ranks in army, list of preferred veggies)
interval scale
mutually exclusive categories, logical natural order, and equal differences between values
(ex. fahrenheit and celsius temp. scales, latitude, longitude)
ratio scale
mutually exclusive categories, logical natural order, equal differences between values, and a true zero point
(ex. age, height, weight)
true experiment
-can randomly assign to groups
-can infer causality
quasi-experiments
when you cannot randomly assign people to groups (because you cannot manipulate independent variable)
correlational methods
examines the relationship between 2 variables
operational definitions
the precise way that you define variables in your study
(ex. ways to measure aggression)
probability sampling
types of random samples (rare)
-simple random sample (computer)
-cluster sample (giant group)
nonprobability sampling
types of non random samples
-convenience sample (most common)
-snowball sample (no access to group -> find people in the group -> they give resources)
confounding variables
a variable that varies along with the IV
experimenter bias
experimenter's expectations could influence the outcome
subject bias
tendency for subjects to behave in ways different from their normal behavior
within-subjects design
each participant experiences every level of the IV
between-subjects design
each participant experiences only one level of the IV
ungrouped frequency distribution
for every x value there is a frequency
grouped frequency distribution
-have equal intervals
-have consecutive intervals
-have non-overlapping intervals
-have 5-8 groups typically
histogram
bar graph where the bars touch
bar chart
if you have a nominal or ordinal variable on the x axis
frequency polygon
same as bar but instead you connect the dots (line graph)
normal distributions
-bell shaped
-unimodal
-symmetrical
-asymptotic
unimodal
1 prominent peak
bimodal
2 prominent peaks
symmetrical
same on both sides of an imaginary line (mirror image)
skewed
not the same sometimes will have a tail that skews to the right (positive) or left (negative)
kurtosis
how peaked or flat the distribution is
mode
most frequently occurring score (if there are 2 modes report both)
advantages of mode
-is unaffected by extreme scores
-is an actual score in the distribution
-can use it with nominal, ordinal, and I/R data
disadvantages of mode
-may not represent data well
-can easily be affected with 1 or 2 scores
median
-middle most score
-score that divides the distribution in half
-if there is an odd number of scores, it is the one on the middle
-if there is an even number of scores, it is the average of the two numbers in the middle
advantages of median
-is unaffected by extreme scores
-can use with ordinal and I/R data
-is stable
disadvantages of median
-may not represent data well
-value may not exist as a real score
-cannot use with nominal data
mean
arithmetic average equal to the sum of all values divided by the number of values
advantages of mean
-every point in the distribution contributes to it
-can be used in equations
-sample mean is an unbiased estimate of the population mean
disadvantages of mean
-is influenced by extreme scores
-can only be used with I/R data
-value may not exist in data
when does mean=mode=median?
any normal distribution or any symmetrical unimodal distribution
measures of central tendency
mean, median, mode
measures of variability
range, standard deviation, variance
range
distance from lowest to highest score
real range
high # minus low # plus 1
advantage of range
-simple
-can be used with ordinal and I/R data
disadvantages of range
-is influenced by extreme scores
-depends on only 2 points in the distribution
-sometimes is impossible to define
standard deviation
approximately the average deviation of scores about the mean
degrees of freedom
the number of values in a data set that can vary, given a statistic of the data
advantages of standard deviation
-is in the same units as the scores
-can be used in equations
-every point in the distribution contributes to it
disadvantages of standard deviation
-is influenced by extreme scores
-need a normal distribution and I/R data
-sample standard deviation is a biased estimate of population standard deviation unless you use different formulas
inflection point
point where the curve turns from con caved to concurved
variance
a measure of the width of the distribution equal to the mean of the squared deviations about the mean
advantages of variance
-can be used in equations
-every point in the distribution contributes to it
disadvantages of variance
-is influenced by extreme scores
-need a normal distribution and I/R data
-sample variance is a biased estimate of population variance unless you use different formulas
-is not the same units as scores
;