45 terms

Statistics 1, Test 1

STUDY
PLAY
Statistics
A set of mathematical procedures for organizing ,
summarizing, and interpreting information
Population
A group of two or more individuals or things that
share one or more common characteristics
Sample
A subgroup of two or more individuals or things
from a population
Representative Sample
A subgroup of two or more individuals or things
randomly and independently selected from a
population
Parameter
Usually a numerical value, that describes a
population.
Statistic
A value, usually a numerical value that describes a
sample.
Data
Measurements or observations
Descriptive Statistics
Statistical procedures used to summarize, organize and
simplify data.
Inferential Statistics
Techniques that allow us to study samples and then
make a generalization about the population from which
they were selected.
Sampling error
The discrepency, or amount of error, that exists
between a sample statistic and the
corresponding population parameter
Variable
A characteristic or condition that changes or
has different values for different individuals
Constant
A characteristic or condition that does not vary
but is the same for every individual.
Correlational Research
Naturalistic observation - Observing naturally occurring
phenomena
· Archival research
· Case histories
· Surveys
Is variable X associated with variable Y?
Correlational Research
Example: Is watching WWE related to aggressive
behavior in children?
Perhaps higher levels of WWE viewing is
associated with higher levels of aggressive
behavior
Correlational Research
A good place to start & explore (especially if
relevant theory is lacking)
- Often cheapest & easiest option
- Can look at more variables simultaneously /
greater realism
Fewer ethical issues...
Experimental Research
Analyzing causality
Manipulation of IV
- Random assignment to treatments
- Control of extraneous variables
- Eliminating threats to validity
Experimenter bias
• Affects treatments
• Affects measurements
Experimental Research Limitations
Often harder, more time consuming, &/or
expensive
- Some variables can't be manipulated
- Difficult to control for all extraneous variables
(hold them all constant)
- Difficult to make the experimental situation
realistic
- Procedural mistakes or flawed sampling can
make findings useless
Some variables shouldn't be manipulated, or
only with great caution
Theories
allow us to generate testable
hypotheses
• - When hypotheses are supported by
evidence, the theory is considered the best
explanation so far
• When hypotheses are not supported, the theory is
refined or discarded
Evidence
Observations must be
- Public
- Replicable
• Can be repeated by others using
same procedures
- Reliable
• Consistent across measurements
&/or observers
Operational Definitions
Defining a construct in terms of the operation(s)
used to measure it
Ways to measure fear? attraction?
Poor operational definitions
--> bad research/misleading results
- Problems with reliability of observations
- Problems with interpretation of results
Independent variable
The variable that is manipulated by the researcher.
Independent variable consists of the antecedent
condition that were manipulated prior to observing
the dependent variable
Dependent variable
The variable that is maintained and observed in order to assess the effect of the treatment.
Control Condition
-Individuals who do not receive experimental treatment
Experimental Condition
-Individuals who receive experimental treatment.
Confounding variable
An uncontrolled variable that is unintentionally
allowed to vary systematically with the
independent variable.
Discrete variable
each item corresponds to a
separate value of the variable
Values/categories do NOT overlap or "touch"
on the scale.
There are no values "in between"
Continuous variable
each item corresponds to an
interval on the scale of measurement.
Intervals defined by upper & lower real limits
Real limits are continuous ("they touch")
Nominal Scale
o Identification (Name): allows you to label
observations.
o Applies to category labels & numbers used as
labels.
o Examples: college major, any "yes/no,"
participant number, etc...
Ordinal Scale
o Magnitude (Order): allows you to make
statements about relative size or ordering/ranking
of observations.
o Applies to ordered category labels & numbers
used as ranks.
o Examples: any "high/medium/low," class rank,
etc...
Interval Scale
Equal Intervals: allows you to assume that the
distances between numbers on the measurement
scale are equal & correspond to equal differences
in the variable being measured.
o Applies to numbers, often scores or ratings.
o Examples: attitude as preference ratings, etc...
Ratio Scale
o Absolute Zero: allows you to assume that a score
of "0" on a variable really means the absence of
that property, & that you can make meaningful
ratio statements.
o Applies to numbers, often tallies or physical
measurements.
o Examples: stress as change in BP, memory
performance as # of words recalled, etc...
Frequency Distribution Table
shows a range of possible
values for a single variable (X) & the number of
observations of each value (f).
Σf=N
X f
1 14
2 33

1 = Male
2 = Female
Percentile rank
A particular score is defined as the percentage of individuals in the distribution with scores at or below the particular value.
cf
# of observations at or below a given value of X
add up frequencies from bottom of table upwards
cum%
percentage of observations at or below a given value of X
divide cf/N for each row (better—less rounding error)
OR add up percentages from bottom of table upwards
Normal Distribution
mean = median = mode
• symmetrical
• Many complexly-determined
traits are normally distributed,
e.g. IQ & SAT scores.
Symmetrical Bimodal Distribution
mean = median, with 2 modes
Bimodal distributions may
also be asymmetrical (mean,
median), & multimodal
distributions are possible.
Positively Skewed Distribution
(tail --> positive end of scale)
Mode<median<mean
Negatively Skewed Distribution
(tail --> negative end of scale)
Mean<median<mode
Mean
Sum of all data points divided by the number of data points. More informative than median & mode
Takes all the observation/scores into account.
Takes the distance & direction of deviations/errors into
account. Advantage: More uses than median & mode
Necessary for calculating many inferential statistics.
Limitation: Not always possible to calculate a mean
(scale)
Mean can only be calculated for interval/ratio level data
Need a different measure for nominal or ordinal level
data
Limitation: Not always appropriate to use the mean to
describe the middle of a distribution
Mean is sensitive to extreme values or "outliers"
Mean does not always reflect where the scores "pile up"
Need a different measure for asymmetrical distributions
Median
Divides the distribution exactly in half; 50th percentile
Odd # of scores & no "pileup" or ties at the middle:
median = the middle score
Even # of scores & no "pileup" or ties at the middle:
median = the average of the 2 middle scores
Is the most central, representative value in skewed
distributions
Advantage: Can be calculated when the mean cannot
Can be used with ranks (as well as interval/ratio data)
Can be used with open-ended distributions
Example: # of siblings (5+ siblings?)
Limitation: Not as informative as the mean
Takes only the observations/scores around the 50th %ile
into account.
Provides no information about distances between
observations.
Limitation: Fewer uses than the mean
Median is purely descriptive
Median can only be calculated for *ordinal &
interval/ratio* data
Use the median when you cannot calculate a
mean or when the distributions of interval/ratio
data are skewed by extreme values.
Mode
The most frequently occurring score(s)
Advantage: Simple to find
Advantage: Can be used with any scale of measurement
Median can only be calculated for ordinal &
interval/ratio data
Mean can only be calculated for interval/ratio data
Mode can be calculated for nominal/ordinal/interval/ratio data
Advantage: Can be used to indicate >1 most frequent
value
Use to indicate bimodality, multimodality
Limitation: Not as informative as the mean or median
Takes only the most frequently observed X values into
account.
Provides no information about distances between
observations or the # of observations above/below the
mode.
Limitation: Fewer uses than the mean
Mode is purely descriptive.
Need to calculate a mean to use with inferential
statistics.
Use the mode when you cannot compute a mean or
median, or with the mean/median to describe a
bimodal/multimodal distribution.
YOU MIGHT ALSO LIKE...