45 terms

Statistics

A set of mathematical procedures for organizing ,

summarizing, and interpreting information

summarizing, and interpreting information

Population

A group of two or more individuals or things that

share one or more common characteristics

share one or more common characteristics

Sample

A subgroup of two or more individuals or things

from a population

from a population

Representative Sample

A subgroup of two or more individuals or things

**randomly and independently selected** from a

population

population

Parameter

Usually a numerical value, that describes a

population.

population.

Statistic

A value, usually a numerical value that describes a

sample.

sample.

Data

Measurements or observations

Descriptive Statistics

Statistical procedures used to summarize, organize and

simplify data.

simplify data.

Inferential Statistics

Techniques that allow us to study samples and then

make a generalization about the population from which

they were selected.

make a generalization about the population from which

they were selected.

Sampling error

The discrepency, or amount of error, that exists

between a sample statistic and the

corresponding population parameter

between a sample statistic and the

corresponding population parameter

Variable

A characteristic or condition that changes or

has different values for different individuals

has different values for different individuals

Constant

A characteristic or condition that does not vary

but is the same for every individual.

but is the same for every individual.

Correlational Research

Naturalistic observation - Observing naturally occurring

phenomena

· Archival research

· Case histories

· Surveys

Is variable X associated with variable Y?

phenomena

· Archival research

· Case histories

· Surveys

Is variable X associated with variable Y?

Correlational Research

Example: Is watching WWE related to aggressive

behavior in children?

Perhaps higher levels of WWE viewing is

associated with higher levels of aggressive

behavior

behavior in children?

Perhaps higher levels of WWE viewing is

associated with higher levels of aggressive

behavior

Correlational Research

A good place to start & explore (especially if

relevant theory is lacking)

- Often cheapest & easiest option

- Can look at more variables simultaneously /

greater realism

Fewer ethical issues...

relevant theory is lacking)

- Often cheapest & easiest option

- Can look at more variables simultaneously /

greater realism

Fewer ethical issues...

Experimental Research

Analyzing causality

Manipulation of IV

- Random assignment to treatments

- Control of extraneous variables

- Eliminating threats to validity

Manipulation of IV

- Random assignment to treatments

- Control of extraneous variables

- Eliminating threats to validity

Experimenter bias

• Affects treatments

• Affects measurements

• Affects measurements

Experimental Research Limitations

Often harder, more time consuming, &/or

expensive

- Some variables can't be manipulated

- Difficult to control for all extraneous variables

(hold them all constant)

- Difficult to make the experimental situation

realistic

- Procedural mistakes or flawed sampling can

make findings useless

Some variables shouldn't be manipulated, or

only with great caution

expensive

- Some variables can't be manipulated

- Difficult to control for all extraneous variables

(hold them all constant)

- Difficult to make the experimental situation

realistic

- Procedural mistakes or flawed sampling can

make findings useless

Some variables shouldn't be manipulated, or

only with great caution

Theories

allow us to generate testable

hypotheses

• - When hypotheses are supported by

evidence, the theory is considered the best

explanation so far

• When hypotheses are not supported, the theory is

refined or discarded

hypotheses

• - When hypotheses are supported by

evidence, the theory is considered the best

explanation so far

• When hypotheses are not supported, the theory is

refined or discarded

Evidence

Observations must be

- Public

- Replicable

• Can be repeated by others using

same procedures

- Reliable

• Consistent across measurements

&/or observers

- Public

- Replicable

• Can be repeated by others using

same procedures

- Reliable

• Consistent across measurements

&/or observers

Operational Definitions

Defining a construct in terms of the operation(s)

used to measure it

Ways to measure fear? attraction?

used to measure it

Ways to measure fear? attraction?

Poor operational definitions

--> bad research/misleading results

- Problems with reliability of observations

- Problems with interpretation of results

- Problems with reliability of observations

- Problems with interpretation of results

Independent variable

The variable that is **manipulated** by the researcher.

Independent variable consists of the antecedent

condition that were manipulated prior to observing

the dependent variable

Independent variable consists of the antecedent

condition that were manipulated prior to observing

the dependent variable

Dependent variable

The variable that is **maintained** and observed in order to assess the effect of the treatment.

Control Condition

-Individuals who do not receive experimental treatment

Experimental Condition

-Individuals who receive experimental treatment.

Confounding variable

An uncontrolled variable that is unintentionally

allowed to vary systematically with the

independent variable.

allowed to vary systematically with the

independent variable.

Discrete variable

each item corresponds to a

separate value of the variable

Values/categories do NOT overlap or "touch"

on the scale.

There are no values "in between"

separate value of the variable

Values/categories do NOT overlap or "touch"

on the scale.

There are no values "in between"

Continuous variable

each item corresponds to an

interval on the scale of measurement.

Intervals defined by upper & lower real limits

Real limits are continuous ("they touch")

interval on the scale of measurement.

Intervals defined by upper & lower real limits

Real limits are continuous ("they touch")

Nominal Scale

o Identification (Name): allows you to label

observations.

o Applies to category labels & numbers used as

labels.

o Examples: college major, any "yes/no,"

participant number, etc...

observations.

o Applies to category labels & numbers used as

labels.

o Examples: college major, any "yes/no,"

participant number, etc...

Ordinal Scale

o Magnitude (Order): allows you to make

statements about relative size or ordering/ranking

of observations.

o Applies to ordered category labels & numbers

used as ranks.

o Examples: any "high/medium/low," class rank,

etc...

statements about relative size or ordering/ranking

of observations.

o Applies to ordered category labels & numbers

used as ranks.

o Examples: any "high/medium/low," class rank,

etc...

Interval Scale

Equal Intervals: allows you to assume that the

distances between numbers on the measurement

scale are equal & correspond to equal differences

in the variable being measured.

o Applies to numbers, often scores or ratings.

o Examples: attitude as preference ratings, etc...

distances between numbers on the measurement

scale are equal & correspond to equal differences

in the variable being measured.

o Applies to numbers, often scores or ratings.

o Examples: attitude as preference ratings, etc...

Ratio Scale

o Absolute Zero: allows you to assume that a score

of "0" on a variable really means the absence of

that property, & that you can make meaningful

ratio statements.

o Applies to numbers, often tallies or physical

measurements.

o Examples: stress as change in BP, memory

performance as # of words recalled, etc...

of "0" on a variable really means the absence of

that property, & that you can make meaningful

ratio statements.

o Applies to numbers, often tallies or physical

measurements.

o Examples: stress as change in BP, memory

performance as # of words recalled, etc...

Frequency Distribution Table

shows a range of possible

values for a single variable (X) & the number of

observations of each value (f).

values for a single variable (X) & the number of

observations of each value (f).

Σf=N

X f

1 14

2 33

1 = Male

2 = Female

1 14

2 33

1 = Male

2 = Female

Percentile rank

A particular score is defined as the percentage of individuals in the distribution with scores at or below the particular value.

cf

# of observations at or below a given value of X

add up frequencies from bottom of table upwards

add up frequencies from bottom of table upwards

cum%

percentage of observations at or below a given value of X

divide cf/N for each row (better—less rounding error)

OR add up percentages from bottom of table upwards

divide cf/N for each row (better—less rounding error)

OR add up percentages from bottom of table upwards

Normal Distribution

mean = median = mode

• symmetrical

• Many complexly-determined

traits are normally distributed,

e.g. IQ & SAT scores.

• symmetrical

• Many complexly-determined

traits are normally distributed,

e.g. IQ & SAT scores.

Symmetrical Bimodal Distribution

mean = median, with 2 modes

Bimodal distributions may

also be asymmetrical (mean,

median), & multimodal

distributions are possible.

Bimodal distributions may

also be asymmetrical (mean,

median), & multimodal

distributions are possible.

Positively Skewed Distribution

(tail --> positive end of scale)

Mode<median<mean

Mode<median<mean

Negatively Skewed Distribution

(tail --> negative end of scale)

Mean<median<mode

Mean<median<mode

Mean

Sum of all data points divided by the number of data points. More informative than median & mode

Takes all the observation/scores into account.

Takes the distance & direction of deviations/errors into

account. Advantage: More uses than median & mode

Necessary for calculating many inferential statistics.

Limitation: Not always possible to calculate a mean

(scale)

Mean can only be calculated for**interval/ratio** level data

Need a different measure for nominal or ordinal level

data

Limitation: Not always appropriate to use the mean to

describe the middle of a distribution

Mean is sensitive to extreme values or "outliers"

Mean does not always reflect where the scores "pile up"

Need a different measure for asymmetrical distributions

Takes all the observation/scores into account.

Takes the distance & direction of deviations/errors into

account. Advantage: More uses than median & mode

Necessary for calculating many inferential statistics.

Limitation: Not always possible to calculate a mean

(scale)

Mean can only be calculated for

Need a different measure for nominal or ordinal level

data

Limitation: Not always appropriate to use the mean to

describe the middle of a distribution

Mean is sensitive to extreme values or "outliers"

Mean does not always reflect where the scores "pile up"

Need a different measure for asymmetrical distributions

Median

Divides the distribution exactly in half; 50th percentile

Odd # of scores & no "pileup" or ties at the middle:

median = the middle score

Even # of scores & no "pileup" or ties at the middle:

median = the average of the 2 middle scores

Is the most central, representative value in skewed

distributions

Advantage: Can be calculated when the mean cannot

Can be used with ranks (as well as interval/ratio data)

Can be used with open-ended distributions

Example: # of siblings (5+ siblings?)

Limitation: Not as informative as the mean

Takes only the observations/scores around the 50th %ile

into account.

Provides no information about distances between

observations.

Limitation: Fewer uses than the mean

Median is purely descriptive

Median can only be calculated for *ordinal &

interval/ratio* data

Use the median when you cannot calculate a

mean or when the distributions of interval/ratio

data are**skewed by extreme values.**

Odd # of scores & no "pileup" or ties at the middle:

median = the middle score

Even # of scores & no "pileup" or ties at the middle:

median = the average of the 2 middle scores

Is the most central, representative value in skewed

distributions

Advantage: Can be calculated when the mean cannot

Can be used with ranks (as well as interval/ratio data)

Can be used with open-ended distributions

Example: # of siblings (5+ siblings?)

Limitation: Not as informative as the mean

Takes only the observations/scores around the 50th %ile

into account.

Provides no information about distances between

observations.

Limitation: Fewer uses than the mean

Median is purely descriptive

Median can only be calculated for *ordinal &

interval/ratio* data

Use the median when you cannot calculate a

mean or when the distributions of interval/ratio

data are

Mode

The most frequently occurring score(s)

Advantage: Simple to find

Advantage: Can be used with any scale of measurement

Median can only be calculated for ordinal &

interval/ratio data

Mean can only be calculated for interval/ratio data

Mode can be calculated for**nominal/ordinal/interval/ratio data**

Advantage: Can be used to indicate >1 most frequent

value

Use to indicate bimodality, multimodality

Limitation: Not as informative as the mean or median

Takes only the most frequently observed X values into

account.

Provides no information about distances between

observations or the # of observations above/below the

mode.

Limitation: Fewer uses than the mean

Mode is purely descriptive.

Need to calculate a mean to use with inferential

statistics.

Use the mode when you cannot compute a mean or

median, or with the mean/median to describe a

bimodal/multimodal distribution.

Advantage: Simple to find

Advantage: Can be used with any scale of measurement

Median can only be calculated for ordinal &

interval/ratio data

Mean can only be calculated for interval/ratio data

Mode can be calculated for

Advantage: Can be used to indicate >1 most frequent

value

Use to indicate bimodality, multimodality

Limitation: Not as informative as the mean or median

Takes only the most frequently observed X values into

account.

Provides no information about distances between

observations or the # of observations above/below the

mode.

Limitation: Fewer uses than the mean

Mode is purely descriptive.

Need to calculate a mean to use with inferential

statistics.

Use the mode when you cannot compute a mean or

median, or with the mean/median to describe a

bimodal/multimodal distribution.