Search
Create
Log in
Sign up
Log in
Sign up
Get ahead with a $300 test prep scholarship
| Enter to win by Tuesday 9/24
Learn more
Statistics
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (82)
raw scores
The scores initially measured in a study
frequency (f)
the number of times each score occurs in a set of data, one way of digesting raw scores
frequency distribution
a distribution showing the number of times each score occurs in the data
bar graph
A graph showing a vertical bar over each x score, but adjacent bars do not touch, used with NOMINAL or ORDINAL scores- the reason for this is they are discrete scales, you can be in one group or the next but not in between.
histogram
A frequency graph similar to a bar graph but with adjacent bars touching-there is no gap between the values on the x-axis(1,2,3,4,5); used with a small range of interval or ratio scores
Frequency polygon
A frequency graph showing a data point above each score, with the adjacent points connected by straight lines-used to communicate that our variable is continuous; used with many different ratio or interval scores.
data point
a point plotted on a graph to represent a pair of x and y scores
Grouped distribution
individual scores are first combined into small groups, and then we report the total frequency (or other information) for each group, X of 2 would fall into a 0-4 group
Normal curve
The symmetrical, bell-shaped curve produced by graphing a normal distribution
Normal distribution
A set of scores in which the middle score has the highest frequency and, proceeding towards higher and lower scores, the frequency, and proceeding toward higher or lower scores the frequencies at first decrease slightly but then decrease drastically, with highest and lowest scores having very low frequencies, never reach zero, very common in BEHAVIORAL research
Tail of the distribution
the far-left or far-right ends of the distribution containing the relatively low frequency, extreme scores
Negatively skewed distribution
an asymmetrical distribution with low frequency, extreme low scores, but without corresponding low-frequency, extreme high scores; its polygon only one pronounced tail over the lower scores.
Positively skewed distribution
an asymmetrical distribution with low frequency, extreme high scores, but without corresponding low-frequency, extreme low scores; its polygon only one pronounced tail over the higher scores.
bimodal distribution
a distribution forming a symmetrical polygon with two humps where there are relatively high frequency scores, with center scores that have the same frequency
Relative frequency
the proportion of time that a score occurs in a distribution. Any proportion is a decimal number between 0 and 1 that indicates a fraction of the total. f/N
proportion of area under the curve
The proportion of total area under the curved at certain scores, which represent the relative frequency of those scores.
percentile
The percentage of scores in a sample that are below a certain score.
cumulative frequency
The number of scores in the data that are at or below a particular score
E (sigma)
summation sign, indicating to add together the scores
EX
"sum of X" find the sum of X scores, if we must round off, to the second decimal point
measure of central tendency
a statistic that indicates the "location" of the distribution on a variable- thus scores are locations and differences between them is the distance
mode
a score with the highest frequency in the data, preferred measure of central tendency when scores reflect a nominal scale of measurement (participants are categorized using a qualitative variable), limitations- does not take into account most of data, does not work with distributions such as 4,4,5,5,6,6,7,7
unimodal
A distribution whose frequency polygon has only one hump and thus has one score qualifying as the mode
bimodal
A distribution whose frequency polygon has two humps, each centered over a score having the highest frequency, so these are the two modes.
median (Mdn)
score located at the 50th percentile, a distribution can only have one, will usually be around where most of the scores are located, preferred measure of central tendency when the data are ordinal score, limitations- only reflects the frequency of scores, doesn't consider their mathematical value
mean
The score located at the mathematical center of a distribution, the average, the distribution's balance point, x(with a line above it)= EX/N, compute only with interval or ratio data-distibution should be symmetrical and unimodal
deviation
The distance a score is from the mean, indicates how different the score is from the mean. Subtract the mean from the raw score to compute, x- x(with a line above it), positive indicates greater than mean, negative indicates less than the mean.
sum of the deviations around the mean
The sum of all the differences between the scores and the mean, E(x- x(with a line above it)), always equals 0, helps explain error
line graph
A graph of an experiment's results when the independent variable is an interval or ratio variable; plotted by connecting data points with straight lines; as opposed to bar graph, used when independent variable is nominal or ordinal
the scale of measurement of the dependent variable
determines which measure of central tendency to compute (mean, median, mode)
the scale of the independent variable
determines the type of graph to create.
u
the symbol used to represent the population mean
measures of variability
statistics that summarize the extent to which scores of a distribution differ from one another, communicate three aspects of the data, 1) the opposite of this is consistency 2) the amount of this implies how accurate a measure of central tendency describes the distribution 3) how spread out the data is
range
the distance between the highest and lowest score of data, we usually use this as our sole measure of variability ONLY with nominal and ordinal data
Sample Variance (S2x)
The average of the squared deviations of scores around the sample mean. The S indicated that we are describing the sample, x indicates its of the x scores. S(squared)x= ((E(x- x(with a line above it))squared)/N, communicates relative variability, the larger the variance the more the scores are spread out, limitations- unrealistically large, not good representation of how each individual score varies, it's squared number doesn't quite make sense
Standard Deviation (Sx)
the square root of the sample variance, interprets as a "average" deviation, as class as we come to it. Sx=square root of S2x, allows us to envision the variance as well as see how accurate the mean is
Area under curve
68% of scores fall +/- 1 sd of mean, 95% of scores fall +/- 2 sd of mean, 98.5% of scores fall +/- 3 sd of mean
Population standard deviation (ox)
The square root of the population variance, or the square root of the average squared deviation of scores around the population mean
population variance (o2x)
The average squared deviation of scores around the population mean
biased estimators
The formula for the variance or standard deviation involving a final division by N, used to describe a sample, but tends to UNDERESTIMATE the population variability
unbiased estimators
The formula for the variance or standard deviation involving a final division by N-1, calculated using sample data to estimate the population variability,
relative standing
reflects the systematic evaluation of a score by comparing it to a sample or population in which the score occurs
z-score
standard scores, the statistic that indicates the distance a score is from its mean when measured in st. deviation units, z= (X-X with line above)/Sx, also can compute for population: raw score(X)- (u), always include a positive or negative
z-distribution
the distribution produced by transforming all raw scores in the data into z-scores, describes relative standing, comparing scores from different distributions 2) computing the relative frequency of scores
characteristics of z-distribution
1) a z-distribution always has the shape as the raw score distribution, 2) The mean of any z-distribution 0, 3) The standard deviation of any z-distribtution is 1.
standard normal curve
a perfect normal curve that serves as a model of any approximately normal z-distribution, requires that we have a large sample of interval or ratio scores that come close to forming the appropriate distribution
sampling distribution of means
the frequency distribution of all possible sample means that occur when an infinite number of samples of the sample size N are selected from one raw score population
Central Limit Theorem
A statistical principle that defines the shape, the mean, and the standard deviation of a sampling distribution, we know that: 1) a sampling distribution is always an approximately normal distribution 2)The mean of the sampling distribution equals the mean of the underlying raw score population used to create the sample distribution 3)The standard deviation of the sampling distribution is mathematically related to standard deviation of the raw score population
inferential statistics
procedures to decide whether sample data represent a particular relation in the population
parameter statistics
inferential procedures that require certain assumptions ( 1. the population of dependent scores should be at least approximately normally distributed 2. the scores should be interval or ratio scores.) about the raw score population represented by the sample; used when we compute the mean
nonparameter statistics
Inferential procedures that do not require stringent assumptions about raw score population represented by the sample used with median and mode.
experimental hypothesis
two statements describing the predicted relationship that may or may not be demonstrated by a study
two-tailed test
The type of inferential test used when we do not predict whether dependent scores will increase or decrease. Ha=/0
one-tailed test
The type of inferential test used when we do predict whether dependent scores will increase or decrease Ha<0
statistical hypothesis
statements that describe the population parameters the sample statistics represent if the predicted relationship exists or does not exist
alternative hypothesis
describes the population parameters the sample data represents if the predicted relationship occurs in nature. Ha=/0
null hypothesis
the hypothesis describing the population parameters the sample data represent if the predicted relationship does not exist in nature. Ho=0 (the subscript is 0 as in zero relationship)
z-test
the parametric procedure used in a single sample experiment when the standard deviation of the raw score is known, has 4 assumptions (We have randomly selected the sample, the dependent variable is at least approximately normally distributed in the population and involves an interval or ratio, we know the mean of the population of the raw scores under another condition of the independent variable, we know the true standard dev of the population described by the null hypothesis
significant
describes results that are unlikely to result from sampling error when the predicted relationship does not exist; it indicates the rejection of the null hypothesis
nonsignificant
describes results that are likely to result from sampling error when the predicted relationship does not exist; it indicates failure to reject the null hypothesis
type 1 error
rejecting the null hypothesis when it is true (saying the independent variable had an effect when it did not) will only be true .05 of the time
type 2 error
retaining the null hypothesis when it is false (failing to identify that the independent variable does have an effect as predicted)
power
the probability that we will detect a relationship and correctly reject a false null hypothesis; the probability of avoiding a type 2 error
one-sample t-test
The parametric procedure used in a one-sample experiment when the standard deviation of the raw score population is estimated
estimated standard error of the mean
an estimate of the standard deviation of a sampling distribution of sample means selected from a population with an unknown variance; estimate of the standard distance that sample means can be expected to deviate from the value of the population mean stated in the null hypothesis, used in calculating the one-sample t-test
t-distribution
the sampling distribution of all values of t that occur when samples of a particular size are selected from the raw score population described
degrees of freedom
the number of scores in a sampling distribution that reflect variability in the population; determine the shape of the sampling distribution when estimating ox
point estimation
a way to estimate a population parameter by describing a point on the variable at which the population parameter is expected to fall
interval estimation
a way to estimate a population parameter by describing an interval within which the population parameter is expected to fall
margin of error
describes an interval by describing a central with plus or minus some amount
confidence interval for u
a range of values for u which we are confident that the actual u is found
independent-samples t-test
the parametric procedure used to test sample means from two independent samples
independent samples
samples created by selecting each participant for one condition without regard to participants selected for any other condition
homogeneity of variance
the requirement that the populations represented in a study have equal variances
sampling distribution of the differences between means
shows all the differences between two means that occur when samples are drawn from the population of scores that Ho says we are representing
pooled variance
the weighted average of the sample variances in a two-sample t-test
standard error of the difference
the estimated standard deviation of the sampling distribution of differences between the means
related-samples t-test
the parametric procedure used for testing two related samples
related samples
samples created by matching each participant in one condition with a participant in the other condition by repeatedly measuring the same participants under all conditions
matched-samples design
when a participant in one condition is matched with a participant in the other condition
repeated-measures design
when the same participants are measured under all levels of the independent variable
mean difference
the means of differences between the paired scores in a related samples t-test, symbolized as D
;