Study sets, textbooks, questions
Upgrade to remove ads
Stats Exam 1
Terms in this set (29)
a population consists of an entire set of objects, observations or scores that have something in common. POPULATION MEASUREMENTS ARE PARAMETERS
Examples of Parameters
the mean and standard deviation
a subset of a population. Best approach to gathering data by samples then one person at a time. SAMPLE MEASUREMENTS ARE STATISTICS
random sampling, not as popular as representative selected portions.
qualitative, attribute, characteristics and categorical. SUCH AS gender, maritial status, religion, political party, YES OR NO REPSPONSES. BAR GRAPHS OR PIE GRAPH<--
BAR GRAPH OR PIE GRAPHS<--ranking, rating, likert scales (strongly disagree to strongly agree, least important to most important)variables such as Income level, level of happiness --->(SCALE 1-5) ETC<----
quantitative, numeric such as GPA, Age , weight, etc. USE HISTOGRAM, STEM&LEAF, BOX PLOT.
A number that describes something about the "average" score of a distribution. (mean median mode)
sample is symbol x-bar, population symbol is (mu); arithmetic average found primarily for scale data; affected by outlier and skewed distributions
The arithmetic mean is what is commonly called the average: When the word "mean" is used without a modifier, it can be assumed that it refers to the arithmetic mean. The mean is the sum of all the scores divided by the number of scores. The formula in summation notation is:
μ = ΣX/N where μ is the population mean and N is the number of scores.
sample symbol is Q2 the 50th percentile used for scale and ordinal data. The median is the middle of a distribution. Half the scores are above the median and half are below. Median is less sensitive to extreme scores than the mean and this makes it a better measure than the mean for highly skewed distributions. the median income is usually more informative then the mean income.
The mean, median, and mode are equal in symmetric normal distributions. The mean is typically higher than the median in positively skewed distributions and lower than the median in negatively skewed distributions, although this may not be the case in bimodal distributions
has no symbol. Is the most frequently occurring score in a distribution and is used as a measure of central tendency. The mode is found for all types of data. The mode is greatly subject to sample fluctuations and is therefore not recommended to be used as the only measure of central tendency. A further disadvantage of the mode is that many distributions have more then one mode. called must modal.
A trimmed mean is calculated by discarding a certain percentage of the lowest and the highest scores and the computing the mean of the remaining scores For example, a 5% TM is computed by discarding the lower and higher 2.5% of the scores and taking the mean of the remaining scores. Trimmed means are often used in Olympic scoring to minimize the effects of extreme ratings possibly caused by biased judges.
no symbol. the range is the simplest measure of spread or dispersion. It is equal to the difference between the largest and smallest values. The range can be a useful measure of spread because it is so easily understood. However, it is very sensitive to extreme scores sense it is based on only two values.
symbol is s ; population parameter is σ . The SD is very useful in that it can be added to/subtracted from the mean for interpretation of variability and for establishing the empirical rule and z scores
symbol is s^2 ; population parameter is σ^2
Standard error of the mean
symbol is s subscript x-bar stars error of mean = s/square root of n. Measures sampling error and establishes confidence intervals.
Q1= 25th percentile, Q2= 50th percentile (median), Q3= 75th percentile displayed in Boxplots
IQR = Q3 - Q1
68-95-99.7% Rule. If the histogram of the data is approximately normal shaped then
• 1 will contain about 68% of the data
• 2 will contain about 95% of the data
• 3 will contain about 99.7% of the data.
standard score or standardized score. Tells how many standard deviations are added to or subtracted from the mean to arrive at a given value. For example, if the mean = 100 and standard deviation = 15 then a value of 130 has a z- score value of +2.0 (2 standard deviations above the mean) ; a value of 85 has a z-score value of -1.0 (1 standard deviation below the mean)
is the degree of departure from symmetry of a distribution. A positively skewed distribution has a "tail" which is pulled in the positive direction; if a distribution of exam scores, it means there are many more lower scores than with a bell-shaped normal distribution. A negatively skewed distribution has a "tail" which is pulled in the negative direction; for exam scores it means that there are more higher scores than normal.
is the degree of peakedness of a distribution. A normal disturbution is a mesokurtic distribution. A pure leptokurtic distribution has a higher peak than the normal distribution and has heavier tails; lepto is a greek prefix mean thin. A pure platykurtic distributions has a lower peak than normal and lighter tails; platy is a greek prefix for mean flat.
describes the strength of an association between two variables, and is completely symmetrical the correlation between A and B is the same as the correlation between B and A. However, if the two variables are related it means that when one changes by a certain amount the other changes on a average by a certain amount. For example, in the children described earlier greater height is associated, on average, with greater anatomical dead space. If y represents the dependent variable and x the independent variable this relationship is described as the regression of y on x.
the relationship can be represented by this simple equation. Means that the average value of y is a "function" of x, that is, it changes with x. Represents how much y changes with any given change of x can be used to construction a regression line a scatter diagram, and in the simplest case this is assumed to be a straight line. The direction in which the line slopes depends on whether the correlation is + or -. When the two sets increase or decrease together the line is positive; when decrease as the other increase its is a negative line. BEST FIT LINE
CAUTION: correlation/regression analysis
A significant result tells us little about the strength of a relationship. One of the flaws is that even with a very weak relationship (say r = 0.1) we would get a significant result (p < 0.05) with a large enough sample (say n over 1000). ----> Correlation and linear regression analysis do not prove a causal relationship between x and y as the relationship could be only causal (spurious).
Coefficient of Determination *
r^2. A part of the variation in one of the variables (as measured by its variance) can be thought of as being due to its relationship with the other variable and another part as due to undetermined (often "random") causes. The part due to the dependence of one variable on the other is measured by
r2 measures the % variation in the predicted variable (y) that is explained (or measured) by the predictor variable (x) and is the correlation coefficient squared.
It is converted to a % such that if r2 = 0.6 then it can be said that the predictor variable x explains or measures 60% of the variation in the predicted variable y.
We declare statistical significance most commonly when p< 0.05 or less than 5% Significance is the probability or percent chance that a relationship found in the data is just due to an unlucky sample, such that if we took another sample we might find nothing. That is, significance is the chance of a Type I error: the chance of concluding we have a relationship when we do not. Social scientists often use the .05 level as a cutoff, ie. there is 5% or less chance that a relationship is just due to chance.
Recommended textbook explanations
A First Course in Probability
The Practice of Statistics for the AP Exam
Daniel S. Yates, Daren S. Starnes, David Moore
A Survey of Mathematics with Applications
Allen R. Angel, Christine D. Abbott, Dennis C. Runde
Sets with similar terms
Psychology Stat Unit 1-3 (Test 1)
Psychology Stat Unit 1-3 (Test 1)
stats in psych TEST 2
Other sets by this creator
FINAL FOR 20TH CENTURY EUROPE
HIST 20TH CENTURY (turn into flashcards test 3)
Spinal Cord Injury
Other Quizlet sets
PAR/01/2021 - PRIVÉ QCM
Exam #3 Study Guide