← PSYC 2471: Introduction to Statistical Methods Export Options Alphabetize Word-Def Delimiter Tab Comma Custom Def-Word Delimiter New Line Semicolon Custom Data Copy and paste the text below. It is read-only. Select All Descriptive Statistics Describing/summarizing sets of scores (graphs); Basic probability & standardized scores Correlational Statistics Assesses strength of a relationship between 2 variables; Generates optimal predictions; Forecast size of prediction error; NO cause and effect Inferential Statistics Makes inferences about whole populations from samples drawn from the population; Cause and Effect 4 Levels of Measurement Categories of assigned numerals to a set of people objects, or events according to a set of rules that differ in their levels of sophistication. Nominal Data A set of labels or categories by which individual cases can be classified-- the size of the category number has no meaning. Simplest. Blood types, Jersey numbers; CHI SQUARE Ordinal Data "Rank Order"; A higher number indicates more of one thing being measured but there is no assurance that the steps between between numbers represent equal increases in the thing being measured. Class rank, Place awarded in a contest; SPEARMAN'S RHO Interval Data The steps between numbers are equal-- numerically equal distances represent empirically equal distances; ARBITRARY ZERO; temperature (F/C), Calendar years; T-TEST, ANOVA, REGRESSION LINE, PEARSON R Arbitrary Zero zero assigned in a scale which doesn't reflect an absolute truth e.g., zero on the celsius scale is only when water freezes at sea level not the absence of temperature Ratio Data All properties of ordinary arithmetic hold; Ratios are meaningful; equal step sizes; ABSOLUTE ZERO;T-TEST, ANOVA, REGRESSION LINE, PEARSON R Absolute Zero the absence of of something to measure; what was being measured is not present; nothing present to measure. Grouped Frequency Distribution One type of visual presentation of data; Presents the summary of a set of scores as efficiently as possible so the information can be easily understood and the same conclusion can be drawn from it. Real Class Interval Size Preferred Sizes 1, 2, 3, 5, 10, 15, 25, 50, 100 4 Characteristics of a Distribution How scores are distributed relative to each other 1. Skewness the extend to which the distribution departs from symmetry; Not mirror images; Not equal; A greater amount of spread in the scores on one side of the center than the other side Positive Skew More extreme scores far above the mean than below the mean; More high scores (tail to the right) Negative Skew More extreme scores far below the mean than above the mean; More low scores (tail to the left) 2. Kurtosis the extent to which a distribution is peaked or flattened relative to the normal distribution Mesokurtic Normal distribution; "middle" amount of kurtosis Leptokurtic more pointed than a normal distribution; high kurtosis Platykurtic Less pointed than a normal distribution; low kurtosis 3. Central Tendency typical/representative values for a set of data; 1 or 2 numbers to represent all the data; Mean, Medium, and Mode Mode Most frequently occurring score; Least sensitive measure Median The score with an equal number of scores above and below it; Pays more attention to the value of each number; More sensitive measure Mean arithmetic average; "Average"; a balancing point; Most sensitive measure; significantly effected by outliers Outlier A data point that is very different from the rest of the data; The mean will move towards this. Unimodal One mode; One distribution Bimodal Two modes; Two distributions at the same height in one graph 4. Variability The amount of scatter around the mean; Differences between the values; Range and Standard Deviation; Sum of square and Mean of Squares Range = Highest score - Lowest score + 1; Simplest measure of variability Standard Deviation Describes an average distance of every score from the mean; Amount of departure from normal/average; Preferred measure of variability Z-Score Number of standard deviations that a raw score lies above or below the mean: Standard score Raw Score A data point; "X" Percentile The percentage of scores below a given raw score; Proportion below Centile Point the raw score below which that percentage of scores will fall Proportion Above The proportion of scores above a given raw score; NOT percentile Proportion Below Percentile; the proportion of scores below a given centile point (raw score) Proportion Between Proportion of scores that fall between tow given raw scores Probability number of actual outcomes divided by the number of possible outcomes "And" Rule Sample with replacement; Independent events; Multiply "Or" Rule Mutually exclusive events; Add Discrete Data data using only whole numbers; may have decimals as averages; Gaps between number; 1,2,3,4... Continuous Data Could be any possible number; very specific; precise; 1.000001, 1.000002, 1.000003... Correlation Coefficient describes the direction and strength of the relationship between two sets of variables Pearson r Measures the strength and direction of a relationship between two sets of data; INTERVAL/RATIO Data; Positive/Negative; PARAMETRIC Spearman's Rho Measures the Strength and direction of a relationship between two sets of data using ORDINAL data ; Agreement/Disagreement; NONparametric Linear Relationship Relationship that can be described as a straight line Regression Line Summarizes the points of a scatterplot and provides the means for making predictions Forecasting Prediction Error Standard Error of the Estimate (Sest); Standard deviation of prediction error; Distance from points to line; PREFERRED METHOD 3rd Variable an outside variable that effects both variables that are being measured Predictor Variable "X" Criterion Variable "Y" Scatter plot a graph with points plotted to show a possible relationship between two sets of data. Negative Correlation As X increases, Y decreases Positive Correlation As X increases, Y also increases Weak Relationship A correlation Coefficient of .00-.29 Moderate Relationship A correlation Coefficient of .30-.59 Strong Relationship A correlation Coefficient of .60-.89 Very Strong Relationship A correlation Coefficient of .90-1.0 Homoscedastic The variability (spread) of the criterion variable, is the same across the entire range of the predictor Heteroscedastic The variability (spread) of the criterion variable, is NOT the same across the entire range of the predictor Coefficient of Determination Proportion of variance in the criterion that IS ACCOUNTED FOR by a knowledge of the predictor; EXPLAINED; SHARED; between 0-1, r^2 Coefficient of Non Determination Proportion of variance in the criterion that is NOT ACCOUNTED FOR by a knowledge of the predictor; NOT EXPLAINED; NOT SHARED; 1-r^2 Coefficient of Alienation Amount of ERROR REMAINING after using a regression equation; square root of 1-r^2=K Index of Forecasting Efficiency ERROR ELIMINATED by using a regression equation to make predictions; 1- (square root of) 1-r^2=E Prediction Error Distance between the regression line and data points Strong Correlation 1 and -1; smaller prediction error; Steep slope Weak Correlation 0; larger prediction error; flat slope Regression Equation Makes optimal predictions of one variable using another variable; mathematical formula for a very specific line (X predicts Y); Interval/Ratio data Slope b; amount of steepness given to a regression line Y-Intercept a; the value of Y at the point where the regression line crosses the Y-Axis Best Guess will always be the mean of Y; average of the Y's; a higher X does not mean you should guess a higher value for Y Fisher-Z Standard correlation coefficient used to average Pearson r's Average the Pearson r's "Transform--> Average--> Un-transform" Parametric Statistics Lots of rules; VERY STRICT NonParametric Statistics Fewer rules; LESS STRICT Guessing Error Sy Y^1 predicted score; predicted criterion a Y-intercept b slope X^1 Predicted X r Pearson r rho Spearman's rho Population a complete set of people, objects, or events having some common observable characteristic Sample a subset of a population Random Sample a sample in which each member of the population has an equal chance of being included in the sample Sample Error margin of error associated with the sample:how much do the samples differ from the population; equation for the confidence interval for the difference in two population means; cannot be 100% when a sample is taken Confidence Level The estimated probability that a population parameter lies within a given confidence interval: 95% sure you are right; 5% chance you are wrong; error rate; level of significance , The estimated probability that a population parameter lies within a given confidence interval. mu Mean (average) of the population Sample Mean Average of the sample taken from the population Unbiased Estimator a sample statistic that is most likely to approximate the corresponding population parameter; Biased Estimator any sample statistic such as a sample mean, obtained from a randomly selected sample that does not equal the value of its respective population parameter, such as a population mean, on average Confidence Interval Way to estimate the population mean (mu) using a (single) sample Significant Difference a difference that is very unlikely to occur by chance; reject the null Experimental Group A subject or group of subjects in an experiment that is exposed to the factor or condition being tested. Control Group the group that does not receive the experimental treatment. Ho ("null") Sample means differ due to sampling error alone; there is no real difference; it didn't work H1 ("alternative") sample means differ due to a "real difference" and some sampling error ; it worked Fail to Reject the Ho ("null") There is not enough evidence to prove a significant difference between the sample mean and the population mean; Didn't work; Not enough; too small of sample Reject the Ho ("null"), Accept the H1 ("alternative") There is a real difference; 95% sure you are right Independent Sample 2 samples; Depndent variable- Interval/Ratio; RANDOM PLACEMENT Independent Variable What is being measured; "X" Dependent Variable How it is being measured; "Y" |tcalc| >/= tcrit reject the null and accept the alternative; there is a real difference |tcalc| < tcrit fail to reject the null; there is not enough evidence to a difference Related Sample 2 samples; Dependent variable- Interval/Ratio; MATCH subjects in PAIRS; genetic relatives (twins/siblings); BEFORE & AFTER; Test-Retest; Give more power; Reduces sampling error; "D" Power Ability of a statistical test to detect a difference between two groups Homogeneity of Variance Variance of the dependent variable scores are equal Analysis of Variance Analyzes the significance of data; A test of the significance of the differences among means: the parametric procedure for determining whether significant differences exist in an experiment involving two or more sample mean; 3 or more groups; INTERVAL/RATIO data Linear Additivity the linear model is an appropriate description of the data Between Group Variance treatment variance , tells us how different the group means are from each other (how much each group mean varies from the "grand mean") Within Group Variance Error variance; , tells us how much the individual scores vary within each group F-Ratio the test statistic for the ANOVA, it is the ratio of variance between groups compared to the variance within groups Chi Square statistical test, using nominal data only, that checks whether two variables are independent events "O" Observed frequency; number that appears in the box "E" expected frequency; row sum * column sum / N df degrees of freedom Exhaustive No other possible explanations Mutually Exclusive Both cannot be true simultaneously Type I Error Error of rejecting null hypothesis when in fact it is true (also called a "false positive"); Rejecting the null when you should have accepted it: You think you found a cause effect relationship but ONE IS NOT THERE Type II Error Error of failing to reject a null hypothesis when in fact it is false (also called a "false negative"). You think there is NO CAUSE EFFECT but THERE IS