Home
Subjects
Create
Search
Log in
Sign up
Upgrade to remove ads
Only $2.99/month
Social Science
Psychology
Psychometrics
Research Design, Statistics, Tests, and Measurements
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (155)
William Wundt
Founded the first psychology laboratory in 1879
Believed that experimental psychology has a very limited use: that methodology could not be used to study the higher mental processes such as memory, thinking, and language
Proposed cultural psychology to study the higher mental processes
Hermann Ebbinghaus
Wundt's contemporary
Showed that higher mental processes could be studied using experimental methodology
Studied memory using nonsense syllables
Showed that at least one of the higher mental processes could be studied empirically using good experimental methodology
Oswald Kulpe
Wundt's contemporary
Disagreed with Wundt and believed that there could be imageless thought; he performed experiments to prove his hypothesis (Wundt believed there could be no thought without a mental image)
James McKeen Cattell
Studied under Wundt
Introduced mental testing to the United States
Alfred Binet and Theodore Simon (1905)
Collaborated together to publish the first intelligence test, known as the Binet-Simon test
Binet-Simon Test
Purpose was to assess the intelligence of French schoolchildren to ascertain which children were too intellectually disabled to benefit from ordinary schooling
Mental Age (Binet)
The age level with which a person functions intellectually, regardless of his actual chronological age
William Stern (IQ)
Developed an equation to compare mental age to chronological age which came to be known as the intelligence quotient
Lewis Terman (1916)
Revised the Binet-Simon test for use the United States
Became as known as the Stanford-Binet Intelligence test
Hypothesis
A tentative and testable explanation of the relationship between two or more variables
Variable
A characteristic or property that varies in amount or kind, and can be measured
Operational Definitions
States how the researcher will measure the variables
Ex: How does the researcher plan to define the variables in the experiment do that the variables are measurable?
Independent Variable (IV)
The variable whose effect is being studied and is the variable that the experimenter manipulates
Dependent Variable (DV)
The response that is expected to vary with differences in the independent variable
Is said to depend on the action
Three basic types of research
True experiments
Quasi-experiments
Correlational studies
Correlational Study
An experiment where the researcher does not manipulate the independent variable
True Experiment
The researcher uses random assignment and manipulates the IV
Quasi-Experimental Design
Researchers do not use a random assignment and lack sufficient control over the variables, and therefore definitive statements on causal factors cannot be made
Naturalistic Observation (Field Study)
The researcher does not intervene at all in what is being studied and is observing what occurs naturally
Population
The group to which the researcher wishes to generalize her results
Sample
A subset of a population that an experiment is run on
Representative Sample
The sample matches as many characteristics as possible of the population as a whole
Stratified Random Sampling
Technique for selecting a sample
Assures that each subgroup of the population is randomly sampled in proportion to its size
Random Selection
Technique for selecting a sample
Each member of the population has an equal chance of being selected for the sample
The three options for deciding which subjects will receive the different levels of the IV
Between-Subjects Design
Matched-Subjects Design
Within-Subjects Design
Between-Subjects Design
Each subject is exposed to only one level of each independent variable
Subjects are assigned randomly to groups and subjects in a given group do not receive the same level of IV as members of another group
Can reasonably assume that the groups are equal in terms of any subject variables that might affect our DV
Possible that the groups might differ on these variables merely due to chance
Matched-Subject Design
A technique of matching subjects on the basis of the variable that the researcher wants to control
Within-Subjects Design (Repeated-Measures Design)
The subject's own performance is the basis of comparison
Crucial thing here is that each subject is exposed to more than one condition, allowing the researcher to separate the effects of individual differences
The problem with within-subject designs
People may just do better on the second test because they are more familiar with the test format
Counterbalancing
Used to eliminate within-subject design problems
Method for controlling for potential unintended order effects by administering variable in all possible sequences
Confounding Variables
Unintended independent variables
Control Group Design
A technique of treating experimental and control groups equally in all respects, except that one group is exposed to the treatment in the experiment, and the other group is not exposed to the treatment
Nonequivalent Group Design
The control group is not necessarily similar to the experimental group since the researcher doesn't use random assignment
Common in educational research because you can't randomly assign subjects to different classes
Experimenter Bias
The fact that due to his or her expectations, the experimenter might inadvertently treat groups of subjects differently
May influence the results of an experiment; experimenter might also let his or her expectations affect how the results of the experiment are interpreted
Double-Blinding
A way to control for experimenter bias
Neither the researcher who interacts with the subjects nor the subjects themselves know which groups received the IV or which level of the IV
Single-Blind Experiment
When the subjects do not know whether they are in the treatment or control group, but the researchers know
Demand Characteristics
Refer to any cues that suggest to subjects what the researcher expects from them
Overall effects of the situation on subject's behavior
The assumption is that if subjects have an idea of what the researcher expects, they will perform as expected
Remedy is deception
Placebo Effect (demand characteristic)
Type of demand characteristic where a placebo has a beneficial effect on the subjects
Remedy is a control group
Hawthorne Effect
Refers to the tendency of people to heave differently if they know that they are being observed
Is controlled for by a control group design
External Validity
Has to do with how generalizable the results of an experiment
Two Basic types of Statistics
Descriptive Statistics
Inferential Statistics
Descriptive Statistics
Concerned with organizing, describing, quantifying, and summarizing a collection of actual observations
Inferential Statistics
Allows us to use a relatively small batch of actual observations to make conclusions about the entire population of interest
Researchers generalize beyond actual observations
Concerned with making an inference from the sample involved in the research to the population of interest, and providing an estimate of popular characteristics
Frequency Distributions
It is a graphic representation f how often each value occurs
Measures of central tendency
Mode
Median
Mean
Mode
The value of the most frequency observation in a set of scores
Bimodal
When there are two values that are tied for being the most frequently occurring observation
Median
The middle value when observations are ordered from least to greatest, or from greatest to least
Not necessarily the halfway point between the numerical value of the highest score and the numerical value of the lowest score
Is the number that divides the distribution in half
Mean
Arithmetic average
Outlers
Extreme scores that affect the mean, median, and mode
Measures of Dispersion/Variability
Range
Standard Deviation
Variance
If the scores in the distribution are all the same...
then there is no variability
If the scores are very spread out...
then the variability is high
Range
The smallest number in the distribution subtracted from the largest distribution
Standard Deviation
Provides a measure of the typical distance scores from the mean
"Average" scatter away from the mean (also the square root of the variance)
Must be either zero or a positive number
Variance
The square of the standard deviation and is a description of how much each score varies from the mean
Must be either 0 or a positive number
Normal Distribution
Forms a symmetrical bell-shaped curve
The horizontal axis gives us the values
The vertical axis gives us the frequency of the values
Percentile
Tell us the percentage of score that fall at or below that particular score
68% of scores will fall within
1 standard deviation
96% of scores will fall within
2 standard deviations
4% of scores will fall beyond
2 standard deviations
Z-score
Another way of calculation how many standard deviations above or below the mean your score is
Determining z-score
Subtract the mean of the distribution from your score, and divide the difference by the standard deviation
Negative z-scores fall
bwlo the mean
How to find the percentile of a z-score of +1
The scores that occur below z-scores of +1 can be divided into two groups: those scores that occur between the mean and a z-score of +1 (34%) and those that occur below the mean (50%)
The total percentage scores below a z-score of +1 is 50% + 34% = 84%
In a normal distribution, 84% of scores will fall below a score with a z-score of +1
How to find the percentile occurring below a z-score of -1
We know that 50% of the scores fall below a z-score of 0, and we know that 34% of scores fall between z-scores of 0 and -1, then by subtracting 34% from 50%, we can figure out the percentage of scores denoted by the area with crossed vertical and horizontal lines: 16%
How to approximate z-scores
Add up all the percentages to the left of it
What would happen if you converted every score in a distribution to a z-score?
If you have a distribution of z-scores and calculate the mean and standard distribution, the mean of the distribution of z-scores will always be zero and the standard deviation will always be 1
True regardless of whether the distribution is normal or not, and regardless of the mean and the standard deviation of the original distribution
T-score
A conversion of z-score
Distribution has a mean of 50 and a standard deviation of 10
Often used in test score interpretation
Ex: 60 is 1 standard deviation above the mean
Because the normal distribution is symmetrical and has its greatest frequency in the middle...
the mean, median, and mode are identical
Correlation coefficients
Descriptive statistic that measures to what extent, if any, two variables are related (knowing the value of one variable helps you predict the value of the other)
Help us understand the relationship and degree of association between two variables
Allows us to mathematically specify how well we can predict the value of the second variable given the corresponding value of the first variable
Ranges from -1.00 to +1.00
Positive Correlation
A change in value of one of the variables tends to be associated with a change in the same direction of the value of the other variable
Negative Correlation
A change in value of one of the variables tends to be associated with a change in the opposite direction of the other variable
Numerical values of correlations
Tells us how strong the relationship is
The closer a correlation coefficient is to +1 or -1, the more sure we can be of our prediction
If you have a perfect correlation, either +1 or -1, then given a value of one variable, you can predict, with absolute certainty, the value of the second variable
As the correlation coefficient moves closer to zero, the less sure you become about your prediction
If two variables have a correlation of zero, knowing the value of the first variable does not help you predict the value of the second variable
Scatterplot
Graphical representation of correlational data
Best-fitting straight line
Most important information to get from this is the direction, or the slope, of the line
Factor Analysis
Correlation is the cornerstone of this
Attempts to account for the interrelationships found among various variables by seeing how groups of variables "hang together"
Factor
A cluster of variables highly correlated with each other is assumed to be measuring the same thing
Armchair factor analysis
The correlation between a variable and itself is +1.00
To the extent that two variables measure the same thing, the correlation between those two variables will be high
Want to look through the correlation matrix to see which variables are highly correlated with each other
A, B, and C have a lot in common with each other, since the are highly correlated
D, E, and F are highly correlated
The analysis has detected two separate factors: one measured by variables A, B, and C, and the other measured by variables D, E, and F
Significance Test
Tool researchers use to draw conclusions about populations based upon research conducted on samples
Researchers tries to show that one hypothesis is supported by the data by showing that other possible hypotheses are inconsistent with the data collected
Helps determine if there is a real difference and tells us the probability that our observed difference is due to chance --> the probability that we could have obtained such a difference if our null hypothesis, that there is no difference between the two groups
research hypothesis (alternative hypothesis)
A statement that postulates that there is a difference between populations or sometimes, more specifically, that there is a difference in a certain direction, positive or negative
Null Hypothesis (H0)
The population mean is the same as the sample mean
If the significance tells us that the probability our observed difference is due to chance is high...
we can accept our null hypothesis and reject the alternative, or research hypothesis
If there is a low probability it means that it is unlikely that the observed difference is due to chance meaning...
we could reject the null hypothesis and accept our research, or alternative hypothesis
Statistically Significant
The observed difference that allows us to reject the null hypothesis
Criterion of Significance (Alpha Level)
Psychologists usually use 5%
Most psychologists are willing to reject the null hypothesis only if they are very sure that observed differences are not due solely to chance
Step 1 of The Significance Testing Process
Formulate alternative and null hypotheses based on your research hypothesis
Step 2 of The Significance Testing Process
Decide on a criterion of significance (usually 5%)
Step 3 of The Significance Testing Process
Collect data
Step 4 of The Significance Testing Process
Perform significance test on your data in order to obtain the significance level
Step 5 of The Significance Testing Process
Compare the obtained significance level to the criterion of significance
If the significance level is less than the criterion of significance, the results are statistically significant
Otherwise the results are statistically insignificant
Step 6 of The Significance Testing Process
If the results are statistically significant, reject the null hypothesis
If the results are statistically insignificant, accept the null hypothesis
Types of Errors in Significance Testing
Type I
Type II
Type I Error
Mistakenly reject the null hypothesis
There is really no difference between the population values mentioned in the null hypothesis and a statistically significant result was obtained just by chance
A true null hypothesis was rejected
Likelihood of making this error is the same as the criterion of significance
Type II Error (Beta)
Accepting the null hypothesis when it is false
A statistically insignificant result was obtained and the null hypothesis was accepted, even though the null hypothesis was false
The purpose of significance testing
To make an inference about a population on the basis of sample size
What does statistical significance not tell us?
Anything about whether or not the research is poorly designed
Whether or not the results are trivial or meaningless
The relationship between sample size and significance levels
The larger the size of the sample, the smaller the difference between the groups has to be in order to be significant
Kinds of Significance Tests
T-Test
ANOVA
Chi-square test
T-Test
Used to compare the means of two groups
ANOVA
Analyses of variance
Used for more than two groups
Estimates how much group means differ from each other by comparing the between-group variance to the within-group variance using a ration, called the F ratio
F ratio = Between-group variance estimate / within-group variance estimate
Can also be used to determine if there is any interaction between two or more IVs
Can assess interactions and this technique helps ascertain if the IV influences the DV
Chi-square tests
Significance tests that work with categorical (nominal) data, rather than numerical data
Factorial Design
Each level of a given IV occurs with each level of the other IVs
Interaction
When the effects of one IV are not consistent for all levels of the other IVs
Meta-Analysis
Statistical procedure that can be sued to make conclusions on the basis of data from different studies
The two ways that test results can be interpreted
Norm-referenced
Domain-referenced
Norm-Referenced Testing
Involves assessing an individual's performance in terms of how that individual performs in comparison to others
Norms are derived from standardized samples; the samples should be large and representative of the population to whom the particular test will be administered
The problem with norm-referenced testing
The population to whom the tests will be administered cam, and often does, change
Domain-referenced Testing (criterion-referenced testing)
Concerned with the question of what the test taker knows about a specified content domain
Performance on such a test is described in terms of what the test taker knows or can do
Reliability
The consistency with which a test measures whatever it is that the test measures
High means that the test measures are dependable, reproducible, and consistent
What makes a good test?
Think about how close a person's score is to his or her true score
The standard error of measurement (SEM) is an index of how much, on average, we expect a person's observed score to vary from the score the person is capable of receiving based on actual ability
Best SE is zero, but no test is possible
The smaller the SEM, the better
Methods used to establish the reliability of a test
Test-retest method
Alternate-form method
Split-half reliability
A correlation coefficient is then calculated using the pairs of scores
Test-retest method
The same test is administered to the same group of people twice
Estimates the inter-individual stability of test scores over time
Alternate-form Method
The examinees are given two different forms of a test that are taken at two different times
Split-half reliability
Where test takers take only one test, but that one test is divided into equal halves
Scores on half are correlated with the scores on the other half
Validity
Concerned with the extent to which a test actually measures what it purports to measure
Examine the relationship between performance on the test in question and other independent and objective sources of information about the knowledge or behaviors or interest
Evidence used depends on the nature of the test
Types of Validity
Content
Face
Criterion
Concurrent
Predictive
Construct
Convergent
Discriminate
Content Validity
Refers to the test's coverage of the particular skill or knowledge area that it is supposed to measure
Face Validity
Refers to whether or not the test items appear to measure what they are supposed to measure
Criterion Validity
has to do with how well the test can predict an individual's performance on an established test of the same skill or knowledge area
Predictive Validity
Test is used to predict future performance
Concurrent Validity
When a test is given at the same time as the criterion measure
Cross Validation
involves testing the criterion validity of a test on a second sample, after you demonstrated validity using an initial sample
Construct Validity
Refers to how well performance on the test fits into the theoretical framework related to what it is you want the test to measure
Convergent Validity
scores on the measure are related to other measures of the same construct
Discriminant Validity
scores on the measure are not related to other measures that are theoretically different
The relationship between reliability and validity
Test with zero reliability will have zero validity
A test can have perfect reliability and very little validity
Reliability is a precondition for validity, but the opposite is not necessarily true
Four basic types of measurement scales
Nominal
Ordinal
Interval
Ratio
Nominal (Categorical) Scale
Labels observations so that observations can be categorized
Ordinal Scale
Observations are ranked in terms of size or magnitude
Interval Scale
Uses actual numbers
Ratio Scale
There is a true zero point that indicates the total absence of the quantity being measured
Two types of ability tests
Aptitude Tests
Achievement Tests
Aptitude Tests
Used to predict what one can accomplish through training
They are used to predict future performance
Include intelligence tests
Achievement Tests
Attempt to assess what one knows or can do now; they can test adequacy of learning content and skill
Adaptive Test (achievement test)
A computerized test that adapts to the test taker's ability by assessing the accuracy of previously answered questions
Test taker with a high ability will be faced with more difficult questions than a test taker wit a low ability
Intelligence Quotient (IQ)
Measure of intelligence aptitude using an equation comparing mental age to chronological age
Mental age divided by chronological age, multiplied by 100
An IQ of 100 indicates mental age equals chronological age
Ratio IQ
Developed by William Stern
After a certain age, chronological age increases while mental age does not
Therefore, even if your mental age remains constant, your IQ will decrease with age
Deviation Quotients
Used in the 1960 revision of the Stanford-Binet test
Gets around the problem of the ratio IQ
Deviation IQ
Tells us how far away a person's score is from the average score for the particular age group the subject is a member of
Represents the individual's standing among his or her same-aged cohort
Wechsler Tests
Major group of intelligence tests
Have all items of a given type grouped into subsets
These items are arranged in order of increasing difficulty within each subtest
Has two broad subscales: a verbal scale which is based on information, vocab, and similar skills; and a performance scale, which is derived from tests of manipulative skill, eye-hand coordination, and speed
Three major Wechsler IQ tests
Wechsler Preschool and Primary Scale of Intelligence (WPPSI-R)
Wechsler Intelligence Scale for Children (WISC-R)
Wechsler Adult Intelligence Scale (WAIS-R)
All have been revised and they are used with preschoolers, school-aged children, and adults
WAIS-III is currently used for adult intelligence testing
Personality Tests
Frequently used in psychological research
Personality Inventory
A self-rating device usually consisting of somewhere between 100 and 500 statements
Subject is asked to determine if the given statements apply to him or her
Reliable, the veracity of responses is not guaranteed
Perceived social acceptability of a response is just one factor that can effect the accuracy of tests that involve self-reporting
Minnesota Multiphasic Personality Inventory (MMPI)
Personality inventory
Consists of 550 statements to which subjects respond "true," "false," or "cannot say"
Yields scores on ten clinical scales, measuring things such as depression, schizophrenia, and masculinity/femininity
Has scales that can indicate whether the person is careless, faking answers, misrepresenting him or herself, or distorting responses, whether it is being done intentionally or unintentionally
Purpose to aid in the assessment of various clinical disorders
Empirical Criterion-Keying Approach
Used to develop MMPI by Hathaway and McKinley
Tested thousands of questions and retained those that differentiated between patient and nonpatient populations, even if the item didn't seem to have anything to do with abnormality
Examined responses of patient groups with different diagnoses.
Each criterion group's responses formed the basis of a particular clinical scale, so that if a new patient answered questions in the same way that say, the depressive group did, that patient would receive a high depression score
California Psychological Inventory (CPI)
Personality inventory based on MMPI
Developed to be used with normal populations ages 13 and up
Oriented to high school and college students
Consists of 20 scales, including three validity scales, used to assess test-taking attitudes
Measures such personality traits as dominance, sociability, self-control, and femininity
All scores are expressed as standard scores with a mean and standard deviation derived from standardization samples
Projective Tests
Different from personality inventories in two basic ways:
The stimuli in these tests are relatively ambiguous
test taker is not limited to a small number of possible response
Test taker is presented with stimuli and asked to interpret what he or she sees
Means that the scoring of this test is subjective, whereas as personality is objective
Rorschach inkblot test
Projective test created by Hermann Rorschach
Made up of 10 cards that are reproductions of inkblots
Cards are presented to the subject in specific order with very specific instructions to describe what it is that the blots remind the subject of
Clinician then interprets the results based upon what the person saw and the spontaneous remarks that the person may have made
Thematic Apperception Test (TAT)
Devised by Christiana Morgan and Henry Murray and consists of 20 simple pictures depicting scenes that have ambiguous meanings
Test taker is told to tell a story about what is happening, to give the events leading up to what is happening in the picture, and to provide an ending
No standardized scoring method
Scoring is qualitative and the clinician has to rely on his or her clinical skills
Blacky pictures
Projective test for children
Consists of 12 cartoonlike pictures that feature a little dog named Blacky
Each picture depicts Blacky in a situation designed to correspond to a particular stage of psychosexual development
Test taker is asked to tell stories about the pictures he or she is shown
Rotter Incomplete Sentences Blank
An example of a sentence completion test, another projective technique researchers and clinicians use
Provided with 40 sentence stems and is asked to complete them
The theory is that the test taker will fill in the blanks with whatever is on his or her mind
Barnum effect
Refers to the tendency of people to accept and approve of the interpretation of their personality that you give them
Relatively simple to generate a "report" from stereotyped statements; these reports are readily accepted as accurate
Form of pseudo validation
Interest Taking
Usually used to assess an individual's interest in different lines of work
Strong-Campbell Interest Inventory
Interest test
Organized like a personality inventory and was developed using empirical criterion-keying approach
Test takers are given lists of interests and asked to indicate whether they like or dislike the interest listed
Other sections of the test, the test taker is asked to indicate his or her preference for one of two paired items
Interpretation of results based at least partly on Holland's model of occupational themes.
RIASEC system
Holland divided interests into 6 types:
1. realistic
2. investigative
3. artistic
4. social
5. enterprising
6. conventional
This set is often in folders with...
Social Psychology
95 terms
Developmental Psychology
123 terms
Personality and Abnormal Psychology
178 terms
Physiological Psychology
207 terms
You might also like...
Module 2- Fundamentals of Psychometrics
59 terms
Appraisal
53 terms
Tests and Measurements Final Exam
100 terms
Tests and Measurements Final Exam
100 terms
Other sets by this creator
Missed GRE Words
73 terms
Reading Comprehension (Attacking the Passage)
5 terms
Cognitive Psychology
96 terms
Learning and Ethology
90 terms
Other Quizlet sets
Chapter 1
75 terms
Jigsaw VIII
12 terms
Period 6 1865-1898
147 terms
Midterm chapter 1
74 terms