Social Science
Psychology
PSYCH 309 EXAM 1
Terms in this set (86)
Define Test and Item
Test: a measurement device or technique used to quantify behavior or aid in the understanding and prediction of behavior
Item: a specific stimulus to which a person respond overtly; this response can be scored or evaluated
Be able to define, recognize, and differentiate between states and traits
traits: dispositions that distinguish one individual from another ex. shy or outgoing
states: temporary change in ones personality ex. depressed, sad, anxious
Define achievement, aptitude, and intelligence testing
Achievement: refers to previous learning
Aptitude: refers to potential for learning or acquiring a specific skill
Intelligence: refers to a person's general potential to solve problems, adapt to changing circumstances, thing abstractly, and profit from experience
If a test is reliable its results are what?
accurate, dependable, consistent, are repeatable
What are test batteries?
two or more tests used in conjunction
Define standardization? Why is it important to obtain a standardization sample?
standardization- when you take a test and compare it to an already defined standard. it may have to be changed over time du to drifting effect
Define representative sample and stratified sample. Know when and why representative and stratified samples are collected.
a sample drawn in a random fashion so that it is composed of people with characteristics similar to those for whom the test is to be used
stratified sample: when the sample is chosen to represent the different groups in a sample ex. 60 white, 10 hispanics, 5 asian
Define hypothetical construct
not tangible
not directly measurable but gives rise to measurable things
measurable phenomena --> operational definition
Define operational definition, measurable phenomenon, and hypothetical construct
Operational- identifies one or more specific, observable events or conditions such that any other researcher can independently measure and/or test for them
measurable phenomenon- something that you can measure like hours of sleep when testing for depression
hypothetical contrust- explanatory variable which is not directly observable. ex. concepts of intelligence and motivation used to explain phenomena in psych
What is the difference between structured and projective personality tests?
structured: provide a statement, usually for self report variety and required the subject to choose between two or more alternative responses
projective: unstructured, either the stimulus or the response-or both- are ambiguous
Define psychological testing and psychological assessment. How are they different?
testing: refers to all the possible uses, applications, and underlying concepts of psychological and educational tests
assessment: process of testing that uses a combination of techniques to help arrive at some hypotheses about a person and their behavior, personality and capabilities
What is psychometry? What are the two major properties of psychometry?
Branch of psych wit the metric of psychological test/ like validity and reliability
What are norm- and criterion referenced tests? How is each unique?
Norm- compares each person with a from
criterion- describes the specific types of skill, tasks, or knowledge that the test takes can demonstrate such as mathematical skills
What types of questions are answered by psychologists through assessment?
...
In what settings do psychologists assess and what is their primary responsibility in each?
...
What are 3 properties of scales that make scales different from one another?
magnitude
equal intervals
absolute 0
Know the 4 scales of measurement and be able to differentiate between these scales
Nominal Scale - categories
Ordinal - rank
Interval - no absolute 0, can average, can't from ratios, ex. temp and intelligence
Ratio - has everything, ex. weight and age
Concrete examples of each of the different scales of measurement
nominal- hair color, favorite food
ordinal- order of finishing a race
interval- degrees farenhiet
ratio- weight
Define frequency distribution and histogram? What kind of data are shown in each?
frequency- displays scores on a variable or a measure to reflect how frequently each value was obtained
histogram - groups numbers into ranges
Understand the concept of percentiles
Percentile Ranks "what percent of scores fall below a particular score"
Define central tendency. Know the 3 types of central tendency and how to calculate each
mean: normally distribute, takes each score into account, mean sensitive to outlier
Median: not affected by outlier
Mode: nominal stat
Define variance and standard deviation
standard deviation: approximation of the average deviation around the mean
variane: average squared deviation around the mean (square of Sd)
Understand the Normal distribution conceptually
normal distribution: symmetrical on both sides of the middle
Define skewness and be able to identify positive and negative skew
negative- curve on the right
postive- curve on the left
Define kurtosis and be about to identify its different types
Kurtosis: height
Leptokurtic: tallest curve
Mesokurtic: normal
Platykurtic: flattest
What is a z score? How is it calculated?
Mean of 0 SD of 1
Transforms data to standardized units
z= (X-Xbar)/s... (Score-mean)/SD
z= (X-Xbar)/s... (Score-mean)/SD
How are T scores different from Z scores?
T scores the mean is 50 and SD is 10
harder to get into negative numbers
What are quartiles? What is Interquartile range?
Quartiles: points that divide the frequency distribution into equal fourths
Median: middle number
Interquartile range: bounded by the range of scores that represent the middle 50% of the distribution
Deciles: mark 10% interval
Stanine System: converts any set of scores into a transformed scale which ranks from 1-9
Define norm, norming, and standardization. For what is each used?
...
Define and differentiate between norm-referenced and criterion-referenced tests
norm-referenced test: compares each person with a norm
Criterion-reference test: describes the specific types of skill, tasks, or knowledge that the test taker can demonstrate such as mathematical skills
To avoid bias, how should error be distributed in a psychological test?
double blind
random sampling
want error to be unsystematic and random
What are the 5 characteristic of a good theory?
has explanatory power
broad scope
systematic
generative
parsimonious
What is a scatter plot? How does it work?
a picture of the relationship between 2 variables
What is the Correlation Coefficient? With what concept should correlation not be confused?
causation
Understand and be able to differentiate and plot positive, negative, and 0 correlation
positive- up and to the right
negative- down to the right
0 correlation is random
What is the principle of least square? how does it relate to the regression line?
the line of best fit
Define covariance
how much both variables change together
What is the principle of dilution in correlation
Variation pulls the slope way from 1 or -1
What is the Pearson product moment correlation? What meaning do the values -1.0 to 1.0 have?
1.0: perfect positive
-1.0: perfect negative
Define residual
observed Y - expected Y how far your score is away from the line of best fit
What is the standard error of estimate? What is its relationship to the residuals?
standard error of measurement: used standard deviation or errors as the basic measure of error
What is shrinkage?
2 samples
effect size will shrink when applied to a different group
x and y will not work as well
what is restricted range? to what does it lead?
using a sample of people who won't fit the test
or test too easy or hard
reduces range and variance
what is factor analysis
data reduction technique
when 2 items correlate highly it is because they measure the same thing
factors are useful because they simplify
What is the coefficient of determination? what is the purpose of the coefficient of determination?
r squared
amount of y that is explained by X
can be look at as a %
Know the different types of correlations and when they are used
...
What is the regression formula? understand the different components of the formula and how they are applied.
...
what is the difference between simple linear regression and multiple regression?
...
What is reliability?
tests that are relatively free of measurement error
What contributes to measurement error?
Standard error of measurement: used standard deviation of errors as the basic measure of error
a lot of things in psych that are measured are concepts
What components make up Classical Test Score Theory?
Classical test score theory assumes that at person has a true score that would be obtained is there were no errors in measurement
observed score= true score + error
Know what an observed score is
true score plus error
In what ways can error impact the observed score
it pulls it from the true score
Test reliability is usually estimated in one of what 3 ways? Know the major concepts in each way
Test-retest: consider consistency of test results when test is administered on different occasions
parallel forms: evaluate the test across different forms is the test
internal consistency: examine how people perform on similar subjects is items selected from the same from of the measure
What is the carryover effect?
when a test taken twice and answers are remembered from the first time you take it
second score is influenced from the first time
Define parallel/alternate forms reliability. What are its advantages and disadvantages?
parallel- different forms of the same test
hard to make same in difficulty but better tests to administer
Define split half reliability. How is this measured?
finding correlation between 2 halves of the same test. measured with spearman brown to test the whole test reliability
pearson r to compare the 2 halves correlation
How do the different aspects of internal consistency differ?
examine how people perform similar subsets of item selected from the same form of the measure
consistency of items within the same test
evaluate which the different items on a test measure the same ability or trait
Understand the major components of inter-rater reliability.
consistency among different judges who are evaluating the same behavior three different ways to do this:
record the percentage of times 2 or more observers agree
Kappa stat is best method for assessing the level of agreement among observers
What is the Kappa statistic and how does it relate to reliability?
measures inter rater reliability
Know the Summary of Reliability Table from lecture
...
What does the standard error of measurement do?
uses standard deviation of errors as the basic measure of error
allows us to estimate the degree to which a tests provides inaccurate readings
larger the standard error, the less certain we can be about the accuracy with which an attribute measured
What factors should be considered when choosing a reliability coefficient?
what type of answers they
split half vs alpha or KR20
Why types of irregularities might make reliability coefficients biased or invalid?
...
How can one address/improve low reliability?
increase the number items
factor and item analysis
correction for attention - estimating what the correlation between tests would have been if there had been no measurement error
What is the purpose of factor and item analysis?
to see if a certain item is bringing the reliability down. see how many factors there are in the test
What example was given in class regarding reliability
...
What are the stages of test development
test conceptualization
test construction
test tryout
item analysis
test revision
define and know examples of incremental validity and ecological validity
incremental validity: tests add something new science
ecological validity: don't really need to know
Define dichotomous and polytomous format. common format? advantage?
dichotomous format: 2 alternatives for each item
true/false
easy to construct
encourages memorization
doesn't allow to show complexity
less reliable
polytomous: more than 2 alternatives
multiple choice
easy to score
distractors
poorly written distractors adversely affect quality of test
which types of questions are selected-response format?
dichotomous format and polytomous format - selecting answers
Other formats
Short answer/ fill in blank
- advantage: flexible
-disadvantage: hard to score
Essay
-advantage: in depth coverage, taps higher order skills
-disadvantage: tedious to score, interrater reliability, scoring subjectivity
2 major formats of summative scales? what type of data do they create?
likert and category
summed up to get a composite or cumulative scales
be able to define and recognize and likert format. what scales most frequently use the likert format?
indicate the degree of agreement with a particular question
what are the primary differences between the likert and category formats?
category: like the likert format but has an even greater number of choices
- 10 point rating scale
-define end points
what are the 4 questions that should be asked when generating a pool of candidate test items?
1. what content domain should the test items cover?
2. how many items?
3. what are the demographics of population?
4. How should I world my items?
What are the 4 ways to score tests and how is each differentiated from the others?
1. cumulative scoring: summing them up
2. subscale scoring: total test scores is divided into groups that is individual summed
3. Class or category scaling: pass or fail
4. Ipsative scoring: forced choice
Define item analysis. What 2 methods are closely associated with item analysis
a general term for a set of methods used to evaluate test items, one of the most important aspects of test construction
the basic methods involve assessment of item difficulty and item discriminability
Define item difficulty. What does the proportion of people getting the item correct indicate
numer of people who get it right
probability an item could be answered correct
optimal difficulty (1-x/2) + x
define item discriminability what is good discrimination? What are two ways to test item discriminability?
determines whether people who did well on item did well on test
extreme group method= compare people who did well with those who have done poorly on test
point biserial method: find correlation between performance on the item and performance on the total test
Will guessing help you on an exam?
yes, if there is not deduction for getting it wrong
know and be able to identify example a double barreled item
2 topics in one quetion
Define item characteristic curve. know what info the x and y axes give as well as slope
valuable way to learn about item by graphing their characteristics
total test score (x) and proportion of examinees who get the item correct (y)
the gradual positive slope of the line demonstrates that the proportion of people who pass the item gradually increases as test scores increase
When shown an item characteristic curve, be able to determine good or poor discrimination
item characteristic curve for a good item: the proportion of test takers who get the item correct increases as a function of the total test score
bad item:people with different test scores were equally as likely to get the item correct
What is systematic error variance called? Is it good or bad and why?
bias
bad, we try to avoid it
Know ceiling effects, floor effects, and indiscriminate items.
ceiling effect: too easy
floor effect: too hard
indiscriminate items: not a good question to see if someone knows it
