Get ahead with a $300 test prep scholarship
| Enter to win by Tuesday 9/24
Measurements and Assessment Exam Kwan
Terms in this set (187)
How is intelligence defined?
1. "intelligence" tests
2. Defined by "intelligence" research
3. Defined by laypersons vs. psychologists
Two factor theory of Intelligence Spearman
Intelligence theory that suggests only two factors are measured by intelligence tests, a general intelligence factor (g - general factor) common to all tests and a specific factor (s - specific factor) that is distinctive in each test. Spearman believed both factors jointly determined the measured value of human intelligence on any particular test.
A general ability, proposed by Spearman as the main factor underlying all intelligent mental activity
Spearman's idea of the ability to excel in certain areas, or specific intelligences (like music, art, business)
group factor theory of intelligence
Intelligence, considered as a mental trait, is the capacity to make impulses focal at their early, unfinished stage of formation. Seven factors (Primary Mental Abilities).
Thurstone's Primary Mental Abilities
Our intelligence may be broken down into seven (7) factors, agreed with Spearman about a "g" construct.
two types of intelligence: fluid and crystalized intelligence.
how you learn. Process. Decreases as you get older.
What you know. Content. Cultural based. Stays the same as you age.
Cognitive development and adaptation (Piaget)
Piaget's theory of intelligence. Gradual orderly changes by which mental processes become more complex and sophisticated.
Four stages of cognitive development
Sensorimotor, pre-operational, concrete operational, formal operational; part of Piaget's cognitive-developmental viewpoint.
Concepts or mental frameworks that organize and interpret information (like rules or guidelines).
According to Piaget, the process by which new ideas and experiences are absorbed and incorporated into existing mental structures and behaviors
According to Piaget, a process used to modify an existing schema when assimilation doesn't produce the desired purpose, and achieving the desired purpose.
Multiple intelligences (Gardener)
A term used to refer to Gardner's theory, which proposes that there are seven (or more) forms of intelligence.
Within-group / Between-group differences
With-in group differences are larger than between-group differences.
A worldwide increase in IQ scores over the last several decades, at a rate of about 3 points per decade.
What environmental factors influence intelligence?
Cultural environment, attendance at school, family environment, and toxins in the environment.
Types of instrument bias (3)
Content bias, internal structure bias, instrument and criterion relationships bias.
What people know varies between groups. No test can be created that will entirely eliminate the influences of learning and cultural experiences. Instruments may be biased in terms of the content being more familiar or appropriate for one group as compared with another group. (i.e. Assessment that asks for state bird of Wyoming).
internal structure bias
Reliability of assessment scores between groups. Scores on an instrument may be more reliable for one group than another or scores on an assessment may be reliable for one group but not reliable for another.
instrument and criterion relationships bias
Predictive validity between groups is different groups.
the degree to which construct irrelevant factors systematically affect groups performance.
when an instrument yields validity coefficients that are significantly different for 2 or more groups,
Standardized instrument that uses closed ended questions. Very common.
project something about your personality or experience in the response that you provide. Requires more specialized training. Validity and reliability tend to be low, without norms or much research support.
Informal assessment techniques
An information-gathering technique that involves watching people and recording (written or mentally) what one has noticed or observed.
face-to-face verbal exchange in which the counselor is requesting information or expression from the client.
Problems with validity of counselor observations
Validity of the observations may be restricted because only a small sample of behaviors are observed, which might not reflect the client's typical behavior. (Two examples are representativeness and generalizability).
the client may not be completely natural with the counselor.
not being able to generalize the behavior of the client to other settings.
fundamental attribution bias
When you see someone behaving in a certain way you attribute their behavior to their personality versus their circumstance. (ex. speeding - "reckless" vs. there's an emergency).
instruments that measure different traits or aspects of a client's character.
Structured Personality Inventories
paper-and-pencil test consisting of questions that respondents answer in one of a few fixed ways. (4 = content related procedure, personality theory, empirical criterion keying, and factor analysis)
content related procedure
instrument is developed where items are based on content analysis of behavior area to be tested. Focus on content relevance relating to personality attributes. (Ex. Content scales of MMPI-2)
personality theory inventory
instrument developed to measure theory. (Ex. Myers-Briggs = Jungian theory).
empirical criterion keying
instrument developed that separates groups used to construct a personality test items included on test that have been found to accurately distinguish between people who do and don't possess the traits measured by the test. (Ex. MMPI-2)
A statistical procedure that identifies clusters of related items (called factors) on a test; used to identify different dimensions of performance that underlie one's total score. (Ex. NEO-PI-R)
A widely used personality assessment instrument that gives scores on ten important clinical traits.
a group of test items that suggest whether or not the test taker answers are valid, tell whether test scores should be invalidated for lying, inconsistency, or "faking good".
A self-report inventory developed to measure the Big Five personality dimensions. (Openness, Conscientiousness, Extroversion, Agreeableness, Neuroticism)
Myers-Briggs Type Indicator
A personality framework that evaluates people on the basis of four types of preferences: extraversion v. introversion, sensing v. intuition, thinking v. feeling, and judging v. perceiving.
A standard series of ambiguous stimuli designed to elicit unique responses that reveal inner aspects of an individual's personality. Five types: Association, Construction, Completions, Arrangement/Selection, Expression.
A type of projective technique in which the respondent is presented with a stimulus and asked to respond with the first thing that comes to mind
A projective technique in which the respondent is required to construct a response in the form of a story, dialogue, or description.
A projective technique that requires the respondent to complete an incomplete stimulus situation.
A projective technique that involves individuals selecting and arranging objects.
A projective technique that involves the client freely expressing themselves. (Ex. art, music, dance).
A type of measure that is related to the individual's evaluation of their own performance or feelings about themselves.
Psychological Theories of Personality
Freud's Psychoanalytical Theory, Type theories of personality, Phenomenological theories of personality, Behavioral and Social Learning theories of personality, Trait theories of personality.
Freud's Psychoanalytical Theory of personality
Existence of unconsciousness that influences human behavior. Id = "I want", SuperEgo = "You shall", Ego = The balancer between the two and external reality. When the Ego is not able to negotiate the three successfully, anxiety arises. Defense mechanisms are used to negotiate all three and reduce anxiety.
Type theories of personality
Attempts to sort individuals into discrete categories or type.
Phenomenological theories of personality
Essentially, your personality is however you define it. personality theory that claims that the conscious mind is ultimately the source and resolution of any conflicts; says that people have an innate drive to reach their full potential and a need for acceptance, love, and belongingness and therefore humans are inherently good.
Behavioral and Social Learning theories of personality
Theories that state that "personality" is a set of stable behaviors manifested by an individual and as such, have been learned through operant and classical conditioning, an maintained through perceptions/attributions.
Trait theories of personality
the emphasis is placed on the person rather than situation or environment. A trait is any "relatively enduring way in which one individual differs from another". Personality traits describe characteristics which are enduring across time, e.g 'caring' or 'excitable'.
Why do we access personality?
Helps us understand the client by giving us a comprehensive feedback about the factors that need to be addressed in counseling and help us pick interventions that are more appealing to the client and his/her needs.
self-report measures/projective measures
What are some problems with assessing personality by relying on observation or by interview?
subjective on the interview and can be biased.
Personality tests that ask individuals to answer a series of questions about their characteristic behavior. Fall into 3 categories: theory-guided, factor-analytically derived, and criterion keyed.
What are some problems with assessing personality by self-report only?
Responders can approach questions as their idealized self or their devalued self and give a distorted and inaccurate/invalid personality profile. Validity scales are used to gauge response style.
Arithmetic average of a set of scores
The score at which 50% score below and 50% above
you determine by arranging scores in order from lowest to highest and finding the middle number.
the most frequent score in a distribution.
provides a measure of the spread of scores and indicates the variability between the highest and lowest scores.
A standardized measure of a sample of a person's behavior which gathers data from a systematic and more "objective" perspective involving quantifiable data, standardized data collection procedures, and reliability of test data.
measurable data that can be repeated. (Nominal, Ordinal, Interval, and Ratio).
measurement of whatever quality a tester is trying to measure. X (observed score) = T (true score) + e (error)
Observed score (test score).
True score (never absolutely known).
Error (low error higher reliability)
A procedure for gathering client information that is used to facilitate clinical decisions, provide clients with information, or for evaluative purposes.
An individual instrument in which the focus is on evaluation.
an assessment tool that typically is not related to grading. In this book, instruments includes test, scales, checklists, and inventories.
Types of Assessment Tools
Standardized vs. Non standardized
Individual vs. Group
Objective vs. Subjective
Speed vs. Power
Verbal vs. Nonverbal
Cognitive vs. Affective
Standardized Assessment Tools
there must be fixed instructions for administering and scoring the instrument. Formal assessment that allows for comparison of performance to other children of the same age.
Nonstandardized Assessment Tools
has not met these guidelines and may not provide the systematic measure of behavior that standardized instruments provide. Informal testing or assessment.
Group Assessment Tools
Systematic measure often more convenient and less time consuming, but can be difficult to observe all examinees and note behaviors while they take the instrument.
Individual Assessment Tools
a substantial amount of info can often be gained by admin an instrument to an individually and by observing a client's nonverbal behaviors
there are predetermined methods for scoring the assessment & individual doing the scoring is not required to make judgments.
require an individual to make professional judgments in scoring the assessment
Verbal Assessment Tools
requireS individuals to use verbal skills or if the instructions are given orally or must be read, can be problematic for those whose primary language is not english.
Nonverbal Assessment Tools
aka non language, require no language on the part of either the examiner or the examinee
ex. performance tests-require manipulation of objects with min verbal influences
items may vary in difficulty with more credit given to more difficult items.
simply examines the number of items completed in a specified period of time.
assess perceiving, processing, concrete and abstract thinking and remembering
Types of Cognitive Instruments
intelligence/general ability tests, achievement tests, aptitude tests
what you can do now
what you can do based on specific standards.
what you can do now that predicts future performance.
assess interest, attitudes, values, motives, temperaments, and the non cognitive aspects of personality.
Structured Personality Instruments
individuals respond to a set of established questions and select answers from the provided alternatives.
A type of personality assessment that provides the client with a relatively ambiguous stimulus, thus encouraging a non structured response. The assumption underlying these techniques is that the individual will project his or her personality into the response. The interpretation of projective techniques is subjective and requires extensive training in the technique.
A scale of measurement characterized by assigning numbers to name or representing mutually exclusive groups (e.g., 1=male, 2=female).
Type of measurement scale in which the degree of magnitude is indicated by the rank ordering of the data.
Type of measurement scale in which the units are in equal intervals. (Ex. units are in equal intervals, ex. intervals of weight-if you gain 5 pounds , those 5 pounds are always the same, whether you are going from 115-120 or 225-230)
A scale of measurement that has both interval data and meaningful zero (e.g., height and weight).
Instruments in which the interpretation of performance is based on the comparison of an individual's performance with that of a specified group of people (allow for a mean and standard deviation to emerge.)
Instruments designed to compare an individual's performance to a stated criterion or standard. Allow for externally-determined scores to define a test's features and a particular client's score.
unadjusted scores on an instrument before they are transformed into standard scores. An example of a raw score is the number of answers an individual gets correct on an achievement test.
A chart that summarizes the scores on an instrument and the frequency or number of people receiving that score. Scores are often grouped into intervals to provide an easy-to-understand chart that summarizes overall performance.
The extent to which the scores in a data set tend to vary from each other and from the mean. In assessments it's important to examine how scores vary so that we can determine if a person is high or low compared with others, and how much higher or lower a person's score is.
the mean square deviation
A measure of variability based on the squared deviations of the data values about the mean
A computed measure of how much scores vary around the mean score. It is the square root of the variance.
bell-shaped and symmetrical
occurs when the distribution of scores is normalized. ("bell curve"
distribution is not symmetrical and majority of people either scored in the low range or the high range.
Positively Skewed Distributions
majority of scores are at the lower end of the distribution.
Negatively Skewed Distributions
majority of scores are on the higher end of distribution.
A ranking that provides an indication of the percent of scores that fall at or below a given score.
scores that represent an individual's relative deviation from the mean of the standardization sample.
standardized scores; based on the assumption that the mean of 0, and the standard deviation of ±1.
standardized scores; based on the assumption that the mean is 50 and the standard deviation is ±10.
Standard IQ scores
standardized scores; based on the assumption that the mean is 100 and the standard deviation is ±15.
Standardized Educational testing
standardized scores; based on the assumption that the mean is 500 and the standard deviation is ±100.
Item Response Theory
one way to calculate age or grade equivalent scores, items are developed to measure performance at certain developmental levels
a group that is actually being tested
larger group of interest from which the sample group is drawn.
How can we know that we can trust the Observed score to be approximately equal to that of the true score?
Based on reliability (minimization of measurement error).
As e (error) decreases, reliability increases and X (observed score) is closer to the T (true score).
How can we know that the Observed score measures something enough to make meaningful interpretations/judgments about it?
Based on validity (meaningful comparison with other indicators of the measure construct)
Sources of Error
(3) testing process, test content, test scoring
Errors in Testing Process
examples: (1) Variation in testing process (ex: environment, tester variables, test order) (2) testee variables (ex: mental state, motivation, other factors such as anxiety/mood). Minimized by constraining variation in testing process.
Errors in Test Content
examples: (1) Asking questions that assess a different construct than one you are trying to measure. (asking a biology question for a math test). (2) Practice effects or coaching.
Errors in Test Scoring
examples: when scoring the MMPI you miscount the responses for a scale which influences the observed score and makes it inaccurate.
Ability of a test to yield very similar scores for the same individual over repeated testings
Types of reliabilty
(1) test-retest, (2) split-half, (3) alternate forms, (4) inter-rater
method for determining the reliability of a test by comparing a test taker's scores on the same test taken on separate occasions and measuring the correlation of performance.
the instrument is given once and then split in half to determine reliability, frequently the instrument is divided in half by using scores from even and odd items.
First divide the instrument into equivalent halves then correlate the individual's score on the two halves.
two different forms of the same test (same format, same number of items, sample content domain equally). These scores are compared using estimate of reliability, correlation, like test-retest. Pearson Product-Moment Correlation
measurement is based on ratings from experts, the higher the reliability, the more given ratings are identical.
classical test theory
suggests that every score has two hypothetical components: a true score and an error component, how much of any score is true and how much is error? (Obs = T + E)
A number between 0 and 1 that expresses the relationship between the error variance, the true variance, and the observed score. A zero correlation indicates no relationship. The closer to 1 the coefficient is, the more reliable the tool.
Constant errors that do not fluctuate. Affects the accuracy of results. Can be eliminated by fixing source of error.
reliability error that is lacking of a system with occurrences presumed random (one person effected by only paper with typo).
provides an indication of consistency by examining the relationship between two scores.
A measure of the relationship between two variables.
A finding that two factors vary systematically in opposite directions, one increasing as the other decreases.
A correlation where as one variable increases, the other also increases, or as one decreases so does the other. Both variables move in the same direction.
provides a numerical indicator of the relationship between two sets of data, the closer to 1.00 (either positive or negative) the stronger the relationship, the closer to .00 the smaller the relationship (lack of evidence of a relationship).
coefficient of determination
the percentage of variance shared between two variables. Represented by r(squared).
When is the test-retest method appropriate?
1. characteristic or trait being measured must be stable over time.
2. there should be no differential in practice effect.
3. no differential in learning between tests.
coefficient of stability
An estimate of test-retest reliability obtained during time intervals of 6 months or longer.
Internal Consistency Measures of Reliability
estimates reliability by using one administration and a single form of the instrument by dividing the instrument in different manners and correlating the scores from the different portions of the instrument.
criterion reference instruments
compare an individuals performance with a standard of criterion, not dependent on others performance. (ex. achievement tests)
Standard Error of Measure
provides an estimation of the range of scores if someone were to take an instrument over and over again (how much error is present in any given score)
Being true. Is the test measuring what it is supposed to measure? No test is 100% valid because of testing error or bias. Validity is more important than reliability. The standard deviation statistic is used. Validity is based on the test content.
Types of Validity
(1) content, (2) construct, (3) criterion-related
aka internal validity. the degree to which the instrument's content indicates the items, questions, or tasks adequately represent the intended behavior domain. ex. a 3rd grade spelling test has words which an educator would expect these children to know how to spell.
aka external validity. Test results should provide some estimate of how the individual will perform in a related area. ex. whether SAT predicts academic performance in college.
Two types of Criterion-Related Validity
Predictive validity and Concurrent validity
determined by some measure used in the future
determined by simultaneously-given measures (where both measures are yielding similar results)
Extent to which a selection device measures a quality or trait that is not tangible but assumed to exist (e.g., intelligence or mechanical comprehension). Constructs need to be operationalized (Ex. how would you define aggression?)
on the surface, the instrument looks good, not truly an indicator of validity and should not be considered one.
differential item functioning
arises when test takers from different cultures have the same ability level on the test construct, but the item or test yields very different scores for the two cultures. (ex. caucasian, african american, latino - in terms of % who get answers correct)
evidence based on relations to other variables
entails examining the relationship between the assessment scores and other pertinent things.
ex. investigating whether an assessment of self-esteem correlates with other measures of self-esteem
a statistical tool often used in providing validation evidence related to an instrument's relationship with other variables
1. select an appropriate group
2.administer the instrument
3.correlate the performance on the instrument w/ criterion info
4. the result=validity coefficient
helpful to counselors in 2 ways
1. they can compare them from different instruments & select the instrument that has the highest relationship between the instrument and the relevant variables or criteria
2. it allows the counselor to examine the amount of shared variance between the instrument and pertinent variables
means that an instrument is related to other variables to which it should theoretically be positively related.
ex. if a test is designed to measure depression and correlates highly with another test that measures depression.
type of criterion-related validity, there is no lag between when an instrument is given and when the criterion info is gathered
type of criterion-related validity, used when we want to make an immediate decision, such as diagnosis, there is a lag between when the test is administered and the time the criterion info is gathered
ex. a test to id couples who will stay married for more than 10 years, we would test right before a couple married and then wait ten years to gather the criterion-evidence
A type of correlational procedure that focuses on predicting the values of an outcome based on its correlation with another variable. "line of best fit".
regression chart example
a. pretty good predictor b/c all scores are near the line
b. not as good of a predictor b/c the points are more spread out
c. such a poor relationship that the regression line could not be determined
the equation that describes the linear relationship between the predictor variable and the criterion variable
Like the simple regression, we have a single outcome, but more than one predictor.
A table that portrays the established relationship between test scores and expected outcome on a relevant task
standard error of estimate
indicates the margin of expected error in an individual's predicted criterion score as a result of imperfect validity
ex. a test developer of a test designed to predict suicidal behaviors would want to use this when determining a score at which a counselor should be concerned about their clients. provides a range of scores.
A measure of the scatter of points around a regression line.
Type 1 Error
Type 2 Error
Assessment error in which pathology is reported (that is, test results are positive) when none is actually present. (Type 1 Error)
Assessment error in which no pathology is noted (that is, test results are negative) when one is actually present. (Type 2 Error)
focuses on examining and evaluating each item within an assessment.
is an index that reflects the proportion of people getting the item correct.
provides an indication of the degree to which an item correctly differentiates among the examinees on the behavior domain of interest.
ex. a test that failed to tell the difference between those individuals that studied hard and knew the material and those who do not
item response theory
the focus is on each item and creating items that measure a particular ability or the respondent's level of a latent trait. a person's performance is not based on the total score but precise items that they answer
good in development of criterion-focused tests. (ex. freerice.com)
Tests that assess the skills of an individual that are necessary for the successful performance of a task.
A test designed to predict a person's future performance; aptitude is the capacity to learn.
Assess a person's accumulated knowledge relative to others.
Student's performances are compared to a criterion, or standard.
An individual's performance is compared to the group that was used to calculate the performance standards.
when a basic psychological process, results in academic impairment.
A condition of limited mental ability, indicated by an intelligence score of 70 or below and difficulty in adapting to the demands of life; varies from mild to profound.
Diagnosed when an individuals achievement is substantially below that expected for age, schooling, and level of intelligence (2 standard deviation difference between achievement and IQ test scores)
A psychological disorder marked by the appearance by age 7 of one or more of three key symptoms: extreme inattention, hyperactivity, and impulsivity.
Interview types (3)
structured, unstructured, semi-structured
A selection interview that consists of a predetermined set of questions for the interviewer to ask.
An interview in which the question-answer sequence is spontaneous, open-ended, and flexible.
An interview in which questions are posed in a standardized yet flexible way.