Home
Browse
Create
Search
Log in
Sign up
Upgrade to remove ads
Only $2.99/month
t2
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Terms in this set (48)
Reliability:
Consistency in measurement
Reliability coefficient
is an index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance
Observed score
= True score plus error (X = T + E)
Error
refers to the component of the observed score that does not have to do with a testtaker's true ability or the trait being measured
Measurement Error
- made up of random error and systematic error
Sources of Error Variance
- Test construction, Test administration, test scoring & interpretation
Test-retest reliability:
An estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test- Most appropriate for variables that should be stable over time (e.g., personality) and not appropriate for variables expected to change over time (e.g., mood). - As time passes, correlation between the scores obtained on each testing decreases
Parallel forms:
For each form of the test, the means and the variances of observed test scores are equal
Alternate forms:
Different versions of a test that have been constructed so as to be parallel; they do not meet the strict requirements of parallel forms but item content and difficulty are similar between tests
Reliability is checked
by administering two forms of a test to the same group; scores may be affected by error related to the state of testtakers (e.g., practice, fatigue, etc.) or item sampling
Split-half reliability:
Obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once; entails three steps:
Spearman-Brown formula
allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test
Inter-item consistency:
The degree of relatedness of items on a scale; this helps gauge the homogeneity of a test
Kuder-Richardson formula 20:
Statistic of choice for determining the inter-item consistency of dichotomous items
Coefficient alpha:
Mean of all possible split-half correlations, corrected by the Spearman-Brown formula; it is the most popular approach for internal consistency, and the values range from 0 to 1
Validity:
Estimate of how well a test measures what it purports to measure
Validation:
the process of gathering and evaluating evidence about validity
1. Content validity
- This is a measure of validity based on an evaluation of the subjects, topics, or content covered by the items in the test
2. Criterion-related validity
- This is a measure of validity obtained by evaluating the relationship of scores obtained on the test to scores on other tests or measures
3. Construct validity
- This is a measure of validity that is arrived at by executing a comprehensive analysis of: a. How scores on the test relate to other test scores and measures. b. How scores on the test can be understood within some theoretical framework for understanding the construct that the test was designed to measure
Face validity:
A judgment concerning how relevant the test items appear to be • If a test appears to measure what it purports to measure "on the face of it," it could be said to be high in face validity
Content validity:
A judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample • Do the test items adequately represent the content that should be included in the test?
Criterion-related validity:
A judgment of how adequately a test score can be used to infer an individual's most probable standing on some measure of interest (i.e., the criterion)
Concurrent validity:
An index of the degree to which a test score is related to some criterion measure obtained at the same time (concurrently)
Predictive validity:
An index of the degree to which a test score predicts some criterion, or outcome, measure in the future; tests are evaluated as to their predictive validity
Construct validity:
Judgment about the appropriateness of inferences drawn from test scores regarding individual standings on a construct • If a test is a valid measure of a construct, then high scorers and low scorers should behave as theorized. • All types of validity evidence, including evidence from the content- and criterion-related varieties of validity, come under the umbrella of construct validity
Evidence of homogeneity
- How uniform a test is in measuring a single concept
Evidence of changes with age
- Some constructs are expected to change over time (e.g., reading rate)
Evidence of pretest-posttest changes
- Test scores change as a result of some experience between a pretest and a posttest (e.g., therapy)
Evidence from distinct groups
- Scores on a test vary in a predictable way as a function of membership in some group
Bias:
A factor inherent in a test that systematically prevents accurate, impartial measurement • Bias implies systematic variation in test scores • Prevention during test development is the best cure for test bias
Rating error:
A judgment resulting from the intentional or unintentional misuse of a rating scale • Raters may be either too lenient, too severe, or reluctant to give ratings at the extremes (central tendency error) • Halo effect: A tendency to give a particular person a higher rating than he or she objectively deserves because of a favorable overall impression
Fairness:
The extent to which a test is used in an impartial, just, and equitable way
Utility:
The usefulness or practical value of testing to improve efficiency
Costs
- One of the most basic elements of utility analysis is the financial cost associated with a test
Benefits
- We should take into account whether the benefits of testing justify the costs of administering, scoring, and interpreting the test Benefits can be defined as profits, gains, or advantages
Utility analysis:
A family of techniques that entail a cost-benefit analysis designed to yield information relevant to a decision about the usefulness and/or practical value of a tool of assessment
Utility gain
refers to an estimate of the benefit (monetary or otherwise) of using a particular test or selection method
- Cut scores
may be relative, which implies that they are determined in reference to normative data (e.g., selecting people in the top 10% of test scores)
Taylor-Russell tables
provide an estimate of the percentage of employees hired by the use of a particular test who will be successful at their jobs, given different combinations of three variables: the test's validity, the selection ratio used, and the base rate
Naylor-Shine tables
help obtain the difference between the means of the selected and unselected groups to derive an index of what the test (or some other tool of assessment) is adding to already established procedures
Brogden-Cronbach-Gleser formula
is used to calculate the dollar amount of a utility gain resulting from the use of a particular selection instrument under specified conditions
Fixed cut scores:
Made on the basis of having achieved a minimum level of proficiency on a test (e.g., a driving license exam)
Multiple cut scores:
The use of multiple cut scores for a single predictor (e.g., students may achieve grades of A, B, C, D, or F)
Multiple hurdles:
Achievement of a particular cut score on one test is necessary in order to advance to the next stage of evaluation in the selection process (e.g., Miss America contest)
Angoff method:
Judgments of experts are averaged to yield cut scores for the test. Can be used for personnel selection based on traits, attributes, and abilities. Problems arise if there is disagreement between experts
Known groups method:
Entails collection of data on the predictor of interest from groups known to possess, and not to possess, a trait, attribute, or ability of interest
Discriminant analysis:
A family of statistical techniques used to shed light on the relationship between identified variables (such as scores on a battery of tests) and two (and in some cases more) naturally occurring groups (such as persons judged to be successful at a job and persons judged unsuccessful at a job)
YOU MIGHT ALSO LIKE...
Chapter 6
33 terms
Psych Testing Exam 2
47 terms
Chapter 6
42 terms
Psychology Tests and Measurements Quiz 2
83 terms
OTHER SETS BY THIS CREATOR
驾照2
68 terms
笔试驾照
80 terms
his psych
59 terms
final 1
61 terms