Home
Subjects
Explanations
Create
Study sets, textbooks, questions
Log in
Sign up
Upgrade to remove ads
Only $35.99/year
Social Science
Psychology
Psychometrics
Reliability
STUDY
Flashcards
Learn
Write
Spell
Test
PLAY
Match
Gravity
Information on the consistency of measurements.
Terms in this set (16)
True score theory of reliability
Assumes that any observation (measure) is composed of the true value plus some random error value.
var(X) = var(T) + var(eX)
The true score theory expressed as the components of variance in a measure. Var(X) is the variance of the measure. Var(T) is the true score for the measure. Var(eX) is the random error variance of the measure.
Benefit of aggregating items that measure the same thing.
When you add items, for example questionnaire items, that measure the same construct, the true score component increases. The random error component does not increase because when you add random numbers (assuming they include negative numbers), they sum to zero. So by adding items the ratio of true score to error (or, in other words, the reliability) increases. This is why we create scales consisting of multiple items that measure the same thing.
Measurement error
True score theory assumes that errors of measurement are random, and therefore add to zero. However, the true score measures what the items have in common, but some items may have variance due to other constructs. This would be shown by a factor analysis that showed that the item "loaded" on two different dimensions. This other dimension or construct that the item measures becomes part of the error variance and so reduces the item's reliability.
Systematic error variance
Variance in a measure that is due to some other factor than the one the item is intended to measure. It may be caused by environment conditions (e.g., a noisy room when the test is given), poor wording of the questionnaire item, or a bias in the respondent's answer (e.g., a need to answer in a socially desirable way).
Reliability
In research, the term reliability means "repeatability" or "consistency". A measure is considered reliable if it would give us the same result over and over again (assuming that what we are measuring isn't changing!).
The estimate of reliability
A ratio of the true-score variance (var T) divided by the total variance (var X), or in other words, the proportion of true-score variance in the measure. Reliability is estimated at the group level, not the individual level.
Test-retest reliability
When a variable is measured at two points in time, the association of the variable at time 1 with itself at time 2 is one way to ESTIMATE reliability. Test-retest reliability can be used with any type of measure, a dichotomy (yes/no), a nominal level variable (type of employment), or a quantitative measure. Test-retest reliability for a dichotomous or nominal level variable would be tested using a chi-square analysis. Test-retest reliability for a quantitative scale would be ESTIMATED with a correlation (assuming the two scores are normally distributed).
The importance of size in estimating reliabilty
An estimate of reliability has a range of zero to 1. If all the variance in a measure is true score variance, the the true score variance divided by the total variance would have a ratio of 1. The closer the estimate is to 1, the better.
Inter-rater reliability
Used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon.
Parallel-forms reliability
Used to assess the consistency of the results of two tests constructed in the same way from the same content domain. For example, you might have two test of arithmetic ability. They cover the same type of arithmetic problems but the numbers used in the problems differ.
Internal consistency reliability
Used to assess the consistency of results across items within a test. For example, if you wanted to use the sum of 5 Likert scale items as a scale, the internal consistency reliability is an estimate of how well they are measuring the same construct.
Average inter-item correlation
A measure of internal consistency that uses all of the items on the instrument that are designed to measure the same construct. We first compute the correlation between each pair of items, as illustrated in the figure. For example, if we have six items we will have 15 different item pairings (i.e., 15 correlations). The average inter-item correlation is simply the average or mean of all these correlations.
Average item-total correlation
Another measure of internal consistency that uses the inter-item correlations. In addition, we compute a total score (the sum) for the six items and use that as a seventh variable in the analysis. The average of the correlations of the items with the total score equals the average item-total correlation.
Split-half reliability
A measure of internal consistency that begins by randomly assigning all items that purport to measure the same construct into two sets. We administer the entire instrument to a sample of people and calculate the total score for each randomly divided half. For example, for six item scale randomly assign 3 of the items to one subscale and 3 to the other subscale. The correlation between the two subscales is split-half reliability.
Coefficient alpha (or Cronbach's alpha)
A measure of internal consistency that is based on all possible combinations of split-half reliabilites within a set of items. In other words, it is like you computed one split-half reliability and then randomly divide the items into another set of split halves and recomputed the split-half reliability, and keep doing this until all possible split half estimates of reliability have been computed. The average of these estimates (computed with a formula that does not require all that work) is the final estimate of the reliability..
Related questions
QUESTION
Which statistic can help the test user determine how large a difference must exist for scores yielded from two different tests to be considered statistically different?
QUESTION
What is the purpose of criterion-referenced tests?
QUESTION
A researcher wants to determine whether the circadian temperature cycle is less stable for older compared to middle aged adults. Body temperature in degrees Fahrenheit is measured every 15 minutes over a period of 3 days.
QUESTION
Your observed score is equal to what scores added up?
Recommended textbook explanations
Psychology: Principles in Practice
Spencer A. Rathus
1,024 explanations
A Concise Introduction To Logic (Mindtap Course List)
13th Edition
Lori Watson, Patrick J. Hurley
1,912 explanations
Myers' Psychology for AP
2nd Edition
David G Myers
900 explanations
Myers' Psychology for the AP Course
3rd Edition
David G Myers
955 explanations
Sets found in the same folder
Language of Research
58 terms
Sampling
17 terms
Survey Research
15 terms
Statistics
24 terms
Sets with similar terms
Research Methods - Chapter 5
21 terms
Reliability
16 terms
Research Methods Lecture 3
18 terms
Other sets by this creator
The social relations model (SRM)
24 terms
Structural Equation Modeling
54 terms
Research Design
22 terms
Other Quizlet sets
Stats Exam - Chapter 7
11 terms
Measurement
34 terms
Psych Achievement Testing Chapter 5
55 terms
ECP 4703 Ch. 4
15 terms