Assessment Midterm
Terms in this set (74)
assessment
process that integrates test information from other sources including information obtained from other tests as well as the individual's social, educational, employment, health, or psychological history. serves for screening, diagnosis, treatment planning and goal identification, and progress evaluation
psychological test
objective, standardized measure of behavior
authentic assessment
assessment tasks that evaluate student abilities by measuring how well the student performs in real-life contexts
statistics
set of tools and techniques used for describing, organizing, and interpreting data or information
descriptive statistics
set of statistical analyses that can be used to organize and describe the characteristics of a set of data
inferential statistics
set of statistical analyses used to draw inferences from a smaller group of data (sample) that then can be applied to a larger group (population)
variable
construct that can assume multiple values, qualitative or quantitative
quantitative variables
defined using numeric values or scores
continuous data
data that can be subdivided infinitely as they are more an approximation based on available data
discrete data
variables or units of measure that cannot be divided or broken down into smaller units
qualitative variables
nonnumeric in nature and defined using nominal or categorical data
nominal scale
used to classify or categorize data into groups that have different names but are not related to each other in any other systematic way (sex, ice cream flavors)
ordinal scale
used to rank order data along some type of continuum so that each value on the scale has a unique meaning and appears in an ordered relationship to every other value on the scale in terms of size and magnitude (1st, 2nd, 3rd place or S/M/L)
interval scale
used to categorize, rank order, and arrange data so that an equal interval unit appears between each of the scores (year, temperature)
ratio scale
value of zero represents the absence of the variable being measured (age, speed)
grouped frequency distribution
distribution in which individual x-values are combined into sets known as intervals. Frequencies are counted for each of these intervals
graphs
pictorial representations of the data and information that one would normally find in a frequency distribution table
histogram
graph that uses vertical bars to represent the frequencies of a set of variables. Measured values on x-axis and frequency counts on y-axis
bar graphs
used to represent nominal data. Bars do not touch to represent the discrete aspects of the data
frequency polygon
variation of histogram where a line is drawn to connect the midpoint for each different measured variable in the distribution
positive skew
majority of scores fall on the low end of the distribution and the asymmetrical tail extends into the positive side of the graph
negative skew
majority of scores fall on the upper end of the distribution and the asymmetrical tail extends into the negative side of the graph
central tendency
indicates the center or middle of a distribution
kurtosis
peakness or flatness of a frequency distribution
mean
sum of all scores in a distribution divided by the total number of scores
median
middle score or the score that divides a distribution evenly in half
mode
score with the greatest frequency in a distribution
variability
describes the degree to which scores in a distribution are spread out or clustered together
range
difference between the highest and lowest value in a distribution
standard deviation
average amount by which individual scores in a distribution vary from the mean
variance
mean of all squared deviation scores
standard scores
distance an individual's raw score is above or below the mean of the reference group in terms of standard deviation units
normal curve
mean, median, and mode all share the same value and an equal percentage of scores fall on either side of the distribution
stanine
divides a data distribution into one of nine possible scores with 1 being the lowest and 9 being the highest, which have a distribution of 2 and a mean of 5. each is 1/2 standard deviation
percentile
numeric value that indicates the percentage of people in a reference group that fall at or below the individual's raw score
reliability
ability of test scores to be interpreted in a consistent and dependable manner across multiple test administrations
carryover effect
occurs when an experimental treatment continues to affect a participant long after the treatment is administered and the scores on the first administration of a test influence the scores obtained on subsequent administrations of the same test
correlation
used to measure and describe a relationship between two variables
correlation coefficient
numeric value that indicates strength of relationship between two variables
fatigue
situation where clients tire from multiple administrations of a test and their performance decreases as they grow weary
internal consistency
measure of reliability which evaluates how strongly items in an assessment are related in a single administration
practice effect
individuals improve their scores across test administrations as a result of increased familiarity and comfort with a test and the content that is being assessed
Spearman-brown prophecy formula
adjusted version of the correlation coefficient formula used to account for the fact that the correlation is being computed between two halves of a test rather than two full-length versions of a test
standard error of measurement
standard deviation of a normal distribution. used to assess whether the standard deviation of the normal distribution is the same for each member of the treatment group
systematic error
test measures a domain other than the trait it was assigned to assess
unsystematic error
random, unpredictable error
concurrent validity
obtained when a test score and criterion performance measure are collected at the same time (does the test relate to an existing similar measure?)
construct validity
extent to which a test is accurately and thoroughly measure a particular construct or trait (does the test relate to underlying theoretical concepts?)
content validity
ability of an instrument to fully assess or measure a construct of interest by sufficiently sampling from the entire universe of items for which the instrument was designed to sample
content validity ratio
ratio of number of raters who evaluate an item and the number of raters who deem an item to be an essential component of the construct being measured
convergent validity
scores on a test are compared to scores obtained on other tests believed to measure the same construct
criterion
score on a separate test or instrument that purports to measure the same construct or set of abilities as the test in question
criterion validity
an empirical form of measurement validity that establishes the extent to which a measure is correlated with a behavior or concrete outcome that it should be related to
discriminant validity
form of validity whereby scores on a test are contrasted with scores obtained on other tests believed to measure alternate constructs
face validity
assesses whether an instrument appears to look like it measures what it is meant to measure
factor analysis
used to determine how well items mathematically group together, thus indicating similarity and the measurement of a common construct
item analysis
series of statistical tests and procedures that can be used to assess for homogeneity in a test
predictive validity
the extent to which a score on a scale or test predicts scores on some criterion measure
test familiarity
examinee's familiarity with the materials/stimuli on a assessment
validity
degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests. Is the test legitimate? does it evaluate what its supposed to?
validity coefficient
representative of the relative strength of an assessment's validity
accommodation
process of modifying existing schemas or creating new ones to deal with new information
standardization
process of establishing uniform procedures for an assessment so that the observation, administration, equipment, materials, and scoring rules remain the same for all who are administered the test
z-score
most common standard score. mean is 0 and standard deviation is 1
t-score
standard score with a mean of 50 and a standard deviation of 10, cannot be negative
Classical Test Theory (CTT)
the true score model, describes a set of psychometric procedures that can be used to test the reliability, difficulty, and discriminatory properties of test items and scales. How measurement impacts our understanding of an individual's true ability level on a test or measure
hand scoring
Most common scoring method. Benefit: low cost, conducted by counselor or client, results obtained quickly. Drawback: error is greater
computer scoring
Benefit: results provided instantly
Drawback: ensure that clients are not able to access outside information while taking the test on a computer
optical scan scoring
Often used in academic settings where several individuals need to be tested at a single time. Can be scored on site or mailed/faxed to test publisher (fee). Benefit: quick, easy to run, human error in calculating scores is mitigated
Drawback: purchasing software and materials, staff training
Griggs v. Duke Power Company
Landmark Supreme Court decision stating that tests must fairly measure the knowledge or skills required for a job
Larry P. v. Riles
Case Ruled that IQ test could not be used as the primary or sole basis of placing students in special programs
Debra P. v. Turlington
Schools must prove that they have educated a student sufficiently before they can be given a graduation test
Sharif v. New York State Department of Education
U.S. District Court ruled that SAT could not be used as the sole criteria for the awarding of scholarships
soroka v. dayton hudson corp
questions that violate privacy must be directly and narrowly related to the nature of the employee's duties. Content and construct validity
