Study sets, textbooks, questions
Upgrade to remove ads
Week 4 Stats
Terms in this set (5)
Stefan is interested in estimating the impact of the hard work he put into grading problem sets on exam scores to find out whether the benefit to the students is worth the cost.(a) What would be the ideal experiment? What are some reasons why we might not be able to conduct the ideal experiment?
Ideal experiment would be to randomize: some students get extensive feedback, others don't. This would not be feasible to carry out in practice.
Why would the following more practical experiment be less ideal? Give more extensive feedback to students in 2018 and less extensive feedback to students in 2019, and compare final exam scores across the two classes.
Students across different cohorts might have different ability levels. For example, if the admissions standards of the program are increasing over time, then the 2019 class might be smarter, and then the comparison between the two groups would underestimate the impact of the feedback.
Suppose Stefan grades all the problem sets in one sitting, without taking any breaks, and he gets more tired as he grades more problem sets.(c) Stefan considers comparing test scores for students with last names in the first half of the alphabet to test scores for students with last names in the second half of the alphabet. He consults you for advice on this proposed evaluation strategy. Do you think this would yield plausible results?
No because last names are not randomized. Students with names at the end of the alphabet are unlikely to provide a good counterfactual for students with names at the beginning of the alphabet.
Stefan considers using a regression discontinuity design to evaluate the impact of homework feedback on test scores using last names L-M as the cutoff. He consults you for advice on this proposed evaluation strategy. Do you think this would yield plausible results?
We would only want to use an RD if there is a cutoff in a continuous running variable which randomizes our sample into treatment and control groups, where the control group provides a good counterfactual for the treated group (i.e., tells us what the outcomes in the treatment group would have been in the absence of the treatment). However, this setting is not appropriate for an RD design since the cutoff in last names does not divide our sample into two otherwise similar groups: Last names are correlated with various demographic characteristics, which might also be correlated with performance.
Now suppose that Stefan grades the problem sets on two days: he is fully energetic and leaves equally detailed comments for everyone on the first day; but he has reduced energy and leaves equally less-detailed comments for everyone on the second day. He is considering the evaluation strategies proposed in (c) and (d) and consults you for advice. What would you say to him?
This does not address the concerns about last names not being randomized which is present for the evaluation strategies in (c) and (d).
Other sets by this creator
Chapter 5 Light & Matter
Chapter 3 Vocab Test
Music History Midterm