Speculation about Standards Grain Size and Exam Performance
This year in algebra-based physics, I switched to larger-grain standards that emphasize synthesis; where as previously had finer-grained standards and fine-tuned assessments that target only one specific skill at a time. You have to remember that my standards based assessments system (with learning goals and reassessments) happen before a common, high-stakes exam.
Typically, on the first exam, the distribution of grades from my section would have looked like this
This semester my grades look like this
The average score remained about the same, but the distribution changed a lot. I think I can speculate about why, but I don’t like what I have to say. Sure, it could be random noise, but I sort of predicted this might happen based on informational observations. That is, it could still be random noise, but I’m subject to confirmation bias. Anyway, here’s my tentative explanation:
With the fine-grained standards, struggling students would get repeated practice on basic skills (e.g., distance vs position vs displacement). Non-struggling students would get it right on the first shot, and not need to reassess. This system made sure that struggling students had mastered very basic skills before the exam, but perhaps left the non-struggling students with less opportunity to practice honing their problem-solving skills. Because of this, my old distribution had a high floor, and relatively sparse ceiling.
This semester, with the synthesis-level assessments, we get a different picture. Struggling students make lots of mistakes on the more difficult assessments; and without targeted, focused goals to practice for reassessment, they don’t develop sufficient basic skills they did in the old system. They may just get swamped in trying to figure out how to solve complex problems. Non-struggling students don’t get it right the first time, but they get close enough to learn something, and take up opportunities in reassessments to hone their skills. Because of this, the ceiling gets more populated, but the floor drops down.
So is my assessment system now just helping students who would have done good do great? Was my old system better at helping struggling students? I can’t be sure, but I’m thinking.