The instruction manual for value-added data will be ignored

2/29/2012

Am I stupid if I can't turn on my stove? The picture below (or one very similar) appears in most textbooks on human factors psychology.

The arrangement of controls is spatially incompatible with the arrangement of stove elements, so if I want to turn on the back left element, I may very well turn on the front left one.

What's notable is that this stove likely came with an instruction book, describing which knob goes with which burner.

But something about that feels wrong. It feels like the designer of the stove should have known how my mind works, and taken that into account, rather than shrugging and saying "well, it's in the manual. It's not my fault if you don't read the manual."

The stove reminds me of value-added measures of teacher effectiveness.

Even the staunchest boosters of value-added measures agree that they should not be the whole story, that there should be multiple measures of teacher effectiveness. But I'm afraid that asking people to remember that fact is a little like asking people to remember which knob goes with which burner on their stove. It's not that people can't do it, but you are swimming upstream of the mind's biases.

To be clear, I don't think that there are data to prove this contention, but let me describe why I'm guessing it's true.

We're talking about a case of missing information: you tell people: "Teacher Smith's value-added score is X. By the way, value-added scores are incomplete as a measure of teacher effectiveness"

How do people interpret information that they know to be incomplete? It varies with the situation. Sometimes they assume the missing information is positive. ("I haven't heard that the roads are closed, so I guess all's well.") Sometimes they assume missing information is negative ("He left 'prior experience' blank, so I guess he doesn't have any.")

And sometimes missing information is forgotten or discounted. My guess--and I emphasize that it's a guess--is that will be the case here. I make this guess in part by analogy to the evaluation of college applicants.

A student's high school record has lots of "soft" components, the values of which are tricky to evaluate: participation in sports and clubs, leadership positions, recommendations from teachers. . .. even a student's grade point average must be evaluated in light of the difficulty of the courses taken and the competitiveness of the high school.

But then there's the SAT. It has the gloss of being numeric, and it is easy to make comparisons across students. Make no mistake, I believe that the SAT does what it's supposed to do--predict success in the freshman year of college. But it's often interpreted to be much more meaningful than that. That's the problem.

I'm afraid that value-added measures will have the same problem. They are produced via a fancy formula, they make it simple to make comparisons, and they are numeric, which can lead one to conclude that they are more precise than they really are. And at this point, we don't even have any of the other "soft" measures to round out the picture of teacher effectiveness.

I don't think value-added measures are meaningless. But handing people value-added measures with the bland warning "these are incomplete" is like giving me a stove with a bad mapping plus an instruction booklet.

The solution to the stove problem is straightforward

The solution to teacher evaluation is not straightforward, and I won't attempt to resolve it in a blog posting.

My purpose here is simply to highlight the problem in publishing value-added data for individual teachers, with the caveat "these measures are incomplete." I predict that caveat will go unnoticed or be forgotten.

Sol Stern

2/29/2012 03:51:08 am

The situation is far worse than you suggest, Dan, and you shouldn't be so cautious about your conclusions. Campbell's Law predicts that under the pressure of value added evaluations that could result in either cash rewards or severe sanctions, teachers will change what they do in the classroom. But they will deliver a worse product for the students, not a better one. They will spend a lot of classroom time teaching kids how to game the tests, at the expense of teaching broad content knowledge.

Dina link

3/4/2012 05:26:14 am

Dan, I have been kicking around a related thought-- how our culture (Western? American? Post-industrial?) has evolved to preference quantitative data over all. Hence, even I-- the most constructivist of constructivist educators in orientation-- immediately focus upon the numerical components of my evals. As a psychologist, do you have any insight into why this might be?

Daniel Willingham link

3/5/2012 03:23:10 am

@Sol: I didn't intend for this blog posting to be my final word on VAM. I've written elsewhere about what I think the incentive structure will be, and how it will likely be destructive. This post was meant to be on this one narrow topic.
@Dina: Dunno. .. it feels to me to be part of the faith we have in science. There is, of course, another, opposing strain of thought that remains strong in US culture--the Romantic impulse--which disdains the measurable. I hear this stance in education sometimes when people argue that it's *impossible* to measure teaching and learning, or to characterize what effective teachers do. As a scientist I think it helps to be be as clear as one can about what science can do and can't do. I take up a lot of this stuff in my next book :)

Comments are closed.

The instruction manual for value-added data will be ignored

Purpose

Archives

Categories