The cover story
of latest New Republic
wonders whether American educators have fallen in blind love with self-control. Author Elizabeth Weil thinks we have. Titled “American Schools Are Failing Nonconformist Kids: In Defense of the Wild Child” the article suggests that educators harping on self-regulation are really trying to turn kids into submissive little robots. And they do so because little robots are easier to control in the classroom.
But lazy teachers are not the only cause. Education policy makers are also to blame, according to Weil. She writes that “valorizing self-regulation shifts the focus away from an impersonal, overtaxed, and underfunded school system and places the burden for overcoming those shortcomings on its students.”
And the consequence of educators’ selfishness? Weil tells stories that amount to Self-Regulation Gone Wild. A boy has trouble sitting cross-legged in class—the teacher opines he should be tested because something must be wrong with him. During story time the author’s daughter doesn’t like to sit still and to raise her hand when she wants to speak. The teacher suggests occupational therapy.
I can see why Weil and her husband were angry when their daughter’s teacher suggested occupational therapy simply because the child’s behavior was an inconvenience to him
. But I don’t take that to mean that there is necessarily a widespread problem in the psyche of American teachers. I take that to mean that their daughter’s teacher was acting like a selfish bastard.
The problem with stories, of course, is that there are stories to support nearly anything. For every story a parent could tell about a teacher diagnosing typical behavior as a problem, a teacher could tell a story about a child who really could
do with some therapeutic help, and whose parents were oblivious to that fact.
What about evidence beyond stories?
Weil cites a study by Duncan et al (2007) that analyzed six large data sets and found social-emotional skills were poor predictors of later success.
She also points out that creativity among American school kids dropped between 1984 and 2008 (as measured by the Torrance Test of Creative Thinking) and she notes “Not coincidentally, that decrease happened as schools were becoming obsessed with self-regulation.”
There is a problem here. Weil uses different terms interchangeably: self-regulation, grit, social-emotional skills. They are not same thing. Self-regulation (most simply put) is the ability to hold back an impulse when you think that that the impulse will not serve other interests. (The marshmallow study would fit here.) Grit refers to dedication to a long-term
goal, one that might take years to achieve, like winning a spelling bee or learning to play the piano proficiently. Hence, you can have lots of self-regulation but not be very gritty. Social emotional skills might have self-regulation as a component, but it refers to a broader complex of skills in interacting with others.
These are not niggling academic distinctions. Weil is right that some research indicates a link between socioemotional skills and desirable outcomes, some doesn’t. But there is quite a lot of research showing associations between self-control and positive outcomes for kids including academic outcomes, getting along with peers, parents, and teachers, and the avoidance of bad teen outcomes (early unwanted pregnancy, problems with drugs and alcohol, et al.). I reviewed those studies here
. There is another literature showing associations of grit with positive outcomes (e.g., Duckworth et al, 2007
Of course, those positive outcomes may carry a cost. We may be getting better test scores (and fewer drug and alcohol problems) but losing kids’ personalities. Weil calls on the reader’s schema of a “wild child,” that is, an irrepressible imp who may sometimes be exasperating, but whose very lack of self-regulation is the source of her creativity and personality.
But irrepressibility and exuberance is not perfectly inversely correlated with self-regulation. The purpose of self-regulation is not to lose your exuberance. It’s to recognize that sometimes it’s not in your own best interests to be exuberant. It’s adorable when your six year old is at a family picnic and impulsively practices her pas de chat because she cannot resist the Call of the Dance. It’s less adorable when it happens in class when everyone else is trying to listen to a story.
So there’s a case to be made that American society is going too far in emphasizing self-regulation. But the way to make it is not to suggest that the natural consequence of this emphasis is the crushing of children’s spirits because self-regulation is the same thing as no exuberance. The way to make the case is to show us that we’re overdoing self-regulation. Kids feel burdened, anxious, worried about their behavior.
Weil doesn’t have data that would bear on this point. I don’t either. But my perspective definitely differs from hers. When I visit classrooms or wander the aisles of Target, I do not feel that American kids are over-burdened by self-regulation.
As for the decline in creativity from 1984 and 2008 being linked to an increased focus on self-regulation…I have to disagree with Weil’s suggestion that it’s not a coincidence (setting aside the adequacy of the creativity measure). I think it might very well be a coincidence. Note that scores on the mathematics portion of the long-term NAEP increased during the same period. Why not suggest that kids improvement in a rigid, formulaic understanding of math inhibited their creativity?
Can we talk about important education issues without hyperbole?
Duckworth, A. L., Peterson, C., Matthews, M. D., & Kelly, D. R. (2007). Grit: perseverance and passion for long-term goals. Journal of personality and social psychology, 92(6), 1087.
Duncan, G. J., Dowsett, C. J., Claessens, A., Magnuson, K., Huston, A. C., Klebanov, P., ... & Japel, C. (2007). School readiness and later achievement. Developmental psychology, 43(6), 1428.
Amanda Ripley's new book, The Smartest Kids in the World: And How They Got That Way,
has garnered positive reviews in the Economist
, the New York Times
, USA Today
, the Daily Beast
and US News and World Report
. Is it really that good? It's pretty darn good. As the subtitle promises, Ripley sets out tell the education success stories of three countries: Finland and South Korea (whose 15 year olds score very high on the PISA test) and Poland (offered as an example of a country in transition, and making significant progress).What's Ripley's answer to the subtitle? They got that way by engaging, from an early age, in rigorous work that poses significant cognitive challenge.
In other words, the open secret is the curriculum.Along the way to this conclusion, she dispenses with various explanations for US kids mediocre performance on the science and math portions of PISA. I've made these arguments myself so naturally I found them persuasive:
What is the explanation? According to Ripley, there is a primary postulate running through the psyche of South Koreans, Finns, and Poles when it comes to education: an expectation that the work will be hard. Everything else is secondary. So anything that gets in the way, anything that compromises the work, will be downplayed or eliminated. Sports, for example. Kids do that on their own time, and it's not part of school culture.
- Poverty is higher in the US. Not compared to Poland. And other countries with low poverty (e.g. Norway) don't end up with well educated kids. The relevant statistic is how much worse poor kids do relative to rich kids within a country. The US fares poorly on this statistic.
- The US doesn't spend enough money on education. Actually we outspend nearly everyone. But because of local funding we perversely shower money on schools attended by the wealthy and spend less on the schools attended by poor kids.
- The US has lots of immigrants and they score low. Other countries do a better job of educating kids who do not speak the native language.
- The kids in other countries who take PISA are the elite. Arguably true in Shanghai, but not Korea or Finland, both of which boast higher graduation rates than the US.
- Why should we compare our kids to those of foreign countries? It's not a race. Because those other kids are showing what we could offer our own children, and are not.
Several consequences follow from this laser-like focus on academic rigor. For example, if schoolwork is challenging kids are going to fail frequently. So failure necessarily is seen as a normal part of the learning process, and as an opportunity for learning, not a cause of shame.
If the academic work for students will be difficult, teachers will necessarily have to be very carefully selected and well trained. And you'll do whatever is necessary to make that happen. Even if it means, as in Finland, offering significant financial support during their training.
So what is the primary postulate of American education?
Ripley doesn't say, and I'm not sure Americans are sufficiently unified to name one. But two assumptions strike me as candidates.
First, that learning is natural, natural meaning that a propensity to learn is innate, instinctive and therefore inevitable. That, in turn, means that it should be easy. This assumption is pretty much the opposite of the one Ripley assigns to South Korea, Finland, and Poland.
Many Americans seem to think that it's not normal for schoolwork to be challenging enough that it takes persistence. In fact, if you have to try much harder than other kids, in our system you're a good candidate for a diagnosis and an IEP.
This expectation that things should be easy may explain our credulity for educational gimmicks, for that's what gimmicks do: they promise to make learning easy for everyone. Can't learn math? It's because your learning style hasn't been identified. Trouble with Spanish? This new app will make it fun and effortless.
The second assumption I often see is that "rigor" and "misery" are synonyms. Rigor means that you will be challenged. It means you may not succeed quickly. It means your cognitive resources will be stretched. It doesn't mean you are being punished, nor that you will be unhappy.
At the same time, I can't agree with the "play is all you need" crowd. Play can be cognitively enriching, but that doesn't mean that all play is cognitively enriching.
It's easy to create schoolwork that's rigorous and a grind likely to make kids hate school. Ripley offers South Korea as an example. Children there are miserable, adults hate the system, and despite kids' excellent test scores, everyone sees the Korean system as dysfunctional.
It's much tougher to educate kids in a way that is challenging but engaging. That's Finland, according to Ripley. And she's here to remind us that most of what has been pointed to as responsible for the Finnish miracle is not. What's responsible is the rigor of the work kids have been asked to do.
Will Americans embrace this idea, and demand that our education system challenge our kids? Will they embrace it to the point that they will follow this primary postulate whither it may lead?
I think Ripley's right to suggest that it's essential. I think the odds that Americans will follow through are remote.
It is known that quality preschool can improve academic outcomes for kids (Barnett, 2011). Getting a handle on measuring “quality” is challenging, but if states and the federal government are to support preschool, such measurement is vital.
More than half of US states have adopted Quality Rating and Improvement Systems (QRISs) in an attempt to quantify preschool quality at the level of individual programs.
States adopt different measures to go into the QRIS, but they are uniform in that they use input measures, not child outcome measures. A study out in Science this week (Sabol et al, 2013)
sought to evaluate whether QRISs work; do they identify quality preschools?
The study used two national data sets from the early 2000s to test whether the type of metrics most often included in QRISs are related to schooling outcomes. So the researchers ask “if you use the kinds of measures QRISs use to evaluate preschools and combine them as the QRISs do, are you probably measuring good learning outcomes for kids?”
The four characteristics of programs were (1) qualifications and experience of teachers; (2) teacher-student ratio; (3) family partnerships and (4) how conducive the environment is to learning (as measured by the Early Childhood Education Rating System, which evaluates both the physical classroom and interactions between teachers, students and parents).
The outcome measures included math, prereading, language, and social skills. These were measured at the end of the year, accounting for beginning-of-year-score, child, and family characteristics.
The figure shows difference between high-scoring and low-scoring preschools on the four metrics. The four qualities of preschools do not differentiate high vs. low quality on the outcomes for kids.
Next, in separate analyses for each state, the researchers aggregated the four qualities; they combined and weighted the metrics as each state does. That didn’t make much difference.
Why are QRISs such a flop? I suspect that the problem is not just that they are “input” measures. The problem is that most are quite distal from where the action takes place—the classroom. Measuring things like years of experience and parental partnership is inexpensive and easy, and that’s nice. Someone at the school can submit this sort of data online.
Classroom measures, in contrast, are expensive. Someone with training has to actually observe what’s going on. That's part of what goes into the "environment" measure, and it does look like that measure showed the most promise.
And indeed, a measure wholly focused on classroom interaction does much better. Sabol et al conducted another analysis, using the Classroom Assessment Scoring System (CLASS), (brainchild of the study’s third author, Bob Pianta), which evaluates interactions between teacher and child. As you can see , the CLASS does quite well. (It’s labeled “interactions” in this graph.)
The phrase “high quality preschool” has been repeated to the point that it’s become almost meaningless. It’s not, and it’s not hopeless to characterize a high-quality classroom. Further, we don’t have to test every child to spot one. Sabol et al (2013) show that qualities of teachers (e.g. experience) and programs (e.g., parental involvement) may not mean much, but qualities of teacher-student interactions might be what we need.
Barnett, W. S. Effectiveness of Early Educational Intervention. Science, 333, 975-978.
Sabol, T. J., Hong, S. L. S., Pianta, R. C., & Burchinal, M. R. (2013). Can rating pre-k programs predict children's learning? Science, 341, 845-846.
Spring, 2013 at the Harvard Initiative for Learning and Teaching.
There are some studies in psychology where you pretty much know what the results will be before you collect the data. But you gotta do 'em to be sure you're right.One example is a recent study (Sana et al, 2013) on the effects of laptop multitasking on classroom learning. (Thanks to Twitter users @rboulle
for tipping me off to this study.)
The authors had college-aged subjects come into a laboratory to listen to a 45 minute lecture on meteorology, meant to simulate the sort of experience they would have in a college classroom. Half of the subjects were given a list of secondary tasks to perform, meant to represent the sort of thing that a bored student in a lecture might investigate during part of the lecture that seemed slow. For example, one question was "What is on Channel 3 tonight at 10 p.m.?" All the questions were designed to be answerable with a simple search using websites that virtually students are familiar with (Google, YouTube, et al.)
The number of questions--twelve--seemed pretty high to me. The authors said that pilot testing indicated students could answer all twelve in about 15 minutes. Thus, students would be multitasking for one third of the 45 minute lecture. Researchers argued that other data indicate this percentage estimate is not unreasonable, although it makes me want to cry.A forty item comprehension test administered 20 minutes after the lecture showed a cost to multitasking.
Experiment 2 examined what happens when you are not multitasking yourself, but someone near you is doing so. Again, you kind of know what's going to happen. Motion in your peripheral vision is distracting, a phenomenon that web page designers have capitalized on for years, much to our annoyance.
And sure enough, a peer multitasking in your view is distracting.
There is a fundamental tension here, and I don't know how to resolve it. On the one hand, I like it when students have their laptops in class. Many of them are more comfortable taking notes this way than longhand. In the middle of a lecture I might ask someone to look something up that I don't know off the top of my head.
On the other hand, the potential for distraction is terrible. I've walked in the back of the classroom of many of my colleagues and seen that perhaps 50% of the students are on the Web.
Students think that they can snap attention back to class "when it gets interesting again." I don't have much confidence they can. Student judgments of their own learning are often not that well calibrated, and that seems to be especially true of multitasking. They think it's cost free.
Tellingly, researchers asked subjects in Experiment 2 to provide ratings as to whether they were distracted by other people multitasking and whether other people multitasking affected their own (the observers') learning. Average answers? "Somewhat distracting" and "Barely" hindered my learning.
What can be done?
Some educators simply ban laptops. Some banish laptop users to the back rows. I don't like either of these solution much because they impose a penalty on anyone who wants to use a laptop.
I asked our IT group if the Wifi could be turned on and off in my classroom. Nope.
Some argue that students are learning how to manage distraction, although there's not much evidence that students are learning this lesson. Certainly, I don't know of anyone actively teaching them this lesson.
Got ideas? I'd love to hear them.
How should textbooks be designed? A new paper
by Jennifer Kaminski and Vladimir Sloutsky shows that that can be real subtly in the answer.The researchers examined early elementary materials meant to teach kids how to read graphs. They were specifically interested in comparing boring, monochromatic, abstract, bar
graphs versus colorful, fun graphs that use a graphic. (Please excuse the black & white reproduction.)
We all know that textbook publishers are eager to make books more visually appealing. And in this case, what's the harm? The graph with the objects seems like a natural scaffold to learn the concept.
kaminski & Sloutsky found that some children shown the graph with embedded objects adopted a counting strategy to read a graph, even if they were taught to focus on the bar height and the axis. The authors surmise that the counting routine is so well-learned that when the child is presented with the vivid graphic with salient objects to count, it's simply very easy to go down that mental path. And of course the child does read the graph correctly.
The problem is not just the child hasn't learned a good strategy to read the graph, or is distracted--the child has learned a bad strategy. So when kids who adopted the counting strategy see graphs like this . . .
. . . some of them count the stripes or count the dots to "read" the graph.
The effect fades as kids get older--first graders are better than kindergarteners in ignoring extraneous information when reading graphs.
On the one hand you could see this as small potatoes--kids will get over it, they will learn how to read graphs. But on the other hand, why knowingly put a stumbling block in front of kids trying to learn math? And more important, how many other small stumbling blocks are there that we don't know about?
A blog posting
over at Schools Matter @ The Chalk Face has gathered a lot of interest--78 comments, many of them outraged.The New York State Education Dept. has a website that is meant to help teachers prepare for the Common Core Standards.
Author Chris Cerrone posted a bit of a 1st grade curriculum module on early civilizations. Here it is:
Cerrone asked primary grade educators to weigh in: "what do you think of the vocabulary contained in this unit of study?"
The responses in the 78 comments were nearly uniformly negative. As you might expect from that volume of commentary, the criticisms were wide-ranging, much of it directed more generally at standardized testing and the idea of the CCSS themselves.
But a lot of the commentary concerned cognitive development, and I want to focus there. This comment was typical (click for larger image).
Photo from milwaukee-montessori.org
There is an important idea at the heart of this criticism: developmental stages. This commenter specifically invokes Piaget, but you don't have to be a Piagetian to think that stages are a good way to think about children's thinking. Stage theories hold that children's thinking is relatively stable, but then undergoes a big shift in a relatively brief time (say, a few months) whereupon it stabilizes again.
So lessons would be developmentally inappropriate if they demanded a type of thinking that the child was simply incapable of, given his developmental stage.
I have argued in some detail
that stage theories have two major problems: first, data from the last twenty years or so make development look like it's continuous, rather than occurring in discrete stages. Second, children's cognition is fairly variable day to day
, even when the same child tries the same task. I have argued elsewhere that trying to take a psychological finding and using it to draw strong conclusions about instruction--including what children are, in principle, ready for--is fraught with problems.
How much the more is that true when using a psychological theory rather than an experimental finding.So if Piaget will not be our guide as to what 1st graders are ready for, what should be? The experience of early elementary educators, of course, and some of the people commenting on the blog posting are or were first grade teachers.
And almost unanimously, they thought this material was inappropriate for first graders. (Some thought kids this age shouldn't be learning about other religions at this age. No argument there, that's a matter of ones values. I'm only talking about what kids can cognitively handle.) But if we adopt a proof-of-the-pudding-is-in-the-eating criterion, lessons on ancient civilizations are fine because they are in use and children are learning. The material shown above is part of the Core Knowledge sequence, around for more than a decade and used by over a thousand schools. (NB: I'm on the Board of the Core Knowledge Foundation.)And Core Knowledge is not alone. Another curriculum has had first-graders learn about ancient civilizations not for a decade, but for about a century: Montessori.
(NB again: my children experienced these lessons at their school, and my wife teaches them--she's an early elementary Montessori teacher.)Montessori schools teach the same "Five Great Lessons"
at the beginning of first, second, and third grades. They are
- The history of the universe and earth
- The coming of life
- The origins of human beings
- The history of signs and writing
- The story of numbers and mathematics
Naturally, these lessons are presented in ways that make sense to young children, but they are far from devoid of content. Montessori educators see them as the foundation and the wellspring of interest for everything to come: biology, geology, mathematics, reading, writing, chemistry and so on.
If it seems impossible or highly unlikely to you that 6 year olds could really get anything out of such lessons, I'll ask you to consider this. Our understanding of any
new concept is always incomplete.
For example, how do children learn that some people they hear about (Peter Pan) are made up and never lived, whereas others (the Pharaohs) were real? Not by an inevitable process of neurological maturation that makes their brain "ready" for this information, whereupon they master it quickly. They learn it bit by bit, in fits and starts, sometimes seeming to get it, other times not.
And you can't always wait until children are "ready." Think about mathematics. Children are born understanding numerosity
, but they understand it on a logarithmic scale--the difference between five and ten is larger than the difference between 70 and 75. To understand elementary mathematics they must learn to think of numbers of a linear scale. In this case, teachers have to undo
Nature. And if you wait until the child is "developmentally ready" to understand numbers this way, you'll never teach them mathematics. It will never happen.
In sum, I don't think developmental psychology is a good guide to what children should learn; it provides some help in thinking about how
children learn. The best guide to "what" is what children know now, and where you want their learning to head.
I read a lot of blogs. I only comment when I think I have something to add (which is rare, even on my own blog) but I read a lot of them.
Today, I offer a plea and a suggestion for making education blogs less boring, specifically on the subject of standardized testing.
I begin with two Propositions about human behavior
- Proposition 1: If you provide incentives for X, people are more likely to do what they think will help them get X. They may even attempt to get X through means that are counterproductive.
- Proposition 2: If we use procedure Z to change Y in order to make it more like Y’, we need to measure Y in order to know whether procedure Z is working. We have to be able to differentiate Y and Y’.
A lot of blog posts on the subject of testing are boring because authors pretend that one of these propositions is false or irrelevant.
On Proposition 1: Standardized tests typically gain validity by showing that scores are associated with some outcome you care about. You seldom care about the items on the test specifically. You care about what they signify. Sometimes tests have face validity
, meaning test items look
like they test what they are meant to test—a purported history test asks questions about history, for example. Often they don’t, but the test is still valid. A well-constructed vocabulary test can give you a pretty good idea of someone’s IQ, for example.
Just as body temperature is a reliable, partial indicator of certain types of disease, a test score is a reliable, partial indicator of certain types of school outcomes. But in most circumstances your primary goal is not a normal body temperature; it’s that the body is healthy, in which case body temperature will be normal as a natural consequence of the healthy state.
Bloggers ignoring basic propositions about human behavior? What's up with that?
If you attach stakes to the outcome, you can’t be surprised if some people treat the test as something different than that. They focus on getting body temperature to 98.6, whatever the health of the patient. That’s Proposition 1 at work. If a school board lets an administrator know that test scores had better go up or she can start looking for another job. . . well, what would you do in those circumstances? So you get test-prep frenzy. These are social consequences of tests, as typically used.
On Proposition 2: Some form of assessment is necessary. Without it, you have no idea how things are going. You won’t find many defenders of No Child Left Behind, but one thing we should remember is that the required testing did expose a number of schools—mostly ones serving disadvantaged children—where students were performing very poorly. And assessments have to be meaningful, i.e., reliable and valid. Portfolio assessments, for example, sound nice, but there are terrible problems with reliability and validity. It’s very difficult to get them to do what they are meant to do.
So here’s my plea. Admit that both Proposition 1 and Proposition 2 are true, and apply to testing children in schools.
People who are angry about the unintended social consequences of standardized testing have a legitimate point. They are not all apologists for lazy teachers or advocates of the status quo. Calling for high-stakes testing while taking no account of these social consequences, offering no solution to the problem . . . that's boring.
People who insist on standardized assessments have a legitimate point. They are not all corporate stooges and teacher-haters. Deriding “bubble sheet” testing while offering no viable alternative method of assessment . . . that's boring.
Naturally, the real goal is not to entertain me with more interesting blog posts. The goal is to move the conversation forward. The landscape will likely change consequentially in the next two years. This is the time to have substantive conversations.
Part of the fun and ongoing fascination of science of science is "the effect that ought not to work, yet does."The impact of values of affirmation on academic performance is such an effect.
Values-affirmation "undoes" the effect of stereotype threat (also called identity threat). Stereotype threat occurs when a person is concerned about confirming a negative stereotype about his or her group. In other words a boy is so consumed with thinking "Everyone expects me to do poorly on this test because I'm African-American" that his performance actually is
compromised (see Walton & Spencer, 2009 for a review).One way to combat stereotype threat is to give the student
better resources to deal with the threat--make the student feel more confident, more able to control the things that matter in his or her life.That's where values affirmation comes in. In this procedure, students are provided a list of values (e.g.,
relationships with family members, being good at art) and are asked to pick three that are most important to them and to write about why they are so important. In the control condition, students pick three values they imagine might be important to someone else
. Randomized control trials show that this brief intervention boosts school grades (e.g., Cohen et al, 2006).Why? One theory is that values affirmation gives students a greater sense of belonging, of being more connected to other people. (The importance of social connection is an emerging theme in other research areas. For example, you may have heard about the studies showing that people are less anxious when anticipating a painful electric shock if they are holding the hand of a friend or loved one.)A new study (Shnabel et al, 2013) directly tested the idea that writing about social
belonging might be a vital element in making values affirmation work.In Experiment 1 they tested 169 Black and 186 White seventh graders in a correlational study. They did the values-affirmation writing exercise, as described above. The dependent measure was change in GPA (pre-intervention vs. post-intervention.) The experimetners found that writing about social belonging in the writing assignment was associated with a greater increase in GPA for Black students (but not for White students, indicating that the effect is due to reduction in stereotype threat.)In Experiment 2, they
used an experimental design, testing 62 male and 55 female college undergraduates on a standardized math test. Some were specifically told to write about social belonging and others were given standard affirmation writing instructions. Female students in the former group outscored those in the latter group. (And there was no effect for male students.)
The brevity of the intervention relative to the apparent duration of the effect still surprise me. But this new study gives some insight into why it works in the first place.References:
Cohen, G. L., Garcia, J., Apfel, N., & Master, A. (2006). Reducing
the racial achievement gap: A social-psychological interven-tion. Science, 313, 1307-1310.
Shnabel, N., Purdie-Vaughns, V., Cook, J. E., Garcia, J., & Cohen, G. L. (2013). Demystifying values-affirmation interventions: Writing about social belonging is a key to buffering against identity threat. Personality and Social Psychology Bulletin,
Walton, G. M., & Spencer, S. J. (2009). Latent ability: Grades and test
scores systematically underestimate the intellectual ability of negatively stereotyped students. Psychological Science, 20, 1132-1139.
One of the great intellectual pleasures is to hear an idea that not only seems right, but that strikes you as so terribly obvious (now that you've heard it) you're in disbelief that no one has ever made the point before.I tasted that pleasure this week, courtesy of a paper by Walter Boot and colleagues (2013)
. The paper concerned the adequacy of control groups in intervention studies--interventions like (but not limited to) "brain games" meant to improve cognition, and the playing of video games, thought to improve certain aspects of perception and attention.
To appreciate the point made in this paper, consider what a control group is supposed to be and do. It is supposed to be a group of subjects as similar to the experimental group as possible, except for the critical variable under study. Active control group
The performance of the control group is to be compared to the performance of the experimental group, which should allow an assessment of the impact of the critical variable on the outcome measure.
Now consider video gaming or brain training. Subjects in an experiment might very well guess the suspected relationship between the critical variable and the outcome. They have an expectation as to what is likely to happen. If they do, then there might be a placebo effect--people perform better on the outcome test simply because they expect that the training will help just as some people feel less pain when given a placebo that they believe is a analgesic.
The standard way to deal with that problem is the use an "active control." That means that the control group doesn't do nothing--they do something, but it's something that the experimenter does not believe will affect the outcome variable. So in some experiments testing the impact of action video games on attention and perception, the active control plays slow-paced video games like Tetris or Sims. Out of control group
The purpose of the active control is that it is supposed to make expectations equivalent in the two groups. Boot et al.'s simple and valid point is that it probably doesn't do that. People don't believe playing Sims will improve attention.
The experimenters gathered some data on this point. They had subjects watch a brief video demonstrating what an action video game was like or what the active control game was like. Then they showed them videos of the measures of attention and perception that are often used in these experiments. And they asked subjects "if you played the video game a lot, do you think it would influence how well you would do on those other tasks?"
And sure enough, people think that action video games will help on measures of attention and perception. Importantly, they don't think that they would have an impact on a measure like story recall. And subjects who saw the game Tetris were less likely to think it would help the perception measures, but were more likely to say it would help with mental rotation.
In other words, subjects see the underlying similarities between games and the outcome measures, and they figure that higher similarity between them means a greater likelihood of transfer.
As the authors note, this problem is not limited to the video gaming literature; the need for an active control that deals with subject expectations also applies to the brain training literature.
More broadly, it applies to studies of classroom interventions. Many of these studies don't use active controls at all. The control is business-as-usual.
In that case, I suspect you have double the problem. You not only have the placebo effect affecting students, you also have one set of teachers asked to do something new, and another set teaching as they typically do. It seems at least plausible that the former will be extra reflective on their practice--they would almost have to be--and that alone might lead to improved student performance.
It's hard to say how big these placebo effects might be, but this is something to watch for when you read research in the future.
Boot, W. R., Simons, D. J., Stothart, C. & Stutts, C. (2013). The pervasive problems with placebos in psychology: Why active control groups are not sufficient to rule out placebo effects. Perspectives in Psychological Science, 8, 445-454.