Ben Goldacre is a British physician and academic, and is the author of Bad Science, an expose of bad medical practice that is based on wrong-headed science.  For the last decade he has written a terrific column by the same name for the Guardian.

Goldacre has recently turned his critical scientific eye to educational practices in Britain. He was asked by the British Department for Education to comment on the use of scientific data in education and on the current state of affairs in Britain. You can download the report here.

So what does Goldacre say?

He offers an analogy of education to medicine; the former can benefit from the application of scientific methods, just as the latter has.

Goldacre touts the potential of randomised controlled trails (RCTs). You take a group of students and administer an intervention (a new instructional method for long division, say) to one group and not to another. Then you see how each group of students did.

Goldacre also speculates on what institutions would need to do to make the British education system as a whole more research-minded. He names two significant changes;
  • There would need to be an institution that communicates the findings of scientific research (similar to the American "What Works Clearinghouse.")
  • British teachers would need a better appreciation for scientific research so that they would understand why a particular practice was touted as superior, and could evaluate themselves the evidence for the claim
Picture
I'm a booster of science in education
As someone who has written shorter and book-length treatments of the role that scientific research might play in education, I'm very excited that Goldacre has made this thoughtful and spirited contribution.

I offer no criticisms of what Goldacre suggests, but would like to add three points.

First, I agree with Goldacre that randomized trials allow the strongest conclusions. But I don't think that we should emphasize RCTs to the exclusion of all other sources of data. After all, if we continue with Goldacre's analogy to medicine, I think he would agree that epidemiology has proven useful.

As a matter of tactics, note that the What Works Clearinghouse emphasized RCTs to the near exclusion of all other types of evidence, and that came to be seen as a problem. If you exclude other types of studies the available data will likely be thin. RCTs are simply hard to pull off: they are expensive, they require permission from lots of people. Hence, the What Works Clearinghouse ended up being agnostic about many interventions--"no randomized controlled trials yet." Its impact has been minimal.

Other sources of data can be useful; smaller scale studies, and especially, basic scientific work that bears on the underpinnings of an intervention.

We must also remember that each RCT--strictly interpreted--offers pretty narrow information: method A is better than method B (for these kids, as implemented by these teachers, etc.) Allowing other sources of data in the picture potentially offers a richer interpretation.

As a simple example, shouldn't laboratory studies showing the importance of phonemic awareness influence our interpretation of RCTs in preschool interventions that teach phonemic awareness skills?

Picture
Second, basic scientific knowledge gleaned from cognitive and developmental psychology (and other fields) can not only help us to interpret the results of randomized trials, that knowledge can be useful to teachers on its own. Just as a physician uses her knowledge of human physiology to diagnose a case, a teacher can use her knowledge of cognition to "diagnose" how to best teach a particular concept to a particular child.

I don't know about Britain, but this information is not taught in most American schools of Education. I wrote a book about cognitive principles that might apply to education. The most common remark I hear from teachers is surprise (and often, anger) that they were not taught these principles when they trained.

Elsewhere I've suggested we need not just a "what works" clearinghouse to evaluate interventions, but a "what's known" clearinghouse for basic scientific knowledge that might apply to education.

Third, I'm uneasy about the medicine analogy. It too easily leads to the perception that science aims to prescribe what teachers must do, that science will identify one set of "best practices" which all must follow. Goldacre makes clear on the very first page of the report that's NOT what he's suggesting, but to the non-doctors among us, we see medicine this way: I go to my doctor, she diagnoses what's wrong, and there is a standard way (established by scientific method) to treat the disease.

That perception may be in error, but I think it's common.

Picture
I've suggested a different analogy: architecture. When building a house an architect must respect certain basic facts set out by science. Physics and materials science will loom large for the architect; for educators it might be psychology, sociology et al. The rules represent limiting conditions, but so long as you stay within those boundaries there is lots of ways to get it right. Just as physics doesn't tell the architect what the house must look like, so too cognitive psychology doesn't tell teachers how they must teach.

RCTs play a different role. They provide proof that a standard solution to a common problem is useful. For example, architects routinely face the problem of ensuring that a wall doesn't collapse when a large window is placed in it, and there are standard solutions to this problem. Likewise, educators face common problems, and RCTs hold the promise of providing proven solutions. Just as the architect doesn't have to use any of the standard methods, the teacher needn't use a method proven by an RCT. But the architect needs be sure that the wall stays up, and the teacher needs to be sure that the child learns.

I made one of my garage-band-quality videos on this topic.

There's more to this topic--what it will mean to train teachers to evaluate scientific evidence, the role of schools of education. Indeed, there's more in Goldacre's report and I urge you to read it. Longer term, I urge you to consider why we wouldn't want better use of science in educational practice.

 
 
An experiment is a question which science poses to Nature, and a measurement is the recording of nature’s answer. --Max Planck

You can't do science without measurement. That blunt fact might give pause when people emphasize non-cognitive factors in student success and in efforts to boost student success.

"Non-cognitive factors" is a misleading but entrenched catch-all term for factors such as motivation, grit, self-regulation, social skills. . .  in short, mental constructs that we think contribute to student success, but that don't contribute directly to the sorts of academic outcomes we measure, in the way that, say, vocabulary or working memory do.
Picture
Non-cognitive factors have become hip. (Honestly, if I hear about the Marshmallow Study just one more time, I'm going to become seriously dysregulated) and there are plenty of data to show that researchers are on to something important. But are they on to anything that that educators are likely to be able to use in the next few years? Or are we going to be defeated by the measurement problem ?

There is a problem, there's little doubt. A term like "self-regulation" is used in different senses: the ability to maintain attention in the face of distraction, the inhibition of learned or automatic responses, or the quelching of emotional responses. The relation among them is not clear.
Further, these might be measured by self-ratings, teacher ratings, or various behavioral tasks.

Picture
But surprisingly enough, different measures do correlate, indicating that there is a shared core construct (Sitzman & Ely, 2011). And Angela Duckworth (Duckworth & Quinn, 2009) has made headway in developing a standard measure of grit (distinguished from self-control by its emphasis on the pursuit of a long-term goal).

So the measurement problem in non-cognitive factors shouldn't be overstated. We're not at ground-zero on the problem. At the same time, we're far from agreed-upon measures. Just how big a problem is that?

It depends on what you want to do.

If you want to do science, it's not a problem at all. It's the normal situation. That may seem odd: how can we study self-regulation if we don't have a clear idea of what it is? Crisp definitions of constructs and taxonomies of how they relate are not prerequisites for doing science. They are the outcome of doing science. We fumble along with provisional definitions and refine them as we go along.

The problem of measurement seems more troubling for education interventions.

Suppose I'm trying to improve student achievement by increasing students' resilience in the face of failure. My intervention is to have preschool teachers model a resilient attitude toward failure and to talk about failure as a learning experience. Don't I need to be able to measure student resilience in order to evaluate whether my intervention works?

Ideally, yes, but that lack may not be an experimental deal-breaker.

My real interest is student outcomes like grades, attendance, dropout, completion of assignments, class participation and so on. There is no reason not to measure these as my outcome variables. The disadvantage is that there are surely many factors that contribute to each outcome, not just resilience. So there will be more noise in my outcome measure and consequently I'll be more likely to conclude that my intervention does nothing when in fact it's helping.

The advantage is that I'm measuring the outcome I actually care about. Indeed, there would not be much point in crowing about my ability to improve my psychometrically sound measure of resilience if such improvement meant nothing to education.

There is a history of this approach in education. It was certainly possible to develop and test reading instruction programs before we understood and could measure important aspects of reading such as phonemic awareness.

In fact, our understanding of pre-literacy skills has been shaped not only by basic research, but by the success and failure of preschool interventions. The relationship between basic science and practical applications runs both ways.

So although the measurement problem is a troubling obstacle, it's neither atypical nor final.



References
Duckworth, A. L., & Quinn, P. D. (2009). Development and validation of the Short Grit Scale (GRIT–S). Journal of Personality Assessment, 91, 166-174.

Sitzmann, T, & Ely, K. (2011). A meta-analysis of self-regulated learning in work-related training and educational attainment: What we know and where we need to go. Psychological Bulletin, 137,  421-442.


 
 
Picture
I like Wikipedia. I like it enough that I have donated during their fund drives, and not simply under the mistaken impression that doing so would make plaintive face of founder Jimmy Wales disappear from my browser.

Wikipedia is sometimes held up as a great victory for crowdsourcing, although as Jaron Lanier has wryly observed, it would have been strange indeed to have predicted in the 1980's that the digital revolution was coming, and that the crowning achievement would be a copy of something that already existed--the encyclopedia.

That's a bit too cynical in my view, but more important, it leapfrogs an important question: is Wikipedia a good encyclopedia?

For matters related to education, my tentative answer is "no." For some time now I've noticed that articles in Wikipedia got things wrong, even allowing for the fact that some topics in education are controversial.

So in a not-at-all scientific test, I looked up a few topics that came to mind.

Reading education in the United States: The third paragraph reads:

There is some debate as to whether print recognition requires the ability to perceive printed text and translate it into spoken language, or rather to translate printed text directly into meaningful symbolic models and relationships. The existence of speed reading, and its typically high comprehension rate would suggest that the translation into verbal form as an intermediate to understanding is not a prerequisite for effective reading comprehension. This aspect of reading is the crux of much of the reading debate.
There is a large literature using many different methods to assess whether sound plays a role in the decoding of experienced readers, and ample evidence that it does. For example, people are slower to read tongue-twisters than control text (McCutchen & Perfetti, 1982). Whether it is necessary to access meaning or is a byproduct of that process is more controversial. There is also pretty good evidence that speed reading can't really work, due to limitations in the speed of eye movements (Rayner, 2004)

Next I looked at mathematics education. The section of most interest is "research" and it's a grab-bag of assertions, most or all of which seem to be taken from the website of the National Council of Teachers of Mathematics. As such, the list is incomplete: no mention of the huge literatures on (1) math facts (e.g. Orrantia et al 2010), nor of (2) spatial representations in mathematics: Newcombe, 2010. The conclusions are also, at times, sketchily draw ("the importance of conceptual understanding:" well, sure), and on occasion, controversial ("the usefulness of homework:" a lot depends on the details.)
Picture
Learning styles: You probably could predict the contents of this entry. A long recounting of various learning styles models, followed by a "criticisms" section. Actually, this Wikipedia entry was better than I thought it would be, because I expected the criticism section to be shorter than it is. Still, if you know nothing about the topic, you'd likely conclude "there's controversy" rather than there's no supporting evidence (Riener & Willingham, 2010).

Finally, I looked at the entry on constructivism (learning theory). This was a pretty stringent test, I'll admit, because it's a difficult topic.

The first section lists constructivists and this list includes Herb Simon, which can only be called bizarre, given that he co-authored criticisms of constructivism (Anderson, Reder & Simon, 1997).

The rest of the article is a bit of a mish-mash. It differentiates social constructivism (that learning is inherently social) from cognitive constructivism (that learners make meaning) only late in the article, though most authors consider the distinction basic. It mentions situated learning in passing, and fails to identify it as a influential third strain in constructivist thought. A couple of sections on peripheral topics have been added ("Role Category Questionnaire," "Person-centered messages") it would appear by enthusiasts.

Of the four passages I examined I wouldn't give better than a C- to any of them. They are, to varying degrees, disorganized, incomplete, and inaccurate.

Others have been interested in the reliability of Wikipedia, so much so that there is a Wikipedia entry devoted to the topic. Two positive results are worthy of note. First, site vandalism is usually quickly repaired. (e.g., in the history of the entry for psychologist William K. Estes one finds that someone wrote "William Estes is a martian that goes around the worl eating pizza his best freind is gondi.") The speedy repair of vandalism is testimony to the facts that most people want Wikipedia to succeed, and that the website makes it easy to make small changes.

Second, Wikipedia articles seem to fare well for accuracy compared to traditional edited encyclopedias. Here's where education may differ from other topics. The studies that I have seen compared articles on pretty arcane topics--the sort of thing that no one has an opinion on other than a handful of experts. Who is going to edit the entry on Photorefractice Keratectomy? But lots of people have opinions about the teaching of reading--and there are lots of bogus "sources" they can cite, a fact I emphasized to the point of reader exhaustion in my most recent book.

Now I only looked through four entries. Perhaps others are better. If you think so, let me know. But for the time being I'll be warning students in my Spring Educational Psychology course not to trust Wikipedia as a source.


References

Anderson, J. R., Reder, L. M., & Simon, H. A. (2000). Applications and Misapplications of Cognitive Psychology to Mathematics Instruction. Texas Education Review, 1(2), 29-49.

McCutchen, D., & Perfetti, C. A. (1982). The visual tongue-twister effect: Phonological activation in silent reading. Journal of Verbal Learning and Verbal Behavior, 21, 672-687.

Newcombe, N. S. (2010). Picture This. American Educator, 1, 29.

Orrantia, J., Rodríguez, L., & Vicente, S. (2010). Automatic activation of addition facts in arithmetic word problems. The Quarterly Journal of Experimental Psychology, 63(2), 310-319.

Radach, R. (2004). Eye movements and information processing during reading (Vol. 16, No. 1-2). Psychology Press.

Riener, C., & Willingham, D. (2010). The myth of learning styles. Change: The Magazine of Higher Learning, 42(5), 32-35.

 
 
Picture
Neuroscience reporting: unimpressive.
An op-ed in the New York Times reported on some backlash against inaccurate reporting on neuroscience. (It included name-checks for some terrific blogs, including Neurocritic, Neurobonkers, Neuroskeptic, Mind Hacks, Dorothy Bishop's Blog). The headline ("Neuroscience: Under Attack") was inaccurate, but the issue raised is important; there is some sloppy reporting and writing on neuroscience.

How does education fare in this regard?

There is definitely a lot of neuro-garbage in the education market.

Sometimes it's the use of accurate but ultimately pointless neuro-talk that's mere window dressing for something that teachers already know (e.g., explaining the neural consequences of exercise to persuade teachers that recess is a good idea for third-graders).

Other times the neuroscience is simply inaccurate (exaggerations regarding the differences between the left and right hemispheres, for example).
 
You may have thought I was going to mention learning styles.

Well, learning styles is not a neuroscientific claim; it's a claim about the mind. But it's often presented as a brain claim, and that error is perhaps the most instructive. You see, people who want to talk to teachers about neuroscience will often present behavioral findings (e.g., the spacing effect)--as though they are neuroscientific findings.

What's the difference, and who cares? Why does it matter whether the science that leads to a useful classroom application is neuroscience or behavioral?

It matters because it gets to the heart of how and when neuroscience can be applied to educational practice. And when a writer doesn't seem to understand these issues, I get anxious that he or she is simply blowing smoke.

Let's start with behavior. Applying findings from the laboratory is not straightforward. Why? Consider this question. Would a terrific math tutor who has never been in a classroom before be a good teacher? Well, maybe. But we recognize that tutoring one-on-one is not the same thing as teaching a class. Kids interact, and that leads to new issues, new problems. Similarly, a great classroom teacher won't necessarily be a great principal.

This problem--that collections don't behave the same way as individuals--is pervasive.

Similarly, knowing something about a cognitive process--memory, say--is useful, but it's not guaranteed to translate "upwards" the way you expect. Just as children interact, making the classroom more than a collection of kids, so too cognitive processes interact, making a child's mind more than a collection of cognitive processes.

That's why we can't take lab findings and pop them right into the classroom. To use my favorite painfully obvious example, lab findings consistently show that repetition is good for memory. But you can't mindlessly implement that in schools--"keep repeating this til you've got it, kids." Repetition is good for memory, but terrible for motivation.

I've called this the vertical problem (Willingham, 2009). You can't assume that a finding at one level will work well at another level.

When we add neuroscience, there's a second problem. It's easiest to appreciate this way. Consider that in schools, the outcomes we care about are behavioral; reading, analyzing, calculating, remembering. These are the ways we know the child is getting something from schooling. At the end of the day, we don't really care what her hippocampus is doing, so long as these behavioral landmarks are in place.

Likewise, most of the things that we can change are behavioral. We're not going to plant electrodes in the child's brain to get her to learn--we're going to change her environment and encourage certain behaviors. A notable exception is when we suspect that there is a pharmacological imbalance, and we try to use medication to restore it. But mostly, what we do is behavioral and what we hope to see is behavioral. Neuroscience is outside the loop. 
For neuroscience to be useful in the classroom we've got to translate from the behavioral side to the neural side and then back again. I've called this the horizontal problem (Willingham, 2009).

The translation to use neuroscience in education can be done--it has been done--but it isn't easy. (I wrote about four techniques for doing it here, Willingham & Lloyd, 2007).

Now, let's return to the question we started with: does it matter if claims about laboratory findings about behavior are presented as brain claims?

I'm arguing it matters because it shows a misunderstanding of the relationship of mind, brain, and educational applications.

As we've seen, behavioral sciences and neuroscience face different problems in application. Both face the vertical problem. The horizontal problem is particular to neuroscience. 

When people don't seem to appreciate the difference, that indicates sloppy thinking. Sloppy thinking is a good indicator of bad advice to educators. Bad advice means that neurophilia will become another flash in the pan, another fad of the moment in education, and in ten year's time policymakers (and funders) will say "Oh yeah, we tried that."

Neuroscience deserves better. With patience, it can add to our collective wisdom on education. At the moment, however, neuro-garbage is ascendant in education.

EDIT:
I thought it was worth elaborating on the methods whereby neuroscientific data CAN be used to improve education:
Method 1
Method 2
Method 3
Method 4
Method 5
Conclusions

Willingham, D. T. (2009). Three problems in the marriage of neuroscience and education. Cortex, 45, 54-545.
Wilingham, D. T. & Lloyd, J. W. (2007). How educational theories can use neuroscientific data. Mind, Brain, & Education, 1, 140-149.
 
 
Michael Gove, Secretary of State for Education in Great Britain, delivered a speech on education policy last week called "In Praise of Tests" (text here),  in which he argued for "regular, demanding, rigourous examinations."

The reasons offered included arguments invoking scientific evidence, and cited my work as examples of such evidence. That invites the question "Does Willingham think that the scientific evidence supports testing, as Gove suggested?"

This question really has two parts. Did Gove get the science right? And did he apply it in a way that is likely to work as he expects?

The answer to the first question is straightforward: yes, he got the science right. The answer to the second question is that I agree that testing is necessary, but have a different take on the scientific backing for this claim than Gove offered.

First, the science. Gove made three scientific claims. First, that people enjoy mental activity that is successful--it's fun to solve challenging problems. Much of the first chapter of Why Don't Students Like School is devoted to this idea, but it's a commonplace observation; that's why people enjoy problem-solving hobbies like crossword puzzles or reading mystery novels.

Second, Gove claimed that background knowledge is critical for higher thought, a topic I've written about in several places (e.g., here).

The only quibble I have with Gove on this topic is when he says "Memorisation is a necessary precondition of understanding." I'd have preferred "knowledge," to "memorisation" because the latter makes it sound as though one must sit down and willfully commit information to memory. This is a poor way to learn new information--it's much more desirable that the to-be-learned material is embedded in some interesting activity, so that the student will be likely to remember it as a matter of course.

It's plain that Gove agrees with me on this point, because he emphasized that exam preparation should not mean a dull drilling of facts, but rather should happen through "entertaining narratives in history, striking practical work in science and unveiling hidden patterns in maths." I think the word "memorisation" may be what led the Guardian to use a headline suggesting Gove was advocating rote learning.

Third, Gove argued that people (teachers and others) are biased in their evaluations of students, based on the student's race, ethnicity, gender, or other features that have nothing to do with the students actual performance. A number of studies from the last forty years show that this danger is real.

So on the science, I think Gove is on firm ground. What of the policy he's advocating?

I lack expertise in policy matters, and I've argued on this blog that the world  of education might be less chaotic if each of us stuck a little closer to the home territory of what we know. Worse yet, I know little about the British education system nor about Gove's larger policy plans. With those caveats in place, I'll tread on Gove's territory and offer these thoughts on policy.

It's true that successful thought brings pleasure. The sort of effort I (and others) meant was the solving of a cognitive problem. Gove offers the example of a singer finishing an aria or a craftsman finishing an artefact. These works of creative productivity likely would bring the sort of pleasure I discussed. It's less certain that the passing of examination would be "successful thought" in this sense.

Why? Because exams seldom call for the creative deployment of knowledge. Instead, they call for the straightforward recall of knowledge. That's because it's very difficult to write exams that call for creative responses, yet are psychometrically reliable and valid.

There is a second manner in which achievement can bring pleasure; I haven't written about it, but I think it's the one Gove may have in mind. It's the pleasure of overcoming a formidable obstacle that you were not sure you could surmount.

I agree that passing a difficult test could be a profound experience. Some children really don't see themselves as students. They have self-confidence, but it comes from knowing that they are effective in other activities. Passing a challenging exam might prompt child who never really thought of himself as "a student" to recognize that he's every bit as able as other children, and that might redirect the remainder of his school experience, even his life.

But there are some obvious difficulties in reaching this goal. How do we motivate the student to work hard enough to actually pass the difficult test? The challenge of the exam is unlikely to do it--the child is much more likely to conclude that he can't possibly pass, so there is no point in trying.

The clear solution is to engage creative teachers who have the skill to work with students who begin school poorly prepared and who may come from homes where education is not a priority. But motivation was the problem we began with, the one we hoped to address. It seems to me that the motivational boost we get from kids passing a tough exam might be a good outcome of successfully motivating kids. It's not clear to me that it will motivate them.

My second concern in Gove's vision of testing is how teachers will believe they should best prepare kids for a difficult exam that demands a lot of factual recall.

Gove is exactly right when he argues that teachers ought not to construe this as a call for rote learning of lists of facts, but rather should ensure that rich factual content is embedded in rich learning activities.

My concern is that some British teachers--in particular, the ones whose performance Gove hopes to boost--won't listen to him.

I say that because of the experience in the US with the No Child Left Behind Act. In the face of mandatory testing for students, some teachers kept doing what they had been doing, which is exactly what Gove suggests; rich content interwoven with a demand for critical thinking, delivered in a way that motivates kids. These teachers were unfazed by the test, certain that their students would pass.

Other teachers changed lesson plans to emphasize factual knowledge, and focused activities on test prep. I've never met a teacher who was happy about this change. Teachers emphasize facts at the expense of all else and engage in disheartening test prep because they think it's necessary.

Teachers believed it was necessary because (1) they were uncertain that their old lesson plans would leave kids with the factual knowledge base to pass the test; or (2) they thought that their students entered the class so far behind that extreme measures were necessary to get them to the point of passing; or (3) they thought that the test was narrow or poorly designed and would not capture the learning that their old set of lesson plans brought to kids; or (4) some combination of these factors. So pointing out that exam prep and memorization of facts is bad practice will probably not be enough.

Despite these difficulties, I think some plan of testing is necessary.  Gove puts it this way: "Exams help those who need support to better know what support they need." A cognitive psychologist would say "learning is not possible without feedback." That learning might be an individual student mastering a subject, OR a teacher evaluating whether his students learned more from a new set of lesson plans he devised compared to last year, OR whether students at a school are learning more with block scheduling compared to their old schedule. In each case, you want to be confident that the feedback is valid, reliable, and unbiased. And if social psychology has taught us anything in the last fifty years, it's that people will believe their informal judgments are valid, reliable, and unbiased, whether they are or not.

There's more to the speech and I encourage you to read all of it. Here I've commented only on some of the centerpiece scientific claims in it. Again, I emphasize that I don't know British education and I don't know Gove's plans in their entirety, so what I've written here may be inaccurate because it lacks broader context.

I can confidently say this: hard as it is, good science is easier than good policy. 



 
 
This is a negative finding, so I'll keep it brief.

How do kids acquire new vocabulary? This process is poorly understood.

An influential theory has been that the phonological loop in working memory provides essential support. The phonological loop is like a little tape loop lasting perhaps two seconds; it allows you to keep active a sound you hear.

The idea is that a new unfamiliar word can be placed on the loop for practice and to keep it around while the surrounding context helps you figure out the meaning.

If so, you'd predict that the larger the capacity of the phonological loop and the greater the fidelity with which it "records" the better children will be able to learn new vocabulary.

The efficacy of the phonological loop is measured by having kids repeat nonsense words. Initially they are short--tozzy--but they increase in length to pose greater challenge to the phonological loop--liddynappish.

Several studies have shown correlations between phonological loop capacity and vocabulary size in children (for a review, see Melby-Lervag & Lervag, 2012).

The problem: it could be that having a big vocabulary makes the phonological loop test easier, because it makes it more likely that some of the nonsense words remind you of a word you already know. (And so you have the semantics of that word helping you remember the to-be-remembered word.) Indeed, even proponents of the hypothesis argue that's what happens when kids get older.

What you really need is a study that measures phonological loop capacity at time 1, and finds that it predicts vocabulary size at time 2. There is one such study (Gathercole et al, 1992) but it used a statistical analysis (cross-lagged correlation) that is now considered less than ideal.

A new study (Melby-Lervag et al, 2012) used probably the best methodology of any used to date. It was a longitudinal study that tested nonword repetition ability and vocabulary once each year between the ages of 3 and 7.

They used a different statistical technique--simplex models--to assess causal relationships. They found that both nonword repetition and vocabulary show growth, both show stability across children, and both are moderately correlated, but there was no evidence that one influenced the growth of the other over time.

The group then reanalyzed the Gathercole et al (1992) data and found the same pattern.

This is one depressing paper. Something we thought we knew--the phonological loop contributes to vocabulary learning--may well be wrong.

If anyone is working on a remediation program for young children that centers on improving the working of the phonological loop, it's probably time to rethink that idea.




Gathercole, S. E., Willis, C., Emslie, H., & Baddeley, A. (1992). Phonological memory and vocabulary development during the early school years: A longitudinal study. Developmental Psychology, 28, 887–898.

Melby-Lervåg, M., & Lervåg, A. (2012). Oral language skills mod-erate nonword repetition skills in children with dyslexia: A meta-analysis of the role of nonword repetition skills in dyslexia. Scientific Studies of Reading, 16, 1–34.

Melby-Lervåg, M., & Lervåg, A., Lyster, S-A H., Klem, M., Hagtvet, B., & Hulme, C. (in press). Nonword-repetition ability does not appear to be a causal influence on children's vocabulary development. Psychological Science.
 
 
In an op-ed piece in August 19th's New York Times, Bronwen Hruska tells of her experiences with her son, Will, between the 3rd and 5th grade. Will was misdiagnosed with ADHD.

Hruska and her husband were initially approached by Will's teacher, who thought his behavior indicated ADHD. Though they were doubtful, they took him to a psychiatrist who said that Will did indeed have ADHD and prescribed stimulant medication. Will took the medication for two years but stopped when he concluded that Aderall is dangerous. Now a happy high school sophomore, there is not much reason to think that the medication was ever necessary.

How did this happen?

The title of the piece--"Raising the Ritalin Generation"--provides a clue to the author's conclusion. Hruska suggests that our society is sick. Teachers are too quick to suggest medication for kids. Schools "want no part" of average kids; they expect kids to be exceptional, extraordinary. And we, as a society, are teaching kids that average is not good enough, and that if you're only average you should take a pill.

But there's an important piece missing from this picture--parents.

From what's written, it sure does sound like Will was misdiagnosed. But I can't help but wonder why his parents didn't know it at the time.

ADHD diagnosis requires that symptoms be present in at least two settings. So it's not enough that Will shows troubling symptoms in school: he would also need to show them at home, in social settings, or in some other context for him to be diagnosed. There's no indication of a problem outside of school.

It's also notable that the mere presence of symptoms is not enough: the symptoms must be clinically significant; in other words, they obstruct the child's ability to function well in that setting and Hruska maintains that Will seems like a typical kid to her.

This is where Hruska loses me. Why would she accept the diagnosis if symptoms were observed in just one context, and if she believed there was limited evidence that the symptoms were clinically significant in that context? Why wouldn't she challenge the physician who diagnosed him?

I'm led to wonder if she knew the diagnostic criteria. They aren't hard to find. Google "adhd diagnosis." The first link is the CDC site that offers a reader-friendly version of the DSM IV criteria.

Are our kids pill-happy? Are we raising a Ritalin generation? If so, the solution is not to lay all of the blame on schools and society or even on physicians who make mistakes, and to portray parents as powerless victims. The solution is for parents to make better use of the wealth of scientific information available to us, and to ask questions when a doctor or other authority makes claims that fly in the face of our experience.

 
 
It's not often that an initiative prompts grave concern in some and ridicule in others. The Gates Foundation managed it.

The Foundation has funded a couple of projects to investigate the feasibility of developing a passive measure of student engagement, using galvanic skin response (GSR).

The ridicule comes from an assumption that it won't work.

GSR basically measures how sweaty you are. Two leads are placed on the skin. One emits a very very mild charge. The other measures the charge. The more sweat on your skin, the better it conducts the charge, so the better the second lead will pick up the charge.

Who cares how sweaty your skin is?

Sweat--as well as heart rate, respiration rate and a host of other physiological signs controlled by the peripheral nervous system--vary with your emotional state.

Can you tell whether a student is paying attention from these data? 

It's at least plausible that it could be made to work. There has long been controversy over how separable different emotional states are, based on these sorts of metrics. It strikes me as a tough problem, and we're clearly not there yet, but the idea is far from kooky, and indeed, the people who have been arguing its possible have been making some progress--this lab group says they've successfully distinguished engagement, relaxation and stress. (Admittedly, they gathered a lot more data than just GSR and one measure they collected was EEG, a measure of the central, not peripheral, nervous system.)

The grave concern springs from the possible use to which the device would be put.

A Gates Foundation spokeswoman says the plan is that a teacher would be able to tell, in real time, whether students are paying attention in class. (Earlier the Foundation website indicated that the grant was part of a program meant to evaluate teachers, but that was apparently an error.)

Some have objected that such measurement would be insulting to teachers. After all, can't teachers tell when their students are engaged, or bored, or frustrated, etc.?

I'm sure some can, but not all of them. And it's a good bet that beginning teachers can't make these judgements as accurately as their more experienced colleagues, and beginners are just the ones who need this feedback. Presumably the information provided by the system would be redundant to teachers who can read it by their students faces and body language, and these teachers will simply ignore it.

I would hope that classroom use would be optional--GSR bracelets would enter classrooms only if teachers requested them.

Of greater concern to me are the rights of the students. Passive reading of physiological data without consent feels like an invasion of privacy. Parental consent ought to be obligatory. Then too, what about HIPAA? What is the procedure if a system that measures heartbeat detects an irregularity?

These two concerns--the effect on teachers and the effect on students--strike me as serious, and people with more experience than I have in ethics and in the law will need to think them through with great care.

But I still think the project is a terrific idea, for two reasons, neither of which has received much attention in all the uproar.

First, even if the devices were never used in classrooms, researchers could put them to good use.

I sat in at a meeting a few years ago of researchers considering a grant submission (not to the Gates Foundation) on this precise idea--using peripheral nervous system data as an on-line measure of engagement. (The science involved here is not really in my area of expertise, and had no idea why I was asked to be at the meeting, but that seems to be true of about two-thirds of the meetings I attend.) Our thought was that the device would be used by researchers, not teachers and administrators.

Researchers would love a good measure of engagement because the proponents of new materials or methods so often claim "increased engagement" as a benefit. But how are researchers supposed to know whether or not the claim is true? Teacher or student judgements of engagement are subject to memory loss and to well-known biases.

In addition, I see potentially great value for parents and teachers of kids with disabilities. For example, have a look at these two pictures.
This is my daughter Esprit. She's 9 years old, and she has Edward's syndrome. As a consequence, she has a host of cognitive and physical challenges, e.g., she cannot speak, and she has limited motor control and bad motor tone (she can't sit up unaided).

Esprit can never tell me that she's engaged either with words or signs. But I'm comfortable concluding that she is engaged at moments like that captured in the top photo--she's turning the book over in her hands and staring at it intently.

In the photo at the bottom, even I, her dad, am unsure of what's on her mind. (She looks sleepy, but isn't--ptosis, or drooping upper eyelids, is part of the profile).  If Esprit wore this expression while gazing towards a video for example, I wouldn't be sure whether she was engaged by the video or was spacing out.

Are there moments that I would slap a bracelet on her if I thought it could measure whether or not she was engaged?

You bet your sweet bippy there are. 

I'm not the first to think of using physiologic data to measure engagement in people with disabilities that make it hard to make their interests known. In this article, researchers sought to reduce the communication barriers that exclude children with disabilities from social activities; the kids might be present, but because of their difficulties describing or showing their thoughts, they cannot fully participate in the group.  Researchers reported some success in distinguishing engaged from disengaged states of mind from measures of blood volume pulse, GSR, skin temperature, and respiration in nine young adults with muscular dystrophy or cerebral palsy.

I respect the concerns of those who see the potential for abuse in the passive measurement of physiological data. At the same time, I see the potential for real benefit in such a system, wisely deployed.

When we see the potential for abuse, let's quash that possibility, but let's not let it blind us to the possibility of the good that might be done.

And finally, because Esprit didn't look very cute in the pictures above, I end with this picture.

 
 
Several people have sent this post from the New York Times web site to me. The author is a Professor of Philosophy at Notre Dame named Gary Gutting, and he argues that data from the social sciences are less useful than many people seem to believe.

Here is the nub of his argument:

Point 1: The author points out that some conclusions have more data behind them than others: initially, Higgs suggested something like the boson might exist. Today, there are many more data consistent with the boson and its suggested characteristics.

Point 2: The best supported conclusions of the core natural sciences (physics, chemistry, biology) are well established, but the best developed social science (he cites economics as an example) “have nothing like this status” in terms of the reliability of their findings.

Point 3: “While many of the physical sciences produce many detailed and precise predictions, the social sciences do not. The reason is that such predictions almost always require randomized controlled experiments which are seldom possible when people are involved.”

Point 4: He closes by suggesting that results from social sciences not be ignored, but that we must recognize their limitations. “At best, they can supplement the general knowledge, practical experience, good sense and critical intelligence that we can only hope our political leaders will have.”

There are several significant problems here.

Point 1: Gutting is right. Better supported conclusions are more reliable.

Point 2: Gutting argues that findings from social sciences are less reliable than those of natural sciences, and so should not be trusted. First, if our goal is to use science to inform public policy (his stated goal) then the question is whether social science findings add value compared to not having the findings in hand. It doesn’t really matter if they are as good as some other science. Second, on the point of reliability, Gutting is wrong. Some findings in social sciences are known with a high degree of reliability, e.g., the consequences of schedules of practice on memory.

On this point and point 3, Gutting paints with an extraordinarily broad brush, seeking to characterize the nature of “social sciences” in general, and to contrast them with all of the “natural sciences.”

Point 3: Regarding the use of random control designs, Gutting is again in error. Most of the experiments I have done in my career have been laboratory studies using random control designs. Readers of this blog are of course aware that many studies in education use this design. Gutting also fails to note that some areas of biology (e.g., evolutionary biology) and of physics (e.g., astrophysics) make extensive use of data that are not produced by randomized control trials.

Point 4: Inviting data from social sciences to “supplement” good old common sense is tantamount to inviting people to ignore the data, or more likely to interpret the data as consistent with their beliefs, a phenomenon described by social scientists.

I am suspicious that this post was written as part of an experiment by nefarious social scientists to see how readers of the Times would react.



 
 
I get so many questions about learning styles that I added an FAQ to my website. You can find it here.