Daniel Willingham: Science and Education Blog

Can Children Be Taught to Comprehend What they Read?

1/8/2024

Just how much does it help to teach children to use strategies when they read--strategies like creating a graphic organizer of the passage, or summarizing as they read, or asking themselves questions and answering them?

I’ve just published an article in Educational Leadership summarizing the research on this question, and I’ll summarize it here.

In 2006, I argued that there was lots of evidence that comprehension strategy instruction worked, and in fact, yielded a big boost to comprehension. I was in good company—The National Reading Panel had drawn the same conclusion five years earlier.

But I also argued that there was no evidence that practice of these strategies provided any additional benefit. I based that conclusion on two meta-analyses—research that synthesizes the results of different studies. Meta-analysis allows one to compare relatively brief exposure to strategy instruction (a total of, say, five hours) versus more practice with strategies (twenty hours). Both meta-analyses suggested that there was no benefit to more practice.

There’s been a good deal of research since then. In my recent article, I report that the number of meta-analyses is now up to twelve, and all are in accord. Practice has no impact on the effectiveness of comprehension strategy instruction.

That observation matters for two reasons. First, and most obviously, it suggests that although it’s well worth the time to teach students comprehension strategies, there’s no reason to devote a lot of time to practicing them. A total of five or ten hours of instruction yields the same advantage as twenty or thirty hours.

Second, this finding suggests that strategy instruction works for a different reason than I suspect many people believe.

It’s tempting to think of comprehension strategy instruction as analogous to coaching in baseball. If you’re a poor hitter, a coach shows you how good a hitter swings. You practice that swing and, in time, it becomes automatic and replaces the older, less effective habit. Likewise, we might think that comprehension strategies show less competent readers the way that more competent readers approach texts.

But this hypothesized “coaching” mechanism doesn’t make any sense because it depends on practice, and the data indicate that practice doesn’t help.

Here’s an alternative interpretation. When a typically-developing child starts school they can use oral language to make inferences, connect sentences, and understand the overall gist of a message. These same mental processes are put to work to support reading comprehension. Indeed, it would be odd if the brain created specialized reading comprehension processes from scratch, rather applying to reading the mental processes that are already in place to support oral language.

The mental processes of reading comprehension don’t require or benefit from practice because children are already quite good at them when they start school.

According to this account, strategy instruction is comparable to a strategy like “check your work” in math. It doesn’t improve the processes that actually do math. It’s a useful way of controlling those processes.

In the same way, comprehension strategy instruction probably has no impact on the processes of comprehension per se, but it reminds students that they are supposed to coordinate meaning across sentences and paragraphs, and to get the gist of the passage; in short, it reminds them that reading is not simply a matter of decoding each word until you reach the last one.

But that’s not quite the end of the story.

My description of comprehension strategy instruction could be interpreted as implying that reading instruction should end around fourth grade. Schooling should include phonics instruction, some work to support fluency, and then perhaps two weeks of comprehension strategy instruction. What’s the point of anything else, if comprehension can’t be taught? (I hadn’t thought of this implication of my account until Tim Shanahan pointed it out.)

Surely that implication can’t be right. Explaining why calls for differentiating types of comprehension.

I’ve suggested that strategies prompt children to apply already-present oral language comprehension processes.

An example would be anaphora resolution, as when a listener finds the referent for “he” in “he went to church.” Another example would be inferences supporting causality or explanation; seeking to understand why things happened seems to be a core aspect of cognition. And indeed, we know a four-year old has no difficulty in making causal bridging inferences in everyday conversation, as when a parent says “You seem bored. Shall we go outside?”

Exactly what prompts inferences in oral language or reading has been difficult to pin down, and there are surely individual differences. I think it’s uncontroversial that the two examples I’ve offered are universal.

It’s also uncontroversial that students are asked to do things with texts that go beyond comprehension supported by oral language processes. They learn sophisticated ways of evaluating arguments; for example, to appreciate that correlation is not equivalent to causation. They learn to evaluate the quality of writing, as when they come to understand how a good paragraph is structured. They also learn tools of analysis that are discipline-specific: why a novelist uses foreshadowing, for example, or how to interpret source information when reading historical documents.

Clearly, these skills must be taught, and there is every reason to think that they are subject to practice effects.

So we should differentiate kinds of comprehension. Some comprehension is supported by processes initially acquired for oral language, and presumably these processes yield a fairly basic understanding of the who, what, where, why, and how of the text. Other comprehension processes offer more sophisticated analysis, and these need to be explicitly taught.

An implication of this hypothesis is that the comprehension tests used in strategy research lean heavily on the first type of process; comprehension tests demand a basic understanding, not a more complex analysis. That prediction has not been tested, so far as I know.

I’ve long argued for the critical importance of knowledge in reading comprehension, but knowledge isn’t everything—teaching students certain types of analysis is critical as well. Understanding how each applies to instruction can help us maximize student enjoyment of and achievement in reading.

6 Comments

Should Psychology PhD Programs Require the GRE?

4/7/2022

Some colleges are no longer requiring that applicants submit SAT scores (or they’re making them optional) because scores correlate with parental income and with ethnicity. Thus the test appears biased—good students are being denied a place at school because of their ethnicity or class. Paige Harden wrote an excellent piece for the Atlantic arguing that dropping the SAT in the name of equity in admissions is a bad idea.

I encourage you to read it, but to provide a little background here, Harden argues that the SAT is not biased, but America’s K-12 education system is. The SAT faithfully tracks the unequal results. Dropping it deprives us of a useful yardstick in measuring those inequities.

Furthermore, a good SAT score is the criterion for college admission that cannot be purchased. Wealthy parents can buy their children places in elite schools, and they can pay for tutors to ensure they succeed there. Wealthy kids don’t have to get part-time jobs or care for other family members, and they can afford to take prestigious unpaid internships. But although they may pay for pricey courses that claim to help you ace the SAT, these courses don’t much affect scores.

Many graduate programs have dropped the GRE. Are they making the same mistake some colleges have made?

There are parallels. In most graduate programs, the student population does not reflect the diversity of the US population—it’s disproportionately white and wealthy. GRE scores, like SAT scores, are correlated with wealth and ethnicity. And, as is true of the SAT, there’s no evidence that the test is biased—the differences reflect bias in how the educational system in the US allocates opportunity.

But whereas the SAT may allow more equitable admissions decisions, I doubt that’s true for the GRE. I think the equity problem for graduate admissions is different than the undergraduate issue and calls for a different solution.

In graduate admissions, we focus on two characteristics of the candidate: capability (can they do the work?), and motivation (will they want to do the work)?

(Note that I said “we.” Professors have nothing to do with undergraduate admissions, but everything to do with it on the graduate side. Graduate training operates on a mentorship system, so an individual professor makes the admission decision, subject to the approval of her department and school.)

Your grades and GRE score tells me whether you can do the work. Statistically the GRE adds some information to grades—the score’s not completely redundant—but it doesn’t add a lot. The GRE would help with the equity issue if there were a big halo effect for prestige schools; in other words, if I didn’t trust that the A- you earned at a not-very-selective school (say, Southern Illinois University—Carbondale, which accepts about 90% of applicants) is equivalent to an A- from Yale (which accepts about 5%). Or even equivalent to a B from Yale.

In my experience, that’s just not that big an issue. First, there are excellent faculty everywhere, and no one is more aware of that fact than university faculty. (I truly picked USI-Carbondale at random, but when I went to their psych webpage I immediately noticed a memory researcher with an excellent reputation who I’ve known by reputation forever, as well as one of our PhDs, now a professor there.) If I’m really concerned about whether courses were rigorous, it’s easy for me to find out exactly which textbook they used in their Introduction to Cognition course, and so on.

I can see why some faculty might figure “the more information, the better,” but it’s asking the student to go to a non-trivial amount of trouble and expense to provide me with information that I don’t value that highly.

The equity problem loads less on the capability metrics (grades and GRE), and more on the metrics of motivation. Motivation is crucial because it could easily happen that, although you enjoyed cognitive psychology as an undergraduate, you’ll later find that doing it for 50 hours per week doesn’t float your boat. No one benefits if you spend a year or two discovering that you’re sorry you came, so when I look at applications, I want evidence that the person knows what they are getting into.

Research experience is the main way to show that, because that’s what you’ll be doing when you get to grad school.

And wealth helps you gain research experience. Wealthy kids can take an unpaid research assistant position for a summer or during the school year. Wealthy kids attend colleges where faculty have elaborate labs and are given time outside the classroom to mentor students in research. In addition, wealthy kids are still cashing in on their rich high school experiences; coming in with a lot of AP credits, they can get to upper level courses that call for research earlier in their college careers.

If the GRE won’t do much to identify untapped talent, what will?

It might help to measure motivation in a more situation-specific fashion. Doing research is great, but having done a project doesn’t reveal great motivation if you attend a school where a research project is required, or where participating is just a matter of signing up.

Other students might have less research experience, but have showed determination in making the most of what was available to them in light of the environment, and/or their other responsibilities.

In other words, we should be asking graduate candidates if they had special circumstances (work-related, health-related, and the like) that affected their undergraduate years. For example, when a student takes a course on structural equation modelling online, instead of wondering whether it was really comparable to an in-person course, we should credit the student for taking the initiative to take a course not offered at her school. If student A’s undergraduate research project seems somewhat unimaginative compared to that of student B, we might note that student B worked in a large lab that churns out many projects each year, whereas student A worked with a professor whose teaching responsibilities leave him little time to supervise student research, but the professor thought so highly of this particular student, he made an exception. And so on.

Standardized tests should always be interpreted in light of the purpose to which they are put. In the case of the GRE, I don’t think there’s a lot of value added, either in terms of helping us to identify great future psychologists, nor specifically in terms of ensuring that we’re not leaving talent on the table. But making changes to our evaluation of student motivation might improve our decision-making.

July 28th, 2020

7/28/2020

Sorry, I had to take this down...something close to it may be published elsewhere.

Responding to a Study You Just KNOW Is Wrong

5/15/2020

Earlier this week I participated in a Zoom session organized and hosted by David Weston of the Teacher Development Trust. David asked me an interesting question, one I hadn’t considered before. He noted that learning styles theories have become a shibboleth for educators who are scientifically “in the know;” mention learning styles in any positive way and they will pounce on you with glee.

There are two problems with verbally thrashing people who say something you think lacks scientific support. First, it’s simply a bad tactic. You look like a bully and snob; sure, the flush of self-righteousness is heady, but you’re not informing the other person, although you're pretending to. You’re not even really talking to them, you’re just enjoying yourself.

The second problem caused by the uncritical dismissal of disproven theories is that you might miss new developments. I’ve written about learning styles theories many times and my view of these theories has changed, (see here versus here) notably from excellent experimental work by David Kraemer, Josh Cuevas, and others. If you hear “learning styles” and immediately shout “Nonsense! Ridiculous!” without noticing that someone is providing new data in a well-conducted study, you miss things.

Which brings me to a study published by my colleague at UVa, Vikram Jaswal, about which I tweeted a earlier this week.

The experiment describes an eye-tracking study in autistics who appear to be typing responses to questions on a letterboard. This sort of communication from people who cannot speak has a long and terrible history. In a previous incarnation called facilitated communication, an assistant held/supported the arm of the respondent (or in some other way touched them) as he or she typed. Research made it clear that the communication was actually coming from the assistant, although the assistant might not be aware of this influence.

The story of facilitated communication has become a staple in psychology methods courses, along with Clever Hans, and it also generated interest in a particular type of unconscious social influence and the methods by which we consciously interpret the reasons for our own behavior. (Which in turn provided Dan Wegner, my colleague at UVa that the time, and a lover of puns, the chance to publish a paper titled Clever Hands.)

The more recent version of this technique, employed in Jaswal's study, has the assistant holding the letterboard, but not touching the person who’s typing. This removes one way the assistant might be authoring the communication, but not all; the assistant might subtly indicate which letter is to be selected.

The study I tweeted about examined eye movements and pointing in a small (N=9) group of autistics who regularly use this method. The predictions are straightforward; if you think people are responding to cues, that’s a 26-choice response time task, and RTs in that sort of task would be slow, and people would make a lot of errors. You’d also predict that they would look at the assistant fairly often. If, in contrast, you think that participants are actually typing themselves, you’d expect fluent typing, you’d expect eye movements to lead finger movements without checking what the assistant was doing, and perhaps most interesting, you’d expect that response times would be slow at the start of a new word (for a multiword response) or at the beginning of the second morpheme in a compound word (like “scarecrow”). This effect us observed in touch typists, and are due to motor planning processes.

The authors claimed that the data were consistent with the interpretation that those typing were agentic—at least part of the behavior was self-generated, rather than being fully determined by external cues. It’s the first such demonstration, which helps explain why it was published in a prestigious journal.

The response to my Tweet was a series of Tweets that were highly critical of…lots of things, and were reminiscent of the sort of thing David Weston asked me about regarding learning styles. I’ll provide just a very small sample of them.

These two tweets suggested that the lead author (Jaswal) had an association with some people who do terrible things.

Jaswal doesn't run "The Tribe," btw, and never has. He taught an undergrad class which collaborated with The Tribe, and that fact is described in the "Competing Interests" section of the paper.

This person wanted the authors to conduct a different experiment.

There was also a good deal of discussion about the fact that facilitated communication has been discredited…which the introduction of the paper points out, and which draws the distinction between the method they use and facilitated communication.

All of these tweets have one thing in common: they don’t address the study. You can debate whether the researchers should have run a different study, and just how terrible some people who use this method are or aren’t, but there the data sit, waiting to be explained.

Some critics did try to address issues with the data.

These two watched the video provided by the authors as a supplementary material and reckoned they could code them more accurately than the coders, who analyzed pointing and eye movements frame by frame. Coders analyzed more than 142,000 frames.

As near as I can tell, these tweeters are unaware of criteria for deciding what counts as the target of an eye movement versus gaze travelling through a location on its way to a target (dwell time > 98 ms).

This next bit would be an important criticism, except this person too appears not to have read the methods section, which describes how it was determined which letter the speller pointed to.

This tweet came closer to mark by calling attention to the sample size.

But small sample sizes are routine in certain types of work, especially neuropsychology. If you can’t get many people with the desired characteristics, you test who you can test. You recover statistical power by administering many trials, which lets you conduct statistical analyses person by person. This is a routine method in psychophysics. But the small N is a valid concern in that it calls into question generalizability. If we take the conclusions at face value, should we find them interesting if they might apply to just a tiny fraction of autistics?

I've picked out tweets that addressed methods, but for the most part, people didn’t engage with the actual study to discredit it. They attacked the experimenter and his associates, they broadly said it’s obvious this can’t work, they said the method has been discredited before.

If they really wanted to shoot for the soft spot of this study, they should have gone after things like the calculation of the simulation of the percentage of points to correct letters preceded by a fixation of that letter if fixations had been random—that was used as a baseline for the analysis that supported a key conclusion of the study, and the method the authors used is probably open to debate.

Now, what do I really think of this study? Of course this study should be replicated, and ideally in a different lab. That’s always the case; researchers may have made a mistake, equipment have been flukey, who knows. Equally obviously, the data are consistent with agency, nothing more—it’s not a test of whether a therapeutic technique or a communication method work. It’s also not a test of whether there was absolutely no influence by the assistant holding the board. The authors note that there certainly was…she sometimes finished words or interrupted, sort of the way speakers do. Their claim was that the data can’t be explained by influence alone.

Back to my conversation with David Weston about learning styles. When we feel sure we know something, disconfirming data pose a problem. You have three choices. The first of the three is the worst, and it’s mostly what we saw here; you castigate the study as terrible and obviously stupid but don’t provide any substantive evidence regarding problems with the method, analysis, or interpretation. The second is to engage with the substance of the research and critique it. But that takes a lot of time and expertise. We saw some attempts at such criticism here, and we saw transparent failure to actually read the study, as well as lack of expertise. Which brings me to the third response. You have a feeling the conclusion is probably wrong because it conflicts with a whole lot of other theory and data. But you don’t know the particulars of what’s wrong with this experiment. So you ignore it.

Lest you think I'm suggesting that people just shut up, I'll tell you that I respond in this third way all the time. I see a study that I think doesn't square with a lot of other theory and data and I think "that's probably wrong." And I ignore it. If someone replicates it or if the study becomes a big deal, I'll get worked up, but not before.

It's not bad poker, folks. A hell of a lot of studies don't replicate, as we all know.

I'm guessing the critics of Jaswal's study would say they can't ignore this study because the stakes are so high. It's my perception that there's no little indignation in many of these tweets. As in every education debate I've seen, each side feels that they are motivated by what's best for students whereas the other side is motivated by greed and evil, filtered through stupidity and stubbornness.

Which brings me back to the start of this blog. I suggested that pleasure lies behind the righteous indignation people apply to the learning styles issue. Emotion was at play here. It wasn’t positive emotion in this case, but the outcome was the same. Nobody learned anything, and nobody was convinced.

If You're Going to Write About Science of Reading, Get Your Science Right

3/23/2020

There’s a new report out about the teaching of reading. It came to my attention when Diane Ravitch tweeted about it, with the tag “There is no Science of Reading.” It turns out to be a relatively brief policy statement from the National Education Policy Center, signed by the Education Deans for Justice and Equity.

I think the statement is pretty confused, as it conflates issues that ought to be considered separately. This statement is meant to be about the science of reading, so much of the confusion arises from a failure to understand or appreciate the nature of science, how basic science applies to applied science, and the scientific literature on reading.

FIRST. The distinction between basic and applied science ought to be fundamental to any discussion of the science of reading. Basic science in this context refers to the cognitive processes that enable reading. It is descriptive. Applied science in this context refers to helping people learn to read, given a particular goal for what it means to be a successful reader. Applied science is normative. It entails goals that are not determined by science.

This distinction is analogous to the difference between the basic sciences of physics and materials science and the applied science of architecture. The latter draws on the former to help an architect predict whether a building will stand, but the sciences don’t tell you how to design a building. It can’t, because designing a “successful” building requires knowing its function: how many people will the building hold? What will people do there? Do you care what the building looks like? What’s your budget?

It may first appear that goals couldn’t vary that much among educators—we want kids to read. But goals actually run throughout education decisions. What do you do if a practice facilitates fluency, but prompts a modest decline in reading attitudes? Or consider whether every child should become a good reader of content intended for “the intelligence layperson.” Committing to that goal commits one to a broad content curriculum, and an attendant reduction in opportunities for students to pursue personal interests in depth.

The NEPC statement conflates basic and applied science. That matters because different methods ought to be used when there’s disagreement with either. If the disagreement concerns basic science, the scientific method is appropriate, but disagreements about application are often disagreements about goals, or the collateral effects of pursuing goals, and therefore ultimately about values. For example, the report suggests that policymakers should “acknowledge and support that the greatest avenue to reading for all students is access to books and reading in their homes, their schools, and their access to libraries (school and community).” What goal does “greatest avenue to reading” refer to? That children will read more? That children will improve fluency? Gain vocabulary? That children will be able to read more challenging texts? These are not the same outcomes, and in fact there is a robust research literature on the extent to which access to books serves any of these goals.

Or take this recommendation “ [Federal or state legislators] Should adopt a wide range of types of evidence of student learning.” To what end? To what purpose is the evidence of student learning to be put? To make book recommendations for leisure reading? As a high school graduation requirement? Elsewhere the report mentions student portfolios positively. One of my children's school uses portfolios to help children take ownership and pride in their work and, although there’s no evidence it helps in this way, I kind of like that they do. On the other hand, there’s ample evidence that portfolios are very poor in terms of predictive validity for future school outcomes (e.g., grades and graduation).

Most of the recommendations in the report are like that. They end up being empty because they tell you what to do, but they don’t specify what outcome they hope the recommendation will achieve.

SECOND. The authors of the report either don’t understand how science works or are trusting that the reader doesn’t. They write “The truth is that there is no settled science of reading. The research on reading and teaching reading is abundant, but it is diverse and always in a state of change.” On the one hand, that’s science, folks. Knowledge from science is always understood to be contingent, and we use the best model we have (of reading or whatever else) as we’re working towards the next, better model. Even though an existing model may be known to be flawed, it may nevertheless be a close enough approximation that can be usefully applied.

But on the other hand, that’s not the conclusion the report invites. Rather, the suggestion is that we don’t know anything about reading from a scientific point of view with enough certainty that it will be useful in education. This claim doesn’t hold up to even passing familiarity with the literature (Here and here are a couple of good undergraduate textbooks on the basic science of reading. Or what the hell, read my book.) When it comes to application of science, what we know is less certain, but we still know a great deal. (This volume will get you started.)

To indicate that there is controversy and no settled science, the report cites contrarian scholars like Steven Krashen and Jeffrey Bowers. But again, if you have any familiarity with science, you’d know this is not a sign of unusual discord in the scientific community. There are always people challenging the mostly-accepted view. That’s part of the process. Very occasionally these outsiders are vindicated and an extremely different model becomes the norm. Occasionally their criticisms lead to a moderate adjustment of the accepted model. Most often these challenges prompt those taking the central view to provide better evidence and to be more rigorous in their thinking.

THIRD. The NEPC report commits the ivory tower blunder of recommending an ideal, and ignoring realities on the ground. For example, the report says

"This “balanced literacy” approach, which stresses the importance of phonics and of authentic reading – and which stresses the importance of teachers who are professionally prepared to teach reading using a full toolbox of instructional approaches and understandings – is now strongly supported in the scholarly community and is grounded in a large research base."

We might note that balanced literacy is said to be grounded in a large research base. But presumably not scientific research, because there’s no settled science. So what research? It goes unnamed.

To return to the main point, the fuel behind the current controversy is NOT that anyone thinks that balanced literacy, as described here, is a bad idea.

Rather, journalists have been reporting as true what has been a sneaking suspicion of edu-pundits for about a decade: (1) that one component of a balanced literacy classroom—phonics instruction—is being poorly taught and/or getting little time and; (2) some balanced literacy programs include methods of instruction (e.g., multi-cuing) that applied research indicates is counterproductive and (3) 1 and 2 are exacerbated because some teachers are poorly prepared during teacher education, and because most commercial reading programs do little to facilitate solid instruction, and because administrators tasked with selecting reading programs sometimes know very little about reading instruction.

There’s also a very real question of whether balanced literacy, even absent these problems, could be taught by any but a handful of the most experienced educators. The original idea was that each teacher would have a big toolbox of literacy education tools, and that children would be individually evaluated for which instructional tool would be just right at which time. It’s a lovely vision—again, who would argue with that if all the tools are good?—but that individualized instruction represents an enormous challenge for any educator. We should probably be asking whether, even with top-notch teacher education and materials, that vision is realistic in classrooms and if not, what supports would make it realistic OR how we could modify it into something doable.

CLOSING. The NEPC didn’t say “science doesn’t matter.” That would sound like climate change denial. But note too they didn’t say “they’ve got the science wrong. HERE’S the way the science of reading really works.” Instead they said “hey, this is all pretty murky and complicated…no one really knows what’s right, it’s all controversial, but those folks are pretending that they’ve got the science of reading figured out.” The authors of this report try to render science irrelevant by claiming it’s premature to apply it. This argument is undercut by their repeated demonstrations that they misunderstand science, the application of science, and the extant literature
on reading.

Ironically, most of their recommendations have nothing to do with science. The report objects to policies meant to raise test scores in the short-term when doing so risks longer term harms. It objects to policymakers ignoring the impact of out-of-school factors (e.g., poverty) on student achievement. It objects to policymakers ignoring the expertise of educators when establishing policy.

If these are problems, they are a result of wrong-headed (in the NEPC’s view) paths toward educational goals, or wrong-headed educational goals. They are not a direct result of reading science. Whoever wrote this report did not know enough science to see the difference.

The Current Controversy About Teaching Reading: Comments for Those Left With Questions After Reading the New York Times Article.

2/17/2020

Over the weekend the New York Times published an article on the front page about the teaching of reading. A friend posted in on Facebook saying "I won't know what to think about this until Dan comments on it." I thought some background for people like my friend might be useful.

How is the teaching of reading still controversial? Surely they’ve sorted it out by now.

The relationship between a teacher’s actions and a child’s success is murky.

Psychologists love to point out that “complex behavior is multiply determined.” Reading is complex, therefore many factors contribute to success or failure. Phonics instruction supports children learning to decode, but some kids figure out decoding with less support. The degree to which kids need more or less phonics instruction depends on their oral language skills (vocabulary, the complexity of syntax they can unravel), their knowledge of letters and print, and their ability to hear individual speech sounds, at the least. In addition, a teacher may be fully on board with phonics instruction, but either not be great at it (lack of knowledge or skill due to poor training) OR may be hobbled by the school or district having adopted a mediocre reading program.

And once you get past measuring decoding (i.e., you're measuring comprehension) things get still murkier because other factors contribute to comprehension.

So with all those factors, how much does all this really matter? If every teacher taught decoding via phonics instruction tomorrow, how much would reading improve?

It’s hard to say precisely, but you can predict the general pattern.

First, as I noted, some kids need less phonics instruction, so they get by with the bits and pieces they are getting now, although they’d learn to decode faster and more easily with more systematic instruction. It’s the kids with weak oral language skills, and those who have a hard time hearing individual speech sounds, those are the kids that will benefit most. There’s absolutely some percentage of kids floating into mid- and upper-elementary grades with really poor decoding skills who could be doing better.

Second, “decoding” is not synonymous with “reading.” It’s necessary but not sufficient. Once a child is a fairly fluent decoder, her comprehension is heavily influenced by her vocabulary, as well as the breadth and richness of background information in memory.

So it’s not that phonics instruction would make every child a great reader. It’s that without it, some kids won’t learn to read at all.

Isn’t phonics instruction boring for the kids who don’t need it?

There’s limited data on the matter, but nationally representative sample from 1995 showed that reading attitudes weren’t affected by decoding instruction. Although phonics instruction may seem boring it may be that (1) decoding itself is rewarding; (2) phonics is boring, but there are still read-alouds and other stuff that support positive reading attitudes (3) other types of instruction aren’t as interesting as we might have thought.

Perhaps most important, in most classrooms, teachers accept that there are some things children must learn or experience that aren’t fun, but are too important to skip. You make it as fun as you can, you make a show of enthusiasm, and hope the kids are swept along.

What happened that prompted the New York Times to put a story about this on the front page?

The article made it sound like new data from eye-tracking and brain imaging “now show” that phonics is crucial (and that exposure to appealing books isn’t enough). I don’t think that’s true. The behavioral data were plenty convincing twenty years ago, although our understanding of how the mind reads is, of course, always advancing. (Also, brain-imaging and eye tracking data aren’t that new.)

This issue—how much phonics instruction is really necessary?—has been visited and revisited since the 1920s. It quieted down in the early aughts with what was supposed to be a compromise position called “balanced literacy.” This position said “look, both sides are right. You need phonics, and you need great childrens literature and read-alouds.” This position is correct, of course, but people have been worried that phonics is getting short shrift, that teachers (and those who teach them) who don’t think phonics matters much just kept doing what they’d been doing, but now called it “balanced literacy.”

I’ve never met a US reading teacher who said “Kids don’t need any phonics instruction.” The concern is that teachers are underestimating how much kids need (edit: and, as Twitter user @MiriamFein pointed out to me, the quality of instruction--issue is not necessarily more, but better). Exactly because reading is multiply determined, it’s easy to think of reasons the child might not seem to get it very quickly…and to think that maybe he’ll get it in a few months.

Meanwhile, the instructional supports teachers get often encourage this sort of thinking. A recent review of one of the most-used reading programs in early grades concluded that support for phonics instruction was weak. In 2015 I noted in one of my books that the K-2 literacy guide for New York City Schools listed 16 activities, only one of which was phonics instruction. Yet I don’t think I was concerned enough.

The impetus behind the new controversy has been the work of Emily Hanford, a reporter who has done a thorough job of describing what’s known about how children learn to read, and she called schools of education to task for not teaching future teachers the best way to teach kids to read. Who knows, maybe the time was just right, but certainly the depth of her reporting made the moment possible.

So schools of education are to blame?

There are thousands of teacher preparation programs in this country so it’s hard to generalize. But the weekly education newspaper, Education Week, did a survey of professors regarding how they prepare future teachers to teach reading, and yeah, the results indicated that a lot of teachers are not getting very good instruction in teaching reading.

The most common misalignment I hear is this: when people think about reading, they think about it the way an already-skilled reader does it. For example, they say that readers use meaning-based cues to help figure out a word. That’s true, and there are two ways it happens. One is an unconscious process that is only in place if you are a fluent decoder who understands the rest of the text to that point; this process only nudges you towards the right interpretation, it doesn’t magically make you read it. The second is a conscious process, puzzling out what an unfamiliar word means, and ample data show readers are willing to do a little of that work, but not much. It’s frustrating and effortful. So the idea that we should teach beginning readers to use meaning-based cues has a certain logic to it—it’s what really good readers do—but it’s not a good strategy for beginners.

So what happens next?

Ideally, current and future teachers will get better instruction in how people read (I actually wrote The Reading Mind as auxiliary textbook for schools of education with this purpose in mind) and then too in how to teach reading. There’s much more to reading than phonics instruction and we’re actually know much less about how to teach those elements—fluency, for example, or how to raise reading motivation. Decoding is the most thoroughly researched aspect of reading, and it’s the one we know the most about teaching. We really ought to take advantage of that work.

On the Reality of Dyslexia

12/13/2019

There was a mild kerfuffle on Twitter about some comments made by literacy researcher Dick Allington; he made some not-very-academic remarks about people with whom he disagrees (see here). Much of Alligton's work is serious, but this didn't strike me as serious, so I merely made a joke about it on Twitter, but a few people replied in earnest, among them this one

Allington is quoted in the article as saying he's "reasonably sure" dyslexia doesn't exist. It's a valuable question: what's the basis for considering dyslexia different than just "bad at reading?"

Presumably, we’re going to measure a child’s reading proficiency with a test, and a score below some cut-off will make us suspicious that the child is having a specific reading disability. Sounds simple enough, but there’s a problem. The figure shows two graphs depicting imaginary data from 10,000 first graders. In panel A, there’s a bell curve on the right, which we would call typical readers and then to the left a smaller number of readers who are impaired. There’s no mistaking that this second group is different than the first, so choosing a cut point—a point at which you say “score below here, and you probably have dyslexia” is straightforward; I’ve marked it with a star.

But scores on reading tests don’t show the pattern you see in panel A. They show one bell curve, as in panel B of the Figure. I can still create a cut-off score—below there, and I’m saying you have dyslexia—but the justification is less clear. We’re describing “dyslexia” as though it’s a quality, a feature of the child, something we can point to. So the child with dyslexia is supposed to be different in kind from the typical child, but you can see the problem. How confident am I that a child just below my cut-off score is really different than the child just above it?

So is dyslexia even real? Maybe it’s more accurate to say “these are the kids in the bottom 10% (or whatever) on this measure.” Why label them “dyslexic?”

Although any cut score will lead to some errors, it’s not the case that it’s a completely arbitrary line. There is something different about many of the kids in the left tail of the graph. That’s important because it makes us want to be more active. Saying “well, there’s always going to be someone who is the worst at reading” sounds like we’ve given up. But if you accept that the child is facing special challenges, it invites searching for a solution. So what makes us think kids with dyslexia are different than typical kids?

First, there is a genetic component. The incidence of dyslexia in the population is about 9%. But if a child has a parent with dyslexia, the chances the child will too are about 35 or 40%. (Pennington et al., 1991). Of course, that high figure may be a consequence of the dyslexic parent offering a less supportive literacy environment than other parents do. Better evidence comes from studies comparing identical and fraternal twins. Twins of each type are likely to attend the same school and be exposed to similar home environments, but identical twins share 100% of their genes, and fraternal twins only 50%. If a child has dyslexia her fraternal twin has a 38% chance of having it too. But an identical twin has a 68% chance of having it (Defries & Alarcón, 1996). Of course the fact that there is a genetic contribution doesn’t mean that ones genes amount to an inevitable reading destiny. Children with dyslexia can become good readers—it will just take more work.

Here’s another reason to think that dyslexia is real: kids show subtle signs of a problem before reading instruction begins (e.g., Guttorm, Leppanen, Richardson, & Lyytinen, 2001; Lyytinen et al., 2004; Richardson, Kulju, Nieminen, & Torvelainen, 2009). These studies examine children from families where one of the parents has dyslexia. Researchers test how these children hear and use language when they are quite young, and then wait for them to get older. Then they note which children struggle with reading and which do not and look back at the data collected when they were younger. The two groups show differences very early in life. At birth, the parts of their brain that handle sounds show different responses to human speech. At age two and a half, children who would later have difficulty learning to read speak in shorter, less syntactically complex sentences, and their pronunciation is less accurate. At age three, they have a smaller vocabulary. At age five, they show deficits in phonological awareness, and they know the names of fewer letters.

A third point regarding the “reality” of dyslexia; we know that it’s not simply a delay, a byproduct of the fact that kids develop at different rates. There’s no doubt that some children learn to read faster than others, but kids identified with dyslexia don’t catch up without intervention. The kids who read poorly in early elementary school continue to struggle unless they get help (Scarborough & Parker, 2003; Shaywitz et al, 1995). Children with a disability can grow up to be good readers but even they show residual problems like slower single-word recognition and problems with spelling (Bruck, 1990; Maughan et al, 2009).

So here’s the way I think about the “reality” of dyslexia. It’s not a disease in the sense that measles is a disease; you have it or you don’t. Rather, we start with the theoretical claim that reading ability is a product of the home environment, teaching at school, and some ability-to-learn that is within the child. Dyslexia is a problem in the child’s ability to learn (restricted to reading and closely related to other language tasks). The three findings I’ve just listed are consistent with the idea that some children do have a specific ability-to-learn problem; but that doesn’t mean the problem is present or absent. The severity of the problem runs on a continuum, so in that way it’s more like high blood pressure or obesity. And like those problems, the fact that there’s not an obvious cut-point at which you can say “you have the disease, but you don’t” doesn’t mean that we shouldn’t take it seriously.

References

Bruck, M. (1990). Word-Recognition Skills of Adults With Childhood Diagnoses of Dyslexia. Developmental Psychology, 26(3), 439–454.

DeFries, J. C., & Alarcón, M. (1996). Genetics of specific reading disability. Mental Retardation and Developmental Disabilities Research Reviews, 2(1), 39-47.

Guttorm, T. K., Leppanen, P. H. T., Richardson, U., & Lyytinen, H. (2001). Event-Related Potentials and Consonant Differentiation in Newborns with Familial Risk for Dyslexia. Journal of Learning Disabilities, 34(6), 534–544.

Lyytinen, H., Aro, M., Eklund, K., Erskine, J., Guttorm, T., Laakso, M.-L., … Torppa, M. (2004). The development of children at familial risk for dyslexia: birth to early school age. Annals of Dyslexia, 54(2), 184–220.

Maughan, B., Messer, J., Collishaw, S., Pickles, A., Snowling, M., Yule, W., & Rutter, M. (2009). Persistence of literacy problems: spelling in adolescence and at mid-life. Journal of Child Psychology and Psychiatry, 50(8), 893–901.

Pennington, B. F., Gilger, J. W., Pauls, D., Smith, S. A., Smith, S. D., & DeFries, J. C. (1991). Evidence for major gene transmission of developmental dyslexia. JAMA, 266(11), 1527-1534.

Richardson, U., Kulju, P., Nieminen, L., & Torvelainen, P. (2009). Early signs of dyslexia from the speech and language processing of children. International Journal of Speech-Language Pathology, 11(5), 366–380.

Scarborough, H. S., & Parker, J. D. (2003). Matthew effects in children with learning disabilities: Development of reading, IQ, and psychosocial problems from grade 2 to grade 8. Annals of Dyslexia, 53(1), 47-71.

Shaywitz, B. A., Holford, T. R., Holahan, J. M., Fletcher, J. M., Stuebing, K. K., Francis, D. J., & Shaywitz, S. E. (1995). A Matthew Effect for IQ but not for reading: Results from a longitudinal study. Reading Research Quarterly, 30(4), 894–906.

Who to Believe on Twitter

10/13/2019

A recent tweet caught my attention. It was posted by Sherry Sanden, a professor at Illinois State, in response to a thread from APM reporter Emily Hanford, well known to educators for her reporting in the last 18 months on the best way to teach reading and the state of reading instruction in the US. Hanford was responding (I think) to an abstract of a talk Sanden and colleague Deborah MacPhee were to present at ILA, which Hanford thought was inaccurate and possibly a response to her reporting. Hanford posted a series of 14 tweets supporting various aspects of her claims about reading, many with links to the scientific research she cited.

I know this is convoluted and honestly I don't think it matters much, but I'm trying to provide some context. Here's what I really wanted to get at.
This is one of Sanden’s tweets in reply to Hanford.

I won’t take the time here to defend Hanford’s reporting—I’ve recommended her reports in the past and think they are solid, and she did a fine job of defending herself.

I’d like to comment on the obvious implication that Sanden should be taken more seriously because of her job; she’s a professor of education at Illinois State. She has credentials: she has a PhD in the relevant discipline, she publishes research on the topic, presents at professional conferences, and so on.

This is called argument from authority. In this instance, here’s the form Sanden is hoping it will take.

Proposition 1: Sanden has research-based reasons for believing that X is true about early literacy.
Proposition 2: Random tweeters don’t understand research very well, but have good reason to believe Proposition 1 is true (because of Sanden’s credentials).
Conclusion: Random tweeters believe that conclusion X is supported by research.

I considered argument from authority at more length in When Can You Trust the Experts, but here’s a short version.

Believing something because someone else believes it rather than demanding and evaluating evidence makes you sound either lazy or gullible. But we yield to the authority of others all the time. When I see my doctor I don’t ask for evidence that the treatments he prescribes are effective, and when an architect designed a new deck for my house I didn’t ask for proof that it could support the weight of my grill and outdoor furniture. I believed what they told me because of their authority.

I think education researchers don’t speak with that kind of authority and (apparently unlike Sanden) I don’t think we deserve it. I can point to two key differences between a doctor (or architect, or accountant, or electrician, etc) and education researchers.

First, I yield authority to someone who has been vetted by a credible entity. I know that, unless you break the law, you cannot practice medicine (or follow the other professions named) without being licensed by the state of Virginia. I haven’t looked into the matter, but I have no reason to think that the accrediting agencies aren’t doing an acceptable job. For one thing, most of the professionals I hire achieve what I expect them to achieve.
Education researchers, in contrast, are not licensed by a credible authority.

Anyone can take the title “education researcher.” That’s why we must point to earmarks of authority like academic degrees, training, and publications ; these make the silent claim “people with expertise think I’m an expert too,” which is, of course, a bit circular. Sometimes researchers mention television, radio and public speaking appearances. That’s called “social proof,” boiling down to “other people think I’m worth listening to.

The problem is that these earmarks of authority are not very reliable. The marketplace is cluttered with purveyors of snake oil who bear degrees, and even some who have published articles in “peer-reviewed” journals. As readers of this blog know, the idea that “peer review” is a guarantee of high quality in a journal does not bear close scrutiny.

But there’s a second, more important difference between education research and professions where people readily accept argument from authority. Those other fields have more accepted truths.

When I get an electrician to figure out why the breaker in my living room keeps flipping, I understand she may be more or less skillful in diagnosis and repair than another licensed electrician. What I don’t expect is that she could have wildly different—perhaps completely opposing—ideas about how electricity works and how to wire a house compared to someone else I might have called.

Education researchers do not speak with one voice, and that makes it hard to expect an argument from authority will work, as they take this form:

Proposition 1: Sanden says that when it comes to early reading, scientific research suggests “X” is true.
Proposition 2: Willingham says that when it comes to early reading, scientific research suggests “not X” is true.
Proposition 3: Random tweeter has equally good reason to believe (based on their credentials) that Proposition 1 and Proposition 2 are true.
Conclusion: ??????

We can’t make arguments from authority if equally authoritative people disagree.

Part of the problem is that people who enter these arguments actually come at the problems with different assumptions and understandings about what constitutes evidence, and indeed, what it means to know something. That’s most obvious when we have had very different training. Cognitive psychologists and researchers in critical theory address aspects of education that are largely non-overlapping, and you’ll find some of each
these folks in most schools of education, with similar credentials.

I’ve argued elsewhere that those of us in education research would do ourselves a favor if we would make our assumptions more explicit, as well as the limitations of the tools in our analytic toolbox—what problems are our methods well suited to answer and what can’t we answer? I think you don’t hear that often enough.

A final note. Later in the thread Sanden posted this

And then this…

Understood, but that’s social media. It may be frustrating and seem ludicrous that teachers use it to inform themselves, but here we are. If you want people to believe you, it's incumbent on you to explain your reasoning.

Politics' Uneasy Bedfellow

4/3/2019

The Journal of Learning Sciences has posted a Call For Papers for a special issue, “Learning In and For Collective Social Action.” It’s overtly political, and it takes a particular political stance: the first paragraph mentions furthering “progressive social movements.”

I think this special issue broadcasts the wrong message about the Journal, and will foster misunderstanding about the relationship of science and politics.

Let me start with the relationship between basic and applied sciences. Basic science aims to be value-free. It seeks to describe the world as it is. I am not claiming that science generally achieves that aim. Science is a human enterprise and humans are biased, and it’s well-established that biases creep into science, in terms of the agenda set, the interpretation of theory and data, funding, etc. My point is that when such bias is exposed it is considered a criticism. A person claiming to do basic science aims to do it in a value-free way and so must either seek to remedy the bias or give up on the claim of behaving scientifically.

In contrast, applied sciences do not aim to describe the world, but to change it, leveraging findings from basic science. Because they seek to change the world, values are part-and-parcel of what they do. Saying “yours is a political enterprise” is not a criticism of an applied science—there must be a goal, and goals are selected based on ones values. (Naturally, one can behave in a biased manner when conducting applied research, and then deny those biases. That’s a different matter.)

The “Aims and Scope” statement of the Journal of Learning Sciences make clear that it publishes both basic and applied research, describing itself as a “forum for research on education and learning.” In an important sense, reading intervention studies have an implicit goal—the goal that children should read. A study that seeks to close the gender gap in higher education STEM course-taking has an implicit political subtext—men and women should be equally represented in these disciplines. These studies are in that sense political.

But it matters that these are political positions about which everyone generally agrees. As a journal editor (or thoughtful reader) you don’t need to think about viewpoint diversity when it comes to those goals. Everyone thinks children should learn to read. But once you’re including topics about which reasonable people do disagree, you’re in different territory.

I have three problems with a scientific journal plunging into political issues as the Journal of Learning Sciences has done here.

First, applying science to politics is a fraught business. Science is powerful. It is perceived by most of the public as epistemically special—that it’s a better way of understanding the world than others. It’s a problem when a group cloaks its argument in the special status of science to further an essentially political point of view. The fact that we know scientists, like everyone else, are subject to unconscious bias in their work ought to make us worried about that possibility. Those who undertake to apply science to politically controversial issues ought to show self-awareness that they have embarked on a different sort of project, and they ought to take steps to ensure that they are thoughtful about the special problems this work poses.

The very fact that this special issue is being published by a journal that does not routinely handle papers on these topics indicates the editors think there is not anything different about them. They are sending the message “sure, politics is in the purview of our journal. This is what we do. Reading, politics…it’s all the same.”

Second, the Journal of Learning Sciences did not issue an even-handed, open-minded call for applications of the learning sciences to political problems. The call refers specifically to furthering progressive social movements, and it includes a list of issues that those on the political left consider most important, with no mention of issues that those on the right find most important: Islamophobia yes, but not bias against evangelical Christians. Settler colonialism, but not the rights of the unborn. Rather than identifying controversial issues and seeking viewpoint diversity, the Journal is signaling quite clearly who is welcome and who is not. This is a mistake. Science is about open debate, not exclusion.

My third issue with the call for papers grows out of the second. This call is not only bad science, it’s bad publicity. The academy is already under suspicion for having a left-leaning political agenda and foisting leftist groupthink on students. That suspicion grows partly from the fact that professors, as a group, lean left. This doesn’t mean that progressives should abandon important work to protect the tender feelings of conservatives. There are journals devoted to education that declare a particular view of politics in their mission statements and obviously there should be such journals. This sort of call for papers belongs in such a journal. It does not belong in the Journal of Learning Sciences or any other that purports to be devoted to science.

Note: This blog began as a Tweet, but I should have known better than to use that forum. I obviously was not clear, as people quickly wanted to let me know that scientists are biased, though I thought I had acknowledged that. More peculiar was the suggestion by several Tweeters that because scientists cannot be neutral, they may as well own their biases and stop pretending. It’s peculiar because it suggests a change to a cornerstone feature of a method that has been very successful for the last few centuries, and because the logic seems to be “if you can’t *completely* remove something undesirable, you may as well add more.”

Objections to Jo Boaler's Take on Neuroscience and Math Education

3/13/2019

Guest post with Daniel Ansari, Professor and Canada Research Chair in Developmental Cognitive Neuroscience in the Department of Psychology and the Brain & Mind Institute at the University of Western Ontario in London, Ontario, where he heads the Numerical Cognition Laboratory.

On February 28th Stanford Professor Jo Boaler and one of her students, Tanya Lamar, published an article that we think is a fine example of how not to draw educational conclusions from neuroscientific data. While we’re more interested in applauding great work than pointing out problems, we feel we can’t ignore an article in a high-profile venue like Time Magazine.
The backbone of their piece includes three points:

Science has a new understanding of brain plasticity (the ability of the brain to change in response to experience), and this new understanding shows that the current teaching methods for struggling students are bad. These methods include identifying learning disabilities, providing accommodations, and working to students’ strengths.
These new findings imply that “learning disabilities are no longer a barrier to mathematical achievement” because we now understand that the brain can be changed, if we intervene in the right way.
The authors have evidence that students who thought they were “not math people” can be high math achievers, given the right environment.

There are a number of problems in this piece.

First, we know of no evidence that conceptions of brain plasticity or (in prior decades) lack of plasticity, had much (if any) influence on educators’ thinking about how to help struggling students. More to the point, conceptions of cellular processes should not influence specific educational plans or general educational outlook. The notion of the brain lacking plasticity obviously was not taken at face value by educators, nor should it have been—an unchangeable brain would be a brain incapable of learning. (For more on the difficulty of drawing educational implications from neuroscientific findings, see here and here)

Second, Boaler and Lamar mischaracterize “traditional” approaches to specific learning disability. Yes, most educators advocate for appropriate accommodations, but that does not mean educators don’t try intensive and inventive methods of practice for skills that students find difficult. Standard practice for students with a specific reading disability, for example, includes intensive practice in decoding and yes, educators have thought of the idea of trying methods other than the ones that a student seems not to learn from—methods that the authors, at the end of the article, mention were suggested for her daughter with dyslexia and auditory processing difficulties.

Third, Boaler and Lamar advocate for diversity of practice for typically developing students that we think would be unremarkable to most math educators: “making conjectures, problem-solving, communicating, reasoning, drawing, modeling, making connections, and using multiple representations.” More surprising is their charge that “There are many problems with the procedural approach to mathematics that emphasizes memorization of methods, instead of deep understanding.“ We agree with the National Mathematics Advisory Panel report that students should learn (and memorize) math facts and algorithms. We also agree with the Panel (and with Boaler and Lamar) that American students struggle with conceptual understanding. Deep understanding is always more difficult than memorization, and it’s the aspect of mathematics that most kids struggle with, but that doesn’t mean that most math educators don’t care if their students understand math. In our view there is no need to reinvigorate the math wars since an overwhelming body of scientific evidence has demonstrated that students need both – procedural fluency and conceptual understanding. One cannot develop one without the other. In our view it is best to lay this false dichotomy to rest and avoid emotive and value laden arguments such as that students who are strong in conceptual understanding of math are more creative.

Fourth, we think it’s inaccurate to suggest that “A number of different studies have shown that when students are given the freedom to think in ways that make sense to them, learning disabilities are no longer a barrier to . Yet many teachers have not been trained to teach in this way.” We have no desire to argue for student limitations and absolutely agree with Boaler and Lamar’s call for educators to applaud student achievement, to set high expectations, and to express (realistic) confidence that students can reach them. But it’s inaccurate to suggest that with the “right teaching” learning disabilities in math would greatly diminish or even vanish. For some students difficulties persist despite excellent education. We don’t know which article Boaler & Lamar meant to link to in support of this point—the one linked to concerns different methods of research for typical students vs students identified with a disability.

Do some students struggle with math because of bad teaching? We’re sure some do, and we have no idea how frequently this occurs. To suggest, however, that it’s the principal reason students struggle ignores a vast literature on learning disability in mathematics. This formulation sets up teachers to shoulder the blame for “bad teaching” when students struggle.

As to the final point—that Boaler & Lamar have evidence from a mathematics camp showing that, given the right instruction, students who find math difficult can gain 2.7 years of achievement in the course of a summer—we’re excited! We look forward to seeing the peer-reviewed report detailing how it worked.

In sum, we think that findings from studies of brain plasticity do not support the implications that Boaler and Lamar suggest they do. Further, we think they have mischaracterized both the typical goals of math instructors, and the typical profile of a student with math disability.

<<Previous