Daniel Willingham--Science & Education
Hypothesis non fingo
  • Home
  • About
  • Books
  • Articles
  • Op-eds
  • Videos
  • Learning Styles FAQ
  • Daniel Willingham: Science and Education Blog

Evaluating readability measures

4/9/2014

 
This piece first appeared on RealClearEducation.com on March 26.

How do you know that whether a book is at the right level of difficulty for a particular child? Or when thinking about learning standards for a state or district, how do we make a judgment about the text difficulty that, say, a sixth-grader ought to be able to handle?

It would seem obvious that an experienced teacher would use her judgment to make such decisions. But naturally such judgments will vary from individual to individual. Hence the apparent need for something more objective. Readability formulas are intended as just such a solution. You plug some characteristics of a text into a formula and it combines them into a number, a point on a reading difficulty scale. Sounds like an easy way to set grade-level standards and to pick appropriate texts for kids.

Of course, we’d like to know that the numbers generated are meaningful, that they really reflect “difficulty.”

Educators are often uneasy with readability formulas; the text characteristics are things like “words per sentence,” and “word frequency” (i.e., how many rare words are in the text). These seem far removed from the comprehension processes that would actually make a text more appropriate for third grade rather than fourth.

To put it another way, there’s more to reading than simple properties of words and sentences. There’s building meaning across sentences, and connecting meaning of whole paragraphs into arguments, and into themes. Readability formulas represent a gamble. The gamble is that the word- and sentence-level metrics will be highly correlated with the other, more important characteristics.

It’s not a crazy gamble, but a new study (Begeny & Greene, 2014) offers discouraging data to those who have been banking on it.

The authors evaluated 9 metrics, summarized in this table:

Picture
The dependent measure was student oral reading fluency, which boils down to number of words correctly read per minute. Oral fluency is sometimes used as a convenient proxy for overall reading skill. Although it obviously depends heavily on decoding fluency, there is also a contribution from higher-level meaning processing; if you are understanding what you’re reading, that primes expectations as you read, which makes reading more fluent.

In this experiment, second, third, fourth, and fifth graders each read six passages taken from the DIBELS test: two passages each from below, at, and above their grade level, for a total of six passages. 

Previous research has shown that the various readability formulas actually disagree about grade levels (e.g., Ardoin et al, 2005). In this experiment, oral reading fluency was to referee the disagreement. Suppose that according to PSK, passage A is appropriate for second graders and passage B is appropriate for third graders. Meanwhile Spache says both are third-grade passages. If oral reading fluency is better for passage A than passage B, that supports the PSK. (“Faster” was not evaluated only in absolute terms, but accounted for the standard error of the mean).

The researchers used an analytic scheme to evaluate how good a job each metric did of predicting the patterns of student oral reading fluency. Each prediction was considered binary: the grade level assignment predicted that there should be a difference (or not) in oral reading fluency: was a difference observed?  Chance, therefore, would be 50%. The data are summarized in the Table
Picture
All of the readability formulas were more accurate for higher ability than lower ability students. But only one—the Dale-Chall—was consistently above chance.

So (excepting the Dale-Chall), this study offers no evidence that standard readability formulas provide reliable information for teachers as they select appropriate texts for their students. As always, one study is not definitive, least of all for a broad and complex issue. This work ought to be replicated with other students, and with outcome measures other than fluency. Still, it contributes to what is, overall, a discouraging picture.

References

Ardoin, S. P., Suldo, S. M., Witt, J., Aldrich, S., & McDonald, E. (2005). Accuracy of readability estimates’ predictions of CBM performance. School Psychology Quarterly, 20, 1 – 22.

Begeny, J. C., & Greene, D. J. (2014). Can readability formuas be used to successfully gauge difficulty of reading materials? Psychology in the Schools, 51(2), 198-215.

Tom Berend
4/9/2014 12:21:28 pm

Readability and decoding only makes sense up to grade 4 books. Grade 5 books require a huge jump in comprehension skills that is not captured in readability.

The earlier books tend to be self-contained, directly-told, and plot-driven. Grade-5 books start to use imagery, setting, themes, and writing style to tell the story. Characters become more complex, their motivations more ambiguous. World knowledge starts to become more important.

We ran into that in an intensive reading intervention for grade-8 impaired readers. For the first six weeks we focused on the basics of word recognition: blending and segmenting, phonics, and morphology. We practiced finger-point reading and fluency drills. We raced them through grade 2, 3, and 4 books. They developed confidence and began to enjoy reading. Then we hit the limits of word recognition.

For a grade-5 text, we selected "The City of Ember" by Jeanne DuPrau. A proficient grade-5 reader immediately realizes that the city is a post-apocalypse underground survival bunker. But that is never stated directly - the book is written from the point of view of two 12-year-olds who have never imagined a larger world. Their strange little city is lavishly described from their point of view, the pleasure of the book is watching them discover Ember's true nature.

Skilled readers build a mental model of a story and bring their world knowledge, personal experiences, and knowledge of literary conventions to it. They challenge the author. They build a tentative hypothesis of the text's meaning, revising as new information arrives. They monitor their understanding and their own attention, re-reading where necessary. Afterwards, they reflect on what they have read.

Here's what poor comprehension looks like. Our students did not grasp that Ember was underground until they were half-way through the book - in spite of the barrage of comprehension questions that we posed to them after each chapter. They accurately decoded the words and captured the surface gloss - they knew from the first pages that Ember's sky was always dark, that Ember was running out of light bulbs, and that the flood-lamps frequently went out. They knew that the timekeeper changes the date sign and tells the light director when to turn the lights on and off, but that strange behavior did not fire any corresponding light bulbs in their heads.

Happily, there is guidance from research on how to teach reading comprehension. Palinczar & Brown (1984) is a good starting point.

Ben Rogers
4/9/2014 08:46:13 pm

My school has recently begun the Accelerated Reader programme which uses it's own reading levels. Tom has just written the response that I was planning to write (only more eloquantly).

The only extra comment I would make is that reading non-fiction has as an additional variable the learner's existing subject knowledge (very Hirsch). I am sure this inhibits secondary school (high school) teachers from using texts effectively. If there is any research on this, I would be very pleased to read it.

Gavin Haque link
4/10/2014 03:07:53 am

So, are teachers left to use their own judgment as paragraph two indicates? Ben and Tom, I'm starting to read more about Lexile Level scores. I find your comments to be incredibly insightful.

Tom Berend
4/10/2014 03:20:42 am

Prior subject knowledge is a necessary foundation, but a poor comprehender will struggle with learning even at his level of subject mastery. The most attractive solution is to teach comprehension skills integrated with subject matter.

Our intervention was limited to 'literacy' and we selected biography and history for our non-fiction, where the focus was on style and the subject matter was secondary. In retrospect I wish we had attempted comprehension skills training using a science textbook. The challenge of course is to set expectations properly, a reading intervention is not expected to teach physics, and not allocated the necessary time.

Science teachers have the opportunity to embed comprehension into their course, but it is a time-consuming project. If someone wants to try it, a sensible-sounding set of 14 guidelines are suggested in:

Brown, R. (2002). Straddling two worlds: Self-directed comprehension instruction for middle schoolers. In C.C. Block & M Pressley (Eds.), Comprehension Instruction (pp. 337-350). New York, NY: Guilford Press.

Nicole
4/14/2014 12:34:01 pm

Can you offer any practical solutions to this problem that can be used right now? I am asking because I frequently volunteer to read with children in my daughter's grade one classroom. The teacher has bins full of books labelled A-M, but some don't have labels and some are obviously labelled at the wrong level. I would like to help her out by reorganizing and relabelling all the books she has, so that when a child chooses a book, they are getting one they can handle. We are talking about roughly 500 books I would say. What is best practice in levelling these books? How do elementary teachers normally handle this?

Tom
4/14/2014 02:33:26 pm

Nicole: Consider using http://www.lexile.com They have a very large database of books measured against a single readability algorithm. .As this blog post points out, that metric may not be reliable.

Tom
4/14/2014 02:33:52 pm

Nicole: Consider using http://www.lexile.com They have a very large database of books measured against a single readability algorithm. .As this blog post points out, that metric may not be reliable.


Comments are closed.

    Enter your email address:

    Delivered by FeedBurner

    RSS Feed


    Purpose

    The goal of this blog is to provide pointers to scientific findings that are applicable to education that I think ought to receive more attention.

    Archives

    April 2022
    July 2020
    May 2020
    March 2020
    February 2020
    December 2019
    October 2019
    April 2019
    March 2019
    January 2019
    October 2018
    September 2018
    August 2018
    June 2018
    March 2018
    February 2018
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    April 2017
    March 2017
    February 2017
    November 2016
    September 2016
    August 2016
    July 2016
    June 2016
    May 2016
    April 2016
    December 2015
    July 2015
    April 2015
    March 2015
    January 2015
    September 2014
    August 2014
    July 2014
    June 2014
    May 2014
    April 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    August 2013
    July 2013
    June 2013
    May 2013
    April 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    September 2012
    August 2012
    July 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012

    Categories

    All
    21st Century Skills
    Academic Achievement
    Academic Achievement
    Achievement Gap
    Adhd
    Aera
    Animal Subjects
    Attention
    Book Review
    Charter Schools
    Child Development
    Classroom Time
    College
    Consciousness
    Curriculum
    Data Trustworthiness
    Education Schools
    Emotion
    Equality
    Exercise
    Expertise
    Forfun
    Gaming
    Gender
    Grades
    Higher Ed
    Homework
    Instructional Materials
    Intelligence
    International Comparisons
    Interventions
    Low Achievement
    Math
    Memory
    Meta Analysis
    Meta-analysis
    Metacognition
    Morality
    Motor Skill
    Multitasking
    Music
    Neuroscience
    Obituaries
    Parents
    Perception
    Phonological Awareness
    Plagiarism
    Politics
    Poverty
    Preschool
    Principals
    Prior Knowledge
    Problem-solving
    Reading
    Research
    Science
    Self-concept
    Self Control
    Self-control
    Sleep
    Socioeconomic Status
    Spatial Skills
    Standardized Tests
    Stereotypes
    Stress
    Teacher Evaluation
    Teaching
    Technology
    Value-added
    Vocabulary
    Working Memory