Daniel Willingham--Science & Education
Hypothesis non fingo
  • Home
  • About
  • Books
  • Articles
  • Op-eds
  • Videos
  • Learning Styles FAQ
  • Daniel Willingham: Science and Education Blog

Why Americans Stink at Math

9/25/2014

 
This column originally appeared at RealClearEducation.com on July 29, 2014

Over the weekend the New York Times Magazine ran an article titled “Why do Americans Stink at Math?”  by Elizabeth Green. The article is as much an explanation of why it’s so hard not to stink as an explication of our problems. But I think in warning about the rough road of math improvement, the author may not have even gone far enough.

The nub of her argument is this. American stink at math because the methods used to teach it are rote, don’t lead to transfer to the real world, and lead to shallow understanding. There are pedagogical methods that lead to much deeper understanding. U.S. researchers pioneered these methods and Japanese student achievement took off when the Japanese educational system adopted them.

Green points to a particular pedagogical method as being vital to deeper understanding. Traditional classrooms are characterized by the phrase “I, we, you.” The teacher models a new mathematical procedure, the whole class practices it, and then individual students try it on their own. That’s the method that leads to rote, shallow knowledge. More desirable is “You, Y’all, We.” The teacher presents a problem which students try to solve on their own. Then they meet in small groups to compare and discuss the solutions they’ve devised. Finally, the groups share their ideas as a whole class.

Why don’t US teachers use this method? In the US, initiatives to promote them are adopted every thirty years or so—New Math in the 60’s, National Council of Teachers of Mathematics Standards in the late ‘80’s--but they never gain traction. (Green treats the Common Core as another effort to bring a different pedagogy to classrooms. It may be interpreted that way by some, but it’s a set of standards, not a pedagogical method or curriculum.)

Green says there are two main problems: lack of support for teachers, and the fact that teachers must understand math better to use these methods. I think both reasons are right, but there’s more to it than that.

For a teacher who has not used the “You, Y’all, We” method it’s this bound to be a radical departure from her experience. A few days of professional development is not remotely enough training, but that’s typical of what American school systems provide. As Green notes, Japanese teachers have significant time built into their week to observe one another teach, and to confer.



Green’s also right when she points out that teaching mathematics in a way that leads to deep understanding in children requires that teachers themselves understand math deeply. As products of the American system, most don’t.

Green’s take is that if you hand down a mandate from on high “teach this way” with little training, and hand it to people with a shaky grasp of the foundations of math, the result is predictable; you get the fuzzy crap in classroom that’s probably worse than the mindless memorization that characterizes the worst of the “I, we, you” method.

But I think there are other factors that make improving math even tougher than Green says.

First, the “You, Y’all, We” method is much harder, and not just because you need to understand math more deeply. It’s more difficult because you must make more decisions during class, in the moment. When a group comes up with a solution that is on the wrong track, what do you do? Do you try to get the class to see where it went wrong right away, or do you let them continue, and play out the consequences of the their solution? Once you’ve decided that, what exactly will you say to try to nudge them in that direction?

As a college instructor I’ve always thought that it’s a hell of a lot easier to lecture than to lead a discussion. I can only imagine that leading a classroom of younger students is that much harder.

There are also significant cultural obstacles to American adoption of this method. Green notes that Japanese teachers engage in “lesson study” together, in which one teacher presents a lesson, and the others discuss it in detail. This is a key solution to the problem I mentioned; teachers discuss how students commonly react during a particular lesson, and discuss the best way to respond. That way, they are not thinking in the moment, but know what to do.

The assumption is that teachers are finding, if not the one best way to get an idea across, then a damn good one. As Green notes, that often gets down to details such as which two digit numbers to use for particular example. An expectation goes with this method; that everyone will change their classroom practice according to the outcome of lesson study. This is a significant hit to teacher autonomy, and not one that American teachers are used to. It’s also noteworthy that there is no concept here of honoring or even considering differences among students. It’s assumed they will all do the same work at the same time.

The big picture Green offers is, I think, accurate (even if I might quibble with some details). Most students do not understand math well enough, and the Japanese have offered an example of one way to get there. As much as Green warns of the challenges in Americans broadly emulating this method, I think she may underestimate how hard it would be. It may be more productive to try to find some other technique to give students the math competence we aspire to.

Can traditional public schools replicate charters?

9/17/2014

 
This piece was originally published at realcleareducation.com on July 24, 2014

Although the politics concerning charter schools remain contentious, most education observers agree that some charters have had real success in helping children from impoverished homes learn more. If you believe that’s true, a natural next step is to ask what those charters are doing and whether it could be replicated in other schools. A recent study tried to do that, and the results looked disappointing. But I think the authors passed over a telling result in the data.

The researcher is Roland Fryer, and the first study was published in in 2011 with Will Dobbie. They analyzed successful charter schools on a number of dimensions, and concluded that some factors one might expect to be associated with student success were not: class size, per-pupil expenditures, and teacher qualifications, for example. They identified five factors that did seem to matter: frequent feedback to teachers, the use of data to drive instruction, high-dosage tutoring to students, increased instructional time, and high expectations.

Fryer (2014) sought to inject those five factors into some public schools in high-needs districts, starting with twenty schools in Texas. They increased the number of occasions for teacher feedback from 3 times each year to 30. Staff learned instructional techniques developed by Doug Lemov and Robert Marzano. They had parents sign contracts and students wear uniforms, along with other marks of a high-expectations school culture. Outcome measures of interest were school averages on state-wide tests.

So what happened? In math, it helped a little. The effect size was around 0.15. In English Language Arts, there was no effect at all.

Fryer tried the same thing in Denver (7 schools) and got identical results. In Chicago (29 schools) there was no effect in either math or reading.

Two questions arise. Why is the effect so small? And why the difference between math and reading? 

Fryer does not really take on the first question, I guess because there is an effect on math achievement. In the conclusion he claims “These  results  provide  evidence  suggesting  that  charter  school  best  practices  can  be  used systematically in previously  low-performing  traditional public schools to significantly increase student achievement in ways similar to the most  achievement-increasing  charter schools.” Whether or not the cost (about $1,800 per student) was worth the benefit is a judgment call, of course, but the benefit strikes me as modest.

Fryer does address the different impact of the intervention for reading and math. He speculates that it might be harder to move reading scores because many low-income kids hear and speak non-standard English at home. There’s some grounded speculation that hearing different dialects of English at home and at school may impact learning to read--see Seidenberg, 2013. I doubt non-standard English is decisive in fourth grade and up, and those were the students tested in this study.

My guess is that another factor  is relevant to both the size of the math effect and the lack of effect in reading. Much of Fryer’s intervention is directed towards a seriousness about content. But actually getting serious about work was the factor that Fryer was least able to address. The paper says “In an ideal world, we would have lengthened the school day by two hours and used the additional time to provide tutoring in math and reading in every grade level.” But due to budget constraints they could tutor in one grade and one subject per school. They chose 4th, 6th, and 9th grades, and they chose math. Non-tutored grades got a double-dose of whatever students were most behind in, and teachers tried to make the double-dose not cut into academic time.  Thus, it may be that researchers saw puny effects because they had to skimp on the most important factor: sustained engagement with challenging academic content.

This explanation is also relevant to the math/reading difference. In math, if you put a little extra time in, it’s at least obvious where that time should go. If kids are behind in mathematics, it’s not difficult to know what they need to work on.

Once kids reach upper elementary school, reading comprehension is driven primarily by background knowledge; knowing a bit about the topic of the text you’re reading confers a big advantage to comprehension. Kids from impoverished homes suffer primarily from a knowledge deficit (Hirsch, 2007).

So a bit of extra time, while better than nothing, is just a start at an attempt to build the knowledge needed for these students to make significant strides in reading comprehension. And in this particular intervention, no attempt was made to assess what knowledge was needed and to build it systematically.

This problem is not unique the Fryer’s intervention. As he notes, it’s always tougher to move the needle on reading than on math. That’s because experiences outside of the classroom make such an enormous contribution to reading ability.

Thus, I find Fryer’s study perhaps more interesting than Fryer does.  On the face of it, his intervention was a modest success: no improvement in reading, but at least a small bump to math. To me, this study was another in a long series showing the primacy of curriculum to achievement.

References

Dobbie, W., & Fryer Jr, R. G. (2011). Getting beneath the veil of effective schools: Evidence from New York City (No. w17632). National Bureau of Economic Research.

Fryer, R. G. (2014). Injecting Charter School Best Practices into Traditional Public Schools: Evidence from Field Experiments. The Quarterly Journal of Economics, doi: 10.1093/qje/qju011

Hirsch, E. D. (2007). The knowledge deficit: Closing the shocking education gap for American children. Houghton Mifflin Harcourt.

 

Seidenberg, M. S. (2013). The Science of Reading and Its Educational Implications. Language Learning and Development, 9(4), 331-360.

Tenure lessons from higher ed

9/10/2014

 
This article was originally published at RealClearEducation.com on July 15, 2014

Teacher tenure laws were adopted by most states during the first half of the 20th century. To advocates, tenure provides a guarantee of due process should a teacher be dismissed, and thus offers protection from capricious firings and personal vendettas. To critics, tenure is granted too readily to teachers of marginal skill, and the “due process” is so arduous, time-consuming, and expensive that it constitutes a de facto job guarantee. Thus critics see tenure is a primary reason that poor teachers stay in the profession. Which interpretation is closer to the truth? It’s been very hard to say. Tenure laws have been in place for so long, we haven’t had a counterfactual; guesses about the impact on the teacher labor force subsequent to a change in tenure procedures have been just that—guesses. Now, we’re starting to have some data on the matter (Loeb, Miller & Wyckoff, 2014)

New York City’s Department of Education changed its procedure for granting tenure in the 2009-2010 school year. Some features of the old system were retained. As before, teachers were evaluated at the end of their second year, based on the results of classroom observation, evaluations of teacher work (e.g., lesson plans), and an annual rating sheet completed by principals.

Starting in the 2009-2010 year, new information was available about student progress (including value-added measures calculated from state tests). Another new wrinkle was that principals would be required to write a justification for their decision if the superintendent would draw a different conclusion about a teacher’s case.

In the following years, some small change were added, the most interesting of which was the addition of data about teacher effectiveness based on surveys of students and parents, and feedback from colleagues.

So did these changes affect tenure decisions?

There was a sizable impact. Not in teacher dismissal at tenure decision time, but in extending the time for the tenure decision. In the two years prior to the new system, about 94% of teachers got tenure, with 2 or 3% being terminated. For about the same number, principals elected to delay the decision for a year, so as to have more time and data with which to evaluate the teacher.

As shown on the graph, under the new system, the number of teachers denied tenure remains very small, but there has been a huge increase in the number of teachers for whom the decision was delayed.



Most interesting was the response of the teachers when the decision was delayed. More of them transfer to a new school or exit the profession altogether.



The probability of a teacher transferring schools is 9 percentage points higher if the decision was extended, compared to if they were approved. The probability of exiting the profession is 4 percentage points higher.

So on the one hand, the changes to tenure review which were meant to make the process more rigorous are not prompting principals to deny tenure any more frequently. On the other hand, principals are making much greater use of the option to delay the decision for a year. That, in turn, is having some impact on the workforce. A straightforward interpretation is that teachers rightly interpret the delayed decision as a sign that things are not going as well as they might, and some teachers figure that’s because the school they are in is bad fit (and so transfer) or that the profession is just not for them (and so they exit).

What are we to make of the fact that the more rigorous criteria did not lead to more recommendations that teachers be fired? Although it’s possible that’s a sign of principals being reluctant to fire teachers, I doubt it. I think it’s more likely that the large number of delayed decisions reflects the belief that two years is just too early to tell. Certainly we know that teachers are still on the steep part of their learning curve at that point. They are improving, and how much more they will improve is tough to know.

It’s always tempting to think that one’s own training was optimal, but I do think higher education has a more sensible approach to tenure, simply because we take longer to make the decision. At most universities professors the tenure decision is made in the sixth year (based on the first five years worth of performance data.) There is a less rigorous review at the end of the third year. That review provides useful information to the candidate to know where he or she stands and what needs to be improved in the next few years. It also gives the university a chance to fire someone if things are going really poorly.

Whether tenure makes sense at all today and the possible consequences of eliminating are viable questions, but ones I’m not tackling here. But if you’re going to continue offering tenure, it’s a decision that ought to be made on more data than can be gleaned from the first two years.

Reference:

Loeb, S., Miller, L. C., & Wyckoff, J. (May, 2014) Performance screens for school improvement: The case of teacher tenure reform in New York City. Downloaded from http://cepa.stanford.edu/sites/default/files/NYCTenure%20brief%20FINAL.pdf July 13, 2014.

"No screen time" study doesn't allow firm conclusion.

9/1/2014

 
NPR, the Daily Mail, and other outlets are trumpeting the results of a study published  in Computers and Human Behavior: The spin is that digital devices leave kids emotionally stunted. But that conclusion is not supported by the study which is, in fact, pretty poorly designed.

Researchers examined kids' ability to assess non-verbal emotion cues from still photos and from video scenes from which dialog had been removed. These assessments were made pre- and post-intervention.

The intervention is where things get weird. The press has it that the main intervention was the removal of electronic devices from children's lives for five days. In fact, the experimental group went to a sort of educational nature camp call the Pali Institute. While control subjects went to their regular school, experimental subjects participated in activities like these:
Picture
This study could almost serve as a test question in an undergraduate research methods course. In the results section, the authors conclude "We found that children who were away from screens for five days with many opportunities for in-person interaction improved significantly in reading facial emotion." As should be obvious from the Table, there were a host of differences between what the experimental kids and the control kids experienced.

In the discussion the authors do allow "
We recognize that the design of this study makes it challenging to tease out the separate effects of the group experience, the nature experience, and the withdrawal of screen-time." But then go on to say "but it is likely that the augmentation of in-person communication necessitated by the absence of digital communication significantly contributed to the observed experimental effect." That's a mere wish. We in fact cannot draw any conclusions about the source of effect.

It's a shame that news outlets are not more discriminating in how they report this sort of work. 

    Enter your email address:

    Delivered by FeedBurner

    RSS Feed


    Purpose

    The goal of this blog is to provide pointers to scientific findings that are applicable to education that I think ought to receive more attention.

    Archives

    April 2022
    July 2020
    May 2020
    March 2020
    February 2020
    December 2019
    October 2019
    April 2019
    March 2019
    January 2019
    October 2018
    September 2018
    August 2018
    June 2018
    March 2018
    February 2018
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    April 2017
    March 2017
    February 2017
    November 2016
    September 2016
    August 2016
    July 2016
    June 2016
    May 2016
    April 2016
    December 2015
    July 2015
    April 2015
    March 2015
    January 2015
    September 2014
    August 2014
    July 2014
    June 2014
    May 2014
    April 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    August 2013
    July 2013
    June 2013
    May 2013
    April 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    September 2012
    August 2012
    July 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012

    Categories

    All
    21st Century Skills
    Academic Achievement
    Academic Achievement
    Achievement Gap
    Adhd
    Aera
    Animal Subjects
    Attention
    Book Review
    Charter Schools
    Child Development
    Classroom Time
    College
    Consciousness
    Curriculum
    Data Trustworthiness
    Education Schools
    Emotion
    Equality
    Exercise
    Expertise
    Forfun
    Gaming
    Gender
    Grades
    Higher Ed
    Homework
    Instructional Materials
    Intelligence
    International Comparisons
    Interventions
    Low Achievement
    Math
    Memory
    Meta Analysis
    Meta-analysis
    Metacognition
    Morality
    Motor Skill
    Multitasking
    Music
    Neuroscience
    Obituaries
    Parents
    Perception
    Phonological Awareness
    Plagiarism
    Politics
    Poverty
    Preschool
    Principals
    Prior Knowledge
    Problem-solving
    Reading
    Research
    Science
    Self-concept
    Self Control
    Self-control
    Sleep
    Socioeconomic Status
    Spatial Skills
    Standardized Tests
    Stereotypes
    Stress
    Teacher Evaluation
    Teaching
    Technology
    Value-added
    Vocabulary
    Working Memory