Daniel Willingham--Science & Education
Hypothesis non fingo
  • Home
  • About
  • Books
  • Articles
  • Op-eds
  • Videos
  • Learning Styles FAQ
  • Daniel Willingham: Science and Education Blog

"Active learning" in college STEM courses--meta-analysis

6/20/2014

 
This column was originally published at RealClearEducation.com on May 20, 2014.

When you think of a college class, what image comes to mind? Probably a professor droning about economics, or biology, or something, in an auditorium with several hundred students. If you focus on the students in your mind’s eye, you’re probably imagining them looking bored and, if you’ve been in a college lecture hall recently, your image would include students shopping online and chatting with friends via social media while the oblivious professor lectures on. What could improve the learning and engagement of these students? According to a recent literature review, the results of which were reported by Science, Wired, PBS, and others, damn near anything.

Scott Freeman and his associates (Freeman et al, 2014) conducted a meta-analysis of 225 studies of college instruction that compared “traditional lecturing” vs. “active learning” in STEM courses. (STEM is an acronym for science, technology, engineering, and math.) Student performance on exams increased by about half a standard deviation. Students in the traditional lecture classes were 1.5 times as likely to fail as students in the active learning classes.

Previous studies of college course interventions have been criticized on methodological grounds. For example, classes would experience either traditional lecture or active learning, but no effort would be made to evaluate whether the students were equivalently prepared when they started the class. Freeman et al. categorized the studies in their meta-analysis by methodological rigor, and reported that the size of the benefit was not different among studies of high or low quality.

That’s encouraging. What’s surprising is the breadth of the activities covered by the term “active learning” and how little we know about their differential effectiveness and why they work. According to the article, active learning “included approaches as diverse as occasional group problem-solving, worksheets or tutorials completed during class, use of personal response systems with or without peer instruction, and studio or workshop course designs.” The authors do not report on differential effectiveness of these methods.

In other words, in most of the studies summarized in the meta-analysis professors were still doing a whole lot of lecturing, but every now and then they would do something else. The “something else” ostensibly made students think about the course material, digest it in some way, generate a response. The authors certainly believe that that’s the source of the improvement, citing Piaget and Vygotsky as learning theorists who “challenge the traditional, instructor-focused, ‘teaching by telling’ approach.”

I’m ready to believe that that aspect of the activity was important (although not because of theory advanced by Piaget and Vygotsky nearly a century ago.) But It would have been useful to evaluate the impact of an active control group-- that is, where active learning is compared to a class in which the professor is asked to do something new, but does not entail active learning  (e.g., ask the professor to show more videos). That’s important because interventions typically prompt a change for the better. John Hattie estimates that interventions boost student learning by 0.3 standard deviations, on average.

The exact figures are not reported, but it appears that for most studies the lecture condition was business-as-usual, the thing that typically happens. An active control is important to guard against the possibility that students improve because the professor is energized by doing something different, or holds higher expectations for students because she expects the “something different” to prompt improvement. It’s also possible that asking the professor to make a change in her teaching actually improves her lectures because she reorganizes them to incorporate the change.

It may seem captious to harp on the “why.” To be clear, I think that focusing on making students mentally active while they learn is a wonderful idea, and an equally wonderful idea is giving instructors rules of thumb and classroom techniques that make it likely that students will think. But knowing the source of the improvement will allow individual instructors to tailor methods to their own teaching, rather than following instructions without knowing why they help. It will also help the field collectively move to greater improvement.

Perhaps the best news is that the effectiveness of college instruction is on people’s minds. This past winter I visited a prominent research university, and an old friend told me “I’ve been here twenty-five years, and I don’t think I heard undergraduate teaching mentioned more than twice. In the last two years, that’s all anybody talks about, all over campus.”

Amen.

References

Freeman, S, Eddy, S. L, McDonough, M., Smith, M. K, Okoroafor, N. Jordt, H., & Wenderoth, M. P. (2014). Active learning increases student performance in science, engineering, and mathematics. Proceedings of the National Academy of Sciences, doi: 10.1073/pnas.1319030111

Hattie, J. (2013). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.

What predicts college GPA?

2/18/2013

 
What aspects of background, personality, or achievement predict success in college--at least, "success" as measured by GPA?

A recent meta-analysis (Richardson, Abraham, & Bond, 2012) gathered articles published between 1997 and 2010, the products of 241 data sets. These articles had investigated these categories of predictors:
  • three demographic factors (age, sex, socio-economic status)
  • five traditional measures of cognitive ability or prior academic achievement (intelligence measures, high school GPA, SAT or ACT, A level points)
  • No fewer than forty-two non-intellectual measures of personality, motivation, or the like, summarized into the categories shown in the figure below (click for larger image).
Picture
Make this fun. Try to predict which of the factors correlate with college GPA.

Let's start with simple correlations.

41 out of the 50 variables examined showed statistically significant correlations. But statistical significance is a product of the magnitude of the effect AND the size of the sample--and the samples are so big that relatively puny effects end up being statistically significant. So in what follows I'll mention correlations of .20 or greater.

Among the demographic factors, none of the three were strong predictors. It seems odd that socio-economic status would not be important, but bear in mind that we are talking about college students, so this is a pretty select group, and SES likely played a significant role in that selection. Most low-income kids didn't make it, and those who did likely have a lot of other strengths.

The best class of predictors (by far) are the traditional correlates, all of which correlate at least r = .20 (intelligence measures) up to r = .40 (high school GPA; ACT scores were also correlated r = .40).

Personality traits were mostly a bust, with the exception of consientiousness (r = .19), need for cognition (r = .19), and tendency to procrastinate (r = -.22). (Procrastination has a pretty tight inverse relationship to conscientiousness, so it strikes me as a little odd to include it.)

Motivation measures were also mostly a bust but there were strong correlations with academic self-efficacy (r = .31) and performance self-efficacy (r = .59). You should note, however, that the former is pretty much like asking students "are you good at school?" and the latter is like asking "what kind of grades do you usually get?" Somewhat more interesting is "grade goal" (r = .35) which measures whether the student is in the habit of setting a specific goal for test scores and course grades, based on prior feedback.

Self-regulatory learning strategies likewise showed only a few factors that provided reliable predictors, including time/study management (r = .22) and effort regulation (r = .32), a measure of persistence in the face of academic challenges.

Not much happened in the Approach to learning category nor in psychosocial contextual influences.

We would, of course, expect that many of these variables would themselves be correlated, and that's the case, as shown in this matrix.
Picture
So the really interesting analyses are regressions that try to sort out which matter more.

The researchers first conducted five hierarchical linear regressions, in each case beginning with SAT/ACT, then adding high school GPA, and then investigating whether each of the five non-intellective predictors would add some predictive power. The variables were conscientiousness, effort regulation, test anxiety, academic self efficacy, and grade goal, and each did, indeed, add power in predicting college GPA after "the usual suspects" (SAT or ACT, and high school GPA) were included.

But what happens when you include all the non-intellective factors in the model?

The order in which they are entered matters, of course, and the researchers offer a reasonable rationale for their choice; they start with the most global characteristic (conscientiousness) and work towards the more proximal contributors to grades (effort regulation, then test anxiety, then academic self-efficacy, then grade goal).

As they ran the model, SAT and high school GPA continued to be important predictors. So were effort regulation and grade goal.

You can usually quibble about the order in which variables were entered and the rationale for that ordering, and that's the case here.  As they put the data together, the most important predictors of college grade point average are: your grades in high school, your score on the SAT or ACT, the extent to which you plan for and target specific grades, and your ability to persist in challenging academic situations.

There is not much support here for the idea that demographic or psychosocial contextual variables matter much. Broad personality traits, most motivation factors, and learning strategies matter less than I would have guessed.

No single analysis of this sort will be definitive. But aside from that caveat, it's important to note that most admissions officers would not want to use this study as a one-to-one guide for admissions decisions. Colleges are motivated to admit students who can do the work, certainly. But beyond that they have goals for the student body on other dimensions: diversity of skill in non-academic pursuits, or creativity, for example.

When I was a graduate student at Harvard, an admissions officer mentioned in passing that, if Harvard wanted to, the college could fill the freshman class with students who had perfect scores on the SAT. Every single freshman-- 800, 800. But that, he said, was not the sort of freshman class Harvard wanted.

I nodded as though I knew exactly what he meant. I wish I had pressed him for more information.

References:
Richardson, M., Abraham, C., Bond, R. (2012). Psychological correlates of university students' academic performance: A systematic review and meta-analysis. Psychological Bulletin, 138,  353-387.


Meta-analysis: Learning from Gaming

2/10/2013

 
What people learn (or don't) from games is such a vibrant research area we can expect fairly frequent literature reviews. It's been about a year since the last one, so I guess we're due.

The last time I blogged on this topic Cedar Riener  remarked that it's sort of silly to frame the question as "does gaming work?" It depends on the game.

The category is so broad it can include a huge variety of experiences for students. If there were NO games from which kids seemed to learn anything, you probably ought not to conclude "kids can't learn from games." To do so would be to conclude that distribution of learning for all possible games and all possible teaching would look something like this.
Picture
But this pattern of data seems highly unlikely. It seems much more probable that the distributions overlap more, and that whether kids learn more from gaming or traditional teaching is a function of the qualities of each.

So what's the point of a meta-analysis that poses the question "do kids learn more from gaming or traditional teaching?

I think of these reviews not as letting us know whether kids can learn from games, but as an overview of where we are--just how effective are the serious games offered to students?
Picture
The latest meta-analysis (Wouters et al, 2013) includes data from 56 studies and examined both learning outcomes (77 effect sizes), retention (17 effect sizes) and motivation (31 effect sizes).

The headline results featured in the abstract is "games work!" Games are reported to be superior to conventional instruction in terms of learning (d = 0.29) and retention (d = .36) but somewhat surprisingly, not motivation (d = .26).

The authors examined a large set of moderator variables and this is where things get interesting. Here are a few of these findings:
  1. Students learn more when playing games in groups than playing alone.
  2. Peer-reviewed studies showed larger effects than others. (This analysis is meant to address the bias not to publish null results. . . but the interpretation in this case was clouded by small N's.)
  3. Age of student had no impact.

But two of the most interesting moderators significantly modify the big conclusions.

First, gaming showed no advantage over conventional instruction when the experiment used random assignment. When non-random assignment was used, gaming showed a robust advantage. So it's possible (or even likely) that games in these studies were more effective only when they interacted with some factor in the gamer that is self-selected (or selected by the experimenter or teacher). And we don't know yet what that factor is.

Second the researchers say that gaming showed and advantage over "conventional instruction" but followup analyses show that gaming showed no advantage over what they called "passive instruction"--that it, the teacher talk or reading a textbook. All of the advantage accrued when games were compared to "active instruction," described as "methods that explicitly prompt learners to learning activities (e.g., exercises, hypertext training.)" So gaming (in this data set) is not really better than conventional instruction; it's better than one type of instruction (which in the US is probably less often encountered.)

So yeah, I think the question in this review is ill-posed. What we really want to know is how do we structure better games? That requires much more fine-grained experiments on the gaming experience, not blunt variables. This will be painstaking work.

Still, you've got to start somewhere and this article offers a useful snapshot of where we are.

EDIT 5:00 a.m. EST 2/11/13. In the original post I failed to make explicit another important conclusion--there may be caveats on when and how the games examined are superior to conventional instruction, but they were almost never worse. This is not an unreasonable bar, and as a group the games tested pass it.

Wouters, P, van Nimwegen, C, van Oostendorp, H., & van der Spek, E. G. (2013). A meta-analysis of the cognitive and motivational effects of serious games. Journal of Educational Psychology. Advance online publication. doi: 10.1037/a0031311

How to Make a Young Child Smarter

2/4/2013

 
If the title of this blog struck you as brash, I came by it honestly: it's the title of a terrific new paper by three NYU researchers (Protzko, Aronson & Blair, 2013). The authors sought to review all interventions meant to boost intelligence, and they cast a wide net, seeking any intervention for typically-developing children from birth to kindergarten age that used a standard IQ test as the outcome measure, and that was evaluated in a random control trial (RCT) experiment.

A feature of the paper I especially like is that none of the authors publish in the exact areas they review. Blair mostly studies self-regulation, and Aronson, gaps due to race, ethnicity or gender. (Protzko is a graduate student studying with Aronson.) So the paper is written by people with a lot of expertise, but who don't begin their review with a position they are trying to defend. They don't much care which way the data come out.

So what did they find? The paper is well worth reading in its entirety--they review a lot in just 15 pages--but there are four marquee findings.
Picture
First, the authors conclude that infant formula supplemented with long chain polyunsaturated fatty acids boosts intelligence by about 3.5 points, compared to formula without. They conclude that the same boost is observed if pregnant mothers receive the supplement. There are not sufficient data to conclude that other supplements--riboflavin, thiamine, niacin, zinc, and B-complex vitamins--have much impact, although the authors suggest (with extreme caution) that B-complex vitamins may prove helpful.

Picture
Second, interactive reading with a child raises IQ by about 6 points. The interactive aspect is key; interventions that simply encouraged reading or provided books had little impact. Effective interventions provided information about how to read to children: asking open-ended questions, answering questions children posed, following children's interests, and so on.

Picture
Third, the authors report that sending a child to preschool raises his or her IQ by a little more than 4 points. Preschools that include a specific language development component raise IQ scores by more than 7 points. There were not enough studies to differentiate what made some preschools more effective than others.

Picture
Fourth, the authors report on interventions that they describe as "intensive," meaning they involved more than preschool alone. The researchers sought to significantly alter the child's environment to make it more educationally enriching. All of these studies involved low-SES children (following the well-established finding that low-SES kids have lower IQs than their better-off counterparts due to differences in opportunity. I review that literature here.)  Such interventions led to a 4 point IQ gain, and a 7 point gain if the intervention included a center-based component. The authors note the interventions have too many features to enable them to pinpoint the cause, but they suggest that the data are consistent with the hypothesis that the cognitive complexity of the environment may be critical. They were able to confidently conclude (to their and my surprise) that earlier interventions helped no more than those starting later.

Those are the four interventions with the best track record. (Some others fared less well. Training working memory in young children "has yielded disappointing results." )

The data are mostly unsurprising, but I still find the article a valuable contribution. A reliable, easy-to-undertand review on an important topic.

Even better, this looks like the beginning of what the authors hope will be a longer-term effort they are calling the Database on Raising Intelligence--a compendium of RCTs based on interventions meant to boost IQ. That may not be everything we need to know about how to raise kids, but it's a darn important piece, and such a Database will be a welcome tool.

The Good News About Spatial Skills

6/12/2012

 
There is a great deal of attention paid to and controversy about, the promise of training working memory to improve academic skills, a topic I wrote about here.

But working memory is not the only cognitive process that might be a candidate for training. Spatial skills are a good predictor of success in science, mathematics, and engineering.

Now on the basis of a new meta-analysis (Uttal, Meadow, Tipton, Hand, Alden, Warren & Newcombe, in press) researchers claim that spatial skills are eminently trainable. In fact they claim a quite respectable average effect size of 0.47 (Hedge's g) after training (that's across 217 studies).

Training tasks across these many studies included things like visualizing 2D and 3D objects in a CAD program, acrobatic sports training, and learning to use a laparascope (an angled device used by surgeons). Outcome measures were equally varied, and included standard psychometric measures (like a paper-folding test), tests that demanded imagining oneself in a landscape, and tests that required mentally rotating objects.

Even more impressive:

1) researchers found robust transfer to new tasks
2) researchers found little, if any effect of delay between training and test--the skills don't seem to fade with time, at least for several weeks. (Only four studies included delays of greater than one month.)

This is a long, complex analysis and I won't try to do it justice in a brief blog post. But the marquee finding is big news. What we'd love to see is an intervention that is relatively brief, not terribly difficult to implement, reliably leads to improvement, and transfers to new academic tasks.

That's a tall order, but spatial skills may fill all the requirements.

The figure below (from the paper) is a conjecture--if spatial training were widely implemented, and once scaled up we got the average improvement we see in these studies,  how many more people could be trained as engineers?
Picture
The paper is not publicly available, but there is a nice summary here from the collaborative laboratory responsible for the work. I also recommend this excellent article from American Educator on the relationship of spatial thinking to math and science, with suggestions for parents and teachers.

Uttal, D. H., Meadow, N. G., Tipton, E., Hand, L. L., Alden, A. R., Warren, C., & Newcombe, N.S. (2012, June 4). The Malleability of Spatial Skills: A Meta-Analysis of Training Studies. Psychological Bulletin. Advance online publication. doi: 10.1037/a0028446

Newcombe, N. S. (2010) Picture this: Increasing math and science learning by improving spatial thinking. American Educator, Summer, 29-35, 43.

Should low-achieving kids be promoted or held back?

3/21/2012

 
One of the most troubling problems concerns the promotion or retention of low-achieving kids. It doesn't seem sensible to promote the child to the next grade if he's terribly far behind. But if he is asked to repeat a grade, isn't there are high likelihood that he will conclude he's not cut out for school?

Until recently, comparisons of kids who were promoted and kids who were retained indicated that retention didn't seem to help academic achievement, and in fact likely hurt. So the best practice seemed to be to promote kids to the next grade, but to try to provide extra academic support for them to handle the work.

But new studies indicate that academic outcomes for kids who are retained may be better than was previously thought, although still not what we would hope.

A meta-analysis by Chiharu Allen and colleagues indicates that the apparent effect of retention on achievement varies depending on the particulars of the research.

Two factors were especially important.  First, the extent to which researchers controlled for possible differences between retained and promoted students. Better studies ensured that groups were matched on many characteristics, whereas worse studies just used a generic "low achiever" control group. Second, some studies compared retained students to their age-matched cohort--who were now a year ahead in school. Other studies compared retained students to a grade-matched cohort or to the grade-matched norms of a standardized test.

Which comparison is more appropriate is, to some extent, a value judgment, but personally I can't see the logic in evaluating a kids' ability to do 4th grade work (relative to other 4th graders) when he's still in 3rd grade.

The authors reported three main findings:
1) studies with poor controls indicated negative academic outcomes for retained students. Studies with better controls indicated no effect, positive or negative, on retention versus promotion.
2) When compared to students in the same grade, retained children show a short term boost to academic achievement, but that advantage dissipates in the coming years. The authors speculate that students' academic self-efficacy increases in that first year, but they come to adopt beliefs that they are not academically capable.

This pattern--a one-year boost followed by loss--was replicated in a recently published study (Moser, West, & Hughes, in press).

The question of whether it's best to promote or retain low-achieving students is still open. But better research methodology is providing a clearer picture of the outcomes for these students. One hopes that better information will lead to better ideas for intervention.

Allen, C. S., Chen, Q., Willson, V. L., & Hughes, J. N. (2009). Quality of research design moderates effects of grade retention on achievement: A meta-analytic, multi-level analysis. Education Evaluation & Policy Analysis, 31, 480-499.

Moser, S. E., West, S. G. & Hughes, J. N. (in press). Trajectories of math and reading achievement in low-achieving children in elementary school: Effects of early and later retention in grade. Journal of Educational Psychology.




 

The lab is not the "real world" . . . . right?

3/15/2012

 
To what extent can we trust that experimental results from a psychology laboratory will be observed outside of the laboratory?

This question is especially pertinent in education. It's difficult to conduct research in classrooms. It's hard to get the permission of the administration, it's hard to persuade teachers to change their practice to an experimental practice which may or may not help students--and the ethics of the request ought to be carefully considered. The researcher must make sure that the intervention is being implemented equivalently across classrooms, and across schools. And so on. Research in the laboratory is, by comparison, easy.

But it's usually assumed that you give something up for this ease, namely, that the research lacks what is usually called ecological validity. Simply put, students may not behave in the laboratory as they would in a more natural setting (often called "the field").

A recent study sought to test the severity of this problem.

The researchers combed through the psychological literature, seeking meta-analytic studies that included a comparison of findings from the laboratory and findings from the field. For example, studies have examined the spacing effect--the boost to memory from distributing practice in time--both in the laboratory and in classrooms. Do you observe the advantage in both settings? Is it equally large in both settings?

The authors identified 217 such comparisons.
Picture
Each dot represents one meta-analytic comparison (so each dot really summarizes a number of studies).

What this graph shows is a fairly high correlation between lab and field experiments: .639. 

If it worked in the lab, it generally worked in the field: only 30 times out of 215 did an effect reverse--for example, a procedure that reliably helped in the lab turned to out to reliably hurt in the field (or vice versa).

The correlation did vary by field. It was strongest in Industrial/Organizational Psychology: there, the correlation was a whopping .89. In social psychology it was a more modest, but far from trivial .53.

And what of education? There were only seven meta-analyses that went into the correlation, so it should be interpreted with some restraint, but the figure was quite close that observed in the overall dataset: the correlation was .71.

So what's the upshot? Certainly, it's wise to be cautious about interpreting laboratory effects as applicable to the classroom. And I'm suspicious that the effects for which data were available were not random: in other words, there are data available for effects that researchers suspected would likely work well in the field and in the classroom.

Still, this paper is a good reminder that we should not dismiss lab findings out of hand because the lab is "not the real world." These results can replicate well in the classroom. 

Mitchell, G. (2012). Revisiting truth or triviality: The external validity of research in the psychological laboratory. Perspectives on Psychological Science, 7,  109-117.

    Enter your email address:

    Delivered by FeedBurner

    RSS Feed


    Purpose

    The goal of this blog is to provide pointers to scientific findings that are applicable to education that I think ought to receive more attention.

    Archives

    April 2022
    July 2020
    May 2020
    March 2020
    February 2020
    December 2019
    October 2019
    April 2019
    March 2019
    January 2019
    October 2018
    September 2018
    August 2018
    June 2018
    March 2018
    February 2018
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    April 2017
    March 2017
    February 2017
    November 2016
    September 2016
    August 2016
    July 2016
    June 2016
    May 2016
    April 2016
    December 2015
    July 2015
    April 2015
    March 2015
    January 2015
    September 2014
    August 2014
    July 2014
    June 2014
    May 2014
    April 2014
    March 2014
    February 2014
    January 2014
    December 2013
    November 2013
    October 2013
    September 2013
    August 2013
    July 2013
    June 2013
    May 2013
    April 2013
    March 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    September 2012
    August 2012
    July 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012

    Categories

    All
    21st Century Skills
    Academic Achievement
    Academic Achievement
    Achievement Gap
    Adhd
    Aera
    Animal Subjects
    Attention
    Book Review
    Charter Schools
    Child Development
    Classroom Time
    College
    Consciousness
    Curriculum
    Data Trustworthiness
    Education Schools
    Emotion
    Equality
    Exercise
    Expertise
    Forfun
    Gaming
    Gender
    Grades
    Higher Ed
    Homework
    Instructional Materials
    Intelligence
    International Comparisons
    Interventions
    Low Achievement
    Math
    Memory
    Meta Analysis
    Meta-analysis
    Metacognition
    Morality
    Motor Skill
    Multitasking
    Music
    Neuroscience
    Obituaries
    Parents
    Perception
    Phonological Awareness
    Plagiarism
    Politics
    Poverty
    Preschool
    Principals
    Prior Knowledge
    Problem-solving
    Reading
    Research
    Science
    Self-concept
    Self Control
    Self-control
    Sleep
    Socioeconomic Status
    Spatial Skills
    Standardized Tests
    Stereotypes
    Stress
    Teacher Evaluation
    Teaching
    Technology
    Value-added
    Vocabulary
    Working Memory