Category: Intelligence

Self-control Gone Wild?

9/9/2013

The cover story of latest New Republic wonders whether American educators have fallen in blind love with self-control. Author Elizabeth Weil thinks we have. Titled “American Schools Are Failing Nonconformist Kids: In Defense of the Wild Child” the article suggests that educators harping on self-regulation are really trying to turn kids into submissive little robots. And they do so because little robots are easier to control in the classroom.

But lazy teachers are not the only cause. Education policy makers are also to blame, according to Weil. She writes that “valorizing self-regulation shifts the focus away from an impersonal, overtaxed, and underfunded school system and places the burden for overcoming those shortcomings on its students.”

And the consequence of educators’ selfishness? Weil tells stories that amount to Self-Regulation Gone Wild. A boy has trouble sitting cross-legged in class—the teacher opines he should be tested because something must be wrong with him. During story time the author’s daughter doesn’t like to sit still and to raise her hand when she wants to speak. The teacher suggests occupational therapy.

I can see why Weil and her husband were angry when their daughter’s teacher suggested occupational therapy simply because the child’s behavior was an inconvenience to him. But I don’t take that to mean that there is necessarily a widespread problem in the psyche of American teachers. I take that to mean that their daughter’s teacher was acting like a selfish bastard.

The problem with stories, of course, is that there are stories to support nearly anything. For every story a parent could tell about a teacher diagnosing typical behavior as a problem, a teacher could tell a story about a child who really could do with some therapeutic help, and whose parents were oblivious to that fact.

What about evidence beyond stories?

Weil cites a study by Duncan et al (2007) that analyzed six large data sets and found social-emotional skills were poor predictors of later success.

She also points out that creativity among American school kids dropped between 1984 and 2008 (as measured by the Torrance Test of Creative Thinking) and she notes “Not coincidentally, that decrease happened as schools were becoming obsessed with self-regulation.”

There is a problem here. Weil uses different terms interchangeably: self-regulation, grit, social-emotional skills. They are not same thing. Self-regulation (most simply put) is the ability to hold back an impulse when you think that that the impulse will not serve other interests. (The marshmallow study would fit here.) Grit refers to dedication to a long-term goal, one that might take years to achieve, like winning a spelling bee or learning to play the piano proficiently. Hence, you can have lots of self-regulation but not be very gritty. Social emotional skills might have self-regulation as a component, but it refers to a broader complex of skills in interacting with others.

These are not niggling academic distinctions. Weil is right that some research indicates a link between socioemotional skills and desirable outcomes, some doesn’t. But there is quite a lot of research showing associations between self-control and positive outcomes for kids including academic outcomes, getting along with peers, parents, and teachers, and the avoidance of bad teen outcomes (early unwanted pregnancy, problems with drugs and alcohol, et al.). I reviewed those studies here. There is another literature showing associations of grit with positive outcomes (e.g., Duckworth et al, 2007).

Of course, those positive outcomes may carry a cost. We may be getting better test scores (and fewer drug and alcohol problems) but losing kids’ personalities. Weil calls on the reader’s schema of a “wild child,” that is, an irrepressible imp who may sometimes be exasperating, but whose very lack of self-regulation is the source of her creativity and personality.

But irrepressibility and exuberance is not perfectly inversely correlated with self-regulation. The purpose of self-regulation is not to lose your exuberance. It’s to recognize that sometimes it’s not in your own best interests to be exuberant. It’s adorable when your six year old is at a family picnic and impulsively practices her pas de chat because she cannot resist the Call of the Dance. It’s less adorable when it happens in class when everyone else is trying to listen to a story.

So there’s a case to be made that American society is going too far in emphasizing self-regulation. But the way to make it is not to suggest that the natural consequence of this emphasis is the crushing of children’s spirits because self-regulation is the same thing as no exuberance. The way to make the case is to show us that we’re overdoing self-regulation. Kids feel burdened, anxious, worried about their behavior.

Weil doesn’t have data that would bear on this point. I don’t either. But my perspective definitely differs from hers. When I visit classrooms or wander the aisles of Target, I do not feel that American kids are over-burdened by self-regulation.

As for the decline in creativity from 1984 and 2008 being linked to an increased focus on self-regulation…I have to disagree with Weil’s suggestion that it’s not a coincidence (setting aside the adequacy of the creativity measure). I think it might very well be a coincidence. Note that scores on the mathematics portion of the long-term NAEP increased during the same period. Why not suggest that kids improvement in a rigid, formulaic understanding of math inhibited their creativity?

Can we talk about important education issues without hyperbole?

References

Duckworth, A. L., Peterson, C., Matthews, M. D., & Kelly, D. R. (2007). Grit: perseverance and passion for long-term goals. Journal of personality and social psychology, 92(6), 1087.

Duncan, G. J., Dowsett, C. J., Claessens, A., Magnuson, K., Huston, A. C., Klebanov, P., ... & Japel, C. (2007). School readiness and later achievement. Developmental psychology, 43(6), 1428.

What predicts college GPA?

2/18/2013

What aspects of background, personality, or achievement predict success in college--at least, "success" as measured by GPA?

A recent meta-analysis (Richardson, Abraham, & Bond, 2012) gathered articles published between 1997 and 2010, the products of 241 data sets. These articles had investigated these categories of predictors:

three demographic factors (age, sex, socio-economic status)
five traditional measures of cognitive ability or prior academic achievement (intelligence measures, high school GPA, SAT or ACT, A level points)
No fewer than forty-two non-intellectual measures of personality, motivation, or the like, summarized into the categories shown in the figure below (click for larger image).

Make this fun. Try to predict which of the factors correlate with college GPA.

Let's start with simple correlations.

41 out of the 50 variables examined showed statistically significant correlations. But statistical significance is a product of the magnitude of the effect AND the size of the sample--and the samples are so big that relatively puny effects end up being statistically significant. So in what follows I'll mention correlations of .20 or greater.

Among the demographic factors, none of the three were strong predictors. It seems odd that socio-economic status would not be important, but bear in mind that we are talking about college students, so this is a pretty select group, and SES likely played a significant role in that selection. Most low-income kids didn't make it, and those who did likely have a lot of other strengths.

The best class of predictors (by far) are the traditional correlates, all of which correlate at least r = .20 (intelligence measures) up to r = .40 (high school GPA; ACT scores were also correlated r = .40).

Personality traits were mostly a bust, with the exception of consientiousness (r = .19), need for cognition (r = .19), and tendency to procrastinate (r = -.22). (Procrastination has a pretty tight inverse relationship to conscientiousness, so it strikes me as a little odd to include it.)

Motivation measures were also mostly a bust but there were strong correlations with academic self-efficacy (r = .31) and performance self-efficacy (r = .59). You should note, however, that the former is pretty much like asking students "are you good at school?" and the latter is like asking "what kind of grades do you usually get?" Somewhat more interesting is "grade goal" (r = .35) which measures whether the student is in the habit of setting a specific goal for test scores and course grades, based on prior feedback.

Self-regulatory learning strategies likewise showed only a few factors that provided reliable predictors, including time/study management (r = .22) and effort regulation (r = .32), a measure of persistence in the face of academic challenges.

Not much happened in the Approach to learning category nor in psychosocial contextual influences.

We would, of course, expect that many of these variables would themselves be correlated, and that's the case, as shown in this matrix.

So the really interesting analyses are regressions that try to sort out which matter more.

The researchers first conducted five hierarchical linear regressions, in each case beginning with SAT/ACT, then adding high school GPA, and then investigating whether each of the five non-intellective predictors would add some predictive power. The variables were conscientiousness, effort regulation, test anxiety, academic self efficacy, and grade goal, and each did, indeed, add power in predicting college GPA after "the usual suspects" (SAT or ACT, and high school GPA) were included.

But what happens when you include all the non-intellective factors in the model?

The order in which they are entered matters, of course, and the researchers offer a reasonable rationale for their choice; they start with the most global characteristic (conscientiousness) and work towards the more proximal contributors to grades (effort regulation, then test anxiety, then academic self-efficacy, then grade goal).

As they ran the model, SAT and high school GPA continued to be important predictors. So were effort regulation and grade goal.

You can usually quibble about the order in which variables were entered and the rationale for that ordering, and that's the case here. As they put the data together, the most important predictors of college grade point average are: your grades in high school, your score on the SAT or ACT, the extent to which you plan for and target specific grades, and your ability to persist in challenging academic situations.

There is not much support here for the idea that demographic or psychosocial contextual variables matter much. Broad personality traits, most motivation factors, and learning strategies matter less than I would have guessed.

No single analysis of this sort will be definitive. But aside from that caveat, it's important to note that most admissions officers would not want to use this study as a one-to-one guide for admissions decisions. Colleges are motivated to admit students who can do the work, certainly. But beyond that they have goals for the student body on other dimensions: diversity of skill in non-academic pursuits, or creativity, for example.

When I was a graduate student at Harvard, an admissions officer mentioned in passing that, if Harvard wanted to, the college could fill the freshman class with students who had perfect scores on the SAT. Every single freshman-- 800, 800. But that, he said, was not the sort of freshman class Harvard wanted.

I nodded as though I knew exactly what he meant. I wish I had pressed him for more information.

References:
Richardson, M., Abraham, C., Bond, R. (2012). Psychological correlates of university students' academic performance: A systematic review and meta-analysis. Psychological Bulletin, 138, 353-387.

How to Make a Young Child Smarter

2/4/2013

If the title of this blog struck you as brash, I came by it honestly: it's the title of a terrific new paper by three NYU researchers (Protzko, Aronson & Blair, 2013). The authors sought to review all interventions meant to boost intelligence, and they cast a wide net, seeking any intervention for typically-developing children from birth to kindergarten age that used a standard IQ test as the outcome measure, and that was evaluated in a random control trial (RCT) experiment.

A feature of the paper I especially like is that none of the authors publish in the exact areas they review. Blair mostly studies self-regulation, and Aronson, gaps due to race, ethnicity or gender. (Protzko is a graduate student studying with Aronson.) So the paper is written by people with a lot of expertise, but who don't begin their review with a position they are trying to defend. They don't much care which way the data come out.

So what did they find? The paper is well worth reading in its entirety--they review a lot in just 15 pages--but there are four marquee findings.

First, the authors conclude that infant formula supplemented with long chain polyunsaturated fatty acids boosts intelligence by about 3.5 points, compared to formula without. They conclude that the same boost is observed if pregnant mothers receive the supplement. There are not sufficient data to conclude that other supplements--riboflavin, thiamine, niacin, zinc, and B-complex vitamins--have much impact, although the authors suggest (with extreme caution) that B-complex vitamins may prove helpful.

Second, interactive reading with a child raises IQ by about 6 points. The interactive aspect is key; interventions that simply encouraged reading or provided books had little impact. Effective interventions provided information about how to read to children: asking open-ended questions, answering questions children posed, following children's interests, and so on.

Third, the authors report that sending a child to preschool raises his or her IQ by a little more than 4 points. Preschools that include a specific language development component raise IQ scores by more than 7 points. There were not enough studies to differentiate what made some preschools more effective than others.

Fourth, the authors report on interventions that they describe as "intensive," meaning they involved more than preschool alone. The researchers sought to significantly alter the child's environment to make it more educationally enriching. All of these studies involved low-SES children (following the well-established finding that low-SES kids have lower IQs than their better-off counterparts due to differences in opportunity. I review that literature here.) Such interventions led to a 4 point IQ gain, and a 7 point gain if the intervention included a center-based component. The authors note the interventions have too many features to enable them to pinpoint the cause, but they suggest that the data are consistent with the hypothesis that the cognitive complexity of the environment may be critical. They were able to confidently conclude (to their and my surprise) that earlier interventions helped no more than those starting later.

Those are the four interventions with the best track record. (Some others fared less well. Training working memory in young children "has yielded disappointing results." )

The data are mostly unsurprising, but I still find the article a valuable contribution. A reliable, easy-to-undertand review on an important topic.

Even better, this looks like the beginning of what the authors hope will be a longer-term effort they are calling the Database on Raising Intelligence--a compendium of RCTs based on interventions meant to boost IQ. That may not be everything we need to know about how to raise kids, but it's a darn important piece, and such a Database will be a welcome tool.

Yep, School Makes You Smarter

10/24/2012

Does going to school actually make you smarter (at least, as measured by standard cognitive ability tests)? Answering this question is harder than it would first appear because schooling is confounded with many other variables.

Yes, kids cognitive abilities improve the longer they have been in school, but it's certainly plausible that better cognitive abilities make it more probable that you'll stay in school longer. And schooling is also confounded with age--kids who have been in school longer are also older and therefore have had more life experiences, and perhaps those have prompted the increases in intelligence.

One strategy is to test everyone on their birthday. That way, everyone should have had the same opportunity for life experiences, but the student with a birthday in May has had four months more schooling than the child with the January birthday.

That solves some problems, but it entails other assumptions. For example, older children within a grade might experience fewer social problems, for example.

Schooling

A new paper (Carlsson, Dahl, & Rooth, 2012) takes a different approach to addressing this difficult problem.

The authors capitalized on the fact that every male in Sweden must take a battery of cognitive tests for military service. The testing occurs near his 18th birthday, but the precise date is assigned more or less randomly (constrained by logistical factors for the military testers). So the authors could statistically control for the time-of-year effect of the birthday and in addition investigate the effects of just a few days more (or less) of schooling. The researchers were able to access a database of all the males tested between 1980 and 1994.

Students took four tests. Two tests (one of word meanings and one of reading technical prose) tap crystallized intelligence (i.e. what you know). Two others (spatial reasoning, and logic) tap fluid intelligence (i.e., reasoning that is not dependent on particular knowledge).

The authors found that older students scored better on all four tests--no surprise there. What about students who were the same age, but who, because of the vagaries of the testing, happened to have had a few days more or fewer of schooling?

More schooling was associated with better performance, but only on the crystallized intelligence tests: an extra 10 days in school improved by about 1% of a standard deviation. Extra non-school days had no effect.

There was no measurable effect of school days on the fluid intelligence tests. This result might mean that these cognitive skills are unaffected by schooling, but it might also mean that the "dose" of schooling was too small to have an impact, or that the measure was insensitive to the effect that schooling has on fluid intelligence.

Reference
Carlsson, M. Dahl, G. B. & Rooth, D-O. (2012). The Effect of Schooling on Cognitive Skills. NBER Working Paper No. 18484 October 2012

Working memory training: Are the studies accurate?

10/15/2012

Last June I posted a blog entry about training working memory, focusing on a study by Tom Redick and his colleagues, which concluded that training working memory might boost performance on whatever task was practiced, but it would not improve fluid intelligence.

(Measures of fluid intelligence are highly correlated with measures of working memory, and improving intelligence would be most people's purpose in undergoing working memory training.)

I recently received an email from Martin Walker, of MindSparke.com, which offers brain training. Walker sent me a polite email arguing that the study is not ecologically valid: that is, the conclusions may be accurate for the conditions used in the study, but the conditions used in the study do not match those typically encountered outside the laboratory. Here's the critical text of his email, reprinted with his permission:

"There is a significant problem with the design of the study that invalidates all of the hard work of the researchers--training frequency. The paper states that the average participant completed his or her training in 46 days. This is an average frequency of about 3 sessions per week. In our experience this frequency is insufficient. The original Jaeggi study enforced a training frequency of 5 days per week. We recommend at least 4 or 5 days per week.

With the participants taking an average of 46 days to complete the training, the majority of the participants did not train with sufficient frequency to achieve transfer. The standard deviation was 13.7 days which indicates that about 80% of the trainees trained less frequently than necessary. What’s more, the training load was further diluted by forcing each session to start at n=1 (for the first four sessions) or n=2, rather than starting where the trainee last left off."

I forwarded the email to Tom Redick, who replied:

"Your comment about the frequency of training was something that, if not in the final version of the manuscript, was questioned during the review process. Perhaps it would’ve been better to have all subjects complete all 20 training sessions (plus the mid-test transfer session) within a shorter prescribed amount of time, which would have led to the frequency of training sessions being increased per week. Logistically, having subjects from off-campus come participate complicated matters, but we did that in an effort to ensure that our sample of young adults was broader in cognitive ability than other cognitive training studies that I’ve seen. This was particularly important given that our funding came from the Office of Naval Research – having all high-ability 18-22 year old Georgia Tech students would not be particularly informative for the application of dual n-back training to enlisted recruits in the Army and Marines.

However, I don’t really know of literature that indicates the frequency of training sessions is a moderating factor of the efficacy of cognitive training, especially in regard to dual n-back training. If you know of studies that indicate 4-5 days per week is more effective than 2-3 days week, I’d be interested in looking at it.

As mentioned in our article, the Anguera et al. (2012) article that did not include the matrix reasoning data reported in the technical report by Seidler et al. (2010) did not find transfer from dual n-back training to either BOMAT or RAPM [Bochumer Matrices Test and Raven's Advanced Progressive Matrices, both measures of fluid intelligence], despite the fact that “Participants came into the lab 4–5 days per week (average = 4.5 days) for approximately 25 min of training per session” (Anguera et al., 2012), for a minimum of 22 training sessions. In addition, Chooi and Thompson (2012) administered dual n-back to participants for either 8 or 20 days, and “Participants trained once a day (for about 30 min), four days a week”. They found no transfer to a battery of gF and gC tests, including RAPM.

In our data, I correlated the amount of dual n-back practice gain (using the same method as Jaeggi et al) during training and the number of days it took to finish all 20 practice sessions (and 1 mid-test session). I would never really trust a correlation of N = 24 subjects, but the correlation was r = -.05.'.

I re-analyzed our data, looking only at those dual n-back and visual search training subjects that completed the 20 training and 1 mid-test session within 23-43 days, meaning they did an average of at least 3 sessions of training per week. For the 8 gF tasks (the only ones I analyzed), there was no hint of an interaction or pattern suggesting transfer from dual n-back.

So to boil Redick's response down to a sentence, he's pointing out that other studies have observed no impact on intelligence when using a training regimen closer to that advocated by Walker, and Redick finds no such effect in a follow-up analysis of his own data (although I'm betting he would acknowledge that the experiment was not designed to address this question, and so does not offer the most powerful means of addressing it.)

So it does not seem that training frequency is crucial.

A final note: Walker commented in another email that customers of MindSparke consistently feel that the training helps, and Redick remarked that subjects in his experiments have the same impression. It just doesn't bear out in performance.

New study: Fluid intelligence not trainable

6/19/2012

A few months ago the New York Times published an article on the training of working memory titled "Can You Make Yourself Smarter?" I suggested that the conclusions of the article might be a little too sunny--I pointed out that reviews of the literature by scientists suggested that having subjects practice working memory tasks (like the n-back task, shown below) led to improvement in the working memory task, but not in fluid intelligence.

I also pointed out that a significant limitation of many of these studies was the use of a single measure of intelligence. A new study solves that problem.

The study, by Thomas Redick and seven other researchers, offers a negative result--training doesn't help--which often is not considered news (link is 404 as I write this--I hope it will be back up soon). There are lots of ways of screwing up a study, most of which would lead to null results. But this null result ended up published in the excellent Journal of Experimental Psychology: General because the study is so well-designed.

Researchers gave the training plenty of opportunity to have an impact--subjects underwent 20 sessions. There were enough subjects (N=75) to afford decent statistical power to detect an effect, were one present. Researchers used a placebo control group (visual search) as well as a no-contact control group. They used multiple measures of fluid intelligence, crystallized intelligence, multi-tasking, and perceptual speed. These measures were administered before, during, and after training.

The results: people got better at what they practiced--either n-back or visual search--but there was no transfer to any other task, as shown in the Table (click for larger version).

One study is never fully conclusive on any issue. But given the previous uneven findings of the effects, this study represents another piece of the emerging picture: either fluid intelligence is trainable only in some specialized yet-to-be-defined circumstances, or it's not possible to make a substantial improvement in fluid intelligence through training at all.

These results make me skeptical of commercial programs offering to improve general cognitive processing.

Redick, T. S., Shipstead, Z., Harrison, T. L., Hicks, K. L., Fried, D. E., Hambrick, D. Z., Kane, M. J., & Engle, R. W. (in press). No evidence of intelligence improvement after working memory training: A randomized, placebo-controlled study. Journal of Experimental Psychology: General.

The latest on intelligence

5/10/2012

You may remember The Bell Curve. The book was published in 1994 by Richard Herrnstein and Charles Murray, and it argued that IQ is largely determined by genetics and little by the environment. It further argued that racial differences in IQ tests scores were likely due to genetic differences among the races.

A media firestorm ensued, with most of the commentary issuing from people without the statistical and methodological background to address the core claims of the book.

The American Psychological Association created a panel of eminent researchers to write a summary of what was known about intelligence, which would presumably contradict many of these claims. The panel published the article in 1996, a thoughtful rebuttal of many of the inaccurate claims in The Bell Curve, but also a very useful summary of what some of the best researchers in the field could agree on when it came to intelligence.

Now there's an update.

A group of eminent scientists thought the time was ripe to provide the field with another status-of-the-field statement. They argue that there have been three big changes in the 15 years since the last report: (1) we know much more about the biology underlying intelligence; (2) we have a much better understanding of the impact of the environment on intelligence, and that impact is larger than was suspected; (3) we have a better understanding of how genes and the environment interact.

Some of the broad conclusions are listed below (please note that these are close paraphrases of the article's abstract).

The extent to which genes matter to intelligence varies by social class (genetic inheritance matters more if you're wealthy, less if you're poor).
Almost no genetic polymorphisms have been discovered that are consistently associated with variation of IQ in the normal range.
"Crystallized" and "fluid" intelligence are different, both behaviorally and biologically.
The importance of the environment for IQ is established by the 12 to 18 point increase in IQ observed when children are adopted from working-class to middle-class homes.
In most developed countries studied, gains on IQ tests have continued, and they are beginning in the developing world
Sex differences in some aspects of intelligence are due partly to biological factors and partly to socialization factors.
The IQ gap between Blacks and Whites in the US has been reduced by 0.33 standard deviations in recent years.

The article is well worth reading in its entirety. Download it here.

Neisser, U.; Boodoo, G.; Bouchard, T. J. , J.; Boykin, A. W.; Brody, N.; Ceci, S. J.; Halpern, D. F.; Loehlin, J. C. et al (1996). Intelligence: Knowns and Unknowns. American Psychologist, 51: 77.

Nisbett, R. E., Aronson, J., Blair, C., Dickens, W., Flynn, J., Halpern, D. F., & Turkheimer, E. (2012, January 2). Intelligence: New Findings and Theoretical Developments. American Psychologist, 67, 130-159.

Training working memory might make you smarter

4/20/2012

The New York Times Magazine has an article on working memory training and the possibility that it boosts on type of intelligence.

I think the article is a bit--but only a bit--too optimistic in its presentation.

The article correctly points out that a number of labs have replicated the basic finding: training with one or another working memory task leads to increases in standard measures of fluid intelligence, most notably, Raven's Progressive Matrices.

Working memory is often trained with a N-back task, shown in the figure at left from the NY Times article. You're presented with a series of stimuli, e.g. you're hearing letters. You press a button if a stimulus is the same as the one before (N=1) or the time before last (N=2) or. the time before that (N=3). You start with N=1 and N increases if you are successful. (Larger N makes the task harder.) To make it much harder, researchers can add a second stream of stimuli (e.g., the colored squares shown at left) and ask you to monitor BOTH streams of stimuli in an N-back task.

That is the training task that you are to practice. (And although the figure calls it a "game" it's missing one usual feature of a game; it's no fun at all.)

There are two categories of outcome measures taken after training. In a near-transfer task, subjects are given some other measure of working memory
to see if their capacity has increased. In a far-transfer task, a task is administered that isn't itself a test of working memory, but of a process that we think depends on working memory capacity.

All the excitement has been about far-transfer measures, namely that this training boosts intelligence, about which more in a moment. But it's actually pretty surprising and interesting that labs are reporting near-transfer. That's a novel finding, and contradicts a lot of work that's come before, showing that working memory training tends to benefit only the particular working memory task used during training, and doesn't even transfer to other working memory tasks.

The far-transfer claim has been that the working memory training boosts fluid intelligence. Fluid intelligence is one's ability to reason, see patterns, and think logically, independent of specific experience. Crystallized intelligence, in contrast, is stuff that you know, knowledge that comes from prior experience. You can see why working memory capacity might lead to more fluid intelligence--you've got a greater workspace in which to manipulate ideas.

A standard measure of fluid intelligence is the Ravens Progressive Matrices task, in which you see a pattern of figures, and you are to say which of a several choices would complete the pattern, as shown below.

So, is this finding legit? Should you buy an N-back training program for your kids?

I'd say the jury is still out.

The Times quotes Randy Engle--a highly regarded working memory researcher--on the subject, and he can hardly conceal his scorn: “May I remind you of ‘cold fusion’?”

Engle--who is not one of those scientists who has made a career out of criticizing others--has a lengthy review of the working memory training literature which you can read here.

Another recent review (which was invited for the journal Brain & Cognition) concluded "Sparse evidence coupled with lack of scientific rigor, however, leaves claims concerning the impact and duration of such brain training largely unsubstantiated. On the other hand, at least some scientific findings seem to support the effectiveness and sustainability of training for higher brain functions such as attention and working memory."

My own take is pretty close to that conclusion.

There are enough replications of this basic effect that it seems probable that something is going on. The most telling criticism of this literature is that the outcome measure is often a single task.

You can't use a single task like the Ravens and then declare that fluid intelligence has increased because NO task is a pure measure of fluid intelligence. There are always going to be other factors that contribute to task performance.

The best measure of an abstract construct like "fluid intelligence" is one that uses several measures of what look to be quite different tasks, but which you have reason to think all call on fluid intelligence. Then you use statistical methods to look for shared variance among the tasks.

So what we'd really like to see is better performance after working memory training on a few tasks.

The fact is that in many of these studies, researchers have tried to show transfer to more than one task, and the training transfers to one, but not the other.

Here's a table from a 2010 review by Torkel Klingberg showing this pattern. (Click the image to see a larger version.)

This work is really just getting going, and the inconsistency of the findings means one of two things. Either the training regimens need to be refined, whereupon we'll see the transfer effects more consistently, OR the benefits we've seen thus far were mostly artifactual, a consequence of uninteresting quirks in the designs of studies or the tasks

My guess is that the truth lies somewhere between these two--there's something here, but less than many people are hoping. But it's too early to say with much confidence.