The Significance of the Flynn Effect

The significance of Flynn’s assault on the meaning of general intelligence cannot be overstated. General intelligence, or g, is inferred to underlie performance on a battery of diverse tasks that seem to be quite dissimilar but which turn out to be moderately correlated. For example, students who score above average on vocabulary usually score above average on math, spatial ability, general information, puzzles, and comprehension. They may be much higher on some things than others (due to the operation of specific abilities) but the fact that they are so alike on such disparate tasks is seen as a manifestation of g.

For nearly 100 years psychometric researchers have been enamored with g, touting its ability to tie together myriad, seemingly unrelated, phenomena. For instance, a person’s level of g is the single best predictor of his school performance, occupational success, and a host of other outcomes. Importantly, it is a far better predictor than are specific cognitive and personality measures, and it remains substantially, if imperfectly, stable over an individual’s lifetime—from around early elementary school till old age. In one longitudinal analysis of individuals who were given IQ tests at age 11 in 1932 and retested at age 77, the corrected correlation between their two IQ scores was .74, showing substantial stability over their lifetimes.

To the extent that an intelligence test measures general intelligence–or in the jargon of psychometrics, “loads” on g– it is not only a better predictor of performance in school and in the workplace, but it is also more heritable, and it is more closely related to a number of physiological indices such as neural efficiency and brain volume. IQ tests are heavily g-loaded, including ones that seem fair to children across all cultures such as matrices that involve nonsense shapes not encountered by any children–regardless of social class or culture–in school or home. The IQ mafia has interpreted this constellation of associations as evidence that g reflects an underlying biologically-based and stable intellectual ability, rather than a specific skill learned in school or taught by parents. It (g) permeates nearly all complex tasks, and this is allegedly why IQ scores are so highly correlated with all other complex cognitive tests, such as the SATs, Civil Service exams, Armed Services Vocational Aptitude Batteries, and the GREs. The claim is that they all measure g and they all predict important life outcomes, while being highly heritable. It is but short stone’s throw to a genetic meritocracy syllogism:

• An underlying ability (called g) is needed for all forms of cognitive performance

• g is manifest in any broad cognitive battery such as IQ

• g is related to many types of biological markers and is highly heritable

• Large individual and group differences exist in g

• Variation in g predicts differential life outcomes

• Therefore, variation in life outcomes is at least partly rooted in biological differences in g

Putting these pieces together leads some to argue that inequality in the distribution of wealth, prestige, and educational attainment is, in part, a consequence of unequal distribution of the intellectual capacity needed for high levels of functioning. Psychologists have gathered impressive data that seem to accord with each prong of this syllogism. So when Flynn revealed massive IQ gains over the course of the 20th century, he threw a spanner into the syllogism by revealing several paradoxes. How can IQ be a test of general intelligence (g) that is biologically driven and highly heritable and yet improve so quickly—often rising dramatically within a single generation?

Putting aside whether one agrees with (1) Flynn’s own attempt to resolve paradoxes such as how large IQ gains are nevertheless compatible with high heritability estimates for IQ or (2) that the gains are actually the result of improvements in g (not everyone does, see Rushton & Jensen, 2006), the fact remains that he has shown beyond doubt that general intelligence fluctuates systematically over time and this cannot be due to our having better genes than our grandparents. Each of us gains every year approximately .3 of an IQ point (6 IQ points every twenty years), and this has been found for nearly 30 nations. It was a secret before Flynn and others made this discovery because the IQ tests were periodically re-normed and the average scores were reset to 100 even if the average person had actually scored a 106. The size of the IQ gain is smaller on tests that are more directly taught in school and home (e.g., vocabulary, arithmetic) and largest on tests that would seem unrelated to schooling (e.g., matrices, detecting similarities).

This is not what one might expect if gains were the result of environmental improvements such as more or higher quality schooling. But it highlights the curious path from everyday activities to intellectual performance: It is one thing if a child’s IQ is elevated over time because she is drilled daily on vocabulary and basic number facts (two of the subtests of major IQ batteries). But Flynn and others have shown that these are not the areas where IQ has risen much. It is in what Flynn refers to as “on-the-spot reasoning” about relations between objects that are either totally familiar to everyone, hence no one can be claimed to have a prior advantage (e.g., arranging familiar pictures so they tell a coherent story) or objects that are totally unfamiliar to everyone (e.g., nonsense shapes that have to be seriated). On these types of tests the IQ gains have been enormous. If we gave our grandparents today’s tests they’d score near the mentally retarded range, something that neither Flynn nor most researchers believe reflects their intelligence, notwithstanding their low scores.

A relatively unexplored question is the causal pathways running from the early environments to later performance on g-loaded tests. Granted most of us do not directly teach our children how to arrange pictures to tell a story or how to seriate or cross-classify a multidimensional matrix of shapes, but perhaps there are activities that indirectly foster elevated scores on such tests. And perhaps these activities are more common with each subsequent generation, leading to the Flynn effect. There is some support for this view. For example research with Brazilian children demonstrates that every year of formal school attendance conveys an improvement in their Raven’s Matrices performance, the quintessential g measure. Raven’s Matrices are associated with the largest IQ gains in the 20th century, so there is clearly something that is associated with being in school that aids performance on the highly g-loaded IQ test.

Similarly, researchers have shown that differences in the ways boys and girls spend their time (e.g., playing with Legos) (Bornstein et al., 1999), toy selection (Goldstein, 1994), and computer videogame experience (Quaiser-Pohl et al., 2006) are responsible for differences in their spatial abilities, also loaded on g. In a recent well-controlled study by Feng et al., (2007), it was found that playing action videogames significantly narrowed gender differences in mental rotation, in which perspective drawings are shown at different orientations and one must determine whether they are the same object, or on tasks in which one is asked to judge whether a 2-dimensional piece of paper can be folded into a 3-D shape. In this study, both males and females who were asked to play action videogames improved their mental rotation scores but the improvement was much larger for females, and the performance of the females after playing such games was indistinguishable from that of the males who did not play them. Mental rotation is a g-loaded task that is related to math ability. But clearly it can be improved with certain everyday experiences that some individuals engage in.

Lest one imagine that g is driven exclusively by schooling, however, in a direct comparison of the increases in performance on tests of general intelligence across educational age (years of schooling) versus chronological age, Brouwers and his colleagues demonstrated that school attendance by rural Indian children has a substantially smaller impact than has the natural stimulation provided by their everyday experiences herding, running errands, etc.. On average, the increase in general intelligence that results from one year of chronological age is twice the increase that results from one year of attending school. This research revealed that attending school affects tests of cognitive ability primarily in academic domains (e.g., arithmetic) as opposed to on-the-spot reasoning.

We do not know where the Flynn effect is headed. I doubt it will continue at the .3 point per year pace that occurred in the 20th century, though my gut suspicion is that it will rear its head in undeveloped nations that have not had access to the environmental improvements (schooling, challenging games, parental investments) that drove the increases in developed nations. Regardless of whether this hunch proves accurate, all of us own Flynn a deep debt of gratitude for complicating what had started to seem like a closed case.

References

Bornstein, M. H., Haynes, O., Pascual, L., Painter, K., & Galperin, C. (1999). Play in two societies: Pervasiveness of process, specificity of structure. Child-Development, 70, 317-331.

Brouwers, S. A., Mishra, R. C., & Van de Vijver , F.J.R. (2006). Schooling and everyday cognitive development among Kharwar children in India: A natural experiment. International Journal of Behavioral Development, 30, 559-567.

Feng, J. Spence, I., & Pratt, J. (2007). Playing an action video game reduces gender differences in spatial cognition. Psychological Science 18, 850–855.

Goldstein, J. H. (1994). Sex differences in toy play and use of video games. (pp. 110-129) In J. H. Goldstein (Ed.), Toys, play, and child development. NY: Cambridge University Press.

Quaiser-Pohl, C.,*, Geiser C., & Lehmann, W. (2006). The relationship between computer game preference, gender, and mental-rotation ability. Personality and Individual Differences, 40, 609–619.

Rushton, J.P., & Jensen, A.R. (2006). The totality of available evidence shows the race IQ gap still remains. Psychological Science, 17, 921–922.

—

Stephen J. Ceci is the Helen L. Carr Professor of Developmental Psychology at Cornell University.

The Significance of the Flynn Effect

Also from this issue

Lead Essay

Response Essays

The Conversation