Brighter Brains

Education, Humanism, Sustainability, Women's Equality

Home > Articles > Rule-dependence model explains the commonalities between the Flynn effect and IQ gains via retesting

Rule-dependence model explains the commonalities between the Flynn effect and IQ gains via retesting

Posted: Tue, November 19, 2013 | By:

by Elijah L. Armstrong and Michael A. Woodley


We present a new model of the Flynn effect. It is proposed that Flynn effect gains are partly a function of the degree to which a test is dependent on rules or heuristics. This means that testees can become better at solving ‘rule-dependent’ problems over time in response to changing environments, which lead to the improvement of lower-order cognitive processes (such as implicit learning and aspects of working memory). These in turn lead to apparent IQ gains that are partially independent of general intelligence. We argue that the Flynn effect is directly analogous to IQ gains via retesting, noting that Raven’s Progressive Matrices is particularly sensitive to both the effects of retesting and the Flynn effect. After an extensive review of the relevant supporting literature, we test our thesis by developing a rule-dependence typology and then correlate the vector of a test’s position in the typology with the vector of the Flynn effect that it yields. We find a significant vector correlation of r ~ .60 (N = 14). Finally, we make a number of novel and testable predictions based on our model.

1. Introduction

The Flynn effect describes the tendency for IQ scores to rise across the board at a rate of approximately .30 points per year, or three points per decade (Flynn, 2009). Amongst developed countries the effect had its origins in the early decades of the 20th century (Lynn, 2013), but seems to have been most pronounced in the period immediately following the conclusion of World War II, where especially pronounced gains in Europe and Japan were recorded (Flynn, 2009). Large gains have also been detected in South Korea following the cessation of hostilities in the post-Korean war period (te Nijenhuis, Cho, Murphy, & Lee, 2012). The Flynn effect has recently been detected in a number of developing countries, including Dominica (Meisenberg, Lawless, Lambert, & Newton, 2005), Saudi Arabia (Batterjee, Khaleefa, Ali, & Lynn, 2013), South Africa (te Nijenhuis, Murphy, & van Eeden, 2012), Turkey (Kagitcibasi & Biricik, 2011; Rindermann, Schott, & Baumeister, 2013), Brazil (Colom, Flores-Mendoza, & Abad, 2007), Kenya (Daley, Whaley, Sigman, Espinosa, & Neuman, 2003) and Sudan (Khaleefa, Sulman, & Lynn, 2009).

There are three related and significant issues concerning the Flynn effect: 1) To what extent does the effect constitute a ‘real’ gain in IQ, as opposed to a simple change in test-taking habits, such as an increasing reliance guessing the answers to multiple choice format items (e.g.,Brand, 1996)? 2) To what extent does the Flynn effect concern changes in the level of g, the common factor among many different cognitive ability measures, rather than more narrow sources of ability variance (e.g., Jensen, 1998a)? 3) What has caused the Flynn effect?

With respect to the first issue, the presence of apparent real world corollaries involving the Flynn effect, such as historical increases in GDP (Purchasing Power Parity adjusted) per-capita paralleling the historical trends in the effect (Woodley, 2012a), an increase in precocious- ness in intellectual games such as Chess, Bridge and Go, teacher ratings indicating that students are becoming increasingly practically ‘intelligent’ (Howard, 1999, 2001), and neurological evidence indicating that the effect may be directly related to both increasing brain size (Lynn, 1989) and to enhanced right hippocampal functioning (Baxendale & Smith, 2012) suggests that the effect is associated with actual increases in certain abilities, and is therefore not solely an artifact of changing attitudes towards test-taking. Recent research, however, indicates that changing test-taking attitudes, especially the tendency towards the in- creased use of guessing on harder items, may nevertheless account for a portion of the Flynn effect (Must & Must, in press).

With respect to the second issue, two complementary lines of evidence indicate that the Flynn effect is not occurring on g.
The first line of evidence concerns the use of the method of correlated vectors, where the g loading of an association between IQ and another variable is calculated by correlating the vector of the magnitude of the effect with the vector of the g loadings of different tests (Jensen, 1998a). Generally, the relationships between IQ and biological or part-biological sources of individual and group differences, such as subtest heritabilities, inbreeding depression scores (Rushton, 1999; Rushton & Jensen, 2010; van Bloois, Geutjes, te Nijenhuis, & de Pater, 2009), reaction time measures (Jensen, 1998a), brain size (Rushton & Ankney, 2009), fluctuating asymmetry (Prokosch, Yeo, & Miller, 2005), and dysgenic fertility (Woodley & Meisenberg, 2013a) are g loaded. Collectively, such effects are termed “Jensen effects” (Rushton, 1998). Conversely, culturally driven effects, such as the IQ gains accrued via the retesting effect and IQ gains in adopted children, are generally anti-Jensen effects in that they are signifi- cantly more pronounced on the least g loaded subtests (Jensen, 1998b; te Nijenhuis, van Vianen, & van der Flier, 2007). Given the presence of this apparent biological vs. cultural division, where does the Flynn effect fall? In other words, is it closer to being a purely biological or cultural effect?

The preponderance of studies indicate that the effect is either uncorrelated or mildly negatively correlated with subtest g loadings (Jensen, 1998a; Must, Must, & Raudik, 2003a, 2003b; Rushton, 1999; te Nijenhuis, 2013, te Nijenhuis & van der Flier, 2007; Woodley & Meisenberg, 2013b). A recent meta-analytic study of over 17,000 individuals revealed that the Flynn effect is in fact a statistically significant anti-Jensen effect (rho = −.38; te Nijenhuis & van der Flier, in press), indicating that it is likely to be substantially environmental in origin, given the monotonic positive relationship between g loadings and subtest heritabilities (Rushton & Jensen, 2010; van Bloois et al., 2009).

The second line of evidence demonstrating the Flynn effect’s lack of g loading comes from the study of Wicherts et al. (2004), who utilized multi-group confirmatory factor analysis to examine factorial invariance across a number of cohorts exhibiting the Flynn effect. If the effect occurs on g, it would be expected that the factor structure of g will be preserved across time between cohorts, i.e., will be in-variant. The study found that lack of factorial invariance was characteristic of the Flynn effect, which indicates that the effect is associated with heterogeneous gains on specific tests rather than a gain at the level of latent variables (such as g). This finding was replicated subsequently in Estonian cohorts employing the National Intelligence Test (Must, te Nijenhuis, Must, & van Vianen, 2009), and using a different method, at the item level in the Raven’s Progressive Matrices (Fox & Mitchum, 2013). Despite the finding of no factorial or measurement invariance in the Flynn effect, te Nijenhuis and van der Flier’s (in press) finding of a modest (rather than monotonic) anti-Jensen effect suggests that some small portion of the Flynn effect may still occur on g. One possible explanation for this discrepancy is that secular gains resulting purely from changing test-taking habits may mask the size of the anti-Jensen effect on the remainder of the Flynn effect, especially in so much as gains through guessing concern the use of increased guessing on harder items, which are generally more g loaded (Must & Must, in press), hence may mimic the Jensen effect. This is a sound theoretical reason for suspecting that the guessing-controlled Flynn effect is more strongly negatively associated with g loadings than the data currently indicate.

With respect to the third issue, the apparent independence of the Flynn effect from g permits us to better discriminate amongst potential causes (te Nijenhuis, 2013). Narrow and more ‘hollow’ sources of variance in cognitive abilities are sometimes substantially less heritable than g itself (Carroll, 1993; Rushton & Jensen, 2010; van Bloois et al., 2009). Hence, these sources are far more amenable to environmental manipulation of a sort that could give rise to relatively large gains in measured IQ over a relatively short time frame (Woodley, 2011a, 2012b). This presents a plausible solution to the so-called ‘IQ paradox’, or the observation that measured IQ has risen despite IQ exhibiting a high additive heritability (Dickens & Flynn, 2001). Proposed causes of the Flynn effect such as heterosis (Mingroni, 2004, 2007) can thus be ruled out (Flynn, 2009; Woodley, 2011a), as such gains are biological and associated with the Jensen ef- fect (Nagoshi & Johnson, 1986). Sources of IQ gains that are associated with significant environmental and social improvements, such as decreased neurotoxic pollution (Nevin, 2000) and the expansion of the education system (Husén & Tuijnman, 1991; Teasdale & Owen, 1989; Tuddenham, 1948), are more plausible causes of the Flynn effect by comparison, since neither education nor neurotoxins seem to impact g (e.g., Christian, Bachnan, & Morrison, 2001; Lezak, 1983). However, there is substantial debate about which of the many proposed causes are predominantly involved in the effect, with different studies frequently indicating simultaneous contributions from multiple causes (e.g., Neisser, 1997; Williams, in press).

In this manuscript, we present a link between the Flynn effect and the retesting effect, i.e., the gain in IQ that accrues from retesting individuals on certain IQ tests. This link is the degree to which tests are reliant upon the repeated reapplication of solution rules, where rules are defined as specific procedures or pieces of information that can be consistently relied upon to locate solutions to specific problems. In essence, the more reliant a particular test is on the identification and repeated use of specific rule-sets, the bigger the Flynn and retesting effects. We consider the results of studies that have examined and formalized the use of rules in the solving of the Raven’s Progressive Matrices (Carpenter, Just, & Shell, 1990), and others which have found that performance improvements via familiarity with the RPM are related to increases in the efficiency with which individuals can successfully sample rules on this test (Verguts, Boeck, & Maris, 1999; Verguts & De Boeck, 2002). We connect these findings with the observation that this battery is especially sensitive to both Flynn and retesting effects, despite its high g loading. We also use this rule-dependence model to propose two subsidiary models concerning the ways in which specific sources of environmental improvement can translate into massive IQ gains. In conducting a test of the model, we infer the existence of a four-level typology into which any IQ test can be assigned based on the degree to which it is depen- dent upon the reapplication of specific rule-sets to problem solving — from least dependent (Level I) to most (Level IV). It is hypothesized that an IQ test’s position in this typology should be both positively and significantly correlated with the actual recorded size of the Flynn effects on those batteries. This is tested using real data on secular gains from 14 scales. Finally in the discussion we consider the broader implications of this model in terms of testable predictions and the debate surrounding the meaningfulness of both the Flynn and retesting effects.

to read the rest of the essay, CLICK HERE


Please email comments to