Missing Heritability: Much More Than You Wanted To Know
The Story So Far The mid-20th century was the golden age of nurture. Psychoanalysis, behaviorism, and the spirit of the ‘60s convinced most experts that parents, peers, and propaganda were the most important causes of adult personality. Starting in the 1970s, the pendulum swung the other way. Twin studies shocked the world by demonstrating that most behavioral traits - especially socially relevant traits like IQ - were substantially genetic. Typical estimates for adult IQ found it was about 60% genetic, 40% unpredictable, and barely related at all to parenting or family environment. By the early 2000s, genetic science reached a point where scientists could start pinpointing the particular genes behind any given trait. Early candidate gene studies, which hoped to find single genes with substantial contributions to IQ, depression, or crime, mostly failed. They were replaced with genome wide association studies, which accepted that most interesting traits were polygenic - controlled by hundreds or thousands of genes - and trawled the whole genome searching for variants that might explain 0.1% or even 0.01% of the pie. The goal shifted toward polygenic scores - algorithms that accepted thousands of genes as input and spit out predictions of IQ, heart disease risk, or some other outcome of interest. The failed candidate gene studies had sample sizes in the three or four digits. The new genome-wide studies needed five or six digits to even get started. It was prohibitively difficult for individual studies to gather so many subjects, genotype them, and test them for the outcome of interest, so work shifted to big centralized genome repositories - most of all the UK Biobank - and easy-to-measure traits. Among the easiest of all was educational attainment (EA), ie how far someone had gotten in school. Were they a high school dropout? A PhD? Somewhere in between? This correlated with all the spicy outcomes of interest people wanted to debate - IQ, wealth, social class - while being objective and easy to ask about on a survey. Twin studies suggested that IQ was about 60% genetic, and EA about 40%. This seemed to make sense at the time - how far someone gets in school depends partly on their intelligence, but partly on fuzzier social factors like class / culture / parenting. The first genome-wide studies and polygenic scores found enough genes to explain 2%pp1 of this 40% pie. The remaining 38%, which twin studies deemed genetic but where researchers couldn’t find the genes - became known as “the missing heritability” or “the heritability gap”. Scientists came up with two hypothesis for the gap, which have been dueling ever since: Maybe twin studies are wrong. Maybe there are genes we haven’t found yet For most of the 2010s, hypothesis 2 looked pretty good. Researchers gradually gathered bigger and bigger sample sizes, and found more and more of the missing heritability. A big 2018 study increased the predictive power of known genes from 2% to 10%. An even bigger 2022 study increased it to 14%, and current state of the art is around 17%. Seems like it was sample size after all! Once the samples get big enough we’ll reach 40% and finally close the gap, right? This post is the story of how that didn’t happen, of the people trying to rehabilitate the twin-studies-are-wrong hypothesis, and of the current status of the debate. Its most important influence/foil is Sasha Gusev, whose blog The Infintesimal introduced me to the new anti-hereditarian movement and got me to research it further, but it’s also inspired by Eric Turkheimer, Alex Young (not himself an anti-hereditarian, but his research helped ignite interest in this area), and Awais Aftab. (while I was working on this draft, the East Hunter Substack wrote a similar post. Theirs is good and I recommend it, but I think this one adds enough that I’m publishing anyway) https://www.astralcodexten.com/p/missing-heritability-much-more-than