Madole & Harden (M&H) attempt to unravel the role of heredity in human behaviour by arguing that the methods of causal analysis can be applied to behavioural genetic data, thus establishing causal links between genes and behaviour. Their key argument is that “within-family genetic effects represent the product of a counterfactual comparison in the same way as average treatment effects from randomised controlled trials” (target article, abstract). Based on this argument, they “advance a framework for identifying, interpreting, and applying causal effects of genes on human behavior” (target article, abstract). While we agree with the authors that human behaviour genetics needs a sound foundation, we see at least three reasons why their proposed framework is not suitable for providing such a foundation.
The first reason is the inherent inconsistency of the proposed approach. M&H discuss whether and when behavioural genetic experiments meet four critical demands on randomised experiments. They argue that the first three demands (independence, sample homogeneity, potential exposability) can be met if the analysis is based on sibling studies, where siblings grow up in a common environment. In contrast, the fourth demand, SUTVA (stable unit treatment value assumption), requires that the siblings do not affect each others' behaviour, that is, grow up in different environments. The fourth demand (growing up in different environments) is contradictory to the first three demands (growing up in a common environment). Thus, at least one of the demands will be violated in any genetic data set. Obviously, this undermines M&H's argument that within-family genetic effects are comparable to the outcome of randomised controlled trials.
The second reason is an unfounded extrapolation from single-gene to genome-wide causation. The key argument in the target article is that Mendelian inheritance has similar properties as the randomisation procedure of controlled trials. Mendel's rules, however, apply to single genes or unlinked pairs of genes, while M&H are mainly interested in the causal analysis of genome-wide association studies (GWASs), where thousands of single-nucleotide polymorphisms (SNPs) are considered simultaneously.
M&H are aware of this problem, in which the physical linkage of genes on a chromosome results in the co-inheritance of alleles at linked loci and subsequent correlations across loci. Consequently, they propose to focus on “a set of alleles that are all in high linkage disequilibrium with each other (but not in linkage disequilibrium with other alleles)” (target article, sect. 3.2.1, para. 6). In this approach, it is crucial to identify such sets of alleles. If the physical linkage of gene loci would be the sole (or most important) cause of linkage disequilibrium, the proposed method might be feasible, as the SNPs used in GWASs provide a physical linkage map, allowing to identify chromosomal regions that are closely linked. However, the term “linkage disequilibrium” is misleading. It suggests that “disequilibrium” (as statistical associations across loci) is mainly caused by physical linkage. Yet, factors like natural and sexual selection, non-random mating, genetic drift, or gene flow can create considerable disequilibrium at unlinked loci, such as loci on different chromosomes (Hedrick, Reference Hedrick2005). Alleles at different loci can, for example, get associated through selection if they produce a high-fitness genotype in combination (but not on their own).
Theoretical considerations suggest that such “epistatic effects” (statistical interactions between genotypes at two or more loci) are common. For example, the evolution of female preferences in sexual selection largely relies on the build-up of disequilibrium between sender and receiver genes (Kuijper, Pen, & Weissing, Reference Kuijper, Pen and Weissing2012). Regulatory networks (such as gene-regulatory networks, metabolic networks, or the immune network) are another important class of examples, as a large percentage of human genes are involved in such networks (Chatterjee & Ahituv, Reference Chatterjee and Ahituv2017). Genes underlying a regulatory network are functionally linked (through selection on the operation of the network) in intricate and unpredictable ways (Van Gestel & Weissing, Reference Van Gestel and Weissing2016, Reference Van Gestel and Weissing2018), and their epistatic interaction will likely result in linkage disequilibrium (even in the absence of physical linkage).
Controlled crossing experiments in animals indeed confirm ample disequilibrium caused by epistatic effects (Flint & Mackay, Reference Flint and Mackay2009; Mackay, Reference Mackay2014). Such experiments cannot be conducted on humans, but likely epistasis is common in our species too. The problem is that epistasis, and its associated disequilibrium, tends to remain hidden in GWASs (Mackay, Reference Mackay2014). This implies that a major source of statistical dependence remains hidden to the researcher, making it almost impossible to correct for linkage disequilibrium in the way suggested by M&H.
The third reason is based on previously documented pitfalls of the GWAS method. M&H have high expectations regarding the GWAS method, while this method is heavily criticised in other branches of genetics because of its low repeatability and its tendency to produce false positives (e.g., Marjoram, Zubair, & Nuzhdin, Reference Marjoram, Zubair and Nuzhdin2014; Zhou et al., Reference Zhou, Pierre, Gonzales, Zou, Cheng, Chitre and Palmer2020; Zuk, Hechter, Sunyaev, & Lander, Reference Zuk, Hechter, Sunyaev and Lander2012). Low repeatability is a major problem, as it either indicates the limited ability of these studies to generalise (i.e., big differences between study populations in how genes cause behaviour) or that most results are actually artefacts of the model (false positives). In the GWAS method it is possible to set the sensitivity of models. Yet, this is a complicated trade-off, especially when using the method to find many genes with weak effects. When the sensitivity is low, only genes with strong effects can be found, which might result in a bias, as possibly important other genes (with weaker effects) cannot be found. On the contrary, setting the sensitivity high will result in many false positives, which might also result in wrong conclusions. Even if the sensitivity is kept constant between studies, low repeatability is found. To increase repeatability of studies, statistical corrections can be added. However, these corrections are generally limited in their success, as artefacts can still appear (e.g., Mills & Mathieson, Reference Mills and Mathieson2022).
In conclusion, we argue that the causal framework proposed by M&H is not suited to understand the effects of genes on behaviour. While we agree with the authors that human behaviour genetics needs a sound causal foundation, this remains a formidable challenge.
Madole & Harden (M&H) attempt to unravel the role of heredity in human behaviour by arguing that the methods of causal analysis can be applied to behavioural genetic data, thus establishing causal links between genes and behaviour. Their key argument is that “within-family genetic effects represent the product of a counterfactual comparison in the same way as average treatment effects from randomised controlled trials” (target article, abstract). Based on this argument, they “advance a framework for identifying, interpreting, and applying causal effects of genes on human behavior” (target article, abstract). While we agree with the authors that human behaviour genetics needs a sound foundation, we see at least three reasons why their proposed framework is not suitable for providing such a foundation.
The first reason is the inherent inconsistency of the proposed approach. M&H discuss whether and when behavioural genetic experiments meet four critical demands on randomised experiments. They argue that the first three demands (independence, sample homogeneity, potential exposability) can be met if the analysis is based on sibling studies, where siblings grow up in a common environment. In contrast, the fourth demand, SUTVA (stable unit treatment value assumption), requires that the siblings do not affect each others' behaviour, that is, grow up in different environments. The fourth demand (growing up in different environments) is contradictory to the first three demands (growing up in a common environment). Thus, at least one of the demands will be violated in any genetic data set. Obviously, this undermines M&H's argument that within-family genetic effects are comparable to the outcome of randomised controlled trials.
The second reason is an unfounded extrapolation from single-gene to genome-wide causation. The key argument in the target article is that Mendelian inheritance has similar properties as the randomisation procedure of controlled trials. Mendel's rules, however, apply to single genes or unlinked pairs of genes, while M&H are mainly interested in the causal analysis of genome-wide association studies (GWASs), where thousands of single-nucleotide polymorphisms (SNPs) are considered simultaneously.
M&H are aware of this problem, in which the physical linkage of genes on a chromosome results in the co-inheritance of alleles at linked loci and subsequent correlations across loci. Consequently, they propose to focus on “a set of alleles that are all in high linkage disequilibrium with each other (but not in linkage disequilibrium with other alleles)” (target article, sect. 3.2.1, para. 6). In this approach, it is crucial to identify such sets of alleles. If the physical linkage of gene loci would be the sole (or most important) cause of linkage disequilibrium, the proposed method might be feasible, as the SNPs used in GWASs provide a physical linkage map, allowing to identify chromosomal regions that are closely linked. However, the term “linkage disequilibrium” is misleading. It suggests that “disequilibrium” (as statistical associations across loci) is mainly caused by physical linkage. Yet, factors like natural and sexual selection, non-random mating, genetic drift, or gene flow can create considerable disequilibrium at unlinked loci, such as loci on different chromosomes (Hedrick, Reference Hedrick2005). Alleles at different loci can, for example, get associated through selection if they produce a high-fitness genotype in combination (but not on their own).
Theoretical considerations suggest that such “epistatic effects” (statistical interactions between genotypes at two or more loci) are common. For example, the evolution of female preferences in sexual selection largely relies on the build-up of disequilibrium between sender and receiver genes (Kuijper, Pen, & Weissing, Reference Kuijper, Pen and Weissing2012). Regulatory networks (such as gene-regulatory networks, metabolic networks, or the immune network) are another important class of examples, as a large percentage of human genes are involved in such networks (Chatterjee & Ahituv, Reference Chatterjee and Ahituv2017). Genes underlying a regulatory network are functionally linked (through selection on the operation of the network) in intricate and unpredictable ways (Van Gestel & Weissing, Reference Van Gestel and Weissing2016, Reference Van Gestel and Weissing2018), and their epistatic interaction will likely result in linkage disequilibrium (even in the absence of physical linkage).
Controlled crossing experiments in animals indeed confirm ample disequilibrium caused by epistatic effects (Flint & Mackay, Reference Flint and Mackay2009; Mackay, Reference Mackay2014). Such experiments cannot be conducted on humans, but likely epistasis is common in our species too. The problem is that epistasis, and its associated disequilibrium, tends to remain hidden in GWASs (Mackay, Reference Mackay2014). This implies that a major source of statistical dependence remains hidden to the researcher, making it almost impossible to correct for linkage disequilibrium in the way suggested by M&H.
The third reason is based on previously documented pitfalls of the GWAS method. M&H have high expectations regarding the GWAS method, while this method is heavily criticised in other branches of genetics because of its low repeatability and its tendency to produce false positives (e.g., Marjoram, Zubair, & Nuzhdin, Reference Marjoram, Zubair and Nuzhdin2014; Zhou et al., Reference Zhou, Pierre, Gonzales, Zou, Cheng, Chitre and Palmer2020; Zuk, Hechter, Sunyaev, & Lander, Reference Zuk, Hechter, Sunyaev and Lander2012). Low repeatability is a major problem, as it either indicates the limited ability of these studies to generalise (i.e., big differences between study populations in how genes cause behaviour) or that most results are actually artefacts of the model (false positives). In the GWAS method it is possible to set the sensitivity of models. Yet, this is a complicated trade-off, especially when using the method to find many genes with weak effects. When the sensitivity is low, only genes with strong effects can be found, which might result in a bias, as possibly important other genes (with weaker effects) cannot be found. On the contrary, setting the sensitivity high will result in many false positives, which might also result in wrong conclusions. Even if the sensitivity is kept constant between studies, low repeatability is found. To increase repeatability of studies, statistical corrections can be added. However, these corrections are generally limited in their success, as artefacts can still appear (e.g., Mills & Mathieson, Reference Mills and Mathieson2022).
In conclusion, we argue that the causal framework proposed by M&H is not suited to understand the effects of genes on behaviour. While we agree with the authors that human behaviour genetics needs a sound causal foundation, this remains a formidable challenge.
Financial support
MJB is supported by ALW-NWO Grant No. ALWOP.531.
Competing interest
None.