The contribution of dominance to the understanding of quantitative genetic variation

ROBIN WELLMANN; JÖRN BENNEWITZ

doi:10.1017/S0016672310000649

The contribution of dominance to the understanding of quantitative genetic variation

Published online by Cambridge University Press: 12 April 2011

ROBIN WELLMANN and

JÖRN BENNEWITZ

Show author details

ROBIN WELLMANN*: Affiliation:
Department of Animal Husbandry and Animal Breeding, University of Hohenheim, D-70599 Stuttgart, Germany
JÖRN BENNEWITZ: Affiliation:
Department of Animal Husbandry and Animal Breeding, University of Hohenheim, D-70599 Stuttgart, Germany
*: *Corresponding author: Department of Animal Husbandry and Animal Breeding, University of Hohenheim, D-70599 Stuttgart, Germany. e-mail: [email protected]

Article contents

Summary
Introduction
Theory
Examples
Discussion
References

Rights & Permissions

Summary

Knowledge of the genetic architecture of a quantitative trait is useful to adjust methods for the prediction of genomic breeding values and to discover the extent to which common assumptions in quantitative trait locus (QTL) mapping experiments and breeding value estimation are violated. It also affects our ability to predict the long-term response of selection. In this paper, we focus on additive and dominance effects of QTL. We derive formulae that can be used to estimate the number of QTLs that affect a quantitative trait and parameters of the distribution of their additive and dominance effects from variance components, inbreeding depression and results from QTL mapping experiments. It is shown that a lower bound for the number of QTLs depends on the ratio of squared inbreeding depression to dominance variance. That is, high inbreeding depression must be due to a sufficient number of QTLs because otherwise the dominance variance would exceed the true value. Moreover, the second moment of the dominance coefficient depends only on the ratio of dominance variance to additive variance and on the dependency between additive effects and dominance coefficients. This has implications on the relative frequency of overdominant alleles. It is also demonstrated how the expected number of large QTLs determines the shape of the distribution of additive effects. The formulae are applied to milk yield and productive life in Holstein cattle. Possible sources for a potential bias of the results are discussed.

Type: Research Papers
Information: Genetics Research , Volume 93 , Issue 2 , April 2011 , pp. 139 - 154

DOI: https://doi.org/10.1017/S0016672310000649 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2011

1. Introduction

Quantitative traits can be controlled by many genes and environmental factors. One main concept to deal with these traits is the infinitesimal model (Fisher, Reference Fisher1918; Goddard, Reference Goddard2001) which assumes an infinite number of genes each with infinitesimal effects. But in fact, the number of genes is finite. Even though the assumptions of this model are violated, the use of this concept has yielded substantial genetic gain for many quantitative traits in livestock and plant breeding over the past few decades (Dekkers & Hospital, Reference Dekkers and Hospital2002). Research on the genetic architecture of quantitative traits and its deviation from the infinitesimal model has a long history, see e.g. Lynch & Walsh (Reference Lynch and Walsh1998), Mackay (Reference Mackay2001) or Hill (Reference Hill2010) and references therein. Many results are trait specific and controversial. Having knowledge about the number of quantitative trait loci (QTLs) as well as the distribution of their effects for a particular trait would contribute to a deeper understanding of its genetic architecture. Furthermore, in animal breeding, genetic evaluation methods that use massive genetic marker data have become common in practice (Hayes et al., Reference Hayes, Bowman, Chamberlain and Goddard2009). For the development of these models as well as for the assessment of long-term genetic progress using this technology it would be helpful to have knowledge about the number of QTLs and the distribution of QTL effects (Goddard, Reference Goddard2009; Daetwyler et al., Reference Daetwyler, Pong-Wong, Villanueva and Woolliams2010).

Estimators of the number of QTLs have been proposed by Hayes & Goddard (Reference Goddard2001) and Chamberlain et al. (Reference Chamberlain, McPartlan and Goddard2007). Hayes & Goddard (Reference Goddard2001) estimated 50–100 segregating QTLs per trait in dairy cattle. Chamberlain et al. (Reference Chamberlain, McPartlan and Goddard2007) concluded that at least 30 QTLs are segregating. However, both authors could not rule out the possibility that there are many more QTLs with small effects, because they used results from QTL mapping studies. It was frequently shown that mapped QTLs were separated into a series of additional QTLs by fine mapping experiments (e.g. Mackay & Anholt, Reference Mackay and Anholt2006), pointing towards larger numbers of QTLs. Support for a higher number of QTLs comes also from human height data (Visscher, Reference Visscher2008) and from genomic selection experiments. Luan et al. (Reference Luan, Woolliams, Lien, Kent, Svendsen and Meuwissen2009) using 20 k single nucleotide polymorphisms (SNPs) for genomic breeding value estimation in dairy cattle found that the method called G-BLUP for the estimation of genomic breeding values produced similar or better than Bayesian methods. This is usually only the case if many QTLs are segregating and the ratio of marker to QTL is low (Meuwissen & Goddard, Reference Meuwissen and Goddard2010). Several estimators are known that are not applicable to outbred populations, e.g. the Castle–Wright estimator or the estimator proposed by Otto & Jones (Reference Otto and Jones2000). Hill et al. (Reference Hill, Goddard and Visscher2008) noted that for a given rate of inbreeding depression, as the number of loci increases and the gene frequencies move towards 0 or 1, the dominance variance decreases towards zero. Consequently, the relationship between dominance variance and inbreeding depression, which is the mean decrease of the trait value when the inbreeding coefficient increases from 0 to 1, could be used to estimate the number of QTLs. Such an estimator would not rely on QTL mapping results but would need knowledge about inbreeding depression and dominance variance, which have been estimated for some populations and traits (Misztal, Reference Misztal1997).

A second main issue in the genetic architecture of quantitative traits is the distribution of QTL effects. In a meta-analysis of information from QTL mapping experiments, Hayes & Goddard (Reference Goddard2001) found the distribution to be moderately leptocurtic, consistent with many genes of small effect and few of large effect. Some studies have reported double exponential distributions (Eyre-Walker & Keightley, Reference Eyre-Walker and Keightley2007; Bennewitz & Meuwissen, Reference Bennewitz and Meuwissen2010). Traits are known with segregating QTLs of large effect (e.g. the pleiotropic DGAT1 in Holstein cattle, see Grisart et al., Reference Grisart, Farnir, Karim, Cambisano, Kim, Kvasz, Mni, Simon, Frère, Coppieters and Georges2004) even though the trait has been under selection for a long time. On the other hand, recent large-scale association studies conducted for human height revealed 54 variants which collectively explained only a few percent of the genetic variance of this highly heritable trait (Visscher, Reference Visscher2008), indicating that there are many QTLs with small effects and none with an exceptional large effect for this trait. This suggests that the distribution of QTL effects is trait specific. In QTL mapping studies, it is often estimated how many QTLs with a ‘large’ effect are segregating. Naturally, these ‘large’ QTLs determine the distribution of the effects and could therefore be used to derive parameters of the distribution. However, the true number of segregating large QTLs may be even smaller than reported by QTL mapping studies due to publication bias, as negative results tend not to be published.

The distribution of the dominance coefficient was derived by Bennewitz & Meuwissen (Reference Bennewitz and Meuwissen2010) for meat production traits in pigs. The dominance coefficient δ_n=d _n/|a _n| depends on the dominance effect d _n and on the absolute additive effect |a _n|, which is half the difference between the homozygous genotypic values. They postulated a normal distribution with a mean of μ_Δ=0·193 and a standard deviation of σ_Δ=0·312 for segregating alleles. Thus, overdominance is a rare but not negligible event for these traits. However, if dominance is more important, e.g. for fitness-related traits, these figures might not hold and overdominance might play a more important role. For example, Ishikawa (Reference Ishikawa2009) mapped an overdominant allele affecting body weight in mice at 6 weeks of age with a dominance coefficient of up to 6·6. Luo et al. (Reference Luo, Li, Mei, Shu, Tabien, Zhong, Ying, Stansel, Khush and Paterson2001) found that overdominance is an important property of most loci associated with heterosis in rice. Rocha et al. (Reference Rocha, Eisen, Siewerdt, Van Vleck and Pomp2004) found more directional dominance in fitness-related traits compared to growth or body composition traits in mice. García-Dorado et al. (Reference García-Dorado, López-Fanjul and Caballero1999) suggested in their review an average dominance coefficient of 0·94<<0·98 (0·01<"=(1−)/2<0·03) for new lethal mutants and argued that the typical value =0·2 (=0·4) can be questioned for new non-severe deleterious mutations, but suggested =0·8 (=0·1). However, even if dominance is common, it does not necessarily cause much dominance variance due to the U-shaped distribution of allele frequencies (Hill et al., Reference Hill, Goddard and Visscher2008).

Also important is the joint distribution of the additive effects a _n of the mutant alleles, the dominance coefficients δ_n and allele frequencies p _n. Several studies assume that they are independent, e.g. Bennewitz & Meuwissen (Reference Bennewitz and Meuwissen2010). But this is likely not true. Kacser & Burns (Reference Kacser and Burns1981) argued that the relationship between enzyme activity and end-product is hyperbolic. Thus, if a high enzyme activity is needed to produce a large trait value, then it is expected that the allele that increases the trait value shows incomplete dominance (i.e. δ_n>0). Moreover, since a differentiable function is locally approximately linear, one would expect that alleles with small effect show little dominance (i.e. δ_n≈0), although this is not always confirmed empirically, e.g. Caballero & Keightley (Reference Caballero and Keightley1994), Bennewitz & Meuwissen (Reference Bennewitz and Meuwissen2010). Thus, |a _n| and δ_n should be positively correlated. Not only arguments based on metabolic pathways (Keightley, Reference Keightley1996) but also arguments based on selection point into this direction. New mutations that affect a trait that was under selection are most often deleterious and recessive. Recessivity of mutants arises as a side effect of the margin of safety built into most metabolic pathways. But the extent to which this safety margin results from natural selection remains controversial (Bourguet, Reference Bourguet1999). Since selection on modifier alleles that act on heterozygote mutant alleles would be much stronger than that acting on homozygotes (Fisher, Reference Fisher1928) and since advantageous dominant alleles less likely become lost by random drift than recessive alleles with the same function, alleles that have been fixed are likely to be dominant over new mutant alleles. However, selection coefficients on modifier alleles would be extremely small (Wright, Reference Wright1934), so modifier alleles may be of little importance. More important for the joint distribution of allelic effects and allele frequencies is the change of an allele frequency by one generation of selection, as it depends on the current allele frequency (Falconer & Mackay, Reference Falconer and Mackay1996, p. 28). Since selection against a recessive deleterious allele becomes inefficient when the allele frequency p _n is small, and since most mutant alleles are recessive and deleterious, one would expect that segregating deleterious alleles (a _n<0) with small frequency (p _n≈0) tend to be recessive (δ_n≈1). For the same reason, segregating advantageous alleles (a _n>0) with high frequency (p _n≈1) tend to be dominant (δ_n≈1). Due to the U-shaped distribution of allele frequencies, most allele frequencies are close to 0 or 1, so these arguments are valid for the majority of the alleles. Therefore, we have likely p _n>0·5 if sign(a _n)=sign(δ_n) and p _n<0·5 if sign(a _n)≠sign(δ_n). All these arguments also confirm that |a _n| and δ_n are likely positively correlated.

Table 1. Table of symbols

Allelic effects and allele frequencies affect the additive variance, the dominance variance and the inbreeding depression of a trait. However, dominance variance is not only determined by these quantities, but may be increased quite substantially by linkage disequilibrium as shown by Avery & Hill (Reference Avery and Hill1979). Given that alleles are coded as 0 and 1, the coefficient of linkage disequilibrium D _n,m between loci n and m equals the covariance of the alleles, if they are chosen at random from the infinite gametic pool.

The objective of this paper was to find parameter settings for the simulation of a particular trait that account for dominance realistically and are consistent with literature reports. A further concern is the identification of genetic architectures for a particular trait that are consistent with available estimates of variance components and inbreeding depression. First, formulae were developed that quantify the contributions of different sources that affect the phenotypic variance. The minimum number of QTLs is calculated that is needed such that inbreeding depression on average does not cause more dominance variance than the true dominance variance of the trait. This novel estimator predicts a lower bound for the number of QTLs. Using the developed formulae, it is shown how the number of ‘large’ QTLs can be used to derive parameters of the distribution of QTL effects. Finally, the role of overdominance for cases where the ratio of dominance variance to additive genetic variance is high is investigated. The developed formulae are based on the availability of additive genetic variance as well as dominance variance components. They were applied using published variance components estimates for milk yield and productive life (PL) in Holstein dairy cattle.

2. Theory

This section is divided into several parts. In (i), formulae for the variance components are derived that account for linkage disequilibrium. Their expectations under a wide range of assumptions are derived in (ii). Part (iii) discusses the distribution of allele frequencies and linkage disequilibrium of alleles. In (iv), parameters that describe the joint distribution of additive effect and dominance coefficients are derived for different scenarios. Parameters that are estimated in (iii) and (iv) are needed to evaluate the formulae that are derived in later parts. In (v), formulae are derived to estimate the contributions of different sources to the additive and dominance variance. Lower bounds for the number of QTLs are presented in (vi). Implications on the importance of overdominance can be found in (vii). Upper bounds for the number of large QTLs are given in (viii). Part (ix) shows how the formulae can be used to estimate the number of QTLs, the variance of the additive effects, and the mean and variance of the dominance coefficients. For a better readability, the proofs of the formulae are given in the electronic appendix (available at http://journals.cambridge.org/GRH). A list of symbols used in the paper is given in Table 1.

(i) Variance components and inbreeding depression

According to Falconer & Mackay (Reference Falconer and Mackay1996), the breeding value of an individual is the sum of all substitution effects that are carried by this individual, i.e. up to an additive constant the breeding value is

${\rm BV} \equals \mathop\sum\limits_{n \in {\cal Q}} \, \alpha _{n} \,\lpar v_{n} \plus m_{n} \rpar \quad {\rm with\ }\alpha _{n} \equals a_{n} \plus d_{n} \,\lpar q_{n} \minus p_{n} \rpar \comma$

where v _n∊{0, 1} is the paternal and m _n∊{0, 1} is the maternal allele of the individual at QTL n and consists of all polymorphic QTLs. Here, it is assumed that all QTLs are biallelic with alleles 0 and 1. Moreover, we assume throughout the paper the absence of genetic interactions. The mutant allele at QTL n has frequency p _n, additive effect a _n and dominance effect d _n. The other allele has frequency q _n=1−p _n. The dominance deviation of the individual is

${\rm DV} \equals \mathop\sum\limits_{n \in {\cal Q}} \minus 2d_{n}\, \lpar v_{n} \minus p_{n} \rpar \lpar m_{n} \minus p_{n} \rpar \comma$

see Falconer & Mackay (Reference Falconer and Mackay1996, Table 7.3). The additive variance is the variance of the breeding value and the dominance variance is the variance of the dominance deviation of an individual whose parents are randomly chosen from the population. Thereby, the additive effects, the dominance effects and the allele frequencies are assumed to be fixed parameters and randomness arises from random sampling of the genotypes. That is, only the paternal and maternal alleles are random. Since maternal and paternal alleles are independent and identically distributed with Var(v _n)=Var(m _n)=p _nq _n and E (v _n)=E (m _n)=p _n, we obtain (see the proofs in the electronic appendix)

(1)

$\hskip-2pt V_{\rm A} \equals {\rm Var}\lpar {\rm BV}\rpar \equals \mathop\sum\limits_{n \in {\cal Q}} \,h_{n} \alpha _{n}^{\setnum{2}} \plus 2\mathop\sum\limits_{n \lt m \in {\cal Q}} \,2\alpha _{n} \alpha _{m} D_{n\comma m} \comma$

where h _n=2p _nq _n is the heterozygosity at locus n in the case of Hardy-Weinberg Equilibrium (HWE) and D _n,m=Cov (v _n, v _m)=Cov (m _n, m _m) is the coefficient of linkage disequilibrium between locus n and m. For the dominance variance, we obtain

(2)

$V_{\rm D} \equals {\rm Var}\lpar {\rm DV}\rpar \equals \mathop\sum\limits_{n \in {\cal Q}} \, h_{n}^{\setnum{2}} d_{n}^{\setnum{2}} \plus 2\mathop\sum\limits_{n \lt m \in {\cal Q}} \,4d_{n} d_{m} D_{n\comma m}^{\setnum{2}}.$

Note that V _A+V _D equals the genotypic variance within a line given by Avery & Hill (Reference Avery and Hill1979). If linkage disequilibrium is neglected, i.e. if D _n,m=0 is assumed, then only the left summands in eqns (1) and (2) remain, which is equal to the well-known formulae given in Falconer & Mackay (Reference Falconer and Mackay1996). We assume that loci from different chromosomes are not in linkage disequilibrium, but loci from the same chromosome may be linked. Take to be the set of chromosomes and for c∊ let _c denote the polymorphic QTL at chromosome c. Then we have

(3)

$V_{\rm A} \equals \mathop\sum\limits_{n \in {\cal Q}} \,h_{n} \alpha _{n}^{\setnum{2}} \plus 2\mathop\sum\limits_{c \in {\cal C}} \,\mathop\sum\limits_{n \lt m \in {\cal Q}_{c} } \, 2\alpha _{n} \alpha _{m} D_{n\comma m}$

and

(4)

$V_{\rm D} \equals \mathop\sum\limits_{n \in {\cal Q}} \,h_{n}^{\setnum{2}} d_{n}^{\setnum{2}} \plus 2\mathop\sum\limits_{c \in {\cal C}} \,\mathop\sum\limits_{n \lt m \in {\cal Q}_{c} } \,4d_{n} d_{m} D_{n\comma m}^{\setnum{2}}.$

The first summand in eqn (3) equals

$\eqalign{ \mathop\sum\limits_{n \in {\cal Q}} \,h_{n} \alpha _{n}^{\setnum{2}} \equals \tab \mathop\sum\limits_{n \in {\cal Q}} \,h_{n} a_{n}^{\setnum{2}} \plus 2\mathop\sum\limits_{n \in {\cal Q}} \,h_{n} \,\lpar q_{n} \minus p_{n} \rpar a_{n} d_{n}\cr \tab \plus \mathop\sum\limits_{n \in {\cal Q}} \,h_{n} \,\lpar q_{n} \minus p_{n} \rpar ^{\setnum{2}} d_{n}^{\setnum{2}}.}$

Therefore, we can write

$V_{\rm A} \equals V_{\rm A}^{a} \plus V_{\rm A}^{ad} \plus V_{\rm A}^{d} \plus V_{\rm A}^{{\rm LD}} \comma$

where V _A^a=∑_n∊h _na _n² and V _A^d=∑_n∊h _n (q _n−p _n)²d _n² are the contributions of single additive effects and dominance effects to V _A, respectively. V _A^ad=∑_n∊2×h _n (q _n−p _n)a _nd _n is a correction term that can be negative. Finally, $V_{\rm A} ^{{\rm LD}} \equals 2\sum\nolimits_{c \in {\cal C}} \,\sum\nolimits_{n \lt m \in {\cal Q}_{c} } \,2\alpha _{n} \alpha _{m} D_{n\comma m}$ is the contribution that comes from covariances between linked loci. Similarly, we can write

$V_{\rm D} \equals V_{\rm D}^{d} \plus V_{\rm D}^{{\rm LD}} \comma$

where V _D^d=∑_n∊ h _n²d _n² is the contribution of single dominance effects to V _D and $V_{\rm D}^{{\rm LD}} \equals 2\sum\nolimits_{c \in {\cal C}} \,\sum\nolimits_{n \lt m \in {\cal Q}_{c} } \,4d_{n} d_{m} D_{n\comma m}^{\setnum{2}}$ is the contribution that comes from covariances between linked loci.

The expected genotypic value of an individual that is randomly chosen from the population in HWE is

$G_{{\rm HWE}} \equals \mu \plus \mathop\sum\limits_{n \in {\cal Q}} \,0q_{n}^{\setnum{2}} \plus \lpar a_{n} \plus d_{n} \rpar 2p_{n} q_{n} \plus 2a_{n} p_{n}^{\setnum{2}}.$

Assume that an inbred line is established from two individuals that are randomly chosen from the population. Individuals from this line carry the genotype 00 at locus n with probability q _n and the genotype 11 with probability p _n. Thus, the expected genotypic value of a completely inbred individual is

$G_{{\rm inbred}} \equals \mu \plus \mathop\sum\limits_{n \in {\cal Q}} \,p_{n} 2a_{n} \plus q_{n} 0.$

The inbreeding depression, i.e. the expected decrease of the genotypic value when the inbreeding coefficient increases from 0 to 1 is therefore

${\cal I}\equals G_{{\rm HWE}} \minus G_{{\rm inbred}} \equals \mathop\sum\limits_{n \in {\cal Q}} \,h_{n} d_{n}.$

(ii) Expectations of variance components and inbreeding depression

In this paper, we treat a population as if it would be a random outcome of a simulation study, i.e. a population is viewed as a realization from all hypothetical populations. Therefore, not only the genotypes are random but also the additive effects, the dominance effects, the allele frequencies and the coefficients of linkage disequilibrium for each pair of loci. Alternatively, randomness of the additive and dominance effects could be due to uncertainty about the true effects (Gianola et al., Reference Gianola, des los Campos, Hill, Manfredi and Fernando2009). In both cases, V _A and V _D are random variables. In our setting, V _A and V _D are treated as random since they would attain different values in each hypothetical replicated population. Their expectations are of interest in order to relate parameters of the distributions of additive effects and dominance coefficients to the expected variance components.

For n∊, the additive effect a _n of the mutant allele has mean μ_A and variance σ_A², the dominance effect d _n has mean μ_D and variance σ_D². The dominance coefficient δ_n=d _n/|a _n| has mean μ_Δ and variance σ_Δ². Take _c to be the set of nucleotides at chromosome c. The probability that a nucleotide is a segregating QTL is equal for all nucleotides and nucleotides become QTL independent from each other. For simplicity, we assume that all chromosomes have equal length. The expected number of segregating QTLs at one chromosome is therefore Q/C, where Q=E (#) is the expected total number of segregating QTLs and C=# is the number of chromosomes.

We derive expectations of the variance components under a wide range of assumptions. That is, we first derive equations using only a small number of assumptions and successive include more of them. The assumptions are

(A1) (p _n, a _n, d _n) and (p _m, a _m, d _m) are identically distributed,
(A2) allelic effects are independent from the allelic effects and the allele frequencies at other loci and linkage disequilibrium does not depend on the allelic effects (see below),
(A3) D _LD^1,0,0=D _LD^1,0,1=D _LD^1,1,0=0 (see below),
(A4) dominance effect d _n and heterozygosity h _n are independent,
(A5) (|a _n|, d _n) and heterozygosity h _n are independent,
(A6)
1. (a) a _n|δ_n, p _n has a symmetrical distribution with mean μ_A=0, or
2. (b) p _n>0·5 if sign (a _n)=sign (δ_n) and p _n<0·5 if sign (a _n)≠sign (δ_n), i.e. sign (a _n)=−sign ((q _n−p _n)δ_n) (see the introduction),

given n≠m∊. Note that additive effects and dominance coefficients may be dependent. By using only (A1) we obtain

(5)

${\tilde{V}}_{\rm A} \equals E\,\lpar V_{\rm A} \rpar \equals {\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt a} \plus {\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt d} \plus {\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt ad} \plus {\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt {\rm LD}} \comma$

${\tilde{V}}_{\rm D} \equals E\,\lpar V_{\rm D} \rpar \equals {\tilde{V}}_{\rm D}^{\vskip2pt \hskip.5pt d} \plus {\tilde{V}}_{\rm D}^{\vskip2pt \hskip.5pt {\rm LD}} \comma$

$\tilde{{\cal I}} \equals E\,\lpar {\cal I}\rpar \equals QE \,\lpar {h_{n} d_{n} \vert n \in {\cal Q}} \rpar \comma$

where

(6)

${\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt a} \equals QE\,\lpar h_{n} a_{n}^{\setnum{2}} \vert n \in {\cal Q}\rpar \comma$

${\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt d} \equals QE\,\lpar h_{n} \lpar q_{n} \minus p_{n} \rpar ^{\setnum{2}} d_{n}^{\setnum{2}} \vert n \in {\cal Q}\rpar \comma$

${\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt ad} \equals 2QE\,\lpar h_{n} \lpar q_{n} \minus p_{n} \rpar a_{n} d_{n} \vert n \in {\cal Q}\rpar \comma$

${\tilde{V}}_{\rm D}^{\vskip2pt \hskip.5pt d} \equals Q E\left( {h_{n}^{\setnum{2}} d_{n}^{\setnum{2}} \vert n \in {\cal Q}} \right) \comma$

${\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt {\rm LD}} \tab \equals {{2Q^{\setnum{2}} } \over C}{1 \over {\# {\cal N}_{c}^{\setnum{2}} }}\mathop\sum\limits_{n \ne m \in {\cal N}_{c} } \ E\,\lpar {\alpha _{n} \alpha _{m} D_{n\comma m} \vert n\comma m \in {\cal Q}} \rpar$

and

${\tilde{V}}_{\rm D}^{\vskip2pt \hskip.5pt {\rm LD}} \equals {{4Q^{\setnum{2}} } \over C}{1 \over {\# {\cal N}_{c}^{\setnum{2}} }}\mathop\sum\limits_{n \ne m \in {\cal N}_{c} } E\,\lpar {d_{n} d_{m} D_{n\comma m}^{\setnum{2}} \vert n\comma m \in {\cal Q}} \rpar.$

In most of the following formulae, we suppress the condition that the SNPs are polymorphic QTLs in order to improve the readability.

Note that a _n denotes the effect of the mutant allele. However, in real populations it is unknown which allele the mutant allele is. One would rather use an arbitrary coding of the alleles. Take a′_n to be the additive effect of the arbitrarily coded allele. We have a′_n=a _n or a′_n=−a _n, each with probability 0·5. Although |a′_n|=|a _n|, the distributions of both random variables are different. Whereas the effect of the mutant allele may have a non-zero mean μ_A, the effect of the arbitrary coded allele a′_n has a symmetrical distribution with mean μ_A′=0 and variance σ_A′²=E (a _n²). Since a′_n has mean 0, we can write E (|a _n|)=E (|a′_n|)=λσ_A′, where the parameter λ depends on the standardized distribution of a′_n.

Let D _n,m^i,j,k =D _n,mⁱ (q _n−p _n)^j (q _m−p _m)^k, which is needed to simplify Ṽ _A^LD. Now we assume additionally that linkage disequilibrium does not depend on the allelic effects and that allelic effects at different loci are independent, i.e. a _n is independent from (a _m, d _m, p _m, D _n,m) and d _n is independent from (a _m, d _m, p _m, D _n,m, D _n,m^1,1,1), given n≠m∊. With (A1) and (A2) we obtain

(7)

${\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt{\rm LD}} \equals 2{{Q^{\setnum{2}} } \over C}\lpar \mu _{\rm A}^{\setnum{2}} D_{{\rm LD}}^{\setnum{1}\comma \setnum{0}\comma \setnum{0}} \plus 2\mu _{\rm A} \mu _{\rm D} D_{{\rm LD}}^{\setnum{1}\comma \setnum{0}\comma \setnum{1}} \plus \mu _{\rm D}^{\setnum{2}} D_{{\rm LD}}^{\setnum{1}\comma \setnum{1}\comma \setnum{1}} \rpar \comma$

${\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt{\rm LD}} \equals 4{{Q^{\setnum{2}} } \over C}\mu _{\rm D}^{\setnum{2}} D_{{\rm LD}}^{\setnum{2}\comma \setnum{0}\comma \setnum{0}} \comma$

where

$D_{{\rm LD}}^{i\comma j\comma k} \equals {1 \over {\# {\cal N}_{\hskip-2pt{c}}^{\setnum{2}} }}\mathop\sum\limits_{n \ne m \in { \cal N}_{c} } \ E\,\left( {D_{n\comma m}^{i\comma j\comma k} \vert n\comma m \in {\cal Q}} \right).$

Roughly speaking, E (D _n,m^2,0,0|n, m∊) would be the mean D ²-value for loci n and m, averaged over many hypothetical populations for which n and m are polymorphic QTLs. If these expected D ²-values are averaged over all pairs of loci at one chromosome, then D _LD^2,0,0 is obtained. Simulations of neutral alleles showed that likely D _LD^1,0,0=D _LD^1,0,1=D _LD^1,1,0=0 but D _LD^1,1,1≠0 (A3), so this will be assumed in the rest of this paper. Thus, linkage disequilibrium not only contributes to the dominance variance but also to the additive variance. If there are many QTLs per chromosome that affect the trait, the contributions from single loci can in principle be much smaller than the contributions that arise from covariances of linked loci, since the latter increases quadratically with the number of QTLs. If additionally dominance effect and heterozygosity are independent (A4), then

(8)

${\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5ptd} \equals QE\,\lpar h_{n} \rpar \lpar \mu _{\rm D}^{\setnum{2}} \plus \sigma _{\rm D}^{\setnum{2}} \rpar \gamma \comma$

${\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5ptd} \equals QE\,\lpar h_{n} \rpar \lpar \mu _{\rm D}^{\setnum{2}} \plus \sigma _{\rm D}^{\setnum{2}} \rpar {{1 \minus \gamma } \over 2}\comma$

$\tilde{{\cal I}}\equals QE\,\lpar h_{n} \rpar \mu _{\rm D} \comma$

where γ=E (h _n(q _n−p _n)²)/E (h _n)=1−2E (h _n²)/E (h _n). The last equation in (8) shows that inbreeding depression occurs only if dominance effects do not have mean zero. It may be more convenient to express the formulae for Ṽ _A^d, Ṽ _D^d and $\tilde{{\cal I}}$ using moments of the dominance coefficients rather than moments of the dominance effects. These can be obtained from eqn (8) with

(9)

$\mu _{\rm D} \equals \mu _{\rmDelta } \lambda \sigma _{{\rm A} \prime} \lpar 1 \plus r_{\setnum{1}} \rpar \comma$

$\mu _{\rm D}^{\setnum{2}} \plus \sigma _{\rm D}^{\setnum{2}} \equals \lpar \mu _{\rmDelta }^{\setnum{2}} \plus \sigma _{\rmDelta }^{\setnum{2}} \rpar \sigma _{{\rm A} \prime}^{\setnum{2}}\, \lpar 1 \plus r_{\setnum{2}} \rpar \comma$

where

$r_{j} \equals {{{\rm Cov}\lpar \delta _{n}^{j} \comma \vert a_{n} \vert ^{j} \rpar } \over {E\,\lpar \delta _{n}^{j} \rpar E\,\lpar \vert a_{n} \vert ^{j} \rpar }}\quad {\rm for}\ j \equals 1\comma 2.$

The scale invariant parameters r ₁ and r ₂ characterize the dependency between additive effects and dominance coefficients. Scale invariance means in particular that r ₁ and r ₂ do not depend on the variance of the additive effects. For biological reasons, we expect that r ₁, r ₂⩾0. If additive effects and dominance coefficients are independent, then r ₁=r ₂=0.

Now we assume additionally that the absolute additive effect and the heterozygosity are independent (A5). We have

(10)

${\tilde{V}}_{\rm A}^{\vskip2pt \hskip0.5pt a} \equals Q\sigma _{{\rm A} \prime}^{\setnum{2}} E\,\lpar h_{n} \rpar.$

In simulation studies, the additive effect a _n often has a symmetrical distribution with mean 0 given δ_n and p _n (A6a). In this case, we have

(11)

${\tilde{V}}_{\rm A}^{\vskip2pt \hskip0.5pt ad} \equals 0.$

However, we believe that (A6a) is an unrealistic assumption for real populations for several reasons. Most important, selection acts mainly on the additive component of genotypic values. Therefore, one would expect that in a long-term selected population alleles with small contribution to the additive variance are overrepresented. The contribution of QTL n to the additive variance depends on |α_n|=|a _n+(q _n−p _n)d _n| which is small if a _n and (q _n−p _n)d _n have opposite signs. Although this is violated in real populations as well for some alleles, it likely holds for the majority of the alleles as pointed out in the introduction and it provides not only more realistic results as (A6a), but it can also easily be accounted for in simulation studies by simply choosing the sign of a _n as sign(a _n)=−sign((q _n−p _n)δ_n). Under (A6b) we have

(12)

${\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt ad} \equals \minus 2 {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt a} E\lpar \vert \delta _{n} \vert \rpar \lpar 1 \plus r_{\setnum{2}\comma \setnum{1}} \rpar {{E\,\lpar h_{n} \sqrt {1 \minus 2h_{n} } \rpar } \over {E\,\lpar h_{n} \rpar }}\comma$

where

$r_{\setnum{2}\comma \setnum{1}} \equals {{{\rm Cov}\lpar a_{n}^{\setnum{2}} \comma \vert \delta _{n} \vert \rpar } \over {\sigma _{{\rm A} \prime}^{\setnum{2}} E\,\lpar \vert \delta _{n} \vert \rpar }}.$

We can write E (|δ_n|)=μ_ΔK, where K depends on the coefficient of variation c _v=σ_Δ/μ_Δ.

Several parameters appeared in these equations that need to be estimated for the application of formulae. The next part provides estimates of parameters that are related to heterozygosity and linkage disequilibrium. In Part (iv), parameters that characterize the joint distribution of additive effect and dominance coefficient are calculated for different scenarios.

(iii) Heterozygosity and linkage disequilibrium

In accordance with the diffusion approximation (Crow & Kimura, Reference Crow and Kimura1970) and Hill et al. (Reference Hill, Goddard and Visscher2008), the density of a segregating neutral mutant allele is f(p)=1/(κp) on the interval [1/(2N), 1−(1/2N)] if mutations are rare and the population is in mutation drift equilibrium. Since f(p) is a density, we have κ=ln(2N−1), where N is the population size. Take p _k=P (k<p _n<1−k) to be the probability that a segregating allele has minor allele frequency (MAF) larger than k, where 1/(2N)⩽k⩽0·5. We have

(13)

$\gamma \equals {1 \over 3}{{\lpar N \minus 1\rpar ^{\setnum{2}} } \over {N^{\setnum{2}} }}\comma \quad E\,\lpar h_{n} \rpar \equals {1 \over \kappa }{{N \minus 1} \over N}\comma \quad {\rm and}\quad p_{k} \equals {{{\rm ln}\lpar \lpar 1\sol k\rpar \minus 1\rpar } \over \kappa }.$

Note that the population size is usually so large that the factors (N−1)²/N ² and (N−1)/N can be ignored. The effective population size is assumed to be constant. But for most livestock populations, the effective population size decreased in the past. In a small bottlenecked population, more alleles with extreme frequencies become lost than new mutations arise and the remaining alleles more likely have intermediate frequencies. Thus, the mean heterozygosity of segregating alleles would be larger. But on the other hand, the population size is usually much larger than the effective population size, so many new mutations arise each generation and these new mutations decrease the mean heterozygosity of segregating alleles. In order to get estimates of γ, E (h _n) and p _k for a bottlenecked population, we simulated a population for which the effective population size decreased from N _e=1000 to N _e=100 within 400 generations in accordance with the results of Villa-Angulo et al. (Reference Villa-Angulo, Matukumalli, Gill, Choi, Van Tassell and Grefenstette2009). The total population size remained constant with N=1000, which was achieved by an unequal number of males and females. We obtained the estimates $\widehat {E}\,\lpar h_{n} \rpar \equals 0 {\cdot} 19$ , $\widehat {E}\,\lpar h_{n} \sqrt {1 \minus 2h_{n} } \rpar \equals 0 {\cdot} 087$ , $\widehat {\gamma} \equals 0 {\cdot} 28$ , and ${\widehat {p}}_{k} \equals 0 {\cdot} 71$ of the segregating alleles had an MAF larger than k=0·01. We also obtained estimates for the LD from the simulated population resulting in ${\widehat {D}}_{{\rm LD}}^{\vskip2pt\hskip0.5pt \setnum{1}\comma \setnum{1}\comma \setnum{1}} \equals 2 {\cdot} 4 \times 10^{ \minus \setnum{5}}$ and ${\widehat {D}}_{\rm LD}^{\vskip2pt\hskip0.5pt \setnum{2}\comma \setnum{0}\comma \setnum{0}} \equals 1 {\cdot} 3 \times 10^{ \minus \setnum{4}}$ . These values are used in the examples. They are needed to apply the formulae that were derived in the previous section. For a population in mutation–selection equilibrium a more extreme L- or U-shaped distribution may be expected. The shape of this distribution would affect these parameters, but not the validity of the formulae.

(iv) Dependency between additive effect and dominance coefficient

The formulae that are derived could be applied by assuming r ₁=r ₂=r _2,1=0 which holds if additive effect and dominance coefficient are independent. However, the literature suggests that they are not independent. Therefore, we consider different possibilities to model dependent effects and for these scenarios we give estimates for the scale invariant parameters c _v, r ₁, r ₂, r _2,1, λ and K. In all scenarios, |a _n| and δ_n are positively correlated.

Because of the scale invariance of all relevant parameters, the distributions of a′_n and δ_n need to be specified only up to constant factors that will be estimated in the next sections. That is, we only need to specify the joint distribution of random variables ã _n, _n. These are multiplied by constant factors to obtain a′_n and δ_n. The parameters c _v, r ₁, r ₂, r _2,1, λ and K for (a _n, δ_n) are exactly the same as for (ã _n, _n). In scenarios (1), (3) and (4), mean and standard deviation of _n were chosen as 0·2 and 0·3, as suggested by Bennewitz & Meuwissen (Reference Bennewitz and Meuwissen2010).

Scenario (1) assumes that δ_n and a′_n are independent with

${\tilde{\delta }}_{n} \sim {\cal N}\lpar 0{\cdot}2 \comma 0{\cdot}3^{2} \rpar \quad {\rm and}\quad {\tilde{a}}_{n} \sim {\cal L}\lpar 0\comma 1\rpar \comma$

where ${\cal L}\lpar 0\comma 1\rpar$ denotes the Laplace distribution with mean 0 and variance 1.

In all other scenarios there is a lack of segregating additive alleles with large effect. This could arise in real populations because these alleles have been fixed due to selection or because of a hyperbolic relationship between enzyme activity and flux (Kacser & Burns, Reference Kacser and Burns1981).

Scenario (2) reflects the conclusions drawn from the analysis of enzyme networks. Genes of large effect show directional dominance, but for genes of small effect the dominance coefficient is close to 0. That is,

${\tilde{a}}_{n} \sim {\cal L}\lpar 0\comma 1\rpar \quad {\rm and}\quad {\tilde{\delta }}_{n} \vert {\tilde{a}}_{n} \sim {\cal N}\left( {{{ {\tilde{a}}_{n}^{\setnum{2}} } \over {1 \plus {\tilde{a}}_{n}^{\setnum{2}} }}\comma 0{\cdot}01 \vert{\tilde{a}}_{n} \vert } \right).$

Scenario (3) reflects the results reported by Caballero & Keightley (Reference Caballero and Keightley1994), i.e. alleles with large effect tend to be partially recessive or even overdominant and the heterozygous effect is above the average effect of the two homozygotes. But alleles of small effect show highly variable dominance coefficients. More precisely,

${\tilde{\delta }}_{n} \sim {\cal N}\lpar 0{\cdot}2 \comma 0{\cdot}3^{2} \rpar \quad {\rm and}\quad {\tilde{a}}_{n} \vert {\tilde{\delta }}_{n} \sim {\cal N}\lpar 0\comma {\rm exp}\lpar 3 {\tilde{\delta }}_{n} \rpar \rpar.$

Scenario (4) is similar to scenario (3), but for alleles with large effect, the heterozygous effect could also be below the average effect of the two homozygotes. This may be more suitable for traits where mutant alleles with large effect could also be recessive and advantageous. We have

${\tilde{\delta }}_{n} \sim {\cal N}\lpar 0{\cdot}2 \comma 0{\cdot}3^{2} \rpar \quad {\rm and}\quad {\tilde{a}}_{n} \vert {\tilde{\delta }}_{n} \sim {\cal N}\lpar 0 \comma \lpar 0{\cdot}5 \plus \vert {\tilde{\delta }}_{n} \vert \rpar ^{\setnum{4}} \rpar.$

Scatter plots of the distributions are shown in Fig. 1. This figure was fitted for the trait PL in Holstein cattle (see Examples section). Table 2 shows the parameter values for all scenarios.

Fig. 1. Scatter plots of the considered joint distributions of |a _n| and δ_n from Part (iv), fitted to the trait productive life (PL) under assumption (6b), so the signs of the additive effects are such that the alleles contribute little to the additive variance. (1) Absolute additive effect and dominance coefficient are independent. (2) Alleles with large effect show directional dominance, but alleles of small effect are additive. (3) Alleles with small effects show highly variable dominance coefficients, but for alleles with large effects, heterozygous effect is above the average effect of the two homozygotes. (4) Alleles with small effects show highly variable dominance coefficients, but alleles with large effect are incomplete recessive or dominant.

Table 2. Characteristics of the considered joint distributions of |a_n| and δ_n that are described in Part (iv)

(v) Contributions of different sources to variance components

We have under assumptions (A1) and (A2)

(14)

${\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt{\rm LD}} \equals 4{{\lpar \tildes{\cal I} \minus Q{\rm Cov}\lpar h_{n} \comma d_{n} \rpar \rpar ^{\setnum{2}} } \over {CE\,\lpar h_{n} \rpar ^{\setnum{2}} }}D_{{\rm LD}}^{\setnum{2}\comma \hskip-1pt\setnum{0}\comma \hskip-1pt\setnum{0}}.$

Unfortunately, the covariance is unknown, so this equation cannot be used directly to estimate the contribution of linkage disequilibrium to the dominance variance. But it shows that if falsely Cov(h _n, d _n)=0 is assumed, although alleles with intermediate frequencies tend to have larger dominance effects (i.e. Cov(h _n, d _n)>0 due to overdominance) then Ṽ _D^LD would become overestimated. Similarly, we obtain under (A1)–(A3) that

(15)

${\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt{\rm LD}} \equals 2{{\lpar \tilde{{\cal I}} \minus Q{\rm Cov}\lpar h_{n} \comma d_{n} \rpar \rpar ^{\setnum{2}} } \over {CE\,\lpar h_{n} \rpar ^{\setnum{2}} }}D_{{\rm LD}}^{\setnum{1}\comma \setnum{1}\comma \setnum{1}}.$

Again we have to assume that Cov(h _n, d _n)=0 for being able to evaluate this formula. Then the contributions of dominance effects of linked loci to the additive and dominance variance depend only on the squared inbreeding depression, which can be estimated, but not directly on the unknown number of QTLs.

Since Ṽ _D can be estimated from the population and Ṽ _D^LD can be estimated with eqn (14), the contribution of single loci to the dominance variance can be obtained from

${\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5ptd} \equals {\tilde{V}}_{\rm D} \minus {\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt{\rm LD}}.$

This contribution would become underestimated if falsely Cov(h _n, d _n)=0 is assumed in (14) although Cov(h _n, d _n)>0. If additionally h _n and d _n are independent (A4), then the contribution of dominance effects of single loci to the additive variance can be obtained from

(16)

${\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5ptd} \equals {{2\gamma } \over {1 \minus \gamma }} {\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5ptd}.$

Under (A6a) we have

${\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5ptad} \equals 0\comma$

whereas under (A6b) we have

(17)

$\hskip-2pt \eqalign{ {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5ptad} \equals \tab \minus \lpar {\tilde{V}}_{\rm A} \minus {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5ptd} \minus {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt{\rm LD}} \rpar \cr \tab \times{{2E\,\lpar \vert \delta _{n} \vert \rpar \lpar 1 \plus r_{\setnum{2}\comma\hskip-1pt \setnum{1}} \rpar E\,\lpar h_{n} \sqrt {1 \minus 2h_{n} } \rpar } \over {E\,\lpar h_{n} \rpar \minus 2E\,\lpar \vert \delta _{n} \vert \rpar \lpar 1 \plus r_{\setnum{2}\comma \hskip-1pt\setnum{1}} \rpar E\,\lpar h_{n} \sqrt {1 \minus 2h_{n} } \rpar }}}.$

The latter formula depends, however, on E(|δ_n|) and r _2,1. The contribution of additive effects of single loci to the additive variance can then be obtained from

${\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt a} \equals {\tilde{V}}_{\rm A} \minus {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt d} \minus {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt{\rm LD}} \minus {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt ad}$

In summary, these formulae enable to estimate the contributions of different sources to the additive and dominance variance under (A1)–(A6a).

(vi) Lower bounds for the number of QTLs

In this section, we derive lower bounds for the expected number of QTLs. We obtain from eqns (5) and (6)

${{ {\tilde{{\cal I}}}^{\setnum{2}} } \over { {\tilde{V}}_{\rm D} \minus {\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt{\rm LD}} }} \equals {{Q^{\setnum{2}} E \,{\left( {h_{n} d_{n} } \right)}^{\setnum{2}} } \over {QE\,\left( {h_{n}^{\setnum{2}} d_{n}^{\setnum{2}} } \right)}} \equals Q{{E \,{\left( {h_{n} d_{n} } \right)}^{\setnum{2}} } \over {E\,\left( {h_{n}^{\setnum{2}} d_{n}^{\setnum{2}} } \right)}}.$

Thus,

(18)

$Q \equals {{ {\tilde{{\cal I}}}^{\setnum{2}} } \over { {\tilde{V}}_{\rm D} \minus {\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt{\rm LD}} }}{{\ E\left( {h_{n}^{\setnum{2}} d_{n}^{\setnum{2}} } \right)} \over {E {\left( {h_{n} d_{n} } \right)}^{\setnum{2}} }}\ges {{ {\tilde{{\cal I}}}^{\setnum{2}} } \over { {\tilde{V}}_{\rm D} \minus {\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt{\rm LD}} }}\ges {{ {\tilde{{\cal I}}}^{\setnum{2}} } \over { {\tilde{V}}_{\rm D} }}.$

Note that the proof only used (A1) and Ṽ _D^LD⩾0, so the bounds hold regardless of how intense the selection is and how deleterious new mutations tend to be. Under (A1)–(A4) the contribution of linkage disequilibrium Ṽ _D^LD can be estimated as described in the previous part if dominance variance and inbreeding depression have been estimated for the population. Note that these bounds depend neither on the distribution of the additive effects nor on the correlation between |a _n| and δ_n. However, they can be further improved under (A1)–(A4). Since

(19)

${{\ E\,\left( {h_{n}^{\setnum{2}} d_{n}^{\setnum{2}} } \right)} \over {E\, {\left( {h_{n} d_{n} } \right)}^{\setnum{2}} }} \equals {{1 \minus \gamma } \over {2E\,\lpar h_{n} \rpar }}{{\lpar 1 \plus r_{\setnum{2}} \rpar } \over {\lambda ^{\setnum{2}} \lpar 1 \plus r_{\setnum{1}} \rpar ^{\setnum{2}} }}{{\mu _{\rmDelta }^{\setnum{2}} \plus \sigma _{\rmDelta }^{\setnum{2}} } \over {\mu _{\rmDelta }^{\setnum{2}} }}$

and

(20)

${{\mu _{\rmDelta }^{\setnum{2}} \plus \sigma _{\rmDelta }^{\setnum{2}} } \over {\mu _{\rmDelta }^{\setnum{2}} }} \ges 1 \plus r_{\setnum{1}}^{\setnum{2}} {{\lambda ^{\setnum{2}} } \over {1 \minus \lambda ^{\setnum{2}} }}\comma$

we have

(21)

$Q \ges {{ {\tilde{\cal I}}^{\setnum{2}} } \over { {\tilde{V}}_{\rm D} \minus {\tilde{V}}_{{\rm D}}^{\vskip2pt\hskip0.5pt{\rm LD}} }}\ {{1 \minus \gamma } \over {2E\,\lpar h_{n} \rpar }}\ {{1 \plus r_{\setnum{1}}^{\setnum{2}} \lambda ^{\setnum{2}} \sol \lpar 1 \minus \lambda ^{\setnum{2}} \rpar } \over {\lambda ^{\setnum{2}} \lpar 1 \plus r_{\setnum{1}} \rpar ^{\setnum{2}} }}\ \lpar 1 \plus r_{\setnum{2}} \rpar.$

The factor (1+r ₂) on the right-hand side is typically larger than 1 (see Table 2). Moreover, we have

${{1 \minus \gamma } \over {2E\,\lpar h_{n} \rpar }} \equals {{E\left( {h_{n}^{\setnum{2}} } \right)} \over {E {\left( {h_{n} } \right)}^{\setnum{2}} }}\ges 1$

and

(22)

${{1 \plus r_{\setnum{1}}^{\setnum{2}} \lambda ^{\setnum{2}} \sol \lpar 1 \minus \lambda ^{\setnum{2}} \rpar } \over {\lambda ^{\setnum{2}} \lpar 1 \plus r_{\setnum{1}} \rpar ^{\setnum{2}} }} \ges 1.$

Thus, if any of these factors is unknown, then it can be skipped in eqn (21) and the inequality still holds. In particular, we have

(23)

$Q \ges {{ {\tilde{\cal I}}^{\setnum{2}} } \over { {\tilde{V}}_{\rm D} \minus {\tilde{V}}_{\rm D}^{{\rm LD}} }}\ {{1 \minus \gamma } \over {2E\,\lpar h_{n} \rpar }}.$

(vii) The role of overdominance

Using (A1)–(A5) we obtain

$\eqalign{\tab \tilde{V}^{d} _{\rm D} \quad \underline{\underline {\lpar 8\rpar }} \quad QE\,\lpar h_{n} \rpar \lpar \mu ^{\setnum{2}} _{\rm D} \plus \sigma ^{\setnum{2}} _{\rm D} \rpar {{1 \minus \gamma } \over 2} \cr \tab \quad \quad \hskip3pt\underline{\underline {\lpar 9\rpar }} \quad QE\,\lpar h_{n} \rpar \sigma ^{\setnum{2}} _{A'} \,\lpar \mu ^{\setnum{2}} _{\rmDelta } \plus \sigma ^{\setnum{2}} _{\rmDelta } \rpar \lpar 1 \plus r_{\setnum{2}} \rpar {{1 \minus \gamma } \over 2} \cr \tab \quad \quad\underline{\underline {\lpar 10\rpar }} \quad \tilde{V}^{a} _{A} \,\lpar \mu ^{\setnum{2}} _{\rmDelta } \plus \sigma ^{\setnum{2}} _{\rmDelta } \rpar \lpar 1 \plus r_{\setnum{2}} \rpar {{1 \minus \gamma } \over 2}. \cr}$

Thus,

(24)

$E\lpar \delta _{n}^{\setnum{2}} \rpar \equals \mu _{\rmDelta }^{\setnum{2}} \plus \sigma _{\rmDelta }^{\setnum{2}} \equals {1 \over {1 \plus r_{\setnum{2}} }}{{ {\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt d} } \over { {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pta} }}{2 \over {1 \minus \gamma }}.$

That is, the second moment of the dominance coefficients depends only on the ratio Ṽ ^d _D/Ṽ ^a _A that can be estimated under (A1)–(A6a), but not on the unknown number of QTLs.

A lower bound for the probability of an allele to be under- or overdominant can be derived if the dominance coefficient is normally distributed and the ratio of dominance variance to additive variance is sufficient large. If Ṽ ^d _D/Ṽ ^a _A>d _max=(1+r ₂)(1−γ)/2, then eqn (24) shows that μ_Δ²+σ_Δ²⩾1. But then P(|δ_n|>1) is minimized if μ_Δ=0 and σ_Δ=1. It follows that

(25)

$P\lpar \vert \delta _{n} \vert \gt 1\rpar \ges 1 \minus {1 \over {\sqrt {2\pi } }}\int_{ \minus \setnum{1}}^{\setnum{1}} \,{\rm e}^{ \minus t^{\setnum{2}} \sol \setnum{2}}\, {\rm d}t \equals 0{\cdot} 317 \comma \quad \quad {\rm if}\quad {{ {\tilde{V}}_{\rm D}^{\vskip2pt \hskip.5pt d} } \over { {\tilde{V}}_{\rm A}^{\vskip2pt \hskip.5pt a} }} \gt d_{{\rm max}}.$

Such a large probability appears to be unrealistic in real populations for almost any trait, see for example Charlesworth et al. (Reference Daetwyler, Pong-Wong, Villanueva and Woolliams2009). Possible reasons for Ṽ ^d _D/Ṽ ^a _A>d _max to occur are the violation of one of the assumptions (A1)–(A6) or a wrong value for r ₂. If E (δ_n²)>1 is obtained under (A6a), then (A6b) should be used instead. That is, Ṽ ^ad _A and Ṽ ^a _A should be calculated again using eqn (17) with some realistic values for E (|δ_n|) and r _2,1.

(viii) Upper bounds for the number of large QTLs

The contribution Ṽ ^a _A of additive effects of single loci to the additive variance is what remains if the contributions of all other factors are subtracted from the additive variance. On the other hand, we have Ṽ ^a _A=σ² _A′QE (h _n). Unfortunately, this equation cannot be used directly to estimate the number of QTLs as σ² _A′ is unknown. But for many traits the number of large QTLs which are the QTLs whose absolute additive effect |a _n| exceeds a given threshold value s and whose MAF is larger than (say) k=0·01 is known from QTL mapping studies and could be used to estimate the expected number of large QTLs

$\tilde{n} \equals E\,\lpar \# \lcub n\colon \vert a_{n} \vert \gt s\comma p_{n} \in I\rcub \rpar \comma$

where I=(0·01, 0·99). Since Ṽ ^a _A=σ² _A′QE (h _n) we obtain

(26)

$\tilde{n} \equals 2p_{k} QF\,\left( { \minus v\sqrt Q } \right)\comma$

where F is the distribution function of a′_n/σ_A′, and $v \equals s\sqrt {E\lpar h_{n} \rpar \sol {\tilde{V}}_{\rm A}^{\vskip2pt\hskip.5pta} }$ . If a particular distribution function F (e.g. the normal distribution or the Laplace distribution) is assumed for the standardized additive effects and an estimate of ñ is given, then eqn (26) can be solved for the number Q of QTLs. Unfortunately, estimates for ñ obtained from real data are often rather poor and F is also not known with certainty. But eqn (26) can also be used to derive upper bounds for ñ. For a given distribution function F of the standardized additive effects an upper bound for the expected number of large QTLs is (see eqn (26))

$\tilde{n}\les \mathop{\rm max}\limits_{Q\ges Q_{{{\rm min}}} } 2p_{k} QF\,\lpar \minus v\sqrt Q \rpar \comma$

where Q _min is a lower bound for the number of QTLs. From eqn (26) it follows with the Tschebyscheff inequality that ñ⩽p _k/v ². The proof is similar to the proof of eqn (27). This bound for the expected number of large QTLs depends neither on the total number of QTLs and the distribution of the QTL effects nor on the correlation between |a _n| and δ_n. The Vysochanskij–Petunin inequality refines the Tschebyscheff inequality for unimodal distributions. That is, if Q>8/(3v ²) and an unimodal distribution is assumed for the additive effects, then the Vysochanskij–Petunin inequality gives the better bound

(27)

$\tilde{n}\les {{4p_{k} } \over 9}{{ {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt a} } \over {s^{\setnum{2}} E\lpar h_{n} \rpar }}.$

The bound depends on the contribution Ṽ ^a _A of single additive effects to the additive variance. As shown in section (v), this can be estimated under (A6a). The bound holds in principle also under (A6b). However, then Ṽ ^a _A depends on the unknown joint distribution of the additive effects and dominance coefficients and could be even larger than the additive variance Ṽ _A since Ṽ ^ad _A is negative. In the scenarios considered so far, the bound may increase by a factor of 3 under (A6b), but then the QTLs with large effect are close to dominance or recessivity, so that they contribute little to the additive variance.

(ix) Parameter estimation

In this section, we show how the presented equations can be used, e.g. to design a simulation experiment for the assessment of genome-wide evaluations of quantitative traits in outbred populations. Estimates for Ṽ _A, Ṽ _D and $\tilde{{\cal I}}$ must be available. The goal is to estimate μ_Δ, σ_Δ², σ_A′², Q and the contributions of the different sources to the additive and dominance variance by the formulae such that the simulated populations have on average the desired additive variance, dominance variance and inbreeding depression. The procedure is as follows:

• Estimate the parameters p _k, E(h _n), γ and $E\lpar h_{n} \sqrt {1 \minus 2h_{n} } \rpar$ that depend on the distribution of allele frequencies and the parameters D _LD^2,0,0 and D _LD^1,1,1 that describe the effect of linkage disequilibrium. This is demonstrated in section (iii). Ensure that the simulation reproduces these values on average.
• Then Ṽ _D^LD, Ṽ _A^LD, Ṽ _D^d and Ṽ _A^d can be calculated as described in section (v).
• A joint distribution for the scaled additive effects and the scaled dominance coefficients must be postulated. For this distribution, the parameters c _v, r ₁, r ₂, r _2,1, λ and K are to be calculated. Alternatively, the values from Table 2 can be used (see section (iv)) for the proposed joint distributions.
• The expected number of QTLs can then be calculated as
(28)
$\hskip9pt Q \equals {{ {\tilde{\cal I}}^{\setnum{2}} } \over { {\tilde{V}}_{\rm D} \minus {\tilde{V}}_{\rm D}^{\vskip2pt\hskip.5pt{\rm LD}} }}\ {{1 \minus \gamma } \over {2E\lpar h_{n} \rpar }}{{\lpar 1 \plus r_{\setnum{2}} \rpar } \over {\lambda ^{\setnum{2}} \lpar 1 \plus r_{\setnum{1}} \rpar ^{\setnum{2}} }}\lpar 1 \plus c_{v}^{\setnum{2}} \rpar.$
• We have
(29)
$\eqalign{\tab \mu _{\rmDelta } \equals \pm \sqrt \alpha \quad\quad\quad\quad\quad\quad \hskip6pt{\rm under}\ \lpar {\rm A}6{\rm a}\rpar \comma \cr \tab \mu _{\rmDelta } \equals \pm \sqrt {\alpha \plus {\left( {\alpha \beta } \right)}^{\setnum{2}} } \minus \alpha \beta \quad {\rm under}\ \lpar {\rm A}6{\rm b}\rpar \comma \cr}$
and σ_Δ=μ_Δc _v, where
$\hskip9pt \eqalign{\tab \alpha \equals {1 \over {1 \plus c_{v}^{\setnum{2}} }}{1 \over {1 \plus r_{\setnum{2}} }}{{ {\tilde{V}}_{\rm D}^{\vskip2pt\hskip0.5pt d} } \over { {\tilde{V}}_{\rm A} \minus {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt d} \minus {\tilde{V}}_{\rm A}^{\vskip2pt\hskip0.5pt {\rm LD}} }}{2 \over {1 \minus \gamma }}\comma \cr \tab \beta \equals {{K\,\lpar 1 \plus r_{\setnum{2}\comma \setnum{1}} \rpar E\,\lpar h_{n} \sqrt {1 \minus 2h_{n} } \rpar } \over {E\,\lpar h_{n} \rpar }}. \cr}$
• Under (A6a) we have Ṽ _A^ad=0 and under (A6b), Ṽ _A^ad can be obtained from eqn (17) since E (|δ_n|)=μ_ΔK.
• Ṽ _A^a=Ṽ _A−Ṽ _A^d−Ṽ _A^ad−Ṽ _A^LD and σ_A′²=Ṽ _A^a/QE (h _n).

3. Examples

In this section we use published data for the application of the derived equations. Van Tassell et al. (Reference van Tassell, Misztal and Varona2000) estimated fractions of variance accounted for by additive variance (h ²) and dominance variance (d ²) as well as inbreeding depression () for milk yield, PL and other traits from Holstein data. The estimates are given in Table 3. They are based on milk records and PL records of more than 730 000 cows. Data were selected to maximize the number of full sibs in the analysis. Variance components were estimated by Method R. Properties of this method were examined by Duangjinda et al. (Reference Duangjinda, Misztal, Bertrand and Tsuruta2001). Unfortunately, the phenotypic variances were not reported. Therefore, literature estimates were used which are V _p=1216² for preadjusted 305-day milk yield during the first lactation (Miglior et al., Reference Miglior, Burnside and Kennedy1995) and V _p=13·1² for total month in milk by 84 months of age (Van Raden et al., Reference Van Raden and Klaaskate1993).

Table 3. Estimates for milk yield and PL in Holstein cattle from literature

Clearly, the conclusions drawn in this section may hold only if these parameters have been estimated with sufficient precision and if the model assumptions hold. The precision of the estimates may be not very high for two reasons. At first, we cannot rule out the possibility of a bias due to confounding factors, and secondly, the variance components were estimated and not their expectations in the sense of this paper. But the variance components are close to their expectations if the number of QTLs is sufficiently large and therefore, the estimates should give realistic values to deal with.

We obtained the bounds shown in Table 4, where =p _kQ with p _k=0·71 is the expected number of QTLs with MAF>0·01. We expect at least 187 QTLs for milk and 84 QTLs for PL (obtained from eqn (23)).

Table 4 Estimated parameters for Milk yield and PL

Table 5 Bounds for the parameters under different assumptions

We chose the threshold values for a QTL to be large as s=200 kg for milk yield and s=2 months for PL. More than $18 {\cdot} 9$ large QTLs for milk can be excluded if a unimodal distribution of the additive effects is assumed. Furthermore, if ñ is assumed to be larger than $9 {\cdot} 6$ , then neither a normal distribution nor a Laplace distribution of the QTL effects is possible. If a normal distribution is assumed for the additive effects for PL then the expected number of large QTLs is 0·7 at most. These bounds have been derived under (A6a).

Figure 1 shows scatter plots of the joint distributions of additive effects and dominance coefficients that were fitted to PL under assumption (6b). The shape of the distributions were given in advance for each scenario. Only the scales were adjusted. It can be seen in the most realistic Scenarios (2)–(4) that QTLs with large additive effect are close to recessivity or dominance, so that they contribute little to the additive variance. Druet et al. (Reference Druet, Sölkner, Groen and Gengler2001) support the hypothesis that large additive QTLs for PL are absent since the fertility, which determines length of PL, has a heritability of only 1%. The QTL on the right-hand side of the vertical lines in Fig. 1 are those whose absolute additive effect exceeds s=2 months.

Table 5 shows the parameter estimates for all scenarios for milk yield and PL. For scenarios (2)–(4), the estimated number of QTLs was between 736 and 1359 and the expected number of large QTLs was about 5·3–8·8. For PL, the estimated number of QTLs was between 332 and 612 and the expected number of large QTLs was about 3·1–5·3. The table also shows for the different sources their estimated contribution to the expected genotypic variance Ṽ _G=Ṽ _A+Ṽ _D. It can be seen that the genotypic variance is affected only little by linkage disequilibrium. Moreover, we have in all scenarios Ṽ _A^d+Ṽ _A^ad+Ṽ _A^LD<0, so dominance effects diminish rather than increase the additive variance.

Equation (28) suggests that the number of QTLs depends heavily on the coefficient of variation of the dominance coefficient c _v. In scenarios (1), (3) and (4), this coefficient was equal to 1·5 as it was suggested by Bennewitz & Meuwissen (Reference Bennewitz and Meuwissen2010). However, the true value may be different for some traits. In order to see how the parameter values depend on c _v we calculated them also for other values. Results are shown in Figs 2 and 3 for PL (left) and milk yield (right). The dotted line shows the results obtained under assumption (6a) for independent Laplace distributed additive effects and normally distributed dominance coefficients. The dashed line shows the results obtained under assumption (6a) for dependent additive effects and dominance coefficients. Here, _n~(0·2, (0·2 c _v)²) was used and ã _n was defined as in scenario (3). This choice results for large c _v in a more heavy tailed distribution of a′_n. The solid line shows the results obtained under assumption (6b) for the same joint distribution of additive effects and dominance coefficients.

Fig. 2. Different parameters for PL and milk yield as a function of the coefficient of variation of the dominance coefficient: (a) expectation of the dominance coefficient, (b) standard deviation of the dominance coefficient and (c) standard deviation of the additive effects.

Fig. 3. Different parameters for PL and milk yield as a function of the coefficient of variation of the dominance coefficient: (a) Number of QTLs, (b) number of large QTLs.

It can be seen that under (A6b) with dependent effects (solid lines) the smallest mean and standard deviation of the dominance coefficients and therefore the smallest fraction of overdominant alleles was obtained. It also provided the highest number of large QTLs. The number of QTLs depends heavily on the coefficient of variation, i.e. if c _v is small then μ_Δ and σ_A′ are small as well, so inbreeding depression must be caused by many genes. The number of large QTLs changes only little although the total number of QTLs strongly increases for increasing c _v which is due to the more heavy tailed distribution of a′_n for large c _v.

4. Discussion

Novel methods to estimate the number of QTLs and to quantify the importance of linkage disequilibrium on the additive and dominance variance have been derived. It is demonstrated how parameters of the distribution of the QTLs effects can be estimated if estimates for the variance components and inbreeding depression are available. In the following, several aspects of the approach are critically discussed.

(i) Bounds for the number of QTLs

Lower bounds for the expected number of QTLs have been derived. The bound given by eqn (18) equals the ratio of squared inbreeding depression to dominance variance. It could be derived under weak assumptions, but may be poor if the trait shows little directional dominance. The bound could be improved if dominance effects and heterozygosity are independent (see eqn (21)). This equation gives the largest bound, but the calculation requires knowledge on the distribution of the additive effects and on the dependency between additive effects and dominance coefficients. This is not needed to calculate the bound given in eqn (23). Since this bound holds even if c _v=0 although, we are likely to have c _v>0, the bound is rather conservative. Larger bounds could be derived under more stringent assumptions. There could be less QTLs in a population, but the average number of QTLs (averaged over many populations with the same characteristics) would not fall below the bound.

This novel method to estimate the number of QTLs affecting a trait gives a higher number than estimates obtained from the QTL mapping experiments mentioned in the Introduction section. The reason is that inbreeding depression must be due to a sufficient number of QTLs, because otherwise it would cause more than the true dominance variance of the trait. Therefore, our estimates are larger than the estimates of other authors who neglected dominance and could not well account for QTLs with small effects. This supports the hypothesis of a large number of genes affecting a quantitative trait. An upper bound for the expected number of large QTLs is also derived.

(ii) Shape of the distribution

A critical issue is the shape of the distribution of the additive effects. It is demonstrated that the distribution of QTL effects must be heavy tailed if many QTLs with large effect are expected to segregate in a population, provided that the trait is affected by directional dominance. This means that a distribution function for standardized additive effects is admissible, only if eqn (26) can be solved for Q under side condition (21).

(iii) Quality of the estimates

Parameter estimates and bounds for the number of QTLs that are obtained as described above may be imprecise for several reasons. The estimation of dominance variance requires large amounts of data with a high proportion of full sibs. For equivalent accuracy, the estimation of dominance variance requires at least 20 times as much data than the estimation of additive variance (Misztal, Reference Misztal1997). In particular, estimates of the ratio of squared inbreeding depression to dominance variance, which determines the bound for the number of QTLs, may be poor for traits where both of them are small. Dominance variance is difficult to estimate as it could be confounded e.g. by maternal effects, environmental covariance of fullsibs and variation in relationship. We cannot rule out the possibility that the dominance variance has been overestimated, but see Duangjinda et al. (Reference Duangjinda, Misztal, Bertrand and Tsuruta2001). Moreover, our approach treats the population as a random realization. Realized additive variance, dominance variance and inbreeding depression deviate by chance from their expectations.

(iv) Assumptions that may be violated in the presence of selection

Our model could account for effects of selection on the distribution of allele frequencies and on the dependency between a _n and δ_n, but it could not account for some other effects such as a dependency between the dominance effects and the heterozygosity. Since overdominant alleles likely have intermediate frequencies because of selection, they contribute more to dominance variance, so a smaller proportion of overdominant alleles would be sufficient to explain a large dominance variance in a long-term selected population. Independence of a _n and D _n,m, which was assumed to calculate Ṽ _A^LD, may not hold in the presence of selection because then allelic effects and allele frequencies are dependent and allele frequencies affect D _n,m. This was neglected in the present study. In the examples we used estimates for LD and for parameters of the distribution of allele frequencies that were obtained for neutral alleles although the traits under examination are not neutral. Further research is needed in order to generalize the model or to find out how robust our results are when the assumptions that have been made are violated.

(v) The role of overdominance

We showed that the second moment of the dominance coefficient is determined by the ratio of dominance variance due to single dominance effects Ṽ _D^d to the additive variance due to single additive effects Ṽ _A^a, but does not depend on the number of QTLs. If the second moment is larger than one, then the probability of an allele to be overdominant becomes large (see eqn (25)). This may occur in computer simulations when populations are simulated without selection (according to assumption (A6a)) and the trait shows much dominance variance. But it is unlikely to occur in real populations. A large second moment does not occur if additive effects and dominance coefficients are dependent (r ₂>>0) and a _n and (q _n−p _n) d _n have opposite signs (A6b). Frankham (Reference Frankham, van der Werf, Graser, Frankham and Gondro2009) argued that in long-term artificially selected populations the proportion of variation due to overdominant alleles is likely much higher than in wild populations at equilibrium. Overdominance of an allele that affects PL could arise as follows: an individual has a long PL if it is good enough for a large number of traits. Consider a pleiotropic gene with alleles a and A. The allele a improves one trait (e.g. mastitis resistance) but reduces another trait (e.g. milking speed). Suppose that one a-allele is sufficient to prevent the individual from being culled because of the first trait. But two a-alleles decrease the second trait so much that the individual would likely be culled because of the second trait. In this hypothetical example, genotype AA would be culled early because of little mastitis resistance and genotype aa would be culled early because of slow milking speed. Only genotype Aa is likely to survive several lactations. Thus, the gene would be overdominant for PL. This is called antagonistic pleiotropy. The concept of antagonistic pleiotropy was popularized by Rose (Reference Rose1982) and critically analysed by Curtsinger (Reference Curtsinger, Service and Prout1994) and Hedrick (Reference Hedrick1999). Curtsinger argued that when many loci are involved, the conditions for intermediate equilibrium frequencies are difficult to satisfy. But he did not take into account that the number of affected fitness components likely also increases when the number of loci increases that affect total fitness. He also ignored the recurrent introduction of new alleles by mutations. Our approach also could not clarify how important overdominance for PL is, since some scenarios yielded a non-negligible portion of overdominant alleles, whereas others did not. However, the most reliable scenarios yielded little overdominance.

(vi) Conclusion

In conclusion, formulae are developed that can be used to obtain estimates for the number and distribution of QTL effects. The examples demonstrated the need to estimate them for each trait separately. Our method needs estimates of dominance variance and inbreeding depression. But if reliable estimates are available, then estimates for the number and distribution of QTL effects can be obtained that are very helpful in the choice of the appropriate method for the prediction of genomic breeding values, even though the dependency between marker effects and QTL effects needs clarification. Additionally they can be used as input parameters for stochastic simulations that mimic the real situation closely.

R.W. was supported by a grant from the Deutsche Forschungsgemeinschaft (DFG). The manuscript has benefited from critical and helpful comments of the anonymous reviewers. The authors also thank Professor W. G. Hill for constructive critique and encouraging comments on an earlier version of the manuscript.

References

Avery, P. J. & Hill, W. G. (1979). Variance in quantitative traits due to linked dominant genes and variance in heterozygosity in small populations. Genetics 91, 817–844.CrossRef Google Scholar PubMed

Bennewitz, J. & Meuwissen, T. H. E. (2010). The distribution of QTL additive and dominance effects in porcine F2 crosses. Journal of Animal Breeding and Genetics 127, 171–179.CrossRef Google Scholar PubMed

Bourguet, D. (1999). The evolution of dominance. Heredity 83, 1–4.CrossRef Google Scholar PubMed

Caballero, A. & Keightley, P. D. (1994). A pleiotropic nonadditive model of variation in quantitative traits. Genetics 138, 883–900.CrossRef Google Scholar PubMed

Chamberlain, A. J., McPartlan, H. C. & Goddard, M. E. (2007). The number of loci that affect milk production traits in dairy cattle. Genetics 177, 1117–1123.CrossRef Google Scholar PubMed

Charlesworth, D. & Willis, J. H. (2009). The genetics of inbreeding depression. Nature Reviews Genetics 10, 783–796.CrossRef Google Scholar PubMed

Crow, J. F. & Kimura, M. (1970). An Introduction to Population Genetics Theory. New York, NY: Harper and Row.Google Scholar

Curtsinger, J. W., Service, P. M. & Prout, T. (1994). Antagonistic pleiotropy, reversal of dominance, and genetic polymorphism. American Naturalist 144, 210–228.CrossRef Google Scholar

Daetwyler, H. D., Pong-Wong, R., Villanueva, B. & Woolliams, J. A. (2010). The impact of genetic architecture on genome-wide evaluation methods. Genetics 185, 1021–1031.CrossRef Google Scholar PubMed

Dekkers, J. C. M. & Hospital, F. (2002). Utilization of molecular genetics in genetic improvement of plants and animals. Nature Reviews Genetics 3, 22–32.CrossRef Google Scholar

Druet, T., Sölkner, J., Groen, A. F. & Gengler, N. (2001). Additive and dominance genetic variance of fertility by method R and preconditioned conjugate gradient. Journal of Dairy Science 84 (Online).CrossRef Google Scholar

Duangjinda, M., Misztal, I., Bertrand, J. K. & Tsuruta, S. (2001). The empirical bias of estimates by restricted maximum likelihood, Bayesian method, and method R under selection for additive, maternal, and dominance models. Journal of Animal Science 79, 2991–2996.CrossRef Google Scholar PubMed

Eyre-Walker, A. & Keightley, P. D. (2007). The distribution of fitness effects of new mutations. Nature Reviews Genetics 8, 610–618.CrossRef Google Scholar PubMed

Falconer, D. S. & Mackay, T. F. C. (1996). Introduction to Quantitative Genetics. London: Longman.Google Scholar

Fisher, R. A. (1918). The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society of Edinburgh 52, 399–433.CrossRef Google Scholar

Fisher, R. A. (1928). The possible modification of the response of the wild type to recurrent mutations. American Naturalist 62, 115–126.CrossRef Google Scholar

Frankham, R. (2009). Genetic architecture of reproductive fitness and its consequences. In Adaption and Fitness in Animal Populations, Evolutionary and Breeding Perspectives on Genetic Resource Management(ed. van der Werf, J, Graser, H.-U., Frankham, R. & Gondro, C.), pp. 15–39. Springer Science+Business Media B. V.CrossRef Google Scholar

García-Dorado, A., López-Fanjul, C. & Caballero, A. (1999). Properties of spontaneous mutations affecting quantitative traits. Genetical Research Cambridge 74, 341–350.CrossRef Google Scholar PubMed

Gianola, D., des los Campos, G., Hill, W. G., Manfredi, E. & Fernando, R. (2009). Additive genetic variability and the bayesian alphabet. Genetics 183, 347–363.CrossRef Google Scholar PubMed

Goddard, M. E. (2001). The validity of genetic models underlying quantitative traits. Livestock Production Science 72, 117–127.CrossRef Google Scholar

Goddard, M. E. (2009). Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245–257. doi:10.1007/s10709-008-9308-0.CrossRef Google Scholar PubMed

Grisart, B., Farnir, F., Karim, L., Cambisano, N., Kim, J. J., Kvasz, A., Mni, M., Simon, P., Frère, J.-M., Coppieters, W. & Georges, M. (2004). Genetic and functional confirmation of the causality of the DGAT1 K232A quantitative trait nucleotide in affecting milk yield and composition. Proceedings of the National Academy of Sciences of the USA 101, 2398–2403.CrossRef Google Scholar PubMed

Hayes, B. & Goddard, M. E. (2001). The distribution of the effects of genes affecting quantitative traits in livestock. Genetics Selection Evolution 33, 209–229.CrossRef Google Scholar PubMed

Hayes, B. J., Bowman, P. J., Chamberlain, A. J. & Goddard, M. E. (2009). Invited review: Genomic selection in dairy cattle: progress and challenges. Journal of Dairy Science 92, 433–443.CrossRef Google Scholar PubMed

Hedrick, P. W. (1999). Antagonistic pleiotropy and genetic polymorphism: a perspective. Heredity 82, 126–133.CrossRef Google Scholar

Hill, W. G. (2010). Understanding and using quantitative genetic variation. Philosophical Transactions of the Royal Society B 365, 73–85.CrossRef Google Scholar PubMed

Hill, W. G., Goddard, M. E. & Visscher, P. M. (2008). Data and theory point to mainly additive genetic variance for complex traits. PLoS Genetics 4, e1000008.CrossRef Google Scholar PubMed

Ishikawa, A. (2009). Mapping an overdominant quantitative trait locus for heterosis of body weight in mice. Journal of Heredity 100 (4), 501–504.CrossRef Google Scholar PubMed

Kacser, H. & Burns, J. A. (1981). The molecular basis of dominance. Genetics 97, 639–666.CrossRef Google Scholar PubMed

Keightley, P. D. (1996). A metabolic basis for dominance and recessivity. Genetics 143, 621–625.CrossRef Google Scholar PubMed

Luan, T., Woolliams, J. A., Lien, S., Kent, M., Svendsen, M. & Meuwissen, T. H. E. (2009). The accuracy of genomic selection in norwegian red cattle assessed by cross-validation. Genetics 183, 1119–1126.CrossRef Google Scholar PubMed

Luo, L. J., Li, Z.-K., Mei, H. W., Shu, Q. Y., Tabien, R., Zhong, D. B., Ying, C. S., Stansel, J. W., Khush, G. S. & Paterson, A. H. (2001). Overdominant epistatic loci are the primary genetic basis of inbreeding depression and heterosis in rice. II. grain yield components. Genetics 158, 1755–1771.CrossRef Google Scholar PubMed

Lynch, M. & Walsh, B. (1998). Genetics and Analysis of Quantitative Traits. Sunderland, MA: Sinauer.Google Scholar

Mackay, T. F. C. (2001). The genetic architecture of quantitative traits. Annual Review of Genetics 35, 303–339.CrossRef Google Scholar PubMed

Mackay, T. F. C & Anholt, R. R. (2006). Of flies and man: Drosophila as a model for human complex traits. Annual Review of Genomics and Human Genetics 7, 339–367.CrossRef Google Scholar

Meuwissen, T. & Goddard, M. (2010). Accurate prediction of genetic values for complex traits by whole genome resequencing. Genetics 185, 623–631.CrossRef Google Scholar PubMed

Miglior, F., Burnside, E. B. & Kennedy, B. W. (1995). Production traits of Holstein cattle: estimation of nonadditive genetic variance components and inbreeding depression. Journal of Dairy Science 78, 1174–1180.CrossRef Google Scholar PubMed

Misztal, I. (1997). Estimation of variance components with large-scale dominance models. Journal of Dairy Science 80, 965–974.CrossRef Google Scholar

Otto, S. P. & Jones, C. D. (2000). Detecting the undetected: estimating the total number of loci underlying a quantitative trait. Genetics 156, 2093–2107.CrossRef Google Scholar PubMed

Rocha, J. L., Eisen, E. J., Siewerdt, F., Van Vleck, L. D. & Pomp, D. (2004). A large-sample QTL study in mice: III. Reproduction. Mammalian Genome 15, 878–886.CrossRef Google Scholar PubMed

Rose, M. R. (1982). Antagonistic pleiotropy, dominance and genetic variation. Heredity 48, 63–78.CrossRef Google Scholar

Van Raden, P. M. & Klaaskate, E. J. H. (1993). Genetic evaluation of length of productive life including predicted longevity of live cows. Journal of Dairy Science 76, 2758–2764.CrossRef Google Scholar PubMed

van Tassell, C. P., Misztal, I. & Varona, L. (2000). Method R estimates of additive genetic, dominance genetic, and permanent environmental fraction of variance for yield and health traits of Holsteins. Journal of Dairy Science 83, 1873–1877.CrossRef Google Scholar PubMed

Villa-Angulo, R., Matukumalli, L. K., Gill, C. A., Choi, C., Van Tassell, C. P. & Grefenstette, J. J. (2009). High-resolution haplotype block structure in the cattle genome. BMC Genetics 10, 19.CrossRef Google Scholar PubMed

Visscher, P. M. (2008). Sizing up human height variation. Nature Genetics 40, 489–490.CrossRef Google Scholar PubMed

Wright, S. (1934). Molecular and evolutionary theories of dominance. American Naturalist 68, 24–53.CrossRef Google Scholar

Table 1. Table of symbols

Fig. 1. Scatter plots of the considered joint distributions of |an| and δn from Part (iv), fitted to the trait productive life (PL) under assumption (6b), so the signs of the additive effects are such that the alleles contribute little to the additive variance. (1) Absolute additive effect and dominance coefficient are independent. (2) Alleles with large effect show directional dominance, but alleles of small effect are additive. (3) Alleles with small effects show highly variable dominance coefficients, but for alleles with large effects, heterozygous effect is above the average effect of the two homozygotes. (4) Alleles with small effects show highly variable dominance coefficients, but alleles with large effect are incomplete recessive or dominant.

Table 2. Characteristics of the considered joint distributions of |an| and δn that are described in Part (iv)

Table 3. Estimates for milk yield and PL in Holstein cattle from literature

Table 4 Estimated parameters for Milk yield and PL

Table 5 Bounds for the parameters under different assumptions

Fig. 3. Different parameters for PL and milk yield as a function of the coefficient of variation of the dominance coefficient: (a) Number of QTLs, (b) number of large QTLs.

Article contents

The contribution of dominance to the understanding of quantitative genetic variation

Summary

1. Introduction

2. Theory

(i) Variance components and inbreeding depression

(ii) Expectations of variance components and inbreeding depression

(iii) Heterozygosity and linkage disequilibrium

(iv) Dependency between additive effect and dominance coefficient

(v) Contributions of different sources to variance components

(vi) Lower bounds for the number of QTLs

(vii) The role of overdominance

(viii) Upper bounds for the number of large QTLs

(ix) Parameter estimation

3. Examples

4. Discussion

(i) Bounds for the number of QTLs

(ii) Shape of the distribution

(iii) Quality of the estimates

(iv) Assumptions that may be violated in the presence of selection

(v) The role of overdominance

(vi) Conclusion

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests