- HLA
human leucocyte antigen
- IL-18R
IL-18 receptor
- SH2B
Src homology-2-B
- SNPs
single-nucleotide polymorphisms
Coeliac disease is a common disorder, thought to affect approximately 0·5–1% of the population in Northern Europe. In the early 1950s, through the seminal observations of a Dutch paediatrician, Willem Dicke, the cause of coeliac disease was identified as dietary gluten(Reference Dicke1). This finding led to a highly-specific treatment for coeliac disease, the gluten-free diet. Initially, coeliac disease was considered to be a relatively rare disorder, affecting only one in 1000 or smaller numbers in the population. An exception to this estimate was found to be the west of Ireland, where the prevalence of coeliac disease was calculated to be one in 300(Reference Mylotte, Egan-Mitchell and McCarthy2). This finding led to the incorrect perception that coeliac disease was uniquely common in Ireland, especially in the western seaboard region. With better diagnostic tests and case ascertainment it is now appreciated that coeliac disease has a high prevalence in many regions of the world(Reference Green and Cellier3).
Early in the study of coeliac disease an increased incidence of this condition was noted in families, and the potential that genes contribute to the disease mechanism was considered(Reference Mylotte, Egan-Mitchell and Fottrell4). An association with genes that code for molecules of the MHC was quickly established and MHC genes located on chromosome 6 have been confirmed as the major genetic influence on coeliac disease(Reference Sollid, Markussen and Ek5).
Advances in the diagnosis of coeliac disease
Following identification of the histological lesion in coeliac disease the diagnosis of the disorder was primarily based on examination of small intestinal biopsies, with evidence that the lesion remits after institution of a gluten-free diet. Measuring anti-gliadin antibodies in serum was used as an adjunct to biopsy and, although useful, the test lacks specificity and is only moderately sensitive(Reference O'Farrelly, Kelly and Hekkens6, Reference Kelly, Feighery and Gallagher7). An assay for the detection of anti-endomysial antibodies was reported in the early 1980s(Reference Chorzelski, Sulej and Tchorzewska8) and deployment of this test became widespread in the following decade. Evidence shows that the presence of endomysial antibodies is virtually 100% specific for coeliac disease, with a reported sensitivity of 87–89%(Reference Feighery, Weir and Whelan9, Reference Collin10). Several large population studies were subsequently performed that employed the endomysial antibody assay. A high prevalence of coeliac disease was revealed by all studies, ranging from one in 140 in Northern Ireland(Reference Johnston, Watson and McMillan11), one in 200 in Italy(Reference Catassi, Ratsch and Fabiani12) and Sweden(Reference Ivarsson, Persson and Juto13) and one in 250 in the USA(Reference Not, Horvath and Hill14).
The development of the endomysial antibody test as a highly-specific serological diagnostic assay for coeliac disease represented a major breakthrough in the investigation of this disorder. In 1997 the target auto-antigen of endomysial antibodies was identified as the enzyme tissue transglutaminase(Reference Dieterich, Ehnis and Bauer15). This finding continues to have major implications for on-going coeliac research. It is now possible, using an ELISA system, to measure antibodies to tissue transglutaminase, and raised levels of these antibodies correlate strongly with detection of endomysial antibodies(Reference Dieterich, Laag and Schopper16, Reference Sulkanens, Halttunen and Laurila17). ELISA-based assays have several advantages, which include enabling a large throughput of patient samples and providing more stringent data on antibody levels.
Advances in understanding the pathogenesis of coeliac disease
Many studies have subsequently been undertaken to investigate a potential role for tissue transglutaminase in the pathogenesis of coeliac disease. It has been demonstrated that the interaction of gluten proteins (gliadins) with the enzyme enhances the in vitro immunogenicity of these gliadin peptides(Reference Molberg, Mcadam and Korner18, Reference van de Wal, Kooy and van Veelen19). This effect is attributed to deamidation of certain glutamine residues in gliadin, thereby converting these residues to glutamic acid(Reference Quarsten, Molberg and Fugger20). This deamidation process is not a random event but highly selective for the position of glutamine amino acids relative to nearby proline residues in a gliadin peptide sequence(Reference Vader, de Ru and van der Wal21). Furthermore, it has been demonstrated that this glutamic acid modification in the peptide sequence enhances its binding to human leucocyte antigen (HLA)-DQ2 molecules, thereby increasing the immunogenicity of gliadin(Reference Vader, de Ru and van der Wal21). Currently, this mechanism is considered to be the principal manner in which tissue transglutaminase contributes to coeliac disease pathogenesis. However, since tissue transglutaminase is a multi-functional enzyme it may play a role through a diverse range of other effects, including its ability to activate the cytokine transforming growth factor β(Reference Halttunen and Mäki22).
The immune response to gliadin is considered to be the central pathogenic event in the development of coeliac disease(Reference Sollid23). Although over the past two decades much has been learned about gliadin stimulation of T lymphocytes, the final pathways that result in histological damage to the small intestine have not been fully elucidated. It is known that gliadin peptides can migrate across the epithelial cell barrier and, after modification by tissue transglutaminase, bind strongly to MHC class II molecules, in particular HLA-DQ2 molecules, which are found on antigen-presenting cells and in this setting are able to activate host T-cells. These T-cells then set in train an inflammatory response that results in tissue damage to the intestine. As such, coeliac disease has many features that resemble a classic auto-immune disorder, including the presence of a highly-specific auto-antibody (the endomysial antibody), being more common in females and associated with a specific immune response to gliadin, the antigen that is responsible for inciting the disease.
Advances in genetics of coeliac disease
As already mentioned, genes that code for MHC class II molecules are the strongest genetic influence on coeliac disease pathogenesis. In northern European populations with coeliac disease approximately 95% of patients are HLA-DQ2-positive and the majority of the remaining 5% are HLA-DQ8-positive. The genes that code for these DQ molecules are considered to account for approximately 35% of the genetic contribution to coeliac disease.
Recent studies have focused on determining the identity of the remaining contributing genes.
Many of these initial studies were based on the finding of linkage regions in family studies that identify commonly inherited regions among patients. These regions are usually large ones containing many genes. These genes were subsequently assessed as candidates and further analysed by fine mapping or sequencing. Alternatively, candidate genes could be selected for study based on their known functions. Both these approaches are constrained to a greater or lesser extent by current knowledge, and the functionality of many or even most gene products is incompletely understood.
Considering the disadvantages of these approaches, the application of novel analytical tools to the genetics of complex disease has had a profound effect on disease gene discovery. As with any mapping strategy, these methods benefit from being hypothesis-free. In essence, these methods use dense maps of genetic markers to look for regions that are more commonly found in patients with the disease compared with those without the condition. This field has been made possible by the precise cataloguing of genetic variation throughout the genome, which has been carried out by the International HapMap consortium and others. The most-commonly-used variants are termed single-nucleotide polymorphisms (SNPs), which exist as a natural element of population diversity and evolution and in the latest iteration of the HapMap (HapMap II) there are >3·8×106 SNPs genotyped, one SNP per 700 bases on average(Reference Frazer and Ballinger24).
To date, one genome-wide association study of patients with coeliac disease has been carried out using >300 000 SNP markers on a UK patient cohort. Significantly-associated SNPs were re-analysed in independent UK, Dutch and Irish patient cohorts(Reference van Heel, Franke and Hunt25). In addition to reconfirming the strong association between coeliac disease and the HLA class II DQ loci, this study has revealed a novel association with a genomic region (on chromosome 4q27) encoding four genes, notably including the IL-2 and IL-21 loci (Table 1). Following genotyping in the replication cohorts, this locus has shown a highly significant association with coeliac disease susceptibility for a range of SNPs and haplotypes (several SNPs with P<10−10). This highly significant statistical result is also important from a functional standpoint, given the currently-accepted view of the role of T-cells in coeliac disease pathogenesis. It is not currently possible to distinguish which gene in this region is responsible for the association and no SNPs with obvious functional consequences have been identified. Of the remaining two genes found in this region, one is testis specific and the other encodes a protein with unknown function and broad tissue expression patterns. However, both IL-2 and IL-21 represent excellent candidates for a role in an immune-mediated inflammatory disease.
RGS1, regulator of G-protein signalling 1; IL-1RL1, IL-1 receptor-like 1; Th, T-helper; IFN, interferon; LPP, lipoma preferred partner; SH2-B, Src homology-2-B.
* A measure of the risk of developing the disease in the presence of these variants. In practice, many single-nucleotide polymorphisms (SNPs) may be associated in any given region; only the most significant values are shown for clarity.
† SNPs causing both reduced and increased risk are found in the region encoding LPP, implying that some variants of LPP may decrease risk of developing coeliac disease while others increase the likelihood of developing the disorder.
As a follow on from this initial study approximately 1500 SNPs were selected for further investigation. Of the selected SNPs 1000 were found to be the most strongly associated SNPs from the initial genome-wide association study. The remainder comprised non-synonymous SNPs (i.e. SNPs that change the amino acid sequence of a gene or protein) showing some extent of association in the genome-wide association study, and further SNPs were chosen in genes based on their function. These SNPs were genotyped in all three population samples and significant novel disease associations for a further seven genomic regions were revealed(Reference Hunt, Zhernakova and Turner26). It is encouraging to note that six of these regions encode one, or in some instances, several genes with clear roles in the immune system (Table 1), including genes in the chemokine gene cluster (the CCR genes) on chromosome 3p21, and genes for the IL-18 receptor (IL-18R) subunits (IL18R1 and IL18RAP; 2q11) and IL12A (3q25), RGS1 (1q31), TAGAP (6q25), SH2B3 (12q24). LPP is the exception to this trend since it does not appear to be expressed in cells of the immune system but rather in smooth muscle.
The CCR genes comprise several genes that function in the homing of lymphocytes to sites of immune activity. The region encoding the IL-18R subunits IL-18R-α (encoded by IL18R1) and IL-18R-β (encoded by IL18RAP) also harbours members of the IL-1 receptor cluster. However, IL18RAP is considered the best candidate in this region, since one of the most-strongly-associated coeliac disease SNPs in the area (denoted rs917997) is also associated with altered IL-18Rβ expression in peripheral blood lymphocytes from patients treated for coeliac disease(Reference Hunt, Zhernakova and Turner26). IL-18 signalling is highly relevant given its critical upstream regulatory role in the production of interferon-γ, which is produced in abundance in the coeliac lesion and is a hallmark of the disease(Reference Salvati, MacDonald and Bajaj-Elliott27).
IL12A codes for the IL-12p35 subunit of IL-12, which is an important immune regulator, promoting natural killer and T-cell activity and, with particular reference to coeliac disease pathogenesis, T-helper type 1 lineage differentiation(Reference Del Vecchio, Bajetta and Canova28). IL-12 also strongly promotes interferon-γ expression, emphasising its credentials as a strong functional candidate.
Regulator of G-protein signalling 1 and T-cell-activation Rho GTPase-activating protein are both GTPase-activating proteins expressed in cells of the immune system and likely to attenuate cell signalling events. Regulator of G-protein signalling 1 is expressed by both T-cell αβ receptor-positive and γδ-positive intraepithelial lymphocytes in contrast to splenic and thymic T-cells, in which it does not appear to be expressed(Reference Hunt, Zhernakova and Turner26). Rgs1-knock-out mice show increased B-cell and dendritic migration(Reference Han, Moratz and Huang29, Reference Shi, Harrison and Han30). T-cell-activation Rho GTPase-activating protein is expressed in activated T-cells, in which it is believed to be involved in cytoskeletal changes(Reference Mao, Biery and Kobayashi31). Interestingly, TAGAP is one of a number of immunological genes recently reported to show reduced expression in children with Down's syndrome(Reference Sommer, Pavarino-Bertelli and Goloni-Bertollo32), in whom an increased prevalence of coeliac disease is known to occur(Reference Goldacre, Wotton and Seagroatt33, Reference Carnicer, Farré and Varea34).
Src homology-2-B (SH2B) 3 (also termed LNK) is one of three members of the SH2B family and is expressed in epithelial cells and a variety of immune cells including monocytes, dendritic cells and lymphocytes(Reference Huang, Li and Tanaka35, Reference Takaki, Watts and Forbush36). SH2B3 is an adaptor protein with the capacity to negatively regulate growth factor and cytokine-induced signalling pathways in immune cells(Reference Li, He and Schembri-King37, Reference Velazquez, Cheug and Fleming38) and has been shown to attenuate the ability of TNF to induce adhesion factor expression by vascular endothelial cells(Reference Fitau, Boulday and Coulon39). SH2B3 is an established susceptibility factor for type 1 diabetes(Reference Todd, Walker and Cooper40). rs3184504, the coeliac-associated SNP, is the most strongly associated SNP in diabetes, which may be a causative association given that it results in a non-synonymous Arg262Trp amino acid alteration to the pleckstrin homology domain. SH2B3 is expressed in small intestinal biopsies from patients with coeliac disease and individuals who are unaffected(Reference Hunt, Zhernakova and Turner26).
LPP, as mentioned earlier, is present in smooth muscle rather than cells of the immune system. The protein, lipoma preferred partner, appears to play a role in influencing cell migration and functions by linking the cytoskeleton to the cell membrane at adhesion sites and also as a transcription factor(Reference Majesky41). Activation of its transcriptional functions may be induced by changes in cell adhesion, and there is substantial evidence indicating that dysregulation of lipoma preferred partner can contribute to oncogenesis, principally by means of its transcriptional activities(Reference Majesky41). How it might promote coeliac disease is unknown, but its potential role in regulating cell adhesion and motility could indicate a structural role, e.g. in barrier function.
The observation of shared susceptibility loci with type 1 diabetes is not unexpected given that coeliac disease is more common in patients with diabetes and vice versa(Reference Scott and Losowsky42, Reference Collin, Reunala and Pukkala43). Indeed, at least three loci show some evidence of being involved in both diseases, these loci being the IL-2 and IL-21, the CCR genes and the SH2B3 regions. Previous studies show the 2q33 region encoding CTLA4 (but also CD28 and ICOS) to be associated with both diseases(Reference Ueda, Howson and Esposito44–Reference Brophy, Ryan and Thornton46). Other shared genes are likely to emerge with greater patient and control samples providing improved power to detect these associations. The IL-2 and IL-21 region is also implicated in several other diseases, including rheumatoid arthritis and psoriasis(Reference Zhernakova, Alizadeh and Bevova47, Reference Liu, Helms, Liao and Zaba48), whilst IL18RAP is associated with inflammatory bowel disease(Reference Zhernakova, Festen and Franke49), providing support for the theory that many of these diseases share common pathways.
Previous studies using linkage analysis and candidate-gene studies have implicated regions on chromosome 11q21 and 5q33 as harbouring genes that predispose to coeliac disease(Reference Greco, Corazza and Babron50, Reference Greco, Corazza and Babron51). Similarly, several studies have implicated CTLA4 (designated the COELIAC2 locus) as being linked to coeliac disease(Reference Hunt, McGovern and Kumar45, Reference Brophy, Ryan and Thornton46). However, these findings were not replicated in the genome-wide association study. The reasons for this result are unclear, but besides the obvious possibility that they represent false positive associations, another possibility is that they may relate to more complex genetic associations involving haplotypes rather than single SNPs. This latter explanation may apply in the case of CTLA4, for which individual SNPs are less strongly linked to disease than an extended haplotype(Reference Brophy, Ryan and Thornton46). Interestingly, this disease-associated haplotype appears to be under positive selection pressure in Caucasian populations(52) and carries three genes (CD28, CTLA4 and ICOS) with complimentary activities in regulating the extent and duration of T-cell activation. This finding indicates that in some instances the co-inheritance of variants in more than one gene may be of greater importance than the inheritance of any individual variant; this effect will not be identified without extended haplotype analysis. Larger studies and more in-depth analysis (complex because of the enormous volume of data generated) are likely to be required to resolve these questions.
Acknowledgements
The authors declare no conflict of interest. Funding was provided by the Health Research Board, Science Foundation Ireland, the Higher Education Authority PRTLI and the Wellcome Trust. Both authors contributed equally to this review.