A Novel Approach for Pathway Analysis of GWAS Data Highlights Role of BMP Signaling and Muscle Cell Differentiation in Colorectal Cancer Susceptibility

Aniket Mishra; Stuart MacGregor

doi:10.1017/thg.2016.100

A Novel Approach for Pathway Analysis of GWAS Data Highlights Role of BMP Signaling and Muscle Cell Differentiation in Colorectal Cancer Susceptibility

Published online by Cambridge University Press: 20 January 2017

Aniket Mishra and

Stuart MacGregor

Show author details

Aniket Mishra*: Affiliation:
Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) and the Colorectal Cancer Family Registry (CCFR), and Statistical Genetics, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
Stuart MacGregor: Affiliation:
Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) and the Colorectal Cancer Family Registry (CCFR), and Statistical Genetics, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
*: address for correspondence: Aniket Mishra, Statistical Genetics, QIMR Berghofer Medical Research Institute, 300 Herston Rd, Brisbane QLD 4006, Australia. E-mail: [email protected]

Article contents

Abstract
Materials and Methods
Results
Discussion
Supplementary Material
References

Abstract

Genome-wide association studies (GWAS) have revolutionized the field of gene mapping. As the GWAS field matures, it is becoming clear that for many complex traits, a proportion of the missing heritability is attributable to common variants of individually small effect. Detecting these small effects individually can be difficult, and statistical power would be increased if relevant variants could be grouped together for testing. Here, we propose a VEGAS2Pathway approach that aggregates association strength of individual markers into pre-specified biological pathways. It accounts for gene size and linkage disequilibrium between markers using simulations from the multivariate normal distribution. Pathway size is taken into account via a resampling approach. Importantly, since the approach only requires summary data, the method can easily be applied in all GWASs, including meta-analysis, singleton-based, family-based, and DNA-pooling-based designs. This approach is implemented in a user-friendly web page https://vegas2.qimrberghofer.edu.au and a command line tool. The web implementation uses gene-sets from the gene ontology (GO), curated gene-sets from MSigDB (containing canonical pathways and gene-sets from BIOCARTA, REACTOME, KEGG databases), PANTHER, and pathway commons databases, enabling analysis of a wide range of complex traits. We applied this method on a colorectal cancer GWAS meta-analysis data set (10,934 cases, 12,328 controls) from the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO). We report statistically significant enrichment of association signal for the ‘BMP signaling’ and ‘muscle cell differentiation’ pathways, suggesting a possible role for these pathways onto the risk of colorectal cancer.

Keywords

GWAS VEGAS2Pathway VEGAS pathway analysis colorectal cancer

Type: Articles
Information: Twin Research and Human Genetics , Volume 20 , Issue 1 , February 2017 , pp. 1 - 9

DOI: https://doi.org/10.1017/thg.2016.100 [Opens in a new window]
Copyright: Copyright © The Author(s) 2017

Genome-wide association studies (GWASs) have substantially improved our understanding of the genetic basis of various complex human phenotypes. GWAS typically tests the association of each common (minor allele frequency >0.01 or 0.05) single nucleotide polymorphism (SNP) with a trait of interest. To control false positives, frequently only SNPs with association p value less than the genome-wide significance threshold (p value < 5 × 10⁻⁸) are reported, with relatively little attention paid toward the remaining SNPs. Since relatively few SNPs will reach genome-wide significance in a given study, it may not be clear that a subset of them act within a particular pathway. By also considering additional sub-threshold SNPs, then it may be possible to (1) prioritize SNPs for follow-up based on the pathway they lie in, and (2) to better define the pathways involved in the trait, leading to insight into the underlying molecular mechanisms.

Pathway-centric approaches that test the association of the combined effects of variants in a set of biologically or functionally related genes are becoming increasingly popular as a complementary method to GWAS (Pers et al., Reference Pers, Karjalainen, Chan, Westra, Wood, Yang and Franke2015; Wang et al., Reference Wang, Zhang, Kugathasan, Annese, Bradfield, Russell and Hakonarson2009; Wang et al., Reference Wang, Li and Hakonarson2010). Pathway-based association strategies have several advantages, but further methodological development is required. In the past few years, several reviews have been published discussing the issues related to pathway-based association tests (Robinson et al., Reference Robinson, Wray and Visscher2014; Wang et al., Reference Wang, Li and Hakonarson2010; Reference Wang, Jia, Wolfinger, Chen and Zhao2011). Some of the potential problems are as follows: (1) inadequate modeling of the linkage disequilibrium (LD) pattern within a gene, and (2) gene-set size and the number of SNPs in a gene may influence the significance of a gene-set. Here, we report VEGAS2Pathway, a versatile pathway-based approach for GWAS data, and demonstrate through simulations how it appropriately accounts for LD between SNPs, gene size, and pathway size. We report its implementation in a user-friendly web page and in command line software.

We applied our method on colorectal cancer GWAS summary data obtained from the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO; Peters et al., Reference Peters, Jiao, Schumacher, Hutter, Aragaki and Baron2013) and Colorectal Cancer Family Registry (total sample size: 10,934 cases and 12,328 controls). Fifty variants so far reported are associated with colorectal cancer susceptibility (Peters et al., Reference Peters, Bien and Zubair2015). Together, these variants explain only small proportion of the heritability of colorectal cancer (Jiao et al., Reference Jiao, Peters, Berndt, Brenner, Butterbach, Caan and Hsu2014).

Materials and Methods

VEGAS2Pathway Strategy

VEGAS2Pathway is a two-step pathway analysis strategy. First, it calculates the gene-based test statistics for all genes using the VEGAS (VErsatile Gene-based Association Study) approach (Liu et al., Reference Liu, McRae, Nyholt, Medland, Wray, Brown and MacGregor2010; Mishra & MacGregor, Reference Mishra and MacGregor2015), which accounts for the LD between the SNPs within a gene through simulation. Second, for each of a set of pre-specified gene-sets, the relevant gene-based results are carried forward to compute a pathway-based test. To give users a pathway analysis platform useful for a wide range of complex traits, we incorporated gene-sets from the Gene Ontology (GO; Gene Ontology Consortium, 2008) curated gene-sets from MSigDB (Liberzon et al., Reference Liberzon, Subramanian, Pinchback, Thorvaldsdottir, Tamayo and Mesirov2011; containing canonical pathways and gene-sets from BIOCARTA, REACTOME, KEGG databases), PANTHER (Thomas et al., Reference Thomas, Campbell, Kejariwal, Mi, Karlak, Daverman and Narechania2003), and pathway commons (Cerami et al., Reference Cerami, Gross, Demir, Rodchenkov, Babur, Anwar and Sander2011) databases. We filtered these gene-sets to include only those with size between 10 and 1,000 genes. Overall, there are 6,212 gene-sets, including 18,399 genes with 511,336 annotations. To compute the pathway-based test, the relevant gene-based p values were first converted to upper-tail χ² statistics with one degree of freedom, before summing. This process of summing is potentially an informative approach for polygenic traits because each gene contributes to the pathway results concomitant with strength of evidence for association at that gene. Contrast this with the hypergeometric-based pathway approaches (Lee et al., Reference Lee, O'Dushlaine, Thomas and Purcell2012; Pers et al., Reference Pers, Karjalainen, Chan, Westra, Wood, Yang and Franke2015), where genes are either ‘in’ or ‘out’ of the pathway, based on them exceeding a particular p value. Generating accurate significance levels for very strongly associated genes is difficult because the simulation derived gene-based p value can be zero (in the default scenario this means 0 of the 1 × 10⁶ simulation replicates exceeded the result for the real data). In such cases, we allocate a p value of 1 × 10⁻⁶ for such genes. This approach may underestimate the importance of such genes but has the advantage of ensuring the resultant pathway results are not driven entirely by one or a few genes.

Under a polygenic model and assuming a non-competitive test, larger pathways will typically be more significant than smaller pathways. Since we believe the genetic architecture for many complex traits will be polygenic, a non-competitive test is unlikely to provide useful results (as large pathways will be much more likely to be significant than small ones due to the accumulation of polygenes). Instead, we implemented a competitive test in which each pathway is benchmarked against the ‘typical’ pathway of the same size. Specifically, we corrected for pathway size bias by adopting a resampling approach where the same number of genes as present in a pathway are repeatedly drawn at random from all set of genes used in the study and summed. The empirical p value of association for a pathway is calculated using following formula:

$$\begin{equation*} {\rm{Emp}}\;P = \ \frac{{\mathop \sum \nolimits_1^N I\left( {{\chi ^{2*}}\ \ge \ {\chi ^2}} \right) + 1}}{{N + 1}}, \end{equation*} $$

where N is the number of resamples performed. The χ²* is the summed χ² statistics computed per resample, which is compared against the observed summed χ² for pathway under consideration. I() is an indicator function. Figure 1 describes the schematic representation of the VEGAS2Pathway strategy.

FIGURE 1 Schematic representation of VEGAS2Pathway strategy: After reading in the two-column (SNP id and GWAS p value) text file, the gene-based test statistics are calculated using the VEGAS approach. These genes are then assigned to pathways. If a pathway contains a gene within another gene then the smaller gene is dropped. In another scenario, if a pathway contains overlapping or genes less than 500 kb away then only one gene (chosen randomly) is used to represent the association signal of the region while all nearby genes are dropped. In this way, VEGAS2Pathway ensures that all gene-based test statistics assigned to a pathway are at least 500 kb away from each other. The gene-based test statistics are then summed to compute a statistic for each pathway. An empirical p value for each pathway is computed by comparing the summed set score against that of replicates of the same size obtained through resampling of gene-based test statistics.

Simulation to Establish Properties of VEGAS2Pathway

We applied VEGAS2Pathway on 1,000 simulated quantitative phenotypes with standard normal distribution N (0, 1). We extracted the top-pathway test statistics from 1,000 simulations to allow estimation of a significance threshold (assessment of how differently the results for the 6,212 pathways behave relative to the situation where, for example, there were 6,212 completely independent pathways). We also used the set of 1,000 simulations to assess the correlation between gene size and gene p value, and between pathway size and pathway p value.

Application on GWAS Summary Data

We applied VEGAS2Pathway approach on colorectal cancer GWAS summary data obtained from GECCO (Peters et al., Reference Peters, Jiao, Schumacher, Hutter, Aragaki and Baron2013) and the Colorectal Cancer Family Registry. This meta-analysis of GWASs was performed across 11 studies totaling 10,934 cases and 12,328 controls of European ancestry. Sample selection and genotyping methods for participating studies have been previously described (Peters et al., Reference Peters, Jiao, Schumacher, Hutter, Aragaki and Baron2013). Please refer to Supplementary Text 1 for brief description on participating studies and GWAS methodology.

We used GWAS meta-analysis summary data to perform the pathway analysis. The 1000 Genomes phase 1 European data was used as a reference for VEGAS2 (Mishra & MacGregor, Reference Mishra and MacGregor2015) joint-SNP gene-based analysis. The 50 kb gene-boundary option was used to include the cis-regulatory variants. The computational burden for VEGAS2 increases dramatically with an increase in the number of SNPs per gene, and hence for a gene containing more than 1,000 variants it successively prunes the list of variants with r² criteria of 0.99, 0.90, 0.70, and 0.50. After each pruning interval, VEGAS2 checks the number of pruned SNPs. If the number of pruned SNPs is less than 1,000, then VEGAS2 uses the pruned SNPs from that interval to perform gene-based analysis; otherwise, it iteratively applies an increasingly stringent r² criterion on all SNPs in the gene. After applying a pruning criteria of r² = 0.50, it uses all pruned SNPs for analysis, irrespective of the number. Once the gene-based test statistics were calculated, we performed pathway analyses using the resampling approach.

Comparison with other Pathway Analysis Approaches

We analyzed the top five pathways identified by VEGAS2Pathway, using the pathway analysis approaches implemented in the INRICH (Lee et al., Reference Lee, O'Dushlaine, Thomas and Purcell2012) and MAGENTA (Segre et al., Reference Segre, Mootha, Daly and Altshuler2010) software. We used the same gene-pathway annotations as used in VEGAS2Pathway analysis to perform pathway analysis using INRICH and MAGENTA. The INRICH program is based on the hyper-geometric pathway analysis approach that compares the number of significant versus non-significant genes in a pathway, and control for type 1 error through permutation (Lee et al., Reference Lee, O'Dushlaine, Thomas and Purcell2012). We created the interval file using plink (with criteria: –clump-p1 1 × 10⁻⁴ –clump-p2 0.05 –clump-r2 0.5 –clump-range-border 20). The 1000 Genomes European reference data were used to compute pair-wise LD. We ran INRICH with default settings except the number of permutations parameter (-r) was changed to 1,000,000. The MAGENTA program is an extension of gene-set enrichment analysis (GSEA), a widely used approach for pathway analysis of gene expression data. It is based on a weighted Kolmogorov–Smirnov-like running sum statistic algorithm, which tests whether a gene-set is more enriched with highly ranked gene scores than would be expected by chance (Segre et al., Reference Segre, Mootha, Daly and Altshuler2010). We ran MAGENTA using default settings.

We also tested whether VEGAS2Pathway implicates the top pathways obtained using MAGENTA and INRICH programs. We performed pathway analysis on colorectal cancer GWAS data considering aforementioned 6,213 gene-sets using INRICH and MAGENTA. For INRICH, we first performed pathway analysis using default settings and reanalyzed the top 10 pathways by changing the number of permutations parameter (-r) to 1,000,000. We used default settings to perform pathway analysis using MAGENTA.

Results

Establishing Type 1 Error Cut-Off

The pathway annotations used for VEGAS2pathway analysis comprises the pathways from different sources, including hierarchical gene-sets from GO, curated gene-sets from MSigDB, PANTHER, and pathway commons. Hence, there is a frequent overlap in the set of genes included in different pathways. Since these pathways are not independent, application of traditional, multiple testing methods such as Bonferroni correction, Šidák correction, and false discovery rate (FDR) procedure would be overly conservative. To estimate the significance threshold correcting for multiple association tests performed for these overlapping pathways, we applied the VEGAS2Pathway approach on GWAS summary files from 1,000 simulated quantitative phenotypes with standard normally distributed trait values. Figure 2A shows the distribution of -log10 p values in the top pathway for 1,000 replicates of simulated phenotypes. The simulation-derived 95% empirical significance threshold for VEGAS2Pathway association test, taking into account the multiple testing of 6,213 correlated pathways, was 1.00 × 10⁻⁵ (50 of 1,000 replicates exceeded this). By using this empirical threshold, we are correcting for 5,000 effectively independent tests (since 0.05/5000 = 1.00 × 10⁻⁵). In comparison, if we were to Bonferroni correct for all 6,213 pathways, our threshold for significance would be 0.05/6213 = 8.02 × 10⁻⁶.

FIGURE 2 Simulation results. (A) The distribution of absolute log10 p values of 1,000 replicates. The line represent 5% interval of top -log10 p values. (B) The distribution of Pearson's correlation between gene-based p value and number of SNPs in the gene. (C) The distribution of Pearson's correlation between pathway-based p value and number of SNPs in the pathway. (D) Pathway type 1 error rate versus pathway size.

The effect of both gene size and pathway size are important to consider in pathway analysis. If there is a positive correlation between a gene's size and its gene-based p value, then pathways with an excess of bigger genes in them will repeatedly appear as significant in pathway analysis. Similarly, problems may also occur if pathway size is correlated to pathway significance. VEGAS2Pathway uses the VEGAS approach to perform gene-based analysis. In addition to accounting for LD between SNPs, we expect that the VEGAS approach will also deal appropriately with gene size. To account for pathway size, VEGAS2Pathway compares each gene-set with multiple resamples of the sets with the same number of genes, an approach that should deal appropriately with different pathway sizes. To assess performance of VEGAS2Pathway in practice, we calculated the Pearson's correlation coefficient between gene size and gene-based p value, and pathway size and empirical p value of pathway association. Figures 2B and 2C show the correlation distribution between and gene-size and gene-based p value, and pathway-size and pathway-based p value respectively in 1,000 replicates. The correlation is typically close to zero, both between gene size and gene-based p value (median -1.43 × 10⁻³, with 90% of simulation replicates in a -0.01 to +0.01 range), and between pathway size and pathway-based p value (median 2.16 × 10⁻⁵, with 90% of simulation replicates in a -0.08 to +0.08 range). Furthermore, we checked how the type 1 error rate varied by pathway size. Figure 2D shows that the type 1 error rate is independent of the pathway size and mean type 1 error rate is preserved to 0.05.

Application of VEGAS2Pathway on Colorectal Cancer GWAS Summary Data

We performed pathway analysis on colorectal cancer GWAS summary data using the VEGAS2Pathway approach. Three pathways reached the genome-wide, pathway-based significant p value of less than 1.00 × 10⁻⁵, which are GO:0060393_regulation_of_pathway-restricted_SMAD_ protein_phosphorylation (pathway-based p value < 1.00 × 10⁻⁶), PID_BMPPATHWAY (pathway-based p value = 2.00 × 10⁻⁶) and GO:0042692_muscle_cell_differentiation (pathway-based p value = 8.00 × 10⁻⁶). Overall, the top five pathways pointed toward the Sma/Mad-related protein (SMAD), bone morphogenetic protein (BMP), and transforming growth factor beta (TGFβ) signaling and muscle cell differentiation pathways. Table 1 shows the top five pathways observed in pathway analysis.

TABLE 1 Top Five Pathways from Pathway Analysis of Colorectal Cancer GWAS Summary Data

The GO:0060393_regulation_of_pathway-restricted_ SMAD_protein_phosphorylation pathway consists of 18 genes, of which six show a gene-based p value less than .05. Sixteen of the 18 genes in this pathway contain a top-SNP with a p value less than .05 (refer to Supplementary Table 1 for gene-based test statistics of genes in the GO:0060393_regulation_of_pathway-restricted_SMAD_ protein_phosphorylation pathway). The SMAD7 and BMP4 genes are the largest contributors to the result for this pathway. After removing all genes with top SNPs showing p value less 5.00 × 10⁻⁸, the pathway still shows evidence for an association (pathway-based p value = 1.14 × 10⁻³), showing that sub-genome-wide threshold genes make an important contribution to this pathway result. Germline mutations in BMPR1A (gene-based p value = 3.77 × 10⁻⁴, top SNP rs71503853 p value = 2.91 × 10⁻⁶) have been reported to be associated with juvenile polyposis syndrome (JPS; Yamaguchi et al., Reference Yamaguchi, Nagayama, Chino, Sakata, Yamamoto, Sato and Arai2014; OMIM 174900), which is a risk factor for colorectal cancer (Brosens et al., Reference Brosens, van Hattem, Hylind, Iacobuzio-Donahue, Romans, Axilbund and Giardiello2007; Howe et al., Reference Howe, Mitros and Summers1998). Another gene, SMAD6 (gene-based p value = 5.00 × 10⁻³, top SNP rs3809571 p value = 9.70 × 10⁻⁵), is involved in lung cancer cell growth and survival (Jeon et al., Reference Jeon, Dracheva, Yang, Meerzaman, Fukuoka, Shakoori and Jen2008). Two further genes with gene-based p values less than .05 are BMP2 and BMP7. BMP2 is located 342 kb away from the colorectal cancer associated 38 kb LD region at 20p12.3 (Study et al., Reference Study, Houlston, Webb, Broderick, Pittman, Di Bernardo and Dunlop2008). Overexpression of BMP2 is a risk factor for survival of non-small cell lung cancer patients (Chu et al., Reference Chu, Luo, Wang, Chen, Li, Bai and Zhang2014). BMP7 influences proliferation, migration and invasion of lung (Liu et al., Reference Liu, Chen, Yang, Zhang and Jiang2012), and breast cancer cells (Alarmo et al., Reference Alarmo, Parssinen, Ketolainen, Savinainen, Karhu and Kallioniemi2009).

The second significant pathway PID_BMPPATHWAY is an expert-curated pathway provided by the pathway interaction database (Schaefer et al., Reference Schaefer, Anthony, Krupa, Buchoff, Day, Hannay and Buetow2009; downloaded from MSigDB). This pathway consists of 40 genes, including 7 (6 genes with gene-based p value < .05) of the 18 genes of GO:0060393_regulation_of_pathway-restricted_SMAD_ protein_phosphorylation pathway. In addition to the six overlapping genes with GO:0060393_regulation_ of_pathway-restricted_SMAD_protein_phosphorylation pathway, there are three more genes in PID_ BMPPATHWAY showing a gene-based p value less than .05: GREM1 (gene-based p value = 3.60 × 10⁻⁵), TAB2 (gene-based p value = 4.83 × 10⁻⁵), and SMAD5 (gene-based p value = 0.011). Refer to Supplementary Table 2 for gene-based test statistics of all genes in the PID_BMPPATHWAY. Variants in the GREM1 are associated with the colorectal cancer. The PID_BMPPATHWAY shows evidence of association (pathway-based p value = 4.80 × 10⁻³) after removal of GREM1 and other genes with top SNP p value less than 5.00 × 10⁻⁸, suggesting the contribution of sub-genome-wide threshold genes.

TABLE 2 Top Five Pathways from VEGAS2Pathway Analysis of Colorectal Cancer GWAS Summary Data and their Test Statistics Using the INRICH and MAGENTA Approaches

The last significant pathway is GO:0042692_ muscle_cell_differentiation, consisting of 86 genes of which 18 show a gene-based p value less than .05 (refer to Supplementary Table 3). Interestingly, apart from BMP4 none of the other genes showed strong evidence for an association at a gene-based level, with much of the signal deriving from the combined effect of many genes with moderate effect sizes. As a result, the significance of GO:0042692_muscle_cell_differentiation was only slightly altered by removing genes, which were individually significant (those with top SNPs with p value less 5.00 × 10⁻⁸) from the analysis (pathway-based p value = 7.80 × 10⁻⁵).

Comparisons with other Pathway Analysis Approaches

To see if these pathways are also implicated by other approaches, we performed pathway analysis using INRICH and MAGENTA software. We used the same gene-pathway annotations as used for VEGAS2pathway to make association statistics comparable across approaches. Table 2 shows the association p-values for our reported top five pathways using the INRICH and MAGENTA approaches.

All top five pathways observed for colorectal cancer using VEGAS2pathway show association p values less than .05 using both INRICH and MAGENTA. In each case, the association p values obtained using INRICH and MAGENTA were less significant than the ones obtained using VEGAS2Pathway. These results are in line with the observations by Evangelou et al. (Reference Evangelou, Rendon, Ouwehand, Wernisch and Dudbridge2012), who reported that the Fisher's method approach (in which combined test statistics of all variants in a pathway is compared against combined test statistics of permuted gene-sets of same size, similar to the VEGAS2Pathway approach) is likely to be a more powerful competitive pathway analysis approach than the hypergeometric and GSEA approaches.

INRICH, a hypergeometric approach, by default uses 1.00 × 10⁻⁴ as a threshold for a variant to be called significant, hence it will be underpowered to detect a pathway containing many variants with association p value greater than 1.00 × 10⁻⁴. On the contrary, VEGAS2Pathway uses information of all associations from GWAS analysis to perform pathway analysis. Hence, VEGAS2Pathway outperforms INRICH to identify GO:0042692_muscle_cell_differentiation pathway, which contains many moderately associated genes (refer to Supplementary Table 3).

Both VEGAS2Pathway and MAGENTA are two-step, pathway analysis methods in which a gene-based association test statistics are first calculated to test the association of a pathway. MAGENTA considers only the top SNP within a gene-boundary to define gene's association test statistics, whereas VEGAS2Pathway combines association test statistics of all SNPs within a gene boundary to calculate each gene's association test statistics. Hence, VEGAS2Pathway would be more powerful than MAGENTA to identify pathways containing genes with many independently associated variants. The variation in results from VEGAS2Pathway and MAGENTA might also be due to inherent differences between hypotheses tested. As mentioned previously, MAGENTA is based on the GSEA approach, which tests whether the gene-set is enriched with highly ranked genes than expected by chance, whereas VEGAS2Pathway tests whether the combined association test statistics of genes within a pathway is significantly different from the ones obtained from randomly generated gene-sets.

We further tested whether VEGAS2Pathway also implicates the top pathways for colorectal cancer obtained using INRICH and MAGENTA. Supplementary Tables 4 and 5 show top pathways for colorectal cancer obtained using INRICH and MAGENTA respectively, and their VEGAS2Pathway association p values.

Another recent competitive pathway analysis approach (DEPICT) was not included in this comparison, since its current implementation does not allow analysis using user specified gene-pathway annotations.

Web-Based Implementation of VEGAS2Pathway

We implemented VEGAS2Pathway in VEGAS2 web page https://vegas2.qimrberghofer.edu.au/ and a command line tool. Users can select VEGAS2pathway analysis in the options box of the VEGAS2 web page to perform the pathway analysis. The web-based version is easy to use and requires only a two-column GWAS summary file with ‘rsID’ and ‘p-value.’ The user manual and scripts can also be downloaded from VEGAS2 webpage.

Discussion

We present a novel but simple competitive approach for pathway analysis of GWAS data. Our approach differs from existing approaches in a number of ways. Gene-based calculation in ALIGATOR (Holmans et al., Reference Holmans, Green, Pahwa, Ferreira, Purcell, Sklar and Craddock2009) is based on the assumption that LD within genes is constant, with the authors highlighting the difficulty in accounting for variable LD without resorting to computationally intensive permutation-based methods. A computationally efficient simulation approach (VEGAS) is used here to calculate gene-based test statistics that account for variable LD patterns within a gene and give results similar to those from a permutation approach. The hypergeometric approach (implemented in DEPICT and INRICH) requires users to pre-specify a significance threshold for SNPs in the pathway test, with less significant SNPs ignored. In contrast, our approach does not use a pre-specified inclusion threshold for SNPs/genes; instead, all genes are included, with weighting determined by the evidence for each gene from a gene-based test. Approaches such as GenGen and MAGENTA consider only a top SNP within a gene boundary to calculate gene-based test statistics. Only including one SNP per gene has the potential to discard relevant information if there are multiple independent signals in a gene. As just one example, Yang et al. (Reference Yang, Ferreira, Morris, Medland, Genetic Investigation of, Replication and Visscher2012) reported that among genes with SNPs associated with height, many had more than one associated SNP. Our default gene-based test includes all SNPs within a gene; hence, it would show improved power to detect pathway that comprises genes with multiple independent risk variants.

In addition to the adjustment of major confounders for pathway analysis of GWAS data such as LD between variants, gene size, and pathway size, the implementation of VEGAS2Pathway is easy to use. The implementation does not depend upon commercial software (as is the case with MAGENTA, which requires a commercial MATLAB licence); rather, a free user-friendly web-based implementation is provided. Our approach does not require much preprocessing of input files (as the case in INRICH) a two-column text file with rsID and p value is a sufficient input to perform default analysis. The web implementation ignores variants without rsID, whereas the command line tool allow user to incorporate variants without rsIDs in their analysis (providing the user has a data set available to estimate the relevant pair-wise LD values).

We incorporated both manually curated pathways from MSigDB (containing canonical pathways and gene-sets from BIOCRATA, REACTOME, KEGG databases), PANTHER, and pathway commons, and computationally predicted pathways from the GO database. We have provided a simulation derived multiple testing cut-off since the traditional multiple testing corrections are over-conservative for non-independent pathway tests. Including pathways from different sources should ensure our approach is applicable to a wide range of complex traits. Moreover, the VEGAS2Pathway command line implementation allows the user to specify other gene pathway annotations, giving more flexibility for analysis.

Our pathway analysis results suggest the enrichment of association signals for risk of colorectal cancer in BMP-SMAD signaling and muscle cell differentiation pathways. BMP-SMAD signaling is involved in cellular differentiation (Kobayashi et al., Reference Kobayashi, Lyons, McMahon and Kronenberg2005) and apoptosis (Guha et al., Reference Guha, Gomes, Kobayashi, Pestell and Kessler2002). Interestingly, different genes encoding regulators of BMP signaling such as the BMP antagonist (GREM1), BMPs (BMP2, BMP4 and BMP7), kinase (TAB1), BMP receptor (BMPR1A), and inhibitory SMADs (SMAD6 and SMAD7) are driving the association of the PID_BMPPATHWAY. Recently, the BMP-SMAD1 signaling pathway was shown to stabilize tumor suppressor p53, suggesting its loss of function in tumorogenesis (Chau et al., Reference Chau, Jia, Wang, Liu, Hu, Zhang and Li2012). Another study reported the inactivation of BMP signaling in majority of sporadic colorectal cancer patients (Kodach et al., Reference Kodach, Wiercinska, de Miranda, Bleuming, Musler, Peppelenbosch and Hardwick2008).

Our study is the first to report an association of the muscle cell differentiation pathway with colorectal cancer. Many genes moderately associated with colorectal cancer drive the association of muscle cell differentiation pathway including NRG1, CAPN2, NOS1, FGF10, SOX15, and ATG5. Several lines of evidence suggest biological plausibility for the role of this pathway in tumorigenesis. The high expression of the transmembrane NRG1 and NRG1/HER3 signaling in the tumor mesenchymal stem cells is associated with poor colorectal cancer prognosis and colorectal cancer cell progression, respectively (De Boeck et al., Reference De Boeck, Pauwels, Hensen, Rummens, Westbroek, Hendrix and De Wever2013). CAPN2-dependent IκBα degradation is responsible for secondary resistance of colorectal cancer cells to the CPT-11 (irinotecan) anti-cancer therapy (Fenouille et al., Reference Fenouille, Grosso, Yunchao, Mary, Pontier-Bres, Imbert and Lagadec2012). Expression of NOS1 is negatively associated with patient response to adoptive T-cell anti-cancer therapy (Liu et al., Reference Liu, Tomei, Ascierto, De Giorgi, Bedognetti, Dai and Marincola2014). FGF10 is reported to induce cell migration and invasion in pancreatic cancer cells through FGF10/ FGFR2 signaling (Nomura et al., Reference Nomura, Yoshitomi, Takano, Shida, Kobayashi, Ohtsuka and Miyazaki2008). SOX15 was reported to show tumor suppressive effects in pancreatic cancer (Thu et al., Reference Thu, Radulovich, Becker-Santos, Pikor, Pusic, Lockwood and Tsao2014). ATG5 is strongly downregulated in colorectal cancer patients (Cho et al., Reference Cho, Jo, Kim, Park and Kim2012). In aggregate, these results suggest that the variants residing in genes involved in muscle cell differentiation pathway might play an important role in colorectal cancer risk.

In the current analysis, we assigned variants that lie within 50 kb on either side of a gene's transcription site to compute its association p value. This choice of boundary might have ignored distantly located risk variants; we used this selection criterion to strike a balance between inclusions of possible cis-regulatory variants and maintaining specificity of a gene. The VEGAS2Pathway implementation allows user to specify the gene-boundary to compute gene-based p value. Furthermore, to note SNP assignment based on a wide boundary leads to inclusion of many non-risk variants, which can dilute the association signal of a gene. Assigning SNPs based on their regulatory relationship to genes is an important future direction. Similarly, future work would ideally expand this approach to include rare variants for gene- and pathway-based analyses. Further to add, our current work did not explore all aspects of VEGAS2 gene-based tests such as top-SNP and top percentage tests, but this study was performed considering default all variant test. Recently, Wojcik et al. (Reference Wojcik, Kao and Duggal2015) compared 21 different methods for gene- and pathway-level analysis of GWAS data and reported that the VEGAS approach considering the top 10% of SNP association p values performs better in terms of both sensitivity and control of type 1 error rate. In this work, we focused on the performance of three methods — VEGAS2Pathway, INRICH, and MAGENTA — on real data from colorectal cancer GWAS; further comparison of VEGAS2Pathway approach with other available methods under different simulated scenarios will be an important extension of current work.

In summary, we report the VEGAS2Pathway approach for pathway analysis of GWAS summary data. It accounts for LD between SNPs within a gene, and between neighboring genes, gene size, and pathway size. The current version of VEGAS2Pathway uses computationally predicted gene ontology pathways and expert-curated pathways from the MSigDB, PANTHER, and pathway commons databases. This approach is implemented in both a user-friendly web page and a unix command line perl script. We applied our method on colorectal cancer GWAS summary data and found evidence that genes involved in the BMP signaling and muscle cell differentiation pathways might play a role in the development of colorectal cancer.

Acknowledgments

We thank Dr Matthew Law for useful comments on the manuscript. We thank Xiaping Lin and Jonathan Davies from QIMR Berghofer IT for assistance with the web application. For the majority of work, AM was supported by an ANZ Trustees PhD scholarship. SM is supported by Australian National Health and Research Council and Australian Research Council fellowships. None of the authors have any conflicts of interest to report.

For description on the funding and acknowledgements for GECCO, please refer to Supplementary text 2.

Supplementary Material

To view supplementary material for this article, please visit https://doi.org/10.1017/thg.2016.100.

References

Alarmo, E. L., Parssinen, J., Ketolainen, J. M., Savinainen, K., Karhu, R., & Kallioniemi, A. (2009). BMP7 influences proliferation, migration, and invasion of breast cancer cells. Cancer Letters, 275, 35–43.Google Scholar

Brosens, L. A., van Hattem, A., Hylind, L. M., Iacobuzio-Donahue, C., Romans, K. E., Axilbund, J., . . . Giardiello, F. M. (2007). Risk of colorectal cancer in juvenile polyposis. Gut, 56, 965–967.CrossRef Google Scholar PubMed

Cerami, E. G., Gross, B. E., Demir, E., Rodchenkov, I., Babur, O., Anwar, N., . . . Sander, C. (2011). Pathway Commons, a web resource for biological pathway data. Nucleic Acids Research, 39, D685–690.Google Scholar

Chau, J. F., Jia, D., Wang, Z., Liu, Z., Hu, Y., Zhang, X., . . . Li, B. (2012). A crucial role for bone morphogenetic protein-Smad1 signalling in the DNA damage response. Nature Communications, 3, 836.Google Scholar

Cho, D. H., Jo, Y. K., Kim, S. C., Park, I. J., & Kim, J. C. (2012). Down-regulated expression of ATG5 in colorectal cancer. Anticancer Research, 32, 4091–4096.Google Scholar PubMed

Chu, H., Luo, H., Wang, H., Chen, X., Li, P., Bai, Y., . . . Zhang, G. (2014). Silencing BMP-2 expression inhibits A549 and H460 cell proliferation and migration. Diagnostic Pathology, 9, 123.CrossRef Google Scholar PubMed

De Boeck, A., Pauwels, P., Hensen, K., Rummens, J. L., Westbroek, W., Hendrix, A., . . . De Wever, O. (2013). Bone marrow-derived mesenchymal stem cells promote colorectal cancer progression through paracrine neuregulin 1/HER3 signalling. Gut, 62, 550–560.CrossRef Google Scholar PubMed

Evangelou, M., Rendon, A., Ouwehand, W. H., Wernisch, L., & Dudbridge, F. (2012). Comparison of methods for competitive tests of pathway analysis. PLoS One, 7, e41018.Google Scholar

Fenouille, N., Grosso, S., Yunchao, S., Mary, D., Pontier-Bres, R., Imbert, V., . . . Lagadec, P. (2012). Calpain 2-dependent IkappaBalpha degradation mediates CPT-11 secondary resistance in colorectal cancer xenografts. Journal of Pathology, 227, 118–129.Google Scholar

Gene Ontology Consortium. (2008). The gene ontology project in 2008. Nucleic Acids Research, 36, D440–444.Google Scholar

Guha, U., Gomes, W. A., Kobayashi, T., Pestell, R. G., & Kessler, J. A. (2002). In vivo evidence that BMP signaling is necessary for apoptosis in the mouse limb. Developmental Biology, 249, 108–120.CrossRef Google Scholar PubMed

Holmans, P., Green, E. K., Pahwa, J. S., Ferreira, M. A., Purcell, S. M., Sklar, P., . . . Craddock, N. (2009). Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. American Journal of Human Genetics, 85, 13–24.CrossRef Google Scholar PubMed

Howe, J. R., Mitros, F. A., & Summers, R. W. (1998). The risk of gastrointestinal carcinoma in familial juvenile polyposis. Annals of Surgical Oncology, 5, 751–756.Google Scholar

Jeon, H. S., Dracheva, T., Yang, S. H., Meerzaman, D., Fukuoka, J., Shakoori, A., . . . Jen, J. (2008). SMAD6 contributes to patient survival in non-small cell lung cancer and its knockdown reestablishes TGF-beta homeostasis in lung cancer cells. Cancer Research, 68, 9686–9692.Google Scholar

Jiao, S., Peters, U., Berndt, S., Brenner, H., Butterbach, K., Caan, B. J., . . . Hsu, L. (2014). Estimating the heritability of colorectal cancer. Human Molecular Genetics, 23, 3898–3905.Google Scholar

Kobayashi, T., Lyons, K. M., McMahon, A. P., & Kronenberg, H. M. (2005). BMP signaling stimulates cellular differentiation at multiple steps during cartilage development. Proceedings of the National Academy of Sciences of the United States of America, 102, 18023–18027.Google Scholar

Kodach, L. L., Wiercinska, E., de Miranda, N. F., Bleuming, S. A., Musler, A. R., Peppelenbosch, M. P., . . . Hardwick, J. C. (2008). The bone morphogenetic protein pathway is inactivated in the majority of sporadic colorectal cancers. Gastroenterology, 134, 1332–1341.Google Scholar

Lee, P. H., O'Dushlaine, C., Thomas, B., & Purcell, S. M. (2012). INRICH: interval-based enrichment analysis for genome-wide association studies. Bioinformatics, 28, 1797–1799.Google Scholar

Liberzon, A., Subramanian, A., Pinchback, R., Thorvaldsdottir, H., Tamayo, P., & Mesirov, J. P. (2011). Molecular signatures database (MSigDB) 3.0. Bioinformatics, 27, 1739–1740.Google Scholar

Liu, J. Z., McRae, A. F., Nyholt, D. R., Medland, S. E., Wray, N. R., Brown, K. M., . . . MacGregor, S. (2010). A versatile gene-based test for genome-wide association studies. American Journal of Human Genetics, 87, 139–145.Google Scholar

Liu, Q., Tomei, S., Ascierto, M. L., De Giorgi, V., Bedognetti, D., Dai, C., . . . Marincola, F. M. (2014). Melanoma NOS1 expression promotes dysfunctional IFN signaling. Journal of Clinical Investigation, 124, 2147–2159.CrossRef Google Scholar PubMed

Liu, Y., Chen, J., Yang, Y., Zhang, L., & Jiang, W. G. (2012). Muolecular impact of bone morphogenetic protein 7, on lung cancer cells and its clinical significance. International Journal of Molecular Medicine, 29, 1016–1024.Google Scholar

Mishra, A., & MacGregor, S. (2015). VEGAS2: Software for more flexible gene-based testing. Twin Research and Human Genetics, 18, 86–91.CrossRef Google Scholar PubMed

Nomura, S., Yoshitomi, H., Takano, S., Shida, T., Kobayashi, S., Ohtsuka, M., . . . Miyazaki, M. (2008). FGF10/FGFR2 signal induces cell migration and invasion in pancreatic cancer. British Journal of Cancer, 99, 305–313.Google Scholar

Pers, T. H., Karjalainen, J. M., Chan, Y., Westra, H. J., Wood, A. R., Yang, J., . . . Franke, L. (2015). Biological interpretation of genome-wide association studies using predicted gene functions. Nature Communications, 6, 5890.Google Scholar

Peters, U., Bien, S., & Zubair, N. (2015). Genetic architecture of colorectal cancer. Gut, 64, 1623–1636 Google Scholar

Peters, U., Jiao, S., Schumacher, F. R., Hutter, C. M., Aragaki, A. K., Baron, J. A., . . . Colon Cancer Family Registry and the Genetics and Epidemiology of Colorectal Cancer Consortium. (2013). Identification of genetic susceptibility loci for colorectal tumors in a genome-wide meta-analysis. Gastroenterology, 144, 799–807.e24.Google Scholar

Robinson, M. R., Wray, N. R., & Visscher, P. M. (2014). Explaining additional genetic variation in complex traits. Trends in Genetics, 30, 124–132.Google Scholar

Schaefer, C. F., Anthony, K., Krupa, S., Buchoff, J., Day, M., Hannay, T., & Buetow, K. H. (2009). PID: The pathway interaction database. Nucleic Acids Research, 37, D674–679.Google Scholar

Segre, A. V., DIAGRAM Consortium, MAGIC investigators Groop, L., Mootha, V. K., Daly, M. J., & Altshuler, D. (2010). Common inherited variation in mitochondrial genes is not enriched for associations with type 2 diabetes or related glycemic traits. PLoS Genet, 6, e1001058.CrossRef Google Scholar PubMed

Study, C., Houlston, R. S., Webb, E., Broderick, P., Pittman, A. M., Di Bernardo, M. C., . . . Dunlop, M. G. (2008). Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer. Nature Genetics, 40, 1426–1435.Google Scholar

Thomas, P. D., Campbell, M. J., Kejariwal, A., Mi, H., Karlak, B., Daverman, R., . . . Narechania, A. (2003). PANTHER: a library of protein families and subfamilies indexed by function. Genome Research, 13, 2129–2141.Google Scholar

Thu, K. L., Radulovich, N., Becker-Santos, D. D., Pikor, L. A., Pusic, A., Lockwood, W. W., . . . Tsao, M. S. (2014). SOX15 is a candidate tumor suppressor in pancreatic cancer with a potential role in Wnt/beta-catenin signaling. Oncogene, 33, 279–288.Google Scholar

Wang, K., Li, M., & Hakonarson, H. (2010). Analysing biological pathways in genome-wide association studies. Nature Reviews Genetics, 11, 843–854.Google Scholar

Wang, K., Zhang, H., Kugathasan, S., Annese, V., Bradfield, J. P., Russell, R. K., . . . Hakonarson, H. (2009). Diverse genome-wide association studies associate the IL12/IL23 pathway with Crohn disease. American Journal of Human Genetics, 84, 399–405.Google Scholar

Wang, L., Jia, P., Wolfinger, R. D., Chen, X., & Zhao, Z. (2011). Gene set analysis of genome-wide association studies: Methodological issues and perspectives. Genomics, 98, 1–8.Google Scholar

Wojcik, G. L., Kao, W. H., & Duggal, P. (2015). Relative performance of gene- and pathway-level methods as secondary analyses for genome-wide association studies. BMC Genetics, 16, 34.Google Scholar

Yamaguchi, J., Nagayama, S., Chino, A., Sakata, A., Yamamoto, N., Sato, Y., . . . Arai, M. (2014). Identification of coding exon 3 duplication in the BMPR1A gene in a patient with juvenile polyposis syndrome. Japanese Journal of Clinical Oncology, 44, 1004–1008.CrossRef Google Scholar

Yang, J., Ferreira, T., Morris, A. P., Medland, S. E., Genetic Investigation of, A. T. C., Replication, D. I. G., . . . Visscher, P. M. (2012). Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nature Genetics, 44, 369–375.Google Scholar

TABLE 1 Top Five Pathways from Pathway Analysis of Colorectal Cancer GWAS Summary Data

TABLE 2 Top Five Pathways from VEGAS2Pathway Analysis of Colorectal Cancer GWAS Summary Data and their Test Statistics Using the INRICH and MAGENTA Approaches

Mishra and MacGregor supplementary material S1

Supplementary Table

File 49.4 KB

Mishra and MacGregor supplementary material S2

Supplementary Table

File 73.8 KB

Mishra and MacGregor supplementary material S3

Supplementary Table

File 56.8 KB

Mishra and MacGregor supplementary material S4

Mishra and MacGregor supplementary material

File 55.5 KB

A Novel Approach for Pathway Analysis of GWAS Data Highlights Role of BMP Signaling and Muscle Cell Differentiation in Colorectal Cancer Susceptibility – Erratum

Aniket Mishra and Stuart MacGregor

Twin Research and Human Genetics , Volume 20 , Issue 2

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Mishra, Aniket and MacGregor, Stuart 2017. A Novel Approach for Pathway Analysis of GWAS Data Highlights Role of BMP Signaling and Muscle Cell Differentiation in Colorectal Cancer Susceptibility – Erratum. Twin Research and Human Genetics, Vol. 20, Issue. 2, p. 186.

Vojinovic, Dina Adams, Hieab H. Jian, Xueqiu Yang, Qiong Smith, Albert Vernon Bis, Joshua C. Teumer, Alexander Scholz, Markus Armstrong, Nicola J. Hofer, Edith Saba, Yasaman Luciano, Michelle Bernard, Manon Trompet, Stella Yang, Jingyun Gillespie, Nathan A. van der Lee, Sven J. Neumann, Alexander Ahmad, Shahzad Andreassen, Ole A. Ames, David Amin, Najaf Arfanakis, Konstantinos Bastin, Mark E. Becker, Diane M. Beiser, Alexa S. Beyer, Frauke Brodaty, Henry Bryan, R. Nick Bülow, Robin Dale, Anders M. De Jager, Philip L. Deary, Ian J. DeCarli, Charles Fleischman, Debra A. Gottesman, Rebecca F. van der Grond, Jeroen Gudnason, Vilmundur Harris, Tamara B. Homuth, Georg Knopman, David S. Kwok, John B. Lewis, Cora E. Li, Shuo Loeffler, Markus Lopez, Oscar L. Maillard, Pauline El Marroun, Hanan Mather, Karen A. Mosley, Thomas H. Muetzel, Ryan L. Nauck, Matthias Nyquist, Paul A. Panizzon, Matthew S. Pausova, Zdenka Psaty, Bruce M. Rice, Ken Rotter, Jerome I. Royle, Natalie Satizabal, Claudia L. Schmidt, Reinhold Schofield, Peter R. Schreiner, Pamela J. Sidney, Stephen Stott, David J. Thalamuthu, Anbupalam Uitterlinden, Andre G. Valdés Hernández, Maria C. Vernooij, Meike W. Wen, Wei White, Tonya Witte, A. Veronica Wittfeld, Katharina Wright, Margaret J. Yanek, Lisa R. Tiemeier, Henning Kremen, William S. Bennett, David A. Jukema, J. Wouter Paus, Tomas Wardlaw, Joanna M. Schmidt, Helena Sachdev, Perminder S. Villringer, Arno Grabe, Hans Jörgen Longstreth, W T van Duijn, Cornelia M. Launer, Lenore J. Seshadri, Sudha Ikram, M Arfan and Fornage, Myriam 2018. Genome-wide association study of 23,500 individuals identifies 7 loci associated with brain ventricular volume. Nature Communications, Vol. 9, Issue. 1,

Wong, Hau Cherng Lee, Christine Siew Ken and Tong, Dong Ling 2018. Pathway Analysis of Marker Genes for Leukemia Cancer using Enhanced Genetic Algorithm-Neural Network (enGANN). p. 118.

Malik, Rainer Chauhan, Ganesh Traylor, Matthew Sargurupremraj, Muralidharan Okada, Yukinori Mishra, Aniket Rutten-Jacobs, Loes Giese, Anne-Katrin van der Laan, Sander W. Gretarsdottir, Solveig Anderson, Christopher D. Chong, Michael Adams, Hieab H. H. Ago, Tetsuro Almgren, Peter Amouyel, Philippe Ay, Hakan Bartz, Traci M. Benavente, Oscar R. Bevan, Steve Boncoraglio, Giorgio B. Brown, Robert D. Butterworth, Adam S. Carrera, Caty Carty, Cara L. Chasman, Daniel I. Chen, Wei-Min Cole, John W. Correa, Adolfo Cotlarciuc, Ioana Cruchaga, Carlos Danesh, John de Bakker, Paul I. W. DeStefano, Anita L. den Hoed, Marcel Duan, Qing Engelter, Stefan T. Falcone, Guido J. Gottesman, Rebecca F. Grewal, Raji P. Gudnason, Vilmundur Gustafsson, Stefan Haessler, Jeffrey Harris, Tamara B. Hassan, Ahamad Havulinna, Aki S. Heckbert, Susan R. Holliday, Elizabeth G. Howard, George Hsu, Fang-Chi Hyacinth, Hyacinth I. Ikram, M. Arfan Ingelsson, Erik Irvin, Marguerite R. Jian, Xueqiu Jiménez-Conde, Jordi Johnson, Julie A. Jukema, J. Wouter Kanai, Masahiro Keene, Keith L. Kissela, Brett M. Kleindorfer, Dawn O. Kooperberg, Charles Kubo, Michiaki Lange, Leslie A. Langefeld, Carl D. Langenberg, Claudia Launer, Lenore J. Lee, Jin-Moo Lemmens, Robin Leys, Didier Lewis, Cathryn M. Lin, Wei-Yu Lindgren, Arne G. Lorentzen, Erik Magnusson, Patrik K. Maguire, Jane Manichaikul, Ani McArdle, Patrick F. Meschia, James F. Mitchell, Braxton D. Mosley, Thomas H. Nalls, Michael A. Ninomiya, Toshiharu O’Donnell, Martin J. Psaty, Bruce M. Pulit, Sara L. Rannikmäe, Kristiina Reiner, Alexander P. Rexrode, Kathryn M. Rice, Kenneth Rich, Stephen S. Ridker, Paul M. Rost, Natalia S. Rothwell, Peter M. Rotter, Jerome I. Rundek, Tatjana Sacco, Ralph L. Sakaue, Saori Sale, Michele M. Salomaa, Veikko Sapkota, Bishwa R. Schmidt, Reinhold Schmidt, Carsten O. Schminke, Ulf Sharma, Pankaj Slowik, Agnieszka Sudlow, Cathie L. M. Tanislav, Christian Tatlisumak, Turgut Taylor, Kent D. Thijs, Vincent N. S. Thorleifsson, Gudmar Thorsteinsdottir, Unnur Tiedt, Steffen Trompet, Stella Tzourio, Christophe van Duijn, Cornelia M. Walters, Matthew Wareham, Nicholas J. Wassertheil-Smoller, Sylvia Wilson, James G. Wiggins, Kerri L. Yang, Qiong Yusuf, Salim Bis, Joshua C. Pastinen, Tomi Ruusalepp, Arno Schadt, Eric E. Koplev, Simon Björkegren, Johan L. M. Codoni, Veronica Civelek, Mete Smith, Nicholas L. Trégouët, David A. Christophersen, Ingrid E. Roselli, Carolina Lubitz, Steven A. Ellinor, Patrick T. Tai, E. Shyong Kooner, Jaspal S. Kato, Norihiro He, Jiang van der Harst, Pim Elliott, Paul Chambers, John C. Takeuchi, Fumihiko Johnson, Andrew D. Malik, Rainer Chauhan, Ganesh Traylor, Matthew Sargurupremraj, Muralidharan Okada, Yukinori Mishra, Aniket Rutten-Jacobs, Loes Giese, Anne-Katrin van der Laan, Sander W. Gretarsdottir, Solveig Anderson, Christopher D. Chong, Michael Adams, Hieab H. H. Ago, Tetsuro Almgren, Peter Amouyel, Philippe Ay, Hakan Bartz, Traci M. Benavente, Oscar R. Bevan, Steve Boncoraglio, Giorgio B. Brown, Robert D. Butterworth, Adam S. Carrera, Caty Carty, Cara L. Chasman, Daniel I. Chen, Wei-Min Cole, John W. Correa, Adolfo Cotlarciuc, Ioana Cruchaga, Carlos Danesh, John de Bakker, Paul I. W. DeStefano, Anita L. Hoed, Marcel den Duan, Qing Engelter, Stefan T. Falcone, Guido J. Gottesman, Rebecca F. Grewal, Raji P. Gudnason, Vilmundur Gustafsson, Stefan Haessler, Jeffrey Harris, Tamara B. Hassan, Ahamad Havulinna, Aki S. Heckbert, Susan R. Holliday, Elizabeth G. Howard, George Hsu, Fang-Chi Hyacinth, Hyacinth I. Ikram, M. Arfan Ingelsson, Erik Irvin, Marguerite R. Jian, Xueqiu Jiménez-Conde, Jordi Johnson, Julie A. Jukema, J. Wouter Kanai, Masahiro Keene, Keith L. Kissela, Brett M. Kleindorfer, Dawn O. Kooperberg, Charles Kubo, Michiaki Lange, Leslie A. Langefeld, Carl D. Langenberg, Claudia Launer, Lenore J. Lee, Jin-Moo Lemmens, Robin Leys, Didier Lewis, Cathryn M. Lin, Wei-Yu Lindgren, Arne G. Lorentzen, Erik Magnusson, Patrik K. Maguire, Jane Manichaikul, Ani McArdle, Patrick F. Meschia, James F. Mitchell, Braxton D. Mosley, Thomas H. Nalls, Michael A. Ninomiya, Toshiharu O’Donnell, Martin J. Psaty, Bruce M. Pulit, Sara L. Rannikmäe, Kristiina Reiner, Alexander P. Rexrode, Kathryn M. Rice, Kenneth Rich, Stephen S. Ridker, Paul M. Rost, Natalia S. Rothwell, Peter M. Rotter, Jerome I. Rundek, Tatjana Sacco, Ralph L. Sakaue, Saori Sale, Michele M. Salomaa, Veikko Sapkota, Bishwa R. Schmidt, Reinhold Schmidt, Carsten O. Schminke, Ulf Sharma, Pankaj Slowik, Agnieszka Sudlow, Cathie L. M. Tanislav, Christian Tatlisumak, Turgut Taylor, Kent D. Thijs, Vincent N. S. Thorleifsson, Gudmar Thorsteinsdottir, Unnur Tiedt, Steffen Trompet, Stella Tzourio, Christophe van Duijn, Cornelia M. Walters, Matthew Wareham, Nicholas J. Wassertheil-Smoller, Sylvia Wilson, James G. Wiggins, Kerri L. Yang, Qiong Yusuf, Salim Amin, Najaf Aparicio, Hugo S. Arnett, Donna K. Attia, John Beiser, Alexa S. Berr, Claudine Buring, Julie E. Bustamante, Mariana Caso, Valeria Cheng, Yu-Ching Choi, Seung Hoan Chowhan, Ayesha Cullell, Natalia Dartigues, Jean-François Delavaran, Hossein Delgado, Pilar Dörr, Marcus Engström, Gunnar Ford, Ian Gurpreet, Wander S. Hamsten, Anders Heitsch, Laura Hozawa, Atsushi Ibanez, Laura Ilinca, Andreea Ingelsson, Martin Iwasaki, Motoki Jackson, Rebecca D. Jood, Katarina Jousilahti, Pekka Kaffashian, Sara Kalra, Lalit Kamouchi, Masahiro Kitazono, Takanari Kjartansson, Olafur Kloss, Manja Koudstaal, Peter J. Krupinski, Jerzy Labovitz, Daniel L. Laurie, Cathy C. Levi, Christopher R. Li, Linxin Lind, Lars Lindgren, Cecilia M. Lioutas, Vasileios Liu, Yong Mei Lopez, Oscar L. Makoto, Hirata Martinez-Majander, Nicolas Matsuda, Koichi Minegishi, Naoko Montaner, Joan Morris, Andrew P. Muiño, Elena Müller-Nurasyid, Martina Norrving, Bo Ogishima, Soichi Parati, Eugenio A. Peddareddygari, Leema Reddy Pedersen, Nancy L. Pera, Joanna Perola, Markus Pezzini, Alessandro Pileggi, Silvana Rabionet, Raquel Riba-Llena, Iolanda Ribasés, Marta Romero, Jose R. Roquer, Jaume Rudd, Anthony G. Sarin, Antti-Pekka Sarju, Ralhan Sarnowski, Chloe Sasaki, Makoto Satizabal, Claudia L. Satoh, Mamoru Sattar, Naveed Sawada, Norie Sibolt, Gerli Sigurdsson, Ásgeir Smith, Albert Sobue, Kenji Soriano-Tárraga, Carolina Stanne, Tara Stine, O. Colin Stott, David J. Strauch, Konstantin Takai, Takako Tanaka, Hideo Tanno, Kozo Teumer, Alexander Tomppo, Liisa Torres-Aguila, Nuria P. Touze, Emmanuel Tsugane, Shoichiro Uitterlinden, Andre G. Valdimarsson, Einar M. van der Lee, Sven J. Völzke, Henry Wakai, Kenji Weir, David Williams, Stephen R. Wolfe, Charles D. A. Wong, Quenna Xu, Huichun Yamaji, Taiki Sanghera, Dharambir K. Melander, Olle Jern, Christina Strbian, Daniel Fernandez-Cadenas, Israel Longstreth, W. T. Rolfs, Arndt Hata, Jun Woo, Daniel Rosand, Jonathan Pare, Guillaume Hopewell, Jemma C. Saleheen, Danish Stefansson, Kari Worrall, Bradford B. Kittner, Steven J. Seshadri, Sudha Fornage, Myriam Markus, Hugh S. Howson, Joanna M. M. Kamatani, Yoichiro Debette, Stephanie Dichgans, Martin Sanghera, Dharambir K. Melander, Olle Jern, Christina Strbian, Daniel Fernandez-Cadenas, Israel Longstreth, W. T. Rolfs, Arndt Hata, Jun Woo, Daniel Rosand, Jonathan Pare, Guillaume Hopewell, Jemma C. Saleheen, Danish Stefansson, Kari Worrall, Bradford B. Kittner, Steven J. Seshadri, Sudha Fornage, Myriam Markus, Hugh S. Howson, Joanna M. M. Kamatani, Yoichiro Debette, Stephanie and Dichgans, Martin 2018. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nature Genetics, Vol. 50, Issue. 4, p. 524.

Schellevis, Rosa L. van Dijk, Elon H. C. Breukink, Myrte B. Altay, Lebriz Bakker, Bjorn Koeleman, Bobby P. C. Kiemeney, Lambertus A. Swinkels, Dorine W. Keunen, Jan E. E. Fauser, Sascha Hoyng, Carel B. den Hollander, Anneke I. Boon, Camiel J. F. and de Jong, Eiko K. 2018. Role of the Complement System in Chronic Central Serous Chorioretinopathy. JAMA Ophthalmology, Vol. 136, Issue. 10, p. 1128.

Chen, Biqing Zhu, Zijian Na, Ren Fang, Wan Zhang, Wenxia Zhou, Qin Zhou, Shanbi Lei, Han Huang, Ailong Chen, Tingmei Ni, Dongsheng Gu, Yuping Liu, Jianing Fang, Fang and Rao, Yi 2018. Genomic Analyses of Visual Cognition: Perceptual Rivalry and Top-Down Control. The Journal of Neuroscience, Vol. 38, Issue. 45, p. 9668.

Timmers, Paul RHJ Mounier, Ninon Lall, Kristi Fischer, Krista Ning, Zheng Feng, Xiao Bretherick, Andrew D Clark, David W Agbessi, M Ahsan, H Alves, I Andiappan, A Awadalla, P Battle, A Bonder, MJ Boomsma, D Christiansen, M Claringbould, A Deelen, P van Dongen, J Esko, T Favé, M Franke, L Frayling, T Gharib, SA Gibson, G Hemani, G Jansen, R Kalnapenkis, A Kasela, S Kettunen, J Kim, Y Kirsten, H Kovacs, P Krohn, K Kronberg-Guzman, J Kukushkina, V Kutalik, Z Kähönen, M Lee, B Lehtimäki, T Loeffler, M Marigorta, U Metspalu, A van Meurs, J Milani, L Müller-Nurasyid, M Nauck, M Nivard, M Penninx, B Perola, M Pervjakova, N Pierce, B Powell, J Prokisch, H Psaty, BM Raitakari, O Ring, S Ripatti, S Rotzschke, O Ruëger, S Saha, A Scholz, M Schramm, K Seppälä, I Stumvoll, M Sullivan, P Teumer, A Thiery, J Tong, L Tönjes, A Verlouw, J Visscher, PM Võsa, U Völker, U Yaghootkar, H Yang, J Zeng, B Zhang, F Agbessi, M Ahsan, H Alves, I Andiappan, A Awadalla, P Battle, A Bonder, MJ Boomsma, D Christiansen, M Claringbould, A Deelen, P van Dongen, J Esko, T Favé, M Franke, L Frayling, T Gharib, SA Gibson, G Hemani, G Jansen, R Kalnapenkis, A Kasela, S Kettunen, J Kim, Y Kirsten, H Kovacs, P Krohn, K Kronberg-Guzman, J Kukushkina, V Kutalik, Z Kähönen, M Lee, B Lehtimäki, T Loeffler, M Marigorta, U Metspalu, A van Meurs, J Milani, L Müller-Nurasyid, M Nauck, M Nivard, M Penninx, B Perola, M Pervjakova, N Pierce, B Powell, J Prokisch, H Psaty, BM Raitakari, O Ring, S Ripatti, S Rotzschke, O Ruëger, S Saha, A Scholz, M Schramm, K Seppälä, I Stumvoll, M Sullivan, P Teumer, A Thiery, J Tong, L Tönjes, A Verlouw, J Visscher, PM Võsa, U Völker, U Yaghootkar, H Yang, J Zeng, B Zhang, F Shen, Xia Esko, Tõnu Kutalik, Zoltán Wilson, James F and Joshi, Peter K 2019. Genomics of 1 million parent lifespans implicates novel pathways and common diseases and distinguishes survival chances. eLife, Vol. 8, Issue. ,

Schafmayer, Clemens Harrison, James William Buch, Stephan Lange, Christina Reichert, Matthias C Hofer, Philipp Cossais, François Kupcinskas, Juozas von Schönfels, Witigo Schniewind, Bodo Kruis, Wolfgang Tepel, Jürgen Zobel, Myrko Rosendahl, Jonas Jacobi, Thorsten Walther-Berends, Andreas Schroeder, Michael Vogel, Ilka Sergeev, Petr Boedeker, Hans Hinrichsen, Holger Volk, Andreas Erk, Jens-Uwe Burmeister, Greta Hendricks, Alexander Hinz, Sebastian Wolff, Sebastian Böttner, Martina Wood, Andrew R Tyrrell, Jessica Beaumont, Robin N Langheinrich, Melanie Kucharzik, Torsten Brezina, Stefanie Huber-Schönauer, Ursula Pietsch, Leonora Noack, Laura Sophie Brosch, Mario Herrmann, Alexander Thangapandi, Raghavan Veera Schimming, Hans Wolfgang Zeissig, Sebastian Palm, Stefan Focke, Gerd Andreasson, Anna Schmidt, Peter T Weitz, Juergen Krawczak, Michael Völzke, Henry Leeb, Gernot Michl, Patrick Lieb, Wolfgang Grützmann, Robert Franke, Andre Lammert, Frank Becker, Thomas Kupcinskas, Limas D’Amato, Mauro Wedel, Thilo Datz, Christian Gsur, Andrea Weedon, Michael N and Hampe, Jochen 2019. Genome-wide association analysis of diverticular disease points towards neuromuscular, connective tissue and epithelial pathomechanisms. Gut, Vol. 68, Issue. 5, p. 854.

Hosoda, Yoshikatsu Miyake, Masahiro Schellevis, Rosa L. Boon, Camiel J. F. Hoyng, Carel B. Miki, Akiko Meguro, Akira Sakurada, Yoichi Yoneyama, Seigo Takasago, Yukari Hata, Masayuki Muraoka, Yuki Nakanishi, Hideo Oishi, Akio Ooto, Sotaro Tamura, Hiroshi Uji, Akihito Miyata, Manabu Takahashi, Ayako Ueda-Arakawa, Naoko Tajima, Atsushi Sato, Takehiro Mizuki, Nobuhisa Shiragami, Chieko Iida, Tomohiro Khor, Chiea Chuen Wong, Tien Yin Yamada, Ryo Honda, Shigeru de Jong, Eiko K. Hollander, Anneke I. den Matsuda, Fumihiko Yamashiro, Kenji and Tsujikawa, Akitaka 2019. Genome-wide association analyses identify two susceptibility loci for pachychoroid disease central serous chorioretinopathy. Communications Biology, Vol. 2, Issue. 1,

White, Marquitta J. Yaspan, Brian L. Veatch, Olivia J. Goddard, Pagé Risse‐Adams, Oona S. and Contreras, Maria G. 2019. Strategies for Pathway Analysis Using GWAS and WGS Data. Current Protocols in Human Genetics, Vol. 100, Issue. 1,

Tao, Feifei Beecham, Gary W. Rebelo, Adriana P. Svaren, John Blanton, Susan H. Moran, John J. Lopez‐Anido, Camila Morrow, Jasper M. Abreu, Lisa Rizzo, Devon Kirk, Callyn A. Wu, Xingyao Feely, Shawna Verhamme, Camiel Saporta, Mario A. Herrmann, David N. Day, John W. Sumner, Charlotte J. Lloyd, Thomas E. Li, Jun Yum, Sabrina W. Taroni, Franco Baas, Frank Choi, Byung‐Ok Pareyson, Davide Scherer, Steven S. Reilly, Mary M. Shy, Michael E. and Züchner, Stephan 2019. Variation in SIPA1L2 is correlated with phenotype modification in Charcot– Marie– Tooth disease type 1A. Annals of Neurology, Vol. 85, Issue. 3, p. 316.

Nel, Melissa Mulder, Nicola Europa, Tarin A. and Heckmann, Jeannine M. 2019. Using Whole Genome Sequencing in an African Subphenotype of Myasthenia Gravis to Generate a Pathogenetic Hypothesis. Frontiers in Genetics, Vol. 10, Issue. ,

Dumitrescu, Logan Mahoney, Emily R Mukherjee, Shubhabrata Lee, Michael L Bush, William S Engelman, Corinne D Lu, Qiongshi Fardo, David W Trittschuh, Emily H Mez, Jesse Kaczorowski, Catherine Hernandez Saucedo, Hector Widaman, Keith F Buckley, Rachel Properzi, Michael Mormino, Elizabeth Yang, Hyun-Sik Harrison, Tessa Hedden, Trey Nho, Kwangsik Andrews, Shea J Tommet, Doug Hadad, Niran Sanders, R Elizabeth Ruderfer, Douglas M Gifford, Katherine A Moore, Annah M Cambronero, Francis Zhong, Xiaoyuan Raghavan, Neha S Vardarajan, Badri Pericak-Vance, Margaret A Farrer, Lindsay A Wang, Li-San Cruchaga, Carlos Schellenberg, Gerard Cox, Nancy J Haines, Jonathan L Keene, C Dirk Saykin, Andrew J Larson, Eric B Sperling, Reisa A Mayeux, Richard Bennett, David A Schneider, Julie A Crane, Paul K Jefferson, Angela L and Hohman, Timothy J 2020. Genetic variants and functional pathways associated with resilience to Alzheimer’s disease. Brain, Vol. 143, Issue. 8, p. 2561.

Camilleri, Michael Sandler, Robert S. and Peery, Anne F. 2020. Etiopathogenetic Mechanisms in Diverticular Disease of the Colon. Cellular and Molecular Gastroenterology and Hepatology, Vol. 9, Issue. 1, p. 15.

Price, Kaitlyn M. Wigg, Karen G. Feng, Yu Blokland, Kirsten Wilkinson, Margaret He, Gengming Kerr, Elizabeth N. Carter, Tasha‐Cate Guger, Sharon L. Lovett, Maureen W. Strug, Lisa J. and Barr, Cathy L. 2020. Genome‐wide association study of word reading: Overlap with risk genes for neurodevelopmental disorders. Genes, Brain and Behavior, Vol. 19, Issue. 6,

Takahashi, Yasuo Yamazaki, Keiko Kamatani, Yoichiro Kubo, Michiaki Matsuda, Koichi and Asai, Satoshi 2021. A genome-wide association study identifies a novel candidate locus at the DLGAP1 gene with susceptibility to resistant hypertension in the Japanese population. Scientific Reports, Vol. 11, Issue. 1,

Hariharan, Praveen and Dupuis, Josée 2021. Mapping gene and gene pathways associated with coronary artery disease: a CARDIoGRAM exome and multi-ancestry UK biobank analysis. Scientific Reports, Vol. 11, Issue. 1,

Kulkarni, Om Sugier, Pierre-Emmanuel Guibon, Julie Boland-Augé, Anne Lonjou, Christine Bacq-Daian, Delphine Olaso, Robert Rubino, Carole Souchard, Vincent Rachedi, Frédérique Lence-Anta, Juan Jesus Ortiz, Rosa Maria Xhaard, Constance Laurent-Puig, Pierre Mulot, Claire Guizard, Anne-Valérie Schvartz, Claire Boutron-Ruault, Marie-Christine Ostroumova, Evgenia Kesminiene, Ausrele Deleuze, Jean-François Guénel, Pascal De Vathaire, Florent Truong, Thérèse and Lesueur, Fabienne 2021. Gene network and biological pathways associated with susceptibility to differentiated thyroid carcinoma. Scientific Reports, Vol. 11, Issue. 1,

Luo, Pan Liu, Lin Hou, Weikun Xu, Ke Xu, Peng and Sheikh, Nadeem 2022. Gene Set Enrichment Analysis Detected Immune Cell‐Related Pathways Associated with Primary Sclerosing Cholangitis. BioMed Research International, Vol. 2022, Issue. 1,

Trpchevska, Natalia Freidin, Maxim B. Broer, Linda Oosterloo, Berthe C. Yao, Shuyang Zhou, Yitian Vona, Barbara Bishop, Charles Bizaki-Vallaskangas, Argyro Canlon, Barbara Castellana, Fabio Chasman, Daniel I. Cherny, Stacey Christensen, Kaare Concas, Maria Pina Correa, Adolfo Elkon, Ran Mengel-From, Jonas Gao, Yan Giersch, Anne B.S. Girotto, Giorgia Gudjonsson, Alexander Gudnason, Vilmundur Heard-Costa, Nancy L. Hertzano, Ronna Hjelmborg, Jacob v.B. Hjerling-Leffler, Jens Hoffman, Howard J. Kaprio, Jaakko Kettunen, Johannes Krebs, Kristi Kähler, Anna K. Lallemend, Francois Launer, Lenore J. Lee, I-Min Leonard, Hampton Li, Chuan-Ming Lowenheim, Hubert Magnusson, Patrik K.E. van Meurs, Joyce Milani, Lili Morton, Cynthia C. Mäkitie, Antti Nalls, Mike A. Nardone, Giuseppe Giovanni Nygaard, Marianne Palviainen, Teemu Pratt, Sheila Quaranta, Nicola Rämö, Joel Saarentaus, Elmo Sardone, Rodolfo Satizabal, Claudia L. Schweinfurth, John M. Seshadri, Sudha Shiroma, Eric Shulman, Eldad Simonsick, Eleanor Spankovich, Christopher Tropitzsch, Anke Lauschke, Volker M. Sullivan, Patrick F. Goedegebure, Andre Cederroth, Christopher R. Williams, Frances M.K. Nagtegaal, Andries Paul Metspalu, Andres Nelis, Mari Mägi, Reedik and Esko, Tõnu 2022. Genome-wide association meta-analysis identifies 48 risk variants and highlights the role of the stria vascularis in hearing loss. The American Journal of Human Genetics, Vol. 109, Issue. 6, p. 1077.

Download full list

Article contents

A Novel Approach for Pathway Analysis of GWAS Data Highlights Role of BMP Signaling and Muscle Cell Differentiation in Colorectal Cancer Susceptibility

Abstract

Keywords