Published online by Cambridge University Press: 16 June 2022
Even within well-studied organisms, many genes lack useful functional annotations. One way to generate such functional information is to infer biological relationships between genes or proteins, using a network of gene coexpression data that includes functional annotations. Signed distance correlation has proved useful for the construction of unweighted gene coexpression networks. However, transforming correlation values into unweighted networks may lead to a loss of important biological information related to the intensity of the correlation. Here, we introduce a principled method to construct weighted gene coexpression networks using signed distance correlation. These networks contain weighted edges only between those pairs of genes whose correlation value is higher than a given threshold. We analyze data from different organisms and find that networks generated with our method based on signed distance correlation are more stable and capture more biological information compared to networks obtained from Pearson correlation. Moreover, we show that signed distance correlation networks capture more biological information than unweighted networks based on the same metric. While we use biological data sets to illustrate the method, the approach is general and can be used to construct networks in other domains. Code and data are available on https://github.com/javier-pardodiaz/sdcorGCN.
Action Editor: Christoph Stadtfeld
A preliminary version of this paper was presented at the Ninth International Conference on Complex Networks and their Applications (COMPLEX NETWORKS 2020).