Hostname: page-component-78c5997874-8bhkd Total loading time: 0 Render date: 2024-11-20T02:34:30.091Z Has data issue: false hasContentIssue false

Examining the variability in network populations and its role in generative models

Published online by Cambridge University Press:  17 January 2020

Viplove Arora
Affiliation:
School of Industrial Engineering, Purdue University, West LafayetteIN, 47907USA
Dali Guo
Affiliation:
School of Industrial Engineering, Purdue University, West LafayetteIN, 47907USA
Katherine D. Dunbar
Affiliation:
School of Industrial Engineering, Purdue University, West LafayetteIN, 47907USA
Mario Ventresca*
Affiliation:
School of Industrial Engineering, Purdue University, West LafayetteIN, 47907USA
*
*Corresponding author. Email: [email protected]

Abstract

A principled approach to understand networks is to formulate generative models and infer their parameters from given network data. Due to the scarcity of data in the form of multiple networks that have evolved from the same process, generative models are typically formulated to learn parameters from a single network observation, hence ignoring the natural variability of the “true” process. In this paper, we highlight the importance of variability in evaluating generative models and present two ways of quantifying the variability for a finite set of networks. The first evaluation scheme compares the statistical properties of networks in a dissimilarity space, while the other relies on data-driven entropy measures to compute variability in network populations. Using these measures, we evaluate the ability of four generative models to synthesize networks that capture the variability of the “true” process. Our empirical analysis suggests that generative models fitted for a single network observation fail to capture the variability in the network population. Our work highlights the need for rethinking the way we evaluate the goodness-of-fit of new and existing network models and devising models that are capable of matching the variability of network populations when available.

Type
Research Article
Copyright
© Cambridge University Press 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Both authors contributed equally to this manuscript.

References

Alderson, David L. (2008). OR FORUM—Catching the ‘Network Science’ Bug: Insight and Opportunity for the Operations Researcher. Operations Research, 56(5), 10471065.CrossRefGoogle Scholar
Amico, E., & Goni, J. (2018). Mapping hybrid functional-structural connectivity traits in the human connectome. Network Neuroscience, 2(3), 306322.CrossRefGoogle ScholarPubMed
Anand, K., & Bianconi, G. (2009). Entropy measures for networks: Toward an information theory of complex topologies. Physical Review E – Statistical, Nonlinear, and Soft Matter Physics, 80(4), 14.CrossRefGoogle ScholarPubMed
Anderson, C. J., Wasserman, S., & Crouch, B. (1999). A p* primer: Logit models for social networks. Social Networks, 21(1), 3766.CrossRefGoogle Scholar
Arora, V., & Ventresca, M. (2016). A Multi-objective optimization approach for generating complex networks. Proceedings of the 2016 on genetic and evolutionary computation conference companion – gecco 2016 companion (pp. 1516). New York, NY, USA: ACM Press.Google Scholar
Arora, V., & Ventresca, M. (2017). Action-based modeling of complex networks. Scientific Reports, 7(1), 6673.CrossRefGoogle ScholarPubMed
Arora, V., & Ventresca, M. (2019). Evaluating the natural variability in generative models for complex networks. Studies in computational intelligence (pp. 743754). Springer International Publishing.Google Scholar
Avena-Koenigsberger, A., Goni, J., Sole, R., & Sporns, O. (2014). Network morphospace. Journal of the Royal Society Interface, 12(103), 2014088120140881.CrossRefGoogle Scholar
Banerjee, A., Chandrasekhar, A. G., Duflo, E., & Jackson, M. O. (2013). The diffusion of microfinance. Science, 341(6144), 1236498.CrossRefGoogle ScholarPubMed
Barabasi, A.-L. (2002). Linked: The new science of networks (1st ed.) Basic Books.Google Scholar
Barabasi, A.-L. (2012). The network takeover. Nature Physics, 8(1), 1416.CrossRefGoogle Scholar
Barabasi, A.-L. (2016). Network science. Cambridge University Press.Google Scholar
Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509512.CrossRefGoogle ScholarPubMed
Bender, E. A., & Canfield, E. R. (1978). The asymptotic number of labeled graphs with given degree sequences. Journal of Combinatorial Theory, Series A, 24(3), 296307.CrossRefGoogle Scholar
Bianconi, G. (2008). The entropy of randomized network ensembles. Epl (Europhysics Letters), 81(2), 28005.CrossRefGoogle Scholar
Casiraghi, G., Nanumyan, V., Scholtes, I., & Schweitzer, F. (2016). Generalized hypergeometric ensembles: Statistical hypothesis testing in complex networks. 7, arXiv, 1–5.Google Scholar
Chung, F., & Lu, L. (2002a). Connected components in random graphs with given expected degree sequences. Annals of Combinatorics, 6(2), 125145.CrossRefGoogle Scholar
Chung, F., & Lu, L. (2002b). The average distances in random graphs with given expected degrees. Proceedings of the National Academy of Sciences, 99(25), 1587915882.CrossRefGoogle ScholarPubMed
Cohen, R., & Havlin, S. (2010). Complex networks: Structure, robustness, and function. Cambridge University Press.CrossRefGoogle Scholar
Danon, L., Díaz-Guilera, A., Duch, J., & Arenas, A. (2005). Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005(09), P09008P09008.CrossRefGoogle Scholar
De Domenico, M., & Biamonte, J. (2016). Spectral entropies as information-theoretic tools for complex network comparison. Physical Review X, 6(4), 041062.CrossRefGoogle Scholar
Dehmer, M., & Mowshowitz, A. (2011). A history of graph entropy measures. Information Sciences, 181(1), 5778.CrossRefGoogle Scholar
Donnat, C., & Holmes, S. (2018). Tracking network dynamics: A survey of distances and similarity metrics. arxiv preprint, 1.Google Scholar
Durante, D., Dunson, D. B., & Vogelstein, J. T. (2017). Nonparametric Bayes Modeling of Populations of Networks. Journal of the American Statistical Association, 112(520), 15161530.CrossRefGoogle Scholar
Emmert-Streib, F., & Dehmer, M. (2012). Exploring statistical and population aspects of network complexity. Plos One, 7(5), e34523.CrossRefGoogle ScholarPubMed
Emmert-Streib, F., Dehmer, M., & Shi, Y. (2016). Fifty years of graph matching, network alignment and network comparison. Information Sciences, 346347(6), 180197.Google Scholar
Erdös, P., & Rényi, A. (1959). On random graphs. Publicationes Mathematicae Debrecen, 6, 290297.Google Scholar
Erdös, P., & Rényi, A. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5(1), 1761.Google Scholar
Faloutsos, M., Faloutsos, P., & Faloutsos, C. (1999). On power-law relationships of the Internet topology. Proceedings of the conference on applications, technologies, architectures, and protocols for computer communication – sigcomm ’99, vol. 50 (251–262). New York, NY, USA: ACM Press.Google Scholar
Faust, K., & Wasserman, S. (1992). Blockmodels: Interpretation and evaluation. Social Networks, 14(1–2), 561.CrossRefGoogle Scholar
Fields, S., & Song, O.-K. (1989). A novel genetic system to detect protein–protein interactions. Nature, 340(6230), 245246.CrossRefGoogle ScholarPubMed
Fortunato, S., & Hric, D. (2016). Community detection in networks: A user guide. Physics Reports, 659(11), 144.CrossRefGoogle Scholar
Ginestet, C. E., Li, J., Balachandran, P., Rosenberg, S., & Kolaczyk, E. D. (2017). Hypothesis testing for network data in functional neuroimaging. The Annals of Applied Statistics, 11(2), 725750.CrossRefGoogle Scholar
Goldenberg, A., Zheng, A. X., Fienberg, S. E., & Airoldi, E. M. (2009). A survey of statistical network models. Foundations and Trends in Machine Learning, 2(3), 129233.CrossRefGoogle Scholar
Gollini, I., & Murphy, T. B. (2016). Joint Modeling of multiple network views. Journal of Computational and Graphical Statistics, 25(1), 246265.CrossRefGoogle Scholar
Gutfraind, A., Meyers, L. A., & Safro, I. (2012). Multiscale Network Generation. arXiv:1207.4266, 7, 28.Google Scholar
Hajibagheri, A., Lakkaraju, K., Sukthankar, G., Wigand, R. T., & Agarwal, N. (2015). Conflict and communication in massively-multiplayer online games. International conference on social computing, behavioral-cultural modeling, and prediction (pp. 6574). Springer.CrossRefGoogle Scholar
Hoff, P. D., Raftery, A. E., & Handcock, M. S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460), 10901098.CrossRefGoogle Scholar
Huberman, B. A. (2003). The laws of the Web: Patterns in the ecology of information. MIT Press.Google Scholar
Hunter, D. R. (2007). Curved exponential family models for social networks. Social Networks, 29(2), 216230.CrossRefGoogle ScholarPubMed
Isella, L., Stehlé, J., Barrat, A., Cattuto, C., Pinton, J.-F., & den Broeck, W. (2011). What’s in a crowd? Analysis of face-to-face behavioral networks. Journal of Theoretical Biology, 271(1), 166180.CrossRefGoogle Scholar
Ladyman, J., Lambert, J., & Wiesner, K. (2013). What is a complex system? European Journal for Philosophy of Science, 3(1), 3367.CrossRefGoogle Scholar
Leskovec, J., Kleinberg, J., & Faloutsos, C. (2007). Graph evolution: Densification and shrinking diameters. Acm Transactions on Knowledge Discovery from Data, 1(1), 2–es.CrossRefGoogle Scholar
Lunagómez, S., Olhede, S. C., & Wolfe, P. J. (2019). Modeling network populations via graph distances. arxiv preprint, 126.Google Scholar
Mahadevan, P., Krioukov, D., Fall, K., & Vahdat, A. (2006). Systematic topology analysis and generation using degree correlations. Acm Sigcomm Computer Communication Review, 36(4), 135.CrossRefGoogle Scholar
Maslov, S., & Sneppen, K. (2002). Specificity and stability in topology of protein networks. Science, 296(5569), 910913.CrossRefGoogle ScholarPubMed
McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology, 27(1), 415444.CrossRefGoogle Scholar
Menezes, T., & Roth, C. (2019). Automatic discovery of families of network generative processes. Dynamics on and of complex networks iii (pp. 83111).CrossRefGoogle Scholar
Milo, R., Shen-Orr, S., Itzkovitz, S., & Kashtan, N. (2002). Network Motif: Simple building blocks of complex networks. Science, 298(5594), 298.CrossRefGoogle Scholar
Molloy, M., & Reed, B. (1995). A critical point for random graphs with a given degree sequence. Random Structures & Algorithms, 6(2–3), 161180.CrossRefGoogle Scholar
Moreno, J. L. (1934). Who shall survive? A new approach to the problem of human interrelations. Nervous and mental disease monograph series, no. 58. Washington: Nervous and Mental Disease Publishing Co.CrossRefGoogle Scholar
Moreno, S., & Neville, J. (2013). Network hypothesis testing using mixed Kronecker product graph models. In Xiong, H., Karypis, G., Thuraisingham, B. M., Cook, D. J., & Wu, X. (eds), 13th international conference on data mining (pp. 11631168).CrossRefGoogle Scholar
Moreno, S., & Neville, J. (2009). An Investigation of the distributional characteristics of generative graph models. Proceedings of the 1st workshop on information in networks.Google Scholar
Moreno, S., Neville, J., & Kirshner, S. (2018). Tied Kronecker product graph models to capture variance in network populations. Acm Transactions on Knowledge Discovery from Data, 20(3), 140.CrossRefGoogle Scholar
Newman, M. E. J. (2003a). Mixing patterns in networks. Physical Review E, 67(2), 026126.CrossRefGoogle ScholarPubMed
Newman, M. E. J. (2003b). The structure and function of complex networks. Siam Review, 45(2), 167–256.ewman, M. (2008). The physics of networks. Physics Today, 61(11), 3338.CrossRefGoogle Scholar
Orsini, C., Dankulov, M. M., Colomer-de Simón, P., Jamakovic, A., Mahadevan, P., Vahdat, A., Bassler, K. E., Toroczkai, Z., Boguñá, M., Caldarelli, G., Fortunato, S., & Krioukov, D. (2015). Quantifying randomness in real networks. Nature communications, 6(May), 8627.CrossRefGoogle ScholarPubMed
Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435(7043), 814818.CrossRefGoogle ScholarPubMed
Park, J., & Newman, M. E. J. (2004). Statistical mechanics of networks. Physical Review E, 70(6), 066117.CrossRefGoogle ScholarPubMed
Pastor-Satorras, R., Vázquez, A., & Vespignani, A. (2001). Dynamical and correlation properties of the internet. Physical Review Letters, 87(25), 258701–1.CrossRefGoogle ScholarPubMed
Peacock, J. A. (1983). Two-dimensional goodness-of-fit testing in astronomy. Monthly Notices of the Royal Astronomical Society, 202(3), 615627.CrossRefGoogle Scholar
Peixoto, T. P. (2012). Entropy of stochastic blockmodel ensembles. Physical Review E, 85(5), 056122.CrossRefGoogle ScholarPubMed
Peixoto, T. P. (2015). Model selection and hypothesis testing for large-scale network models with overlapping groups. Physical Review X, 5(1), 120.CrossRefGoogle Scholar
Peixoto, T. P. (2017). Nonparametric Bayesian inference of the microcanonical stochastic block model. Physical Review E, 95(1), 121.Google ScholarPubMed
Perc, M., Jordan, J. J., Rand, D. G., Wang, Z., Boccaletti, S., & Szolnoki, A. (2017). Statistical physics of human cooperation. Physics Reports, 687(5), 151.CrossRefGoogle Scholar
Pinar, A., Seshadhri, C., Kolda, T. G., Pinar, A., & Kolda, T. G. (2012). The similarity between stochastic Kronecker and Chung-Lu graph models. Zaki, M., Obradovic, Z., Tan, P. N., Banerjee, A., Kamath, C. & Parthasarathy, S. (Eds.), Twelfth Siam international conference on data mining (pp. 10711082). Philadelphia, PA: Society for Industrial and Applied Mathematics.Google Scholar
Piraveenan, M., Prokopenko, M., & Zomaya, A. Y. (2008). Local assortativeness in scale-free networks. Epl (Europhysics Letters), 84(2), 28002.CrossRefGoogle Scholar
Price, D. J. D. S. (1965). Networks of scientific papers. Science, 149(3683), 510515.CrossRefGoogle ScholarPubMed
Przulj, N., Corneil, D. G., & Jurisica, I. (2004). Modeling interactome: Scale-free or geometric? Bioinformatics, 20(18), 35083515.CrossRefGoogle ScholarPubMed
Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E, 74(1), 016110.CrossRefGoogle ScholarPubMed
Robins, G., Pattison, P., Kalish, Y., & Lusher, D. (2007). An introduction to exponential random graph (p*) models for social networks. Social Networks, 29(2), 173191.CrossRefGoogle Scholar
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 11181123.CrossRefGoogle Scholar
Schieber, T. A., Carpi, L., Díaz-Guilera, A., Pardalos, P. M., Masoller, C., & Ravetti, M. G. (2017). Quantification of network structural dissimilarities. Nature Communications, 8(May 2016), 13928.CrossRefGoogle ScholarPubMed
Snijders, T. A. B., Pattison, P. E., Robins, G. L., & Handcock, M. S. (2004). New specifications for exponential random graph models. Sociological Methodology, 44.Google Scholar
Soundarajan, S., Eliassi-Rad, T., & Gallagher, B. (2014). A guide to selecting a network similarity method. Proceedings of the 2014 Siam international conference on data mining (pp. 10371045). Philadelphia, PA: Society for Industrial and Applied Mathematics.Google Scholar
Stanton, I., & Pinar, A. (2012). Constructing and sampling graphs with a prescribed joint degree distribution. Journal of Experimental Algorithmics, 17(1), 3.1.CrossRefGoogle Scholar
Strauss, D. (1986). On a general class of models for interaction. Siam Review, 28(4), 513527.CrossRefGoogle Scholar
Sweet, T. M., Thomas, A. C., & Junker, B. W. (2012). Hierarchical network models for education research. Journal of Educational and Behavioral Statistics, 38(3), 295318.CrossRefGoogle Scholar
Vallès-Català, T., Peixoto, T. P., Sales-Pardo, M., & Guimerà, R. (2018). Consistencies and inconsistencies between model selection and link prediction in networks. Physical Review E, 97(6), 062316.CrossRefGoogle ScholarPubMed
Van Essen, D. C., Smith, S. M., Barch, D. M., Behrens, T. E. J., Yacoub, E., & Ugurbil, K. (2013). The WU-Minn human connectome project: An overview. Neuroimage, 80(10), 6279.CrossRefGoogle ScholarPubMed
Views, R. (2000). University of oregon route views project.Google Scholar
Wasserman, S., & Pattison, P. (1996). Logit models and logistic regressions for social networks. Psychometrika, 60, 401425.CrossRefGoogle Scholar
Wasserman, S., & Anderson, C. (1987). Stochastic a posteriori blockmodels: Construction and assessment. Social Networks, 9(1), 136.CrossRefGoogle Scholar
Watts, D. J, & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684), 440442.CrossRefGoogle ScholarPubMed
White, J. G., Southgate, E., Thomson, J. N., & Brenner, S. (1986). The structure of the nervous system of the Nematode Caenorhabditis elegans. Philosophical Transactions of the Royal Society B: Biological Sciences, 314(1165), 1340.Google ScholarPubMed
Zheng, B., Wu, H., Kuang, L., Qin, J., Du, W., Wang, J., & Li, D. (2014). A simple model clarifies the complicated relationships of complex networks. Science Report, 4, 6197.CrossRefGoogle ScholarPubMed
Supplementary material: PDF

Arora et al. supplementary material

Arora et al. supplementary material

Download Arora et al. supplementary material(PDF)
PDF 1.6 MB