Hostname: page-component-cd9895bd7-gxg78 Total loading time: 0 Render date: 2024-12-23T18:46:21.807Z Has data issue: false hasContentIssue false

Bayesian model selection for the latent position cluster model for social networks

Published online by Cambridge University Press:  03 April 2017

CAITRÍONA RYAN
Affiliation:
Department of Mathematics and Statistics, University of Limerick, Limerick, Ireland (e-mail: [email protected])
JASON WYSE
Affiliation:
Discipline of Statistics, School of Computer Science and Statistics, Trinity College Dublin, College Green, Dublin 2, Ireland (e-mail: [email protected])
NIAL FRIEL
Affiliation:
School of Mathematics and Statistics and Insight: The National Centre for Big Data Analytics, University College Dublin, Belfield, Dublin 4, Ireland (e-mail: [email protected])

Abstract

The latent position cluster model is a popular model for the statistical analysis of network data. This model assumes that there is an underlying latent space in which the actors follow a finite mixture distribution. Moreover, actors which are close in this latent space are more likely to be tied by an edge. This is an appealing approach since it allows the model to cluster actors which consequently provides the practitioner with useful qualitative information. However, exploring the uncertainty in the number of underlying latent components in the mixture distribution is a complex task. The current state-of-the-art is to use an approximate form of BIC for this purpose, where an approximation of the log-likelihood is used instead of the true log-likelihood which is unavailable. The main contribution of this paper is to show that through the use of conjugate prior distributions, it is possible to analytically integrate out almost all of the model parameters, leaving a posterior distribution which depends on the allocation vector of the mixture model. This enables posterior inference over the number of components in the latent mixture distribution without using trans-dimensional MCMC algorithms such as reversible jump MCMC. Our approach is compared with the state-of-the-art latentnet (Krivitsky & Handcock, 2015) and VBLPCM (Salter-Townshend & Murphy, 2013) packages.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adamic, L. A., Lukose, R. M., Puniyani, A. R., & Huberman, B. A. (2001). Search in power-law networks. Physical Review E, 64 (Sep), 046135.CrossRefGoogle ScholarPubMed
Faloutsos, M., Faloutsos, P., & Faloutsos, C. (1999). On power-law relationships of the internet topology. Sigcomm Computer Communication Review, 29 (4), 251262.Google Scholar
Fraley, C., & Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97 (458), 611631.Google Scholar
Fraley, C., & Raftery, A. E. (2003). Enhanced model-based clustering, density estimation, and discriminant analysis software: MCLUST. Journal of Classification, 20 (2), 263286.CrossRefGoogle Scholar
Friel, N., & Wyse, J. (2012). Estimating the evidence—a review. Statistica Neerlandica, 66 (3), 288308.Google Scholar
Handcock, M. S., Raftery, A. E., & Tantrum, J. M. (2007). Model-based clustering for social networks. Journal of the Royal Statistical Society: Series A (Statistics in Society), 170 (2), 301354.Google Scholar
Hoff, P. D., Raftery, A. E., & Handcock, M. S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97 (460), 10901098.CrossRefGoogle Scholar
Kolaczyk, E. D. (2009). Statistical analysis of network data: methods and models. New York: Springer.Google Scholar
Krivitsky, P. N., & Handcock, M. S. (2008). Fitting latent cluster models for networks with latentnet. Journal of Statistical Software, 24 (5), 123.Google Scholar
Krivitsky, P. N., & Handcock, M. S. (2015). Latentnet: Latent position and cluster models for statistical networks. The Statnet Project (http://www.statnet.org). R package version 2.7.1.Google Scholar
Lusseau, D., Schneider, K., Boisseau, O. J., Haase, P., Slooten, E., & Dawson, S. M. (2003). The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behavioral Ecology and Sociobiology, 54 (4), 396405.CrossRefGoogle Scholar
Michailidis, G. (2012). Statistical challenges in biological networks. Journal of Computational and Graphical Statistics, 21 (4), 840855.CrossRefGoogle Scholar
Miller, W., & Harrison, M. T. (2016). Mixture models with a prior on the number of components. Journal of the American Statistical Association. doi:10.1080/01621459.2016.1255636 Google Scholar
Nobile, A. (2007). Bayesian finite mixtures: A note on prior specification and posterior computation. preprint, arxiv:0711.0458.Google Scholar
Nobile, A., & Fearnside, A. T. (2007). Bayesian finite mixtures with an unknown number of components: The allocation sampler. Statistics and Computing, 17 (2), 147162.CrossRefGoogle Scholar
Nowicki, K., & Snijders, T. A. B. (2001). Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96 (455), 10771087.Google Scholar
Raftery, A. E., Niu, X., Hoff, P. D., & Yeung, K. Y. (2012). Fast inference for the latent space network model using a case-control approximate likelihood. Journal of Computational and Graphical Statistics, 21 (4), 901919.Google Scholar
Richardson, S., & Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59 (4), 731792.Google Scholar
Robins, G., Snijders, T., Wang, P., Handcock, M. S., & Pattison, P. (2007). Recent developments in exponential random graph (p*) models for social networks. Social Networks, 29 (2), 192215.Google Scholar
Salter-Townshend, M., & Murphy, T. B. (2013). Variational Bayesian inference for the latent position cluster model. Computational Statistics and Data Analysis, 57 (1), 661671.Google Scholar
Sampson, S. F. (1968). A novitiate in a period of change: An experimental and case study of social relationships. Ph.D. thesis, Cornell University, September.Google Scholar
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6 (2), 461464.Google Scholar
Shortreed, S., Handcock, M. S., & Hoff, P. (2006). Positional estimation within a latent space model for networks. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 2 (1), 2433.CrossRefGoogle Scholar
Sibson, R. (1979). Studies in the robustness of multidimensional scaling: Perturbational analysis of classical scaling. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 41 (2), 217229.Google Scholar
Wasserman, S., & Galaskiewicz, J. (1994). Advances in social network analysis: Research in the social and behavioral sciences. Thousand Oaks, California: Sage Publications.Google Scholar
Wasserman, S., & Pattison, P. (1996). Logit models and logistic regressions for social networks: I. an introduction to markov graphs and p*. Psychometrika, 61 (3), 401425.CrossRefGoogle Scholar
Wyse, J., & Friel, N. (2012). Block clustering with collapsed latent block models. Statistics and Computing, 22 (2), 415428.CrossRefGoogle Scholar
Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33 (4), 452473.Google Scholar