Hostname: page-component-cd9895bd7-gbm5v Total loading time: 0 Render date: 2024-12-23T09:54:27.014Z Has data issue: false hasContentIssue false

Mixed-membership of experts stochastic blockmodel

Published online by Cambridge University Press:  16 December 2015

ARTHUR WHITE
Affiliation:
School of Computer Science and Statistics, Trinity College Dublin, Dublin 2, Ireland (e-mail: [email protected])
THOMAS BRENDAN MURPHY
Affiliation:
School of Mathematical Sciences, University College Dublin, Dublin 4, Ireland (e-mail: [email protected])

Abstract

Social network analysis is the study of how links between a set of actors are formed. Typically, it is believed that links are formed in a structured manner, which may be due to, for example, political or material incentives, and which often may not be directly observable. The stochastic blockmodel represents this structure using latent groups which exhibit different connective properties, so that conditional on the group membership of two actors, the probability of a link being formed between them is represented by a connectivity matrix. The mixed membership stochastic blockmodel extends this model to allow actors membership to different groups, depending on the interaction in question, providing further flexibility.

Attribute information can also play an important role in explaining network formation. Network models which do not explicitly incorporate covariate information require the analyst to compare fitted network models to additional attributes in a post-hoc manner. We introduce the mixed membership of experts stochastic blockmodel, an extension to the mixed membership stochastic blockmodel which incorporates covariate actor information into the existing model. The method is illustrated with application to the Lazega Lawyers dataset. Model and variable selection methods are also discussed.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2015 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abramowitz, M., & Stegun, I. A. (1965). Handbook of mathematical functions (1st ed.). Mineola, New York: Dover Publications.Google Scholar
Airoldi, E. M., Blei, D. M., Fienberg, S. E., Goldberg, A., Xing, E. P., & Zheng, A. X. (2007). Statistical network analysis: Models, issues and new directions, Lecture Notes in Computer Science, vol. 4503. Berlin: Springer.CrossRefGoogle Scholar
Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2008). Mixed-membership stochastic blockmodels. Journal of Machine Learning Research, 9, 19812014.Google ScholarPubMed
Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B (Methodological), 44 (2), 139177.CrossRefGoogle Scholar
Albert, A., & Anderson, J. A. (1984). On the existence of maximum likelihood estimates in logistic regression models. Biometrika, 71 (1), 110.Google Scholar
Beal, M. (2003). Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University College London.Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.Google Scholar
Blei, D. M. (2014). Build, compute, critique, repeat: Data analysis with latent variable models. Annual Review of Statistics and its Application, 1, 203232.Google Scholar
Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. Annals of Applied Statistics, 1 (1), 1735.Google Scholar
Blei, D. M., Ng, Andrew Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 9931022.Google Scholar
Breiger, R. L. (1974). The duality of persons and groups. Social Forces, 53 (2), 181190.Google Scholar
Daudin, J.-J., Picard, F., & Robin, S. (2008). A mixture model for random graphs. Statistics and Computing, 18 (2), 173183. 10.1007/s11222-007-9046-7.Google Scholar
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society B, 39 (1), 138.Google Scholar
Efron, B. (2013). Empirical Bayes modeling, computation, and accuracy. Tech. rept. Stanford University.Google Scholar
Efron, B., & Morris, C. (1973). Combining possibly related estimation problems. Journal of the Royal Statistical Society, Series B, 35 (3), 379421.Google Scholar
Erdős, P., & Rényi, A. (1959). On random graphs I. Publicationes Mathematicae debrecen, 6, 290297.Google Scholar
Erosheva, E. A., Fienberg, S. E., & Joutard, C. (2007). Describing disability through individual-level mixture models for multivariate binary data. The Annals of Applied Statistics, 1 (2), 502537.Google Scholar
Fellows, I., & Handcock, M. S. (2012). Exponential-family random network models. Arxiv e-prints.Google Scholar
Gormley, I. C., & Murphy, T. B. (2010). A mixture of experts latent position cluster model for social network data. Statistical Methodology, 7 (3), 385405.Google Scholar
Handcock, M. S., Raftery, A. E., & Tantrum, J. M. (2007). Model-based clustering for social networks. Journal of the Royal Statistical Society: Series A, 170 (2), 122.Google Scholar
Heinze, G., & Schemper, M. (2002). A solution to the problem of separation in logistic regression. Statistics in Medicine, 21 (16), 24092419.Google Scholar
Hill, M. O. (1973). Diversity and evenness: A unifying notation and its consequences. Ecology, 54 (2), 427432.Google Scholar
Hoff, P. (2008). Modeling homophily and stochastic equivalence in symmetric relational data. In Platt, J. C., Koller, D., Singer, Y., & Roweis, S. (Eds.), Advances in neural information processing systems 20. (pp. 657664) Cambridge, MA: MIT Press.Google Scholar
Hoff, P., Raftery, A. E., & Handcock, M. S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97 (460), 10901098.Google Scholar
Holland, P. W., Laskey, K. B., & Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks, 5 (2), 109137.Google Scholar
Holland, P. W., & Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, 76 (373), 3350.Google Scholar
Hunter, D. R., Goodreau, S. M., & Handcock, M. S. (2008). Goodness of fit of social network models. Journal of the American Statistical Association, 103 (481), 248258.Google Scholar
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3 (1), 7987.Google Scholar
Kass, R. E., & Raftery, A. E. (1995). Bayes Factors. Journal of the American Statistical Association, 90 (430), 773795.Google Scholar
Krivitsky, P. N., & Handcock, M. S. (2008). Fitting latent cluster models for networks with latentnet. Journal of Statistical Software, 24 (5), 123.Google Scholar
Krivitsky, P. N., Handcock, M. S., Raftery, A. E., & Hoff, P. D. (2009). Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models. Social Networks, 31 (3), 204213.Google Scholar
Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22 (1), 7986.CrossRefGoogle Scholar
Latouche, P., Birmelé, E., & Ambroise, C. (2011). Overlapping stochastic block models with application to the french political blogosphere. Annals of Applied Statistics, 5 (1), 309336.CrossRefGoogle Scholar
Latouche, P., Birmelé, E., & Ambroise, C. (2012). Variational Bayesian inference and complexity control for stochastic block models. Statistical Modelling, 12 (1), 93115.Google Scholar
Mariadassou, M., Robin, S., & Vacher, C. (2010). Uncovering latent structure in valued graphs: A variational approach. The Annals of Applied Statistics, 4 (2), 715742.Google Scholar
McDaid, A. F., Murphy, T. B., Friel, N., & Hurley, N. (2012). Model-based clustering in networks with stochastic community finding. Colubi, A., Fokianos, K., Kontoghiorghes, E. J., & Gonzáles-Rodríguez, G. (Eds.), Proceedings of COMPSTAT 2012: 20th International conference on computational statistics. Limassol, Cyprus: ISI-IASC, pp. 549560.Google Scholar
Minka, T. P. (2012). Estimating a Dirichlet distribution. Online Manuscript.Google Scholar
Nowicki, K., & Snijders, T. A. B. (2001). Estimation and prediction of stochastic blockstructures. Journal of the American statistical association, 96 (455), 10771087.Google Scholar
Ormerod, J. T., & Wand, M. P. (2010). Explaining variational approximations. The American Statistician, 64 (2), 140153.Google Scholar
R Core Team. (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.Google Scholar
Raftery, A. E., Niu, X., Hoff, P. D., & Yeung, K. Y. (2012). Fast inference for the latent space network model using a case-control approximate likelihood. Journal of Computational and Graphical Statistics, 21 (4), 901919.Google Scholar
Robbins, H. (1956). An empirical Bayes approach to statistics. Berkeley, California: University of California Press.Google Scholar
Robins, G., Snijders, T. A. B., Wang, P., & Handcock, M. S. (2006). Recent developments in exponential random graph (p*) models for social networks. Social Networks, 29 (2), 192215.Google Scholar
Rogers, S., Girolami, M., Campbell, C., & Breitling, R. (2005). The latent process decomposition of cDNA microarray datasets. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2 (2), 143156.Google Scholar
Salter-Townshend, M., & Murphy, T. B. (2013). Variational Bayesian inference for the latent position cluster model for network data. Computational Statistics and Data Analysis, 57 (1), 661671.Google Scholar
Salter-Townshend, M., White, A., Gollini, I., & Murphy, T. B. (2012). Review of statistical network analysis: Models, algorithms, and software. Statistical Analysis and Data Mining, 5 (4), 243264.Google Scholar
Smyth, P. (2000). Model selection for probabilistic clustering using cross-validated likelihood. Statistics and Computing, 10 (1), 6372.Google Scholar
Snijders, T. A. B. (2002). Markov chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure, 3 (2), 140.Google Scholar
Snijders, T. A. B., & Nowicki, K. (1997). Estimation and prediction for stochastic bockmodels for graphs with latent block structure. Journal of Classification, 14 (1), 75100.Google Scholar
Snijders, T. A. B., Pattison, P. E., Robins, G. L., & Handcock, M. S. (2006). New specifications for exponential random graph models. Sociological Methodology, 36 (1), 99153.Google Scholar
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications. Cambridge: Cambridge University Press.Google Scholar
White, A., Chan, J., Hayes, C., & Murphy, T. B. (2012). Mixed membership models for exploring user roles in online fora. In Ellison, N., Shanahan, J. G., & Tufekci, Z. (Eds.), Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM 2012), Dublin, Ireland, pp. 599602.Google Scholar
Xing, E. P., Fu, W., & Song, L. (2010). A state-space mixed membership blockmodel for dynamic network tomography. Annals of Applied Statistics, 4 (2), 535566.Google Scholar
Zanghi, H., Picard, F., Miele, V., & Ambroise, C. (2010b). Strategies for online inference of model-based clustering in large and growing networks. Annals of Applied Statistics, 4 (2), 687714.CrossRefGoogle Scholar
Zanghi, H., Volant, S., & Ambroise, C. (2010a). Clustering based on random graph model embedding vertex features. Pattern Recognition Letters, 31 (9), 830836.Google Scholar
Zhang, Y., Levina, E., & Zhu, J. (2013). Community detection in networks with node features. Lake Tahoe, Nevada.Google Scholar