Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-08T08:22:17.030Z Has data issue: false hasContentIssue false

A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis

Published online by Cambridge University Press:  01 January 2025

Christopher J. Urban*
Affiliation:
University of North Carolina at Chapel Hill
Daniel J. Bauer
Affiliation:
University of North Carolina at Chapel Hill
*
Correspondence should be made to Christopher J. Urban, L. L. Thurstone Psychometric Laboratory in the Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, USA. Email: [email protected]

Abstract

Marginal maximum likelihood (MML) estimation is the preferred approach to fitting item response theory models in psychometrics due to the MML estimator’s consistency, normality, and efficiency as the sample size tends to infinity. However, state-of-the-art MML estimation procedures such as the Metropolis–Hastings Robbins–Monro (MH-RM) algorithm as well as approximate MML estimation procedures such as variational inference (VI) are computationally time-consuming when the sample size and the number of latent factors are very large. In this work, we investigate a deep learning-based VI algorithm for exploratory item factor analysis (IFA) that is computationally fast even in large data sets with many latent factors. The proposed approach applies a deep artificial neural network model called an importance-weighted autoencoder (IWAE) for exploratory IFA. The IWAE approximates the MML estimator using an importance sampling technique wherein increasing the number of importance-weighted (IW) samples drawn during fitting improves the approximation, typically at the cost of decreased computational efficiency. We provide a real data application that recovers results aligning with psychological theory across random starts. Via simulation studies, we show that the IWAE yields more accurate estimates as either the sample size or the number of IW samples increases (although factor correlation and intercepts estimates exhibit some bias) and obtains similar results to MH-RM in less time. Our simulations also suggest that the proposed approach performs similarly to and is potentially faster than constrained joint maximum likelihood estimation, a fast procedure that is consistent when the sample size and the number of items simultaneously tend to infinity.

Type
Theory and Methods
Copyright
Copyright © 2021 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

We would like to thank to the Editor, the Associate Editor, and the reviewers for their many constructive comments. We are also grateful to Dr. David Thissen for his extensive suggestions, feedback, and support.

References

Anderson, T. W., Rubin, H., & Neyman, J. (1957). Statistical inference in factor analysis. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press. 111150. Google Scholar
Asparouhov, T., & Muthén, B. (2009). Exploratory structural equation modeling. Structural Equation Modeling: A Multidisciplinary Journal. 16 (3), 397438. CrossRefGoogle Scholar
Béguin, A. A., & Glas, C. A. W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika. 66 (4), 541562. CrossRefGoogle Scholar
Bengio, Y. Montavon, G., Orr, G., & Müller, K. -R. (2012). Practical recommendations for gradient-based training of deep architectures. Neural Networks: Tricks of the Trade. Berlin: Springer. 437478. CrossRefGoogle Scholar
Biesanz, J. C., & West, S. G. (2004). Towards understanding assessments of the Big Five: Multitrait-multimethod analyses of convergent and discriminant validity across measurement occasion and type of observer. Journal of Personality. 72 (4), 845876. CrossRefGoogle ScholarPubMed
Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. (2017). Variational inference: A review for statisticians. Journal of the American Statistical Association. 112 (518), 859877. CrossRefGoogle Scholar
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika. 46 (4), 443459. CrossRefGoogle Scholar
Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement. 12 (3), 261280. CrossRefGoogle Scholar
Bolt, D. M. Maydeau-Olivares, A., & McArdle, J. J. (2005). Limited- and full-information estimation of item response theory models. Contemporary Psychometrics, Chap. 2. New Jersey: Lawrence Erlbaum Associates, Inc. 2772. Google Scholar
Bottou, L., Curtis, F. E., & Nocedal, J. (2018). Optimization methods for large-scale machine learning. SIAM Review. 60 (2), 223311. CrossRefGoogle Scholar
Bowman, S. R., Vilnis, L., Vinyals, O., Dai, A. M., Jozefowicz, R., & Bengio, S. (2016). Generating sentences from a continuous space. In Proceedings of the 20 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$20^{{\rm th}}$$\end{document} SIGNLL Conference on Computational Natural Language Learning (pp. 10-21). Association for Computational Linguistics. Retrieved from arXiv:1511.06349. Google Scholar
Burda, Y., Grosse, R., & Salakhutdinov, R. (2016). Importance weighted autoencoders. In 4 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4^{{\rm th}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1509.00519. Google Scholar
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika. 75 (1), 3357. CrossRefGoogle Scholar
Cai, L. (2010). Metropolis-Hastings Robbins-Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics. 35 (3), 307335. CrossRefGoogle Scholar
Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software. 48 (6), 129. CrossRefGoogle Scholar
Chen, Y., Filho, T. S., Prudêncio, R. B. C., Diethe, T., & Flach, P. (2019). β 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta ^3$$\end{document} -IRT : A new item response model and its applications. In Proceedings of the 22 nd \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$22^{{\rm nd}}$$\end{document} International Conference on Artificial Intelligence and Statistics (pp. 1013-1021). Retrieved from http://proceedings.mlr.press/v89/chen19b/chen19b.pdf. Google Scholar
Chen, Y., Li, X., & Zhang, S. (2019). Joint maximum likelihood estimation for high-dimensional exploratory item factor analysis. Psychometrika. 84 (1), 124146. CrossRefGoogle ScholarPubMed
Chen, X., Liu, S., Sun, R., & Hong, M. (2019). On the convergence of a class of ADAM-type algorithms for non-convex optimization. In 7 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$7^{{\rm th}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1808.02941. Google Scholar
Cho, A. E. (2020). Gaussian variational estimation for multidimensional item response theory. [Doctoral dissertation, University of Michigan]. Deep Blue Data. Retrieved from https://deepblue.lib.umich.edu/bitstream/handle/2027.42/162939/aprilcho_1.pdf?sequence=1&isAllowed=y. Google Scholar
Choi, J., Oehlert, G., & Zou, H. (2010). A penalized maximum likelihood approach to sparse factor analysis. Statistics and Its Interface. 3 (4), 429436. CrossRefGoogle Scholar
Christensen, R. H. B. (2019). Cumulative link models for ordinal regression with the R package ordinal. Retrieved from https://cran.r-project.org/web/packages/ordinal/vignettes/clm_article.pdf. Google Scholar
Clevert, D. A., Unterthiner, T., & Hochreiter, S. (2016). Fast and accurate deep network learning by exponential linear units (ELUs). In 4 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4^{{\rm th}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1511.07289. Google Scholar
Cremer, C., Li, X., & Duvenaud, D. (2018). Inference suboptimality in variational autoencoders. In Proceedings of the 35 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$35^{{\rm th}}$$\end{document} International Conference on Machine Learning (pp. 1078–1086). JMLR, Inc. and Microtome Publishing. Retrieved from http://proceedings.mlr.press/v80/cremer18a/cremer18a.pdf. Google Scholar
Cremer, C., Morris, Q., & Duvenaud, D. (2017). Reinterpreting importance-weighted autoencoders. In 5 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5^{{\rm th}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1704.02916. Google Scholar
Curi, M., Converse, G. A., Hajewski, J., & Oliveira, S. (2019). Interpretable variational autoencoders for cognitive models. 2019 International Joint Conference on Neural Networks. https://doi.org/10.1109/IJCNN.2019.8852333 CrossRefGoogle Scholar
Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems. 2 (1), 303314. CrossRefGoogle Scholar
Domke, J., & Sheldon, D. (2018). Importance weighting and variational inference. In Advances in Neural Information Processing Systems 31 (pp. 4470–4479). Curran Associates, Inc. Retrieved from https://papers.nips.cc/paper/2018/file/25db67c5657914454081c6a18e93d6dd-Paper.pdf. Google Scholar
Duchi, J. C., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research. 12 (1), 21212159. Google Scholar
Edwards, M. (2010). A Markov chain Monte Carlo approach to confirmatory item factor analysis. Psychometrika. 75 (3), 474497. CrossRefGoogle Scholar
Erosheva, E. A., Fienberg, S. E., & Joutard, C. (2007). Describing disability through individual-level mixture models for multivariate binary data. The Annals of Applied Statistics. 1 (2), 502537. CrossRefGoogle ScholarPubMed
Gershman, S., & Goodman, N. (2014). Amortized inference in probabilistic reasoning. In Proceedings of the 36 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$36^{{\rm th}}$$\end{document} Annual Conference of the Cognitive Science Society, (Vol. 1, pp. 517–522). Retrieved from https://escholarship.org/content/qt34j1h7k5/qt34j1h7k5_noSplash_8e5b24dd056d61b53b1170a1861e49d1.pdf?t=op9xkp. Google Scholar
Ghosh, R. P., Mallick, B., & Pourahmadi, M. (2020). Bayesian estimation of correlation matricesof longitudinal data. Bayesian Analysis, 1–20, https://doi.org/10.1214/20-ba1237. CrossRefGoogle Scholar
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. Journal of Machine Learning Research. 9 (1), 249256. Google Scholar
Goldberg, L. R. (1992). The development of markers for the Big-Five factor structure. Psychological Assessment. 4 (1), 2642. CrossRefGoogle Scholar
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., Gough, H. G. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in Personality. 40 (1), 8496. CrossRefGoogle Scholar
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. Cambridge: MIT Press. Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In 2015 IEEE International Conference on Computer Vision (pp. 1026–1034). https://doi.org/10.1109/ICCV.2015.123 CrossRefGoogle Scholar
Heaton, J. (2008). Introduction to Neural Networks for Java. 2 Washington, DC: Heaton Research, Inc. Google Scholar
Hirose, K., & Konishi, S. (2012). Variable selection via the weighted group lasso for factor analysis models. The Canadian Journal of Statistics. 40 (2), 345361. CrossRefGoogle Scholar
Huang, C. W., Krueger, D., Lacoste, A., & Courville, A. (2018). Neural autoregressive flows. In Proceedings of the 35 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$35^{{\rm th}}$$\end{document} International Conference on Machine Learning (pp. 2078–2087). Retrieved from http://proceedings.mlr.press/v80/huang18d/huang18d.pdf. Google Scholar
Huber, P., Ronchetti, E., & Victoria-Feser, M. -P. (2004). Estimation of generalized linear latent variable models. Journal of the Royal Statistical Society - Series B. 66 (4), 893908. CrossRefGoogle Scholar
Hui, F. K. C., Tanaka, E., & Warton, D. I. (2018). Order selection and sparsity in latent variable models via the ordered factor LASSO. Biometrics. 74 (4), 13111319. CrossRefGoogle ScholarPubMed
Hui, F. K. C., Warton, D. I., Ormerod, J. T., Haapaniemi, V., & Taskinen, S. (2017). Variational approximations for generalized linear latent variable models. Journal of Computational and Graphical Statistics. 26 (1), 3543. CrossRefGoogle Scholar
Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1998). Learning in Graphical Models., 37(1), 183–233.CrossRefGoogle Scholar
Jöreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research. 36 (3), 347387. CrossRefGoogle ScholarPubMed
Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M., & Tang, P. T. P. (2017). On large-batch training for deep learning: Generalization gap and sharp minima. In 5 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5^{{\rm th}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1609.04836.Google Scholar
Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I., & Welling, M. (2016). Improved variational inference with inverse autoregressive flow. In Advances in Neural Information Processing Systems 31 (pp. 4743-4751). Curran Associates, Inc. Retrieved from https://papers.nips.cc/paper/2016/file/ddeebdeefdb7e7e7a697e1c3e3d8ef54-Paper.pdf. Google Scholar
Kingma, D. P., & Welling, M. (2014). Auto-encoding variational Bayes. In 2 nd \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2^{{\rm nd}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1312.6114.Google Scholar
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature Methods. 521 (1), 436444. CrossRefGoogle ScholarPubMed
Lehmann, E. L., & Casella, G. 1998 Theory of Point Estimation. Berlin: Springer. Google Scholar
Linnainmaa, S. (1970). The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. [Unpublished master’s thesis (in Finnish)]. University of Helsinki. Google Scholar
Lorenzo-Seva, U., & ten Berge, J. M. (2006). Tucker’s congruence coefficient as a meaningful index of factor similarity. Methodology: European Journal of Research Methods for The Behavioral and Social Sciences. 2 (2), 5764. CrossRefGoogle Scholar
MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods. 4 (1), 8499. CrossRefGoogle Scholar
Mattei, P. -A., & Frellsen, J. (2019). MIWAE: Deep generative modelling and imputation of incomplete data. In Proceedings of the 36 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$36^{{\rm th}}$$\end{document} International Conference on Machine Learning, (pp. 4413–4423). Retrieved from http://proceedings.mlr.press/v97/mattei19a/mattei19a.pdf. Google Scholar
McKinley, R., & Reckase, M. (1983). An extension of the two-parameter logistic model to the multidimensional latent space (Research Report ONR83-2). The American College Testing Program.Google Scholar
McMahan, H. B., & Streeter, M. (2010). Adaptive bound optimization for online convex optimization. In A. T. Kalai & M. Mohr (Eds.), The 23 rd \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$23^{{\rm rd}}$$\end{document} Conference on Learning Theory (pp. 244–256). Retrieved from http://www.learningtheory.org/colt2010/conference-website/papers/COLT2010proceedings.pdf. Google Scholar
Meng, X. -L., & Schilling, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. Journal of the American Statistical Association. 91 (435), 12541267. CrossRefGoogle Scholar
Monroe, S. L. (2014). Multidimensional item factor analysis with semi-nonparametric latent densities. [Unpublished doctoral dissertation]. University of California.Google Scholar
Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika. 43 (4), 551560. CrossRefGoogle Scholar
Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika. 49 (1), 115132. CrossRefGoogle Scholar
Natesan, P., Nandakumar, R., Minka, T., & Rubright, J. D. (2016). Bayesian prior choice in IRT estimation using MCMC and variational Bayes. Frontiers in Psychology. 7 (1), 1CrossRefGoogle ScholarPubMed
Nemirovski, A., Juditsky, A., Lan, G., & Shapiro, A. (2009). Robust stochastic approximation approach to stochatic programming. SIAM Journal on Optimization. 19 (4), 15741609. CrossRefGoogle Scholar
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Demaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 (pp. 8024-8035). Curran Associates, Inc. Retrieved from http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf. Google Scholar
Pinheiro, J. C., & Bates, D. M. (1996). Unconstrained parametrizations for variance-covariance matrices. Statistics and Computing. 6 (3), 289296. CrossRefGoogle Scholar
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics. 128 (2), 301323. 10.1016/j.jeconom.2004.08.017 CrossRefGoogle Scholar
Rainforth, T., Kosiorek, A. R., Le, T. A., Maddison, C. J., Igl, M., Wood, F., & Teh, Y. W. (2018). Tighter variational bounds are not necessarily better. In Proceedings of the 35 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$35^{{\rm th}}$$\end{document} International Conference on Machine Learning (Vol. 80, pp. 4277–4285). Retrieved from http://proceedings.mlr.press/v80/rainforth18b/rainforth18b.pdf. Google Scholar
Rapisarda, F., Brigo, D., & Mercurio, F. (2007). Parameterizing correlations: A geometric inter-pretation. IMA Journal of Management Mathematics. 18 (1), 5573. CrossRefGoogle Scholar
Reckase, M. D. 2009 Multidimensional Item Response Theory. Berlin: Springer. CrossRefGoogle Scholar
Reddi, S. J., Kale, S., & Kumar, S. (2018). On the convergence of ADAM and beyond. In 6 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6^{{\rm th}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1904.09237. Google Scholar
Rezende, D. J., Mohamed, S., & Wierstra, D. (2014). Stochastic backpropagation and approximate inference in deep generative models. In Proceedings of the 31 st \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$31^{{\rm st}}$$\end{document} International Conference on Machine Learning (pp. 1278–1286). Retrieved from http://proceedings.mlr.press/v32/rezende14.pdf. Google Scholar
Rezende, D. J., & Mohamed, S. (2015). Variational inference with normalizing flows. In Proceedings of the 32 nd \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$32^{{\rm nd}}$$\end{document} International Conference on Machine Learning (pp. 530–1538). Retrieved from http://proceedings.mlr.press/v37/rezende15.pdf. Google Scholar
Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics. 22 (3), 400407. CrossRefGoogle Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika. 35 (1), 139CrossRefGoogle Scholar
Schilling, R., & Bock, D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika. 70 (3), 533555. Google Scholar
Sønderby, C. K., Raiko, T., Maaløe, L., Sønderby, S. K., & Winther, O. (2016). Ladder variational autoencoders. In Advances in Neural Information Processing Systems (pp. 3745–3753). Curran Associates, Inc. Retrieved from https://papers.nips.cc/paper/2016/file/6ae07dcb33ec3b7c814df797cbda0f87-Paper.pdf.Google Scholar
Song, X., & Lee, S. (2005). A multivariate probit latent variable model for analyzing dichotomous responses. Statistica Sinica. 15 (3), 4564. Google Scholar
Spall, J. C. 2003 Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Hoboken: Wiley. CrossRefGoogle Scholar
Staib, M., Reddi, S., Kale, S., Kumar, S., & Sra, S. (2019). Escaping saddle points with adaptive gradient methods. In Proceedings of the 36 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$36^{{\rm th}}$$\end{document} International Conference on Machine Learning (pp. 5956–5965). Retrieved from http://proceedings.mlr.press/v97/staib19a/staib19a.pdf. Google Scholar
Sun, J., Chen, Y., Liu, J., Ying, Z., & Xin, T. (2016). Latent variable selection for multidimensional item response theory models via L1 regularization. Psychometrika. 81 (4), 921939. CrossRefGoogle Scholar
Tabak, E. G., & Turner, C. V. (2012). A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics. 66 (2), 145164. CrossRefGoogle Scholar
Tabak, E. G., & Vanden-Eijnden, E. (2010). Density estimation by dual ascent of the log-likelihood. Communications in Mathematical Sciences. 8 (1), 217233. CrossRefGoogle Scholar
Tsay, R. S., & Pourahmadi, M. (2017). Modelling structured correlation matrices. Biometrika. 104 (1), 237242. Google Scholar
Tucker, G., Lawson, D., Gu, S., & Maddison, C. J. (2019). Doubly reparameterized gradient estimators for Monte Carlo objectives. In 7 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$7^{{\rm th}}$$\end{document} International Conference on Learning Representations. ICLR. Retrieved from arXiv:1810.04152. Google Scholar
Wainwright, M. J., & Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning. 1 1–2 1305. CrossRefGoogle Scholar
Wirth, R. J., & Edwards, M. C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods. 12 (1), 5879. CrossRefGoogle ScholarPubMed
Woods, C. M., & Thissen, D. (2006). Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika. 71 (2), 281301. CrossRefGoogle ScholarPubMed
Wu, M., Davis, R. L., Domingue, B. W., Piech, C., & Goodman, N. (2020). Variational item response theory: Fast, accurate, and expressive. In A. N. Rafferty, J. Whitehill, C. Romero, & V. Cavalli-Sforza (Eds.), Proceedings of the 13 th \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$13^{{\rm th}}$$\end{document} International Conference on Educational Data Mining 2020 (pp. 257–268). Retrieved from https://educationaldatamining.org/files/conferences/EDM2020/EDM2020Proceedings.pdf.Google Scholar
Yalcin, I., & Amemiya, Y. (2001). Nonlinear factor analysis as a statistical method. Statistical Science. 16 (3), 275294. Google Scholar
Yates, A. 1988 Multivariate Exploratory Data Analysis: A Perspective on Exploratory Factor Analysis. Albany: State University of New York Press. Google Scholar
Yun, J., Lozano, A. C., & Yang, E. (2020). A general family of stochastic proximal gradient methods for deep learning. arXiv preprint. Retrieved from arXiv:2007.07484.Google Scholar
Zhang, C., Butepage, J., Kjellstrom, H., & Mandt, S. (2019). Advances in variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence. 41 (8), 20082026. CrossRefGoogle ScholarPubMed
Zhang, S., Chen, Y., & Li, X. (2019). mirtjml [Computer software]. Retrieved from https://cran.r-project.org/web/packages/mirtjml/index.html. Google Scholar
Zhang, H., Chen, Y., & Li, X. (2020). A note on exploratory item factor analysis by singular value decomposition. Psychometrika, pp. 1–15. CrossRefGoogle Scholar
Zhang, S., Chen, Y., & Liu, Y. (2020). An improved stochastic EM algorithm for large-scale full-information item factor analysis. British Journal of Mathematical and Statistical Psychology. 73 (1), 4471. CrossRefGoogle ScholarPubMed
Zhou, D., Tang, Y., Yang, Z., Cao, Y., & Gu, Q. (2018). On the convergence of adaptive gradient methods for nonconvex optimization. arXiv preprint. Retrieved from arXiv:1808.05671. Google Scholar