A Latent Hidden Markov Model for Process Data

Xueying Tang

doi:10.1007/s11336-023-09938-1

A Latent Hidden Markov Model for Process Data

Published online by Cambridge University Press: 01 January 2025

Xueying Tang

Show author details

Xueying Tang*: Affiliation:
University of Arizona
*: Correspondence should be made to Xueying Tang, University of Arizona, 617 N. Santa Rita Ave., Tucson, AZ 85721, USA. Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Response process data from computer-based problem-solving items describe respondents’ problem-solving processes as sequences of actions. Such data provide a valuable source for understanding respondents’ problem-solving behaviors. Recently, data-driven feature extraction methods have been developed to compress the information in unstructured process data into relatively low-dimensional features. Although the extracted features can be used as covariates in regression or other models to understand respondents’ response behaviors, the results are often not easy to interpret since the relationship between the extracted features, and the original response process is often not explicitly defined. In this paper, we propose a statistical model for describing response processes and how they vary across respondents. The proposed model assumes a response process follows a hidden Markov model given the respondent’s latent traits. The structure of hidden Markov models resembles problem-solving processes, with the hidden states interpreted as problem-solving subtasks or stages. Incorporating the latent traits in hidden Markov models enables us to characterize the heterogeneity of response processes across respondents in a parsimonious and interpretable way. We demonstrate the performance of the proposed model through simulation experiments and case studies of PISA process data.

Keywords

response process latent variable hidden Markov models problem-solving behaviors

Type: Theory and Methods
Information: Psychometrika , Volume 89 , Issue 1 , March 2024 , pp. 205 - 240

DOI: https://doi.org/10.1007/s11336-023-09938-1 [Opens in a new window]
Copyright: Copyright © 2023 The Author(s), under exclusive licence to The Psychometric Society.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Binkley, M., Erstad, O., Herman, J., Raizen, S., Ripley, M., Miller-Ricci, M., & Rumble, M. (2012). Defining twenty-first century skills. In Assessment and teaching of 21st century skills (pp. 17–66). Springer.CrossRef Google Scholar

Broyden, C.G.. (1970). The convergence of a class of double-rank minimization algorithms 1. General considerations. IMA Journal of Applied Mathematics, 6 176–90.CrossRef Google Scholar

Cappé, O, Moulines, E, Ryden, TInference in hidden Markov models 2005 Springer.CrossRef Google Scholar

Chen, Y. (2020). A continuous-time dynamic choice measurement model for problem-solving process data. Psychometrika, 85 41052–1075.CrossRef Google Scholar PubMed

Chen, Y, Li, X, Liu, J, Ying, Z. (2019). Statistical analysis of complex problem-solving process data: An event history analysis approach. Frontiers in Psychology, 10, 486.CrossRef Google Scholar

Chen, Y, Li, X, Zhang, S. (2019). Joint maximum likelihood estimation for high-dimensional exploratory item factor analysis. Psychometrika, 84 1124–146.CrossRef Google Scholar PubMed

Cover, T.M., Thomas, J.A.Elements of information theory 2006 2Wiley.Google Scholar

Dempster, A.P., Laird, N.M., Rubin, D.B.. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society Series B, 39 11–22.CrossRef Google Scholar

Eddelbuettel, D, François, R. (2011). Rcpp: Seamless r and c++ integration. Journal of Statistical Software, 40, 1–18.CrossRef Google Scholar

Eichmann, B, Greiff, S, Naumann, J, Brandhuber, L, Goldhammer, F. (2020). Exploring behavioural patterns during complex problem-solving. Journal of Computer Assisted Learning, 36 6933–956.CrossRef Google Scholar

Fletcher, R. (1970). A new approach to variable metric algorithms. The Computer Journal, 13 3317–322.CrossRef Google Scholar

Giner, G., Chen, L., Hu, Y., Dunn, P., Phipson, B., & Chen, Y. (2023). statmod: Statistical modeling [Computer software manual]. Retrieved from https://cran.r-project.org/package=statmod.Google Scholar

Goldfarb, D. (1970). A family of variable-metric methods derived by variational means. Mathematics of Computation, 24 10923–26.CrossRef Google Scholar

Greiff, S, Niepel, C, Scherer, R, Martin, R. (2016). Understanding students’ performance in a computer-based assessment of complex problem solving: An analysis of behavioral data from computer-generated log files. Computers in Human Behavior, 61, 36–46.CrossRef Google Scholar

Greiff, S, Wüstenberg, S, Avvisati, F. (2015). Computer-generated log-file analyses as a window into students’ minds? A showcase study based on the PISA 2012 assessment of problem solving. Computers & Education, 91, 92–105.CrossRef Google Scholar

Han, Y, Liu, H, Ji, F. (2021). A sequential response model for analyzing process data on technology-based problem-solving tasks. Multivariate Behavioral Research, 57, 960.CrossRef Google Scholar PubMed

He, Q., & von Davier, M. (2016). Analyzing process data from problem-solving items with n-grams: Insights from a computer-based large-scale assessment. In Y. Rosen, S. Ferrara, & M. Mosharraf (Eds.), Handbook of research on technology tools for real-world skill development (pp. 749-776). Information Science Reference. https://doi.org/10.4018/978-1-4666-9441-5.ch029.CrossRef Google Scholar

He, Q., Liao, D., & Jiao, H. (2019). Clustering behavioral patterns using process data in PIAAC problem-solving items. In Theoretical and practical advances in computer-based educational measurement (pp. 189-212). Springer.CrossRef Google Scholar

Herborn, K, Mustafić, M, Greiff, S. (2017). Mapping an experiment-based assessment of collaborative behavior onto collaborative problem solving in PISA 2015: A cluster analysis approach for collaborator profiles. Journal of Educational Measurement, 54 1103–122.CrossRef Google Scholar

Liang, K, Tu, D, Cai, Y. (2022). Using process data to improve classification accuracy of cognitive diagnosis model. Multivariate Behavioral Research, .Google Scholar

Lord, F.M.Applications of item response theory to practical testing problems 1980 Routledge.Google Scholar

McCullagh, P, Nelder, JGeneralized linear models 2018 Routledge.Google Scholar

OECD PISA 2012 results: Creative problem solving: Students’ skills in tackling real-life problems 2014 OECD Publishing.Google Scholar

R Core Team. (2023). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/.Google Scholar

Rabiner, L, Juang, B. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, 3 14–16.CrossRef Google Scholar

Rupp, A.A., Templin, J, Henson, R.A.Diagnostic measurement: Theory, methods, and applications 2010 Guilford Press.Google Scholar

Shanno, D.F.. (1970). Conditioning of quasi-Newton methods for function minimization. Mathematics of Computation, 24 111647–656.CrossRef Google Scholar

Stadler, M, Fischer, F, Greiff, S. (2019). Taking a closer look: An exploratory analysis of successful and unsuccessful strategy use in complex problems. Frontiers in Psychology, 10, 777.CrossRef Google Scholar PubMed

Tang, X, Wang, Z, He, Q, Liu, J, Ying, Z. (2020). Automatic feature construction for process data using multidimensional scaling. Psychometrika, 85, 378–397.CrossRef Google Scholar

Tang, X, Wang, Z, Liu, J, Ying, Z. (2021). An exploration of process data by action sequence autoencoder. British Journal of Mathematical and Statistical Psychology, 74, 1–33.CrossRef Google Scholar

Ulitzsch, E, He, Q, Pohl, S. (2022). Using sequence mining techniques for understanding incorrect behavioral patterns on interactive tasks. Journal of Educational and Behavioral Statistics, 47 13–35.CrossRef Google Scholar

Ulitzsch, E, Ulitzsch, V, He, Q, Lüdtke, O. (2022). A machine learning-based procedure for leveraging clickstream data to investigate early predictability of failure on interactive tasks. Behavior Research Methods, 55, 1392.CrossRef Google Scholar PubMed

Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE transactions on Information Theory, 13 2260–269.CrossRef Google Scholar

von Davier, M, Khorramdel, L, He, Q, Shin, H.J., Chen, H. (2019). Developments in psychometric population models for technology-based large-scale assessments: An overview of challenges and opportunities. Journal of Educational and Behavioral Statistics, 44 6671–705.CrossRef Google Scholar

Wang, Z, Tang, X, Liu, J, Ying, Z. (2022). Subtask analysis of process data through a predictive model. British Journal of Mathematical and Statistical Psychology, .Google Scholar PubMed

Xiao, Y, He, Q, Veldkamp, B, Liu, H. (2021). Exploring latent states of problem-solving competence using hidden Markov model on process data. Journal of Computer Assisted Learning, 37 51232–1247.CrossRef Google Scholar

Xu, H, Fang, G, Ying, Z. (2020). A latent topic model with Markov transition for process data. British Journal of Mathematical and Statistical Psychology, 73 3474–505.CrossRef Google Scholar PubMed

Zhang, S, Wang, Z, Qi, J, Liu, J, Ying, ZAccurate assessment via process data. Psychometric 2023 88, 76–97.Google Scholar

Zhan, P, Qiao, X. (2022). Diagnostic classification analysis of problem-solving competence using process data: An item expansion method. Psychometrika, 87, 1529.CrossRef Google Scholar PubMed

Article contents

A Latent Hidden Markov Model for Process Data

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests