Hostname: page-component-745bb68f8f-g4j75 Total loading time: 0 Render date: 2025-02-03T16:03:06.445Z Has data issue: false hasContentIssue false

SRMR for Models with Covariates

Published online by Cambridge University Press:  03 January 2025

Daniel McNeish*
Affiliation:
Arizona State University
Tyler H. Matta
Affiliation:
HMH
*
Corresponding author: Daniel McNeish; Email:[email protected]
Rights & Permissions [Opens in a new window]

Abstract

The standardized root mean squared residual (SRMR) is commonly reported to evaluate approximate fit of latent variable models. As traditionally defined, SRMR summarizes the discrepancy between observed covariance elements and implied covariance elements. However, current applications of latent variable models often include additional features like overidentified mean structures and covariates, to which the traditional SRMR definition is not applicable. To date, SRMR extensions for models with covariates have received limited attention. Nonetheless, mainstream software provide SRMR for models with covariates, but values differ based on model specification and differ across programs. The goal of this paper is to formalize SRMR definitions for models with covariates. We develop possible SRMR definitions corresponding to different model specifications with covariates, discussing the advantages and disadvantages of each. Importantly, some SRMR definitions are susceptible to confounding misfit and model size such that SRMR values systematically decrease and suggest better fit when covariates are present, even if covariates have null effects. The primary conclusion is that there may not be a single unifying SRMR definition for covariates, but practically, researchers reporting SRMR with covariates should be aware (a) which definition is being used and (b) which information is and is not included in the particular definition.

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Psychometric Society

1 Introduction

The standardized root mean squared residual (SRMR) has been characterized as a standardized effect size for evaluating the discrepancy between a model-implied covariance matrix and the covariance matrix from the observed data in structural equation models (Maydeu-Olivares, Reference Maydeu-Olivares2017; Maydeu-Olivares et al., Reference Maydeu-Olivares, Shi and Rosseel2018; Saris et al., Reference Saris, Satorra and Van der Veld2009). Several recent sources have endorsed SRMR over competing fit indices like RMSEA or CFI based on advantages like a consistent interpretation that is less dependent on model characteristics (Shi et al., Reference Shi, Maydeu-Olivares and DiStefano2018; Ximénez et al., Reference Ximénez, Maydeu-Olivares, Shi and Revuelta2022), strong performance with small samples or small degrees of freedom (Pavlov et al., Reference Pavlov, Maydeu-Olivares and Shi2021; Shi et al., Reference Shi, DiStefano, Maydeu-Olivares and Lee2022), and the ability to put an interval around the index to account for sampling variability (Maydeu-Olivares et al., Reference Maydeu-Olivares, Shi and Rosseel2018; Ogasawara, Reference Ogasawara2001; Shi et al., Reference Shi, Maydeu-Olivares and Rosseel2020). SRMR also tends to be the least redundant with other commonly reported metrics (Hu & Bentler, Reference Hu and Bentler1998; Browne et al., Reference Browne, MacCallum, Kim, Andersen and Glaser2002), and commonly cited resources for model fit evaluation have suggested a “two-index strategy” of reporting SRMR in conjunction with another index like RMSEA, CFI, Gamma Hat, or McDonald’s Centrality Index to minimize classification error rates (Hu & Bentler, Reference Hu and Bentler1999).

Although recent and classical research has extolled several benefits of SRMR, a potential limitation is that SRMR has not been rigorously studied—or formally defined—for some common types of structural equation models. The original definition of SRMR is valid for factor analyses where the mean structure is saturated or absent and where no covariates are present (Jöreskog & Sörbom, Reference Jöreskog and Sörbom1981; Bentler, Reference Bentler1995); however, the classical version of SRMR is not suitable for models that are interested in aspects beyond the covariance structure. For instance, mean structure models are the norm in most current applications because accommodating common missing data techniques requires a mean structure (e.g., Enders, Reference Enders, Hancock and Mueller2006, p. 329), the fit of which may not be perfect even if the mean structure is saturated (Asparouhov & Muthén, Reference Asparouhov and Muthén2018, p. 6). The traditional SRMR definition is insensitive to potential mean structure misfits and only incorporates covariance structure misfit (e.g., Leite & Stapleton, Reference Leite and Stapleton2011; Wu & West, Reference Wu and West2010).

Previous work has extended definitions of SRMR to include mean structures such that discrepancies between the observed and model-implied means can be incorporated into the index (e.g., Asparouhov & Muthén, Reference Asparouhov and Muthén2018). However, other model features have not received much attention. In particular, covariates are present in latent growth models, multiple indicator multiple cause models (MIMIC), and some measurement invariance models but there is little formal study of the potential implications of covariates on SRMR definitions. Furthermore, covariates pose unique challenges related to the specification of covariates (i.e., fixed versus stochastic) and which model-implied moments are used (i.e., marginal versus conditional on covariates; Vonesh et al., Reference Vonesh, Chinchilli and Pu1996). As will be discussed shortly, these decisions impact which variables count as part of “the model” and can alter the numerator and/or the denominator of the SRMR calculation. Practically, this is relevant because different covariate specifications corresponding to the same conceptual model can have different SRMR values and implications for data-model fit.

Despite limited formal examination of SRMR extensions for models that include covariates, latent variable model software like Mplus and lavaan currently output SRMR for models with covariates. As discussed in this paper, SRMR values in software output (a) do not agree across programs, (b) employ different SRMR definitions depending on which options are selected, or (c) may attempt to correct out covariate information with varying success.

The intention of this paper is therefore to (a) highlight the complexities of defining SRMR with covariates, (b) consider different possible SRMR definitions when covariates are present, and (c) better understand the advantages and disadvantages of different definitions. The ultimate goal is to help researchers make more informed and more accurate decisions when using SRMR to evaluate the approximate fit of their models. This issue is particularly timely because software programs are currently providing users with SRMR values even though such values are not well understood or may not align with the user’s expectations.

To outline the structure of the paper, Section 2 provides a brief example to motivate the nature of the issue. Section 3 overviews SRMR for covariance structure models and discusses recent extensions to mean structures. Section 4 reviews structural equation models with covariates and factors that complicate extensions of SRMR to these models. Section 5 outlines different ways that SRMR can be defined with covariates and how different model specifications impact what is included in different SRMR definitions. Section 6 provides an empirical application of a latent growth model with covariates to highlight how different versions of SRMR behave. A small simulation also demonstrates that the patterns in the empirical example hold when the population model is known. Section 7 concludes with limitations and future directions.

2 Motivation

To motivate the nature of the problem, consider data generated in Mplus Version 8.10 from the following unconditional linear growth model with four repeated measures,

(1) $$\begin{align} & {y_{it}} = \left( {0.5 + {\zeta _{0i}}} \right) + \left( {1.0 + {\zeta _{1i}}} \right) \times Tim{e_t} + {e_{ti}} \nonumber\\ & {\mathbf{{e}}_i}\sim\mathcal{N}\left( {{\mathbf{{0}}_4},diag\left[ {.25,.75,1.25,1.75} \right]} \right) \nonumber\\ & {\mathbf{{\zeta }}_i}\sim\mathcal{N}\left( {\left[ \begin{matrix} 0 \\ 0 \\ \end{matrix} \right],\left[ \begin{matrix} {1.0} & {0.1} \\ {0.1} & {0.2} \\ \end{matrix} \right]} \right){\mkern 1mu} \end{align}$$

where ${y}_{it}$ is the outcome from person i at time t, ${\zeta}_{0i}$ is a person-specific latent intercept, ${\zeta}_{1i}$ is a person-specific latent slope, and ${e}_{ti}$ is within-person error for person i at time t. Each of the 500 simulated datasets, i = 1, …, 1000.

Five models are fit to each generated dataset using homoskedastic error variances to underparameterize the model so that fit is not perfect. The first model correctly specifies no covariates. The remaining four models add 1, 2, 3, or 4 time-invariant covariates as predictors of the latent intercept and slope, but each covariate is known to have no effect on the population. Because the null covariates do not explain any variance, the model with and without covariates is functionally the same and SRMR should seemingly not improve.

Figure 1 shows SRMR averaged across replications with default settings in Mplus Version 8.10 (Muthén & Muthén, 1998–Reference Muthén and Muthén2017) and default settings in lavaan Version 0.6.17 (Rosseel, Reference Rosseel2012). Importantly, SRMR values do not agree, and fit appears to steadily improve as more null covariates are added, counterintuitively suggesting better fit despite null covariates effects.

Figure 1 Average SRMR values across replications for a latent growth model with four repeated measures fit with default options in lavaan and Mplus. The population model has no covariates, but null covariates were added. The SRMR value systematically decreases as a function of covariates, even though the covariates explain no variance and have no effect.

3 Overview of residual fit indices

3.1 Likelihood ratio test

Consider a population of random variables with mean ${\boldsymbol{\unicode{x3bc}}}$ and population covariance $\boldsymbol{\Sigma}$ . A random sample of size N from this population has a data matrix Y with sample mean $\overline{\mathbf{y}}$ and sample covariance S. A structural equation model proposed to model relations between variables in Y has a model-implied mean structure $\boldsymbol{\unicode{x3bc}} \left(\boldsymbol{\vartheta} \right)$ and a model-implied covariance structure $\boldsymbol{\Sigma} \left(\boldsymbol{\vartheta} \right)$ where $\boldsymbol{\vartheta}$ is the fundamental parameter vector containing the f freely estimated parameters featured in the model (Skrondal & Rabe-Hesketh, Reference Skrondal and Rabe-Hesketh2004). With maximum likelihood estimation, parameter estimates for $\boldsymbol{\vartheta}$ are found by minimizing the maximum likelihood discrepancy function,

(2) $$\begin{align}{F}_{\mathrm{ML}}\left(\boldsymbol{\vartheta} \right)= tr\left[\mathbf{S}{\boldsymbol{\Sigma}}^{-1}\left(\boldsymbol{\vartheta} \right)\right]+\ln \left|\boldsymbol{\Sigma} \left(\boldsymbol{\vartheta} \right)\right|-\ln \left|\mathbf{S}\right|+{\left[\overline{\mathbf{y}}-\boldsymbol{\unicode{x3bc}} \left(\boldsymbol{\vartheta} \right)\right]}^{\prime }{\boldsymbol{\Sigma}}^{-1}\left(\boldsymbol{\vartheta} \right)\left[\overline{\mathbf{y}}-\boldsymbol{\unicode{x3bc}} \left(\boldsymbol{\vartheta} \right)\right]-P.\end{align}$$

P corresponds to the number of variables in the model, dimensions of S and $\boldsymbol{\Sigma} \left(\boldsymbol{\vartheta} \right)$ are each $P\times P$ , and the dimensions of $\overline{\mathbf{y}}$ and $\boldsymbol{\unicode{x3bc}} \left(\boldsymbol{\vartheta} \right)$ are each $P\times 1$ .

To test whether the model-implied moments exactly reproduce the sample moments, a likelihood ratio test statistic can be defined by ${T}_{\mathrm{ML}}=N\times {F}_{\mathrm{ML}}\left(\hat{\boldsymbol{\vartheta}}\right)$ where N is the total sample size and ${F}_{\mathrm{ML}}\left(\hat{\boldsymbol{\vartheta}}\right)$ is the value of the discrepancy function evaluated at the maximum likelihood estimates of the parameters, $\hat{\boldsymbol{\vartheta}}$ . Under the assumption of multivariate normality, T ML is asymptotically distributed ${\chi}_{P^{\ast }-f}^2$ where ${P}^{\ast}=0.5P\left(P+3\right)$ , the number of non-duplicated entries in the augmented covariance-mean matrix.

Though valued for its clear definition and inferential nature, researchers have noted that satisfying an exact fit test like T ML is not always a necessary condition for a model to be useful (Bentler & Bonett, Reference Bentler and Bonett1980; Hu et al., Reference Hu, Bentler and Kano1992; MacCallum, Reference MacCallum2003). That is, models are often intended to be approximations from the onset, so tests of exact fit may be expected to be false a priori (e.g., Browne & Cudeck, Reference Browne, Cudeck, Bollen and Long1993). Consequently, approximate fit indices like RMSEA, CFI, and SRMR have become popular supplemental metrics to summarize the practical magnitude of misspecifications throughout the model (Jöreskog & Sörbom, Reference Jöreskog and Sörbom1982).

Whereas T ML is interested in the presence of misfit between the model-implied and observed moments, approximate fit indices are interested in quantifying the magnitude of the discrepancy between the model-implied and observed moments (e.g., McNeish & Wolf, Reference McNeish and Wolf2023) and operate more like effect sizes for model misspecification (Kelly & Preacher, Reference Kelley and Preacher2012). Commonly reported approximate fit indices like RMSEA and CFI are transformations of T ML, but SRMR is unique in that it is based on the model residuals (Yuan, Reference Yuan2005) where a “model residual” is the difference between a model-implied moment and an observed moment. SRMR can therefore have unique advantages relative to other indices and can provide non-redundant information. The remainder of this paper focuses on properties, clarifications, or extensions of SRMR.

3.2 SRMR for covariance structure models

Jöreskog and Sörbom (Reference Jöreskog and Sörbom1982) first proposed the root mean residual (RMR) index based on the model residuals, which summarizes the difference between S and $\boldsymbol{\Sigma} \left(\hat{\boldsymbol{\vartheta}}\right)$ with a single value. RMR is unit dependent and can be unintuitive to interpret, so Bentler (Reference Bentler1995) proposed the traditional classic definition of SRMR to standardize the RMR such that,

(3) $$\begin{align}\mathrm{SRMR}\left(\mathbf{S},\hat{\boldsymbol{\Sigma}}\right)=\sqrt{\frac{\delta_1}{0.5P\left(P+1\right)}}.\end{align}$$

and

(3a) $$\begin{align}{\delta}_1&=\sum \mathrm{vech}{\left({\mathbf{D}}^{-1}\left(\mathbf{S}-\hat{\boldsymbol{\Sigma}}\right){\mathbf{D}}^{-1}\right)}^2\end{align}$$
(3b) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j}{\left(\frac{s_{jk}-{\hat{\sigma}}_{jk}}{\sqrt{s_{jj}{s}_{kk}}}\right)}^2\end{align}$$
(3c) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j}{\left(\frac{s_{jk}}{\sqrt{s_{jj}{s}_{kk}}}-\frac{{\hat{\sigma}}_{jk}}{\sqrt{s_{jj}{s}_{kk}}}\right)}^2\end{align}$$
(3d) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left(\frac{s_{jk}}{\sqrt{s_{jj}{s}_{kk}}}-\frac{{\hat{\sigma}}_{jk}}{\sqrt{s_{jj}{s}_{kk}}}\right)}^2+\sum \limits_{j\in P}{\left(\frac{s_{jj}}{s_{jj}}-\frac{{\hat{\sigma}}_{jj}}{s_{jj}}\right)}^2\end{align}$$
(3e) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left({r}_{jk}-\frac{{\hat{\sigma}}_{jk}}{\sqrt{s_{jj}{s}_{kk}}}\right)}^2+\sum \limits_{j\in P}{\left(1-\frac{{\hat{\sigma}}_{jj}}{s_{jj}}\right)}^2\end{align}$$

For $\mathbf{d}=\operatorname{diag}{\left(\mathbf{S}\right)}^{1/2}$ , $\mathbf{D}={\mathbf{I}}_P\odot \mathbf{d}$ , $\hat{\boldsymbol{\Sigma}}=\boldsymbol{\Sigma} \left(\hat{\boldsymbol{\vartheta}}\right)$ , and ${r}_{jk}$ is a correlation element ${s}_{jk}/{d}_j{d}_k$ for $j\ne k$ .

Equations 3b and 3c illustrate that the residuals, ${s}_{jk}-{\hat{\sigma}}_{jk}$ , are scaled according to the product of the sample variances ( ${s}_{jj}$ and ${s}_{kk}$ ) such that the denominator always consists of elements of S, even when the numerator is an element of $\hat{\boldsymbol{\Sigma}}$ . Correspondingly, equation 3e shows that the minuend is a sample standardized metric (standardized covariance, ${r}_{jk}$ , or standardized variance, 1) because the numerator is divided by diagonal terms from the same matrix. However, subtrahend of equation 3e is the model-implied parameter estimate scaled by the product of the jth and kth sample standard deviation or the jth sample variance, respectively. Consequently, the model-implied elements are not necessarily completely standardized whenever the variances are not saturated because ${s}_{jj}\ne {\hat{\sigma}}_{jj}$ and ${s}_{kk}\ne {\hat{\sigma}}_{kk}$ , which may occur in a growth model (e.g., if residual variances are constrained to equality across repeated measures).Footnote 1 The denominator in equation 3 is $0.5P\left(P+1\right)$ , which is the number of unique diagonal and off-diagonal elements of the covariance matrix.

The SRMR expressed in equation 3 only considers elements from the sample and model-implied covariance matrix but includes no information about the mean structure. It is, therefore, suitable for factor analysis where mean structures are absent or saturated, but not for models with overidentified mean structures like latent growth models (Leite & Stapleton, Reference Leite and Stapleton2011; Wu & West, Reference Wu and West2010). Structural equation model applications frequently feature an overidentified mean structure, so an extension of SRMR that incorporates the model residuals between the model-implied means and sample means (i.e., $\overline{\mathbf{y}}-\boldsymbol{\unicode{x3bc}} \left(\hat{\boldsymbol{\vartheta}}\right)$ ) is desirable. Such an extension is described in the next section.

3.3 SRMR with a mean structure

Define ${\overline{y}}_j$ as the jth element of $\overline{\mathbf{y}}$ and ${\hat{\mu}}_j$ is the jth element $\boldsymbol{\unicode{x3bc}} \left(\hat{\boldsymbol{\vartheta}}\right)\equiv \hat{\boldsymbol{\unicode{x3bc}}}$ . The SRMR for a model with an overidentified mean structure therefore extends to

(4) $$\begin{align}\mathrm{SRMR}\left(\overline{\mathbf{y}},\mathbf{S},\hat{\boldsymbol{\unicode{x3bc}}},\hat{\boldsymbol{\Sigma}}\right)=\sqrt{\frac{\delta_2}{0.5P\left(P+3\right)/2}}\end{align}$$

where

(4a) $$\begin{align}{\delta}_2&=\sum \mathrm{vech}{\left({\mathbf{D}}^{-1}\left(\mathbf{S}-\hat{\boldsymbol{\Sigma}}\right){\mathbf{D}}^{-1}\right)}^2+\sum {\mathbf{d}}^{-1}{\left(\overline{\mathbf{y}}-\hat{\boldsymbol{\unicode{x3bc}}}\right)}^2\end{align}$$
(4b) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left(\frac{s_{jk}-{\hat{\sigma}}_{jk}}{\sqrt{s_j{s}_k}}\right)}^2+\sum \limits_{j\in P}{\left(\frac{s_j-{\sigma}_j}{s_j}\right)}^2+\sum \limits_{j\in P}{\left(\frac{{\overline{y}}_j-{\mu}_j}{\sqrt{s_j}}\right)}^2\end{align}$$
(4c) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left(\frac{s_{jk}}{\sqrt{s_j{s}_k}}-\frac{{\hat{\sigma}}_{jk}}{\sqrt{s_j{s}_k}}\right)}^2+\sum \limits_{j\in P}{\left(\frac{s_j}{s_j}-\frac{{\hat{\sigma}}_j}{s_j}\right)}^2+\sum \limits_{j\in P}{\left(\frac{{\overline{y}}_j}{\sqrt{s_j}}-\frac{{\hat{\mu}}_j}{\sqrt{s_j}}\right)}^2\end{align}$$
(4d) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left({r}_{jk}-\frac{{\hat{\sigma}}_{jk}}{\sqrt{s_j{s}_k}}\right)}^2+\sum \limits_{j\in P}{\left(1-\frac{{\hat{\sigma}}_j}{s_j}\right)}^2+\sum \limits_{j\in P}{\left({\overline{z}}_j-\frac{{\hat{\mu}}_j}{\sqrt{s_j}}\right)}^2\end{align}$$

Importantly, equation 4a adds a new term $\sum {\mathbf{d}}^{-1}{\left(\overline{\mathbf{y}}-\hat{\boldsymbol{\unicode{x3bc}}}\right)}^2$ to account for differences between the observed and model-implied means. The denominator in equation 4 changes to $0.5P\left(P+3\right)$ to incorporate elements in the mean structure. Like equation 3e, each residual in the minuend of equation 4d is a sample standardized metric (rjk , 1, or zj ) while the subtrahend is the model-implied parameter estimate divided by the square-root of the product of the jth and kth sample variances, the jth sample variance, or the jth standard deviation, depending on the residual being standardized. In other words, the elements of $\hat{\boldsymbol{\Sigma}}$ and $\hat{\boldsymbol{\unicode{x3bc}}}$ are divided by diagonal elements of S.

Note that the third term in equation 4d corresponding to mean structure residuals is unbounded. Conversely, the first term corresponding to the covariances is approximately standardized (depending on the congruence of diagonal elements in S and $\hat{\boldsymbol{\Sigma}}$ ) and will be bounded by a value near 2 (e.g., its maximum value occurs when the observed correlation is 1 and the model-implied correlation is –1). When a discrepancy between the covariance and mean structure is summarized by a single value, a large unbounded misfit in the mean structure may overpower the covariance structure misfit. Conversely, for large models, there can be many more covariance elements than mean elements and the covariance elements can wash out the contribution of the mean structure. It can therefore be prudent to separately examine the contribution of the covariance structure misfit and mean structure misfit (e.g., Yuan et al., Reference Yuan, Zhang and Deng2019). The lavResiduals function in lavaan will provide separate SRMR values for all elements combined, only the covariance elements, and only the mean elements.

3.4 Alternative standardization methods

Whereas equations 3 and 4 standardize with sample standard deviations (sometimes called Bentler standardization), an alternative approach is to standardize model-implied moments by model-implied standard deviations rather than observed standard deviations (sometimes referred to as Bollen standardization; Bollen, Reference Bollen1989). With this standardization, the numerator in equation 3 would instead be $\sum \limits_{j\in P}\sum \limits_{k\in j}{\left({s}_{jk}{\left({s}_{jj}^{1/2}{s}_{kk}^{1/2}\right)}^{-1}-{\hat{\sigma}}_{jk}{\left({\hat{\sigma}}_{jj}^{1/2}{\hat{\sigma}}_{kk}^{1/2}\right)}^{-1}\right)}^2$ . This transforms the observed and implied covariance matrices to correlation matrices prior to taking the difference, which removes potential contributions of the diagonal terms because they will always be 1 in each matrix. Consequently, the index derived from this standardization is typically referred to as a separate index (the correlation root mean square residual; CRMR, Bollen, Reference Bollen1989) rather than SRMR.

There are also proposed definitions that mix Bollen standardization for the covariance and mean elements with Bentler standardization for the variance elements so they are not excluded (this definition is employed by default in Mplus, Asparouhov & Muthén, Reference Asparouhov and Muthén2018). Specifically,

(5) $$\begin{align}{\mathrm{SRMR}}^{\ast}\left(\overline{\mathbf{y}},\mathbf{S},\hat{\boldsymbol{\unicode{x3bc}}},\hat{\boldsymbol{\Sigma}}\right)=\sqrt{\frac{\delta_3}{0.5P\left(P+3\right)}}\end{align}$$

where

(5a) $$\begin{align}{\delta}_3=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left(\frac{s_{jk}}{\sqrt{s_j{s}_k}}-\frac{{\hat{\sigma}}_{jk}}{\sqrt{{\hat{\sigma}}_j{\hat{\sigma}}_k}}\right)}^2+\sum \limits_{j\in P}{\left(\frac{s_j-{\sigma}_j}{s_j}\right)}^2+\sum \limits_{j\in P}{\left(\frac{{\overline{y}}_j}{\sqrt{s_j}}-\frac{\mu_j}{\sqrt{{\hat{\sigma}}_j}}\right)}^2\end{align}$$
(5b) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left(\frac{s_{jk}}{\sqrt{s_j{s}_k}}-\frac{{\hat{\sigma}}_{jk}}{\sqrt{{\hat{\sigma}}_j{\hat{\sigma}}_k}}\right)}^2+\sum \limits_{j\in P}{\left(\frac{s_j}{s_j}-\frac{{\hat{\sigma}}_j}{s_j}\right)}^2+\sum \limits_{j\in P}{\left(\frac{{\overline{y}}_j}{\sqrt{s_j}}-\frac{{\hat{\mu}}_j}{{\hat{\sigma}}_j}\right)}^2\end{align}$$
(5c) $$\begin{align}=\sum \limits_{j\in P}\sum \limits_{k\in j-1}{\left({r}_{jk}-{\hat{\rho}}_{jk}\right)}^2+\sum \limits_{j\in P}{\left(1-\frac{{\hat{\sigma}}_j}{s_j}\right)}^2+\sum \limits_{j\in P}{\left({z}_j-z\left({\hat{\mu}}_j\right)\right)}^2\end{align}$$

Notice that the first and third terms of equation 5a are divided by elements of $\hat{\boldsymbol{\Sigma}}$ rather than S as in equations 3 and 4 but the middle term of equation 5a continues to divide by an element of S.

3.5 SRMR for models with covariates

As shown in equations 3–5, SRMR definitions heavily rely on the definition of P, which is prominently featured in the denominator of each definition. For models without covariates, P is unambiguous. However, when covariates are present, the situation becomes more opaque because covariates may or may not count as part of P. Additionally, models with covariates will have marginal and conditional structures depending on how a researcher wishes to treat the variance explained by covariates, which may complicate SRMR definitions.

Potential challenges of SRMR with covariates have been considered, but have yet to be more rigorously embraced. Section 4 overviews details and properties of models with covariates needed to discuss different possible SRMR definitions; definitions are then provided in Section 5.

4 Structural equation models with a mean structure and covariates

A general structural equation model with a mean structure and covariates can be written as,

(6) $$\begin{align}{\mathbf{y}}_i=\boldsymbol{\unicode{x3bd}} +{\boldsymbol{\Lambda} \boldsymbol{\unicode{x3b7}}}_i+{\mathbf{Kx}}_i+{\boldsymbol{\varepsilon}}_i.\end{align}$$

where ${\mathbf{y}}_i$ is a P-dimensional vector of manifest outcome variables for person i  $\left(i=1,\dots, N\right)$ , $\boldsymbol{\unicode{x3bd}}$ is a P-dimensional vector of manifest outcome intercepts, $\boldsymbol{\Lambda}$ is a $P\times M$ matrix of factor loadings for M the number of latent variables, ${\boldsymbol{\unicode{x3b7}}}_i$ is an M-dimensional vector of latent variables, $\mathbf{K}$ is a $P\times C$ matrix of parameters associating the C-dimensional ${\mathbf{x}}_i$ vector of manifest covariates for person i that directly predict to the manifest outcome ${\mathbf{y}}_i$ , and ${\boldsymbol{\varepsilon}}_i$ is a P-dimensional vector of residuals for person i such that ${\boldsymbol{\varepsilon}}_i\sim {\mathcal{N}}_P\left(\mathbf{0},\boldsymbol{\Theta} \right)$ .

The structural model for the latent variables can then be written as

(7) $$\begin{align}{\boldsymbol{\unicode{x3b7}}}_i=\boldsymbol{\unicode{x3b1}} +{\mathbf{B} \boldsymbol{\unicode{x3b7}}}_i+{\boldsymbol{\Gamma} \mathbf{x}}_i+{\boldsymbol{\unicode{x3b6}}}_i.\end{align}$$

where $\boldsymbol{\unicode{x3b1}}$ is an M-dimensional vector of latent variable means, B is an M-dimensional square matrix of structural paths between latent variables, $\boldsymbol{\Gamma}$ is an $M\times C$ matrix of parameters associating the C-dimensional ${\mathbf{x}}_i$ vector of manifest covariates for person i to the latent variables ${\boldsymbol{\unicode{x3b7}}}_i$ , and ${\boldsymbol{\unicode{x3b6}}}_i$ is an M-dimensional vector of disturbances for the latent variable for person i such that ${\boldsymbol{\unicode{x3b6}}}_i\sim {\mathcal{N}}_M\left(\mathbf{0},\boldsymbol{\Psi} \right)$ .

The fundamental parameters vector containing the unique parameters from equations 6 and 7 is $\boldsymbol{\vartheta} ={\left[{\boldsymbol{\unicode{x3bd}}}^{\prime },\mathrm{vec}{\left(\boldsymbol{\Lambda} \right)}^{\prime },\mathrm{vec}{\left(\mathbf{K} \right)}^{\prime },\mathrm{vec}\mathrm{h}{\left(\boldsymbol{\Theta} \right)}^{\prime },{\boldsymbol{\unicode{x3b1}}}^{\prime },\mathrm{vec}{\left(\boldsymbol{\unicode{x3b2}} \right)}^{\prime },\mathrm{vec}{\left(\boldsymbol{\Gamma} \right)}^{\prime },\mathrm{vec}\mathrm{h}{\left(\boldsymbol{\Psi} \right)}^{\prime}\right]}^{\prime }$ , which is featured in the estimator in equation 2 and is the basis for the model-implied means and covariances.

4.1 Model-implied means

From the estimated parameters in $\hat{\boldsymbol{\vartheta}}$ , the model-implied conditional expectation for the manifest outcomes in y given the covariates x can be expressed as

(8) $$\begin{align}{\boldsymbol{{\unicode{x3bc} }}_i}\left( \hat{\boldsymbol \vartheta } \right) &= {{\hat{\boldsymbol{ \unicode{x3bc} }}}_i} = {\mkern 1mu} {\mathrm{E}}\left( {\mathbf{{y}}|\mathbf{{x}} = {\mathbf{{x}}_i};\hat{\boldsymbol{ \vartheta} } }\right)\nonumber \\ &= \hat{\boldsymbol{ \nu }} + \hat{\boldsymbol{ \Lambda }}{\left( {\mathbf{{I}} - \hat{\boldsymbol{ \unicode{x3b2} }}} \right)^{ - 1}}\left[ {\hat{\boldsymbol{ \unicode{x3b1} }} + \boldsymbol{{\hat \Gamma }}{\mathbf{{x}}_i}} \right] + \mathbf{{\hat {K}}}{\mathbf{{x}}_i} \end{align}$$

The i subscript on ${\hat{\boldsymbol{\unicode{x3bc}}}}_i$ indicates the expectation changes as a function of covariate values.

When fitting a model conditional on the covariates, the resulting output is typically the expected values given $\mathbf{x}=\mathbf{0}$ . This results in a different conditional expectation such that,

(9) $$\begin{align}{\boldsymbol{{\unicode{x3bc} }}_0}\left( \boldsymbol{\hat \vartheta } \right) &= {{\hat{\boldsymbol{ \unicode{x3bc} }}}_0} = {\mkern 1mu} {\mathrm{E}}\left( {\mathbf{{y}}|\mathbf{{x}} = \mathbf{{0}};\boldsymbol{\hat \vartheta} } \right)\nonumber \\ &= \boldsymbol{{\hat \nu }} + \boldsymbol{{\hat \Lambda }}{\left( {\mathbf{{I}} - \hat{\boldsymbol{ \unicode{x3b2} }}} \right)^{ - 1}}\left[ {\hat{\boldsymbol{ \unicode{x3b1} }} + \boldsymbol{{\hat \Gamma }}\mathbf{0}} \right] + \mathbf{{\hat {K}0}} \nonumber\\ &= \boldsymbol{{\hat \nu }} + \boldsymbol{{\hat \Lambda }}{\left( {\mathbf{{I}} - \hat{\boldsymbol{ \unicode{x3b2} }}} \right)^{ - 1}}\hat{\boldsymbol{\unicode{x3b1} }} \end{align} $$

The “0” subscript denotes that the expectation is conditional on the covariate being equal to 0.

Setting the covariate values to their respective sample means, $\mathbf{x}=\overline{\mathbf{x}}$ marginalizes over the covariates to arrive at a model-implied marginal expectation for the focal outcomes where

(10) $$\begin{align}\boldsymbol{{\unicode{x3bc} }}\left( \boldsymbol{\hat \vartheta } \right)& = \hat{\boldsymbol{ \unicode{x3bc} }} = {\mkern 1mu} E\left( {\mathbf{{y}}|\mathbf{{x}} = \mathbf{{\bar x}};\boldsymbol{\hat \vartheta }} \right)\nonumber \\ &= \boldsymbol{{\hat \nu }} + \boldsymbol{{\hat \Lambda }}{\left( {\mathbf{{I}} - \hat{\boldsymbol{ \unicode{x3b2} }}} \right)^{ - 1}}\left[ {\hat{\boldsymbol{ \unicode{x3b1} }} + \boldsymbol{{\hat \Gamma \bar x}}} \right] + \mathbf{{\hat {K}\bar x}} \end{align} $$

i subscripts are dropped in equation 10 to indicate a marginal expectation given that covariates are set to their respective sample means.

4.2 Model-implied covariance

The P × P model-implied covariance for the manifest outcomes, conditional on covariates, can be expressed as

(11) $$\begin{align}{\boldsymbol{\Sigma}}_i\left(\hat{\boldsymbol{\vartheta}}\right)&={\hat{\boldsymbol{\Sigma}}}_i= Cov\left(\mathbf{y}|\mathbf{x}={\mathbf{x}}_i;\hat{\boldsymbol{\vartheta}}\right)\nonumber\\ &=\hat{\boldsymbol{\Lambda}}{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{-1}\hat{\boldsymbol{\Psi}}{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{\prime -1}{\hat{\boldsymbol{\Lambda}}}^{\prime }+\hat{\boldsymbol{\Theta}}\end{align}$$

Like the model-implied conditional expectation, an i subscript indicates that the covariance is conditional. However, the conditional model-implied covariance does not vary as a function of covariate values (i.e., x does not appear in equation 11), so ${\hat{\boldsymbol{\Sigma}}}_i={\hat{\boldsymbol{\Sigma}}}_0$ . Together, equations 8 and 11 define the conditional model-implied probability distribution such that ${\mathbf{y}}_i\mid {\mathbf{x}}_i\sim \mathcal{N}\left({\hat{\boldsymbol{\unicode{x3bc}}}}_i,{\hat{\boldsymbol{\Sigma}}}_i\right)$ or ${\mathbf{y}}_i\mid \mathbf{x}=\mathbf{0}\sim \mathcal{N}\left({\hat{\boldsymbol{\unicode{x3bc}}}}_0,{\hat{\boldsymbol{\Sigma}}}_0\right)$ .

Correspondingly, the model-implied marginal covariance matrix is

(12) $$\begin{align}\boldsymbol{\Sigma} \left(\hat{\boldsymbol{\vartheta}}\right)&=\hat{\boldsymbol{\Sigma}}\nonumber\\ &=\hat{\boldsymbol{\Lambda}}{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{-1}\hat{\boldsymbol{\Gamma}}{\mathbf{S}}_x{\hat{\boldsymbol{\Gamma}}}^{\prime }{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{\prime -1}{\hat{\boldsymbol{\Lambda}}}^{\prime }+\hat{\mathbf{K}}{\mathbf{S}}_x{\hat{\mathbf{K}}}^{\prime }+\hat{\boldsymbol{\Lambda}}{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{-1}\hat{\boldsymbol{\Psi}}{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{\prime -1}{\hat{\boldsymbol{\Lambda}}}^{\prime }+\hat{\boldsymbol{\Theta}}\nonumber\\ &=\hat{\boldsymbol{\Lambda}}{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{-1}\hat{\boldsymbol{\Gamma}}{\mathbf{S}}_x{\hat{\boldsymbol{\Gamma}}}^{\prime }{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{\prime -1}{\hat{\boldsymbol{\Lambda}}}^{\prime }+\hat{\mathbf{K}}{\mathbf{S}}_x{\hat{\mathbf{K}}}^{\prime }+{\hat{\boldsymbol{\Sigma}}}_i.\end{align}$$

where ${\mathbf{S}}_x$ is the sample covariance matrix for the covariates. Notably, the model-implied marginal covariance is calculated from the conditional covariance matrix ( ${\hat{\boldsymbol{\Sigma}}}_i$ ) plus the proportion of variance in the outcomes that are explained through covariates. Together, equations 10 and 12 define the marginal model-implied probability distribution such that ${\mathbf{y}}_i\sim \mathcal{N}\left(\hat{\boldsymbol{\unicode{x3bc}}},\hat{\boldsymbol{\Sigma}}\right)$ .

4.3 Special case of continuous outcomes

When y is continuous and all covariates are exogenous, the model can be simplified based on LISREL notation. Namely,

(13) $$\begin{align}{\mathbf{v}}_i&={\boldsymbol{\unicode{x3bd}}}_v+{\boldsymbol{\Lambda}}_v{\boldsymbol{\unicode{x3b7}}}_{vi}+{\boldsymbol{\varepsilon}}_{vi}\nonumber\\ {\boldsymbol{\unicode{x3b7}}}_{vi}&={\boldsymbol{\unicode{x3b1}}}_v+{\mathbf{B}}_v{\boldsymbol{\unicode{x3b7}}}_{vi}+{\boldsymbol{\unicode{x3b6}}}_{vi}\end{align}$$

where ${\mathbf{v}}_i={\left({\mathbf{y}}_i^{\prime },{\mathbf{x}}_i^{\prime}\right)}^{\prime }$ stacks all the variables into one vector and all variables are treated as outcomes. This notation does not permit direct paths from manifest variables to latent variables (e.g., manifest variables can only indicate latent variables, but they cannot predict them; Bollen Reference Bollen1989, pp. 395). Instead, single-indicator latent variables are created for each manifest variable that predicts or is predicted by another manifest variable where factor loadings are fixed to 1 and residual variances fixed to 0 for identification.Footnote 2

${\boldsymbol{\unicode{x3b7}}}_{vi}={\left({\boldsymbol{\unicode{x3b7}}}_i^{\prime },{\boldsymbol{\unicode{x3b7}}}_{yi}^{\prime },{\boldsymbol{\unicode{x3b7}}}_{xi}^{\prime}\right)}^{\prime }$ is then composed of three parts, (a) focal latent variables ( ${\boldsymbol{\unicode{x3b7}}}_i$ ), (b) dummy latent variables for elements of y that are predicted from elements of x ( ${\boldsymbol{\unicode{x3b7}}}_{yi}$ ), and (c) dummy latent variables for elements of x that predict elements of y or ${\boldsymbol{\unicode{x3b7}}}_i$ . All regression paths are housed in the B matrix rather than being split amongst Γ, K, and B as in equations 6 and 7 (Skrondal & Rabe-Hesketh, Reference Skrondal and Rabe-Hesketh2004, p. 78).

Mplus and lavaan rely on this notation for efficient computation with continuous variables (Muthen, Reference Muthén2004, p. 13; von Oertzen & Brick, Reference von Oertzen and Brick2014). Other notation systems like reticular action model notation (RAM, McArdle & McDonald, Reference McArdle and McDonald1984) or Bentler–Weeks notation (Bentler & Weeks, Reference Bentler and Weeks1979) can directly accommodate paths from manifest covariates to latent variables. Correspondingly, different specifications emerge for models with covariates. Section 4.4 reviews these different specifications and Section 5 discusses implications for how different specifications can have different SRMR definitions.

4.4 Specifications for models with covariates

There are two main dimensions along which model specifications with covariates can differ. The first is joint versus conditional, the second is fixed versus stochastic. The result is four possible combinations, though one combination (conditional and stochastic) is theoretically possible but seldom serviceable, so it is not considered here. Figure 2 illustrates the differences between model path diagrams for different specifications for a hypothetical conditional linear growth model with four repeated measures and two time-invariant covariates predicting the growth factors. Figure 2a shows the joint and fixed specification, Figure 2b shows the joint and stochastic specification, and Figure 2c shows the conditional and fixed specification. More details on each specification appear in dedicated subsections below.

Figure 2 Hypothetical path diagram of conditional latent growth model with two time-invariant covariates and four repeated measures. Panel (a) shows a joint and fixed covariate specification where the covariates are converted to latent variables whose moments are constrained to sample statistics. Panel (b) shows a joint and stochastic specification where the covariates are converted to latent variables whose moments are free parameters. Panel (c) shows a conditional and fixed specification where the manifest covariates directly predict the latent growth factors. The difference between panels (a) and (b) is subtle and is related to whether the means, variances, and covariances of η 3 and η 4 are fixed or estimated.

4.4.1 Joint and fixed

In the joint specification, a joint likelihood for the outcome variables and all covariates is built such that ${\mathbf{{v}}_i}\sim {\mathcal{N}_V}\left( {\mathbf{{\mu }},{\mkern 1mu} \boldsymbol{{\Sigma }}} \right)$ . In Figure 2a, this is represented by the manifest covariates x 1 and x 2 being replaced with single-indicator latent variable models with factor loadings fixed to 1 and their residual variances fixed to 0. These single-indicator latent variables then predict the latent growth factors. This specification is used by default in lavaan and Mplus by default.

With a joint specification, all covariates become dependent variables in the model, which has ramifications for how P is defined within SRMR calculations. Because covariates technically become outcomes (i.e., a latent variable points into them), they are pulled into ‘the model’ such that P equals the sum of the T focal outcome variables in y and the C covariates in x predicting the latent variables or manifest outcomes. This sum is defined as V where V = T + C. The model-implied moments correspond to equations 10 and 12.

With a fixed specification, the mean, variances, and covariances of the covariates are constrained to their sample values rather than being estimated. Because there are no free parameters in the covariate portion of the model, full information maximum likelihood is not applicable with this specification and missing covariates must be imputed or listwise deleted.

The full model equations for the joint and conditional specification in Figure 2a are,

(14) $$\begin{align} &\left[ {\begin{array}{c} {{y_{1i}}} \\ {{y_{2i}}} \\ {{y_{3i}}} \\ {{y_{4i}}} \\ {{x_{1i}}} \\ {{x_{2i}}}\end{array}} \right] = \left[ {\begin{array}{c} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0\end{array}} \right] + \left[ {\begin{array}{cccc} 1&0&0&0 \\ 1&1&0&0 \\ 1&2&0&0 \\ 1&3&0&0 \\ 0&0&1&0 \\ 0&0&0&1\end{array}} \right]\left[ {\begin{array}{c} {{\eta _{1i}}} \\ {{\eta _{2i}}} \\ {{\eta _{3i}}} \\ {{\eta _{4i}}}\end{array}} \right] + \left[ {\begin{array}{c} {{\epsilon_{1i}}} \\ {{\epsilon_{2i}}} \\ {{\epsilon_{3i}}} \\ {{\epsilon_{4i}}} \\ {{\epsilon_{5i}}} \\ {{\epsilon_{6i}}}\end{array}} \right] \nonumber\\ &\left[ {\begin{array}{c} {{\eta _{1i}}} \\ {{\eta _{2i}}} \\ {{\eta _{3i}}} \\ {{\eta _{4i}}}\end{array}} \right] = \left[ {\begin{array}{c} {{\alpha _1}} \\ {{\alpha _2}} \\ {{{\bar x}_1}} \\ {{{\bar x}_2}}\end{array}} \right] + \left[ {\begin{array}{cccc} 0&0&{{\beta _{13}}}&{{\beta _{14}}} \\ 0&0&{{\beta _{23}}}&{{\beta _{24}}} \\ 0&0&0&0 \\ 0&0&0&0\end{array}} \right]\left[ {\begin{array}{c} {{\eta _{1i}}} \\ {{\eta _{2i}}} \\ {{\eta _{3i}}} \\ {{\eta _{4i}}}\end{array}} \right] + \left[ {\begin{array}{c} {{\zeta _{1i}}} \\ {{\zeta _{2i}}} \\ {{\zeta _{3i}}} \\ {{\zeta _{4i}}}\end{array}} \right] \nonumber \\ &{\boldsymbol{\epsilon}_i}\sim\mathcal{N}\left( {{{\mathbf{0}}_6},{\text{diag}}\left[ {\theta ,\theta ,\theta ,\theta ,0,0} \right]} \right) \nonumber \\ &{{\mathbf{\zeta }}_i}\sim\mathcal{N}\left( {\left[ {\begin{array}{c} 0 \\ 0 \\ 0 \\ 0\end{array}} \right],\left[ {\begin{array}{cccc} {{\psi _{11}}}&{{\psi _{12}}}&0&0 \\ {{\psi _{21}}}&{{\psi _{11}}}&0&0 \\ 0&0&{\operatorname{var} \left( {{x_1}} \right)}&{\operatorname{cov} \left( {{x_1},{x_2}} \right)} \\ 0&0&{\operatorname{cov} \left( {{x_2},{x_1}} \right)}&{\operatorname{var} \left( {{x_2}} \right)}\end{array}} \right]} \right)\end{align}$$

4.4.2 Joint and stochastic

A joint and stochastic specification maintains the joint likelihood approach in Section Section 4.4.1 but differs in how covariates parameters are treated. Namely, rather than fixing the covariate means, variances, and covariances to their sample values, these parameters are directly estimated. That is, the vector of latent variables means in equation 14 would change to ${\left[{\alpha}_1\kern0.5em {\alpha}_2\kern0.5em {\alpha}_3\kern0.5em {\alpha}_4\right]}^{\prime }$ and the lower right triangle of the disturbance covariance matrix in equation 14 would change to $\begin{smallmatrix} {\psi}_{33}& {\psi}_{34}\\ {\psi}_{43}& {\psi}_{44}\end{smallmatrix}$ . This can be seen in Figure 2b where sample statistics ${\overline{x}}_1$ , ${\overline{x}}_2$ , $\operatorname{var}\left({x}_1\right)$ , $\operatorname{var}\left({x}_2\right)$ , and $\operatorname{cov}\left({x}_2,{x}_1\right)$ from Figure 2a are replaced with freely estimated parameters. The model-implied moments are again the marginal moments from Equations 10 and 12.

A main benefit of the stochastic approach is that missing data on the covariates can be handled directly with maximum likelihood assuming a missing at random mechanism because there are free parameters and distributional assumptions related to the covariates (Baraldi & Enders, Reference Baraldi and Enders2010). In lavaan and Mplus, this is specification is used whenever the mean or variance of a covariate is included in the code (or by using the fixed.x = FALSE option in lavaan). Similar to Section 4.4.1, the number of variables in the model is equal to V because all outcomes and covariates are considered part of the model.

4.4.3 Conditional and fixed

A conditional specification aligns more closely with models from the regression or mixed effect tradition and the likelihood is conditioned on the covariates such that ${\mathbf{{v}}_i}\sim {\mathcal{N}_V}\left( {\mathbf{{\mu }},{\mkern 1mu} \boldsymbol{{\Sigma }}} \right)$ . In the conditional likelihood, the effects of covariates are removed from the model-implied moments correspond to the conditional moments in equations 9 and 11. The mean, variance, and covariance of covariates are fixed to sample statistics as in Section 4.4.1. Consequently, P is defined only as the number of focal outcome variables T rather than V. The corresponding model equations are,

(15) $$\begin{align} &\left[\begin{array}{c}{y}_{1i}\mid \left({x}_{1i},{x}_{2i}\right)\\ {}{y}_{2i}\mid \left({x}_{1i},{x}_{2i}\right)\\ {}{y}_{3i}\mid \left({x}_{1i},{x}_{2i}\right)\\ {}{y}_{4i}\mid \left({x}_{1i},{x}_{2i}\right)\end{array}\right]=\left[\begin{array}{c}0\\ {}0\\ {}0\\ {}0\end{array}\right]+\left[\begin{array}{cc}1& 0\\ {}1& 1\\ {}1& 2\\ {}1& 3\end{array}\right]\left[\begin{array}{c}{\eta}_{1i}\\ {\eta}_{2i}\end{array}\right]+\left[\begin{array}{c}{\unicode{x3b5}}_{1i}\\ {}{\unicode{x3b5}}_{2i}\\ {}{\unicode{x3b5}}_{3i}\\ {}{\unicode{x3b5}}_{4i}\end{array}\right]\nonumber\\&\left[\begin{array}{c}{\eta}_{1i}\\ {}{\eta}_{2i}\end{array}\right]=\left[\begin{array}{c}{\alpha}_1\\ {}{\alpha}_2\end{array}\right]+\left[\begin{array}{cc}{\gamma}_{11}& {\gamma}_{21}\\ {}{\gamma}_{21}& {\gamma}_{22}\end{array}\right]\left[\begin{array}{c}{x}_{1i}\\ {}{x}_{2i}\end{array}\right]+\left[\begin{array}{c}{\zeta}_{1i}\\ {}{\zeta}_{2i}\end{array}\right]\nonumber\\ &{\boldsymbol{\unicode{x3b5}}}_i \sim \mathcal{N}\left({\mathbf{0}}_{\mathrm{T}},{\mathbf{I}}_{\mathrm{T}}\odot \boldsymbol{\unicode{x3b8}} \right)\nonumber\\ &{\boldsymbol{\unicode{x3b6}}}_i\sim \mathcal{N}\left(\left[\begin{array}{c}0\\ {}0\end{array}\right],\left[\begin{array}{cc}{\psi}_{11}& {\psi}_{12}\\ {}{\psi}_{21}& {\psi}_{11}\end{array}\right]\right)\end{align}$$

The path diagram corresponding to this specification is shown in Figure 2c. With a conditional and fixed specification, there are no distributional assumptions placed on the covariates, so missing covariates must be dealt with imputation or deletion (Sterba, Reference Sterba2014). The conditional and fixed specification is conceptually similar to the joint and fixed specification and the parameter estimates will closely correspond (and may be identical) even though there are different ramifications for defining SRMR.

Slope structures are not present in joint specifications but become relevant in conditional specifications (Muthén, Reference Muthén1984, pp. 49–50). The slope structure refers to possible pathways from the covariates in x to the outcomes y (possibly through latent variables in η) and corresponds to the covariance attributable to covariates (which is conditioned out). If the slope structure is saturated such that every covariate predicts every outcome (i.e., there are $T\times C$ covariate paths in the model), then the observed covariance attributable to covariates will equal the model-implied covariance attributable to covariates. However, in the more common case where the slope structure is overidentified (i.e., there are fewer than $T\times C$ covariate paths), then there may be slope structure residuals in addition to the mean and covariance residuals for conditional specifications.

4.5 Covariate specification affects fit

Despite the conceptual similarity among specifications (especially without missing data where specifications yield identical parameter estimates), the choice of specification—whether made explicitly or implicitly by software—has implications for calculating SRMR because definitions of P are different. For the model in Figure 2, joint specifications are 6-dimensional, but the corresponding conditional specification is only 4-dimensional. Additionally, the conditional specification has different residuals because the variance attributable to covariates is removed whereas a joint specification yields marginal moments.

Section 5 provides different possible SRMR definitions that emerge from different covariate specifications and standardization methods. An empirical example and simulation follow in Section 6 to demonstrate complexities of defining model fit in complex models.

5 Defining SRMR with covariates

Sections 5.1 and 5.2 describe different SRMR definitions depending on the model specification. Key properties are summarized in Table 1. Section 5.3 discusses relevant properties to consider when choosing among different SRMR definitions for a model with covariates.

Table 1 Comparison of primary features of different possible SRMR definitions for models with covariates

Note: V = number of total variables in the model including covariates, T = number of focal outcome variables, C = number of covariates in the model. V = T + C. SRMR labels without a “*” are based on Bentler standardization that divides by sample standard deviations; SRMR labels with a “*” standardize using model-implied standard deviations for covariance and mean elements and sample standard deviation for the variance elements. Mplus column refers to Version 8.10, lavaan column refers to version 0.6.17.

5.1 Conditional specification

The likelihood for a model with conditionally specified covariates is $\mathbf{{y}}|\mathbf{{x}} = \mathbf{{0}}\sim {\mathcal{N}_T}\left( {{\mathbf{{\mu }}_0},{\mkern 1mu} {\boldsymbol{{\Sigma }}_0}} \right)$ where ${\boldsymbol{\unicode{x3bc}}}_0\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{cond}}\right)$ is the model-implied conditional means and ${\boldsymbol{\Sigma}}_0\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{cond}}\right)$ is the model-implied conditional covariance. The observed conditional means are then $E\left(\mathbf{y}|\mathbf{x}=\mathbf{0}\right)={\overline{\mathbf{y}}}_0=\mathbf{y}+{\mathbf{S}}_{yx}{\mathbf{S}}_x^{-1}\left(\mathbf{0}-\overline{\mathbf{x}}\right)$ and the observed conditional covariance is $Cov\left(\mathbf{y}|\mathbf{x}\right)={\mathbf{S}}_{y\mid x}={\mathbf{S}}_y-{\mathbf{S}}_{yx}{\mathbf{S}}_x^{-1}{\mathbf{S}}_{yx}^{\prime }$ .

These conditional sample and model-implied moments can produce a residualized SRMR:

(16) $$\begin{align}{\mathrm{SRMR}}_R=\mathrm{SRMR}\left({\overline{\mathbf{y}}}_0,{\mathbf{S}}_{y\mid x},{\boldsymbol{\unicode{x3bc}}}_0\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{cond}}\right),{\boldsymbol{\Sigma}}_0\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{cond}}\right)\right)\end{align}$$

${\mathrm{SRMR}}_R$ provides an index of the standardized discrepancy between the sample and model-implied conditional means and conditional covariances of the T focal outcomes given all covariates are set to 0. This is most meaningful when covariates are centered or have natural zero points and when the interest is evaluating the discrepancy after removing variance explained by covariates.

Importantly, because ${\mathrm{SRMR}}_R$ is conditional, changing the scaling of the covariates will change the value of ${\mathrm{SRMR}}_R$ (e.g., centered versus uncentered covariates will have different values of ${\mathrm{SRMR}}_R$ ). This can be useful if the fit at specific values of the covariates is desired because the scaling of the covariates can be adjusted so that the specific values of interest are set to 0.

${\mathrm{SRMR}}_R$ in equation 16 uses Bentler standardization from equation 4, but it could use a mix of Bentler and Bollen standardization as in equation 5 such that ${\mathrm{SRMR}}_R^{\ast}={\mathrm{SRMR}}^{\ast}\left({\overline{\mathbf{y}}}_0,{\mathbf{S}}_{y\mid x},{\boldsymbol{\unicode{x3bc}}}_0\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{cond}}\right),{\boldsymbol{\Sigma}}_0\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{cond}}\right)\right)$ .

Regarding the slope structure, the observed covariance attributable to covariates is ${\mathbf{S}}_{\bullet x}={\mathbf{S}}_{yx}{\mathbf{S}}_x^{-1}{\mathbf{S}}_{yx}^{\prime }$ and the model-implied covariance attributable to covariates is ${\hat{\boldsymbol{\Sigma}}}_{\bullet x}=\hat{\boldsymbol{\Lambda}}{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{-1}\hat{\boldsymbol{\Gamma}}{\mathbf{S}}_x{\hat{\boldsymbol{\Gamma}}}^{\prime }{\left(\mathbf{I}-\hat{\mathbf{B}}\right)}^{\prime -1}{\hat{\boldsymbol{\Lambda}}}^{\prime }+\hat{\mathbf{K}}{\mathbf{S}}_x{\hat{\mathbf{K}}}^{\prime }$ . The slope structure residuals are then equal to ${\mathbf{S}}_{\bullet x}-{\hat{\boldsymbol{\Sigma}}}_{\bullet x}$ . The slope structure residuals are implicitly present in equation 16 because they are one source of misfit such that ${\mathbf{S}}_{y\mid x}$ is based on removing all covariance attributable to covariates whereas ${\hat{\boldsymbol{\Sigma}}}_0$ only removes covariance based on the specified, possibly overidentified slope structure. Separating the contribution of the slope structure residuals can help identify if misfit is potentially due to covariate-related covariance not being fully conditioned out.

5.2 Joint specification

Let $\overline{\mathbf{v}}={\left[{\overline{\mathbf{y}}}^{\prime },{\overline{\mathbf{x}}}^{\prime}\right]}^{\prime }$ and ${\mathbf{S}}_V$ be the sample mean vector and covariance matrix for all V variables in a jointly specified model whose model-implied mean and covariance are $\boldsymbol{\unicode{x3bc}} \left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)$ and $\boldsymbol{\Sigma} \left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)$ . These quantities could be used directly such that

(17) $$\begin{align}{\mathrm{SRMR}}_V=\mathrm{SRMR}\left(\overline{\mathbf{v}},{\mathbf{S}}_{\mathrm{V}},\boldsymbol{\unicode{x3bc}} \left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right),\boldsymbol{\Sigma} \left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)\right)\end{align}$$

${\mathrm{SRMR}}_V$ uses the marginal model-implied and sample moments for all V variables in the model (i.e., the variance explained by covariates is not factored out). Essentially, ${\mathrm{SRMR}}_V$ captures how well the model reproduces the means, variances, and covariances of focal outcomes and covariates simultaneously. ${\mathrm{SRMR}}_V$ is applicable when covariates are treated as fixed or stochastic. Equation 17 uses Bentler standardization, but a corresponding ${\mathrm{SRMR}}_V^{\ast }$ could also be defined by using the hybrid Bollen–Bentler standardization from equation 5.

Asparouhov and Muthén (Reference Asparouhov and Muthén2018) note that ${\mathrm{SRMR}}_V$ may be problematic with a fixed covariate specification because elements related to the covariates are included in the SRMR calculation, but they are constrained to sample values rather than being estimated. Therefore, these elements will not have misfit and cannot contribute to the SRMR numerator, but they will contribute to the denominator. Therefore, Asparouhov and Muthén (Reference Asparouhov and Muthén2018) describe a covariate adjustment for ${\mathrm{SRMR}}_V^{\ast }$ where

(18) $$\begin{align}{\mathrm{SRMR}}_{VC}^{\ast}\left(\overline{\mathbf{v}},{\mathbf{S}}_{\mathrm{V}},\hat{\boldsymbol{\unicode{x3bc}}}\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right),\hat{\boldsymbol{\Sigma}}\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)\right)=\sqrt{\frac{\delta_3}{0.5V\left(V+3\right)-0.5C\left(C+3\right)}}\end{align}$$

The denominator removes the $0.5C\left(C+3\right)$ elements associated with the covariates whose model-implied values are constrained to sample values (which will necessarily have no misfit). Elements corresponding to covariates are factored out of the denominator to avoid artificially reducing the index by inflating the denominator. If C = 0, ${\mathrm{SRMR}}_{VC}^{\ast }$ reduces to ${\mathrm{SRMR}}_V^{\ast }$ because none of the model-implied elements will be constrained to sample values. There is also a corresponding ${\mathrm{SRMR}}_{VC}$ that corrects the denominator of ${\mathrm{SRMR}}_V$ when Bentler standardization is used.

In Mplus Version 8.10, specifying a model with stochastic covariates (e.g., by including the mean or variance of the covariate in the model) will result in ${\mathrm{SRMR}}_V^{\ast }$ being reported in the output; the Mplus default specifies fixed covariates and results in ${\mathrm{SRMR}}_{VC}^{\ast }$ being reported in the output. The INFORMATION = EXPECTED option in Mplus can yield ${\mathrm{SRMR}}_V$ and ${\mathrm{SRMR}}_{VC}$ in the output but has ramifications beyond the calculation of SRMR. Specifically, it can affect the consistency of standard errors if missing data are present and not missing completely at random (Kenward & Molenberghs, Reference Kenward and Molenberghs1998) and it can impact the effectiveness of robust estimators (Savalei, Reference Savalei2010). This may be particularly problematic for ${\mathrm{SRMR}}_V$ with a stochastic specification because a common motivation for this specification is accommodating covariates with missing values.

To extend the idea of ${\mathrm{SRMR}}_{VC}$ , the scope of SRMR can be refined further by subsetting the mean vector and covariance matrix to only include elements related to the T focal outcomes. This way, all means, variances, and covariances related to covariates are not counted in either the numerator or the denominator. Namely,

(19) $$\begin{align}{\mathrm{SRMR}}_M=\mathrm{SRMR}\left({\overline{\mathbf{y}}}_{\mathrm{T}},{\mathbf{S}}_{\mathrm{T}},{\boldsymbol{\unicode{x3bc}}}_{\mathrm{T}}\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right),{\boldsymbol{\Sigma}}_{\mathrm{T}}\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)\right)\end{align}$$

where ${\boldsymbol{\unicode{x3bc}}}_{\mathrm{T}}\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)$ and ${\boldsymbol{\Sigma}}_{\mathrm{T}}\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)$ are the marginal mean vector and marginal covariance that only contain elements involving the T focal variables. After subsetting, ${\mathrm{SRMR}}_M$ uses the same information regardless of whether covariates are fixed or stochastic. Because it uses marginal moments, ${\mathrm{SRMR}}_M$ corresponds to the fit of unconditional growth model. ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_R$ will be equivalent for mean-centered covariates that explain no variance because they will have the same dimension and the same moments. ${\mathrm{SRMR}}_M^{\ast }$ can be defined similarly but uses the hybrid Bollen-Bentler standardization (equivalence between ${\mathrm{SRMR}}_M^{\ast }$ and ${\mathrm{SRMR}}_R^{\ast }$ exists under the same conditions as equivalence of ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_R$ ).

5.3 Choosing among different definitions

Among the definitions in Sections 5.1 and 5.2, ${\mathrm{SRMR}}_V$ and ${\mathrm{SRMR}}_V^{\ast }$ are the most susceptible to overly optimistic assessments of model fit when covariates are present, especially when a fixed specification is used for covariates. Both definitions use all variables—focal outcomes and covariates—in the numerator and denominator. The means and variances of covariates as well as covariances among covariates will fit exceedingly well because they are either explicitly fixed to sample statistics with a fixed specification (and will fit perfectly) or are the full information maximum likelihood estimates of sample statistics with a stochastic specification (Little & Rubin, Reference Little and Rubin2020).

${\mathrm{SRMR}}_V$ and ${\mathrm{SRMR}}_V^{\ast }$ therefore capitalize on the perfect or near-perfect fit of the $0.5C\left(C+3\right)$ elements involving covariates and are susceptible to deflation because elements involving covariates contribute to the denominator but do not contribute or contribute very little to the numerator. Good fit can be achieved by simply adding many covariates into the model, which will increase the proportion of elements with 0 or near-zero residuals, which will attenuate ${\mathrm{SRMR}}_V$ and ${\mathrm{SRMR}}_V^{\ast }$ .

This mechanism motivates ${\mathrm{SRMR}}_{VC}$ and ${\mathrm{SRMR}}_{VC}^{\ast }$ , which explicitly reduce the denominator by $0.5C\left(C+3\right)$ to account for elements that do not contribute to the numerator. This reduces—but may not entirely eliminate— ${\mathrm{SRMR}}_{VC}$ and ${\mathrm{SRMR}}_{VC}^{\ast }$ capitalizing on the presence of covariates and producing optimistic assessments of fit. In addition to the $0.5C\left(C+3\right)$ elements that will have perfect fit, there are $C\times T$ model-implied covariances that are partially informed by covariates. For instance, in Figure 2a, ${\hat{\sigma}}_{y_1,{x}_1}$ is implied (in part) by ${s}_{x1}{\hat{\beta}}_{13}$ and ${s}_{x_1,{x}_2}{\hat{\beta}}_{14}$ . The sample statistics will not have misfit and will limit the potential magnitude of misfit in ${\hat{\sigma}}_{y_1,{x}_1}$ . Partial embedding of sample statistics throughout the model makes the effectiveness of a denominator correction for covariates uncertain because it is unclear how or whether to account for elements that have partial covariate information. As a result, ${\mathrm{SRMR}}_{VC}$ and ${\mathrm{SRMR}}_{VC}^{\ast }$ remain susceptible to some deflation when more covariates are added to the model. Nonetheless, the situation is somewhat ambiguous because the model’s ability to reproduce covariances between covariates and outcomes may be relevant because—even if these residuals are tempered—they are not guaranteed to be exactly zero.

${\mathrm{SRMR}}_R$ , ${\mathrm{SRMR}}_R^{\ast }$ , ${\mathrm{SRMR}}_M$ , and ${\mathrm{SRMR}}_M^{\ast }$ appear least susceptible to capitalizing on covariate information because they restrict focus solely to the T outcome measures. ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ rely on the marginal mean and covariance, which preserves the interpretation as the fit of the unconditional model regardless of the number of covariates. This provides the most consistency by insulating the interpretation from the effect of covariates. However, as expanded upon in Sections 6.2 and 7, this may not necessarily be a positive characteristic. ${\mathrm{SRMR}}_R$ and ${\mathrm{SRMR}}_R^{\ast }$ use the conditional mean and covariance, which is helpful to factor out explained variance. However, they are dependent on the scaling of the covariates and require a meaningful zero point for the covariates to have a meaningful interpretation. Essentially, ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ do not engage with the covariate information, which provides a constant interpretation. Conversely, ${\mathrm{SRMR}}_R$ and ${\mathrm{SRMR}}_R^{\ast }$ fully engage with covariate information, which results in a focused interpretation that is sensitive to changes in covariate scaling. The next section provides an empirical example to empirically demonstrate the differences between SRMR definitions.

6 Empirical example

6.1 Model description

To demonstrate how SRMR can be affected even for modest and routine models, this section uses a subset of the 1979 National Longitudinal Survey on Youth (NLSY). These data are publicly available and were retrieved from the companion site to the popular multilevel modeling textbook by Hox et al. (Reference Hox, Moerbeek and Van de Schoot2018). The data feature Peabody Individual Assessment test reading scores (PIAT; Dunn & Markwardt, Reference Dunn and Markwardt1970) for 221 children. These data are wave-based such that each child’s PIAT score is measured four times in two-year intervals. This subsample has no missing values and each child completed all four waves.

For this example, a taxonomy of models were fit. Model 0 is an unconditional latent growth model. Model 1 includes mother’s age when the child was born (mom_age; M = 25.59, SD = 1.87, range = 21–29) as a time-invariant covariate of the initial status and linear change growth factors. Model 2 adds a second time-invariant covariate, cognitive support provided at home (cog_home; M = 9.10, SD = 2.45, range = 3–14). Both covariates were grand-mean centered to preserve the interpretation of the growth factor means. The models were fit with maximum likelihood estimation in lavaan Version 0.6.17 (Rosseel, Reference Rosseel2012) or in Mplus Version 8.10 (Muthén & Muthén, 1998–2024). Because there was no missing data, parameter estimates are identical across specifications and programs.

The model ${\chi}^2$ and degrees of freedom are reported in Table 2 along with the estimated parameters for each model and SRMR values from each definition in Section 5. Parameter estimates are identical across all specifications. The data and code for this example are provided on an Open Science Framework page associated with this paper, (https://osf.io/sxp4g). Software options yielding different SRMR definitions are provided in Table 1. ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ are not currently available in either software and were computed manually.

Table 2 Parameter estimates from three latent growth models fit to the empirical reading assessment data

Note: Est = parameter estimate, SE = standard error, df = degrees of freedom, Res. Var. = residual variance, Cov = covariance, Var = variance

6.2 Model results

All three models have identical model ${\chi}^2$ statistics for all specifications. The model ${\chi}^2$ values are larger than the critical value for any conventional significance level given the model degrees of freedom, indicating that the models do not recreate the sample moments exactly. Because the model ${\chi}^2$ test can be seen as a severe test for data-model fit (Mayo, Reference Mayo2018), SRMR can help quantify the magnitude of the model residuals to help contextualize the practical amount of data-model misfit. Because there are no covariates in Model 0, the Bentler-standardized indices ${\mathrm{SRMR}}_V$ , ${\mathrm{SRMR}}_{VC}$ , ${\mathrm{SRMR}}_R$ , and ${\mathrm{SRMR}}_M$ are identical. Similarly, the hybrid Bollen–Bentler standardized indices ${\mathrm{SRMR}}_V^{\ast }$ , ${\mathrm{SRMR}}_{VC}^{\ast }{\mathrm{SRMR}}_R^{\ast }$ , and ${\mathrm{SRMR}}_M^{\ast }$ are also equal to each other, but they are not equal to the Bentler-standardized SRMRs because the variance structure is not saturated and $\operatorname{diag}\left(\mathbf{S}\right)\ne \operatorname{diag}\left(\hat{\boldsymbol{\Sigma}}\right)$ .

As covariates are added, ${\mathrm{SRMR}}_V$ , ${\mathrm{SRMR}}_{VC}$ , ${\mathrm{SRMR}}_V^{\ast }$ , and ${\mathrm{SRMR}}_{VC}^{\ast }$ systematically decrease regardless of whether the covariates improve the model. That is, despite home_cog having no effect on initial status ( $Z=1.32,p=.19$ ) or linear change ( $Z=1.42,p=.16$ ), all four of these SRMR definitions suggest improvement between Model 1 and Model 2. This behavior stems from relying on V-dimensional moments, which rewards reproduction of covariate elements.

To demonstrate, the Model 2 Bentler-standardized residual mean vector and residual covariance matrix for a joint covariate specification are

$$\begin{align*}&{\mathbf{d}}^{-1}\left(\overline{\mathbf{v}}-\boldsymbol{\unicode{x3bc}} \left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)\right)={\left[\begin{array}{cccccc}\mathrm{Read}1& \mathrm{Read}2& \mathrm{Read}3& \mathrm{Read}4& \mathrm{HomeCog}& \mathrm{MomAge}\\ {}-.351& .164& .083& -.147& .000& .000\end{array}\right]}^{\prime }\\&{\mathbf{D}}^{-1}\left(\mathbf{S}-\boldsymbol{\Sigma} \left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)\right){\mathbf{D}}^{-1}=\left[\begin{array}{rrrrrrr}& \mathrm{Read}1& \mathrm{Read}2& \mathrm{Read}3& \mathrm{Read}4& \mathrm{HomeCog}& \mathrm{MomAge}\\ {}\mathrm{Read}1& -.327& & & & & \\ {}\mathrm{Read}2& -.006& .076& & & & \\ {}\mathrm{Read}3& -.111& .097& .051& & & \\ {}\mathrm{Read}4& -.193& .012& -.010& -.128& & \\ {}\mathrm{HomeCog}& -.023& .009& .007& -.011& .000& \\ {}\mathrm{MomeAge}& -.048& .031& -.001& -.010& .000& .000\end{array}\right]\end{align*}$$

Notably, there are $0.5V\left(V+3\right)=27$ unique elements across the mean vector and covariance matrix. The sum of squared standardized residuals is 0.374 (the numerator of ${\mathrm{SRMR}}_V$ ) but notice that two rightmost elements in the mean residual vector and three elements in the rightmost lower triangle of the covariance residual matrix are necessarily zero because they only involve covariate information. Because there are no missing data, the model-implied elements are identical to the sample statistics regardless of whether a fixed or stochastic specification is used. This produces zero residuals for these elements, so they do not contribute to the numerator but they do add to the denominator, resulting in ${\mathrm{SRMR}}_V=\sqrt{.374/27=}.118$ for Model 2.

${\mathrm{SRMR}}_{VC}$ subtracts $0.5C\left(C+3\right)$ from the denominator to address deterministic zeroes. In Model 2, C = 2 so the denominator is lowered by 5 to account for values that are necessarily 0 (i.e., ${\mathrm{SRMR}}_{VC}=\sqrt{.374/22=}.130$ ). ${\mathrm{SRMR}}_{VC}$ continues to count the $C\times T=8$ elements that are partially based on the covariates (the 8 non-zero values in the last two rows of the residual covariance matrix above). These elements represent the discrepancy of the model-implied and observed covariances between the repeated measures and the covariates.

One perspective of these elements is that these elements should count towards SRMR because reproducing the covariance between the outcome and covariates is relevant information to consider. From this perspective, reproduction is the main interest and the decrease in ${\mathrm{SRMR}}_{VC}$ is warranted because it indicates that the model is adequately reproducing the covariances between repeated measures and covariates.

An alternative perspective is that the model should not be rewarded for small residuals that are partially composed of sample statistics. Although these elements are not deterministically zero, their magnitude is moderated—for instance, these elements represent 8 of the 11 smallest non-zero residuals in the model. From this perspective, counting elements that are partially dependent on covariate sample statistics artificially deflates ${\mathrm{SRMR}}_{VC}$ because there is not a substantive interest in reproducing these covariance elements and counting them drives down the ${\mathrm{SRMR}}_{VC}$ without providing substantively useful information. From this perspective, determining whether the covariates improve the model is the main interest and the decrease in ${\mathrm{SRMR}}_{VC}$ is less warranted because it does not speak to whether the covariates are useful or whether the model-implied covariances of the repeated measures more closely reproduce the observed covariances after covariates are included. This is particularly prudent for researchers intending to use ${\mathrm{SRMR}}_{VC}$ for model comparisons or fit index difference evaluation because there is a distinction between a model reproducing the covariances between outcomes and covariates and the covariates explaining variance in the outcome.

Regarding other definitions, ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ do not change across the three models. This stability is due to these definitions marginalizing over the covariates, so ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ will be constant whether covariates are present or not. For these data, the Bentler-standardized residual mean vector and residual covariance matrix used for ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ only include the T-dimensional elements related to the four repeated measures:

$$\begin{align*}&{{\mathbf{d}}_{\mathrm{T}}}^{-1}\left(\overline{\mathbf{y}}-\boldsymbol{\unicode{x3bc}} \left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)\right)={\left[\begin{array}{rrrr}\mathrm{Read}1& \mathrm{Read}2& \mathrm{Read}3& \mathrm{Read}4\\ {}-.351& .164& .083& -.147\end{array}\right]}^{\prime }\\&{{\mathbf{D}}_{\mathrm{T}}}^{-1}\left({\mathbf{S}}_{\mathrm{T}}-{\boldsymbol{\Sigma}}_{\mathrm{T}}\left({\hat{\boldsymbol{\vartheta}}}_{\mathrm{joint}}\right)\right){{\mathbf{D}}_{\mathrm{T}}}^{-1}=\left[\begin{array}{rrrrr}& \mathrm{Read}1& \mathrm{Read}2& \mathrm{Read}3& \mathrm{Read}4\\ {}\mathrm{Read}1& -.327& & & \\ {}\mathrm{Read}2& -.006& .076& & \\ {}\mathrm{Read}3& -.111& .097& .051& \\ {}\mathrm{Read}4& -.193& .012& -.010& -.128\end{array}\right]\end{align*}$$

where ${\mathbf{d}}_{\mathrm{T}}=\operatorname{diag}{\left({\mathbf{S}}_{\mathrm{T}}\right)}^{1/2}$ and ${\mathbf{D}}_{\mathrm{T}}={\mathbf{I}}_{\mathrm{T}}\odot {\mathbf{d}}_{\mathrm{T}}$ . There are only, $0.5T\left(T+3\right)=14$ unique elements, resulting in a sum of squared residuals of .369 and ${\mathrm{SRMR}}_M=\sqrt{.369/14=}.162$ . The calculation involves no covariate information and focuses purely on fit of repeated measure elements.

${\mathrm{SRMR}}_R$ and ${\mathrm{SRMR}}_R^{\ast }$ happen to mirror ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ , respectively, in this example but this will not be the case generally. Covariates in the example are grand-mean centered and explain little variance, so the marginal and conditional definitions converge. But if Model 2 is refit using uncentered covariates, ${\mathrm{SRMR}}_R=.116$ and ${\mathrm{SRMR}}_R^{\ast}=.105$ and conditional moments no longer correspond to the marginal moments because SRMR is now conditional on the covariates equaling 0 on their original scales. Similarly, if the interest was fit of Model 2 for people simultaneously at the maximum values of the two covariates, the model could be refit centering the covariates around their maximum values. This yields ${\mathrm{SRMR}}_R=.191$ and ${\mathrm{SRMR}}_R^{\ast}=.113$ suggesting that the model fits less well for people at the upper extreme of the covariates. Other SRMR definitions are based on marginal moments and are unaffected by covariate centering.

This reinforces the point made earlier: although ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_R$ are both resistant to artificial deflation when adding covariates and can converge in some cases. However, resistance is gained through two opposing mechanisms. ${\mathrm{SRMR}}_M$ is not deflated by covariates because its interpretation is insensitive and oblivious to the presence of covariates. Conversely, ${\mathrm{SRMR}}_R$ is not deflated by covariates because its interpretation is entirely dependent on the covariate information and its value changes within the same model as a function of covariate scaling.

6.3 Expanding Figure 1

Data in Section 6.2 were empirical and the truth is not concretely known. All SRMR definitions from Section 5 were therefore applied to the same simulated data from Section 2 that were used to create Figure 1. The results are shown in Figure 3 where Panel A shows Bentler standardized definitions and Panel B shows Bollen-Bentler standardized definitions.

Figure 3 Simulation results showing average SRMR value across replications as the number of null covariates increases. Panel (a) shows the Bentler-standardized SRMR definitions and panel (b) shows the Bollen-Bentler standardized SRMR definitions. Patterns in simulated data match those in the empirical example where ${\mathrm{SRMR}}_R$ and ${\mathrm{SRMR}}_M$ are stable and unaffected when null covariates are added whereas ${\mathrm{SRMR}}_V$ decreases sharply and ${\mathrm{SRMR}}_{VC}$ decreases but more moderately.

Results in Figure 3 mirror those in the empirical example. Namely, when truly null covariates are added to the model, ${\mathrm{SRMR}}_M$ maintains a consistent value, ${\mathrm{SRMR}}_R$ is mostly consistent with small variation due to variance explained by random chance, ${\mathrm{SRMR}}_V$ sharply decreases as more null covariates are added because a larger proportion of elements are deterministic zeroes, and ${\mathrm{SRMR}}_{VC}$ decreases but less sharply because it filters out the deterministic zeroes but still includes elements that are partially based on sample statistics. Patterns are the same for either standardization method.

7 Discussion

The traditional definition of SRMR is appropriate for covariance structure models, but many structural equation model applications include additional features, which can alter the appropriate definition of SRMR. The current paper focused on the context of models with covariates. Despite mainstream software reporting SRMR values for models featuring covariates, there has been little formal methodological work exploring how to suitably extend SRMR when covariates are present.

The primary finding in this paper was that some SRMR definitions were susceptible to being systematically deflated if covariates are present $({\mathrm{SRMR}}_V$ ${\mathrm{SRMR}}_V^{\ast }$ ${\mathrm{SRMR}}_{VC}$ , and ${\mathrm{SRMR}}_{VC}^{\ast }$ ). Other definition were less susceptible $({\mathrm{SRMR}}_R$ ${\mathrm{SRMR}}_R^{\ast }$ ${\mathrm{SRMR}}_M$ , and ${\mathrm{SRMR}}_M^{\ast }$ ), but had properties and interpretational caveats that may be undesirable. This is primarily due to joint covariate specifications, which can complicate counting which variables are “in the model” and which model residuals should contribute to the numerator and denominator of SRMR.

Consequently, this paper does not definitively solve the issue of properly defining SRMR with covariates and it may raise more questions than answers. It also only considered the situation with continuous outcomes and did not explore contexts where outcomes are discrete (see Section 2.7 of Asparouhov & Muthén, Reference Asparouhov and Muthén2018 for SRMR considerations with discrete outcomes). Nonetheless, the hope is that this paper at least raises awareness of these potential issues and encourages more thoughtful consideration of how to interpret SRMR when models feature covariates.

Regarding specific limitations of definitions that did not systematically decrease when covariates were included, ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ mirror evaluating fit with no covariates because covariates are marginalized out, which does not differentiate between explained and unexplained variance. ${\mathrm{SRMR}}_M$ and ${\mathrm{SRMR}}_M^{\ast }$ are essentially oblivious to covariates because the variance is pushed around to different sources, but the marginal amount of variance is unchanged. Essentially, ${\mathrm{SRMR}}_M$ is not deflated with covariates because it simply is not sensitive to covariates. ${\mathrm{SRMR}}_R$ and ${\mathrm{SRMR}}_R^{\ast }$ are conditional, which seems more useful because the variance explained by covariates is removed. However, its interpretation depends on how covariates are scaled.

In essence, a meaningful SRMR with covariates may require a better developed sense of what the model-implied moments should reproduce. This is closely related to the ambiguity when interpreting ${\mathrm{SRMR}}_{VC}$ and ${\mathrm{SRMR}}_{VC}^{\ast }$ where it is unclear whether elements corresponding to covariances between outcomes and covariates are part of the model and whether reducing them meaningfully corresponds to what fit should be capturing.

Moreover, it is unclear which values of any SRMR definitions for covariates indicate acceptable approximate fit. The traditional guideline from Hu and Bentler (Reference Hu and Bentler1999) is that SRMR < .08 indicates acceptable fit. However, this guideline is known not to generalize well beyond the confirmatory factor models from which it was derived (e.g., Fan & Sivo, Reference Fan and Sivo2007; Hancock & Mueller, Reference Hancock and Mueller2011; McNeish & Wolf, Reference McNeish and Wolf2024). Models with covariates differ in meaningful ways from Hu and Bentler’s simulation, so it is unclear which SRMR values indicate substantively important misfit when covariates are present. This is especially true for models with an overidentified mean structure because the mean and covariance structure may be weighed differently.

This also raises questions about whether combining mean structure residuals and covariance structure residuals into a single index is meaningful. As noted earlier, mean residuals and covariance residuals can be split with separate SRMR values for each submodel to make the index more interpretable and to better identify the location of misfit. Nonetheless, the same issues discussed in this paper are relevant even if using separate SRMRs for the mean and covariance structure because there are still ambiguities about which elements should and should not be counted in each (though the mean structure is less opaque than the covariance structure).

Of course, SRMR may simply have too many interpretational challenges to productively assess approximate fit for models with covariates. Whereas lack of dependence on T ML can be a helpful property of SRMR in covariance structure models, SRMR’s reliance on model residuals may convolute calculation of SRMR in situations where there is ambiguity regarding which variables and corresponding residual elements should be counted as part of “the model”. If a concise summary of residuals with SRMR may be too difficult, local fit may offer another option for those looking to evaluate fit of models with covariates.

Broadly, local fit refers to evaluating some smaller portion of the overall model (Thoemmes et al., Reference Thoemmes, Rosseel and Textor2018). This can include structural versus measurement portions (Anderson & Gerbing, Reference Anderson and Gerbing1988; Zhang & Hao, Reference Zhang and Wu2024) or can be as narrow as elementwise inspection of each individual residual element to identify elements that were not closely reproduced by the model (McDonald & Ho, Reference McDonald and Ho2002; West et al., Reference West, Wu, McNeish, Savord and Hoyle2023). Elementwise local fit is essentially the most extreme version of splitting SRMR into subcomponents and can help locate areas of local strain and to ensure that misfit is not attributable to a few outlying elements that are poorly reproduced (Appelbaum et al., Reference Appelbaum, Cooper, Kline, Mayo-Wilson, Nezu and Rao2018; Kline, Reference Kline2023).

Elementwise local fit is commonly recommended as a supplement to global fit metrics like SRMR. However, reviews of empirical studies suggest that few studies report or examine model residuals and elementwise local fit and instead rely on global summary measures like SRMR (Ropovik, Reference Ropovik2015; Zhang et al., Reference Zhang, Dawson and Kline2021). As models become more complex, it may be more straightforward to simply look at each residual in isolation rather than debate merits of different possible aggregated summaries of the residuals.

Similar to global fit, elementwise local fit can be exact or approximate. In exact approaches, inferential tests are built to assess whether the individual residual element is equal to 0 (Maydeu-Olivares, Reference Maydeu-Olivares2017; Ogasawara, Reference Ogasawara2001). With approximate local fit, the intent is to identify whether the amount of misfit for an individual element is acceptably small. Typical recommendations for elementwise local approximate fit are that standardized residuals are between [–0.10, 0.10] (Hu & Bentler, Reference Bentler1995; Goodboy & Kline, Reference Goodboy and Kline2017; Schreiber, Reference Schreiber2008). However, this recommendation is motivated by factor analysis and may not apply to models with overidentified mean structures where residuals are not bounded. Additional work that refines understanding of elementwise local fit in models that extend beyond factor analysis would be beneficial.

In sum, hopefully this paper has illuminated potential issues and open problems when extending approximate fit indices like SRMR that were originally developed for the narrower context of covariance structure models. Even though there are few definitive conclusions, hopefully researchers will have greater appreciation for nuance required when using any SRMR definition to understand the data-model fit when covariates are present.

Author contributions

Both authors conceived the idea and contributed to the methodology. Matta led the initial draft with input from McNeish. McNeish primarily edited subsequent drafts and handled revisions to accommodate reviewer comments. Both authors contributed to the simulation and the empirical analyses.

Funding statement

This work was partially supported by the Institute for Educational Sciences, Award Number R305D220003 (PI = McNeish).

Competing interests

The authors has no financial or non-financial competing interests to disclose.

Footnotes

1 Note that Mplus uses a slightly different definition of SRMR than defined by Bentler (Reference Bentler1995); see Equation 129 in Muthen (Reference Muthén2004, p. 23). This is discussed in more detail in Section 3.4.

2 If covariates have known imperfect reliability, residual variances could be fixed to a non-zero value that implies a particular reliability (Bollen, 1989, p. 312; Cole & Preacher, Reference Cole and Preacher2014).

References

Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A review and recommended two-step approach. Psychological Bulletin, 103(3), 411423.Google Scholar
Appelbaum, M., Cooper, H., Kline, R., Mayo-Wilson, E., Nezu, A., & Rao, S. (2018). Journal article reporting standards for quantitative research in psychology: The APA publications and communications board task force report. American Psychologist, 73(1), 325.Google Scholar
Asparouhov, T., & Muthén, B. (2018). SRMR in Mplus. Technical Report. Mplus. Retrieved from http://www.statmodel.com/download/SRMR2.pdf Google Scholar
Baraldi, A. N., & Enders, C. K. (2010). An introduction to modern missing data analyses. Journal of School Psychology, 48(1), 537.Google Scholar
Bentler, P. M. (1995). EQS structural equations program manual. Multivariate Software.Google Scholar
Bentler, P. M., & Bonett, D. G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88(3), 588606.Google Scholar
Bentler, P. M., & Weeks, D. G. (1979). Interrelations among models for the analysis of moment structures. Multivariate Behavioral Research, 14(2), 169186.Google Scholar
Bollen, K. A. (1989). Structural equations with latent variables. Wiley.Google Scholar
Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In Bollen, K. A. & Long, J. S. (Eds.), Testing structural equation models (pp. 136162). Sage.Google Scholar
Browne, M. W., MacCallum, R. C., Kim, C. T., Andersen, B. L., & Glaser, R. (2002). When fit indices and residuals are incompatible. Psychological Methods, 7(4), 403421.Google Scholar
Cole, D. A., & Preacher, K. J. (2014). Manifest variable path analysis: potentially serious and misleading consequences due to uncorrected measurement error. Psychological Methods, 19(2), 300315.Google Scholar
Dunn, L. M., & Markwardt, F. C. (1970). Peabody individual achievement test. American Guidance System.Google Scholar
Enders, C. K. (2006). Analyzing structural equation models with missing data. In Hancock, G. R., & Mueller, R. O. (Eds.), Structural equation modeling: A second course (pp. 315344). Information Age Publishing Google Scholar
Fan, X., & Sivo, S. A. (2007). Sensitivity of fit indices to model misspecification and model types. Multivariate Behavioral Research, 42(3), 509529.Google Scholar
Goodboy, A. K., & Kline, R. B. (2017). Statistical and practical concerns with published communication research featuring structural equation modeling. Communication Research Reports, 34(1), 6877.Google Scholar
Hancock, G. R., & Mueller, R. O. (2011). The reliability paradox in assessing structural relations within covariance structure models. Educational and Psychological Measurement, 71(2), 306324.Google Scholar
Hox, J. J., Moerbeek, M., & Van de Schoot, R. (2018). Multilevel analysis: Techniques and applications (3rd ed.). Routledge.Google Scholar
Hu, L. & Bentler, P. M. (1995). Evaluating model fit. In Hoyle, R. (Ed.) Structural equation modeling: Issues, concepts, and applications, (pp. 7699). Sage .Google Scholar
Hu, L. T., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3(4), 424453.Google Scholar
Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 155.Google Scholar
Hu, L. T., Bentler, P. M., & Kano, Y. (1992). Can test statistics in covariance structure analysis be trusted?. Psychological Bulletin, 112(2), 351362.Google Scholar
Jöreskog, K., & Sörbom, D. (1981). LISREL V: Analysis of linear structural relationships by maximum likelihood and least squares methods. International Educational Services.Google Scholar
Jöreskog, K. G., & Sörbom, D. (1982). Recent developments in structural equation modeling. Journal of Marketing Research, 19(4), 404416.Google Scholar
Kelley, K., & Preacher, K. J. (2012). On effect size. Psychological Methods, 17(2), 137152.Google Scholar
Kenward, M. G., & Molenberghs, G. (1998). Likelihood based frequentist inference when data are missing at random. Statistical Science, 236247.Google Scholar
Kline, R. B. (2023). Principles and practice of structural equation modeling (5th ed.). Guilford.Google Scholar
Leite, W. L., & Stapleton, L. M. (2011). Detecting growth shape misspecifications in latent growth models: An evaluation of fit indexes. The Journal of Experimental Education, 79(4), 361381.Google Scholar
Little, R. J. A., & Rubin, D. B. (2020). Statistical analysis with missing data (3rd ed.). Wiley Google Scholar
MacCallum, R. C. (2003). 2001 presidential address: Working with imperfect models. Multivariate Behavioral Research, 38(1), 113139.Google Scholar
Maydeu-Olivares, A. (2017). Assessing the size of model misfit in structural equation models. Psychometrika, 82 (3), 533558.Google Scholar
Maydeu-Olivares, A., Shi, D., & Rosseel, Y. (2018). Assessing fit in structural equation models: A Monte-Carlo evaluation of RMSEA versus SRMR confidence intervals and tests of close fit. Structural Equation Modeling, 25(3), 389402.Google Scholar
Mayo, D. (2018). Statistical inference as severe testing: How to get beyond the statistics wars. Cambridge University Press.Google Scholar
McArdle, J. J., & McDonald, R. P. (1984). Some algebraic properties of the reticular action model for moment structures. British Journal of Mathematical and Statistical Psychology, 37(2), 234251.Google Scholar
McDonald, R. P., & Ho, M. H. R. (2002). Principles and practice in reporting structural equation analyses. Psychological Methods, 7(1), 6482.Google Scholar
McNeish, D., & Wolf, M. G. (2023). Dynamic fit index cutoffs for one-factor models. Behavior Research Methods, 55(3), 11571174.Google Scholar
McNeish, D., & Wolf, M. G. (2024). Direct discrepancy dynamic fit index cutoffs for arbitrary covariance structure models. Structural Equation Modeling, advanced online publication.Google Scholar
Muthén, B. O. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49(1), 115132.Google Scholar
Muthén, B. O. (2004). Mplus Technical Appendices. Muthén & Muthén. https://statmodel.com/download/techappen.pdf Google Scholar
Muthén, L. K., & Muthén, B. O. (2017). Mplus user’s guide (8th ed.). Muthén & Muthén Google Scholar
Ogasawara, H. (2001). Standard errors of fit indices using residuals in structural equation modeling. Psychometrika, 66, 421436.Google Scholar
Pavlov, G., Maydeu-Olivares, A., & Shi, D. (2021). Using the standardized root mean squared residual (SRMR) to assess exact fit in structural equation models. Educational and Psychological Measurement, 81(1), 110130.Google Scholar
Ropovik, I. (2015). A cautionary note on testing latent variable models. Frontiers in Psychology, 6, 163271.Google Scholar
Rosseel, Y (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 136.Google Scholar
Saris, W. E., Satorra, A., & Van der Veld, W. M. (2009). Testing structural equation models or detection of misspecifications? Structural Equation Modeling, 16(4), 561582.Google Scholar
Savalei, V. (2010). Expected versus observed information in SEM with incomplete normal and nonnormal data. Psychological Methods, 15(4), 352367.Google Scholar
Schreiber, J. B. (2008). Core reporting practices in structural equation modeling. Research in Social and Administrative Pharmacy, 4(2), 8397.Google Scholar
Shi, D., Maydeu-Olivares, A., & DiStefano, C. (2018). The relationship between the standardized root mean square residual and model misspecification in factor analysis models. Multivariate Behavioral Research, 53(5), 676694.Google Scholar
Shi, D., Maydeu-Olivares, A., & Rosseel, Y. (2020). Assessing fit in ordinal factor analysis models: SRMR vs RMSEA. Structural Equation Modeling, 27(1), 115.Google Scholar
Shi, D., DiStefano, C., Maydeu-Olivares, A., & Lee, T. (2022). Evaluating SEM model fit with small degrees of freedom. Multivariate Behavioral Research, 57(2-3), 179207.Google Scholar
Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling. Chapman & Hall/CRC.Google Scholar
Sterba, S. K. (2014). Handling missing covariates in conditional mixture models under missing at random assumptions. Multivariate Behavioral Research, 49(6), 614632.Google Scholar
Thoemmes, F., Rosseel, Y., & Textor, J. (2018). Local fit evaluation of structural equation models using graphical criteria. Psychological Methods, 23(1), 2741.Google Scholar
von Oertzen, T., & Brick, T. R. (2014). Efficient Hessian computation using sparse matrix derivatives in RAM notation. Behavior Research Methods, 46, 385395.Google Scholar
Vonesh, E. F., Chinchilli, V. M., & Pu, K. (1996). Goodness-of-fit in generalized nonlinear mixed-effects models. Biometrics, 52 (2), 572587.Google Scholar
West, S. G., Wu, W., McNeish, D., & Savord, A. (2023). Model fit in structural equation modeling. In Hoyle, R. H. (Ed.), Handbook of structural equation modeling (2nd ed.), pp. 184205). Guilford Press.Google Scholar
Wu, W., & West, S. G. (2010). Sensitivity of fit indices to misspecification in growth curve models. Multivariate Behavioral Research, 45(3), 420452.Google Scholar
Ximénez, C., Maydeu-Olivares, A., Shi, D., & Revuelta, J. (2022). Assessing cutoff values of SEM fit indices: Advantages of the unbiased SRMR index and its cutoff criterion based on communality. Structural Equation Modeling, 29(3), 368380.Google Scholar
Yuan, K. H. (2005). Fit indices versus test statistics. Multivariate Behavioral Research, 40(1), 115148.Google Scholar
Yuan, K. H., Zhang, Z., & Deng, L. (2019). Fit indices for mean structures with growth curve models. Psychological Methods, 24(1), 3653.Google Scholar
Zhang, X., & Wu, H. (2024). Investigating structural model fit evaluation. Structural Equation Modeling, advance online publication.Google Scholar
Zhang, M. F., Dawson, J. F., & Kline, R. B. (2021). Evaluating the use of covariance-based structural equation modelling with reflective measurement in organizational and management research: A review and recommendations for best practice. British Journal of Management, 32(2), 257272.Google Scholar
Figure 0

Figure 1 Average SRMR values across replications for a latent growth model with four repeated measures fit with default options in lavaan and Mplus. The population model has no covariates, but null covariates were added. The SRMR value systematically decreases as a function of covariates, even though the covariates explain no variance and have no effect.

Figure 1

Figure 2 Hypothetical path diagram of conditional latent growth model with two time-invariant covariates and four repeated measures. Panel (a) shows a joint and fixed covariate specification where the covariates are converted to latent variables whose moments are constrained to sample statistics. Panel (b) shows a joint and stochastic specification where the covariates are converted to latent variables whose moments are free parameters. Panel (c) shows a conditional and fixed specification where the manifest covariates directly predict the latent growth factors. The difference between panels (a) and (b) is subtle and is related to whether the means, variances, and covariances of η3 and η4 are fixed or estimated.

Figure 2

Table 1 Comparison of primary features of different possible SRMR definitions for models with covariates

Figure 3

Table 2 Parameter estimates from three latent growth models fit to the empirical reading assessment data

Figure 4

Figure 3 Simulation results showing average SRMR value across replications as the number of null covariates increases. Panel (a) shows the Bentler-standardized SRMR definitions and panel (b) shows the Bollen-Bentler standardized SRMR definitions. Patterns in simulated data match those in the empirical example where ${\mathrm{SRMR}}_R$ and ${\mathrm{SRMR}}_M$ are stable and unaffected when null covariates are added whereas ${\mathrm{SRMR}}_V$ decreases sharply and ${\mathrm{SRMR}}_{VC}$ decreases but more moderately.