Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-01-08T09:34:02.987Z Has data issue: false hasContentIssue false

Generating Correlated, Non-normally Distributed Data Using a Non-linear Structural Model

Published online by Cambridge University Press:  01 January 2025

Max Auerswald*
Affiliation:
University of Mannheim University of Kassel
Morten Moshagen
Affiliation:
University of Kassel
*
Correspondence should be made to Max Auerswald, Institute of Psychology, University of Kassel, Holländische Straße 36-38, 34127 Kassel, Germany. Email: [email protected]

Abstract

An approach to generate non-normality in multivariate data based on a structural model with normally distributed latent variables is presented. The key idea is to create non-normality in the manifest variables by applying non-linear linking functions to the latent part, the error part, or both. The algorithm corrects the covariance matrix for the applied function by approximating the deviance using an approximated normal variable. We show that the root mean square error (RMSE) for the covariance matrix converges to zero as sample size increases and closely approximates the RMSE as obtained when generating normally distributed variables. Our algorithm creates non-normality affecting every moment, is computationally undemanding, easy to apply, and particularly useful for simulation studies in structural equation modeling.

Type
Original Paper
Copyright
Copyright © 2015 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Electronic supplementary material The online version of this article (doi:10.1007/s11336-015-9468-7) contains supplementary material, which is available to authorized users.

References

Bradley, D.R., & Fleisher, C.L. (1994). Generating multivariate data from nonnormal distributions: Mihal and Barrett revisited. Behavior Research Methods, Instruments, & Computers, 26, 156166. doi:10.3758/BF03204610.CrossRefGoogle Scholar
Burr, I.W. (1942). Cumulative frequency functions. The Annals of Mathematical Statistics, 13, 215232. doi:10.1214/aoms/1177731607.CrossRefGoogle Scholar
Cario, M.C., & Nelson, B.L. (1998). Numerical methods for fitting and simulating autoregressive-to-anything processes. INFORMS Journal on Computing, 10, 7281.CrossRefGoogle Scholar
Cook, R.D., & Johnson, M.E. (1981). A family of distributions for modelling non-elliptically symmetric multivariate data. Journal of the Royal Statistical Society. Series B, 43, 210218. doi:10.2307/2984851.CrossRefGoogle Scholar
Curran, P.J., West, S.G., & Finch, J.F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1, 1629.CrossRefGoogle Scholar
Devroye, L. (1986). Non-uniform random variate generation. New York: Springer.CrossRefGoogle Scholar
Fang, K-T, Kotz, S., & Ng, K.W. (1990). Symmetric multivariate and related distributions. London: Chapman and Hall.CrossRefGoogle Scholar
Fleishman, A.I. (1978). A method for simulating non-normal distributions. Psychometrika, 43, 521532. doi:10.1007/BF02293811.CrossRefGoogle Scholar
Foldnes, N., & Grønneberg, S. (in press). How general is the Vale-Maurelli simulation approach? Psychometrika. doi:10.1007/s11336-014-9414-0.CrossRefGoogle Scholar
Headrick, T.C. (2002). Fast fifth-order polynomial transforms for generating univariate and multivariate nonnormal distributions. Computational Statistics & Data Analysis, 40, 685711. doi:10.1016/S0167-9473(02)00072-5.CrossRefGoogle Scholar
Headrick, T. C. (2010). Statistical simulation: Power method polynomials and other transformations. Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
Headrick, T.C., & Kowalchuk, R.K. (2007). The power method transformation: Its probability density function, distribution function, and its further use for fitting data. Journal of Statistical Computation and Simulation, 77, 229249. doi:10.1080/10629360600605065.CrossRefGoogle Scholar
Headrick, T.C., & Mugdadi, A. (2006). On simulating multivariate non-normal distributions from the generalized lambda distribution. Computational Statistics & Data Analysis, 50, 33433353. doi:10.1016/j.csda.2005.06.010.CrossRefGoogle Scholar
Headrick, T.C., & Sawilowsky, S.S. (1999). Simulating correlated multivariate nonnormal distributions: Extending the Fleishman power method. Psychometrika, 64, 2535. doi:10.1007/BF02294317.CrossRefGoogle Scholar
Hodis, F.A., Headrick, T.C., & Sheng, Y. (2012). Power method distributions through conventional moments and L-moments. Applied Mathematical Sciences, 6, 21592193.Google Scholar
Hu, L-T, & Bentler, P.M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424453. doi:10.1037/1082-989X.3.4.424.CrossRefGoogle Scholar
Joe, H. (1997). Multivariate models and multivariate dependence concepts. Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
Johnson, N.L. (1949). Systems of frequency curves generated by methods of translation. Biometrika, 36, 149176. doi:10.2307/2332539.CrossRefGoogle ScholarPubMed
Mair, P., Satorra, A., & Bentler, P.M. (2012). Generating nonnormal multivariate data using copulas: Applications to SEM. Multivariate Behavioral Research, 47, 547565. doi:10.1080/00273171.2012.692629.CrossRefGoogle ScholarPubMed
Mattson, S. (1997). How to generate non-normal data for simulation of structural equation models. Multivariate Behavioral Research, 32, 355–373. doi:10.1207/s15327906mbr3204_3.CrossRefGoogle Scholar
Moshagen, M. (2012). The model size effect in SEM: Inflated goodness-of-fit statistics are due to the size of the covariance matrix. Structural Equation Modeling, 19, 8698. doi:10.1080/10705511.2012.634724.CrossRefGoogle Scholar
Nagahara, Y. (2004). A method of simulating multivariate nonnormal distributions by the Pearson distribution system and estimation. Computational Statistics & Data Analysis, 47, 129. doi:10.1016/j.csda.2003.10.008.CrossRefGoogle Scholar
Ramberg, J.S., & Schmeiser, B.W. (1974). An approximate method for generating asymmetric random variables. Communications of the ACM, 17, 7882. doi:10.1145/360827.360840.CrossRefGoogle Scholar
Ruscio, J., & Kaczetow, W. (2008). Simulating multivariate nonnormal data using an iterative algorithm. Multivariate Behavioral Research, 43, 355381. doi:10.1080/00273170802285693.CrossRefGoogle ScholarPubMed
Savalei, V. (2010). Expected versus observed information in SEM with incomplete normal and nonnormal data. Psychological Methods, 15, 352367.CrossRefGoogle ScholarPubMed
Tadikamalla, P.R. (1980). On simulating non-normal distributions. Psychometrika, 45, 273279. doi:10.1007/BF02294081.CrossRefGoogle Scholar
Vale, C.D., & Maurelli, V.A. (1983). Simulating multivariate nonnormal distributions. Psychometrika, 48, 465471. doi:10.1007/BF02293687.CrossRefGoogle Scholar
Yuan, K-H, & Bentler, P.M. (1999). On normal theory and associated test statistics in covariance structure analysis under two classes of nonnormal distributions. Statistica Sinica, 9, 831853.Google Scholar
Supplementary material: File

Auerswald and Moshagen supplementary material

Auerswald and Moshagen supplementary material
Download Auerswald and Moshagen supplementary material(File)
File 9.6 KB