Article contents
To Lag or Not to Lag?: Re-Evaluating the Use of Lagged Dependent Variables in Regression Analysis*
Published online by Cambridge University Press: 03 May 2017
Abstract
Lagged dependent variables (LDVs) have been used in regression analysis to provide robust estimates of the effects of independent variables, but some research argues that using LDVs in regressions produces negatively biased coefficient estimates, even if the LDV is part of the data-generating process. I demonstrate that these concerns are easily resolved by specifying a regression model that accounts for autocorrelation in the error term. This actually implies that more LDV and lagged independent variables should be included in the specification, not fewer. Including the additional lags yields more accurate parameter estimates, which I demonstrate using the same data-generating process scholars had previously used to argue against including LDVs. I use Monte Carlo simulations to show that this specification returns much more accurate coefficient estimates for independent variables (across a wide range of parameter values) than alternatives considered in earlier research. The simulation results also indicate that improper exclusion of LDVs can lead to severe bias in coefficient estimates. While no panacea, scholars should continue to confidently include LDVs as part of a robust estimation strategy.
- Type
- Research Notes
- Information
- Copyright
- © The European Political Science Association 2017
Footnotes
Arjun S. Wilkins, Department of Political Science, Stanford University, Encina Hall West, Room 100, 616 Serra St., Stanford, CA 94305-6044 ([email protected]). I wish to thank Justin Grimmer, Simon Jackman, Bobby Gulotty, and two anonymous reviewers for their very helpful comments and advice as I worked on this paper. Any errors or omissions are the author’s responsibility. To view supplementary material for this article, please visit http://dx.doi.org/10.1017/psrm.2017.4
References
- 140
- Cited by