Article contents
Estimating Regression Models in Which the Dependent Variable Is Based on Estimates
Published online by Cambridge University Press: 04 January 2017
Abstract
Researchers often use as dependent variables quantities estimated from auxiliary data sets. Estimated dependent variable (EDV) models arise, for example, in studies where counties or states are the units of analysis and the dependent variable is an estimated mean, proportion, or regression coefficient. Scholars fitting EDV models have generally recognized that variation in the sampling variance of the observations on the dependent variable will induce heteroscedasticity. We show that the most common approach to this problem, weighted least squares, will usually lead to inefficient estimates and underestimated standard errors. In many cases, OLS with White's or Efron heteroscedastic consistent standard errors yields better results. We also suggest two simple alternative FGLS approaches that are more efficient and yield consistent standard error estimates. Finally, we apply the various alternative estimators to a replication of Cohen's (2004) cross-national study of presidential approval.
- Type
- Research Article
- Information
- Political Analysis , Volume 13 , Issue 4: Special Issue on Multilevel Modeling for Large Clusters , Autumn 2005 , pp. 345 - 364
- Copyright
- Copyright © The Author 2005. Published by Oxford University Press on behalf of the Society for Political Methodology
Footnotes
Authors' note: We thank Chris Achen, Barry Burden, Michael Herron, Gary King, Eduardo Leoni, and Lynn Vavreck for comments on earlier drafts of this article. Any remaining errors are ours alone.
References
- 389
- Cited by