Article contents
Understanding Wordscores
Published online by Cambridge University Press: 04 January 2017
Abstract
Wordscores is a widely used procedure for inferring policy positions, or scores, for new documents on the basis of scores for words derived from documents with known scores. It is computationally straightforward, requires no distributional assumptions, but has unresolved practical and theoretical problems. In applications, estimated document scores are on the wrong scale and the theoretical development does not specify a statistical model, so it is unclear what assumptions the method makes about political text and how to tell whether they fit particular text analysis applications. The first part of the paper demonstrates that badly scaled document score estimates reflect deeper problems with the method. The second part shows how to understand Wordscores as an approximation to correspondence analysis which itself approximates a statistical ideal point model for words. Problems with the method are identified with the conditions under which these layers of approximation fail to ensure consistent and unbiased estimation of the parameters of the ideal point model.
- Type
- Special Issue: The Statistical Analysis of Political Text
- Information
- Political Analysis , Volume 16 , Issue 4: Special Issue: The Statistical Analysis of Political Text , Autumn 2008 , pp. 356 - 371
- Copyright
- Copyright © The Author 2008. Published by Oxford University Press on behalf of the Society for Political Methodology
Footnotes
Author's note: I would like to thank Ken Benoit, Mik Laver, Cees van der Eijk, and Wijbrandt van Schuur for useful comments and discussion. The remaining errors are my own.
References
- 125
- Cited by