Analyzing Text Complexity in Political Science Research

Damon M. Cann; Greg Goelzhauser; Kaylee Johnson

doi:10.1017/S1049096514000808

Analyzing Text Complexity in Political Science Research

Published online by Cambridge University Press: 19 June 2014

Damon M. Cann ,

Greg Goelzhauser and

Kaylee Johnson

Show author details

Damon M. Cann: Affiliation:
Utah State University
Greg Goelzhauser: Affiliation:
Utah State University
Kaylee Johnson: Affiliation:
Utah State University

Article contents

Abstract
MEASURING COMPLEXITY
EXPLAINING VARIATION IN COMPLEXITY
COMPLEXITY OVER TIME
CONCLUSION
Footnotes
References

Rights & Permissions

Abstract

This article analyzes the text complexity of political science research. Using automated text analysis, we examine the text complexity of a sample of articles from three leading generalist journals and four leading subfield journals. We also examine changes in text complexity across time by analyzing a sample of articles from the discipline’s flagship journal during a 100-year span. Although it is not surprising that a typical political science article is difficult to read, it is accessible to intelligent lay readers. We found little difference in text complexity across time or subfield.

Type: The Profession
Information: PS: Political Science & Politics , Volume 47 , Issue 3 , July 2014 , pp. 663 - 666

DOI: https://doi.org/10.1017/S1049096514000808 [Opens in a new window]
Copyright: Copyright © American Political Science Association 2014

In 2013, the US Senate approved a measure to eliminate federal funding to the Political Science Program at the National Science Foundation (NSF) except for research “certifie[d] as promoting national security or the economic interests of the United States.” Senator Tom Coburn, who sponsored the measure, previously wrote to the NSF’s director urging a shift in funding toward “research topics…more likely to contribute to truly meaningful discoveries or knowledge.”^{Footnote 1} This letter singled out political science research on topics such as institutional conflict and elections, and it urged the NSF to “consider eliminating or greatly reducing the amount allocated” to the discipline.

We suspect that most political scientists think that the discipline’s research results in “truly meaningful discoveries or knowledge.” Political scientists contribute to our understanding of central issues such as elections, public-policy formation, institutional performance, and war. What is less clear, however, is whether political scientists do a good job of communicating the value of these contributions. In an article titled “What Political Science Owes the World,” Diamond (2002, 7) argued that “[p]olitical science has an obligation not only to cover the pressing issues and areas of our time but also to do so intelligibly.” The complexity of writing in the discipline, Diamond continued, “erect[s] barriers to intellectual dialogue, inhibit[s] the cross-fertilization of perspectives, and impede[s] broader access to the work of the discipline.”

Concerns about the accessibility of academic writing are not new. However, we are not aware of any attempt to systematically analyze the text complexity of political science research. Using automated text analysis, we considered the clarity of all political science articles published in three leading general-interest journals and four leading subfield journals in 2012. In addition, we examined the clarity of writing in political science over time by analyzing the readability of a sample of articles published in the American Political Science Review in the past 100 years.

MEASURING COMPLEXITY

We analyzed the clarity of all articles published in 2012 from what are generally considered the three leading generalist political science journals: American Political Science Review, American Journal of Political Science, and Journal of Politics. In addition, we analyzed all articles published in 2012 from the leading subfield journals in American politics, comparative politics, international relations, and political theory according to the Robust ISI Impact scores reported by Giles and Garrand (Reference Giles and Garand2007). These journals include American Politics Research, Comparative Political Studies, International Organization, and Political Theory. Our sampling procedure yielded a total of 312 articles.^{Footnote 2}

Clarity was captured with the commonly employed Flesch Reading Ease (FRE) statistic (Flesch Reference Flesch1948). FRE is calculated according to the following formula:

$${\rm{206}}{\rm{.835}} - {\rm{1}}{\rm{.015}}\left( {{{total\,words} \over {total\,sentences}}} \right) - {\rm{84}}{\rm{.6}}\left( {{{total\,syllables} \over {total\,words}}} \right)$$

FRE scores range from 0 to 100. Texts with FRE scores ranging from 0 to 30 are considered very difficult to read, 31 to 50 are difficult, 51 to 60 are fairly difficult, 61 to 70 are standard, 71 to 80 are fairly easy, 81 to 90 are easy, and 91 to 100 are very easy. The FRE statistic is widely reported in applied research across a variety of disciplines (see, e.g., Coleman and Phung Reference Coleman and Phung2010, Lowrey Reference Lowrey2006, and Terris Reference Terris1949). Moreover, policy makers sometimes require certain documents and contracts to achieve minimum FRE scores to ensure public comprehension. For example, Massachusetts requires insurance contracts to achieve a minimum FRE score of 50, and California requires financial institutions to provide notices to consumers with forms that achieve a minimum FRE score of 50 before they disclose nonpublic information.^{Footnote 3} The FRE statistic also is calculated as a standard feature in popular word-processing programs (e.g., Microsoft Word), making it easy for scholars to check the clarity of their own writing.

To calculate FRE scores for the articles in our sample, we processed text files through an open-source Java application called Flesh.^{Footnote 4} FRE scores for the articles in our sample ranged from 12 (very difficult) to 65 (standard). The average FRE score was 33 (difficult), with a standard deviation of 7. We also calculated Flesch–Kincaid Grade Level (FKGL) scores, which correspond to the years of education needed by an individual to understand a text. The average FKGL score for articles in our sample was 13 (i.e., accessible to an individual with one year of college), with a standard deviation of 1. These scores are similar to the average readability of academic research in other disciplines (Hartley, Sotto, and Fox Reference Hartley, Sotto and Fox2004; Loveland et al. Reference Loveland, Whatley, Ray and Reidy1973).

Figure 1 places political science research in context by plotting FRE scores for the average political science article along with other documents and well-known texts.^{Footnote 5} It is not surprising that the average political science article is not as accessible as material in the New York Times or Reader’s Digest, which tend to be pitched to broader lay audiences. The most similar text to the average political science article is the average judicial opinion. Both types of text are pitched toward educated but diverse audiences. It is also worth noting that the average political science article is substantially less complex than text samples taken from the Uniform Commercial Code (i.e., a set of model laws adopted by each state that govern commercial transactions) or the United States Code—both of which have FRE scores lower than 15.

Figure 1. Examples of Reading Ease

EXPLAINING VARIATION IN COMPLEXITY

Although the average political science article is classified as difficult, there is considerable variation in text complexity across articles. One possibility is that this variation is essentially random, with differences based on idiosyncrasies in writing, skill, and interest in reaching diverse audiences. However, text complexity also may differ systematically based on coauthorship, methodology, and subfield. The dependent variable is the FRE score of an article. The coauthorship variable is an indicator that has a score of 1 for single-authored articles and 0 otherwise. To examine whether different methodological approaches are associated with differences in textual complexity, we included indicator variables for the use of formal theory; quantitative methods, as indicated by the presence of a statistical test (e.g., linear or nonlinear regression models, a t-test for a difference between means, ANOVA and related methods, or a chi square test); and an experimental research design. Thus, the baseline category captured articles that did not use any of these methods. We captured subfields by examining each article in the sample and then coding it as emphasizing American politics, comparative politics, international relations, political methodology, or political theory.^{Footnote 6} Comparative politics is the excluded baseline in the model presented here because it was the subfield with the highest average complexity score.

Table 1 presents results from an ordinary least squares (OLS) regression model that explains the text clarity of political science articles from the three leading generalist journals and four subfield journals in 2012. Single-authored articles are not more or less complex than co-authored articles on average. Furthermore, the results suggest that, on average, articles using quantitative or experimental methods are not more difficult to read than those that do not use quantitative, experimental, or formal methods. Articles using formal theory are clearer than the baseline, on average, but the effect size of approximately 3.29 points (with a 95% confidence interval of [0.93, 5.66]) is rather modest. Although articles that use quantitative, experimental, or formal methods require a measure of subject-specific knowledge to fully understand them, the associated text is not substantively different, on average, than articles that do not use those methods.

Our analysis of political science research suggests that the average political science article is difficult to read but readily comprehensible to an educated layperson.

Table 1 Regression Model of Political Science Article Clarity

Notes: Robust standard errors are in parentheses. * p < 0.05 (two-tailed). The baseline category for the methodological approach is an article that does not use quantitative, formal, or experimental methods. The baseline category for the subfield is comparative politics.

As discussed previously, comparative politics is the subfield with the highest average FRE score. The results presented in table 1 suggest that articles in American politics, international relations, and political theory are all less complex than articles in comparative politics. The effect size is 4.85 [3.23, 6.46] for American politics; 4.98 [1.72, 8.24] for international relations; and 5.80 [1.84, 9.75] for political theory. Other than the differences between these subfields and comparative politics, there were no substantive differences in text complexity across subfields.

COMPLEXITY OVER TIME

Next, we examined whether there have been changes in the textual complexity of political science research over time. To do this, we analyzed a sample of articles published by the American Political Science Review in the past 100 years. Specifically, we used the automated text-analysis procedures described previously to compute FRE scores for each article published in 10-year intervals from 1912 to 2012.^{Footnote 7} The sample included 319 articles.

The average FRE score for this sample of articles was 36, with a standard deviation of 3. The lowest average FRE score (i.e., most complex) for a year was 32 in 1952; the highest average score (i.e., least complex) for a year was 43 in 1912. Figure 2 plots the data points for each yearly average from 1912 to 2012 (at 10-year intervals) along with a fitted line and 95% confidence intervals. Although the data are somewhat noisy, there is a general trend toward more complex writing throughout the sample period. However, the slope of the line is relatively flat, and a decline of 5 points on the FRE measure translates to less than a grade level of change in reading difficulty.

Figure 2. Articles Published in the American Political Science Review, 1912–2012

CONCLUSION

In a widely discussed recent op-ed in the New York Times, Nicholas Kristof (Reference Kristof2014) criticized academics, and political scientists in particular, for being inaccessible. A lack of readability was among the problems identified by Kristof, who lamented that “academics seeking tenure must encode their insights into turgid prose.” Our analysis of political science research suggests that the average political science article is difficult to read but readily comprehensible to an educated layperson. Indeed, the average political science article is about as difficult to read as Kristof’s op-ed in the New York Times (both have Flesch Reading Ease scores in the 30s, corresponding to “difficult” text, and require 13 years of education to comprehend based on Flesh-Kincaid scores). Although methodological details and subfield-specific terminology may sometimes inhibit comprehension, their provision is often a necessary tradeoff to ground research in the relevant literature and offer scientifically sound answers to pressing political questions. Outlets that target broader audiences, including journals such as Foreign Affairs and The Forum as well as blogs such as The Duck of Minerva and The Monkey Cage, help make technical research even more accessible. Overall, the results presented here suggest that political scientists are doing a commendable job making their research accessible to diverse audiences.

Damon M. Cann is an associate professor in the department of political science at Utah State University. He can be reached at [email protected].

Greg Goelzhauser is an assistant professor in the department of political science at Utah State University. He can be reached at [email protected].

Kaylee Johnson is an undergraduate student in the department of political science at Utah State University. She can be reached at [email protected].

Footnotes

1. Senator Tom Coburn, letter to Subra Suresh (March 12, 2013).

2. The sample includes only original research articles.

3. Massachusetts Title XXII, Chapter 176, Section 2B; California Financial Privacy Act, Chapter 243, Section 6.

4. The application is cross-platform and is available at http://flesh.sourceforge.net.

5. Data sources: Judicial opinion data are from a sample of state supreme court decisions from 1995 to 1998 (Goelzhauser and Cann N.d.); comics, Time Magazine, and Reader’s Digest are from Flesch (Reference Flesch2002); New York Times is from Dalecki, Lasorsa, and Lewis (Reference Dalecki, Lasorsa and Lewis2009); the Declaration of Independence was downloaded as a text file from http://www.constitution.org/usdeclar.txt; the Uniform Commercial Code (Part I of Article I) was downloaded from the Legal Information Institute; the United States Code (Title 2) was downloaded from the Government Printing Office; and The Bible and Peter Pan were downloaded from Project Gutenberg.

6. We evaluated each article in one of the generalist journals. Articles in a subfield journal were coded as emphasizing that subfield.

7. The sample includes articles published in Volumes 6, 16, 26, 36, 46, 56, 66, 76, 86, 96, and 106. Articles from Issues 2 and 3 of Volume 66 were excluded because the PDF files could not be converted to readable text files.

References

REFERENCES

Coleman, Brady, and Phung, Quy. 2010. “The Language of Supreme Court Briefs: A Large-Scale Quantitative Investigation.” Journal of Appellate Practice & Process 11 (1): 75–103.Google Scholar

Dalecki, Linden, Lasorsa, Dominic L., and Lewis, Seth C.. 2009. “The News Readability Problem.” Journalism Practice 3 (1): 1–12.CrossRef Google Scholar

Diamond, Larry. 2002. “What Political Science Owes the World.” PS: Political Science & Politics Online Forum 113–27.Google Scholar

Flesch, Rudolf. 1948. “A New Readability Yardstick.” Journal of Applied Psychology 32 (3): 221–33.Google Scholar

Flesch, Rudolf. 2002. How to Write Plain English: A Book for Lawyers & Consumers. New York: Barnes & Noble.Google Scholar

Giles, Michael W., and Garand, James C.. 2007. “Ranking Political Science Journals: Reputational and Citational Approaches.” PS: Political Science & Politics 40 (4): 741–51.Google Scholar

Goelzhauser, Greg, and Cann, Damon M.. N.d. “Judicial Independence and Opinion Clarity on State Supreme Courts.” Forthcoming in State Politics & Policy Quarterly.Google Scholar

Hartley, James, Sotto, Eric, and Fox, Claire. 2004. “Clarity across the Disciplines: An Analysis of Texts in the Sciences, Social Sciences, and Arts and Humanities.” Science Communication 26 (2): 188–210.CrossRef Google Scholar

Kristof, Nicholas. 2014. “Professors, We Need You!” New York Times, SR11. February 16.Google Scholar

Loveland, John, Whatley, Arthur, Ray, Barbara, and Reidy, Richard. 1973. “An Analysis of the Readability of Selected Management Journals.” Academy of Management Journal 16 (3): 522–24.Google Scholar

Lowrey, Tina M. 2006. “The Relation between Script Complexity and Commercial Memorability.” Journal of Advertising 35 (3): 7–15.Google Scholar

Terris, Fay. 1949. “Are Poll Questions Too Difficult?” Public Opinion Quarterly 13 (2): 314–19.Google Scholar

Figure 1. Examples of Reading Ease

Table 1 Regression Model of Political Science Article Clarity

Figure 2. Articles Published in the American Political Science Review, 1912–2012

Article contents

Analyzing Text Complexity in Political Science Research

Abstract

MEASURING COMPLEXITY

EXPLAINING VARIATION IN COMPLEXITY

COMPLEXITY OVER TIME

CONCLUSION

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests