Growing evidence points to problems with “character misrepresentation” in digital surveys (Ahler et al. Reference Ahler, Roush and Sood2021; Chandler and Paolacci Reference Chandler and Paolacci2017; Hydock Reference Hydock2018; Ryan Reference Ryan2020; Wessling et al. Reference Wessling, Huber and Netzer2017). We present concerning results from a nonprobability online survey of a specific subpopulation fielded through a well-known, commercial firm. A set of rigorous “screeners” revealed extremely high rates of presumptive fraud: More than 81 percent of respondents appeared to misrepresent themselves as current or former US Army members – our subpopulation of interest – to complete the survey and earn compensation. Presumed falsification rates were similar across multiple established sub-vendors, indicating that the problems were not idiosyncratic to a particular panel. Data also indicate the use of deliberate tactics to circumvent one of our screeners and repeated participation from a respondent or group of respondents, further raising suspicions about the data.
These irregularities point to the potential for significant identity misrepresentation rates in online nonprobability, subpopulation surveys – rates that are orders of magnitude greater than those typically reported in standard online surveys (Callegaro et al. Reference Callegaro, Villar, Yeager, Krosnick, Baker, Bethlehem, Goritz, Kros-nick, Callegaro and Lavrakas2014; Cornesse et al. Reference Cornesse, Blom, Dutwin, Krosnick, De Leeuw, Legleye, Pasek and Pennay2020; Kennedy et al. Reference Kennedy, Hatley, Lau, Mercer, Keeter, Ferno and Asare-Marf2020a; Mullinix et al. Reference Mullinix, Leeper, Druckman and Freese2016). Although online survey firms can vary markedly in their regulation of quality control – and all should be assessed for data problems (Kennedy et al. Reference Kennedy, Mercer, Keeter, Hatley, McGeeney and Gimenez2016) – risks for exceedingly high levels of fraud could be heightened under the conditions present in our study. Our findings call for further, systematic research into the validity of nonprobability online surveys, particularly those that sample specific subpopulations. They also underscore the imperative for researchers to develop clear tools and strategies (prior to statistical analyses and within preregistration plans) to ensure data integrity based on expert knowledge of research subjects.
Survey details
We contracted a nationally recognized market research firm, Footnote 1 whose samples have formed the basis of numerous, widely-cited political science studies, including articles published in Journal of Experimental Political Science, American Political Science Review, and Journal of Politics. The firm, which used multiple sub-vendors, fielded the survey over two separate rounds in April–May 2021. Footnote 2
Screening process
We employed two screeners to confirm the authenticity of respondents with self-identified Army experience. First, we asked a “knowledge” question about the practice of saluting, one of the most essential elements of military protocol. Footnote 3 The question required knowing both the Army’s rank hierarchy and that enlisted soldiers salute first. Multiple former Army officers consulted for this study validated the screen, with one stating: “Anyone who is answering that question incorrectly is either not reading the question or has not served in the military, let alone the Army.” Footnote 4 Second, respondents reported specific information on their Army background including: highest rank achieved, source of officer commission, deployment years and locations, and unit type. Footnote 5 We coded “non-viable” responses as those that provided information contravening federal law or Army personnel policy. We coded a small number of responses “highly improbable” based on being contrary to Army personnel practices or historical evidence, but which could be theoretically plausible. Footnote 6
Key findings
-
Total invalid: 81.8 percent
-
– 43.3 percent of total respondents failed the Army knowledge question.
-
– 35.5 percent of respondents passed the knowledge screen but gave answers about Army service that were non-viable under federal law or military administrative rules.
-
– 3.0 percent of respondents reported information about an Army background and career that was highly improbable.
-
-
Total valid: 18.2 percent
Notably, the number of invalid responses remained consistent across multiple vendors, each operating in a different survey round (83.7 in Round 1 and 78.9 in Round 2), suggesting that the problem was not isolated to a specific panel (Table 1).
Additional irregularities
1. Respondents appeared to employ deliberate tactics to circumvent the knowledge screener
Evidence. Beginning on April 17 in Round 1, the proportion of respondents who passed the knowledge question with non-viable demographic and career answers jumped sharply from 15.4 to 65.9 percent (Table 2). Possible explanations for this unexpected “break” in the data might include collusion (via the sharing of information on an email listserv or crowdsourced website), the retaking of surveys on devices with unique IP addresses, or the use of survey “bots.”
2. Respondents appeared to use deliberate tactics to detect the knowledge question as a screener
Evidence. In Round 1 – where failure to pass the knowledge question terminated the survey immediately – 49.8 percent passed the knowledge question but provided non-viable career responses. Conversely, in Round 2 – where we modified the setup so that failure resulted in termination later in the survey, thereby “hiding” the screener – the number of non-viable fell to 13.7 percent (Table 1). Conversely, the percentage of respondents who failed the screen rose from 30.6 percent (Round 1) to 62.7 percent (Round 2). This suggests that, absent a “hidden” screener, later respondents in Round 1 may have gleaned information from earlier respondents or survey attempts to identify the screener and defeat it. Footnote 7
3. A respondent, or collection of respondents, appeared to take the survey numerous times
Evidence. Categorizing conservatively, we identified – at a minimum – 73 suspicious instances of repeated (and unusual) responses regarding Army background and deployment experience. Footnote 8 The sequential clustering of these responses – six distinct waves of repeating answers across the 3 days of responses – suggests that such repetition was not coincidental. We also observed obvious repetitive patterns of survey takers seeming to misrepresent personal demographic information.
Conclusion
We see this analysis as an opportunity for learning. Despite taking precautions to screen out invalid respondents, we found high rates of presumptive fraud. This reinforces that researchers should be especially cautious when employing online surveys using nonprobability samples of specific subpopulations. Given that only about 7 percent of the US population is military or ex-military (Vespa Reference Vespa2020), our results are consistent with incentives for fraud increasing as the size of the subpopulation qualifying to participate in surveys decreases (Chandler and Paolacci Reference Chandler and Paolacci2017). Combined with other techniques, employing a diversity of screeners predicated on expert understanding of research subjects – including factors like demographics and content knowledge – can improve the odds of detecting falsified responses. Future research should systematically assess the quality of nonprobability surveys (Hauser and Schwarz Reference Hauser and Schwarz2016; Lopez and Hillygus Reference Lopez and Hillygus2018; Kennedy et al. Reference Kennedy, Clifford, Burleigh, Waggoner, Jewell and Winter2020b; Thomas and Clifford Reference Thomas and Clifford2017). By implementing rigorous screeners on diverse populations, replicated across many firms and sub-vendors, this could illuminate whether our results are endemic to nonprobability surveys that sample specific subpopulations and what the broader implications are for internal and external validity.
Supplementary Material
To view supplementary material for this article, please visit https://doi.org/10.1017/XPS.2022.8
Data Availability
The data, code, and additional materials required to replicate all analyses in this article are available at the Journal of Experimental Political Science Dataverse within the Harvard Dataverse Network, at https://doi.org/10.7910/DVN/Y1FEOX”
Acknowledgments
The authors thank Christopher DeSante, Timothy Ryan and Steven Webster, two anonymous U.S. Army officers, as well as anonymous reviewers for their helpful comments and suggestions.
Conflicts of Interest
The authors declare no conflicts of interest.
Ethics Statement
This survey was approved by the Indiana University-Bloomington IRB (Protocol #: 1910663858). The research adheres to APSA’s Principles and Guidance for Human Subjects Research. See Supplemental Appendix for more information.