Don't you think it is a miracle that Russian mortality [from COVID-19] is tenfold lower than in Europe and the USA?Footnote 1
The success of modern autocracies is commonly attributed to the skilful manipulation of information used to convince the public of the autocrat's competence (Guriev and Treisman Reference Guriev and Treisman2019; Guriev and Treisman Reference Guriev and Treisman2020); yet, little is known about specific political and bureaucratic mechanisms manufacturing this manipulation. The global COVID-19 pandemic is a setting that can be used to reveal and study these mechanisms (Thomson, Reference Thomson2020). Many autocratic governments responded to the pandemic by manipulating mortality statistics to fabricate successful handling of the health crisis (Balashov, Yan and Zhu Reference Balashov, Yan and Zhu2021; Kilani Reference Kilani2021; Neumayer and Plümper Reference Neumayer and Plümper2022). The existence of this manipulation created a serious challenge for the scholarship investigating the effect of political regimes on pandemic management.Footnote 2 While some studies claim the existence of a so-called ‘autocratic advantage’ in terms of dealing with COVID-19 (Cepaluni, Dorsch and Branyiczki Reference Cepaluni, Dorsch and Branyiczki2021; Karabulut et al. Reference Karabulut2021), others argue that this statistical ‘advantage’ is a product of the deliberate under-reporting of official pandemic statistics by autocratic governments (Adiguzel, Cansunar and Corekcioglu Reference Adiguzel, Cansunar and Corekcioglu2020; Annaka Reference Annaka2021; Badman et al. Reference Badman2021; Kapoor et al. Reference Kapoor2020; Kennedy and Yam Reference Kennedy and Yam2020; Knutsen and Kolvani Reference Knutsen and Kolvani2022; Neumayer and Plümper Reference Neumayer and Plümper2022).
This under-reporting is frequently identified through the comparison of the indicators of COVID-19 deaths and overall excess mortality, with the latter measure being a substantially more reliable one (COVID-19 Excess Mortality Collaborators 2022; Karlinsky and Kobak Reference Karlinsky and Kobak2021; Whittaker et al. Reference Whittaker2021). This simple method of detecting COVID-19 mortality manipulation hands us a unique opportunity to look deeper into the mechanism to explain the extent of data manipulation in authoritarian regimes.
Autocracies are not unitary actors. Rather, their policies are an outcome of a complex interaction of numerous players with partly contradictory interests. Data published in such regimes are produced within their bureaucracies, comprised of agencies pursuing their own goals (Herrera and Kapur Reference Herrera and Kapur2007).Footnote 3 Thus, to understand the patterns of data manipulation, we need to study this interaction and the interests of the actors involved. Previous literature has acknowledged the importance of bureaucratic incentives; yet, the primary focus has been on how bureaucracies misinform their principals and which tools authoritarian regimes establish to address this problem (see, for example, Wallace Reference Wallace2016; Zhou and Zeng Reference Zhou and Zeng2018). The settings where autocrats create incentives encouraging data manipulation and their consequences for reported information have received much smaller scholarly attention (for a recent exception, see Tang, Wang and Yi Reference Tang, Wang and Yi2022). This article aims to close this gap.
We do so by studying the mechanisms leading to the manipulation of COVID-19 data by regional authorities in Russia in the early months of the pandemic in 2020. Russia is among the countries heavily affected by the COVID-19 crisis and against which accusations of data manipulation have been made relatively often.Footnote 4 At the same time, it has a large bureaucracy plagued by severe principal–agent problems (Libman and Rochlitz Reference Libman and Rochlitz2019) and a history of responding to incentives set by the central government through massive data manipulation (Kalinin Reference Kalinin2018). We show that the set of incentives generated by the federal government triggered less accurate reporting of COVID-19 mortality in the Russian regions. To identify the effect of incentives, we rely on cross-regional variation in COVID-19 mortality reporting. Bureaucracies in different regions have different intensities of response to the central government's incentives. Regions where local governors believe they face larger political risks and could therefore lose their office should be more willing to appease the central government and, thus, to under-report COVID-19 mortality (Egorov and Sonin Reference Egorov and Sonin2011).
We use a specific feature of the Russian political system – asynchronous election cycles in individual regions – as a source of exogenous variation, which leads to the natural randomization of political risk for incumbent governors across regions (Akhmedov and Zhuravskaya Reference Akhmedov and Zhuravskaya2004).Footnote 5 The proximity to the next elections poses a risk to the survival of the governor in their office but is orthogonal to characteristics determining the spread of the pandemic in the region; hence, this setting enables us to establish a causal relationship between political risk (and, hence, susceptibility to the federal government's incentives) and under-reporting of COVID-19 mortality.
The article is structured as follows. The second section discusses the theoretical contribution and describes the Russian setting during the early phase of the COVID-19 pandemic. The third section presents the data and main variables. The fourth section reports the main findings on the causal effect of career concerns on the under-reporting of COVID-19 mortality. Finally, the fifth section reveals some tentative results on the potential consequences of exposed under-reporting on trust in governmental statistics and self-isolation.
Theoretical Considerations
Data Manipulation in Authoritarian Regimes
Authoritarian regimes systematically manipulate information and, in particular, statistical data. Some regimes, such as the Soviet Union or North Korea, suppress a substantial amount of data or even publish deliberately wrong information (Eberstadt Reference Eberstadt2007; Jasny Reference Jasny1950). Others are more open but still inclined to manipulation. For example, Magee and Doces (Reference Magee and Doces2015) and Martinez (Reference Martinez2022) provide evidence of the manipulation of economic growth statistics by autocracies. Economic news, election outcomes, media publications and even social network posts are all subject to manipulation (see, for example, Bader and van Ham 2015; Hale Reference Hale2018; Harvey Reference Harvey2020, King, Pan and Roberts Reference King, Pan and Roberts2013; King, Pan and Roberts Reference King, Pan and Roberts2017; Moser and White Reference Moser and White2017; Myagkov, Ordeshook and Shakin Reference Myagkov, Ordeshook and Shakin2009; Pearce and Kendzior Reference Pearce and Kendzior2012; Rozenas and Stukal Reference Rozenas and Stukal2019). This manipulation serves several goals. First, when directed at domestic audiences, it reduces the likelihood of public protests by preventing coordination among the opposition and possibly misleading the public into perceiving the regime as competent and benevolent (Chen and Xu Reference Chen and Xu2017b; Hollyer, Rosendorff and Vreeland Reference Hollyer, Rosendorff and Vreeland2015). Secondly, manipulating statistics matters for the international posture of the regime, for example, by making other countries overestimate the regime's stability and power, or raising its attractiveness to foreign business (Aragao and Linsi Reference Aragao and Linsi2022).
Which political mechanisms, however, lead to the production of fake data? In some cases, authoritarian regimes explicitly order their subjects to manipulate information.Footnote 6 In many cases, however, the mechanism is more complex: the authoritarian leadership creates incentives for data fabrication but leaves the details on the actual means and the extent of this manipulation to be decided by individual bureaucracies. Under these conditions, information manipulation will vary across individual agencies and branches of bureaucracy: while some of them ‘underperform’, thus remaining relatively ‘honest’, others overindulge in the information manipulation to an extent surpassing the central government's desire.Footnote 7 While this mechanism appears to be plausible, there is hardly any research on how misinformation emerges from the regime nudging its bureaucrats towards it through a specific formal or, even more importantly, informal incentive structure.
The role of bureaucracies is acknowledged in different literature on data manipulation in autocracies, with one looking at regimes being at the ‘receiving end’ of misinformation and failing to obtain accurate data on the real political, economic and social developments in the country. Fundamentally, there is a large literature showing the severe problems that authoritarian regimes face in gathering information from their subjects (Kuran Reference Kuran1997; Wintrobe Reference Wintrobe1998) and studying the various tools autocracies use to improve information acquisition (Anderson et al. Reference Anderson2019; Chen, Pan and Xu Reference Chen, Pan and Xu2016; Chen and Xu Reference Chen and Xu2017a; Dimitrov Reference Dimitrov2014; Egorov, Guriev and Sonin Reference Egorov, Guriev and Sonin2009; Huang, Boranbay-Akan and Huang Reference Huang, Boranbay-Akan and Huang2019; Jiang and Wallace Reference Jiang and Wallace2017; Lorentzen Reference Lorentzen2014; Tan Reference Tan2014). Bureaucracies of authoritarian regimes that pursue their own strategic goals, whether for promotion or the avoidance of punishment, are eager to embellish their achievements or hide their failures. Bureaucratic hierarchies are, generally speaking, a natural environment for bottom-up information manipulation due to widespread principal–agent problems (Gailmard and Patty, Reference Gailmard and Patty2012), but in authoritarian regimes, information distortions tend to be especially severe due to the lack of free media or public accountability as alternative sources of information. The research on bureaucratic information manipulation in autocracies so far has focused primarily on the case of China (Fisman and Wang Reference Fisman and Wang2017; Merli and Raftery Reference Merli and Raftery2000; Wallace Reference Wallace2016; Zhou and Zeng Reference Zhou and Zeng2018), where governmental incentives for local bureaucrats not only played an essential role in ensuring the high economic performance of the regime (Xu Reference Xu2011), but also made data manipulation very attractive (Chen et al. Reference Chen2019).Footnote 8
We are interested, however, in settings where bureaucracies' objective is not to hide information from their political principals, but to manipulate the publicly available information in line with the goals of the principals, or at least what bureaucrats perceive to be the goals of the principals. A fundamental challenge we face is, then: (1) to find evidence of data manipulation; and (2) to check whether bureaucrats indeed respond to specific political incentives. As already mentioned in the introduction, the COVID-19 pandemic provides us with an instrument to solve the first problem (comparison of official COVID-19 mortality and overall excess mortality). To deal with the second problem, we leverage the aforementioned fact of the heterogeneous response of individual bureaucracies to central incentives as the key element of our identification strategy and look for systematic patterns between data manipulation and the sensitivity of regional bureaucracies to incentives. We treat proximity to local elections in individual regions as the main trigger of this heterogeneity.
Proximity to Sub-national Elections and Bureaucratic Response
Approaching elections are well known to influence the behaviour of both politicians and bureaucrats. Since Nordhaus (Reference Nordhaus1975), the voluminous political business-cycle literature has focused on how proximate elections influence fiscal and macroeconomic policies to create visible short-term economic results to sway voters in their favour (for a review, see Drazen Reference Drazen2000; Dubois Reference Dubois2016; Philips Reference Philips2016). Its argument is straightforward: incumbents introduce policies attractive to voters, such as increasing public spending, prior to elections at the expense of policies implemented in the immediate aftermath of the elections. While the original political business-cycle literature was developed for fiscal policies, similar arguments were later used to explain the temporal dynamics of a wide variety of other policies in democracies (Ahuja Reference Ahuja1994; Berdejó and Yuchtman Reference Berdejó and Yuchtman2013; Bracco Reference Bracco2018; Canes-Wrone and Shotts 2004; Marinov, Nomikos and Robbins Reference Marinov, Nomikos and Robbins2015; Nanes Reference Nanes2017; Potrafke Reference Potrafke2019; Shmuel Reference Shmuel2021; Vadlamannati Reference Vadlamannati2015).
It is, however, less clear if this logic applies in an authoritarian context (Pepinsky Reference Pepinsky2007; Shmuel Reference Shmuel2020). Elections in autocracies are generally less important than in democracies and thus provide smaller incentives for leaders to adjust their policies. Term limits for the tenure of bureaucrats and politicians can, however, produce effects similar to elections. For instance, in the case of China, recent literature documents the existence of a sort of political business cycle tied to the term limits of local officials (Cao, Kostka and Xu Reference Cao, Kostka and Xu2019; Chen and Zhang Reference Chen and Zhang2021; Guo Reference Guo2009). The core difference between this bureaucratic political cycle and the political business cycle in a democracy is that officials have incentives to ‘please’ their superiors (or the authoritarian leaders), rather than their electorate.
In Russia, formally, regional governors hold their position for a five-year period, after which they have to stand for re-election. Despite the de jure direct election of governors, in practice, the federal government almost always determines whether a governor stays in power or is replaced (Golosov Reference Golosov2018). Therefore, governors should be more likely to behave more like appointed bureaucrats than elected politicians. From the point of view of the bureaucratic political-cycle argument, they should focus on pleasing the federal centre towards the end of their tenure and be less concerned about it if the end of their tenure is farther away.Footnote 9 Sidorkin and Vorobyev (Reference Sidorkin and Vorobyev2018; Sidorkin and Vorobyev Reference Sidorkin and Vorobyev2020) show that proximity to the end of the term increases the levels of predation among Russian governors, potentially expecting to lose their position, and makes them more willing to engage in the acquisition of votes for pro-Kremlin candidates at federal elections. We expect a similar logic to apply to the behaviour of governors during the pandemic.
Proximity to the end of the term provides an exogenous variation in exposure to political risk for individual governors. The election cycles in Russian regions mostly originate from the historical precedent of the 1990s, when individual regions had substantial freedom in determining their political systems, including the timing of regional elections (Gel'man Reference Gel'man1999; Hale Reference Hale2003; Sharafutdinova Reference Sharafutdinova2006). The governors' elections were introduced in different regions at different points of time between 1991 and 1996; the terms of governors were further influenced by region-specific political changes and occasional early resignations. This historical variation is likely to be orthogonal to any characteristics of the regional governors occupying these positions in 2020 and thus other factors influencing governor-specific responses to the pandemic.
Perception of imminent political risk can trigger numerous types of political responses. In Russia, however, data manipulation is, as the next section shows, a particularly likely reaction on the side of bureaucrats.
Russian Bureaucracy and Data Manipulation
The Russian case presents us with an excellent opportunity to study bureaucratic data manipulation. On the one hand, Russia is a large country with extreme heterogeneity in regional economic, political and cultural conditions. On the other hand, under Vladimir Putin, Russia has developed into a consolidated authoritarian regime, where media and civil society are heavily constrained in their ability to report on local conditions openly (Gel'man Reference Gel'man2015), offering bureaucrats a free hand in faking data.
Two features of the political organization of the Russian state gave rise to what one can refer to as a real culture of data manipulation. First, Russia is an electoral autocracy: elections at the regional and federal levels are conducted regularly and are important for the legitimation of the Russian regime. One of the crucial tasks of regional governors is to ensure electoral success: both their own and that of federal pro-regime candidates and parties. The share of votes of pro-Kremlin candidates serves as a key criterion for evaluating regional governors from the perspective of the federal administration (Gorokhov Reference Gorokhov2017; Reuter and Robertson Reference Reuter and Robertson2012; Rochlitz Reference Rochlitz2020). In Russia, the task of ensuring favourable election outcomes is frequently achieved through electoral fraud (Enikolopov et al. Reference Enikolopov2013; Harvey Reference Harvey2016; Myagkov, Ordeshook and Shakin Reference Myagkov, Ordeshook and Shakin2005; Skovoroda and Lankina Reference Skovoroda and Lankina2017). Manipulating elections requires the active participation of bureaucrats at all levels, and electoral fraud constitutes a casual routine for numerous Russian state officials (Forrat Reference Forrat2018; Frye, Reuter and Szakonyi Reference Frye, Reuter and Szakonyi2019). It stands to reason that bureaucrats who are accustomed to electoral fraud will not hesitate to manipulate data in other settings as well.
Secondly, the Russian central government heavily relies on quantitative indicators to monitor its bureaucracy. The lion's share of the salaries of Russian bureaucrats is constituted by a performance-based bonus, which is paid depending on over-fulfilling a number of quantitative indicators set by the higher-level bureaucracies. Over time, the quantitative indicators in virtually all branches of bureaucracy have become more numerous and complex (Schultz, Kozlov and Libman Reference Schultz, Kozlov and Libman2014). Conversely, Russian bureaucrats are subject to regular checks by numerous controlling agencies, which again concentrate on formal regulations and quantitative data produced by bureaucracies. State officials in many agencies consider the ability to carefully fulfil all requirements for the paperwork as the essential characteristic of performance, being more important than the actual tasks of the bureaucracy (Paneyakh Reference Paneyakh2014). As a result, Russian bureaucrats systematically fake data to fulfil formal requirements, as well as to avoid inspections and punishments (Kalgin Reference Kalgin2016). Data manipulation is also widespread in the healthcare sector (Chernov and Sornette Reference Chernov and Sornette2016).
Thus, for the Russian bureaucracy, data manipulation appears to be a routine, rather than an exceptional, practice. From this point of view, there are no reasons to expect the COVID-19 pandemic to have been met with a different set of tools than any other challenge of the Russian bureaucracy. However, the direction and the scope of manipulation depend on the particular structure of incentives that Russian regional bureaucrats face in a specific situation.
COVID-19 in Russia, the Referendum and the Career Incentives of Russian Bureaucrats
The start of the pandemic in Russia presented Putin's regime with a challenge. For 2020, Putin had scheduled a major constitutional reform. While the amendments to the constitution were numerous, probably the most important one was that Putin would receive the right to run for presidential office after the expiration of his current term in 2024, which would otherwise be impossible due to constitutional term limits. Although the amendments could have been adopted by a simple parliamentary decision, Putin decided to turn the change of constitution into a major showcase of their loyalty to his regime, announcing a national referendum. Initially, the referendum was scheduled for April. However, the spread of the SARS-CoV-2 virus made the feasibility of the referendum questionable. Putin was forced to postpone the referendum and decided to implement it over a seven-day period from 25 June until 1 July (Pomeranz and Smyth Reference Pomeranz and Smyth2021; Teague Reference Teague2020).
The feasibility of the new referendum date depended upon the development of the pandemic. Organizing the referendum at the peak of the spread of the new virus would both reduce the ability of the referendum to boost legitimacy and even result in public disapproval of the carelessness of the government. The high perceived risk of contracting the virus at the mass public event would also severely reduce the turnout. Thus, reducing the contagion rates, or at least convincing the population that the pandemic was under control, became the key task of the regime (Blackburn and Petersson Reference Blackburn and Petersson2021).
During the pandemic, Putin refrained from personally introducing unpopular measures (like lockdowns); instead, he transferred the authority to deal with the pandemic to regional governors, making them de facto responsible for containing the virus (Åslund 2020; Hartwell, Otrachshenko and Popova Reference Hartwell, Otrachshenko and Popova2021). Given the arguments outlined earlier, it appears plausible that governors responded to the COVID-19 challenge with systematic data manipulation similar to other informal tasks of the federal centre, such as ensuring favourable election outcomes (see Busygina and Filippov Reference Busygina and Filippov2021).
The literature has already provided the first evidence of the ‘culture of silence’ (Shok and Beliakova Reference Shok and Beliakova2020) and manipulation of COVID-19 mortality and contagion data in Russia (Belianin and Shivarov Reference Belianin and Shivarov2020; Kobak Reference Kobak2021). We conjecture that a major share of data manipulation is likely to originate from the regional level and result from the informal incentives faced by Russian governors. Data manipulation is determined by two conditions: the importance of suppressing the COVID-19 data for the federal centre and the individual political situation of the governors. This leads to the following two expectations:
- First, some governors should be more inclined to care about the informal federal objectives than others. In particular, in line with the reasoning of the previous section, governors who perceive their situation as risky (that is, face proximate elections) should be more likely to attempt to please the central government by excessively manipulating the data; for governors with a stronger political position (that is, more distant elections), excessive manipulation is less critical.
- Second, data manipulation should be stronger in the months preceding the referendum, when it was essential for the regime to show that the pandemic was on the decline. After the referendum, acknowledging the spread of the pandemic became less of a problem for the Russian regime. Therefore, regional governors should also be less inclined to manipulate data. Since the national referendum took place in all regions at the same time, its timing is orthogonal to region-specific characteristics.
These arguments guide the remaining part of this article.
Data
Explanatory Variable: Measuring Political Risk
In what follows, we present the key variables of our study.Footnote 10 We start with the proxy of political risk, that is, the key factor of susceptibility of governors to federal incentives. As already mentioned, we measure the individual exposure of a governor to political risk by looking at the proximity to the upcoming governor's election. Since governors' elections in Russia follow an asynchronous electoral schedule, the arrival of an exogenous shock, such as a COVID-19 pandemic, automatically splits all the regions into two categories: regions with a governor in the first half of their term; and regions with a governor in the last half of their term. Governors in the second half of their term are expected to be more concerned with their political future because their performance is under more scrupulous attention from the federal centre, which eventually decides the governor's fate as the election time arrives.
For April 2020, we identify forty-three out of a total of eight-five regions where governors were in the second half of their term. We construct our main explanatory variable, Elections approaching, as a dummy that equals 1 when the elections of the regional governor are scheduled for the year 2020–22 and 0 for governors with elections in the year 2023–24 (forty-two regions). Additionally, similar to Pulejo and Querubín (Reference Pulejo and Querubín2021), we use a continuous measure of the proximity to the upcoming elections measured in full years.Footnote 11 Since the governor's term is for five years, our alternative measure takes values from 0 for governors with elections in 2020 to 4 for those who have to stand for re-election in 2024. Figure 1 presents the distribution of regional governors in Russia according to the period remaining until re-election, and Map 1 shows how these regions are spread across the Russian territory.
Supplementary Appendix (SA) C provides balancing tests to show that both our variables for the proximity to upcoming elections are not correlated with regional characteristics, such as income, Gini index, the share of professional education, urbanization rate, population size, life expectancy, vote share for Putin in 2018 (see Table C1 in SA C) and regional political institutions (see Table C2 in SA C).
Dependent Variable: Measuring the Under-Reporting of COVID-19 Mortality
As already discussed in the introduction, we capture the degree of under-reporting of the COVID-19 pandemic by comparing the official data on COVID-19 mortality and the data on overall excess mortality (published at a later point in time). Our analysis looks exclusively at mortality from COVID-19 and not infection rates for two reasons. First, the cross-country evidence suggests that the deliberate misreporting of mortality data from COVID-19 was much more common than the misreporting of infection rates (see, for example, Balashov, Yan and Zhu Reference Balashov, Yan and Zhu2021).Footnote 12 In Russia, this also seemed to be the case because the official mortality rates were extremely low in international comparison but, at the same time, Russia was ranked third in terms of the number of infections worldwide, according to the official data.Footnote 13 This appalling discord between high infections and low mortality received great attention in Russian public discourse and was even labelled ‘a Russian miracle’ by the official media representatives of the Russian Coronavirus Information Center.Footnote 14
The second reason to focus on mortality statistics is determined by our identification method of data manipulation based on using excess mortality as a reliable measure of the true toll of the pandemic, while such a non-manipulable benchmark does not exist for infection rates. Excess mortality has become widely recognized as a substantially less biased measure that can account for undetected and unreported COVID-19 cases (Beaney et al. Reference Beaney2020; Vestergaard and Mølbak Reference Vestergaard and Mølbak2020). In the Russian case, overall mortality has been traditionally more reliable than the often manipulated disaggregated mortality by causes (Danilova et al. Reference Danilova2016; Lysova and Shchitov Reference Lysova and Shchitov2015) because registering an act of death has important legal implications. Russian bureaucrats would face enormous difficulties if they tried to conceal the fact of death (and it would be immediately discovered by the citizens). The cause of death, on the other hand, can be easily manipulated since misspecifying a cause of death on a death certificate has no particular implications, with only a few exceptional circumstances, such as an investigation of a medical error. The surviving members of the family often pay little attention to it (for details, see SA A3). The key variables of our analysis are constructed in the following way.
Official COVID-19 mortality
We employ the data published at stopcoronavirus.rf, the government-operated website established in the first weeks of the pandemic, reporting real-time data on infections and mortality. The website was widely advertised on national television and the internet, including the leading social platforms. Importantly, stopcoronavirus.rf was recognized by the authorities as the only legitimate source of statistical information on the COVID-19 pandemic in Russia. Such a status implied that publishing any alternative estimates would be classified as ‘false information of public interest, shared under the guise of fake news’ (Sherstoboeva Reference Sherstoboeva2020), and penalized with up to a five-year sentence or a heavy fine (up to 300,000 roubles or 4,200$) under a newly introduced amendment to the defamation law.Footnote 15 SA A2 provides information on the process of COVID-19 mortality statistics collection in Russia.
We construct the main variable for official COVID-19 mortality as the ratio of officially reported deaths from COVID-19 to the average all-cause mortality in the respective months over the previous three years (2017–19):
where i and t indicate the region and time period, which is either the three pandemic months before the referendum (April–June) or the three months after (July–September), which are used for testing the effect of political risk on under-reporting in the absence of federal incentives.Footnote 16 Data for Reported deaths from COVID19t are collected from the official website, stopcoronavirus.rf. Data on average deaths from all causes in the previous three years come from the Federal State Statistic Service (Rosstat).Footnote 17 Map 2 presents the spatial allocation of official COVID-19 mortality across Russia's territory before the referendum (April–June), indicating substantial regional heterogeneity in the intensity of the virus outbreaks.
Excess mortality
First, we compute the number of excess deaths as a difference between the number of current deaths from all causes and deaths from all causes in the respective period over the last three years.Footnote 18 We are interested only in the positive number of excess deaths, as it manifests the actual death toll of the pandemic. We construct the variable for Excess mortality as denoted in Equation 2:
where i and t again indicate the region and time period, respectively, and Past deaths from all causes t is the average number of deaths from all causes in the respective period over the last three years, as in Equation 1.
Juxtaposing the two mortality measures, we observe that excess mortality in the pandemic months before the referendum was significantly higher than official COVID-19 deaths in most of the regions (eight-one regions out of eight-five). The spatial distribution for the excess mortality is also different, as illustrated by Map 3.
Importantly, the publication of all-cause mortality for May, the first month when the death toll of the pandemic was high enough to make the discrepancy between the two mortality statistics noticeable, was postponed until after the referendum was completed, that is, the data were published when the federal government no longer needed to convince the public that the COVID-19 crisis was under control. This is yet another reason to treat these data as more reliable.
Did the regions with governors in the second half of their term exhibit different rates of official COVID-19 mortality and excess mortality than regions with governors in the first half? We plot the monthly trend of both mortality measures by the two categories of regions in Figure 2. While official COVID-19 mortality was substantially higher in regions with approaching elections (see Panel A), the excess mortality rate was statistically indistinguishable between the two groups (see Panel B). The gap in official COVID-19 mortality was particularly noticeable for the months before the referendum; it became smaller (though did not disappear entirely) after the referendum.
Medic mortality
We also use an additional variable to corroborate our results: we look at COVID-19 mortality among medical staff as an alternative estimate of the actual size of the virus outbreak in the region. The data come from a non-governmental website, Memorial List, established in the first days of the pandemic and based on colleagues and relatives reporting the deaths of medical personnel.Footnote 19 Thus, it is unaffected by governmental manipulations; the irregular governmental reports provide much lower COVID-19 mortality rates among healthcare personnel than the data from Memorial List.Footnote 20 In the first months of the pandemic, medical staff were particularly vulnerable (Domínguez-Varela Reference Domínguez-Varela2021; Gross, Mohren and Erren Reference Gross, Mohren and Erren2021; Iyengar et al. Reference Iyengar2020; Manzoni and Milillo Reference Manzoni and Milillo2020), and higher mortality in this group is thus likely to indicate a more substantial spread of the SARS-CoV-2 virus and the severity of the pandemic in Russian regions. Thus, medic mortality can be used as a robustness test.
Medic mortality is constructed as a ratio of the deaths of medical staffFootnote 21 from COVID-19 to the average deaths from all causes in the last three years, as follows:
Alternative explanation: anti-COVID-19 policies
Our analysis focuses on sensitivity to incentives (the extent of political risk) influencing the under-reporting of COVID-19 mortality. However, there is an important alternative explanation: political risk could trigger governors to implement actual measures to reduce COVID-19 mortality, rather than to fake data. This would lead to an upward bias in our estimations.Footnote 22
To check this explanation, we investigate the effect of election proximity on the actual anti-COVID-19 measures introduced by regional authorities. We use the CoronaNet Research Project, an international database on government responses to COVID-19. CoronaNet (Cheng et al. Reference Cheng2020) contains over 110,000 entries of individual anti-COVID-19 measures for about 200 countries in the world. For Russia, as well as for a number of other countries, data are available at the sub-national level (Schenk and Ganga Reference Schenk and Ganga2022). We compile a variable, AntiCOVID policies it, as the number of regional policies established before the constitutional referendum. Such policies commonly included social distancing, health testing, travel restrictions, quarantine and mass-gathering regulations, lockdowns, curfews, and business and public restrictions, and were very widespread, with an average region introducing about 124 measures, starting as early as February.
The disadvantage of this proxy is that the enforcement of the anti-COVID-19 policies may also differ from region to region. Some regions may introduce multiple anti-COVID-19 measures but not enforce them. We are unable to observe enforcement directly, but we can indirectly infer it from data on the individual behaviour of people – particularly their effort to self-isolate. We employ the index of self-isolation of the Russian regional population composed by Yandex, the Russian major search-engine company. The self-isolation index measures population mobility in all urban areas based on smartphone data.Footnote 23 Its value ranges from 0, which is equivalent to the highest mobility during the pre-pandemic rush hour, to 5, which is the lowest level of mobility, for example, that which can be observed at night. Again, we check whether proximity to the elections of regional governors affects the self-isolation index.Footnote 24
Analysis
Cross-Sectional Results for the First Three Months of the Pandemic
This section presents our main results. First, we look at the cross-sectional specification to test our main hypothesis about the positive relationship between the time left until the election and the reporting of official COVID-19 mortality across the eighty-five regions of Russia before the referendum. In theory, official COVID-19 mortality would be perfectly predicted by a more reliable estimate of overall excess mortality even under a sizeable manipulation of data as long as the manipulation effort is uniform across all regions (that is, all regional authorities report the same proportion of actual cases). However, in reality, the index of correlation between the two estimates aggregated over April–June equals about 0.7. This means that in some regions, manipulation was stronger than in others. This is precisely the variation that we are interested in for our analysis.
We estimate the following cross-sectional equation:
where: i = 1, …, 85 indicates the region; Official COVID19 mortality i is the officially reported COVID-19 mortality over the first three pandemic months before the referendum; Election proximity i is one of the two variables for the distance to the next governor elections in region i; and Excess mortality i is the excess mortality. Additionally, we test the effect of election proximity on Official COVID19 mortality i during the three months after the referendum, when the federal incentives weakened. To corroborate the robustness of our main results, we run the same regression as in Equation 4 for other estimates of regional mortality from the virus – excess and medic mortality – and the variables for the state response to the pandemic – the number of anti-COVID-19 policies and the self-isolation index.
The results are presented in Figure 3. The significant negative coefficient of the approaching election variable and the significant positive coefficient of election proximity in years (see Panel A) indicate significantly fewer reported deaths from the virus in regions with relatively sooner elections. The average magnitude of the effect is massive: having an election in 2020–22 decreases the reporting of COVID-19 mortality by 61 per cent of its average value, and having the election one year earlier, all things equal, reduces reporting by almost 23 per cent of its average value. This effect, however, becomes statistically insignificant in the three months after the referendum (see Panel B), suggesting that career incentives drive the manipulation of data only when there is a demand from the federal authorities (though the coefficients keep their signs). In short, political risk perception is correlated with under-reporting COVID-19 mortality prior to the referendum but not after the referendum.
The correlation we report in Panel A cannot be explained by the actual severity of the pandemic in regions with proximate elections because election proximity measures are uncorrelated with excess mortality or medic mortality, as reported in Panels C and D. Finally, we show in Panels E and F that approaching elections are not associated with an effort by local authorities to impose additional pandemic regulations and any consequent decrease in self-isolation. Thus, election proximity is likely to influence only the COVID-19 under-reporting and is unrelated to actual measures implemented by the government.
Panel Data Results
The cross-sectional approach can be subject to criticism that it fails to capture a multitude of unobserved region-specific factors potentially correlated with proximity to elections and with COVID-19 mortality. While, as mentioned, we treat the variation in proximity to elections as exogenous, as an additional check, we still estimate a region-month panel data model to eliminate region-specific heterogeneity.Footnote 25 We run the standard region fixed-effects model; additionally, we employ a generalized method of moments (GMM) estimator to account for the dynamic nature of the pandemic data. We are interested in the heterogeneous effect of election proximity conditional on the local severity of the COVID-19 outbreak for the whole period and the months before the referendum to test whether political risk amplifies the extent to which regions marked down COVID-19 mortality only before the referendum.
We estimate the following equation:
where: i = 1, …, 85 indicates the region; t = 1, …, 6 indexes months (April–September); Before referendum t is a dummy variable that equals 1 in the months before the referendum (April–June); and r i and m t represent the region and month fixed effects. We are interested in the coefficient of the triple interaction λ. which should capture the proportion of actual deaths not being reported in the official statistics in the months before the referendum. The linear term of the proximity to election is time invariant and, thus, absorbed by the region fixed effects.
The results for the regressions are reported in Table 1. We start with the fixed-effects estimation of the interaction term in Equation 2, which shows a strong and significant effect of the approaching elections on the official reporting of COVID-19 mortality (see Column 1) conditional on the actual COVID-19 mortality. We observe this effect only before the referendum; after the referendum, the effect disappears. Similarly, the more years left before the election, the larger the share of the excess mortality reported as official COVID-19 mortality (see Column 3) before the referendum but not after it. Having elections in the next two years decreases the share of excess deaths reported as COVID-19 deaths by almost twofold.
Notes: Standard errors clustered at the regional level in parentheses. FE OLS = fixed effects ordinary least squares. *p < 0.1; **p < 0.05; ***p < 0.01.
The fixed-effects model, however, does not account for the dynamic nature of the infectious spread; therefore, we also estimate the system GMM estimation that includes a one-period lag of the dependent variable, as well as excess mortality. The GMM estimations do not alter the results: the overall magnitude of the effect remains unchanged. To illustrate our central findings, we plot the conditional marginal effects of the election proximity variables for the periods before and after the referendum from Columns 2 and 4 in Figure 4. Again, the results are consistent with the findings reported in the previous section and confirm our main intuition: proximity to elections triggers more intensive COVID-19 mortality manipulation prior to the referendum.
Social Costs of Data Manipulation
Besides lessening individual and collective protective behaviour, under-reporting of COVID-19 mortality may become revealed to the public and, consequently, damage the trust in state-provided statistics, making individuals reluctant to react to any future warning signs in official data. In this section, we test this assumption by using an opportunity to study how the exposed under-reporting of COVID-19 mortality affected the trust and self-isolation behaviour of the Russian public.
On 10 July, ten days after the end of the national referendum, Rosstat made all-cause mortality data at the aggregate and regional level for the month of May accessible to a broader public. This was the first month when the pandemic death toll caused excess mortality to be noticeable for a large number of regions, also revealing a significant gap between excess deaths and officially reported deaths.Footnote 26 This discrepancy between the two mortality measures instantly received press coverage in numerous online media outlets, amateur blogs and social media, thus exposing the under-reporting that happened to the data for May.Footnote 27 This was also the first time the Russian public learned about excess mortality as an alternative way to estimate the true death toll of the pandemic and its application in spotting the under-reporting of COVID-19 mortality in both Russia and their region.
We are interested, in particular, in two possible effects of the exposed under-reporting for the month of May: an effect on trust in governmental COVID-19 statistics: and an effect on willingness to reduce social contacts to avoid contagion. We measure the extent to which COVID-19 reporting in May was exposed to the public by using an indicator of the ratio of excess deaths to the number of deaths reported by the governmental website. Naturally, because some regions had not yet had positive excess mortality in May, we cannot identify any under-reporting for these regions, but we include a dummy variable to control for this. It is noteworthy that only two regions with positive mortality have an under-reporting coefficient below the value of 1, meaning an over-reporting of official deaths compared to excess mortality. The rest demonstrate an under-reporting ratio ranging from 1.3 to 82.6.Footnote 28
Trust
Trust in official statistics is essential for an adequate public response to the pandemic, in particular, for the adherence to safety regulations (Bargain and Aminjonov Reference Bargain and Aminjonov2020; Pak, McBryde and Adegboye Reference Pak, McBryde and Adegboye2021), but it can be substantially damaged if the public learns about deliberate data manipulation. For measuring trust in official statistics, we employ a telephone survey (N = 1,617) carried out at the end of July by a highly reputable independent Russian pollster, Levada Center.Footnote 29 The survey provides us with self-reported trust in COVID-19 statistics as of 24 July, two weeks after the data on all-cause mortality for May were published, allowing the population to infer the extent to which the regional government under-reported COVID-19 mortality, and over three weeks after the referendum was completed. Trust is assessed via the question, ‘Do you trust the official information about the coronavirus situation in Russia?’, with possible answers being: ‘fully yes’, ‘mostly yes’, ‘only somewhat’ and ‘fully no’. The survey is representative nationally, and over 90 per cent of the respondents are from seventy-eight out of eighty-three Russian regions. It allows us to look at the relationship between individual trust in COVID-19 statistics and the degree of exposure of under-reporting in the respondent's location.
We start our analysis with simple descriptive statistics. For this purpose, we group respondents by the three categories of regions: regions with no excess mortality in May; half of the regions with excess mortality but with relatively accurate reporting of COVID-19 mortality; and the other half where regional governments hid COVID-19 mortality to a larger extent. Figure 5 presents the overlay of the three histograms for respondents in every region category. Respondents in regions without excess mortality and consequently without the under-reporting of COVID-19 mortality in May are consistently the most trustworthy of official statistics. However, this category does not allow us to disentangle the actual determinant of relatively higher trust because it may be driven both by the absence of the COVID-19 outbreak and by the lack of data manipulation. This is different for the regions with positive excess mortality because the intensity of under-reporting is not correlated with excess mortality, meaning that any difference in trust should be attributed to the difference in under-reporting.Footnote 30 Here, we observe that respondents from regions with a larger deviation of official COVID-19 data from actual mortality report consistently lower trust levels than respondents from regions where regional governments provided more accurate information on COVID-19 mortality, despite having the same level of severity of the COVID-19 pandemic on average.
However, inferring the under-reporting from all-cause mortality data is not a straightforward task; thus, we expect this relationship to hold mostly for respondents with better analytical skills. To test this hypothesis, we split all respondents into two subgroups by education: those with and without a university degree. Figure 6 replicates Figure 5 for both subgroups. We notice that the previous pattern is more prominent in the subgroup with better education and overall mistrust of official statistics is also higher for this group.
The regression results also confirm that the under-reporting variable decreases trust only conditional on the respondent holding a university degree. The estimation results are available in Table E1 in SA E.
Self-isolation
The deliberate under-reporting negates the advantages of informed self-regulation and affects the level of self-isolation by creating a false perception of safety. Once the under-reporting is exposed and the public trust in official statistics is lost, as we showed in the previous section, the population will no longer adjust their behaviour to the official information. We test these hypotheses using the self-isolation index of the Russian regional population, as described earlier.
Our analysis is based on the following assumptions. We hypothesize that monthly official COVID-19 mortality is positively associated with self-isolation in the regions; indeed, the higher the official mortality, the more fearful of the possible contagion people in the region become. However, the extent to which these two indicators are correlated could be affected by how likely people are to trust governmental statistics. We hypothesize that the correlation should be the highest for the months prior to the publication of the all-cause mortality that exposed pre-referendum under-reporting. Furthermore, the correlation should be further suppressed by the extent of revealed under-reporting at the regional level.
Thus, we regress the self-isolation index on: (1) interaction terms between months dummies and COVID-19 mortality; and (2) triple interaction terms between COVID-19 mortality, months dummies and coefficient of exposed under-reporting in May (the same variable as in the previous subsection). The regression results are presented in Figure 7. As we expected, self-isolation and official COVID-19 mortality are strongly correlated for the first month of the pandemic, and the correlation becomes statistically insignificant after July – the month after the end of the referendum when the all-cause mortality was finally published.
However, when we account for the regional differences in exposed under-reporting starting from June, we observe that responsiveness to official statistics dwindled proportionately to the exposed under-reporting of official COVID-19 mortality. This finding allows us to conclude that the general public may have discounted official information proportionally to the level of under-reporting in May in this region.
Our findings provide two important implications. First, since self-isolation was relatively higher in regions with higher reported COVID-19 mortality, deliberate under-reporting potentially led to suboptimal levels of social mobility, thus increasing the risk of contagion. Secondly, the publication of information exposing the data manipulations was likely to have decreased public responsiveness to official statistics further.
Conclusion
The global pandemic caused by COVID-19 has posed an unprecedented challenge for governments around the world; yet, many autocratic regimes have responded to the pandemic in the manner they are used to – by manipulating official statistics to create an image of success instead of actually fighting the virus. However, without accurate pandemic information, it is hardly possible to assess the effectiveness of governmental policies, estimate the virus spread or make decisions on opening up borders for international travel. This article shows that in large authoritarian states, COVID-19 data manipulation could be driven by the actions of sub-national politicians reacting to the (informal) incentives set by the central government. Furthermore, we provide evidence that this data manipulation leads to declining public trust in official COVID-19 information and induces lower compliance with safety measures. Thus, under-reporting comes at a cost to the ability of society to contain the spread of the virus.
The case of the Russian Federation studied in this article suggests the following mechanism explaining under-reporting. To achieve political goals associated with the need to implement the referendum on constitutional amendments, the federal government provided informal incentives to governors to paint a ‘rosy picture’ of the COVID-19 pandemic in their regions – either by actually managing the pandemic or by doctoring the data. Since manipulating data is an everyday routine for most Russian officials, many Russian governors opted for under-reporting COVID-19 mortality to achieve the goal set by the federal centre. Governors who perceived themselves as facing larger political risks and needing support from the federal centre provided more biased reporting of COVID-19 mortality.
Our analysis finds evidence of the correlation between perceived political risk and under-reporting only for the period preceding the national referendum, that is, when political incentives from the centre were particularly strong. After the referendum, we find no consistent evidence of a significant link between political risk and under-reporting. This may be explained by the fact that data manipulation, in the eyes of regional bureaucrats, is not an effortless activity, and they are more likely to engage in it only if they face respective incentives from the centre.
Our study acknowledges several limitations. First, our measure of under-reporting is based on the assumption that excess mortality is a more accurate proxy than officially reported data on COVID-19 deaths, and while this assumption has become widely accepted by scholars of the current pandemic, this approach might still not be ideal. Secondly, while the fundamental logic of the political mechanism behind under-reporting in the case of Russia is externally valid in the context of other autocratic regimes and is causal based on the identification strategy, the findings regarding the consequences of exposed under-reporting are rather more illustrative and might not apply in other circumstances.
Still, the relevance of our findings goes beyond empirical observations on how Russia handled the COVID-19 pandemic, as they provide evidence of an important mechanism of data manipulation in authoritarian regimes that has so far remained unexplored in the scholarly literature. While existing studies acknowledge the importance of data manipulation as a legitimation strategy by autocracies, they often fail to uncover the specific mechanisms of how data manipulation emerges through the interaction of multiple agents in an authoritarian political system. Our study shows that it is essential to understand not only the motives of the authoritarian regime to manipulate data, but also the specific incentives it sets for its bureaucracies engaged in data fabrication and the factors triggering more or less intensive responses on the side of bureaucrats.
Supplementary Material
Online appendices are available at: https://doi.org/10.1017/S0007123422000527
Data Availability Statement
The dataset and full replication files for the article are available in the journal's Dataverse at: https://doi.org/10.7910/DVN/OOWHY5
Acknowledgements
We are thankful for helpful comments and suggestions from the editor and two anonymous referees, as well as Benita Combet, Antonio Farfán-Vallespín, Gerrit Gonschorek, Henry Hale, Gan Jin, Günther Schulze, Alexey Raksha, Anton Shirikov and participants of the Wisconsin Russia Project Young Scholars Workshop and the Workshop on Russian Politics of the Freie Universität Berlin. We appreciate the research assistance of Guram Kvaratskhelia. We are also grateful to Caress Schenk for drawing our attention to the CoronaNet Database. All mistakes remain our own.
Financial Support
None.
Competing Interests
None.