INTRODUCTION
There is a fundamental divide in democratic theory between “realist” approaches, which severely question the capacity of ordinary citizens to rule themselves, and deliberative approaches, which propose to rely on, or nurture, the capacity of citizens to make thoughtful and informed choices. We see this divide in Joseph Schumpeter’s contrast between the “classical” theory of democracy, in which citizens reason about the public good, and the modern “competitive” model (Posner Reference Posner2005; Schumpeter Reference Schumpeter1942; Shapiro Reference Shapiro2003), in which democracy is reduced to little more than a “competitive struggle for the people’s vote.” In that competitive struggle, advertising, manipulation of the public, and deception are all fair game as part of the competition (Schumpeter Reference Schumpeter1942, 263). The lack of meaningfulness of the public’s resulting “volitions” is a central claim of the realist position. In this view, the point of democracy is to have peaceful transitions of power in a system that preserves rights. This division recurs throughout a vast literature, but most prominently of late, in the “realist” theory offered by Achen and Bartels (Reference Achen and Bartels2016), as contrasted with what they call the “folk theory of democracy” in which informed citizens would supposedly make reasoned choices. For the “realist” theory, the “will of the people” is mostly chimerical or simply the product of various manipulative techniques, or the by-product of mobilizing partisan loyalties. Relying on the capacities of the public to make thoughtful decisions about the policies on offer, in this line of thinking, is a “pipe dream hardly worth the attention of a serious person” (Posner Reference Posner2005, 163) and a “fairy tale” (Achen and Bartels Reference Achen and Bartels2016, 7).
Advocates of “deliberative democracy” can concede that the realists have a point about current democratic practices and voting behavior, but they hold out the prospect that deliberations by the people themselves, taking place under certain conditions, can be more reason-based. Is it the capacities of the public that are so limited or is it the nature of its opportunities within the current design of our democratic institutions and social practices? Perhaps voters are just not effectively motivated by the social context of being a citizen in mass society, subject to manipulative messages, incentives for rational ignorance, and a public sphere that seems to be decomposing into “filter bubbles” of like-minded “enclaves,” especially on contested issues (Chitra and Musco Reference Chitra and Musco2020; Dilko et al. Reference Dilko, Dolgov, Hoffman, Eckhart and Molina2017; Pariser Reference Pariser2011; Sunstein Reference Sunstein2017; Spohr Reference Spohr2017; but see Zuiderveen et al. Reference Zuiderveen Borgesius, Trilling, Möller, Bodó, de Vreese and Helberger2016 for skepticism). Under other possible conditions, could they perform a bit more like ideal citizens? This is not a utopian question. It is a question that can guide institutional designs for possible reforms. There are now experiments around the world with different designs for deliberative mini-publics that might foster reason-based decisions by random samples of citizens both on policies and on electoral choices (Fishkin Reference Fishkin2018; Grönlund, Bächtiger, and Setälä Reference Grönlund, Bächtiger and Setälä2014; Karpowitz and Raphael Reference Karpowitz and Raphael2014).
There has long been speculation that certain kinds of political participation might make “better citizens” (Mansbridge Reference Mansbridge, Elkin and Soltan1999; Pateman Reference Pateman1970). But what kind of participation? In what respects might “better citizens” result? Most of the speculation has focused on participation in deliberative institutions such as the New England town meeting or the jury, institutions which John Stuart Mill called “schools of public spirit” (Mill [1861] Reference Mill1991) when he reacted to de Tocqueville’s report of how these institutions operated in America. As de Tocqueville said, “Town meetings are to liberty what primary schools are to science; they bring it within the people’s reach” (de Tocqueville [1835] Reference de Tocqueville2019, 73). Mill envisioned similar civic effects from service on juries and parish offices in England and, he speculated, from service in the deliberative institutions of ancient Athens. In all these cases, Mill argued, a citizen would be called upon “to weigh interests not his own; to be guided, in case of conflicting claims, by another rule than his private partialities; to apply, at every turn, principles and maxims which have for their reason of existence, the general good” (Mill [1861] Reference Mill1991, 79). When the public is called upon to discuss what to do about public problems, they consider each other’s reasons and learn to take responsibility. This is essentially what Pateman termed the “educative effect” of political participation (Pateman Reference Pateman1970, 33) but focused specifically on participation in deliberative or discursive institutions.
What are the relevant dependent variables that might be affected by this kind of participation? Much of the focus has been on political efficacy (Morrell Reference Morrell2005), paying attention to public affairs (or knowledge gain about them) (Fishkin Reference Fishkin2018) and on voting turnout (Gastil, Deess, and Weiser Reference Gastil, Deess and Weiser2002; Gastil et al. Reference Gastil, Deess, Weiser and Simmons2010). For the latter, Gastil et al. found striking effects on voting turnout from participation in juries that reached a verdict (Gastil, Deess, and Weiser Reference Gastil, Deess and Weiser2002; Gastil et al. Reference Gastil, Deess, Weiser and Simmons2010). All of these effects might plausibly fit an account of “better citizens.” Ideally, citizens will consider that they have views worth listening to (internal political efficacy), they will learn about public issues and pay attention to campaigns, and they will vote. In the ideal case, they would also make an explicit connection between their policy positions and whom they vote for.
These elements combine to fit the classical picture of voters taking their responsibilities seriously, becoming engaged and informed, having the sense of efficacy to do so and then voting based on their policy views. These are attributes of their civic capacities that speak directly to their ability to contribute to collective self-government. If deliberation were to facilitate voters behaving in this way, it would constitute a response to the realist critique, at least so far as that critique depended on the capacities of voters rather than our current design of the institutions that engage them.
To offer a response to the realist critique, advocates of a more deliberative democracy need to face five related empirical questions:
-
1) Does deliberation in an organized setting demonstrate that ordinary citizens can come to reason-based and evidence-based judgments about what is to be done? One criterion for success on this score would be if citizens come to conclusions that clearly depart from simple partisan-based loyalties and show a judgment on the merits of the issues.Footnote 1
-
2) Can deliberation in such an organized setting have lasting effects? Or do the effects simply dissipate in the hothouse environment of political competition, campaigns, and elections?
-
3) If deliberation has lasting effects, is it primarily in the persistence of the post-deliberation policy attitudes or is it in the propensity to make reason-based choices, especially with respect to behaviors such as voting (both turnout and vote intentions)?
-
4) If deliberation has lasting effects on voting (either voting turnout and/or voting intention) are there identifiable mediators that are affected by deliberation that have these effects on voting?
-
5) Can deliberation in an organized setting, of the kind that offers encouraging answers to the first four questions, be scaled to large numbers beyond the random samples?
We approach questions 1 through 4 through a test case, a national experiment in deliberation. The experiment took place on the eve of the presidential primary season in 2019 and included a series of follow-up surveys with the deliberating sample and a control group. As for question 5, we have developed, based on this experiment, an approach that is now being piloted with new technology. We will sketch this approach at the end of the article. We report on our results in a separate paper (Fishkin et al. Reference Fishkin, Bolotnyy, Lerner, Siu and Bradburn2023).
AMERICA IN ONE ROOM
In collaboration with the Helena Group Foundation, we convened a national experiment in public deliberation about the major issues facing the United States in the period just preceding the 2020 presidential primary season. The event, entitled “America in One Room,” gathered a stratified random sample of 523 registered voters from around the country, recruited by NORC at the University of Chicago. A control group of 844 was also recruited by NORC and took essentially the same questionnaires in parallel with the experiment participants. The registered voter samples for both the participant and control groups were sourced from NORC’s probability-based and nationally representative AmeriSpeak panel. The recruitment and representativeness of the participant and control groups are discussed in the Supplementary material and in the Discussion. In advance of the initial survey, an advisory committee reflecting different points of view on the selected topics vetted the briefing materials for balance and accuracy. These materials serve as the initial basis for discussion when the sample is convened for its deliberations. The agenda focuses on policy options with balanced arguments for and against each option.
The first stage of the process resembles a normal public opinion poll: participants are surveyed with a standardized instrument in advance of seeing or discussing any information from the project. In the second stage, the random sample is brought together to a single place for extensive face-to-face discussions, usually over a long weekend. They are randomly assigned to moderated small group discussions, and they attend plenary sessions where they can pose questions to panels of experts or decision-makers with diverse views on a particular issue. At the end of the deliberations, participants take the same questionnaire as on the first contact, plus added questions for evaluation.
The participants gathered at a hotel in Dallas, Texas, on the weekend of September 19–22, 2019, arriving Thursday late afternoon and leaving Sunday after lunch. The agenda alternated small group discussions by issue area and plenary sessions, each lasting 90 minutes and running throughout the weekend. Each of the five issue domains (the economy, immigration, the environment, health care, and foreign policy) was discussed both in small group discussions and in plenary sessions with experts. Participants remained with the same small group (averaging about 13 persons) throughout the event, enabling them to get to know one another on a personal level over the course of the weekend. In the final questionnaire, completed just before departure, respondents were asked (as they had been in the pre-deliberation survey) to rate each specific policy proposal on a scale of 0 to 10, where 0 was “strongly oppose,” 10 was “strongly favor,” and 5 was a neutral midpoint.
Of the 47 proposals in these five issue areas, 26 can be classified as instances of extreme partisan polarization between Republicans and Democrats. The criteria are given as follows:
-
a) At least 15% of respondents identifying with each party take the most extreme possible position (0 or 10) at time 1 (T1), with these Democrats and Republicans at opposite poles on the proposal.
-
b) A majority of those party members who take a position at T1 are on the same side of the scale as the “extremes.”
These two criteria combine to identify extreme partisan polarization because the extremes are balanced at the two poles, with Republicans on one side and Democrats on the other.
Deliberation in this setting produced significant depolarization on 20 of 26 of the extreme partisan issues. By depolarization, we mean that the means of the two parties move closer together. This does not necessarily mean that they both move toward the middle. They can both move in the same direction so long as they end up closer together. In a number of cases, the changes were massive, amounting to 40 percentage points for Republicans on the most hardline immigration questions and for Democrats on the most ambitious redistributive proposals. The control group changed hardly at all on these policy issues in the same period. The project included three follow-up waves: in July 2020 before the national party conventions, in late September/early October before the November 2020 Presidential election and, to capture self-reported actual voting, in the weeks following the election. All of these waves included both the treatment group (participants on the deliberative weekend) and the control group. We thus have data from T1 (before the weekend of deliberation), T2 (at the end of the weekend), T3 (10 months later, July 2020), T4 (October 2020, a year later, shortly before the presidential election), and T5 (after the election for self-reported recollection of voting. After the election, we also collected verified voting data from publicly available sources on the participants and the control group. The changes of opinion from T1 to T2 are the subject of Fishkin et al. (Reference Fishkin, Siu, Diamond and Bradburn2021). This article focuses on the follow-up waves including T4 and T5.
To get a picture of the overall changes in policy attitudes, we use individual responses to the 26 extremely polarized issues to construct a policy-based ideology score (PBS). The PBS ranges from 0 to 10, with 0 denoting most liberal and 10 denoting most conservative. The score is constructed for each individual at T1 by averaging over their responses to each of the 26 questions. This process is similar to the weighted averaging of issues scales method advanced in Ansolabehere, Rodden, and Snyder (Reference Ansolabehere, Rodden and Snyder2008).Footnote 2 Before averaging, we make sure that the response to each question is converted to the PBS rubric (e.g., if the extreme Republican response to the question was 0, we flip everyone’s scores, turning 0 s into 10, 1 s into 9 s, and so forth before averaging). We do the same to construct the PBS using individual responses to the 26 questions at T2 and T3. The policy questions were not asked at T4 and T5.
It is worth emphasizing that the overall PBS is composed of policy scores in five different issue areas. The overall movement thus masks differential movement within the issue areas. For example, on the economy, participants moved significantly to the right on average. However, movements left on proposals in the other four areas outweighed that movement to the right in the aggregation for the overall score. Depending on the issues selected, it is evident that deliberation can move people significantly in either direction.
As can be seen from Table 1, the PBS for all five issue areas moved significantly between T1 and T2. Four of the areas saw movement to the left on average, but one of the largest movements was to the right, on the economic issues (column 4). All five issue area changes between T1 and T2 were significant at the 0.01 level. For comparisons to the control group and a difference-in-difference analysis employing regressions, see Supplementary Tables A.1 and A.2.
Note: Standard errors are from paired t-tests of mean differences. The samples used for the paired t-tests are slightly different since individuals need to have taken both surveys, but this does not change the sample means reported here in meaningful ways.
*p < 0.10,**p < 0.05,***p < 0.01.
Changes in the overall PBS are shown in a binned scatterplot in Figure 1.Footnote 3 While the control group changes very little, there are large changes in the participant group especially, between T1 and T2. Clustered in the broad middle of the PBS range, individuals with more conservative initial positions showed negative movement (down in the chart and hence to the left politically), while those with more liberal initial positions showed positive movement (up in the chart and hence to the right politically). By T3, their positions reverted significantly in the direction of those they held at T1.Footnote 4
In other words, the significant movements in the PBS between T1 and T2 reverted considerably nine months later (T3-T1). As Table 1 shows, there were still significant differences between T1 and T3 remaining but when compared to the control group movements from T1 to T3, the long-term effects of participation appear to wash out (see Supplementary Tables A1 and A2). At the first glance, it would seem that deliberation does not produce many lasting effects. After all, these voters returned from their weekend of deliberation to the hothouse of an extremely polarizing and nasty campaign, one of the most polarizing in recent memory. It is hard to imagine sustained effects of a single weekend on participants nine months or a year after such a collective experience.
It is worth pausing to note that from the standpoint of deliberative mini-publics convened as a form of public consultation, it would not matter crucially if the results revert back nine months or a year after the deliberations have concluded. With public consultation of a stratified random sample, we are interested in what the public, in microcosm, thinks about a topic when the issues are fresh and when it has really engaged the competing arguments. Its considered judgments soon after deliberation can be taken, collectively, as a recommendation about what should be done. After that, memories fade. People return to their customary social networks and media habits. New news events offer added, and perhaps different or one-sided perspectives on the same issues discussed when the microcosm had deliberated. So reversion in whole or in part is to be expected and does not undermine the core purpose for which the microcosm was convened in the first place.
However, while reversion is not troubling for the core function of deliberative public consultation on a given set of issues, it is challenging for the broader aspiration, often shared among deliberative democrats, of somehow creating a more deliberative society. Unless a national mass deliberation were to be conducted soon before a national election (see Ackerman and Fishkin Reference Ackerman and Fishkin2004 for one such scenario), it would seem to have little effect on collective self-government.
However, a more careful examination of the data collected following our national experiment offers a different picture. The follow-up surveys of treatment and control groups actually offer evidence of a significant effect on collective capacities for self-government, resulting in a more optimistic picture. We say “optimistic” not because of any partisan implications. In a different election, the positions of the two parties could easily be reversed. Rather, we say “optimistic” because of the effect on the civic attributes of voters that deliberation appears to stimulate.
THE PUZZLE OF LASTING EFFECTS
The follow-up surveys just before the 2020 election as well as just after the election with both the participants and the control group show significant differences in voting behavior for these samples of registered voters.Footnote 5
Table 2 shows the dramatic difference between the treatment and control groups in voting intention just before the election, a full year after the deliberative weekend. The control group had a gap between Joseph R. Biden and Donald J. Trump at 3.8% (the actual gap in the electorate was about 3%). But the voting intentions of the participants suggest a dramatic effect of the treatment—a gap of 28.2 percentage points between the two major candidates.Footnote 6
How is such an effect possible? The results are surprising because the accepted wisdom in political science has long been that voting behavior, deeply rooted in group attachments, is much more stable, and is presumably much harder to change than political attitudes (Achen and Bartels Reference Achen and Bartels2016; Campbell et al. Reference Campbell, Converse, Miller and Stokes1960; Green, Palmquist, and Schickler Reference Green, Palmquist and Schickler2004). We find significant effects on two aspects of voting behavior: who one votes for and whether one votes at all. The second is just as puzzling as the first in that successful interventions on turnout tend to be soon before the election (e.g., Green and Gerber Reference Green and Gerber2019). All of the effects on voting behavior discussed here occur much longer after the intervention. How can deliberation possibly have such effects almost a year later?
DELIBERATIVE DEPARTURES FROM PARTISAN LOYALTIES: WHO WAS AFFECTED?
We can begin to explore these differences between treatment and control groups first by looking at who was different in voting behavior between the two groups at election time. Second, we will do predictive modeling to indicate who departed from the voting behavior that would normally be predicted by standard demographics (including party ID). Then, we will turn to causal mediation analysis to explore the effects of a latent variable, which we term a civic awakening) that helps to further explain the voting behavior of the treatment group.
Not only are Democratic participants in the middle (roughly 3–5) PBS range more likely to intend to vote for Biden than those in the same range in the control group, but Independents and Republicans are as well. Figure 2 illustrates this finding with a binned scatterplot that breaks participants and control group members out by party. Democrats and Independents who start off in the middle group at T1 are especially more likely to intend to vote for Biden than those in the control group by T4. Self-described Republicans who, in fact, hold somewhat left-leaning policy positions at T1 are also more likely to vote for Biden at T4 than comparable individuals in the control group.
In addition to vote intention for Biden, the middle group of participants also sees the biggest effect on intention to vote at all. Figure 3 demonstrates this fact, comparing those in the middle group to those outside of the middle group at T1 and participants to control group members. The figure provides support to our view that individuals in the middle group, there are marginal voters whose political engagement has the capacity to be especially awakened by democratic deliberation.
To further investigate individuals in the middle group, we build a model designed to predict vote intention based on the control group and then see where the model performs poorly when applied to the participants.
We take our control group sample and run a probit regression of the control group’s voting intentions (1 for Biden, 0 for Trump) at T4 on their characteristics at T1. The regression includes the following explanatory variables: education, gender, age, race, marital status, employment status, income level, home ownership status, metro/rural area of residence, feelings towards Republicans, feelings towards Democrats, opinion of Trump, opinion of Biden, PBS, ideological self-assessment score, and political party. The pseudo R-squared for a probit regression that includes just the demographic variables is 0.11; the pseudo R-squared for a regression that has all the variables above is 0.86.
We then take the estimated model (with all the variables) and use it to predict the participants’ intent to vote at T4 based on their T1 characteristics. We calculate the delta between actual vote intention and the model’s predictions. To do this, we take the vote intention a participant’s reports at T4 (1 = vote for Biden, 0 = vote for Trump) and subtract the model’s prediction for that participant (a probability of voting for Biden that ranges from 0 to 1). Thus, if a person’s delta is positive, the probability that he/she will vote for Biden is higher in reality than the model would predict. If the delta is negative, the probability that he/she will vote for Biden is lower in reality than the model would predict.
In Figure 4, we plot the binned averages of the deltas on the y-axis against the participants’ Time 1 PBS on the x-axis. The model error is close to zero for most participants, except for those who are in the middle group of the PBS, in the 3–5 score range. The errors start to pick up a bit (meaning the likelihood that the participant will vote for Biden is higher in reality than predicted by the model at T1) for those with PBS between 3 and 4, and really shoot up for those with PBS between 4 and 5.
Looking at model prediction errors by demographic characteristics, we find evidence that the participants driving differences in vote intention between the participant and control groups are those without a college degree (Figure 5), especially in the middle of the PBS range. These findings corroborate our evidence that the participants who saw the largest lasting increase in political engagement as a result of deliberation are individuals who came into the experiment with the lowest levels of political knowledge. Figure 6 also demonstrates that women in the middle range of the PBS were disproportionately affected by the deliberations.
The first issue to examine is whether or not the apparent difference is the result of differential attrition or some other distortion in the composition of the participant and control groups a year or more later (T4 and T5) compared to the way they matched up at T1.
Table 3 shows that the average differences in characteristics of the control group and the participant group did not change significantly across time. Differences in the averages between participant and control groups for all the standard demographics (as well as party ID) are stable across the various waves. The dramatic differences in voting intention we saw in Table 2 thus cannot be attributed to differential attrition in either the participant group or the control group in the survey waves collected either before or after the election.
Note: Table shows differences in average characteristics of the control group and the treatment group at each time period of the study. T1 is just before September 2019, before the deliberations; T2 is just after the deliberations (not shown because the sample is the same as at T1); T3 is 10 months later, in July 2020; T4 is October 2020; T5 is November–December 2020, after the general election. All characteristics are based on data collected at T1. The number of observations in each sample includes the participant and control group samples. For full balance tables for each time period of the study, please see the Appendix Tables A3–A6.
*p < 0.10,**p < 0.05,***p < 0.01.
Even though the participant group did not experience differential attrition compared to the control group, one might wonder if they started out more knowledgeable and more oriented to discussion with diverse others. Perhaps starting as more amenable to diverse civic dialogue they were more easily activated by the process. However, on questions of general knowledge, there were no significant differences between the participants and the control group on five out of seven of the questions on general political knowledge at time 1 (see Supplementary Table A8). On questions about others “who disagree with you strongly,” such as whether they “have good reasons” or whether “they are thinking clearly,” the participant and control groups show virtually no difference at time 1. This is also true for the time 1 views of the sample who answered the voting intention questions at time 4 (see Figure A2 in the Appendix). We take these questions as an approximate measure of the predisposition to engage with those with whom you most strongly disagree. There does not seem to be any difference in this predisposition between treatment and control groups nor any strong indication that the participants started out as more informed citizens.
Another potential explanation stems from the global COVID-19 pandemic and the Trump administration’s response, both of which occurred during the T3-T5 survey waves. One might anticipate that participants in the deliberations might have perceived the impact of the pandemic—and the public policy responses to the pandemic—differently than the general populace, which would have given the participants a more negative view of the Trump administration. To explore this possibility, Figure 7 looks at respondent PBS against their assessment of the federal government’s COVID-19 response, broken down by the participant and control groups. What we see in the figure is that when accounting for pretreatment policy positions, there is no difference at all between the two groups. This would indicate that, whatever effect the deliberations had, it was not primarily through differential changes in perception of the Trump administration’s performance during the COVID-19 pandemic. This makes sense in that the effects of the pandemic should not have been localized or uniquely felt by participants; any effects that COVID-19 had on the election were likely homogenous between the participant and the control group. COVID-19 was everywhere.
SOLVING THE PUZZLE
Our proposed solution to the puzzle of the delayed effect is that the deliberations gave rise to a latent variable, which might be termed an awakening of civic capacities, that has an effect, in turn, on voting (whether or not one votes at all) and on vote choice (whom one intends to vote for). The people who deliberated over the weekend, as compared to the control group, became more politically engaged. We take significant movement on the PBS over the weekend as an indicator that they were deeply involved in the deliberations. Those who deliberated were also more likely to follow the campaign, have a greater sense of internal efficacy (belief that their political views were “worth listening to”),Footnote 7 and acquire (and continue to acquire) general political knowledge. Deliberative change on the issues, following the campaign, feeling that you have views worth listening to, and becoming more knowledgeable are all elements of a coherent civic awakening—a picture of more engaged citizens.
These elements of a civic awakening are roughly similar to those found by Gastil, Deess, and Weiser (Reference Gastil, Deess and Weiser2002) in their study of the indirect effects on voting from serving on a jury that reached a verdict.Footnote 8 They found that the depth of deliberation (which they measured by the number of counts considered at a trial that reached a verdict) was one of the mediators in increasing the likelihood of voting. Our deliberators all considered the same number of policy issues, but we measure “depth of deliberation” through the opinion changes on the PBS score on the deliberative weekend. Gastil et al. (Reference Gastil, Deess, Weiser and Simmons2010) also found public affairs media use as measured by “following the campaign,” political efficacy and satisfaction with the deliberative processFootnote 9 all connected to the civic awakening from jury service. We use “following the campaign” and gain in general political knowledge as mediators along with internal political efficacy.
In the jury case, the dependent variable was limited to whether or not one voted, In our analysis, we are interested both in turnout and in how one voted and whether that voting behavior has a connection to one’s policy positions. The latter is essential for considering the broader question of the impact of civic engagement on collective self-government.
These elements of a civic awakening are most simply captured graphically. First, we saw that the treatment stimulated significant policy change on the issues (Figure 1 and Table 1). These changes indicate who engaged in the deliberations to the point of changing their opinions significantly on the most contested issues. Second, the participants were more likely than the control group to say at T3 that they are “closely following the campaign”. Figure 8 shows how this difference is mostly (but not entirely) clustered around the moderate middle range of the policy score (based on the T1 scores).
Third, the deliberators show an increase in internal efficacy or self-efficacy between T1 and T3. They are more likely to think, at T3, that their opinions are “worth listening to” compared to the responses from the control group. Again, as pictured in Figure 9, these differences are clustered mostly, but not entirely, in the broad middle of the policy score range (based on their scores at T1).
Fourth, we have a measure of general political knowledge on items that were not explicitly the subject of the deliberations (who controls the House and who controls the Senate). This measure did not increase right after deliberation, at T2, but was significantly higher for participants at T3 (Figure 10). This suggests that 10 months after deliberation, the participants were obtaining general political knowledge on their own, a sign that the civic awakening that occurred during the course of deliberation is manifesting itself in lasting ways.
Let us review these aspects of the civic awakening and note how they are distributed in the policy space. First, as we see in Figure 8, there are clear differences between the participant and control groups in how closely the respondents are following the 2020 election, regardless of the PBS. We also see a similar relationship in Figure 9 for respondent’s self-reported beliefs in the value of their own political opinions: again, participants were more likely to believe that their political opinions were “worth listening to” even at T3, a persistent change long after the deliberative weekend. The deliberative treatment also affected general political knowledge. It increased between T2 and T3, throughout the course of the election. Once again, the increases are clustered in the broad middle of the policy space and show a large difference between participant and control groups. For details on the two general political knowledge questions (as well as the policy-specific knowledge questions) and how they compared to responses from the control group at T1, T2, and T3, see Supplementary Table A.8, Panels A, B, and C.
To summarize, we believe the following elements of civic awakening serve as mediators for whether citizens will vote at all and whom they intend to vote for: a) changes in the PBS over the weekend of deliberation (Table 1 and Figure 1); b) closely following the campaign (Figure 8); c) having “opinions worth listening to” (Figure 9); and d) general political knowledge (Figure 10). We will employ causal mediation analysis to demonstrate the effect of these variables on voting.
First, a few words on how we conceptualize the civic awakening. Our theory of measurement requires us to differentiate between two types of measures: “reflective” and “formative” (Sokolov Reference Sokolov2018; Stenner, Burdick, and Stone Reference Stenner, Burdick and Stone2008; Trochim Reference Trochim2001). A formative measure requires knowing all of the factors that make up a construct and including measures for all of these components. A classic example of a formative measure is socioeconomic status, which is defined as the combination of education, income, and occupational prestige (Auerbach, Lerner, and Ridge Reference Auerbach, Lerner and Ridge2022). If one part is not included, then the index would be incomplete and not measure socioeconomic status. Reflective measures examine multiple outputs of a force and use latent trait modeling to identify this force in these results. For instance, intelligence is a latent ability assessed through various types of tests. IQ tests take test question responses as reflections of an individual’s underlying ability. In this case, there is no complete corpus of intelligence components to be assembled (Coltman et al. Reference Coltman, Devinney, Midgley and Venaik2008).
We view the civic awakening as a formative construct: we believe that it is a combination of individuals propensity to follow the campaign, to feel their opinions are worth listening to, to become knowledgeable about politics, and to have deliberated in depth (measured by whether attending the deliberation caused a general shift in their underlying political attitudes [PBST1 - PBST2]). If we treated this as a singular reflective measure, we would want to study their underlying correlation matrix and use methods that exploit similar covariance between the variables (like factor analysis). Because we believe civic awakening is a latent combination of these observable factors, such tests are inappropriate, though there are modest correlations between “follows the campaign,” “having opinions worth listening to,” and knowledge (with Pearson’s correlation coefficient ranging from 0.24 to 0.37 between the three).
Fundamentally, we believe that these four indicators are evidence of the formative construct of an individual’s unobservable civic engagement. We choose to keep these as separate indicators, rather than utilize an aggregation strategy, because keeping these separate makes the results of our treatment on each individual indicator clear, showing that certain indicators only affect certain outcomes. Aggregation would lose specificity that makes our overarching story much clearer.
CAUSAL MEDIATION ANALYSIS: ESTIMATING DIRECT AND INDIRECT EFFECTS OF DELIBERATION
The traditional method of exploring relationships between a treatment and outcomes is by using a regression model. However, this method fails to disentangle underlying causes and effects that are indirect, rather than direct. In our case, we know that there is an effect of participating in the deliberations on an individual’s propensity to vote; this effect is perceivable even a year after the event. It is, however, unsatisfactory to state that participating in the deliberations is the direct cause of an increased propensity to vote and to vote in particular ways: surely, there were intermediate steps caused by the deliberations that, when taken together, affect these outcomes.
When faced with the possibility of indirect effects, investigators may have prior knowledge that an explanatory variable plausibly exerts its effect on an outcome via direct and indirect pathways. In the indirect pathway, there exists a mediator that transmits the causal effect.
Suppose we have variables T and Y indicating the treatment variable and outcome variable, respectively. Mediation in its simplest form involves adding a mediator M between T and Y. The sequential ignorability assumption, critical to causal mediation analysis, states that the treatment (explanatory variable T) is first assumed to be ignorable given the pretreatment covariates, and then the mediator variable (M) is assumed to be ignorable given the observed value of the treatment as well as the pretreatment covariates (Imai, Keele, and Tingley Reference Imai, Keele and Tingley2010; Imai et al. Reference Imai, Keele, Tingley and Yamamoto2011).
The first part is often satisfied by randomization, while the second part implies that there are no unmeasured confounding variables between the mediator and the outcome. The standard mediation analysis starts with three equations, usually modeled with continuous outcomes (though advances in methods now allow for most parametric modeling approaches for either stages of the mediation):
where i1, i2, and i3 denote intercepts; Y is the outcome variable; T is the treatment variable; M is the mediator; c is the coefficient linking T and Y (total causal effect); c’ is the coefficient for the effect of T on Y adjusting for M (direct effect); b is the effect of M on Y adjusting for explanatory variables; and a is the coefficient relating to the effect of T on M. e1, e2, and e3 are residuals that are uncorrelated with the variables in the right-hand side of the equation and are independent of each other. Under this specific model, the causal mediation effect (CME) is represented by the product coefficient of ab. Of note, Equation 3 can be substituted into Equation 2 to eliminate the term M:
It appears that the parameters related to direct (c’) and indirect effects (ab) of T on Y are different from those of their total effect. That is, testing the null hypothesis c = 0 is unnecessary since CME can be nonzero even when the total causal effect is zero (i.e., direct and indirect effects can be opposite), which reflects the effect cancellation from different pathways.
This standard setting for mediation analysis was refined and brought into the potential outcomes framework in Imai et al. (Reference Imai, Keele, Tingley and Yamamoto2011). The authors propose a set of methods that unifies the approach to identifying direct and indirect effects, relying on a set of assumptions that are more readily testable than classical mediation analysis provides.
MEDIATION AND DELIBERATIVE POLLING
We believe we have identified four effects that may represent a mediated effect of our treatment on voting (both whether to vote and whom to vote for). The four mediators are change in PBS immediately following the deliberative weekend, a self-reported measure of following the campaign, a self-reported measure of whether one’s political opinions are worth listening to, and general political knowledge. We view these four collectively as latent indicators of an underlying civic awakening that made participants more politically and civically engaged. We believe that the effect of the treatment on these mediators caused eventual changes in two key dependent variables: a respondent’s propensity to vote at all and a respondent’s propensity to vote for Biden.
For each mediator, we estimated as follows:
$ {\displaystyle \begin{array}{l}{\mathbf{1}}^{\mathbf{st}}\mathbf{stage}\hskip-0.2em :\\ {} Follows\;{Campaign}_{T3}\sim {c^{\ast }} Treatment+{PBS}_{T1}\\ {}+\hskip0.3em Demographic\;{Controls}_{T1}\\ {} Worth\ Listening\ To{?}_{T3}\sim {c}^{\ast } Treatment+{PBS}_{T1}\\ {}+\hskip0.3em Demographic\;{Controls}_{T1}\\ {}{KnowledgeIndex}_{T3}\sim {c}^{\ast } Treatment+{PBS}_{T1}\\ {}+\hskip0.3em Demographic\;{Controls}_{T1}\\ {}{PBS}_{T1}-{PBS}_{T2}\sim {c}^{\ast } Treatment+{PBS}_{T1}\\ {}+\hskip0.3em Demographic\;{Controls}_{T1}\end{array}} $
$ {\displaystyle \begin{array}{l}{\mathbf{2}}^{\mathbf{nd}}\mathbf{stage}\hskip-0.2em :\\ {}{Outcome}_{1,2}\sim Follows\;{Campaign}_{T3}+{c}^{\ast } Treatment+\hskip0.3em {PBS}_{T1}+ Demographic\;{Controls}_{T1}\\ {}{Outcome}_{1,2}\sim Worth\ Listening\ To{?}_{T3}+{c}^{\ast } Treatment\\ {}+\hskip0.3em {PBS}_{T1}+ Demographic\;{Controls}_{T1}\\ {}{Outcome}_{1,2}\sim {KnowledgeIndex}_{T3}+{c}^{\ast } Treatment\\ {}+\hskip0.3em {PBS}_{T1}+ Demographic\;{Controls}_{T1}\\ {}{Outcome}_{1,2}\sim \left({PBS}_{T1}-{PBS}_{T2}\right)+{c}^{\ast } Treatment\\ {}+\hskip0.3em {PBS}_{T1}+ Demographic\;{Controls}_{T1}\end{array}} $ where Outcome1,2 refers to vote at all (T4) and vote for Biden (T4), respectively. To estimate the mediation effects, we utilized a mixed effects regression framework, with demographic controls and random intercepts at the state level.Footnote 10 Demographic controls include education, gender, age, race, marital status, employment status, income level, home ownership status, metro/rural area of residence, and party ID. For consistency, we use the same sets of controls and regression modeling specifications for each of the models.
Participation in the deliberations significantly increased campaign interest, self-efficacy, general political knowledge, and movement to the left overall for the PBS between T1 and T2, as demonstrated earlier. Because of this, we know that it is possible that these four mediators will have significant indirect effects on outcomes, even if there is weaker evidence for a direct effect of these mediators. For the entire sample, these effects are shown in Figures 3 through 5. “Following the campaign,” having opinions “worth listening to,” and general political knowledge are significant mediators for whether or not one will vote. Table 4 shows the causal mediation effects over the full range of the PBS.
Note: Each model is fit using a generalized linear mixed effects model for both the mediators and the dependent variables—linear models for each of the mediators and logistic regression models for the dependent variables. The dependent variables are Vote at All and Vote for Biden, as indicated at the top of the table. Random intercepts were fit at the level of the respondents’ home state. We include a set of respondent demographic controls (age, gender, race, education poverty, and party ID) in each model, as well as respondent PBS. Observations include participants and control groups members. Models are fit using the “mediation” package in R with 95% CI included in the parenthesis. Complete model results are available in the Supplementary Tables A14–1, A14–2, A14–3, and A14–4. *p < 0.1, **p < 0.05,***p < 0.01.
Table 4 presents the results of this mediation analysis, with the mediators on the left-hand side of the table, and the dependent variables on top. The effect listed is the average causal mediated effect with 95% confidence intervals presented. The effects in this analysis are estimated using the “mediation” package in R (see Imai et al. Reference Imai, Keele, Tingley and Yamamoto2011 for a discussion and Tingley et al. Reference Tingley, Yamamoto, Hirose, Keele and Imai2014 for an overview of the features of the package). 95% credible intervals are estimated with a parametric bootstrap with 1000 intervals, estimated with robust standard errors.
Indirect effects of the treatment were significant for voting at all if mediated through increases in respondent attention to the campaign, self-efficacy, and general political knowledge, though the relationship is not significant between these mediators and voting for Biden. However, there were significant indirect effects in intention to vote for Biden if mediated through changes in their PBS before and after treatment (movement to the left). For the group as a whole, there were significant indirect effects on voting for Biden from the treatment if the treatment induced a movement to the left along the PBS. But as Figures 8–10 suggest, these effects are likely to be more strongly felt among those in the middle range of the PBS (defined as respondents with scores 3–5 in their T1 PBS). As those participants start off at particularly low levels of civic engagement and as they lean slightly left in their policy positions, we believe they are likely to decide whether to vote and whom to vote for on the margin. Table 5 looks at the same set of models as Table 4 but restricts the analyses to just this middle group.
Note: Each model is fit using a generalized linear mixed effects model for both the mediators and the dependent variables—linear models for each of the mediators and logistic regression models for the dependent variables. The dependent variables are Vote at All and Vote for Biden, as indicated at the top of the table. Random intercepts were fit at the level of the respondents’ home state. We include a set of respondent demographic controls (age, gender, race, education, poverty, and party ID) in each model, as well as respondent PBS. Observations include participants and control groups members. Models are fit using the “mediation” package in R with 95% CI included in the parenthesis. Complete model results are available in the Supplementary Tables A15–1, A15–2, A15–3, and A15–4.
*p < 0.1,**p < 0.05,***p < 0.01.
Here, we see a much larger effect from changes in the weekend on voting for Biden. However, with the smaller N of the middle range only, the other effects, except for following the campaign, are no longer significant. Before, for the full range of the PBS, the ACME was a rounded 0.02—roughly a 2% increase in the probability of voting for Biden. Now the effect is a rounded 0.06—roughly a 6% increase in the probability of voting for Biden. This suggests that the indirect effect of participating in the deliberations, mediated through short run changes to respondent PBS and thus an openness to moving one’s average position left on policy issues, was responsible for a 6% increase in the likelihood a respondent would vote for Biden—even after conditioning on a variety of demographic controls as well as state fixed effects.
MEDIATION AND VOTE RECOLLECTION
In the interest of assessing whether the long run effect of deliberation on voting is not simply a function of intention, we also ran our same models from the previous section on a slightly different version of respondent voting behavior: voting recollection. Unlike in the previous section, where respondents were asked if they intended to vote and how they planned to vote in the upcoming election (T4), we now rely on retrospective descriptions of respondent voting behavior (T5) for our dependent variables.
We perform this analysis using the mediation analysis framework from the previous section, simply substituting out the dependent variables. We are still interested in the indirect effect of deliberation on following the campaign, self-efficacy, political knowledge, and changes in ideology pre- and posttreatment. We use reported voting at all as our first dependent variable and reported voting for Biden as our second dependent variable.
In Table 6, we see two important trends. First, the mediated effects of deliberation through knowledge and change in ideology are basically the same between voting intention and voting recollection; deliberation increases knowledge, which increases an individual’s propensity to vote. Furthermore, the shift in the PBS from deliberation between T1 and T2 had a bigger effect on an individual’s propensity to report actually voting for Biden. The primary change is with the first two mediators: following the campaign and respondent self-efficacy. The effect that deliberation has on respondents’ self-reported following of the campaign still has an effect on their propensity to vote. What has changed is that this same effect also makes participants more likely to self-report voting for Biden. The opposite is true for respondent self-efficacy; there is now no effect from deliberation to self-efficacy to any change in self-reported behavior. This may represent an interesting illustration of the potential difference in how prospective versus retrospective assessments of voting behavior tracks respondent self-assessment.
Note: Each model is fit using a generalized linear mixed effects model for both the mediators and the dependent variables—linear models for each of the mediators and logistic regression models for the dependent variables. The dependent variables are vote at all and vote for Biden, as indicated at the top of the table. Random intercepts were fit at the level of the respondents’ home state. We include a set of respondent demographic controls (age, gender, race, education, poverty, and party ID) in each model, as well as respondent PBS. Observations include participants and control groups members. Models are fit using the “mediation” package in R with 95% CI included in the parenthesis. Complete model results are available in the Supplementary Tables A16–1, A16–2, A16–3, and A16–4.
*p < 0.1,**p < 0.05,***p < 0.01.
Table 7 shows the results of causal mediation analysis of the middle group, but using measures of reported voting after the November 2020 election. Similar to the relationship between Tables 4 and 6, Table 7 tells largely the same story as Table 5, with the sole exception being the emergence of an indirect effect of deliberation through following the campaign on voting for Biden. The results are otherwise largely unchanged.
Note: Each model is fit using a generalized linear mixed effects model for both the mediators and the dependent variables—linear models for each of the mediators and logistic regression models for the dependent variables. The dependent variables are vote at all and vote for Biden, as indicated at the top of the table. Random intercepts were fit at the state level. We include a set of respondent demographic controls in (age, gender, race, education poverty, and party ID) in each model, as well as respondent PBS. Observations include participants and control groups members. Models are fit using the “mediation” package in R with 95% CI included in the parenthesis. Complete model results are available in the Supplementary Tables A17–1, A17–2, A17–3, and A17–4.
*p < 0.1,**p < 0.05,***p < 0.01.
There is a long-standing discussion about overreporting of voting in the literature (Belli, Traugott, and Beckman Reference Belli, Traugott and Beckman2001; Bernstein, Chadha, and Montjoy Reference Bernstein, Chadha and Montjoy2001). But the fact that our results are essentially unchanged, whether we measure voting outcomes before or after the election suggests a robustness of the effects of deliberation on this array of mediators. Thus far, the causal mediation analyses have employed self-reported voting.
But we also collected verified votes for the participant and control samples after the election. Of course, there are well-known challenges with voter verification (Katosh and Traugott Reference Katosh and Traugott1981; Miller et al. Reference Miller, Kalmback, Woods and Cepuran2021). Some voters exaggerate whether they have voted, some move, some have different spellings of their names, or change their names. Despite these issues, we collected verified voter information for our sample and then redid the causal mediation analysis for those for whom we could definitely verify that they voted. The results are presented in Supplementary Table A12 for the whole range of the PBS and in Supplementary Table A13 for the middle range only. They show that the causal mediation results for voting at all and for voting for Biden remain essentially unchanged. They are not the result of people overreporting that they had voted because the same causal relations hold for those for whom we could definitely verify whether or not they voted.
Thus far, we have traced elements of a civic awakening—greater efficacy, increased knowledge, and closer attention to the campaign among the deliberators. We have also seen indirect effects of the civic awakening on voting at all and voting for Biden. However, we can also explore whether there is a direct effect of their time 3 PBS scores on how they voted. Once awakened, are the deliberators more likely to take their policy preferences into account in deciding whom to vote for?
We can see this relationship in Table 8. In this set of regressions, we seek to compare how policy positions (PBS score) measured at different times predicts voter behavior in the 2020 election. We focus on three separate models which are identical in all indications, except they use a respondents’ PBS measured at three separate times. If, as we believe, participants become better spatial voters—voters who are better able to transform policy preferences into voting behavior—we should see a strong negative correlation between respondent PBS and voting for Biden (negative since positive scores mean more conservative), and we should see a significant coefficient on the interaction between participant status and the PBS score. We compare ideology pretreatment (time 1), immediately posttreatment (time 2), and in the year follow-up (time 3). We find results that confirm our expectations: respondents are more likely to vote for Biden than members of the control group, having a higher PBS makes one less likely to vote for Biden, and, most importantly, being a participant makes this relationship statistically stronger.
Note: Dependent variable is a binary on “voting for Biden,” conditional on having voted. The model is a logit regression with random intercepts for state. Each model includes demographic controls and party ID (which are the same as in the mediation results). The model uses PBS scores for each respondent measured at different times. Full regression table is available in the Supplementary Table A18. *p < 0.05,**p < 0.01,***p < 0.001.
The relationship between voter policy positions at each time and whether or not they ultimately vote for Biden gets stronger for participants given their time 3 PBS, rather than their time 1 or time 2 PBS; the coefficient on the interaction goes from −0.921 for time 1 PBS interacted with treatment status, to −0.781 in time 2, and to −1.075 in time 3. This suggests that participant policy position is becoming a better predictor of voting for Biden over time; participants are voting more in line with their spatial preferences as their spatial preferences shift over time.
We can speculate how these results might apply to the broader universe of eligible rather than just registered voters. Might similar effects have been found among non-registered but eligible voters? The non-registered voters are likely to be less knowledgeable and less educated. Do we think deliberation would have comparable effects on them? Figure 10 shows the biggest effects of the treatment on the less knowledgeable in the middle of the policy space (PBS). Figure 5 shows that the biggest effects on voting intention came from those who lacked a college degree. So it is worth speculating that if registration as a barrier to voter participation were somehow to disappear, deliberation could be expected to have comparable or even greater effects among those currently non-registered.
CAN THESE DELIBERATIVE EFFECTS BE SCALED?
The picture that emerges from these analyses is that deliberation in an organized setting, on the model of Deliberative Polling (Fishkin Reference Fishkin1991; Reference Fishkin2018), fostered elements of a civic awakening, particularly in the moderate and less politically engaged middle of the policy space. Those who were most affected by the deliberations during the weekend (as indicated by the changes in their PBS), those who subsequently followed the campaign more closely, those who thought they had opinions “worth listening to,” and those who gained knowledge over the course of the campaign were also more likely to vote and most particularly, more likely to vote for Biden in the 2020 Presidential election. In short, by intensively deliberating on the issues, becoming more aware of the campaign, having greater self-efficacy, and becoming more knowledgeable, they brought to life many of the elements of the “folk theory of democracy.” This is not a myth beyond the competence of ordinary citizens. It is a set of capacities that can be stimulated by institutional design. We think it is remarkable that such a short intervention can have a lasting effect a year later via the mediating variables in this civic awakening that led them to process the campaign and their voting decisions differently than the control group.
Think of the changed distribution of this political engagement. Before deliberation, our civic mediators tended to be distributed in the policy space in a kind of sunken parabola (a U shape) bottoming in the middle range (see Figures 3–5). Those in the broad middle range were left out—less likely to “follow the campaign,” less likely to think they had “opinions worth listening to,” and less likely to gain general political knowledge. But deliberation brought up the middle ranges and created a distribution on these variables more like a plateau, putting everyone on a more equal footing. This is a more inclusive form of democracy, where so many are not simply left out and where deliberation is an intrinsic part of participation.
This is very much like the vision in “Deliberation Day,” the idea of a national holiday in which the whole country deliberates on the issues in many organized small groups and comes to a considered judgement during the Presidential campaign (Ackerman and Fishkin Reference Ackerman and Fishkin2004). In anticipation of such informed voters being energized en masse, the book argues that there would be rational incentives for candidates to adjust their campaign strategies to appeal to voters in more thoughtful and nuanced ways. Perhaps this would disincentivize at least some of our more manipulative campaign practices. Whether or not this latter claim is correct, it is surely true that a scaling of the deliberative process would take voters out of their filter bubbles and engage them with diverse others as they determine their views on the issues. Activating the broad middle of the policy spectrum would change the incentives for candidates (and their allies via independent expenditure groups) to do more than simply address the base of their parties to stimulate turnout. The overall electorate might depolarize because the universe of more moderate and potentially persuadable voters would be enlarged by bringing those in the broad middle of the policy space back into the political arena.
If we are correct in this picture, can deliberation actually be scaled? We believe this is an area ripe for creative experimentation. One approach is through the Stanford Online Deliberation Platform, which reproduces the experience of Deliberative Polling for innumerable small group discussions. In fact, it has already been successfully employed as the mode of deliberation with stratified random samples in Japan, Hong Kong, Chile, Canada, and the United States, with up to 1,000 deliberators in 104 small groups (plus a separate control group).Footnote 11 In theory, the automated platform can handle any number of deliberative participants randomly assigned (with stratification) to small groups of ten or twelve. Further projects are planned to continue to expand scaling to much larger numbers, and study effects on participants. If eventual aspirations for mass participation in such processes succeed, this work suggests that we can achieve a more deliberative society.
SUPPLEMENTARY MATERIAL
To view supplementary material for this article, please visit https://doi.org/10.1017/S0003055423001363.
DATA AVAILABILITY STATEMENT
Research documentation and data that support the findings of this study are openly available at the American Political Science Review Dataverse: https://doi.org/10.7910/DVN/ERXBAB.
Acknowledgements
We would especially like to thank Henry Elkus, Sam Feinburg, and Jeff Brooks of Helena for their vision and collaboration. We also thank Larry Diamond for his invaluable contributions throughout the A1R project. We also want to thank Michael Dennis, Jennifer Carter, and their superb team at NORC at the University of Chicago. In addition, we want to thank Siddharth George for his valuable suggestions. This paper benefited from a presentation at the meetings of the American Political Science Association, September 2021. We would like to thank Helene Landemore, Jonathan Collins, Kimmo Grönlund, and John Gastil for their insights.
FUNDING STATEMENT
This research was funded by the Helena Group Foundation.
CONFLICT OF INTEREST
The authors declare no ethical issues or conflicts of interest in this research
ETHICAL STANDARDS
The authors declare the human subjects research in this article was reviewed and approved by NORC IRB: #21–07-386 and by the Stanford University IRB 35343. The authors also affirm that this article adheres to the APSA’s Principles and Guidance on Human Subject Research.
Comments
No Comments have been published for this article.