Introduction
Artificial intelligence (AI) is believed to have significant potential to enhance public services’ effectiveness, efficiency and convenience. However, citizens often meet the use of AI in the public domain with skepticism (König and Wenzelburger, Reference König and Wenzelburger2021), as AI-based technologies raise important normative questions about the quality of decision-making in public policy (Bullock, Reference Bullock2019; Busuioc, Reference Busuioc2021; Grimmelikhuijsen and Meijer, Reference Grimmelikhuijsen and Meijer2022). A popular approach to counter citizens’ skepticism is objectively evaluating AI’s benefits (Cartwright and Hardie, Reference Cartwright and Hardie2012; Hortal, Reference Hortal2023). According to evidence-based policymaking (EBPM), such objective evidence can be provided through a rigorous evaluation of novel technology, for example, in pilot projects. The evaluation studies that emerge from such pilot projects are supposed to provide hard statistical evidence for the advantages and disadvantages of AI. This article examines how effective such objectification can be concerning the performance assessment of AI in three policy domains with varying levels of safety-criticality (Krafft et al., Reference Krafft, Zweig and König2022).
A fundamental premise of EBPM is that citizens, politicians and administrators can draw inferences from scientific evidence (Cartwright and Hardie, Reference Cartwright and Hardie2012; Oliver, Reference Oliver2013; Dhami and Sunstein, Reference Dhami and Sunstein2022). The validity of the data presented in a single report and the policy implications drawn from such an evaluation study may be open to debate, and citizens’ cognitive abilities to understand and process information may also vary. Nevertheless, any objectification strategy must presume that citizens can – in principle – correctly infer the information from simple statistical evidence presented in evaluation studies. Motivated reasoning challenges this fundamental premise of evidence-based public policy (e.g., Kunda, Reference Kunda1990; Mintz et al., Reference Mintz, Valentino and Wayne2021) and describes the phenomenon that peoples’ existing beliefs hinder their ability to interpret scientific evidence correctly (Kahan et al., Reference Kahan, Peters, Cantrell Dawson and Slovic2017: 56). Instead of updating their beliefs according to new evidence, they make biased evaluations in defense of their desired conclusions, even when scientific evidence suggests the opposite (Christensen and Moynihan, Reference Christensen and Moynihan2020: 3–4).
The presence of motivated reasoning has already been documented in various policy domains: peoples’ beliefs about the causes of climate change, for example, biases their evaluation of climate data (Druckman and McGrath, Reference Druckman and McGrath2019), peoples’ ideological orientation biases their assessment of gun control policies (Redlawsk, Reference Redlawsk2002; Leeper and Slothuus, Reference Leeper and Slothuus2014; Martin et al., Reference Martin, James, Serritzlew and Van Ryzin2020), as well as their performance evaluation of public organizations (Baekgaard and Serritzlew, Reference Baekgaard and Serritzlew2016). This study tests whether citizens’ assessment of the performance of AI in three use cases is influenced by motivated reasoning. Specifically, we test whether motivates reasoning of processing information about the performance of AI is present and whether such biased information processing stems from citizens’ preference for state regulation of AI-based systems or their subjective attitudes toward AI.
Current research suggests that citizens’ skepticism toward AI-based technologies in the public domain tends to be conditional on the specific task delegated to or supported by AI (Wenzelburger et al., Reference Wenzelburger, König, Felfeli and Achtziger2024). Experimental studies presenting citizens’ design options and regulatory frameworks on AI across multiple policy domains (Miller and Keiser, Reference Miller and Keiser2020; Kennedy et al., Reference Kennedy, Waggoner and Ward2022; König et al., Reference König, Felfeli, Achtziger and Wenzelburger2024; Grimmelikhuijsen, Reference Grimmelikhuijsen2023) confirm that technology-associated risk perceptions play a crucial role in AI acceptance. On the other hand, recent evidence on the moderating effect of ideological predispositions on regulatory preferences indicates a robust effect across contexts (Hemesath and Tepe, Reference Hemesath and Tepe2023, Reference Hemesath and Tepe2024a; König et al., Reference König, Wurster and Siewert2023). This literature shows that several preexisting beliefs about AI, such as its perceived threat to personal values and ethics (Kleizen et al., Reference Kleizen, Van Dooren, Verhoest and Tan2023), about companies building or promoting AI for the public sector, as well as general beliefs about the role of technology in society, influence citizens’ acceptance of AI in public policy (Hemesath and Tepe, Reference Hemesath and Tepe2024b).
This study contributes to this literature on AI in public policy (Gesk and Leyer, Reference Gesk and Leyer2022; Kleizen et al., Reference Kleizen, Van Dooren, Verhoest and Tan2023; Wenzelburger et al., Reference Wenzelburger, König, Felfeli and Achtziger2024) by examining citizens’ cognition of information about the performance of AI systems, testing whether citizens’ regulation preferences and attitudes toward AI are a relevant source of motivated reasoning about AI-based systems across varying public policy domains. Drawing on an established experimental design to detect motivated reasoning (Christensen and Moynihan, Reference Christensen and Moynihan2020), we conducted two online survey experiments among German citizens and examined motivated reasoning in three distinct public administration tasks that vary in complexity, safety criticality and normative considerations: routine public administration of allocating parking permits, the public safety approval of self-driving cars and the forecast of recidivism of incarcerated individuals.
Theoretical framework
Motivated reasoning builds on psychological insights (e.g., cognitive dissonance theory, Festinger, Reference Festinger1957), and describes the phenomenon that peoples’ preexisting beliefs shape their ability to process and interpret information, often in biased ways to protect these beliefs (Kunda, Reference Kunda1990; Taber and Lodge, Reference Taber and Lodge2006). The core assumption is that individuals’ reasoning is goal-driven, influencing their cognitive processing and judgment of information (Druckman and McGrath, Reference Druckman and McGrath2019). Rather than acknowledging novel information and updating their beliefs, people choose to optimize a tradeoff between accuracy and directional motives (Little, Reference Little2024). This study considers a narrow conceptualization of motivates reasoning as a biased evaluation of policy information.Footnote 1
Motivated reasoning about AI occurs if citizens are more likely to correctly infer statistical evidence about the performance of an AI in public policy if the evidence is congenial to their AI regulation preferences and less likely to correctly infer statistical evidence if it is uncongenial to these preferences. This mechanism offers an important addendum to existing research on public perceptions of AI, which often assumes rational evaluation of AI systems. Existing research indicates that individuals tend to base evaluations of AI on contextual cues, heuristics and prior dispositions (Schiff et al., Reference Schiff, Jackson Schiff and Pierson2022; Kleizen et al., Reference Kleizen, Van Dooren, Verhoest and Tan2023; Wenzelburger et al., Reference Wenzelburger, König, Felfeli and Achtziger2024). By directly targeting the cognitive mechanism of motivated reasoning, this study goes beyond prior research by investigating how citizens’ biases influence their interpretation of evidence about AI systems, offering a new lens to understand public perceptions of AI in policy contexts.
As a controversially discussed technology, AI intersects with key public values, such as fairness, privacy and redistribution, which are tied to individuals’ ideological orientations, trust in institutions, emotional investment and perceived risks (König and Wenzelburger, Reference König and Wenzelburger2021; Horvath et al., Reference Horvath, James, Banducci and Beduschi2023; Hemesath and Tepe, Reference Hemesath and Tepe2024b). These values make AI evaluations personally and politically salient, even in contexts where direct experiences with AI may be limited. Therefore, citizens may interpret information on the performance of an AI-based system in public policy primarily to affirm their existing beliefs about AI. Hence, motivated reasoning diminishes the likelihood that scientific evidence about the performance and reliability of AI in public policy will be assessed based on its true value. We subsequently hypothesize that citizens engage in motivated reasoning when they evaluate the performance of AI-based systems and interpret information on the performance of AI in public policy to reaffirm their existing beliefs. The baseline hypothesis states:
H1.Citizens engage in motivated reasoning when evaluating an AI-based system’s performance in public policymaking.
Two sources of directional motives toward AI
Motivated reasoning is a universal cognitive distortion that has been observed in various areas (Mintz et al., Reference Mintz, Valentino and Wayne2021). In this respect, one might say that the challenge is not whether motivated reasoning occurs but which motives are responsible for this bias in a particular context. This study tests two types of beliefs that might cause motivated reasoning toward AI in public policy.
First, citizens might have political beliefs about the general need to regulate AI and the role of the state. The public use of technology is connected to questions of resource distribution, public safety, public trust and state intervention. The state and corresponding regulations significantly influence when, how and in which shape technology is and can be used. Moreover, the evaluation of AI performance in the context of EBPM specifically occurs to make inferences on political regulation. Existing research on citizens’ preferences toward the regulation of AI, however, suggests that respondents’ self-reported ideological position has little explanatory power (Hemesath and Tepe, Reference Hemesath and Tepe2023, Reference Hemesath and Tepe2024a; König et al., Reference König, Wurster and Siewert2023). This sharply contrasts with many other policy fields (e.g., welfare politics, fiscal policy, moral policies), where citizens’ ideological orientation is crucial for policy preferences. Likewise, previous research on motivated reasoning by Christensen and Moynihan (Reference Christensen and Moynihan2020) and others (Baekgaard and Serritzlew, Reference Baekgaard and Serritzlew2016; Martin et al., Reference Martin, James, Serritzlew and Van Ryzin2020) shows that political views are responsible for motivated reasoning about the performance of public vs private organizations.
A possible explanation for these findings might be a lack of elite discourses (Druckman, Reference Druckman2012) on the regulation of AI in the past, reducing citizens’ ability to use partisan cues and heuristics for reasoning. With recent debates about how society should deal with AI and substantial legislation, such as the European Union’s AI act, passed, this issue should be more politically salient for individuals (Laux et al., Reference Laux, Wachter and Mittelstadt2024). Explicitly, we anticipate that citizens’ regulatory preference toward AI, their preference for more or less state regulation of AI-based technologies, is a source of motivated reasoning toward AI-based systems in various policy domains.
H2.Citizens’ preference toward regulating AI-based technologies is a source of motivated reasoning biasing the evaluation of AI in public policymaking.
Second, citizens might have preexisting beliefs about AI on a more deeply rooted psychological level rather than regulation preferences that stem from ideological beliefs. Citizens’ emotions and subjective beliefs toward AI could represent a second source of motivated reasoning. Schepman and Rodway (Schepman and Rodway, Reference Schepman and Rodway2023: 2725–2776) conceptualize citizens’ general attitudes toward AI alongside a positive and a negative dimension: positive attitudes correspond to positive perceptions of the utility of AI (e.g., economic opportunities, improved performance), individual attitudes toward using such technologies (e.g., at work or in private life) and overly positive emotions (e.g., excitement, being impressed). Negative attitudes correspond to significant concerns about AI’s safety (e.g., unethical use, making errors) and strong negative emotions regarding AI (e.g., discomfort, finding AI sinister). In contrast to AI regulation preferences (more or less state regulation of AI-based systems), subjective AI attitudes are rooted on subjective, emotional and psychological levels. The technology acceptance model (TAM) is a popular approach to explaining consumers’ willingness to adopt technology (Davis, Reference Davis1989). However, when it comes to AI in the public sector, citizens are not consumers in the sense that they have the freedom to choose, nor does TAM capture the psychological aspects of attitudes toward AI. Schepman and Rodway (Reference Schepman and Rodway2023) show that their measure of AI attitudes correlates with respondents’ personality traits and generalized social trust. In light of this research pointing toward the importance of technology attitudes, we expect that citizens are more likely to correctly infer statistical evidence about the performance of an AI in public policy if the evidence is congenial to their subjective AI attitudes and less likely to correctly infer statistical evidence if it is uncongenial to these attitudes.
H3.Citizens’ subjective AI attitude is a source of motivated reasoning biasing the evaluation of AI in public policymaking.
While both regulation preferences and general attitudes toward AI are expected to bias citizens’ evaluations of AI systems, the literature on motivated reasoning suggests that emotionally stronger beliefs or attitudes may exert a greater influence (Kunda, Reference Kunda1990; Taber and Lodge, Reference Taber and Lodge2006). General attitudes toward AI are often deeply tied to subjective psychological and emotional factors, e.g., perceptions of fairness, risk and trust (Schepman and Rodway, Reference Schepman and Rodway2023), which are known to amplify motivated reasoning. In contrast, regulation preferences, which are connected to broader ideological beliefs about state intervention, may be weaker sources of motivated reasoning in this context, given the relatively low levels of politicization observed in related studies (Hemesath and Tepe, Reference Hemesath and Tepe2023; König et al., Reference König, Wurster and Siewert2023). Nonetheless, given the relative novelty of AI as a policy issue, this comparison remains empirical.
Research design
Policy domains
There is no comparative evidence on whether or how the policy domain influences motivated reasoning, or whether it is stable across varying contexts. Previous research suggests that respondents’ perceived issue salience moderates the magnitude of related cognitive biases such as framing (Lecheler et al., Reference Lecheler, de Vreese and Slothuus2009). Existing research on AI in public policy suggests that the area of application matters for citizens’ willingness to entrust AI-based systems. Some evidence points toward a general appreciation of AI-based services, and in some instances, even preferred over human decision-making (e.g., Logg et al., Reference Logg, Minson and Moore2019). Other studies revealed significant apprehensions about the use of AI in general (Dietvorst et al., Reference Dietvorst, Simmons and Massey2015) and in the public domain in particular (Wenzelburger et al., Reference Wenzelburger, König, Felfeli and Achtziger2024). Castelo et al. (Reference Castelo, Bos and Lehmann2019) suggest that algorithm apprehension is contingent on the nature of the task. They found that respondents were particularly reluctant to assign tasks to AI-based systems when considering them subjective, e.g., selecting clothing or writing a poem. The reason, they argue, is that algorithms are believed to lack human-like emotional characteristics. In contrast, allegedly objective tasks, quantifiable by figures, were more likely to be entrusted to AI-based systems. This is because algorithms are often perceived as more accurate and rational than human judgment. Wenzelburger et al. (Wenzelburger et al., Reference Wenzelburger, König, Felfeli and Achtziger2024: 42) provide further evidence that citizens’ acceptance of AI-based systems is contingent upon the concrete area of application. Studying the use of AI in policing and health care, they find that personal importance is an essential determinant of citizens’ acceptance of AI (Wenzelburger et al., Reference Wenzelburger, König, Felfeli and Achtziger2024: 54). In contrast, evidence on the moderating effect of beliefs and ideological predispositions on regulatory preferences for AI suggest those beliefs to have a relatively stable effect across various domains (Hemesath and Tepe, Reference Hemesath and Tepe2023, Reference Hemesath and Tepe2024a; König et al., Reference König, Wurster and Siewert2023). Subsequently, there is a theoretical argument for both, with the sources for motivated reasoning being context-dependent or stable across contexts.
Disentangling the various context factors (such as stakes, actor roles, task complexity and normative considerations) and potential case-specific sources presents a methodological challenge. This study starts from the opposite end and explores whether more general sources of motivated reasoning possess substantial variation across substantially different use cases for AI. Specifically, we test citizens’ motivated reasoning based on general preferences for regulation and subjective attitudes toward AI to evaluate AI-based systems in three distinct public policy contexts. They differ in the nature of the tasks, task complexity, the state’s role, their safety-criticality (Krafft et al., Reference Krafft, Zweig and König2022) and normative considerations. In all three cases, an AI-based system is used to fulfill a task that a human individual has conducted so far.
First, we investigate motivated reasoning in routine public service in municipal administration. Allocating parking permits represents a low-stakes, low-complexity case of state-citizen service encounters (Lipsky, Reference Lipsky2010). Previous research on citizens’ acceptance of AI in public services also considered standardized routine public services as a low-stake venue for digitalizing public services (Prokop and Tepe, Reference Prokop and Tepe2022; Grimmelikhuijsen, Reference Grimmelikhuijsen2023; Horvath et al., Reference Horvath, James, Banducci and Beduschi2023), particularly since some of these systems are already in trial use (e.g., semi-automated self-service terminalsFootnote 2). While technically relatively simple, such tasks can still provoke concerns about transparency and impartiality when citizens perceive unequal treatment.
Second, we investigate motivated reasoning in the context of self-driving cars. The approval of self-driving cars reflects a technically complex, regulatory function of the state that has considerable safety implications for individuals. Compared to administrative processes, the state no longer serves as a provider of services but is tasked with oversight. The safety implications of self-driving cars and how citizens’ risk perceptions affect their acceptance have been the topic of extensive research (Liu et al., Reference Liu, Yang and Xu2019). On a normative level, evidence (Bonnefon et al., Reference Bonnefon, Shariff and Rahwan2016; Awad et al., Reference Awad, Dsouza, Kim, Schulz, Henrich, Shariff and Rahwan2018) suggests citizens’ evaluation is highly emotional, as they prefer others to buy a self-driving car that sacrifices the passengers in favor of other traffic participants but would like to ride in a self-driving car that protects the passengers at all costs. The lack of trust in self-driving cars is supported by the occupants’ strong preferences for permanent supervision of self-driving cars (Hemesath and Tepe, Reference Hemesath and Tepe2024a).
Third, we investigate the evaluation of AI for predicting the recidivism of incarcerated individuals. This scenario captures a less tangible and more abstract risk for personal safety relating to a fear of crime (Hart et al., Reference Hart, Chataway and Mellberg2022). Already in use in some parts of the US juridical system (Dressel and Farid, Reference Dressel and Farid2018), various studies have tested under which circumstances citizens are willing to rely on AI for predicting recidivism (Meijer and Wessels Reference Meijer and Wessels2019; Hartmann and Wenzelburger, Reference Hartmann and Wenzelburger2021; Meijer et al., Reference Meijer, Lorenz and Wessels2021; Kennedy et al., Reference Kennedy, Waggoner and Ward2022; Wenzelburger et al., Reference Wenzelburger, König, Felfeli and Achtziger2024), suggesting that citizens are more likely to approve using such systems if they adhere to normative standards, such as transparency and when human decision-makers have room to adjust and interpret the machine forecast instead of directly implementing it. Predicting recidivism using AI introduces profound ethical and normative concerns that extend beyond technical considerations (Hartmann and Wenzelburger, Reference Hartmann and Wenzelburger2021) and raise questions about fairness, bias and legitimacy, particularly when outcomes impact marginalized communities disproportionately. In this context, the dual nature of harm (potentially depriving defendants of liberty through false positives or jeopardizing public safety through false negatives) highlights its salience as a morally and politically ambivalent case. Competing concerns could shape citizens’ evaluations: protecting individual rights vs safeguarding public safety. Evaluations may thus align with individuals’ broader values and normative beliefs about justice and security. When the justice system fails to deliver impartial enforcement of laws, the state risks eroding its legitimacy, as citizens may perceive it as incapable of maintaining order or serving the common good (Tyler, Reference Tyler1990).
The three use cases are expected to provide sufficient variation to examine the stability of motivated reasoning across policy domains.
Experimental instrument
We use an established experimental instrument (Baekgaard and Serritzlew, Reference Baekgaard and Serritzlew2016; Baker et al., Reference Baker, Patel, Von Gunten, Valentine and Scherer2020; Christensen and Moynihan, Reference Christensen and Moynihan2020). In the following, we will explain how it was adjusted for the three use cases (see Figures 1–3).

Figure 1. Satisfaction with municipal service.

Figure 2. Safety approval of new cars.

Figure 3. Predicting recidivism in incarcerated person.
Respondents are randomly assigned to one of four (I–IV) contingency tables for each use case (between-subjects design). The tables illustrate the results of a supposed evaluation study of either (1) citizens’ satisfaction with public service delivery at a municipal office, (2) the results of a safety reliability test of new types of vehicles or (3) the reliability of predictions made for determining recidivism of criminal offenders: For each use case, respondents evaluate the performance of two types of procedures, where the placebo group (Tables I and II) uses neutral labeling of the procedures as Type A vs Type B, and the treatment groups (Tables III and IV) use the same results, but respondents evaluate a human decision-maker and an AI-based system. The key to correctly interpreting the presented data is to calculate the difference in ratio between the two rows (Baker et al., Reference Baker, Patel, Von Gunten, Valentine and Scherer2020: 204). The information in the contingency tables is unambiguous; the answers to which municipal office receives a better satisfaction rating, which type of vehicles, or which recidivism prediction is more reliable can be identified as either correct or incorrect.
In Tables I and III, procedure Type B, respectively, the AI-based system performs better. In Tables II and IV, the numbers are reversed, making procedure Type A, respectively, the human decision-maker perform better. In Figures 1–3, the numerically correct answer is marked in dark gray. In Tables I and II, using the neutral frame, respondents’ ability to correctly identify the correct answer should only depend on respondents’ numeracy. Hence, we should not observe motivated reasoning in those conditions. We utilize Tables I and II as a control or placebo test, offering a baseline against which the occurrence of motivated reasoning can be tested (Christensen and Moynihan, Reference Christensen and Moynihan2020: 8–9).Footnote 3 For the outcome variable, we ask respondents to select the correct statement on which system performs better: Type A or B in Tables I and II, or the AI system vs the human decision-maker in Tables III and IV of the contingency tables. In each use case, the two answer options below the table were presented in random order to minimize a potential bias in the dependent variable due to order effects (e.g., always pick the first option).
Respondents’ technology attitudes are measured with two constructs (see Appendix Tables 3 and 4): First, we measure citizens’ preference toward the political regulation of AI using four statements: (1) Artificial intelligence should be more strictly regulated by the government. (2) Companies that produce artificial intelligence should be able to act on the market without restriction. (3) The government should take responsibility for ensuring that artificial intelligence benefits all people. (4) Each person should be responsible for judging artificial intelligence’s benefits and drawbacks. Second, we measure respondents’ attitudes toward AI using the General Attitudes toward Artificial Intelligence Scale (GAAIS) developed by Schepman and Rodway (Schepman and Rodway, Reference Schepman and Rodway2020, Reference Schepman and Rodway2023). Both item batteries are measured before respondents are asked to evaluate the information in the contingency table. At the end of the survey, all respondents were informed that the evidence presented in the contingency table was hypothetical examples (debriefing).
Samples and data analysis
The experiment was conducted on two samples of the German population drawn from the online access panel of ‘Bilendi’. The AI-based allocation of municipal parking permits was tested in Study 1. Prior power analysis assuming a small effect of f = 0.2 (alpha = 0.05; power = 0.8) resulted in a required sample size of n = 788 for both studies. For both studies, we recruited adult respondents through the online access panel of ‘Bilendi’, employing quotas for age, gender, state of residence and level of education. Bilendi reimbursed participants as per their standing agreements. We included an attention check in the GAAIS item battery to increase sample quality. Respondents failing to pass the attention check were excluded from the sample. After that, Study 1 (municipal parking permits) included 1692 respondents, and Study 2 (self-driving cars and prediction of recidivism) comprised 2464 respondents.Footnote 4

Figure 4. AI regulation preference and AI attitudes.
The dependent variable measures whether the respondent chose the correct answer for a given contingency table (yes/no). The five items on AI regulation preferences show an alpha of 0.34. Since this is indicative of low internal consistency and little correlation between the individual items, we elected to use only the first item of that battery (‘Artificial intelligence should be more strictly regulated by the government.’ 1 strongly disagree, …, 5 = strongly agree) for further analysis, as it best captures our intended belief dimension. The GAAIS items (Schepman and Rodway, Reference Schepman and Rodway2023) were transformed into two indexes measuring positive and negative attitudes toward AI. For both subdimensions, we find strong internal consistency with an alpha of 0.94 for the positive scale and an alpha of 0.88 for the negative scale. We ran a factor analysis for each subdimension and extracted factor scores. The distribution of respondents’ preferences for AI regulation and the factor scores of the GAAIS subindexes are depicted in Figure 4.Footnote 5 Descriptive statistics on the distribution of socio-demographic variables (gender, age, education and region) from the two samples are summarized in Appendix Figure 1.
To identify motivated reasoning, we estimate the marginal effect (ME), conditioned by congeniality. The ME is calculated based on two logistic interaction models that predict respondents’ correct choices. The binary logistic model for estimating the baseline results is estimated in the following form: Correct Choice = Congeniality × Treatment + Covariates + Error. Correct Choice is a dummy variable that equals one if the respondent picked the correct answer and zero if the answer was incorrect. We coded two different congeniality variables, one based on respondents’ AI regulation preference and the other based on the two GAAIS scales. For each measure, congeniality reports how strongly (rescaled on a scale from 0 to 1) individuals’ attitudes are in accordance with the correct choice in each condition (e.g., the AI-based system is the correct answer, and respondents report a positive attitude toward AI). If respondents evaluated Tables I and III, where AI performs better, the congeniality measure is the invert of ‘stricter regulation of AI’ and the index of the GAAIS scale for the positive subdimension. For Tables II and IV, where the human performs better, congeniality captures attitudes toward ‘stricter regulation of AI’, respectively, the negative GAAIS subdimension index measure. Treatment is a nominal variable indicating whether the respondent evaluated versions I and II (placebo group) or III and IV (treatment group) of the contingency tables. Since selecting the correct choice could also depend on individuals’ numeracy, we control for respondents’ level of education, age and gender (see Appendix Table 1).Footnote 6

Figure 5. Evaluation of an AI in allocating parking permits (Study 1).
Results
Baseline analysis
Findings on motivated reasoning in the three use cases are summarized in Figures 5–7. The left panel of each figure reports the results for the congeniality measure based on respondents’ preferences for stricter AI regulation. In comparison, the right panel reports the results based on the congeniality measure for respondents’ subjective attitudes toward AI (GAAIS). The y-axis denotes the probability that respondents chose the correct answer, while the x-axis represents the congeniality measure ranging from 0, indicating complete uncongeniality (respondents’ prior beliefs strongly contradict the objectively correct result), and 1, indicating high congeniality (respondents’ prior beliefs strongly aligning with the objectively correct result). The lines report the effect of congeniality on the probability respondents chose the correct answer. The dotted orange line illustrates the effect on the placebo groups, and the solid blue line illustrates the treatment effect on the treatment groups. To illustrate the distribution of varying levels of congeniality among participants, we included histograms detailing the frequency of observations.
For the case of the allocation of municipal parking permits (Figure 5), we find that information that is in accordance with citizens’ AI regulation preferences hardly affects respondents’ likelihood of choosing the correct answer. While we see a positive slope in the treatment group, we find no significant differences from the placebo group. In contrast, concerning respondents’ subjective attitudes toward AI (GAAIS), we find a solid symmetric effect of congeniality on respondents’ likelihood of choosing the correct answer. In contrast to the placebo group, respondents are significantly more likely to select the correct answer if the correct answer aligns with their prior subjective AI attitudes. Respondents in the treatment groups are about 20 percentage points more likely to choose the correct answer if it is highly congenial to their initial beliefs (compared to respondents in the placebo group). At the same time, they are more than 20 percentage points less likely to select the correct choice if said choice is strongly uncongenial to their subjective AI attitudes. For both measures, we find no effect of congeniality in the placebo group, strengthening the robustness of the findings.
For the case of self-driving cars (Figure 6), we find a similarly strong effect of congeniality based on subjective attitudes toward AI as observed in the first scenario and a mild positive slope of congeniality based on attitudes toward technology regulation. While we observe no treatment effect in the placebo group for subjective attitudes toward AI, we observe a slight negative slope in the placebo group for congeniality based on preferences for AI regulation. While this strengthens the robustness of the finding on the impact of subjective attitudes, the positive effect of regulatory preferences should be interpreted with caution.

Figure 6. Evaluation of AI in self-driving cars’ safety (Study 2).

Figure 7. Evaluation of an AI in predicting recidivism (Study 2).
Last, for the case of predicting recidivism (Figure 7), we find only uncongeniality of information based on regulatory preferences to significantly affect respondents’ choices, reducing their likelihood to correctly evaluate the scenario by up to 15 percentage points compared to the placebo group. We find no significant differences when the information provided is congenial to regulatory preferences. For subjective attitudes toward AI, we find a similarly strong asymmetrical effect that is strong and significant if information is uncongenial to the evaluated data, but we observe no significant differences between treatment groups and placebo groups if the information is highly congenial to subjective beliefs.
Overall, subjective attitudes toward AI (GAAIS) produce a robust pattern of motivated reasoning across all three use cases. When the information provided in the contingency table is uncongenial to individuals’ attitudes, individuals perform significantly worse than in the placebo groups. When the information is in accordance with respondents’ AI attitudes, they perform significantly better. In comparison, regulatory preferences are found to be only a selective source of motivated reasoning.
Secondary analysis
Our results indicate considerable differences in absolute correct choices between samples. While respondents in the first sample, the use case for assessing AI in the allocation of parking permits, gave the correct answer on average 52% of the time (across treatment groups), we find this rate to increase to 66% (self-driving cars) and 61% (prediction of recidivism) in the second sample. To test whether this was caused by different sample quality, we tested whether our results were significantly moderated by the response time of individuals. Figures A2 and A3 report the results for both measures of congeniality, conditional on whether respondents answered in a low, medium or high response time (relative to the median response time). The lines represent the baseline effects for reference, while the points illustrate the binned estimates for low, medium and high levels of congeniality and the shapes the low, medium and high groups of response time. For both measures, we find considerably stronger heterogeneity in the first sample, while results in the second sample are more robust across varying levels of response time. Disregarding individual outliers potentially subject to the limited power of this additional analysis (three-way interaction), the general trend in the first sample, however, supports the conclusions drawn from the baseline results, as we find similar slopes across the different response time groups.
Second, we repeated the baseline analysis on binned subgroups to strengthen the robustness of this finding and rule out that the reported effects are due to overfitting (Hainmueller et al., Reference Hainmueller, Mummolo and Xu2019). We binned the congeniality measure based on AI regulation preferences and subject attitudes toward AI into three terciles and estimated the same regression models used in the main analysis (see Appendix Table 2). The estimation results using the binned congeniality measures are summarized in Appendix Figure 5. The robustness analysis strengthens our findings (also see Appendix Table 3).
Third, selecting only one of the items to measure preferences for regulation might induce unintended confounding by wording or comprehension. To strengthen the robustness of this measure, we repeated the analysis with the fourth item (‘Each person should be responsible themselves for judging artificial intelligence’s benefits and drawbacks’), which should yield an identical effect if congeniality is reversed. Appendix Figure 4 reports the results for this analysis, indicating results that are similar in direction yet significantly more pronounced than with item 1. Here, we can observe a strong symmetrical effect for the prediction of recidivism, meaning that individuals who strongly support this statement are significantly more likely to correctly evaluate the AI as better if the AI is better and significantly less likely to correctly evaluate the human as better if the human performs better.
Last, previous studies (Christensen and Moynihan, Reference Christensen and Moynihan2020) also tested for potential variation of motivated reasoning among subgroups (e.g., age, gender, level of education or political ideology). While the results from such analyses should be treated with caution, considering the issue of multiple comparisons and lack of statistical power induced by a limited sample size, we likewise tested for the moderating effect of age, gender, level of education and political ideology. The results of those tests (Appendix Figures 6a–9b) indicate no substantial variation among subgroups.
Discussion and conclusions
Objectifying the benefits of AI-based systems in the public sector through systematic evaluations is a critical element in policymakers’ strategies to overcome skepticism and increase the legitimacy of AI in public service provision (König and Wenzelburger, Reference König and Wenzelburger2021; Grimmelikhuijsen and Meijer, Reference Grimmelikhuijsen and Meijer2022). This study addressed how motivated reasoning about AI might limit the objectivation of the pros and cons of AI in public policies. In a series of experiments, we tested whether citizens’ AI regulation preferences or their subjective attitudes toward AI cause motivated reasoning when they are asked to evaluate the performance of AI-based technology in three distinctive policy domains.
Experimental results from two preregistered studies conducted among German citizens confirm the existence of motivational reasoning about AI in public policy (H1) with two refinements: First, apart from the case of predicting recidivism, citizens’ regulatory preferences toward AI are not a vital source of motivating reasoning (H2). In contrast to other studies on motivated reasoning in public policies, e.g., public vs private provision of education and health services (Baekgaard and Serritzlew, Reference Baekgaard and Serritzlew2016; Martin et al., Reference Martin, James, Serritzlew and Van Ryzin2020), we do not find preferences for state regulation to be a significant source of motivated reason about AI in public services. This result suggests that explicit political or ideological preferences may not strongly influence citizens’ assessments of AI performance in the three policy contexts studied here. Nevertheless, this interpretation should be approached with caution. The analysis does not take into account specific political attitudes, such as beliefs about crime or trust in government, that might influence citizens’ reasoning. The findings for predicting recidivism draw attention to the possible influence of institutional trust or the salience of normative concerns, particularly in contexts where issues such as public safety and justice are closely linked to political attitudes (Harcourt, Reference Harcourt2007). Probing deeper, one could examine how specific political attitudes, such as beliefs about crime, public safety or government authority, interact with citizens’ evaluations of AI, as they represent an important dimension of trust and legitimacy in public policy.
Second, citizens’ subjective attitudes toward AI are a robust source of motivating reasoning about the performance of AI in public policies (H3). In contrast to AI regulation preferences, these attitudes are more deeply rooted on an emotional and psychological level, as GAAIS has been shown to correlate with interpersonal trust and personality traits (Schepman and Rodway, Reference Schepman and Rodway2023). This might explain why AI attitudes’ role in motivating reasoning about AI in public policies does not differ substantially between the three use cases utilized in this study. Considering that the AI attitudes measured with GAAIS are related to respondents’ personality and other psychological parameters, AI attitudes as a source of motivated reasoning toward using AI in public policies may be hard to overcome. Research on debiasing interventions to mitigate motivated reasoning based on prior political or ideological beliefs offers a fruitful theoretical perspective but with mixed empirical results (Christensen and Moynihan, Reference Christensen and Moynihan2020; Boissin et al., Reference Boissin, Caparos, Raoelison and De Neys2021). A follow-up consideration is how to design simple and effective interventions to mitigate the distorting effects of motivated reasoning about AI in public policy.
All of these results are based on a narrow conceptualization of motivated reasoning, asking respondents to identify a numerically correct answer from a contingency table. However, the data source itself may be perceived as a signal rather than an objective evaluation. In this case, motivated reasoning, understood as ‘once-motivated reasoning’, is less irrational (Little, Reference Little2024); instead, it becomes a process in which contradictory evidence is given less weight in respondents’ belief updating. Bayesian updating can be regarded as a complementary perspective on how individuals process information. It suggests that prior beliefs are not fixed but can be gradually updated in light of new evidence. Little (Reference Little2024) proposed a model combining Bayesian updating and ‘once-motivated reasoning’ about information signals, showing that subjects sometimes completely reject signals that lead to less pleasant beliefs. This model could better account for the increasing skepticism, even hostility, that can be observed toward scientific evaluations. To validate this promising model of the relationship between motivated reasoning and Bayesian updating, one could manipulate the source of the data in the contingency tables (e.g., evaluations by an independent scientific body or a partisan think tank).
This study is not without methodological limitations. First, this study is certainly limited by its measurement of regulatory preferences and attitudes toward AI. Future research could explore the robustness of the findings presented here by using alternative measures of respondents’ demand for regulation of AI (Heinrich and Witko, Reference Heinrich and Witko2024) and psychological resonance toward AI, such as the technophilia scale (Martínez-Córcoles et al., Reference Martínez-Córcoles, Teichmann and Murdvee2017). Second, this study does not investigate how motivated reasoning about AI influences real-world behaviors, such as voting or technology adoption. While cognitive measures provide important insights into the cognitive processes underlying decision-making, future research should examine the behavioral implications of motivated reasoning on political decision-making in the case of AI. Finally, this study is based on survey data from a single country. While this limits the generalizability of findings to other institutional contexts and policy environments, Germany represents an interesting and appropriate case for examining motivated reasoning. Germans are known for their relatively high-risk aversion, paired with comparatively high levels of trust in government and administration. These characteristics may amplify motivated reasoning about AI in public policy. A follow-up study could adopt a cross-country framework to explore whether our findings hold in countries with differing levels of trust in government, cultural attitudes toward technology and administrative traditions.
What policy implications can be drawn from these findings? This study should not be considered an argument for abandoning the ambition of EBPM despite the robust bias in citizens’ evaluations of AI performance (H1) caused by subjective attitudes toward AI. Instead, we believe these findings point to the limitations of policy objectification (Sunstein, Reference Sunstein2002; Oliver, Reference Oliver2013; Barak-Corren and Kariv-Teitelbaum, Reference Barak-Corren and Kariv-Teitelbaum2021; Dhami and Sunstein, Reference Dhami and Sunstein2022). Our study suggests that the evaluation of scientific data about the pros and cons of AI alone will not be sufficient to improve public acceptance among those with solid pre-perceptions of AI. In addition to identifying effective debiasing measures, future studies should also address the question of how to target debiasing, i.e., how to provide objective information to groups that are receptive to this information and support the evaluation of policy by merit, whereas highly biased individuals are unlikely to be swayed by additional information in their evaluation of AI in the public sector. Finally, we have no reason to believe that motivated reasoning about AI based on subjective attitudes toward AI is limited to public sector employees. Thus, if the phenomenon is general and applies to public employees, their subjective attitudes toward AI in public policy should be addressed and carefully considered in strategies to digitize public services.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/bpp.2025.2.
Acknowledgements
This study was funded by the Niedersächsisches Vorab VWVN 1466 (Niedersächsicher Ministerium für Wissenschaft und Kultur). Previous versions of this study have been presented, for example, at the Tripartite Political Science Conference of the Swiss (SVPW), Austrian (ÖGPW), and German (DVPW) Political Science Association at Linz University in 2023. We want to thank all participants for their valuable comments, which contributed to the revision of the study.