1 Introduction
Accuracy-first epistemology aims to justify all epistemic norms by showing that they can be derived from the rational pursuit of accuracy. Take, for example, probabilism—the norm that credence functions should be probability functions. Accuracy-firsters say non-probabilistic credences are irrational because they’re accuracy-dominated: For every non-probabilistic credence function, there’s some probabilistic credence function that’s more accurate no matter what.Footnote 1 Or take norms of updating, my topic in this paper. Accuracy-firsters aim to derive the rational updating rule by way of accuracy; specifically, they claim that the rational updating rule is the rule that maximizes expected accuracy.Footnote 2
Externalism, put roughly, says that we do not always know what our evidence is. Though far from universally accepted, externalism is a persuasive and widely held thesis, supported by a compelling vision about the kinds of creatures we are—creatures whose information-gathering mechanisms are fallible, and whose beliefs about most subject matters are not perfectly sensitive to the facts.
Schoenfield (Reference Schoenfield2017) has shown that following the update rule Metaconditionalization maximizes expected accuracy.Footnote 3 However, as she and many other authors note, if externalism is true, Metaconditionalization is not Bayesian Conditionalization. Therefore, the externalist seems to face a dilemma: Either deny that Conditionalization is the rational update rule, thereby rejecting traditional Bayesian epistemology, or else deny that the rational update rule is the rule that maximizes expected accuracy, thereby rejecting the accuracy-first program. Call this the Bayesian Dilemma.Footnote 4
I’m not convinced by this argument. We’ll see that once we make the premises fully explicit, the argument relies on assumptions that the externalist should reject. Still, I think that the Bayesian Dilemma is a genuine dilemma. I give a new argument—I call it the continuity argument—that does not make any assumptions that the externalist rejects. Roughly, what I show is that if you’re sufficiently confident that you would follow Metaconditionalization if you adopted Metaconditionalization, then you’ll expect adopting a rule I’ll call Accurate Metaconditionalization to be more accurate than adopting Bayesian Conditionalization.
I’ll start in section 2 by introducing an accuracy-based framework for evaluating updating rules in terms of what I will call actual inaccuracy. In section 3, I’ll introduce externalism. In section 4, I turn to the Bayesian Dilemma. I present an argument purporting to show that the externalist must choose between Bayesian Conditionalization and accuracy-first epistemology, and I explain why the argument does not succeed. In section 5, I present the continuity argument showing that the Bayesian Dilemma is nevertheless a genuine dilemma. Section 6 concludes.
2 The accuracy framework: Actual inaccuracy
Accuracy-first epistemology says that our beliefs and credal states aim at accuracy, or closeness to the truth; that is, our beliefs and credal states aim to avoid inaccuracy, or distance from the truth. We said that, according to accuracy-firsters, the rational update rule is the rule that maximizes expected accuracy. There are different ways of making that thesis precise. In this section, I’ll present my own preferred way. We’ll start by getting the basics of the accuracy-first framework on the table.
2.1 Basics of the accuracy framework
For technical purposes, it is better to work with measures of inaccuracy rather than measures of accuracy. An inaccuracy measure I is a function that takes a world
$$$$$$$$$
$w$
from a set of worlds
${\rm{\Omega }}$
, and a probability function
$C$
defined over
${\cal P}\left( {\rm{\Omega }} \right)$
, and returns a number between 0 and 1. This number represents how inaccurate
$C$
is in
$w$
.
$C$
is minimally inaccurate if it assigns 1 to all truths and 0 to all falsehoods;
$C$
is maximally inaccurate if it assigns 1 to all falsehoods and 0 to all truths.
The expected inaccuracy of a probability function
$C$
—relative to another probability function
$P$
—is a weighted average of
$C$
’s inaccuracy in all worlds, weighted by how likely it is, according to
$P$
, that those worlds obtain. Formally:

I will make three assumptions about inaccuracy measures. Though these assumptions are not incontrovertible, they are standard in the accuracy-first literature, and I will not say much to justify them.Footnote 5
The first assumption is:
Strict Propriety
For any two distinct probability functions
$P$
and
$C$
,
${\mathbb{E}_C}\left[ {{\bf{I}}\left( C \right)} \right] \lt {\mathbb{E}_C}\left[ {{\bf{I}}\left( P \right)} \right]$
.
Strict Propriety says that probabilistic credence functions expect themselves to minimize inaccuracy. Strict Propriety is often motivated by appeal to the norm of immodesty—roughly, that rational agents should be doing best, by their own lights, in their pursuit of accuracy.
The second assumption is Additivity, which says, roughly, that the total inaccuracy score of a credence function at a world is the sum of the inaccuracy scores of each of its individual credences. More precisely:
Additivity
For any
$H \in {\cal P}\left( {\rm{\Omega }} \right)$
, there is a local inaccuracy measure
${i^H}$
that takes a world
$w \in {\rm{\Omega }}$
and a credence
$C\left( H \right)$
in the proposition H to a real number such that:

The third assumption is a continuity assumption for local inaccuracy measures. Specifically:
Continuity
$i_w^H\left( x \right)$
is a continuous function of
$x$
.
Now that we know how to measure the inaccuracy of a credence function, we turn to updating rules. I will assume that a learning experience can be characterized by a unique proposition—the subject’s evidence. We define a learning situation as a complete specification of all learning experiences that an agent thinks she might undergo during a specific period of time—a specification of all of the propositions that the agent thinks she might learn during that time. Formally, a learning situation is an evidence function
${\textsf{E}}$
that maps each world
$w$
to a proposition
${\textsf{E}}\left( w \right)$
, the subject’s evidence in w. I will write
$\left[ {{\textsf{E}} = {\textsf{E}}\left( w \right)} \right]$
for the proposition that the subject’s evidence is
${\textsf{E}}\left( w \right)$
:

We define an evidential updating rule as a function
$g$
that takes a prior probability function
$C$
and an evidence proposition
${\textsf{E}}\left( w \right)$
, and returns a credence function.Footnote 6 In the next two sections of the paper, we will be talking about two updating rules. The first is Bayesian Conditionalization.
Bayesian Conditionalization

Bayesian Conditionalization says that you should respond to your evidence
${\textsf{E}}\left( w \right)$
by conditioning on your evidence; for any proposition
$H$
, your new credence in
$H$
, upon receiving your new evidence, should be equal to your old credence in
$H$
conditional on your new evidence. The second rule is Metaconditionalization.
Metaconditionalization

Metaconditionalization says that you should respond to your evidence
${\textsf{E}}\left( w \right)$
by conditioning on the proposition that your evidence is
$E\left( w \right)$
.
2.2 Adopting rules and following rules
I will distinguish adopting an updating rule from following an updating rule. If you follow a rule, then your posterior credence function is the credence function that the rule recommends. If you adopt an updating rule, then you intend or plan to follow the rule. Of course, in general, we can intend or plan to do things without succeeding in doing those doing things. Intending or planning to follow an updating rule is no exception. We can intend or plan to follow an updating rule—in my terminology, we can adopt an updating rule—without following it.Footnote 7
To see how this might happen, consider Williamson’s well-known case of the unmarked clock.Footnote 8 Off in the distance you catch a brief glimpse of an unmarked clock. You can tell that the hand is pointing to the upper-right quadrant of the clock, but you can’t discern its exact location—your vision is good, but not perfect. What do you learn from this brief glimpse? What evidence do you gain? That—according to Williamson—depends on what the clock really reads. If the clock really reads that it is 4:05, the evidence you gain is that the time is between (say) 4:04 and 4:06. If the clock really reads 4:06, the evidence you gain is that the time is between (say) 4:05 and 4:07. Suppose that you adopt Bayesian Conditionalization as your update rule, and that the clock in fact reads 4:05. Your evidence is that the time is between 4:04 and 4:06, but you mistakenly think that your evidence is that the time is between 4:05 and 4:07. As a result you misapply Bayesian Conditionalization; you condition on the wrong proposition.Footnote 9 Despite having adopted Bayesian Conditionalization as your update rule, you did not follow the rule.
The accuracy-first epistemologist says that the rational updating rule is the rule that minimizes expected inaccuracy. I said that there are different ways to make this precise. According to one common way of making it precise, the thesis is a claim about following updating rules (although the distinction between adopting and following is often not made explicit). At a first pass, we might understand this thesis as saying that we are rationally required to follow an updating rule that minimizes expected inaccuracy. But there is an immediate problem with this first-pass thesis, which others have recognized. Consider the omniscient updating rule, which tells you to assign credence 1 to all and only true propositions. The omniscient updating rule is less inaccurate than any other rule at every world, and so every probabilistic credence function expects it to uniquely minimize inaccuracy. But we do not want to say that we are rationally required to follow the omniscient updating rule. To avoid this implication, theorists refine the thesis by appeal to the notion of an available updating rule. The refined thesis says that we’re rationally required to follow an updating rule that is such that (i) following that rule is an available option, and (ii) following that rule minimizes expected inaccuracy among the available options.Footnote 10 Following the omniscient updating rule is not an available option and so we are not required to follow it.
To evaluate this proposal, we need to investigate the notion of availability at issue. A natural thought is that an act is available to you only if you are able to perform the act, and that you are able to perform an act if and only if, if you tried to perform the act, you would.Footnote 11 But on this understanding, even following Bayesian Conditionalization is not always an available option, according to the externalist. Return to the example of the unmarked clock. The clock in fact reads 4:05. Your evidence is therefore that the time is between 4:04 and 4:06. How do you update your credences? There are two cases. In the first case, you correctly identify your evidence, and as a result, you condition on your evidence. In this case, it is true that if you tried to follow Bayesian Conditionalization, you would. In the second case, you mistakenly take your evidence to be that the time is between 4:05 and 4:07, and as a result, you condition on the wrong proposition. In this case, it is not true that if you tried to follow Bayesian Conditionalization then you would, and so it is not true that you are able to follow Bayesian Conditionalization.
Of course, one might object to this account of ability. Rather than wade any further into this debate, I will simply observe that however we define availability, if we state the accuracy-first thesis in terms of following, we’ll be taking for granted that if you adopt an available updating rule, you will follow it; we’ll be ignoring possibilities in which you do not succeed in following your updating rule because you mistake your evidence. But the example of the unmarked clock suggests that cases like this are commonplace. We should take them into account. In light of this, I suggest that we understand the accuracy-first thesis as a thesis about which updating rule we are rationally required to adopt. To that end, we need to say how to evaluate the inaccuracy of adopting an updating rule.
2.3 Actual inaccuracy
I propose to measure the inaccuracy of adopting an updating rule in terms of what I will call actual inaccuracy.Footnote 12 Roughly, the actual inaccuracy of adopting an updating rule
$g$
in a world
$w$
is the inaccuracy, in
$w$
, of the credence function you would have if you adopted
$g$
in
$w$
.Footnote 13 To give a more precise definition, I need to introduce credal selection functions.
A credal selection function is a function
$f$
that takes an evidential updating rule
$g$
and a world
$w$
, and returns a credence function—the credence function that the subject would have if she were to adopt the rule g in world
$w$
.Footnote 14 Of course, any number of factors might play a role in determining what credence function a given subject would have if she were to adopt a certain updating rule. To keep things manageable, I am going to make some simplfiying assumptions about how we are disposed to change our credal states if we adopt Bayesian Conditionalization or Metaconditionalization.
Return to the example of the unmarked clock. Suppose you adopt Bayesian Conditionalization. In fact, the clock reads 4:05 and so your evidence is that the time is between 4:04 and 4:06. How do you update your credences? There are, as before, two cases. In one case, you correctly identify your evidence: to use the terminology that I will from now on adopt, you guess correctly that your evidence is that the time is between 4:04 and 4:06. In this case, the conditional
(I) If you adopted Bayesian Conditionalization, you would follow Bayesian Conditionalization.
is true of you. In the second case, you guess incorrectly that your evidence is that the time is between 4:05 and 4:07. In this case, the conditional (I) is false—if you adopted Bayesian Conditionalization you would condition on the wrong proposition. I will assume that these are the only two cases. Either you guess correctly and condition on the right proposition, or else you guess incorrectly and condition on the wrong proposition.Footnote 15
To make this more precise, fix a set of worlds
${\rm{\Omega }}$
and an evidence function
${\textsf{E}}$
defined on
${\rm{\Omega }}$
. We will let
${{\textsf{G}}^{\textsf{E}}}$
be a guess function defined on
${\rm{\Omega }}$
. This is a function that takes each world
$w$
to a proposition
${{\textsf{G}}^{\textsf{E}}}\left( w \right)$
: the subject’s guess about what her evidence is in
$w$
.Footnote 16 Then, where
${f_{C,{{\textsf{G}}^{\textsf{E}}}}}$
is the credal selection function for any subject with guess function
${{\textsf{G}}^{\textsf{E}}}$
and prior
$C$
:Footnote 17


Equation (3) says that the credence function you would have if you adopted Bayesian Conditionalization in a world
$w$
, given that you have prior
$C$
and guess function
${{\textsf{G}}^{\textsf{E}}}$
, is the result of conditioning your prior
$C$
on
${{\textsf{G}}^{\textsf{E}}}\left( w \right)$
, your guess about what your evidence is in
$w$
. Likewise, (4) says that the credence function you would have if you adopted Metaconditionalization in a world
$w$
, given that you have prior
$C$
and guess function
${{\textsf{G}}^{\textsf{E}}}$
, is the result of conditioning your prior
$C$
on the proposition that your evidence is
${G^E}\left( w \right)$
, your guess about what your evidence is in
$w$
.
We will now use credal selection functions to define the actual accuracy of adopting an evidential updating rule. Let
$g$
be any evidential updating rule. Let
${{\textsf{G}}^{\textsf{E}}}$
be any guess function. Let
$C$
be any prior. We define
${V_{{\rm{C}},{{\textsf{G}}^{\textsf{E}}}}}\left( {g,w} \right)$
, the actual inaccuracy, in
$w$
, of adopting rule
$g$
given prior
$C$
and guess function
${{\textsf{G}}^{\textsf{E}}}$
, as follows:
Actual Inaccuracy

The actual inaccuracy, in
$w$
, of adopting the updating rule
$g$
given that you have guess function
${{\textsf{G}}^{\textsf{E}}}$
and prior
$C$
is the inaccuracy, in
$w$
, of the credence function you would have if you adopted rule
$g$
in
$w$
, given that
$C$
is your prior and
${{\textsf{G}}^{\textsf{E}}}$
is your guess function.Footnote 18
Assuming (3), the actual inaccuracy of adopting Bayesian Conditionalization in a world
$w$
for a subject with prior
$C$
and guess function
${{\textsf{G}}^{\textsf{E}}}$
is

Assuming (4), the actual inaccuracy of adopting Metaconditionalization in a world
$w$
for a subject with prior
$C$
and guess function
${{\textsf{G}}^{\textsf{E}}}$
is

The expected actual inaccuracy of adopting Bayesian Conditionalization and of adopting Metaconditionalization are defined in (7) and (8), respectively:


Returning to the accuracy-first thesis that the rational updating rule is the rule that does best in terms of accuracy. I have argued that this claim is best understood as a claim about which updating rule we should adopt. We can now make this claim more precise using the notion of actual inaccuracy. I propose to formulate the accuracy-first thesis, which I call Accuracy-First Updating, as follows:
Accuracy-First Updating
You are rationally required to adopt an evidential updating rule that minimizes expected actual inaccuracy.
Let’s turn now to epistemic externalism.
3 Externalism
To characterize externalism, we need to first characterize internalism. Internalism says, roughly, that for certain special propositions, when those propositions are true we have a special kind of access to their truth. Let’s say that you have access to a proposition if and only if, whenever it is true, your evidence entails that it is true. Then internalism says that, for certain special propositions, whenever those propositions are true, your evidence entails that they are true. There are different brands of internalism, depending on what kinds of propositions are taken to be special. According to some, the special propositions are propositions about our own minds, such as the proposition that I am in pain. These internalists say that whenever I am in pain, my evidence entails that I am in pain—I can always tell that I am in pain by carefully attending to this evidence, my own experiences. In this paper we will be mainly interested in one form of internalism—evidence internalism. On this view, propositions about what our evidence is are special propositions in the sense that whenever they’re true, our evidence entails that they are true.
Evidence Internalism
If your evidence is the proposition
${\textsf{E}}\left( w \right)$
, then your evidence entails that your evidence is
${\textsf{E}}\left( w \right)$
.
Let evidence externalism be the denial of evidence internalism. More precisely:
Evidence Externalism
Sometimes, your evidence is some proposition
${\textsf{E}}\left( w \right)$
, but your evidence does not entail that your evidence is
${\textsf{E}}\left( w \right)$
.
Why accept evidence externalism? One standard argument appeals to our fallibility. The externalist says that all of our information-gathering mechanisms are fallible. Now, it is no surprise that our mechanisms specialized for detecting the state of our external environment—such as whether it is raining, or whether there is a computer on my desk—can lead us astray. What is controversial about externalism is its insistence that what is true of these propositions about my external environment is true of nearly all propositions, including the proposition that I am in pain or that I feel cold. The externalist says that, sometimes, I am feeling cold, but my mechanisms specialized for detecting feelings of coldness misfire, telling me that I am not feeling cold.
The externalist asks us to consider a case in which my information-gathering mechanisms have misfired. As a matter of fact, I’m feeling cold, but my mechanisms specialized for detecting feelings of coldness misfire, telling me that I’m not feeling cold. Since it is false that I’m not feeling cold, it is not part of my evidence that I’m not feeling cold. But I have no reason to believe that anything is amiss—it is not part of my evidence that it is not part of my evidence that I’m not feeling cold.Footnote 19
4 The Bayesian Dilemma and the externalist reply
In the introduction I said that some have argued that externalists face a dilemma, the Bayesian Dilemma: Either deny that we are rationally required to adopt Bayesian Conditionalization as our update rule or else deny that the rational update rule is the rule that maximizes expected accuracy, thereby rejecting the accuracy-first program. In this section, I present a core piece of that argument, Schoenfield’s result that you can expect following Metaconditionalization to be more accurate than following any other updating rule. But as we’ll see, this result cannot do the work that others have thought it can. It doesn’t follow from Schoenfield’s result that you expect adopting Metaconditionalization to be more accurate than adopting Bayesian Conditionalization, and I have argued that that it is adopting, not following, that the accuracy-first updating thesis should concern.
Let’s begin by stating Schoenfield’s result.
Theorem 1 Let
$E$
be any learning situation. Consider any updating rule
$g$
and any prior
$C$
such that
$g\left( {C,E\left( w \right)} \right) \ne {g_{meta}}\left( {C,E\left( w \right)} \right)$
for some
$w$
such that
$C\left( w \right) \gt 0$
. Then

Here is what Theorem 1 says. Consider any evidential updating rule
$g$
that disagrees with Metaconditionalization in learning situation
${\textsf{E}}$
. Consider any subject who leaves open worlds where
$g$
and Metaconditionalization disagree. Then, Theorem 1 says, that the subject will expect the recommendation of Metaconditionalization to be strictly less inaccurate than the recommendation of
$g$
in that learning situation.
But, as Schoenfield and others observe, if evidence externalism is true, Metaconditionalization is not Bayesian Conditionalization. Remember, Baysian Conditionalization says that you should respond to your evidence
${\textsf{E}}\left( w \right)$
by conditioning on
${\textsf{E}}\left( w \right)$
. Metaconditionalization says that you should respond to
${\textsf{E}}\left( w \right)$
by conditioning on the proposition that your evidence is
${\textsf{E}}\left( w \right)$
, the proposition
$\left[ {{\textsf{E}} = {\textsf{E}}\left( w \right)} \right]$
. If evidence externalism is true, then
${\textsf{E}}\left( w \right)$
is not always the same proposition as
$\left[ {{\textsf{E}} = {\textsf{E}}\left( w \right)} \right]$
. In particular, sometimes
${\textsf{E}}\left( w \right)$
will not entail the proposition
$\left[ {{\textsf{E}} = {\textsf{E}}\left( w \right)} \right]$
, and when this happens, Metaconditionalization and Bayesian Conditionalization will disagree.
Let
${\textsf{E}}$
be any learning situation in which
$\left[ {{\textsf{E}} = {\textsf{E}}\left( w \right)} \right] \ne {\textsf{E}}\left( w \right)$
for some world
$w$
. Consider any subject who leaves open some such worlds. Then Theorem 1 entails that the subject will expect the recommendation of Metaconditionalization to be less inaccurate than the recommendation of Bayesian Conditionalization in learning situation
${\textsf{E}}$
. Formally:

But it doesn’t follow from Theorem 1 that the subject expects adopting—intending or planning to follow—Metaconditionalization to be less inaccurate than adopting Bayesian Conditionalization. That would follow from Theorem 1 only if we knew that the subject would follow Metaconditionalization if she adopted Metaconditionalization, and that she would follow Bayesian Conditionalization if she adopted Bayesian Conditionalization.
To see this, let
${{\textsf{G}}^{\textsf{E}}}$
be the subject’s guess function in learning situation
${\textsf{E}}$
. Let Guess Right be the proposition that the subject’s guess about her evidence in learning situation
${\textsf{E}}$
is right. Formally:

Let Guess Wrong be the proposition that the subject’s guess about her evidence in
${\textsf{E}}$
is not right. Formally:

Say that a subject with guess function
${{\textsf{G}}^{\textsf{E}}}$
is infallible in learning situation
${\textsf{E}}$
if, for any
$w \in {\rm{\Omega }}$
, Guess Right is true in
$w$
. If we assume that our subject is infallible in learning situation
${\textsf{E}}$
, then for all
$w \in {\rm{\Omega }}$
,


If (12) and (13) are true, then Theorem 1 entails that the subject expects adopting Metaconditionalization to be less inaccurate than adopting Bayesian Conditionalization. Formally:

But of course the externalist will insist that creatures like us are not infallible. Remember, the externalist says my beliefs about what evidence I have are not perfectly sensitive to the facts about what evidence I have. Return to the case of the unmarked clock. In fact my evidence is that the time is between 4:04 and 4:06. But my mechanisms specialized for detecting what evidence I have misfire, and so I mistakenly think that my evidence is some other proposition—that the time is between 4:05 and 4:07. Importantly, the externalist maintains that no amount of careful attention to my evidence will insure me against error. For the externalist, even ideally rational, maximally attentive agents are not always certain of the true answer to the question of what their evidence is. That is just to say that even ideally rational, maximally attentive agents are not always such that, if they adopted Metaconditionalization, they would follow Metaconditionalization.
In short, (13) is often false for agents like us—agents with fallible information-gathering mechanisms. But without (13), we can’t derive (14) from (9). We can’t conclude that, for fallible agents like us, adopting Metaconditionalization has lower expected actual inaccuracy than adopting Bayesian Conditionalization.
Let me summarize. If evidence externalism is true, then Theorem 1 tells us that, under certain conditions, we will expect following Metaconditionalization to be less inaccurate than following any other evidential updating rule. It doesn’t follow, however, that we expect adopting Metaconditionalization to be less inaccurate than adopting any other rule.Footnote 20 In particular, it doesn’t follow that we expect adopting Metaconditionalization to be less inaccurate than adopting Bayesian Conditionalization. That would follow only if we knew that we’re infallible, but we cannot, on pain of begging the question against the externalist, simply assume that this is so. So we have not shown that if evidence externalism is true, then we must choose between the rule that maximizes expected accuracy and Bayesian Conditionalization.Footnote 21
5 The Bayesian Dilemma reconsidered
In this section, I show that we can establish the Bayesian Dilemma without the assumption of infallibility. I give a new argument—I call it the continuity argument—showing that if you are sufficiently confident that you will correctly identify your evidence, then you will expect a rule that I call Accurate Metaconditionalization to have less expected inaccuracy than adopting Bayesian Conditionalization. In section 5.1 I’ll begin by saying what Accurate Metaconditionalization is, and then I’ll present the continuity argument. In section 5.2 I will consider whether other rules are immune to the continuity argument.
5.1 The continuity argument
Metaconditionalization said that you should respond to your evidence
${\textsf{E}}\left( w \right)$
by conditioning on the proposition that your evidence is
${\textsf{E}}\left( w \right)$
. Accurate Metaconditionalization says that you should respond to your evidence
${\textsf{E}}\left( w \right)$
by conditioning on the proposition that your evidence is
${\textsf{E}}\left( w \right)$
and that you have guessed right. (Remember,
$Guess{\rm{\;\;}}Right = \left\{ {w \in {\rm{\Omega }}:{{\textsf{G}}^{\textsf{E}}}\left( w \right) = {\textsf{E}}\left( w \right)} \right\}$
.) More precisely:
Accurate Metaconditionalization
Where
$C$
is any prior such that
$C({\textsf {E}} = {\textsf{E}}\left( w \right)|Guess{\rm{\;\;}}Right) \gt 0$
for all
$w \in {\rm{\Omega }}$
,

For simplicity, I will assume:

Equation (15) says that the credence function you would have if you adopted Accurate Metaconditionalization is the result of conditioning your prior on the proposition that your evidence is
${{\textsf{G}}^{\textsf{E}}}\left( w \right)$
, your guess about what your evidence is in
$w$
, and that you have guessed right.
I am going to show that for a wide class of fallible subjects, if the subject is sufficiently confident that she will correctly identify her evidence, then adopting Accurate Metaconditionalization will have lower expected actual inaccuracy than adopting Bayesian Conditionalization for her. Here is roughly how the argument will go. I will begin by showing that we can state the expected actual inaccuracy of adopting an updating rule as a function of your credence
$x$
in the proposition Guess Right. In particular, we can state the expected actual inaccuracy of adopting Accurate Metaconditionalization as a function of
$x$
, and we can state the expected actual inaccuracy of adopting Bayesian Conditionalization as a function of
$x$
. Importantly, both functions are continuous functions of
$x$
. We will show that when
$x = 1$
, adopting Bayesian Conditionalization has greater expected actual inaccuracy than adopting Accurate Metaconditionalization. Since both functions are continuous, it follows there is some
$\delta \gt 0$
such that if
$x \gt 1 - \delta $
, then adopting Bayesian Conditionalization has greater expected actual inaccuracy than adopting Accurate Metaconditionalization.
Let’s now turn to the details. To begin, I am going to introduce and define a new kind of function, which I’ll call a probability extension function. We can think of a probability extension function as a specification of the conditional credences of some hypothetical subject, conditional on each member of the partition
$\left\{ {Guess{\rm{\;\;}}Right,Guess{\rm{\;\;}}Wrong} \right\}$
that the subject leaves open. We then feed the probability extension function a possible credence
$x$
in Guess Right (a real number between 0 and 1) and the function returns a (complete) probability function—the probability function determined by the conditional credence specifications, together with
$x$
.
To make this more precise, fix a set of worlds
${\rm{\Omega }}$
. Let
${\textsf{E}}$
be any evidence function, and let
${{\textsf{G}}^{\textsf{E}}}$
be any guess function. Let
${\rm{\Delta }}$
be the set of probability functions over
${\cal P}\left( {\rm{\Omega }} \right)$
. We define
${{\rm{\Delta }}_{{\rm{Right}}}}$
as

and we define
${{\rm{\Delta }}_{{\rm{Wrong}}}}$
in a similar way:

For each pair
$\langle {P_{\rm{R}}},{P_{\rm{W}}}\rangle $
consisting of a
${P_{\rm{R}}} \in {{\rm{\Delta }}_{{\rm{Right}}}}$
and a
${P_{\rm{W}}} \in {{\rm{\Delta }}_{{\rm{Wrong}}}}$
, we define a probability extension function
${\lambda _{\langle {P_{\rm{R}}},{P_{\rm{W}}}\rangle }}$
as a function that takes a real number
$x$
between 0 and 1 and returns a probability function
${\lambda _{\langle {P_{\rm{R}}},{P_{\rm{W}}}\rangle }}\left( x \right)$
over
${\cal P}\left( {\rm{\Omega }} \right)$
defined as follows:

Each probability extension function is indexed to a pair
$\langle {P_{\rm{R}}},{P_{\rm{W}}}\rangle $
. In what follows I will leave off the subscripts for the sake of readability.
We can use probability extension functions to specify the expected actual inaccuracy of adopting an updating rule, for some subject, as a function of her credence in Guess Right. To see this, fix a learning situation
${\textsf{E}}$
, a guess function
${{\textsf{G}}^{\textsf{E}}}$
, and an evidential updating rule
$g$
. Each probability extension function
$\lambda $
determines a function that takes a credence
$x$
in Guess Right and returns the expectation, relative to
$\lambda \left( x \right)$
, of the actual inaccuracy of adopting rule
$g$
given guess function
${{\textsf{G}}^{\textsf{E}}}$
. For example, consider

This is a function that takes a credence
$x$
in Guess Right and returns the expectation, relative to
$\lambda \left( x \right)$
, of the actual inaccuracy of adopting Metaconditionalization given guess function
${{\textsf{G}}^{\textsf{E}}}$
. Similarly, we have

This is a function that takes a credence
$x$
in Guess Right and returns the expectation, relative to
$\lambda \left( x \right)$
, of the actual inaccuracy of adopting Bayesian Conditionalization given guess function
${{\textsf{G}}^{\textsf{E}}}$
.
For any probability extension function
$\lambda $
, we define
${{\cal C}_\lambda }$
as follows:

We’re thinking of
$\lambda $
as a specification of the conditional credences of some hypothetical subject, conditional on each member of
$\left\{ {Guess{\rm{\;}}Right,Guess{\rm{\;\;}}Wrong} \right\}$
that the subject leaves open. We can then think of
${{\cal C}_\lambda }$
as the set of all probability functions that agree with
$\lambda $
with respect to those assignments of conditional credences. Importantly, every probability function
$C \in {\rm{\Delta }}$
belongs to
${{\cal C}_\lambda }$
for some probability extension function
$\lambda $
.Footnote 22
We will show that for any probability extension function
$\lambda $
satisfying certain constraints, and any probability function
$C$
in
${{\cal C}_\lambda }$
, if
$C\left( {Guess{\rm{\;}}Right} \right)$
is sufficiently high, then the expected actual inaccuracy, relative to
$C$
, of adopting Accurate Metaconditionalization will be lower than the expected actual inaccuracy of adopting Bayesian Conditionalization. More precisely:
Theorem 2 Let
$E$
be any learning situation,
${G^E}$
any guess function, and
$\lambda $
any probability extension function such that:
-
(i)
$\lambda \left( 1 \right)\left( {{\textsf{E}}\left( w \right)} \right) \gt 0$ for all
$w \in {\rm{\Omega }}$ ;
-
(ii)
$\lambda \left( 1 \right)\left( {{\textsf{E}} = {\textsf{E}}\left( w \right)} \right) \gt 0$ for all
$w \in {\rm{\Omega }}$ ;
-
(iii)
${g_{{\rm{meta}}}}\left( {\lambda \left( 1 \right),{\textsf{E}}\left( w \right)} \right) \ne {g_{{\rm{cond}}}}\left( {\lambda \left( 1 \right),{\textsf{E}}\left( w \right)} \right)$ for some
$w \in Guess{\rm{\;\;}}Right$ .
Then there’s a
${\delta _\lambda } \gt 0$
such that, for all
$C \in {{\cal C}_\lambda }$
, if
$C\left( {Guess{\rm{\;\;}}Right} \right) \gt 1 - {\delta _\lambda }$
, then

The proof of Theorem 2 relies on a lemma.
Lemma 1 Let
$E$
be any learning situation,
${G^E}$
any guess function, and
$\lambda $
any probability extension function satisfying conditions (i) and (ii) in our statement of Theorem 2. Then
-
(i)
$\mathop \sum \limits_{w \in {\rm{\Omega }}} \lambda \left( x \right)\left( w \right) \cdot {\bf{I}}\left[ {{f_{\lambda \left( 1 \right),{{\textsf{G}}^{\textsf{E}}}}}\left( {{g_{{\rm{meta}}}},w} \right),w} \right)]$
-
(ii)
$\mathop \sum \limits_{w \in {\rm{\Omega }}} \lambda \left( x \right)\left( w \right) \cdot {\bf{I}}\left[ {{f_{\lambda \left( x \right),{{\textsf{G}}^{\textsf{E}}}}}\left( {{g_{cond}},w} \right),w} \right)]$
are both continuous at 1.
I leave the proof of Lemma 1 to the appendix.
Proof of Theorem 2. Consider any learning situation
${\textsf{E}}$
, any guess function
${{\textsf{G}}^{\textsf{E}}}$
, and any probability extension function
$\lambda $
satisfying (i), (ii), and (iii). It follows from Theorem 1 that

This says that any subject whose prior is
$\lambda \left( 1 \right)$
expects following Metaconditionalization in learning situation
${\textsf{E}}$
to have lower expected inaccuracy than following Bayesian Conditionalization in learning situation
${\textsf{E}}$
. Note that
$\lambda \left( 1 \right)\left( {Guess{\rm{\;\;}}Right} \right) = 1$
. This means that, for all
$w \in {\rm{\Omega }}$
such that
$\lambda \left( 1 \right)\left( w \right) \gt 0$
,


Given (23) and (24), (22) entails

This says that any subject whose prior is
$\lambda \left( 1 \right)$
and whose guess function is
${{\textsf{G}}^{\textsf{E}}}$
expects adopting Metaconditionalization in learning situation
${\textsf{E}}$
to have lower expected inaccuracy than adopting Bayesian Conditionalization in learning situation
${\textsf{E}}$
.
Equation (25) and Lemma 1 together entail that

We know that for all
$C \in {{\cal C}_\lambda }$
,
$C = \lambda \left( {C\left( {Guess{\rm{\;\;}}Right} \right)} \right)$
. Therefore, it follows from (26) that
There’s a
${\delta _\lambda } \gt 0 {\rm s.t.,for\ all}\ C \in {{\cal C}_\lambda },{\rm{if\ C}}\left( {Guess{{\;\;}}Right} \right) \gt 1 - {\delta _\lambda },{\rm{then}}$

This says that for any subject whose prior probability functions is in
${{\cal C}_\lambda }$
, if the subject is sufficiently confident in Guess Right, then she will expect adopting Metaconditionalization with respect to
$\lambda \left( 1 \right)$
to have strictly lower actual inaccuracy than adopting Bayesian Conditionalization with respect to her own prior. Remember, we’re assuming that

We are also assuming that

It follows that

We know that, for all
$C \in {{\cal C}_\lambda }$
, if
$C\left( {Guess{\rm{\;\;}}Right} \right) \gt 0$
then

Equations (28) and (29) together entail that for all
$C \in {{\cal C}_\lambda }$
, if
$C\left( {Guess{\rm{\;\;}}Right} \right) \gt 0$
, then

There’s a
${\delta _\lambda } \gt 0\ {\rm s.t.,for\ all}\ C \in {{\cal C}_\lambda },{\rm{if\ C}}\left( {Guess{\rm{\;\;}}Right} \right) \gt 1 - {\delta _\lambda },{\rm{then}}$

This says that for any subject whose prior probability functions is in
${{\cal C}_\lambda }$
and whose guess function is
${{\textsf{G}}^{\textsf{E}}}$
, if the subject is sufficiently confident in Guess Right, then she will expect adopting Accurate Metaconditionalization in learning situation
${\textsf{E}}$
to have strictly lower actual inaccuracy than adopting Bayesian Conditionalization in learning situation
${\textsf{E}}$
. This completes the proof of Theorem 2.
Let’s take stock. In section 4, I presented Schoenfield’s showing that following Metaconditionalization has greater expected actual accuracy than following Bayesian Conditionalization. But, I argued, we cannot conclude from this fact that adopting Metaconditionalization has greater expected actual accuracy than adopting Bayesian Conditionalization. That would follow only if we said that we’re infallible in every learning situation, and we cannot, on pain of begging the question against the externalist, assume that this is so. In this section I have shown that we can do without the assumption of infallibility. Theorem 2 shows that for a wide class of fallible subjects and learning situations, if the subject is sufficiently confident that she will correctly identify her evidence in that learning situation, then adopting Accurate Metaconditionalization will have greater expected actual accuracy for her than adopting Bayesian Conditionalization.Footnote 23
This is not good news for the project of reconciling accuracy-first externalism with Bayesian epistemology. The externalist who wishes to justify Bayesian Conditionalization on the basis of accuracy should hope to find a natural class of fallible agents for whom Bayesian Conditionalization is the most accurate updating procedure in expectation. We should be pessimistic about the prospects for this project on the basis of the results of this paper. Theorem 2 shows that adopting Accurate Metaconditionalization will have greater expected actual accuracy than adopting Conditionalization for some agents in any such class—so long as it includes agents who are sufficiently confident that they will correctly identify their evidence, and I can see no principled reason to exclude all such agents.
5.2 Guess conditionalization
Let me take a moment to address a concern about the significance of this result, and its relationship to other results in the literature. Those who have read Gallow (Reference Gallow2021) or Isaacs and Russell (Reference Isaacs and Sanford Russell2023) might wonder: Haven’t these authors already shown us how fallible agents should update their credences? Gallow (Reference Gallow2021) argues that we can use a version of a result due to Greaves and Wallace (Reference Greaves and Wallace2006) to show that a rule that I will call Guess Conditionalization is the best rule for fallible agents.Footnote 24
Guess Conditionalization

However, I believe that the argument that Gallow is alluding to requires certain assumptions about the nature of our fallibility that the externalist should reject. To see this, remember that our guess function
${{\textsf{G}}^{\textsf{E}}}$
is a function that takes each world
$w$
to the subject’s guess, in
$w$
, about what her evidence is. If we are interested in subjects who are trying to follow Guess Conditionalization, we need another guess function
${{\textsf{G}}^{{{\textsf{G}}^{\textsf{E}}}}}$
that takes each world
$w$
to the subject’s guess, in
$w$
, about what her guess is in
$w$
. Let us assume that

This says that the credence function you would have if you adopted Guess Conditionalization in learning situation
${\textsf{E}}$
is the result of conditioning your prior on your guess about what your guess is.Footnote 25 With this assumption in place, the Greaves and Wallace–style argument that Guess Conditionalization is the best rule for fallible agents requires us to assume that subjects are guess-infallible:

If we assume that our subject is guess-infallible then, for all
$w \in {\rm{\Omega }}$
,

This says that if the subject adopts Guess Conditionalization, then she would follow Guess Conditionalization.
But the externalist should insist that creatures like us are not guess-infallible. According to the externalist, my beliefs about what I have guessed are not perfectly sensitive to the facts about what I have guessed, and, importantly, no amount of careful attention to my guesses will insure me against error. Even ideally rational, maximally attentive agents are not always certain of the true answer to the question of what their guess is. That is just to say that even ideally rational, maximally attentive agents are not always such that, if they adopted Guess Conditionalization, they would follow Guess Conditionalization. In short, (34) is often false for agents like us—agents with fallible information-gathering mechanisms. But without (34), we can’t use the Greaves and Wallace–style argument that Gallow is alluding to in order to show that adopting Guess Conditionalization has lower expected actual inaccuracy than adopting any other rule.
6 Conclusion
It’s been said that accuracy-first epistemology poses a special threat to externalism. Schoenfield (Reference Schoenfield2017) shows that the rule that maximizes expected accuracy is Metaconditionalization. But if externalism is true, Metaconditionalization is not Bayesian Conditionalization. Thus, externalists seem to face a dilemma, which I have called the Bayesian Dilemma: Either deny that Bayesian Conditionalization is required or else deny that the rational update rule is the rule that maximizes expected accuracy. I am not convinced by this argument. Schoenfield’s result shows that following Metaconditionalization has greater expected accuracy than following Bayesian Conditionalization. It doesn’t follow that adopting Metaconditionalization has greater expected accuracy than adopting Bayesian Conditionalization. That would follow only if we also said that if you adopted Metaconditionalization, you would follow Metaconditionalization. But the externalist has every reason to deny that this is always so. I have argued that the Bayesian Dilemma is nevertheless a genuine dilemma. I presented a new argument that does not make any assumptions that the externalist must reject. This argument shows that, for a wide class of fallible subjects, if the subject is sufficiently confident that she will correctly identify her evidence, then adopting Accurate Metaconditionalization will have greater expected accuracy for her than adopting Bayesian Conditionalization.
Acknowledgments
Thanks to David Boylan, Kevin Dorst, Matt Mandelkern, Alex Meehan, Dmitri Gallow, and Bernhard Salow for helpful conversations. I am especially grateful to two people: Milo Phillips-Brown for extremely helpful feedback on earlier drafts, and to Snow Zhang for many conversations about this material, including conversations that led me to starting work on accuracy and updating in the first place.
Appendix A
In this appendix, we prove Lemma 1.
Proof of Lemma 1. We start by showing that (i) is continuous. Observe that (i) is a sum of terms of the form

Notice that
$\lambda \left( x \right)\left( w \right) = {P_{\rm{R}}}\left( w \right) \cdot x + {P_{\rm{W}}}\left( w \right)\left( {1 - x} \right)$
is a polynomial and so is continuous everywhere. Moreover,
${\bf{I}}\left[ {{g_{{\rm{meta}}}}\left( {\lambda \left( 1 \right),{{\textsf{G}}^{\textsf{E}}}\left( w \right)} \right),w} \right]$
is a constant. Therefore, (i) is a linear combination of continuous functions and therefore is itself continuous.
Next we will show that (ii) is continuous at 1. To begin, observe that (ii) is a sum of terms of the form

Thus, to show that (ii) is continuous at 1, it suffices to show that (38) is a continuous function at 1 for all
$w \in {\rm{\Omega }}$
. We have seen that
$\lambda \left( x \right)\left( w \right)$
is a polynomial and so is continuous everywhere. Thus, to show that (38) is continuous at 1 it suffices to show that

is continuous at 1. By our assumption that I satisfies Additivity, we have that
${\bf{I}}\left[ {{g_{{\rm{cond}}}}\left( {\lambda \left( x \right),{{\textsf{G}}^{\textsf{E}}}\left( w \right)} \right),w} \right]$
is equal to

Fix an arbitrary
$H \in {\cal P}\left( {\rm{\Omega }} \right)$
. To show that (39) is continuous at 1 it suffices to show that

is continuous at 1. Define
$h\left( x \right)$
as follows:

Then
$f\left( x \right) = i_w^H \circ h\left( x \right)$
. By our assumption of Continuity for the local inaccuracy measure
$i_w^H$
, we know that
$i_w^H$
is a continuous function of
$h\left( x \right)$
. Thus, to show that
$f\left( x \right)$
is continuous at 1, it suffices to show that
$h$
is continuous at 1. By the definition of
$\lambda \left( x \right)(H|{{\textsf{G}}^{\textsf{E}}}\left( w \right))$
, we have

It follows from our assumption that
$\lambda \left( 1 \right)\left( {{\textsf{E}}\left( w \right)} \right) \gt 0$
for all
$w \in {\rm{\Omega }}$
that
$\lambda \left( 1 \right)\left( {{{\textsf{G}}^{\textsf{E}}}\left( w \right)} \right) \gt 0$
for all
$w \in {\rm{\Omega }}$
. Since the numerator and denominator are both continuous at 1 and the denominator is greater than zero when
$x = 1$
, it follows that
$h\left( x \right)$
is continuous at 1. This completes the proof of Lemma 1.