1. Introduction
An important line of response to scepticism appeals to the best explanation (abductivism). But anti-sceptics have not engaged much with work on explanation in the philosophy of science. I plan to investigate whether plausible assumptions about best explanations really do favour anti-scepticism. I will argue that there are ways of constructing sceptical hypotheses in which the assumptions do favour anti-scepticism, but the size of the support for anti-scepticism is small, so the overall verdict is mixed.
I start by reviewing the options for the abductivist, identifying the most promising option as that of using rationality constraints on priors in a Bayesian framework. These constraints lead to the Bounded Asymmetry and Numerousness Arguments (Huemer Reference Huemer2009) which I will argue can both provide support for anti-sceptical hypotheses. But when it comes to the quantitative question of how much support they provide, I will argue that they do not provide strong support.
To be clear about the dialectic, my goal is not to refute scepticism; my goal is to connect arguments based on simplicity and explanation with Bayesianism, and apply the result to the traditional sceptical challenge.
Section 2 explains the background and makes explicit some assumptions, section 3 explains the Bounded Asymmetry Argument, section 4 explains the Numerousness Argument, section 5 connects the arguments to Reichenbach's cubical world thought experiment and section 6 concludes.
2. Background
Consider two hypotheses:
Real World Hypothesis (RWH): My precise total evidence is generated by an external world which is roughly as it appears to be.
Scepticism: My precise total evidence is generated by something other than an external world which is roughly as it appears to be.
Notice that RWH and Scepticism are indexical hypotheses. They state how things are with us, not how things are objectively. I think this is the right way to set up the debate – we are primarily concerned with our own position. But my main arguments concern non-indexical hypotheses, so I'll later address how they affect the probabilities of indexical hypotheses.
Do we have justification to believe RWH? One important line of anti-sceptical argument is that RWH is the best explanation of our evidence. Call this abductivism:
Abductivism: RWH is the best explanation of my precise total evidence.
Abductivism can be traced to John LockeFootnote 1 (1690⁄1975: bk. iv, ch. xi), with more recent advocates including Bertrand Russell (Reference Russell1912, 1927, 1948), C.D. Broad (Reference Broad1925), A.J. Ayer (Reference Ayer1956), J.L. Mackie (Reference Mackie1976), Frank Jackson (Reference Jackson1977), James Cornman (1980), Alan Goldman (Reference Goldman1988), William Lycan (Reference Lycan1988), Paul Moser (Reference Moser1989), Jonathan Vogel (Reference Vogel1990, Reference Vogel, Steup and Sosa2005), and Laurence BonJour (Reference BonJour1998, Reference BonJour1999, Reference BonJour, BonJour and Sosa2003).
To develop this line of argument, we have to face two questions: what is an explanation? And what makes an explanation good, better, or best?
We need not dwell too long on the first question. Following Hempel and Oppenheim (1948: 135), we can take an explanation to be an answer to a why question. And we can take the question to be ‘why do I have this precise total evidence?’. It is plausible that RWH answers that question, as RWH entails that I have this precise total evidence – but so does Scepticism, and the challenge is to explain why RWH is a better explanation.Footnote 2
The philosophical controversy centres on the second question – what makes an explanation good? Here we can draw on the philosophy of science literature, which separates two factors contributing to the goodness of an explanation – how well the hypothesis predicts the data (empirical virtues) and the intrinsic qualities (theoretical virtues) of the hypothesis, such as simplicity.Footnote 3 Einstein makes clear the contrast:
The first point of view … is concerned with … confirmation … by the available empirical facts [evidence]. The second point of view is not concerned with the relation of the material of observation but with the premises of the theory itself, with what may briefly be characterized as the ‘naturalness’ or ‘logical simplicity’ of the premises … The second point of view may briefly be characterized as concerning itself with the ‘inner perfection’ of a theory, whereas the first point of view refers to the ‘external confirmation’. (Einstein ‘Autobiographical Notes’, in Schilpp Reference Schilpp1949: 21–2)
The first factor/point of view won't help us because both RWH and Scepticism have been designed to entail the evidence.Footnote 4
Let's move on to the second factor: theoretical virtues. Popular candidates for theoretical virtues include coherence, scope, abstraction, generality, unification and simplicity.Footnote 5 The abductivist's appeal to a best explanation therefore seems to be a place-holder for a long list of purported virtues, and has little content unless we have some substantive grip on what these virtues amount to. It has proven difficult to justify any of these virtues, but, without getting bogged down in the details, I think the abductivist's best hope is to appeal to a priori inductive probabilitiesFootnote 6 which assign higher priors to simpler hypotheses.Footnote 7
My aim in this paper is to grant such priors and investigate whether anti-scepticism follows. (I remain neutral on other routes to RWH.) In particular, I will assume:
1. A Bayesian epistemology according to which agents
(i) have beliefs which can be represented by probabilities and
(ii) update those beliefs by conditionalizing when evidence is acquired (Bradley 2015).
2. There are rationality constraints on priors i.e. rationality requires that agents’ prior credences fit particular constraints.
3. The rational constraints involve weak Principles of Indifference, which assign roughly equal probability to possibilities (where we have no reason to favour one possibility over another).
Let me say a little to motivate the second and third assumptions. Rational constraints on priors have historically been contentious, but it has not been sufficiently appreciated that they are needed in order for there to be rationality constraints on posteriors.Footnote 8 Someone with crazy priors can, given arbitrary evidence, end up with crazy posteriors. And then, as Russell wrote:
the lunatic who believes that he is a poached egg is to be condemned solely on the ground that he is in a minority. (Russell 1946: 646)
So for those who deny that there are rationality constraints, the hypothesis that there is no external world will be the least of their concerns.
If there are principles for the rational distribution of priors then the only serious contenders are based on Principles of Indifference which assign roughly equal probability to possibilities (where we have no reason to favour one possibility over another). I think the core intuition is based on arbitrariness: it would be unacceptably arbitrary to assign one hypothesis a higher probability than another without a good reason for doing so.
The high watermark of the research project of identifying and defending Principles of Indifference was Carnap's (e.g. 1950; Carnap and Jeffrey Reference Carnap and Jeffrey1971) later work, in which he tried to defend a universal systematic set of principles. This project was widely regarded to have failed. Perhaps he aimed too high, but we should not throw out the baby with the bathwater. We can identify principles that apply in some cases without holding that they apply in all cases. And we might be guided by principles even if we are unable to formulate them.Footnote 9 At any rate Principles of Indifference remain a topic of lively debate.Footnote 10
I will not defend the specific principles used by the Numerousness Argument and the Bounded Asymmetry Argument in any detail because my aim is not to defend the arguments. I am asking: can plausible constraints on priors be used to answer the sceptic? My point, which bears repeating, is to grant the abductivist these assumptions and investigate whether RWH follows. For it is not obvious that the assumptions will favour RWH rather than Scepticism.Footnote 11
My verdict will be mixed. Specifically, there will be qualitative but little quantitative support for RWH. That is, RWH emerges as more likely than competitors, but the strength of support for RWH is low, and seems to be insufficient for the goals of anti-sceptics.
A different, methodological point is worth making explicit. A number of philosophers have argued that simplicity considerations are relevant to science but not to philosophy (see Thomasson 2007: 151, Bennett Reference Bennett, Chalmers, Manley and Wasserman2009; Huemer Reference Huemer2009; Shalkowski 2010; Kriegel 2013; French 2014; Willard 2014, Saatsi Reference Saatsi, Slater and Yudell2017). I hope to show how simplicity considerations can be used in philosophy.Footnote 12
Terminology: I will sometimes find it convenient to talk about reasons to believe, but my aim is purely to make claims about rational a priori probabilities. So an ‘a priori reason to believe’ means ‘a consideration that increases one's prior above what it would have been if that consideration had not been taken into account’. Relatedly, I will not be making any claims about ‘belief’ in the colloquial sense of ‘full belief’. So ‘a reason to believe’ is not necessarily a strong reason, and ‘justification to believe’ is not necessarily ‘full justification’, or ‘justification to fully believe’. The expressions simply point to a consideration which increases the prior.
3. Bounded Asymmetry
3.1. Background
The Bounded Asymmetry argument starts from the fact that the complexity of theories is unbounded in one direction only.Footnote 13 Huemer writes:
for any given phenomenon, there is a simplest theory (allowing ties for simplest), but no most complex theory of the phenomenon: however complex a theory is, it is always possible to devise a more complicated one. This is most easily seen if we take a theory's complexity to be measured by the number of entities that it posits: one cannot posit fewer than zero entities, but for any number n, one could posit more than n entities. Similar points hold for other measures of complexity, such as the number of parameters in an equation. (Huemer Reference Huemer2009: 219)
Now note that the total probability across all possible hypotheses must sum to 1. It follows that theories must get diminishing probability as their complexity increases.
What probabilistic principles deliver this conclusion? We need some kind of principle of non-arbitrariness regarding complexity. (We allow arbitrariness regarding, say, the alphabetical order of the hypotheses, but not regarding their complexity.) If we did allow arbitrariness regarding complexity, then we might allow local peaks – for example, with moderately complex theories being most probable (see
Figure 2). Disallowing arbitrariness regarding complexity, the only place we can put a peak is at the start, with the simplest theory. The result is diminishing probability as complexity increases (
Figure 1).
Granting the rationality of such distributions, does anything follow regarding RWH?
3.2. Layered Scepticism
Is there any bounded asymmetry in the debate about scepticism? Yes – the trick is to generate a layered sceptical hypothesis.
Terminology: Call an apparent reality a ‘universe’. For example, when a (non-simulated) mad scientist creates an apparent reality for a brain-in-vat, we'll say he creates a universe. The mad scientist is in a different universe to that of the brain-in-vat. For comic-book fans, think of usages of ‘Marvel universe’ and the ‘DC universe’. The mad scientist is in the real universe and the brain-in-vat is in a simulated universe. The brain-in-vat and the mad scientist are in the same possible world.
Suppose that someone in the real universe builds a simulation universe containing deceived agents. These deceived agents are the first layer of simulation. Now we can suppose that these deceived agents live in such a rich artificial reality that they too construct computers and build their own simulations. The agents in these simulations live in the second layer of simulation. It is clear that we can multiply this hierarchy indefinitely.
This hierarchy can be extended to other sceptical scenarios. Start with the dreaming hypothesis. A hierarchy is created by allowing dreams within dreams (a scenario familiar from the film Inception). A priori there is no limit to the number of embedded dreams there might be. Next, what about the brain-in-a-vat scenario? It is possible that the mad scientist's reality is itself a simulated reality, constructed by a mad scientist one layer up. What about Descartes’ scenario, in which you are being deceived by an evil demon? Again, the demon might be living in a reality constructed by a superior demon one layer up. Our demon might be deceived by an even more powerful demon, and so on. And we can mix up these scenarios, to create even more possibilities e.g. a demon creates a simulated world for a mad scientist who builds a brain in a vat. There is no limit to such hierarchies.Footnote 14
This suggests that the traditional dichotomy between sceptical and non-sceptical possibilities is too simple. It is true that the best situation is to not be deceived. But the alternatives are not on a par. If we are deceived, it is better to be in a simulation constructed by someone non-deceived than to be in a simulation constructed by someone who is deceived, where error is layered upon error.
Let's now focus on the hierarchy of hypotheses over which we need to distribute priors. Suppose the actual world is roughly as ours appears to be, and there is never sufficient computing power to construct realistic simulations. With no simulations, this world has Complexity Level 1. Now suppose there is one scientist, across all space and time, who manages to create a realistic simulation. This world has Complexity Level 2. Next, suppose that in this one simulation, the sims are clever enough to create their own realistic simulation. This world has Complexity Level 3. The pattern should be clear. Each new level in the hierarchy adds a level of complexity. For simplicity, assume that the only deceived agents are in the simulations, so given Complexity Level 1, no agents are deceived.Footnote 15 Thus we have a sequence of (non-indexical) hypotheses:
Complexity level 1
Complexity level 2
Complexity level 3
etc.
How should we assign prior probabilities? The possible layers have the asymmetrical structure we saw above. Thus, the only non-arbitrary way to assign priors is to assign higher priors to hypotheses saying there are fewer layers. The result is that the hypothesis that there is only one universe gets a relatively high prior (
Figure 3).
And this non-indexical hypothesis entails the indexical hypothesis that I am perceiving an external world which is roughly as it appears to be.Footnote 16 So we end up with a relatively high prior for RWH. The argument provides a reason to favour RWH over any specific simulation hypothesis. And on the question of how far down the possibly endless layers of deception we might be, we have a reason to favour those in which we are less deceived rather than more deceived.
But although there is some support for RWH (qualitative), the degree of support (quantitative) is small. The probability of the disjunction of all simulated possibilities will be higher than the probability of RWH. Furthermore, the assumption that the space of possible hypotheses is infinite seems to suggest that the difference between the probabilities of the first and second hypotheses is vanishingly small, and the second hypothesis is still a sceptical hypothesis. So to achieve quantitatively significant support for RWH, the Boundary Asymmetry Argument would need to be bolstered by other arguments.
A couple of complications are worth mentioning before moving on. First, Bostrom (Reference Bostrom2003) has used related considerations to argue against RWH. He argued that there are likely to be many more simulated agents than non-simulated agents (as we can expect lots of simulations to be run), and that it is correspondingly more likely that I am a simulated agent. Bostrom's argument can be reconstructed as:
1. I live in a possible world with lots of simulated agents.
2. If I live in a possible world with lots of simulated agents, then I am probably a simulated agent.
3. Therefore I am probably a simulated agent.
The Bounded Asymmetry argument reduces the prior of (1). So the Bounded Asymmetry Argument will lead you to reduce your probability that you are in a simulation, even if you are convinced of (2) and have evidence for (1). The anti-sceptical power of the Bounded Asymmetry Argument remains.
Second, suppose we drop the assumption that the only deceived agents are in the simulations. Then the hypothesis that there is only one universe does not entail the indexical hypothesis RWH. But it still confirms it.Footnote 17 As all agents in simulations are deceived, positing a simulated universe adds to the number of deceived agents (without adding to the number of non-deceived agents, we can assume), and so increases the probability that I am deceived.
4. Numerousness Argument
4.1. Background
Let's set aside the Boundary Asymmetry Argument. Specifically, suppose that instead of the increasingly complex theories getting a diminishing probability, they all get roughly the same probability. The Numerousness Argument says that there are more versions of profligate theories than versions of parsimonious theories, so the same prior probability must be shared around more theories in the profligate groups than around the parsimonious groups. So each profligate theory ends up with a lower prior.
For example, if the world has an equal probability of being parsimonious or profligate (1/2), and there are three versions of the parsimonious theory and five versions of the profligate theory, then each version of the profligate theory ends up with a lower probability (of 1/10 rather than 1/6):Footnote 18
Huemer writes:
There is some reason for thinking that ontologically complex theories are in fact more numerous than ontologically simple theories. The positing of new entities generally allows multiple theories concerning the nature of those entities; consequently, the more entities one posits, the more theories one can construct about those entities. (Huemer Reference Huemer2009: 220–1)
Which probabilistic principles deliver this conclusion? We need to divide groups of hypotheses into families according to their complexity, and assign each family approximately the same probability. This assumption seems plausible independently of a desire to avoid scepticism. It can be thought of as a weakened Principle of Indifference. We do not need the probabilities assigned to the families to be equal, we just need them to be ‘not wildly different’ (Huemer Reference Huemer2009: 230). Again, my aim is not to defend this principle, but to investigate whether it supports RWH.
4.2. Numerous Scepticisms
Given Complexity Level 1 there is only one universe; according to all other hypotheses there is more than one universe. There are many ways these extra universes might be, and so many more possibilities which need to be assigned some prior given Scepticism.Footnote 19
Let's go through an example. Take some feature that a universe may or may not have – say, that Newton's laws hold. According to Complexity Level 1 there is only one universe, so only two possibilities – either Newton's laws do hold or they don't. But now add an extra universe – Complexity Level 2. Then there is a further fact about whether Newton's laws hold in that extra universe, so there are four possibilities in total.
Thus each possibility given Complexity Level 2 is assigned a lower prior than each possibility given Complexity Level 1.
And the possibilities multiply in other ways. For example, we haven't yet taken into account the numerous ways in which one universe can be embedded in another. One degree of freedom is whether there is a deliberate deceiver. In the demon and brain-in-vat scenarios there is an agent who is being deliberately deceptive; in the dreaming and Boltzmann brain scenarios there is not. Another degree of freedom concerns the true nature of the apparent universe. In a computer simulation it is computer code; in a dream it is the result of random electrical firings in our brains (we can suppose).
The reason for all these extra degrees of freedom is the same – sceptical hypotheses posit the apparent universe and more besides. And the consequences of this proliferation of sceptical possibilities is the same – each possibility requires using up a fixed allocation of probabilities, and so each sceptical possibility ends up with a relatively low probability. Hypotheses which say that there is only one universe therefore get a relatively high probability. Such non-indexical hypotheses entail the indexical hypothesis that I am perceiving an external world which is roughly as it appears to be,Footnote 20 so we end up with a relatively high prior for RWH.
Again, the anti-sceptical power of this argument is modest. It does not shift the overall probabilities of each Complexity Level. Instead, each Complexity Level 1 possibility gets a higher prior than each Complexity Level 2 possibility. And each Complexity Level 2 possibility gets a higher prior than each Complexity Level 3 possibility, and so on.
Both the complications faced by the Boundary Asymmetry Argument are also faced by the Numerousness Argument, and can be dealt with in the same way. First, the Numerousness Argument reduces the probability that (1) I live in a possible world with lots of simulated agents (compared to an alternative prior distribution where each specific hypothesis gets the same prior). So the anti-sceptical power of the Bounded Asymmetry Argument remains even if we accept Bostrom's argument for (2).
Second, suppose we drop the assumption that the only deceived agents are in the simulations. Then, as above, the hypothesis that there is only one universe does not entail the indexical hypothesis RWH, but it still confirms it.Footnote 21 As all agents in simulations are deceived, positing a simulated universe adds to the number of deceived agents (without adding to the number of non-deceived agents, we can assume), and so increases the probability that I am deceived.
What is the relation between the Boundary Asymmetry Argument and Numerousness Argument? They are compatible. The former concerns the probability of any given Complexity Level, and the latter concerns the probability of specific versions on a given Complexity Level. Both can be applied here, increasing the overall anti-sceptical result. And as they both concern priors, they can be combined with any anti-sceptical arguments based on evidence. (Perhaps the cumulative effect of such arguments provides reasonable justification for RWH.)
This completes the main arguments of the paper. Many related arguments can be found in the literature and I want to engage with one of the most similar – Reichenbach's classic argument for the external world. I will argue that the Bounded Asymmetry Argument and Numerousness Argument vindicate Reichenbach's judgments, and can explain why we prefer to posit common causes than to posit independent causes.
5. Reichenbach's Cube and the Common Cause
Compare Reichenbach's (Reference Reichenbach1938) argument for external world realism, given at the height of logical positivism:
We imagine a world in which the whole of mankind is imprisoned in a huge cube, the walls of which are made of sheets of white cloth, translucent as the screen of a cinema but not permeable by direct light rays. Outside this cube there live birds, the shadows of which are projected on the ceiling of the cube by the sun rays; on account of the translucent character of this screen, the shadow-figures of the birds can be seen by the men within the cube. The birds themselves cannot be seen, and their singing cannot be heard. To introduce the second set of shadow-figures on the vertical plane, we imagine a system of mirrors outside the cube which a friendly ghost has constructed in such a way that a second system of light rays running horizontally projects shadow-figures of the birds on one of the vertical walls of the cube. (Reichenbach Reference Reichenbach1938: 115)
What might justify the belief that there are objects outside the cube, and not just dark patches on the screen? The belief might be justified by Reichenbach's (Reference Reichenbach1956: 158–9) Principle of the Common Cause, the underlying idea of which is that ‘simultaneous correlated events must have prior common causes’ (Arntzenius Reference Arntzenius and Zalta2010). The simultaneous correlated events would be the movement of dark patches on the wall and the ceiling; the prior common cause would be a bird outside the cube.
But this just pushes the question back – why should we believe that correlated events must have prior common causes? Indeed, the claim that correlated events must have common causes is false – bread prices in Britain and sea levels in Venice have been correlated for centuries (both have been rising), however, there is presumably no common cause (Sober Reference Sober1987).
Instead, I suggest we prefer hypotheses which say that correlated events have common causes. The Bounded Asymmetry Argument and Numerousness Argument can explain why we tend to prefer to posit common causes than to posit independent causes.
To keep things simple, assume that we have one instance of a shadow appearing on the wall and the ceiling at the same time. Compare two competing theories:Footnote 22
Common Cause: There is exactly one entity outside the cube.
Independent Causes: There are exactly two entities outside the cube.
Start with the Bounded Asymmetry Argument. Living inside the cube, what are our priors regarding how many entities there are outside the cube? There could be any number from 0 to infinity. We can therefore straightforwardly apply the Bounded Asymmetry Argument to get the result that there is more likely to be one entity than two, so Common Cause has a higher prior than Independent Causes, thus explaining our preference for the common cause. (Of course, if we get evidence against Common Cause, then credence in Common Cause will sharply fall – but we have no such evidence in the case described, so the a priori preference for Common Cause will have the knock-on effect of a preference for Common Cause in the posterior.)
Let's move on to the Numerousness Argument. Assuming Common Cause, there are questions to be answered concerning the nature of this cause. Is it a living creature or an automaton? If living, is it an agent whose behaviour can be explained by beliefs and desires or closer to a plant responding to stimuli? Various possibilities are open.
If, on the other hand, we assume Independent Causes then we have many more open possibilities. There are the many possibilities about the nature of the first entity – plus the many possibilities about the nature of the second entity. Here are the possibilities given only the question of whether the entity is living:
So a priori we have reason to prefer possibilities with common causes, as this reduces the number of possibilities left open, and so increases their probability. Thus the Numerousness Argument also contributes to explaining our preference for common causes.
Notice what this doesn't rely on – it doesn't rely on the correlation between the two shadows being especially unlikely. And we are wise to avoid relying on this – after all, any precise path of the two shadows is highly unlikely. For the two shadows to take paths that are correlated is no less likely than for them to take any other specific paths. (Compare: Lottery numbers of ‘123456’ are as likely as any other six numbers.)
Then why use an example in which the shadows’ movements are correlated? Because if they weren't then there could be no straightforward cause of both. If one shadow moved in a circle and one in a straight line then a single bird could not cause both of them, ruling out Common Cause. (Or rather, a single bird could not cause both of them without some very complicated causal mechanism explaining the differing appearances – and if we posited a complicated causal mechanism then there would be numerous ways to implement it, and we would face new versions of the Numerousness and Bounded Asymmetry Arguments applied to the mechanism.)
6. Conclusion
There is a long and distinguished history of attempts to answer scepticism by appeal to simplicity, or to the best explanation. But these attempts have rarely connected up with the dominant approach to modelling beliefs and justification – Bayesianism. I have investigated whether plausible constraints on Bayesian priors lead to a favouring of the real world hypothesis over non-sceptical hypotheses. I have argued that constraints on Bayesian priors can provide a foundation for responses to scepticism, but the probabilistic gains remain modest, with scepticism remaining a serious alternative.Footnote 23