Published online by Cambridge University Press: 12 March 2014
This paper is intended to serve a twofold purpose. Its ultimate aim is the presentation of a syntactical definition of the degree of confirmation of a hypothesis on the basis of given evidence, a notion which is known to be of outstanding significance for the logic of inductive reasoning in the empirical sciences. The theory of confirmation to be developed here has as its foundation the theory of probability, and in order to make that foundation sufficiently secure, it was found expedient to present the theory of probability in axiomatic form and to introduce a syntactical interpretation of probability suitable for the intended application in the theory of confirmation. In this sense then to clarify the logical foundations of probability theory is this paper's first aim.
1 The theory of confirmation to be presented here was developed by both authors jointly with Professor C. G. Hempel. The presentation of the theory of probability, and the execution of technical detail, are the work of the first author. Also in this latter respect Professor Hempel made valuable contributions.
A less technical exposition by C. G. Hempel and P. Oppenheim of the theory of the degree of confirmation is appearing in the April 1945 issue of Philosophy of science. This issue of Philosophy of science also contains an article by Professor R. Carnap which deals with the same topic. The approach to the problem which is to be developed in the present paper is independent of Professor Carnap's and differs from it in various respects. We wish to express our thanks to Professor Carnap for valuable comments he made in the course of an exchange of ideas on the two different studies of confirmation. We also wish to thank Dr. K. Gödel for his stimulating remarks.
For a detailed analysis of the non-metrical concept of confirmation, compare Hempel, C. G., A purely syntactical definition of confirmation, this Journal, vol. 8 (1943), pp. 122–143Google Scholar; and Studies in the logic of confirmation, Mind, vol. 54 (1945), pp. 1–26.
2 For a recent discussion of the axiomatic construction of probability theory, compare Halmos, P. R., The foundations of probability, The American mathematical monthly, vol. 51 (1944), pp. 493–510.CrossRefGoogle Scholar
3 Huntington, E. V., New sets of independent postulates for the algebra of logic, Transactions of the American Mathematical Society, vol. 35, (1933), pp. 274–304 and 557–558.Google Scholar
4 Note that | and → are relations, not operations; in particular, | is not Sheffer's stroke operation, nor is → material implication.
5 On this, and on what follows in this section, compare Kolmogoroff, A., Grundbegriffe der Wahrscheinlichkeitsrechnung (Berlin 1933)CrossRefGoogle Scholar, and P. R. Halmos (cf. Footnote 2).
6 Quotation marks have been omitted to facilitate reading. No mention has in general been made of well-known theorems of Boolean algebra that are needed in the proofs.
7 See Kendall, M. G., Advanced theory of statistics (London 1943).Google Scholar
8 Compare Hosiasson-Lindenbaum, J., On confirmation, this Journal, vol. 5 (1940), pp. 133–148Google Scholar; and Mazurkiewicz, S., Zur Axiomatik der Wahrscheinlichkeitsrechnung, Comptes rendus des séances de la Société des Sciences et des Lettres de Varsovie, Class III, vol. 25 (1932), pp. 1–4.Google Scholar
9 Thus, ‘singular’ is not intended to imply reference to one individual only.
10 The normal forms here discussed are disjunctive normal forms. The dually constructed conjunctive normal forms will not be required in the sequel; they would be conjunctions of weakest (non-analytic) assertions.
11 Compare Hilbert, D. and Bernays, P., Grundlagen der Mathematik, vol. 1 (Berlin 1934), p. 9Google Scholar and p. 185.
12 If n is finite we set If n is infinite the existence of the limit requires proof, which will be furnished below.
13 Fisher, R. A., Inverse probability, Proceedings of the Cambridge Philosophical Society, vol. 26 (1930), pp. 528–535.CrossRefGoogle Scholar Compare also Kendall, M. G., On the method of maximum likelihood, Journal of the Royal Statistical Society, vol. 103 (1940), pp. 388–399CrossRefGoogle Scholar, and Ch. VII of the work cited in Footnote 7.
14 It should be clearly understood that we do not wish to condemn legitimate applications of Bayes' Theorem, such as to situations of the following kind: Let 3 urns U 1, U 2, U 3, containing 10 balls each, with different distributions D 1, D 2, D 3 of red (R) and white (R) balls be given. If each urn is known to have a certain (e.g. the same) probability to be drawn from, and if a red ball has been drawn (H = ‘Ra’), from which urn has the ball most probably been drawn (i.e. which among ‘U 1a’, ‘U 2a’, ‘U 3a’ is most probable)? The answer, which can be found by Bayes' Theorem, will at the same time characterize one among the urn-distributions D 1, D 2, D 3 as the most probable. These, however, are not Δ's, since a Δ, in this case, would be a distribution of all individuals in the universe over the combinations of (at least) the properties R, U 1, U 2, U 3.
15 Fisher, R. A., The mathematical foundations of theoretical statistics, Philosophical transactions of the Royal Society of London, vol. 222 (1922), pp. 309–368.CrossRefGoogle Scholar
16 The likelihood of Δ relative to Ε is defined as follows: ιΕ(Δ) = df p Δ (Ε). Hence the phrase “maximum likelihood method”. It is to be noted that likelihood is not a probability function.
17 Any similarity to the terminology of the Selective Service regulations is purely coincidental.
18 Compare our earlier discussion of the equivalence of (3) and (4), in connection with Lemma 13.
19 The denominator, 2p–ι, is not 0 since, for ι = 2p, E 1 would be contradictory.
20 Waismann, F., Logisehe Analyse des Wahrscheinlichkeitsbegriffs, Erkenntnis, vol. 1 (1930), pp. 228–248.CrossRefGoogle Scholar
21 Compare Carnap, R., Introduction to semantics (Cambridge, Mass., 1942)Google Scholar.
22 The restriction, in the above discussion, to languages with a fixed number of individual constants and to singular sentences is inessential. It can be shown that sentences with arbitrarily many individual constants can be dealt with by using SNFi(S) (where Voc(S) ⊂ Vi) and by introducing the i-range of S. As for general sentences, either MNFi(S) can be resorted to, or else SNFi(DiS) plus a limiting process (as i → ∞).
23 Wittgenstein, L., Tractatus logico-philsophicus (New York and London 1922)Google Scholar.
24 This criticism was first advanced by Professor Carnap.
25 See the articles by J. Hosiasson and R. Carnap referred to in Footnotes 8 and 1 respectively.
26 Compare, again, Professor Carnap's article.
27 Reichenbach, H., Wahrscheinlichkeitslehre (Leiden 1935), pp. 387 ff.Google Scholar, and Experience and prediction (Chicago 1938), Chapter 5. The maximum likelihood method, which in our system acts as the rule of induction, represents in fact a generalization of Reichenbach's rule of induction, and we wish to acknowledge the stimulating influence which his theory of induction has had on our present approach.
28 Compare also the section on instance-confirmation in Professor Carnap's paper.
29 It is possible to interpret (i) as a special case of (ii).
30 If dc(H1, E1) and/or dc(H2, E2) are multi-valued, one might agree to interpret (56) as meaning that mdc(H1, E1) ≧ mdc(H2, E2) and, in case of mdc(H1 E2, that the maximum of dc(H1, E1) exceeds the maximum of dc(H2, E2). If neither dc(H1, E1) > dc (H2, E2,) nor dc (H2, E2), > dc(H1, E1), then the two dc's might be called equivalent: dc(H1, E1) = dc(H2, E2 (For the meaning of mdc see the end of Section 10.)
31 Compare Nagel, E., Principles of the theory of probability (Chicago 1939)Google Scholar, Section 8.