Hostname: page-component-586b7cd67f-t8hqh Total loading time: 0 Render date: 2024-11-28T16:01:47.312Z Has data issue: false hasContentIssue false

Systems of syntactic analysis

Published online by Cambridge University Press:  12 March 2014

Noam Chomsky*
Affiliation:
Society of Fellows, Harvard University

Extract

During the past several decades, linguists have developed and applied widely techniques which enable them, to a considerable extent, to determine and state the structure of natural languages without semantic reference. It is of interest to inquire seriously into the formality of linguistic method and the adequacy of whatever part of it can be made purely formal, and to examine the possibilities of applying it, as has occasionally been suggested, to a wider range of problems. In order to pursue these aims it is first necessary to reconstruct carefully the set of procedures by which the linguist derives the statements of a linguistic grammar from the behaviour of language users, distinguishing clearly between formal and experimental in such a way that grammatical notions, appearing as definienda in a constructional system, will be formally derivable for any language from a fixed sample of linguistic material upon which the primitives of the system are experimentally defined. The present paper will be an attempt to formalize a certain part of the linguist's generalized syntax language.

From another point of view, this paper is an attempt to develop an adequate notion of syntactic category within an inscriptional nominalistic framework. The inscriptional approach seems natural for linguistics, particularly in view of the fact that an adequate extension of the results of this paper will have to deal with the problem of homonymity, i.e., with a statement of the conditions under which tokens of the same type must be assigned to different syntactic classes.

Type
Research Article
Copyright
Copyright © Association for Symbolic Logic 1953

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1 Within linguistics, the source for these investigations is in the methods of structural analysis developed by Z. S. Harris; within philosophy and logic, it is in the work of N. Goodman on constructional systems and in the development of nominalistic syntax by Goodman and Quine. As general references, then, for this paper, see Harris, , Methods in structural linguistics, Chicago, 1951Google Scholar, Goodman, , The structure of appearance, Cambridge, 1951Google Scholar, and Goodman, and Quine, , Steps towards a constructive nominalism, this Journal, vol. 12 (1947), pp. 105122Google Scholar. I am much indebted to Professors Harris, Goodman, and Quine, as well as to Y. Bar-Hillel, H. Hiż, and others, for many suggestions and criticisms.

2 E.g., Quine, W. V., Notes on existence and necessity, Journal of philosophy, vol. 40 (1943), pp. 120CrossRefGoogle Scholar. Also, see Harris, Z. S., Discourse analysis, Language, vol. 28 (1952), pp. 130CrossRefGoogle Scholar, for an investigation of the possibility of using methods of linguistics to determine the structure of a connected short text, thus, in a sense, setting up partial synonymity classes for it.

3 The constructions of this paper are roughly coextensive with the procedures of chapters 15, 16, Methods.

4 See Bar-Hillel, Y., On syntactic categories, this Journal, vol. 15 (1950), pp. 116Google Scholar, for a development of these notions.

5 The third suggestion is actually equivalent to the system adopted here for the special case of languages in which each sentence contains exactly two elements (morphemes).

6 The first two in particular are problems of how to apply the primitives of these systems. Thus ‘CON’ must not be predicated of homonyms, and ‘ENV’ must not be predicated of contexts such as ‘it was …’ (see below, § 2).

7 Thus we do not wish to require in principle that the ‘whole language’ be available as data. It is, however, of interest to consider this situation as well. Thus, if there are large significant classes which are subdivided into classes whose distributions cluster separately (see footnote 8), but such that the subclasses have similar distributions in terms of other classes, then the methods to be adopted here permit the construction of the large class as a ‘second-level’ class.

8 Thus we might require, for an expression to be admissible into the class of contexts, that the distributions of the elements occurring in its ‘blank space’ form a single cluster of sets. It is therefore necessary on the one hand to clarify the sense in which a set of sets can be said to be most efficiently divided into a set of clusters of sets, on the other, to investigate the actual statistics of distribution in natural languages. Precisely the same researches are necessary to resolve at least part of the homonym problem, considering homonyms as the elements whose distributions overlap two clusters of distributions. Cf. Harris, , Methods, pp. 257ffGoogle Scholar.

9 Actually, over morpheme occurrences. The linguist's morphemes are classes of conforming minimal meaning-bearing units, e.g., ‘boy,’ ‘think,’ ‘of,’ ‘ing,’ the plural ‘s’, etc. Forms such as ‘wife’ and ‘wive,’ with selection predictable given the context (thus ‘wive’ occurs only before's' plural, ‘wife’ only elsewhere), are called morpheme alternants and are considered to belong to the same morpheme. They are here considered to conform. See Methods, chap. 12, 13.

10 For a discussion of the Calculus of Individuals (and the notions of ‘sum’, ‘scattered individual’, etc.) see Leonard, H. S. and Goodman, N., The calculus of individuals and its uses, this Journal, vol. 5 (1940), pp. 4555Google Scholar, and Structure, pp. 42–55.

11 D1-4 are, respectively, D2.042, D2.044, D2.045, and D2.047 of Structure, pp. 44–46.

12 A1 and A10 are, respectively, 2.41 and 2.45 of Structure, pp. 44–46. The essential idea of A13 is discussed in Structure on pp. 47–48. This axiom system is adequate only if we assume that no inscription contains infinitely many atoms, and then carry out proofs in the metalanguage, using induction on the number of atoms in an inscription. Alternatively, we could adjoin several axioms involving ‘EQL’ which would permit the derivation of all theorems in which no schematically defined terms appear within the system.

13 The non-atomic terms will in the interesting cases be what are called ‘immediate constituents’ in linguistic terminology. Thus such a linguistic form as ‘that poor fellow on the corner missed his bus’ might be analyzed into two immediate constituents, a noun phrase (‘that …. corner’) and a verb phrase (‘missed his bus’), in which case it might be shown to be equivalent in the sense of the procedure to be adopted to a sentence consisting simply of a noun and a verb, e.g., ‘he fell.’ These phrases in turn can be analyzed into immediate constituents (e.g., ‘that poor fellow’ and ‘on the corner’), etc., until the ultimate constituents (morphemes) are reached. For a detailed discussion of constituent analysis and its problems see Wells, R. S., Immediate constituents, Language, vol. 23 (1947), pp. 81117CrossRefGoogle Scholar, and Methods.

14 For the time being, we restrict ourselves to terms which do not cross over discontinuities. See however systems III, IV, V, pp. 15–18.

15 ‘environment-included’ will always be used in the sense of proper inclusion.

16 In the definitions themselves, the variables ‘m’, ‘n’, etc., must be taken as syntactic variables ranging over numerals; elsewhere (including range specification) it is convenient to take them as numerical variables, ranging over numbers.

17 It thus appears that ‘CON’ as explained and axiomatized above could have been defined from a simple conformity relation among atoms. The same is true of ‘PRE’. This conformity relation could, in turn, be defined as the ancestral of a non-transitive matching relation, in a way analogous to that demonstrated in Structure, pp. 234–235.

(Added November 19, 1952.) These reductions would in fact increase the complexity of the basis in the sense of Structure, pp. 59–85, because the predicate formed (in calculating complexity) by compounding ‘EQL’ and ‘CON’ would have two segments rather than one under this revision, since ‘CON’ would now hold only of atoms. However, under a more recent formulation of the notion of simplicity (Goodman, N., New notes on simplicity, this Journal, vol. 17 (1952), pp. 189191Google Scholar) the two bases would be of equal simplicity.

18 As in the so-called endocentric constructions, e.g., ‘poor John,’ which belongs to the same category as ‘John.’ See Bloomfield, L., Language, p. 194Google Scholar.

19 It seems that this can be done by means of the devices developed by Martin, R. M. and Woodger, J. H., Towards an inscriptional semantics, this Journal, vol. 16 (1951), pp. 191203Google Scholar.

20 We will call the immediate constituents, their immediate constituents, etc., down to ultimate constituents, simply the constituents of the language.

21 The systems constructed in this section will keep the symbolism of system I (as well as the numbering of definitions and theorems), but with numerical superscripts, ‘2’ for system II, etc. The symbols of system I appear without superscripts. Obviously, Kab = K2ab, EIa = EI2a, etc. Superscripts will ordinarily be dropped in such cases.