1. Introduction
Do-support is one of the most characteristic properties of English. It occurs in the following core cases, which are the focus of this article.Footnote 1
- (1)
a. Sandy did not call. [negation]
b. Did Sandy call? [interrogation]
c. Robin didn't call but Sandy did. [ellipsis]
Other languages do not have do-support, although many languages have periphrastic do of some type (van der Auwera Reference Auwera1999, Jäger Reference Jäger2006, Wichmann & Wohlgemuth, Reference Wichmann, Wohlgemuth, Stolz, Palomo and Bakkein press). My goal here is to give an account of English do-support that explains why do-support, with its attendant properties, is found uniquely in English among the world's languages. The account makes use of a particular view of constructions as a crucial part of syntactic theory. Such an account contrasts with the familiar derivational approaches of mainstream generative grammar.
The development of do-support involves a number of critical historical developments that have been addressed in the descriptive and theoretical literature. Old English, an early ancestor of Modern English, was a Germanic language with many of the properties of contemporary German.Footnote 2 Most importantly, Old English was essentially a V2 language, verbs were fully inflected, and there was no distinct subclass of modal verbs comparable to what we find in Modern English. The changes that are associated with the development of do-support include the following:Footnote 3
- (2)
a. growth of periphrastic do+Vinf;
b. loss of full V2 reanalysis of V2 as “residual” V2 in questions and other “affective” environments;
c. formation of a distinct subclass of modal verbs;
d. loss of case and establishment of “positional licensing” of subjects;Footnote 4
e. loss of pre-verbal ne and introduction of post-verbal not as the marker of sentential negation.
Adopting the general perspective of Government and Binding Theory (GB) and later Principles and Parameters Theory (PPT), a number of scholars have sought to detail aspects of the changes in word order patterns from Old English through Early and Late Middle English to Modern English in terms of changes in the interactions between overt heads and complements, on the one hand, and abstract grammatical formatives, such as functional categories and grammatical features, on the other. This research has produced a number of proposals for changes not simply in terms of superficial grammatical patterns, but in terms of grammars. For some representative examples, see Kroch Reference Kroch1989, Kroch & Taylor Reference Kroch, Taylor, Pintzuk, Tsoulas and Warner2000, Fischer et al. Reference Fischer, van Kemenade, Kopman and van der Wurrff2000, van Kemenade Reference Hirtle1997, and Kiparsky Reference Kiparsky, van Kemenade and Vincent1997.
This research, while it differs from author to author in terms of specific proposals, shares the characteristic that it makes crucial use of abstract syntactic structure and movement. For example, the V2 property of Old English is typically (but not universally) accounted for by assuming that there is a functional head, minimally I0, which follows the subject position and precedes VP. I0 contains the finite inflection, which I denote here as [tense]. The highest V0 moves from the VP and adjoins to I0, which produces the result that the main verb functions as the head of IP.
(3)
I refer to this phenomenon as V-to-I.Footnote 5
In Modern English, in contrast, only auxiliary verbs (including modals) are adjoined to I0, on standard assumptions. An analysis along these lines captures the generalization that in languages such as German and French, which have full V-to-I, the distribution of the finite verb with respect to inversion constructions, negation, and adverbs closely parallels the distribution of the finite auxiliary verb in English, as outlined in Pollock's Reference Pollock1989 influential proposal.
On this view, the emergence of do-support is associated with the loss of V-to-I. Through this loss, English changes from a language of the German or French type to a language with the characteristic features of Modern English. In Modern English, finite tense marking must be licensed on a main verb through a mechanism other than V-to-I; this phenomenon is typically referred to as Affix Hopping (AH) (see Chomsky Reference Chomsky1957 and, for recent reappraisals, Bobaljik Reference Bobaljik1995, Lasnik Reference Lasnik, Lappin and Benmamoun1999, and Freidin Reference Freidin2004).Footnote 6Do-support, on this general view, marks the position of I0 when it is not immediately adjacent to the verb and hence cannot appear on it through AH.
While this approach to do-support captures a number of important generalizations, it is problematic in two respects. First, it is not entirely satisfactory as an account of the synchronic English grammar, a fact that is well known but has never been satisfactorily resolved. Second, it accounts (in the sense of “keeping an account”) for the changes in the language but does not explain them.
I suggest that characterizing the changes in the history of English in such derivational terms, while appealing in some respects, does not constitute the most explanatory account. Section 2 argues that an analysis that makes crucial use of I0, as the V-to-I account does, is not the best way to explain the English verbal sequence, and is problematic in a number of other respects as well. Section 3 argues that a more satisfactory approach can be formulated that makes crucial use of the notion of construction. On this approach, the various word orders are realizations of different constructions that grow and contract over time and eventually end up in the form that we see in Modern English. (This account thus adopts a perspective on grammar that is explicitly rejected in derivational approaches.) Section 4 argues that it provides a better explanation for the actual course of events observed in the history of English.
2. Derivations
While there have been significant changes in syntactic theory during the more than fifty years of generative grammar, there is a common thread that runs through all of the analyses of the English verbal cluster: the tense inflection that is marked on the finite verb is syntactically isolatable as a constituent of the sentence (in fact, the head of the sentence), and its distribution determines whether or not there will be do-support. Where tense appears, and hence whether and where do appears, is a consequence of a syntactic derivation that crucially makes use of movement to create the configuration in which the elements that are responsible for the overt form are arranged with respect to one another.
In this section, I outline the essential properties of these analyses, and point out the problematic aspects of the general approach. I review briefly how syntactic variation is accounted for in these terms, and argue that positing I0 leads to an overall loss of generalization.
2.1. Affix Hopping
The earliest generative and derivational account of the English verb cluster is that of Syntactic Structures (Chomsky Reference Chomsky1957). It is a classic analysis and widely known, so I review it only briefly in order to highlight its critical features.
Chomsky assumes that the verbal cluster excluding the main verb is generated as a complex unit, consisting of:
(4)
Each of the underlined elements is an affix, which is adjoined to the verbal element to the right of it by AH. Hence, if there is a Modal, have, or be (henceforth a Vaux), Tense is marked on this verb, as the examples in 5 demonstrate.
(5)
The sequence of terms in the rule accounts for the observed ordering, while the association of the affixes +en and +ing with their corresponding auxiliaries and AH captures the fact that have selects VPs with perfect inflection, and be selects VPs with progressive inflection.
The strict linear ordering of Modal, have, and be ensures that the verb is marked with Tense only if there is no intervening element.
- (6)
a.
b.
But if Tense is not immediately adjacent to V, for example because of the presence of not, AH is blocked. In this case, do must be inserted.
- (7)
a.
b.
Similarly, if Tense is moved away from V by SAI, Tense is not adjacent to V. AH is blocked, and do must be inserted.
- (8)
a.
b.
AH is also blocked, and do must be inserted, if V is moved away from Tense, for example by VP topicalization (as in 9a), or V is deleted, for example by VP ellipsis (in 9b) or pseudo-gapping (9c).
- (9)
a. . . . and call the manager did.
b. . . . and she did.
c. She likes beets no more than she does sweet potatoes.
The key insight of this analysis is that unaccompanied Tense functions just like Tense paired with a Vaux. The sequence in 10 functions as a unit with respect to negation, movement, and deletion.
(10)
However, on the classical analysis it is not a constituent. Thus, the analysis fails to “capture a generalization” in the traditional sense.
Another missed generalization is the fact that do is necessarily inserted after AH in such a way that it ends up in the same configuration with Tense that it would have if it was inserted before AH, that is, do+Tense. Thus, we have two ways of getting Tense onto the verb, one for Vaux and main verbs, and one for do.
Moreover, sequences like would not have, would not be, and would not have been show that not must precede have and be. That is, when there is no Modal, the order of elements is as in 11.
(11) Tense not have +en V
Tense not be +ing V
Since Tense-have and Tense-be also function as units with respect to SAI, it is necessary to shift the leftmost have or be to Tense before SAI. This movement is not blocked by the lack of adjacency between Tense and the auxiliary verb. {Have/be}-shift is another way in which the classical analysis misses a generalization about the distribution of Tense and do-support. The requirement that the rules of SAI, shifting {have/be} to Tense, and AH must be strictly ordered is a further loss of generality.
Finally, it turns out that only an intervening not causes do-support, and not an intervening adverb, even one that conveys negation.
- (12)
a. Sandy {will/did} not call.
b. Sandy {will/*did} {certainly/never} call.
This is a long-standing puzzle that has never been entirely resolved, and is rarely addressed (but see Pesetsky Reference Pesetsky1989 for a proposal).
2.2. The Shift to Structure: I0
It is natural to expect that the problems noted in the preceding section can be fixed by taking Tense-Vaux to be a constituent. In contemporary terms, this is the assumption that Tense(-Modal) is a realization of I0, the head of IP(=S). The verbal auxiliary is thereby brought into the general X' framework, which allows a uniform theory of phrase structure.Footnote 7 In SAI, I0 undergoes head-to-head movement to C0, and is thus local and structure-preserving.
This step is in fact the leading idea in the extension of X' theory to the “functional” categories and is the basis of many subsequent influential proposals in mainstream generative grammar, including uniform binary branching and the derivation of surface word order from branching structure (Kayne Reference Kayne1994), the localization of parametric variation (Borer Reference Borer1984), the structure of the left periphery (Rizzi Reference Rizzi and Haegeman1997), the DP hypothesis (Abney Reference Abney1987), and so on.
In general, PPT defines the apparent location of I0 as the head of the projection IP, and more generally, the location of an X0 as the head of XP. If an overt head is assumed to originate in some other place than where it appears on the surface, for theoretical reasons, then there must be a derivation that moves it to its observed position. This is seen in the mainstream analysis of inversion in German and French, where the main verb moves out of the VP into I0, and from there to C0(Pollock Reference Pollock1989, Den Besten Reference Den Besten and Abraham1983). In English, by contrast, only the auxiliary verbs (including modals) adjoin to I0 and function as the head of IP. The main verb remains in situ as the head of VP. Hence we have Will you eat the vegetables? but *Eat you the vegetables?
However, the assumption that I0 functions in this way does not actually improve the analysis of the English verbal cluster—it makes it worse. Below I outline the reasons for this and conclude that I0 should be abandoned. To make the analysis concrete, the following must be assumed. (The notion [tense] is intended to indicate that the tense marking is a feature.)
- (13)
a. I0 is the head of IP;
b. I0 precedes (not) VP;
c. I0 contains [tense] or Modal[tense];
d. [tense] is discharged if it locally c-commands V[tense]; do is inserted if [tense] is not discharged;
e. each head (except [tense]) selects its complement. In particular:
i. [Modal[tense]] selects VP[bare], that is, uninflected VP;
ii. have selects VP[past.prt] and be selects VP[pres.prt];
iii. the features of a projection are realized on its head;
f. {have/be} raise to I0 when I0 contains only [tense].
On this analysis, I0 is a constituent, precedes not, and undergoes SAI, which captures a generalization in configurational terms. The cost of such an analysis amounts to three stipulations: (i) that a tensed verb in situ is licensed by [tense] in I0 but not by Modal[tense] in I0, (ii) that {have/be} substitute for do, so that we do not derive *doesn't have called, *doesn't be calling, and (iii) that do must be inserted under certain circumstances; for example, as a “last resort” in the Minimalist Program (Chomsky Reference Chomsky1995).
With the shift to I0, questions arise about how exactly to get do-support to apply. Assume that I0 consists either of [tense] alone, or Modal[tense], which are in complementary distribution in I0. The following VP then has two possibilities. First, it may be a bare VP headed by a main V0, with no tense marking. Second, it may be a tensed VP, since in a simple declarative clause the main V0 is tensed and in situ. These two possibilities combine with the two possibilities for I0 to give four combinations.
- (14)
a.
b.
c.
d.
Not all of these combinations correspond to well-formed derivations. Combination 14a is possible only if [tense] and VP are ultimately not adjacent and do is introduced. Combination 14b is possible only if I0 and VP are adjacent. Whereas 14c always leads to a well-formed sentence, regardless of what happens to VP, 14d can never yield a well-formed sentence.
To capture these facts, we need to assume that when [tense] is alone in I0, it licenses and is “discharged” by a tensed main verb V[tense]. But when [tense] appears with Modal, only a bare VP is possible.
- (15)
a. Unattached and undischarged [tense] is illformed, but it is saved by introduction of do. This takes care of 14a.Footnote 8
b. VP[tense] is licensed by a locally c-commanding [tense], which is thereby “discharged,” which accounts for 14b.Footnote 9
c. Modal[tense] either does not have to be licensed, or it is automatically licensed because it is in I0. This accounts for 14c.
d. VP[tense] is not licensed by Modal[tense], and thus 14d is ungrammatical.
This analysis compares with the classical analysis as follows:
The failure of discharge of [tense] (the functional equivalent of the blocking of AH) produces do-support, as stipulated in 15a, as in the classical analysis.
The discharge of [tense] in 15b is the functional equivalent of AH, as in the classical analysis.
Since this analysis assumes that each verb is the head of its own VP, we still have to say something about the situation in which the VP is headed by an auxiliary V0 (have or be) and I0 contains only [tense]. That is, the auxiliary verb has to raise into I0 if there is only [tense] in I0. This is the equivalent to the auxiliaries moving to the right of tense in the classical analysis.Footnote 10
There are five main puzzles with this type of account of do-support.
A. Why is AH, or the equivalent, needed? In this particular analysis, why does [tense] in I0 license V0[tense], while [tense] on a modal or auxiliary does not?
B. Why does AH (or the functional equivalent) require strict adjacency between I0 and V0?
C. Why does negation block AH, or otherwise trigger do-support, while adverbs do not?
D. Why do auxiliary verbs with [tense] have to be raised into I0? And when they are raised, why do they end up having exactly the form that they would have if they were main verbs subject to AH (or the equivalent)?
E. Why is do inserted, and why does it get exactly the form that it would have if it was an underlying Modal?
These questions point to the same redundancy that we found in the classical analysis. The redundancy arises even when we formulate the analysis in terms consistent with X' theory, feature discharge, licensing, and movement. Essentially, there are four ways in which [tense] can be licensed on a verb: (i) it is “Merged” as Modal[tense], (ii) it is “Merged” as V0[tense] and licensed by I0[tense], (iii) it is the result of raising of {have/be} to I0[tense], or (iv) it is the result of do-support.
We might try to achieve some simplification by assuming that [tense] is in I0 alone, and that Modal, have, and be are each heads of their own VP projection. Then in order to function as the head of IP, a tensed Modal, have, and be must all raise to I0. The two situations are then as in 16.
- (16)
a.
b.
We assume that the Vaux that is raised discharges the [tense] in I0.Footnote 11
The peculiarity of this alternative is that the raising of Vaux to I0 is not blocked by not, while the discharge of [tense] by the main verb is blocked by not.Footnote 12
As far as I can tell, this is the best that can be done given the key assumptions of the account in which I0 is the head of the sentence, heads raise to other heads, and do is inserted as a “last resort” when [tense] is not otherwise licensed. This redundancy is the result of what is the “tyranny” of I0. Regardless of the technical details, the English verbal cluster displays idiosyncrasies that do not succumb to a systematic treatment in terms of branching structure, functional heads, and movement.
Bobaljik (Reference Bobaljik1995) proposes an adaptation of the classical analysis in which do-support is triggered by the failure of Tense to adjoin to V0. On his analysis, Tense must be attached to a V0 in order to satisfy a morphological requirement; for example, it undergoes a form of AH. When attachment of Tense is blocked, do-support applies so that the features of Tense can be expressed, along the lines that we have sketched. Putting AH in the morphology and not the syntax makes the adjacency requirement more natural, although it does not avoid stipulations. Bobaljik (Reference Bobaljik1995:62) must assume that intervening adverbs do not interfere with the adjacency of Tense and V0. As in the classical analysis, {have/be} must raise in his analysis, again to allow Tense to attach to V. There is a residual redundancy in that the morphology is now doing work that replicates what would happen in the syntax, if do was a modal in the first place.
These redundancies can be eliminated if we abandon the notion that I0, consisting of the sequence Tense-(Modal), is the head of the sentence and that more generally, empty functional heads are responsible for the derivation of surface order. Such a step also has the virtue of eliminating problems that I0 introduces in the analysis of V-final languages such as German and Dutch; for discussion, see Kiparsky Reference Kiparsky, Thráinsson, Epstein and Peter1996, Ackema et al. Reference Ackema, Neeleman and Weerman1993, and Sternefeld Reference Sternefeld2006:507ff.
Eliminating I0 and other functional heads also simplifies the analysis of constituent ordering, both synchronically and diachronically. The possibility of raising V0 to I0 and also of moving constituents higher in the tree to specifier positions of various functional heads introduces too many degrees of freedom into the analytical arena.Footnote 13 For example, on the assumption that the canonical order in the VP is [V0 (NP) XP*], any order in which the NP and other constituents appear to the left of V must be derived by movements of the NP and XPs to the left, to the Spec position of functional heads. If the V is assumed to move as well, for example to I0, the proper analysis of any given word order is problematic at best, especially if one allows a rich set of invisible functional heads above V0 (as in, for example, Cinque Reference Cinque1999). The situation is further complicated if the possibility of rightward movement is also allowed.
The difficulties that such a derivational approach presents for the analysis of a V2/OV language are in fact highlighted by Kroch & Taylor (Reference Kroch, Taylor, Pintzuk, Tsoulas and Warner2000:135) in their discussion of verb object order in Early Modern English. They write “we know that Old English allowed both leftward scrambling and rightward extraposition of complements and adjuncts (Kemenade Reference Kemenade1987, Pintzuk & Kroch Reference Pesetsky1989) and these movements obscure underlying order even in the absence of verb movement (Pintzuk Reference Pintzuk1991).”
These observations echo the conclusion arrived at by Culicover & Rochemont (Reference Culicover and Rochemont1990) in a critique of rightward movement and uniform binary branching. In the absence of independent stipulations regarding the direction of movement, the underlying structure, and the triggers of movement, the surface structure significantly underdetermines the analysis. The surface order alone even more severely underdetermines the analysis.
The alternative that I therefore consider here is related to the one ruled out by Kroch (Reference Kroch, Baltin and Collins2003:702), who claims that it “depends on a fragile assumption; namely, on the existence of directionally consistent drifts in usage over long periods of time that are unconnected to grammar change.” On the view that I develop, language change is still grammar change. But the grammar is formulated in terms of constructions, defined in terms of shared surface properties and systematic correspondences with meaning, and not in terms of parameter settings, such as whether or not the language has V-to-I.
3. Constructions
In section 3.1, I review briefly the constructional approach to grammar. The basic idea is that there are constructions in language that involve complex syntactic structure, but are associated with particular sets of lexical items or more general categories, restricted meanings, or both. As these constructions become more general, and drop their lexical conditions, they take on the status of fully general rules, with compositionally transparent meanings.Footnote 14
What is particularly interesting in the history of English is that there are some constructions that have arisen as the result of the narrowing of the scope of otherwise fully general constructions. This has occurred because of the growth of competing constructions that have taken away some of the domain of the earlier constructions.
3.1. Correspondences
I assume as background the general approach of Jackendoff Reference Culicover, Nowak, Pica and Rooryk2002, Culicover & Jackendoff Reference Culicover and Jackendoff2005 (chapter 1), and Culicover Reference Culicover1999. On this view, in addition to words and rules, a grammar contains complex lexical entries that combine some degree of generality with irreducible idiosyncrasy. These lexical entries express correspondences between form and meaning. Correspondences that are more complex than a word, yet not fully general, are constructions.
Briefly, a critical motivation for the constructional approach is that certain aspects of meaning appear to adhere to a complex syntactic structure, and do not follow from a narrowly compositional interpretation (what Culicover & Jackendoff (Reference Culicover and Jackendoff2006) call Fregean Compositionality (FC)). That is, there are aspects of the meaning of expressions that do not reside in a particular overt element in the expression, but in the syntactic structure of the expression.Footnote 15 A typical example is given in 17.
(17) The car roared around the corner.
This construction denotes meaning along a path in a certain manner, in this case, roaring. The verb roar does not have the meaning MOVE or [MANNER: ROAR] associated with it—these aspects of the meaning are a product of the construction itself. The alternatives are to imbue the lexical item roar with the meaning either as an alternative entry, as in 18a, or as a lexical derivation based on the general rule, as in 18b.
- (18)
a.
b.
Culicover & Jackendoff (Reference Culicover and Jackendoff2005) suggest that constructions in this sense can and should be incorporated into an extended conception of the lexicon. Following Jackendoff Reference Jackendoff1990 and more recent work, the lexicon on this view consists of correspondences between syntactic structure (syntax), phonology (phonology), and conceptual structure (cs). A word is an irreducible unitary correspondence set; constructions are syntactically more complex. On this view, idioms are specialized constructions with (relatively) simple semantics. There are other constructions as well that are multiword expressions whose meaning is not completely predictable from the meanings of the parts. Some illustrative examples are given in 19a (word) and 19b (idiom). For simplicity I use the English orthography instead of the phonetic representation. Subscripts mark those parts of each representation that correspond to one another, and ji denotes the phonological form of the syntactic constituent with the same index.
(19) a.
b.
The crucial property of a representation such as 19b is that it expresses a correspondence between a particular meaning, in this case MANNER:EXTREME, and a particular syntactic form, in this case V one's heart out. This construction has the form of the more general English verb-particle construction, illustrated in 20, but in the case of heart out, the form and position of the particle are fixed (21).
- (20)
a. Terry looked the information up.
b. Terry looked up the information.
- (21)
a.
b.
Other specific constructions based on the general verb-particle pattern show the other order.
- (22)
a. Terry programmed up a storm.
b. *Terry programmed a storm up.
A crucial property of a construction is that the correspondence that it expresses relates a specific surface order to a specific meaning. The complexity of the correspondence may thus be measured in terms of how specific the description of the surface order is, and how specific the description of the meaning is. As the example of kick the bucket and V one's heart out shows, a construction may be expressed in terms of particular words and very specific meanings. The example of V one's heart out also illustrates how a construction may have a description in terms of a category—it is possible to replace heart in this construction with one of a class of lexical items that denotes an inalienable and highly personal body part, such as guts or brains.
More general correspondences are less highly specified, but constrain the order of constituents. For example, Culicover & Jackendoff (Reference Culicover and Jackendoff2005) propose correspondences that specify that the grammatical function of Subject corresponds to an NP that precedes the tensed verb.Footnote 16 The verb, in turn, is head-initial in VP. As in HPSG and LFG, we express constituent ordering and the correspondences between such orderings and meaning directly, and not in terms of derivation.
These correspondences are summarized graphically in 23–25. The diagram in 23 shows that (in English) when a CS argument of some CS relation F, such as EAT, corresponds to the Subject grammatical function, this argument is expressed as a left branch of the sentence S in syntactic structure.
(23)
The diagram in 24 illustrates that a temporal operator Ψtime (such as PRESENT or PAST) corresponds (in English) to tense morphology on the verb.
(24)
The diagram in 25 shows that an aspectual operator such as PERFECT corresponds (in English) to the verb have.
(25)
Correspondence rules such as these express the possible form/ meaning pairings that constitute well-formed correspondences in English. For example, a representation such as 26 satisfies all three of these correspondences simultaneously.Footnote 17
(26)
3.2. Ordering the Correspondences
In the illustration shown in 26, the correspondences 23–25 apply simultaneously. It is interesting to observe that if we order the correspondences in terms of their application to a string of constituents, it is possible to reduce, if not entirely eliminate, reference to structure.Footnote 18 Consider, for example, the correspondences that govern the distribution of topic and subject in English. Assume that the topic expresses some correspondence with information structure (IS).
(27) An analysis like that, John could never subscribe to.
The generalization about where the topic goes is straightforward: it is leftmost in the clause. It is possible to capture this fact in structural terms, by assuming that there is a particular location in the syntactic structure into which the topic goes, along the lines of Rizzi Reference Rizzi and Haegeman1997 and virtually all derivational analyses.
(28)
As Rizzi's analysis clearly shows, such an approach is problematic, because topics in English precede the subject, but may precede the complementizer position in wh-questions, or follow the complementizer position in that-clauses.
- (29)
a. An analysis like that, how could anyone actually subscribe to?
b. It is abundantly obvious that an analysis like that, John would never subscribe to.
Therefore, on the derivational approach it is necessary to have (at least) two Topic0 functional heads and two landing sites for a topicalized phrase. Such an account clearly misses a generalization, which is that the topic is always clause-initial, regardless of what type of clause it is. But “clause-initial” means after the complementizer in a subordinate clause.
The formulation of correspondences, and more specifically constructions, in linear terms offers a way of capturing this generalization. Reducing matters to their essentials, what we want to say is simply the following, where Topic is an XP that corresponds to a particular IS interpretation:
(30) topic-position: Topic is leftmost.
As we have just seen, it is not possible to state the property “clause-initial” in structural terms, because the initial part of one type of clause does not look like the initial part of another type of clause. But suppose that we consider the linear ordering of constituents, without considering the structure, as in 31.
(31) XP NP V[tense] . . .
Suppose further that in the CS representations, XP corresponds to an appropriate IS and NP corresponds to Subject. Then the sequence of constituents in 31 satisfies 30, since XP counts as Topic.
Consider next the position of the subject. If there is no topic, then the subject is initial in the sequence by default. (There are of course constructions that position the subject elsewhere.) If there is a topic, then the subject is initial by default in the string of constituents following the topic. The simplest generalization about the position of the subject is the following:
(32) subject: An XP that corresponds to Subject is leftmost.
Now, the correspondences topic and subject conflict, since they both say that XP is leftmost. It is clear how to eliminate the conflict: order the correspondences so that topic applies first to the string of constituents, and then subject applies to the remaining string of constituents. For example, if the string is 31, XP is leftmost in the entire string, satisfying topic, and the NP is leftmost in the remaining string, satisfying subject.
This “dynamic” approach to applying correspondences reveals an interesting relationship between structure and order. As Kayne (Reference Kayne1994) demonstrated, it is possible to encode order in terms of structure, given certain assumptions. By the same token, it is possible to code structure in order, again given certain (different) assumptions. In this case, we make it a part of the theory that a correspondence applies only to the portion of the string that is not already accounted for by a prior correspondence. Let us call the portion of the string that a correspondence applies to the (linear) domain of the correspondence.
By ordering the correspondences appropriately, and applying this notion of domain, it is possible to eliminate redundancies and exceptions. In particular, it is not necessary to mention the right context of a correspondence if the following holds: (i) by default every correspondence applies from the left of its domain, (ii) the correspondences are (partially) ordered, and (iii) the domain of a correspondence excludes what is in the domain of a correspondence ordered before it. That is, a given correspondence applies if it is satisfied by the linear sequence of constituents beginning from the left, excluding everything further to the left that has already participated in a correspondence.
Following the outline of 23–25, but eliminating redundant structural and contextual information, it is possible to formulate correspondences using the general notation in 33.
(33)
Such a rule expresses a correspondence in which syntactic X is leftmost and corresponds to some Y that is an aspect of CS, of IS, or a GF. The notation “⇒” says that the correspondence holds when X is leftmost in the sequence.
Some default correspondences for English are found in 34. The notation “#” signifies the beginning of a sentence. For each correspondence, I give an informal description of the generalization that is expressed.
- (34)
a.
b.
c.
d.
e.
f.
The verb is not mentioned in the tense constraint, since [tense] itself is bound inflectional morphology that presupposes the presence of a verb.Footnote 19
The ordering of these constraints is summarized in 35.
(35)
This ordering produces the canonical linear constituent order for English.
(36)
This is a correct result.
There are, of course, alternative constituent orderings in English. These are themselves the responsibility of specific constructions, which have the property of relating particular constituent orderings to particular meanings, as we have seen. There is a family of constructions for subject-aux inversion (SAI), for example, and similar constructions for the distribution and interpretation of negation. For simplicity I concentrate on the case of SAI that occurs in questions and the core case of sentential negation in declaratives.
3.3. Constructional Domains
Consider yes-no and wh-questions. I use the symbol Q to represent the CS interrogative operator. The difference between inverted and noninverted clauses is that in the case of inversion, the position of [tense] must precede the position of the subject, and is restricted to Vaux.
(37)
This correspondence maps the CS operator Ψtime mapped to [tense], thereby preempting correspondence 34e—tense. However, Vaux is available for an additional correspondence with a CS operator such as PERFECT, along the lines of 25. Thus, the ordering of constraints is as given in 38.
(38)
Finally, we need to specify where the wh-phrase goes in a wh-question. Clearly it follows the topic and precedes inversion in a root question.
- (39)
a. Tomorrow, where are we going to stay?
b. To Sandy, what do you plan to say?
The ordering of application of correspondences should be the following, where wh-position follows topic and precedes sai and subject.
(40)
The problem with this ordering is that it predicts incorrectly that it is possible to have topic before a wh-phrase in subordinate clauses.
- (41)
a. *I wonder tomorrow, where we are going to stay.
b. *I wonder to Sandy, what you are planning to give.
One possibility is to rule such cases out on the basis of an incompatibility between the information structure function of topicalization in English and indirect questions. In the absence of a comprehensive account along these lines, we must simply stipulate that the incompability exists.Footnote 20
By requiring in correspondence sai that Vaux appears immediately after the wh-phrase, we account for the distribution of inversion in main questions. The domain of sai only exists when none of Q(Ψtime and Subject have yet been mapped to syntax. According to sai, Q(Ψtime corresponds to Vaux, and sai applies before subject applies. Hence the tensed auxiliary verb precedes the subject. However, if the wh-phrase happens to be a subject, then wh-position applies before sai, putting the wh-phrase in initial position. There is no inversion, since the GF Subject is already realized syntactically where it has to go. Because sai is a construction, it applies in the same domain as subject and tense and excludes them from applying.
Negation in English is another construction that crucially mentions the category Vaux.
(42)
This correspondence is ordered after subject and before tense. If s-neg is satisfied in a domain, tense is irrelevant, since the temporal operator Ψtime is already involved in a correspondence with tense, by satisfying 42.
3.4. Do-Support
Suppose now that we have an instance of an inversion construction, and the only relation in CS is F, which corresponds to a main verb. By the correspondence verb, F corresponds to a V in VP, while by sai, [tense] in the interrogative must be realized on Vaux to the left of the subject, which is not the position of V. A constraint that says that an element of a certain category must appear in a certain position says that there must be something in the syntactic structure that participates in the correspondence. Thus, there must be an expletive if there is no available lexical item. In this case, the requirement that Vaux must be leftmost in its domain means that Ψtime will be realized overtly as do[tense], if there is no operator in the CS that corresponds to a Vaux. More generally:
(43) Expletives: a formal (grammatical) constraint must be satisfied. If there is no lexical item available for this purpose, an expletive must be used.Footnote 21
Thus, in English the default realization of [tense] is do and the default realization of Subject is it. On this analysis, do is an expletive auxiliary verb.
This analysis of do-support generalizes from cases of inversion to other cases where [tense] cannot be realized on a verb. This occurs when there is negation, ellipsis or VP topicalization, as discussed in section 2.1.
Since do occurs only if a correspondence involves a construction that explicitly mentions Vaux, the redundancies and adjacency problems raised in connection with the classical analysis and the contemporary reformulation do not arise. For example, in contrast with s-neg, there is no construction that specifically mentions Vaux in the position of adverbs. Therefore, we do not get do-support when there is an adverb preceding the main verb. There is no loss of generality in the assignment of [tense], since the correspondence that takes care of the position of [tense] says quite generally that it is marked on the leftmost verb in the sequence. If this verb is a main verb, a modal, or an auxiliary verb, that is where [tense] will appear. Further, if a particular construction stipulates that there must be a Vaux[tense] in a particular position, that is where do[tense] will appear.
The cost of such an analysis is that it requires that every construction in which there is do-support must explicitly mention Vaux. This includes not only SAI and negation, but ellipsis and related constructions, and VP-topicalization.Footnote 22
The history of do-support may now be viewed as the generalization of do to the status of the default Vaux that satisfies all correspondences in which the position of Vaux is specified. The enabling conditions on this development are therefore the following:
The emergence of the category Vaux.
The emergence of constructions, or correspondences, that explicitly mention the position of Vaux.
In the next section, I review briefly the historical record that bears on this account. The emergence of do as a general Vaux was preceded by its occurrence in a number of more restricted and specialized constructions, hence the term “rise of constructions” in the title of this article. The emergence of constructions that mention Vaux, and in particular sai, is the well-attested consequence of the contraction of V2, hence the term “fall of constructions.”
4. Tracing the Changes
I start with the premise that the best account of the change from Old English to Modern English is one that makes minimal assumptions about what changed, and imposes minimal stipulations on the various stages and the changes themselves.
The search for a minimal account is plagued by the fact that there are so many degrees of freedom in what counts as a possible solution.Footnote 23 Accounts in PPT try to limit the solution space by assuming an analysis of Modern English in terms of I0 and V-to-I, and formulating the changes in terms of the behavior of V with respect to I0, and more generally in terms of movement, for example, of constituents (such as DPs) to the Spec positions of functional heads (such as Agr0). However, the centrality of movement to the analysis, even with relatively specific constraints, poses significant challenges for attempts to narrow the domain of possible scenarios (see for example the quotation from Kroch & Taylor Reference Kroch, Taylor, Pintzuk, Tsoulas and Warner2000 in section 2.2).
Clearly, an account of the change depends crucially on the characterization of the endpoint of the change. In the preceding section, I outlined a grammatical account of Modern English that does not assume movement. Rather, the relative positions of constituents of the sentence are accounted for in terms of correspondences that state linearization constraints. These linearization constraints are ordered in such a way that they license certain constituent ordering that correspond to CS representations. The following are two key components of this analysis.
1. linear realization of [tense] and default realization of do when there is no Vaux;
2. linearization of Subject.
These constitute the locus of the change. Focusing on these aspects of the grammar permits a more faithful account of the actual facts seen in the course of time, and opens up some interesting possibilities for establishing causal relationships.
The historical developments that led to the current state of affairs in Modern English are well documented. Here I review only those aspects that are central to the constructional story. In section 4.2, I discuss the history of do-support itself, and review evidence that shows that the occurrence of periphrastic do is a wide-spread phenomenon in languages of the world. Section 4.3 summarizes the transition from general V2 to SAI. This transition involves the emergence of the category Vaux, the loss of V2 in cases of topicalization under certain conditions, and positional licensing of the subject.
A key simplifying assumption that makes this account workable is that constituent order is the province of three classes of constraints: (i) those that govern the default correspondences (the grammatical rules), (ii) those that specify individual constructions, and (iii) those that concern articulation with focus, coherence of processing, and other components that are not narrowly grammatical, but are realized in terms of constituent order. The interactions between these types of constraints are discussed in section 4.3.
4.1. Periphrastic Do
As I point out in section 2, the analysis of contemporary do-support is faced with two main problems: the redundancy of AH and the fact that adverbs do not enter into the analysis. These problems are reified in the analysis of Old English, since it must be assumed that before there was do-support, negation did not block AH (or its functional equivalent). Kroch (Reference Kroch1989) suggests that in Old English, as presumably in other languages, following the analysis of Zannuttini Reference Zannuttini1991, negation was a specifier and became a head over a period of time.
But on Zannuttini's analysis, in a number of languages negation is a head. These languages do not have do-support. So we have to assume that these languages do not have AH either. Thus, the situation in English turns out to be a remarkable coincidence in that a language, which is apparently just like German in its syntax, simultaneously manages to develop AH, negation as a head, and do as a dummy modal.
In comparison, do-periphrasis is well-attested in languages of the world. For example, Benincà & Poletto (Reference Benincà and Poletto2004) describe the Monnese dialect of Northern Italian in which fa 'do' is obligatorily used in yes-no and non-subject wh-questions, inverting with the subject. Some illustrative examples are given in 44.
- (44)
a.
b.
c.
d.
e.
f.
g.
h.
i.
j.
Crucially, fa-support in Monnese alternates with V-to-I. V-to-I occurs with negation, while do-support occurs in questions.
- (45)
a.
b.
These examples are consistent with the proposal that Monnese has a form of do-support, which is limited to questions. However, as Benincà & Poletto (Reference Benincà and Poletto2004) suggest, fa-support in Monnese can be seen as a type of light-verb construction, which has generalized to most of the lexicon and which is restricted to questions.Footnote 24
The analysis of Monnese suggests that a similar development may have occurred in Old English as a precursor to modern do-support. In fact, Jäger (Reference Jäger2006) demonstrates that periphrastic do is very widespread in the world's languages. Its distribution does not appear to be determined by genetic relatedness or by geography, nor does it appear to be correlated with particular grammatical properties of a language. It has many functions across languages, some of which are strictly formal while others are pragmatic.
There are, however, several distinct types of morphosyntactic environments in which do-periphrasis appears. According to Jäger (Reference Jäger2006:92), there are two types of do-periphrasis that apply to English:Footnote 25
“Type 1. The appearance of lexical or morphological material in the clause triggers verbal periphrasis, in most cases material that attaches to the lexical verb and thus prevents the realisation of regular verb morphology. This material usually belongs to a closed class and its function is similar to that of regular verb morphology, i.e. verbal categories, and/or adverbial modification.
Type 2. If a language has rigid or dominant word order, periphrasis is used to mark clause types that display a deviant or irregular word order or to maintain a close approximation of the regular word order in these, i.e. to keep the relative order of verb and object unchanged. Focalisation, topicalisation, and interrogativity are the most common functions that can be associated with periphrasis in this context.”
Based on this view, the origin of do-support may have had the consequence of keeping certain verbs free of inflectional morphology, and of keeping the main verb and its complements adjacent. Both may be viewed as having to do with the minimization of complexity. The first reduces paradigmatic complexity, whereas the second minimizes the domain in which the dependency between the verb and its complements can be computed, as in the sense of Hawkins Reference Hawkins2004.
Jäger's study shows that English is by no means unique in having do-periphrasis. This is to be expected, if the motivations for do-periphrasis in fact have to do with the reduction of complexity. What is characteristic of English is the spread of do-periphrasis from particular lexically constrained environments to specific syntactic environments, and its ultimate generalization to all environments where Vaux is constructionally specified. However, there appears to be no reason in principle why similar developments could have not occurred in other languages. The absence of evidence bearing on this question calls for the cross-linguistic study of the diachrony of do-periphrasis, to see if we in fact find patterns that are similar, if not identical, to what we find in the history of English.
4.2. Growth of English Do -Support
The historical development of do-support has been amply documented in the literature. Kroch (Reference Kroch1989) carefully and intensively analyzed the data in Ellegård Reference Ellegård1953 to demonstrate that do-support developed in each of the contexts in which it currently appears in Modern English, at the same rate, but at different stages. There are six contexts distinguished in Ellegård Reference Ellegård1953 that Kroch tracks: negative declaratives (neg decl), negative questions (neg Qs), affirmative transitive adverbial and yes/no questions (aff trans & y/n Qs), affirmative intransitive adverbial and yes/no questions (aff intrans & y/n Qs), affirmative wh-object questions (affwh-Qs), and “contact” do, where do immediately precedes V. Each of these grows in the percentage of cases that do is used versus the percentage of cases where it could be used, from the period 1400–1425 to the period 1650–1675, with the exception of contact-do. This use peaked in the mid-16th century and then declined until it was virtually unattested by the end of the 17th century.Footnote 26
Kroch argues that the rate of development of each use of do is constant, but that some uses are more advanced than others. This can be seen from the data for the last period (1650–1700), where the percentages and total number of attested examples for the six cases are as in table 1 (the data are Ellegård's as represented in Kroch's tables).
What is interesting about this data is that do-support was virtually obligatory in some cases, optional in others, and non-existent in still others. This variability is found throughout the period of change. The subsequent development to Modern English makes all of the optional cases obligatory.
There is an apparent incompatibility between the variability of do-support at various stages of the language and the theoretical account in terms of I0 and V-to-I. Kroch (Reference Kroch1989) suggests that at these intermediate points in the development of the language, there were two grammars, one with V-to-I for main verbs and the other without V-to-I. While such a formal solution does provide a way of accounting for apparent optionality, it is a rather crude tool, since in itself it cannot account for frequency data as seen in table 1.
It is possible to enrich the multiple grammar account with a device that determines the percentage of cases in which each grammar applies. Since the percentages are different for the different morphosyntactic contexts, this device must encode this grammatical information. Since the percentages are different for different speakers (or different writers), the coefficients must vary from speaker to speaker.
Furthermore, in order to account for the development, we must assume a mechanism that adjusts the percentages over time, for each morphosyntactic context. Kroch has convincingly demonstrated that the rate of change in each case is precisely what we would expect if the various pieces of grammatical knowledge, such as do-support in questions, do-support in negative contexts, and so on, spread through the population following a typical population dynamics pattern. Since each context is developing independently, there appear to be separate mechanisms that encode the frequency information for each context.
This situation is of course precisely what we would expect if the knowledge in this area of the language developed not as a general grammatical rule in the classical sense, but as a grammatical construction. Constructions, unlike rules, are typically not fully general, but are formulated in terms of specific lexical items, meanings, or both. On this scenario, a narrowly defined construction becomes more general. It extends to additional lexical items, and possibly spreads to more generallydefined grammatical contexts. As it develops, it bleeds the existing rule, which continues to apply as the default. In this case, we would say that English began to acquire do-support, but at some point still had the default construction in which the main verb is inflected and functions as the head of the sentence with respect to SAI and negation.
It is also noteworthy that contact-do follows a very different path than the other uses of do. It begins to appear at the same time, increases in use (to less than 10% of the possible environments) around 1550, and becomes very rare by the period shown in table 1. Contact-do represents the most frequent use of periphrastic do.
This use of do is also incompatible with the V-to-I analysis, since there is no reason why do should be inserted if AH (or the functional equivalent) is already available in the language, which clearly was the case. What seems to be the case, rather, is that the contact-do is a distinct construction that exists side-by-side with the construction that realizes [tense] on the main verb.
The plausibility of this account of the development of do-support is buttressed by the observation that it does not simply apply optionally in the language of individual writers, in the sense that it applies in less than 100% of the contexts in which it could apply. Rather, it appears to apply with respect to particular verbs, and not with others, as Ellegård's Reference Ellegård1953 data make very clear. He writes (p. 166) “[o]f Machyn's 370 'do'-instances, 216 involve the verb 'preach'; the simple verb 'preach' occurs only half a dozen times” and (p. 167) “[i]n Polychronicom there are 816 'do'-instances, 243 with 'slay' (and no finite 'slay' in the past tense), as well as 70 with 'succeed', 69 with 'write', 19 with 'eat', 8 with 'fight', 7 with 'hold', 7 with 'appear' and 4 with 'add',” and “we note that verbs or phrases singled out for preferential treatment are not always the same with different writers. We should add to the list Cely Papers /204/ 'do well understand', Fitz James 'appear', Decaye of England 'think', King James Bible 'eat'.”
Ellegård shows that certain verbs, such as know, say, and think, “resisted the usage of do-support, at least in negatives, until as late as the nineteenth century” (Trudgill et al. Reference Trudgill, Nevalainen and Wischer2002:5, see also Nevalainen Reference Nevalainen, van Kamenade and Los2006:576 and Nurmi Reference Nurmi1999).
These observations suggest that do-support began as a lexically restricted construction. Its development as a rule of English is a consequence of its spread through the population, and through the lexicon, until it generalized free of particular items and contexts.
4.3. Transition to SAI
The transition from V2 to SAI involves several critical changes. One is what Kiparsky calls “positional licensing” of the subject, which we express in terms of the correspondence rule subject. Kiparsky (Reference Kiparsky, Thráinsson, Epstein and Peter1996:461) argues that there is a clear connection between the rise of a fixed position for the subject and the loss of overt case marking:
That there is a relationship between the loss of inflectional morphology and the development of rigid positional constraints is clear from comparative syntax. The most important point about this relationship is that it is not a vague correlation or tendency, as often assumed, but an exceptionless implication, which however holds in one direction only: lack of inflectional morphology implies fixed order of direct nominal arguments (abstracting away from ə-movement of operators). [. . .] Every Germanic language which has lost case and agreement morphology, whether VO (English, Swedish, Danish, Norwegian) or OV (Dutch, West Flemish, Frisian, Afrikaans), has imposed a strict mutual ordering requirement on its nominal arguments, without changing the headedness of its VP. The order is always that subjects precede objects, and indirect objects (NPs, not PPs) precede direct objects.
A second change is the formation of a subclass of modal verbs that lack inflection for agreement in the present (and the past). This is a distinctive characteristic of Modern English. The development of this subclass is well documented; see for example Denison Reference Denison1993 and for a recent study, Bybee Reference Bybee2006:chapter 16. The changes that occurred in the transition to Modern English were syntactic, semantic, and morphological.
The modals lost their ability to take non-infinitival complements, and they lost their root meanings. For example, cunnan meant not only 'be able to', but 'know', and willan meant 'wish, want'.
The modals acquired epistemic meanings, having to do with possibility and necessity.
The modals acquired contraction of not to n't. According to Denison (Reference Denison1993:309), this is recorded “first from the fifteenth century, with assimilation of elision of part of the modal.”
The modals lost all person inflection.
The infinitive lost its inflection.
Since, as main verbs, inflected modals functioned as the head of the S even in Old English, we do not find any changes in their behavior with respect to word order in questions and negative sentences.
In German, modal verbs do not govern VP-ellipsis; rather, they require a pronominal complement.
(46)
The same use of pronouns is found in earlier forms of English (Denison Reference Denison1993:307). However, Warner (Reference Warner1993:112) has many examples from Old English of ellipsis with modal and auxiliary verbs. The following is just one such example.
(47)
On the view that ellipsis is possible only if the governing verb is an auxiliary, such data would suggest that the modals were already a distinct subclass of verbs in Old English; see Warner Reference Warner1993:113f for discussion. Warner also argues that Old English already had pseudo-gapping, where the main verb is omitted, but not the auxiliary or a portion of the VP, as in 48.
(48)
Denison (Reference Denison1993:336) notes that each modal followed its own path from root verb to auxiliary, as shown by their co-occurrence with infinitival complements:
In the ModE period Warner notes that will and can are the last to lose the ability to take an object, an ability he correlates with the existence of non-finite forms [. . .] must and shall, on the other hand, seem to have attained unequivocal modalhood much sooner, may falls somewhere between the two pairs. And other items have shown varying degrees of modalhood in the different periods of English history.
This fact bears on the constructional account of the change from Old English to Modern English.
Since the modal corresponds to the highest operator in CS, it is the leftmost verb, and hence the verb that appears to the left of the subject in V2 and to the left of negation. The re-specialization of V2 to contexts with fronted scopal operators would have produced a situation in which a special subcategory of verbs, namely the modals and auxiliaries, had a different distribution than the main verbs. The existence of this special subcategory makes possible the formulation of constructions that specifically govern the distribution of this subcategory, such as sai.
The third key change is the loss of full V2, which also displays the properties of constructional change. Baekken (Reference Baekken2000:394) provides a summary of the percentage of XVS and XSV word orders in Early Modern English (see table 2).
As can be seen, there is a gradual loss of V2 from period I to period III.
Baekken breaks down the data into various contexts in which V2 was found in Old English and shows the percentage of cases in which V2 appears in these contexts in each of the three periods. These contexts are defined in terms of the type of the initial constituent X. For instance, table 3 shows the incidence of V2 with initial then, therefore, thus, and yet.
The importance of these numbers lies in the fact that not only was V2 not uniform during the transition, but it was apparently sensitive to the particular lexical item in initial position. This variability shows that V2 had the status of a construction, in the technical sense that it was a syntactically complex lexical object, some of whose properties were idiosyncratic and others fully general.
Again, it is difficult to see how to capture such facts in terms of multiple grammars in the intended sense. It is possible, of course, if we assume that for each speaker each construction is defined over a some-what different set of lexical items. In this case, for example, we may assume that there is a rule of V2 and competing constructions for topicalization. V2 is given by the following ordered constraints. For purposes of comparison, we also list the constraints for Modern English.
The constraints for Old English are virtually the same as those for Modern English, with three major differences. First, Old English, like German, has a default rule that there must be a clause-initial constituent. This rule, along with tense, produces the V2 effect. In Modern English, the topic rule is tied to IS, and the V2 effect is associated with a restricted meaning. Second, Modern English has an ordering rule for Subject, while Old English does not. Third, Modern English has reinterpreted the V2 structure of Old English as a requirement that holds only in certain cases; Vaux is ordered before the subject.
These differences suggest the following scenario. The independent emergence of the subject correspondence produces sentences in the subject precedes the verb. But subject is variable. Sentences with pronominal subjects are more likely to lack V2 because of the tendency of pronouns in German to appear to the left. When V2 began to be lost, there was no well-defined context that distinguished between V2 and non-V2. Therefore, the constructions that were formed are indexed to specific lexical items, along the lines suggested by Baekken's data in table 3. Ultimately, learners associate a particular meaning with V2, giving rise to constructions.Footnote 27
The variability of V2 as it evolved into SAI is also demonstrated in the following tables from Baekken Reference Baekken2000 that track the development of negative inversion.
It is interesting to note that negative inversion became almost obligatory in this transition period, as XSV became almost obligatory with nonnegative X (table 5). However, different initial negative elements contributed differentially to the total development of negative inversion, with nor becoming virtually obligatory and never rising to 50% (table 6).
Consider, finally, the data in table 7.
These data show that the type of verb had an effect on the rate of inversion. Baekken (Reference Baekken2000:412) suggests that what is going on here has to do with the relative “weight” of the post-verbal material when there is or is not a direct object:
From a pragmatic point of view, it is of considerable interest that inverted structures contain high rates of intransitive verbs . . . No doubt, this is connected with the principle of end weight: in such structures the post-verbal subject provides the required sentence-final weight; consequently, it may be assumed that the subject and the verb are inverted in such structures precisely because of pragmatic requirements such as end focus and end weight.
These data support the view that during the transition there were two forces at play in the ordering of constituents, one having to do with grammatical function and the other to do with weight.Footnote 28 These are competing with one another, in the sense that given the alternations as in 49, it is not immediately obvious whether the ordering in 49a is due to the relative weight of the NP or to a constraint that says that the subject must be leftmost.
- (49)
a. XP NP V . . .
b. XP V NP . . .
If rightmost position corresponds to some extent to focus, then it is more likely that the subject is focus in an intransitive than in a transitive, other things being equal, since in the intransitive the subject is the only thematic argument. Such cases are typically presentational, as in 50a.
- (50)
a. Into the room walked a man.
b. Into the room pushed the man the cart that had been discovered in the basement.
In Modern English, such inversion is completely ruled out when the verb is transitive, as in 50b (see Culicover & Levine Reference Culicover and Levine2001).
5. Summary and Perspective
I have argued that a minimal account of the change from Old English to Modern English can be envisioned if we abandon the view that linear order is a consequence of syntactic configuration. Rather, syntactic configuration, or the appearance of syntactic configuration, is a consequence of linear order and its correspondence with grammatical functions (GFs) and conceptual structure (CS). The ramifications of such a proposal are far-reaching and deserve substantially more exploration than can be devoted to them here.Footnote 29 However, at least in the domain of the canonical word order and basic alternative constructions of English, an account in these terms appears to have some promise.
Even granting the essential correctness of this view, though, there are yet further issues that cannot be dealt with here. Most importantly, there is the question of exactly how constructions become rules and how rules become more restricted constructions. I have given some suggestions and sketched out a scenario, but modeling at the appropriate level of detail, and detailed validation with the historical data, still remains to be done.
I conclude by situating the main proposal of this paper within a broader perspective, and suggest one possible future line of development. On the model offered in Culicover & Nowak Reference Culicover and Nowak2003 and Culicover & Jackendoff Reference Culicover and Jackendoff2005, the shape that a language (and a grammar) takes is determined by these primary factors:
Universal grammar/evaluation metric
Concrete minimalism in learning
A social network
Each contributes to grammar construction in the learner, and to the persistence, diffusion, or loss of particular linguistic properties in a population over time.
Universal grammar consists of the structures and principles that constrain grammars; I include in this category universal conceptual structure (Jackendoff Reference Culicover and Rochemont1990, Jackendoff Reference Culicover, Nowak, Pica and Rooryk2002) and the laws governing the correspondences between CS and strings of words (see Culicover & Jackendoff Reference Culicover and Jackendoff2005).
One of the obvious concerns raised by this approach to constructions, especially in light of the central role of syntactic uniformity found in mainstream generative grammar, is that it appears to leave open the possibility that language can vary in arbitrary and unbounded ways. In Construction Grammar, this issue is addressed by taking into account the extent to which a construction shares certain properties with other constructions, including those that are characterized in terms of very general rules (such as the structure of the VP) (see Kay & Fillmore Reference Kay and Fillmore1999:30 for one example). For instance, the sentence in example 17 (the car roared around the corner) is perfectly regular except that its interpretation departs in certain ways from the canonical one.
More generally, Culicover & Jackendoff (Reference Culicover and Jackendoff2005:chapter 1) suggest that English has a hierarchy of VP constructions, which range from the very specific and idiomatic (such as kick the bucket) to the very general ([VP V . . .]). Beyond the top of the language-specific hierarchy is Universal Grammar (UG) that stipulates endocentricity as the default state of affairs. On the view suggested by Culicover & Jackendoff, UG is not actually part of the grammar of any particular language. Through the evaluation metric, it is a guide to the construction of a grammar. It sets defaults from which general rules of grammar may depart, at a cost, just as the rules of any given language set defaults from which particular constructions of the language may depart.
Along related lines, Culicover (Reference Culicover1999) (see also Hawkins Reference Hawkins1994) argues that putatively universal constraints, such as Subjacency, are measures of simplicity that can be violated by particular constructions, but at a cost. The overall picture that emerges is one in which each level of the hierarchy, from UG on down, establishes a baseline against which deviations are measured. The extent of deviation from the baseline may then be understood as contributing to the level of complexity in the grammar. This notion of complexity, which is relative rather than absolute, offers the potential for explaining language change in terms of the pressure to simplify (that is, to achieve “economy”), while leaving open the possibility of deviation from the ideal, but at a cost.Footnote 30 The cost, in turn, translates into frequency of occurrence (among languages), and perhaps learnability and processing complexity, and instability under conditions of competition. For some discussion, see Culicover & Nowak Reference Culicover, Nowak, Pica and Rooryk2002.
A theory of constructions, then, must incorporate an evaluation metric in order to explain what is natural and what is not. But this is true of any theory of grammar. In the current framework, the evaluation metric distinguishes grammars in terms of the deviation of general rules from UG and deviations of constructions from general rules. This is in fact the classical sense of the evaluation metric in syntax introduced by Chomsky (Reference Chomsky1964), rather than in the framework of economy of derivation, which has been the focus of much recent work.
Concrete minimalism refers to the conservative strategy of a learner to construct only those hypotheses about the formmeaning correspondences that are warranted by the evidence and universal grammar. On the view that universal grammar is quite impoverished (see Culicover Reference Culicover1999 for arguments), concrete minimalism leads to a situation in which learners are always beginning from constructions. (See Tomasello Reference Tomasello2003 for experimental evidence in support of this view.) They move to rules only when the weight of the evidence warrants generalization beyond the evidence. Precisely what justifies and constrains generalization is an open question, and a fundamental one, but the evidence from language acquisition appears to be consistent with this general view. For extended discussion and a computational simulation, see Culicover & Nowak Reference Culicover and Nowak2003.
Finally, language exists in a social network. The social and physical topology of the network determines to some extent the stability of clusters of linguistic properties quite independent of their substantive properties. This point is demonstrated in Culicover & Nowak Reference Culicover, Nowak, Pica and Rooryk2002 and Culicover & Nowak Reference Culicover and Nowak2003, and holds more generally for properties of agents that are transmitted to other agents in a social network (see, for example, Latane & Nowak Reference Latané, Nowak, Barnett and Boster1997).
Of particular relevance in the case of do-support is the possibility that we may be able to treat the spontaneous emergence of do-periphrasis as the result of “noise” in the network. We would thus expect that do-periphrasis would appear in language after language. The topology of the network in which a language resides determines whether the innovations produced by random fluctuations are washed out or sustained. It is logically possible that a language can have the enabling conditions that English had for do-support, but spread of the innovation may be inhibited by the configuration of the network. It is natural to appeal to linguistic and cognitive factors in accounting for why a particular change might spontaneously emerge. However, in order to fully understand why some changes take hold and others do not, it is also essential to understand the ways in which such an innovation is realized and transmitted in the social network.
Appendix
I use the abbreviations below.
AH Affix hopping
C[-wh] Non-interrogative complementizer
C[wh] Interrogative complementizer
CS Conceptual structure
DECL Declarative (clause)
EXP Experiencer
GF Grammatical function
HPSG Head-driven phrase structure grammar
IS Iformation structure
LFG Lexical-functional grammar
NEG Negative operator
PPT Principles and Parameters Theory
PRES Present
PRT Participle
Q Interrogative (operator)
SAI Subject-aux inversion
SUBORD Subordinate (clause)
UG Universal Grammar
V2 Verb second
Vaux Auxiliary verb
Vinf Infinitival verb