1. Introduction
The philosophy of linguistics has emerged as a robust philosophy of science (Scholz et al. (Reference Scholz, Pelletier, Pullum, Nefdt, Zalta and Nodelman2022)). Prominent topics include the ontology of language, realist versus antirealist conceptions of standard linguistic entities (Rey (Reference Rey2020)) and the commensurability of competing theoretical frameworks (Pullum (Reference Pullum2013)). Geoffrey Pullum has contributed not only to the theory of language and formal linguistics but also to its philosophical analysis. The three main themes of his philosophy of linguistics have been the metatheory of formal syntax, the role and reality of linguistic infinity and the normative force of linguistic rules. In this article, I explore each of these topics in turn with a special eye toward a unified philosophy of linguistics along constructivist and even possibly finitist lines.
In Section 2, I outline the basics of the model-theoretic alternative to generative syntactic theory. I argue that Pullum is too quick to draw the theoretical consequences he does from this model of grammar, from the perspective of the philosophy of science. In Section 3, I discuss one particularly poignant such consequence in his rejection of the linguistic infinity postulate. Here, I suggest, with Langendoen (Reference Langendoen2010), that Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010) provide only the first half of the critique. In Section 4, I address Pullum’s recent claims as to the normative grounding of linguistic theory (Pullum (Reference Pullum and McElvenny2018, Reference Pullum, Nefdt, Dupre and Stantonforthcoming)). Finally, in Section 5, I motivate a novel unification of these themes.
2. A model theory for syntax
The origins of modern generative grammar are proof-theoretic in nature (Tomalin (Reference Tomalin2006), Pullum (Reference Pullum2011), Lobina (Reference Lobina2017)). This involves the recursive enumeration of a set by means of a finite set of rules over a finite vocabulary. To be a well-formed formula or grammatical string of a language is to be generated by the rules over the vocabulary, that is, to be an output of the grammar. Phrase structure grammars, categorial grammars, minimalist grammars and many others fall within this broad mathematical framework.
Model-theoretic syntax (MTS) replaces the centrality of that proof-theoretic notion of syntax with a concept associated more with semantics, drawing from first-order logic and model theory. Grammar formalisms such as Head-Driven Phrase Structure Grammar (HPSG) (Müller et al. Reference Müller, Abeillé, Borsley and Koenig2021) and Sign-based Construction Grammar (Michaelis Reference Michaelis2009) are prominent examples of this approach. They are also contrasted with the derivational approach of generative grammar. Instead of a grammar generating the well-formed formulas of a language, a grammar models the rules of that language via constraints. ‘An MTS [model-theoretic syntax] grammar does not recursively define a set of expressions; it merely states necessary conditions on the syntactic structures of individual expressions’ (Pullum & Scholz Reference Pullum, Scholz, de Groote, Morrill and Retoré2001: 19). In this sense, a sentence or expression is well-formed iff it is a model of the grammar (defined in terms of constraints which act as the axioms of the formalism). A grammar is then just a finite, unordered set of MTS rules, which in turn are just statements about the structure of expressions like ‘every direct object noun phrase in a transitive verb phrase immediately follows the verb’ (Pullum Reference Pullum, Rogers and Kepser2007). To be a model of the grammar is to be an expression that satisfies the grammar or meets its constraints.
Pullum (Reference Pullum2013) draws out a number of comparative advantages of adopting an MTS framework over a generative one. In fact, he states that ‘[t]he question I regard as most central, in that it arises at a level more abstract than any comparison between rival theories, concerns the choice between two ways of conceptualizing grammars’ (Pullum Reference Pullum2013: 492). The general thrust of the argument is to establish the superiority of MTS over generative syntax on a number of significant theoretical desiderata, such as incremental grammaticality, incorporation of neologisms and language acquisition (see Steedman (this volume) for more on these). What is interesting from our perspective is the property of ‘cardinality neutrality’.
Grammars of this sort [MTS] are entirely independent of the numerosity of expressions (though conditions on the class of intended models can be stipulated at a meta-level). For example, suppose the grammar of English includes statements requiring (i) that adverb modifiers in adjective phrases precede the head adjective; (ii) that an internal complement of know must be a finite clause or NP or PP headed by of or about; (iii) that all content clause complements follow the lexical heads of their immediately containing phrases; and (iv) that the subject of a clause precedes the predicate. Such conditions can adequately represent facts like those in (I). But they are compatible with any answer to the question of how many repetitions of a modifier an adjective can have, or how deep embedding of content clauses can go, or how many sentences there are. The constraints are satisfied by expressions with the relevant structure whether there are infinitely many of them, or a huge finite number, or only a few (Pullum & Scholz Reference Pullum, Scholz and van der Hulst2010 : 123).
I think Pullum might be reading too much into the structure of model-theoretic grammars here. Admittedly, cardinality neutrality might be the more conservative interpretation. However, as Langendoen & Postal (Reference Langendoen and Postal1984) argue somewhat controversially in their ‘vastness proof’, non-denumerable infinity of language might also warrant model-theoretic treatment.
Nevertheless, I am going to take a different route here. Rather than emphasise the differences between generative and model-theoretic grammars, I am going to highlight their similarities. In fact, according to the scientific modelling perspective proffered by Tiede & Stout (Reference Tiede, Stout and van der Hulst2010) and Nefdt (Reference Nefdt2019), these are merely alternative models of grammar potentially modelling the same linguistic reality. Pullum is correct that a divergence in ontological commitment on a particular property might occasion neutrality and the possibility that the given property is an artefact of the model. However, the history of science has shown us that artefacts of models can prove to be genuine aspects of the target. For example, multiple revisions, in terms of physical interpretations, of the same mathematical formalism in classical mechanics led to the discovery of the positron (Bueno & Colyvan (Reference Bueno and Colyvan2011)). Dirac initially considered negative energy solutions to be mere features of the mathematical model. However, after finding physical interpretations of these solutions, he was led to revise his entire theory and predict the existence of a novel particle.
In terms of convergence, formal language theory, the mathematical theory of the structure of language, was concerned with defining languages of nested complexity known as the Chomsky hierarchy. One way of doing this is by defining languages in terms of the production rules of grammars of certain varieties. For instance, regular grammars or familiar finite-state grammars were argued by Chomsky (Reference Chomsky1956) to be inadequate as models of natural language syntax since they cannot capture certain common linguistic patterns (such as $ {a}^n{b}^n $ ). The context-free languages contain regular languages as a proper subset (their grammars can generate those languages) as well as more complex structures. Similarly, as we move up the hierarchy, we approach the context-sensitive class and, above that, the recursively enumerable, which subsumes all those below. Each grammar type can be further associated with a type of recognising machine or automaton, with the last ring corresponding to a Turing machine. The specifics, however fascinating, will not detain us here.Footnote 1
The important part for our purposes is that this nested hierarchy of formal languages used to model the grammatical rules and features of natural languages can be defined in more than one way. Classical formal language theory pursued this task in terms of grammars and generated classes of formal languages. But there is a model-theoretic alternative. For a flavour of the results: Büchi (Reference Büchi1960) showed that a set of strings forms a regular language if and only if it can be defined in the weak monadic second-order theory of the natural numbers with a successor. Thatcher & Wright (Reference Thatcher and Wright1968) then showed that context-free languages ‘were all and only the sets of strings forming the yield of sets of finite trees definable in the weak monadic second-order theory of multiple successors’ (Rogers Reference Rogers1998: 1117). It seems that the Chomsky Hierarchy is not immune to equivalent model-theoretic characterisation.
At the slightly lower level, various specific grammatical formalisms under the umbrellas of the generative grammar and model-theoretic syntax, respectively, have been shown to be weakly equivalent or characterise the same sets of strings (Vijay-Shanker & Weir Reference Vijay-Shanker and Weir1994). The differences between derivational approaches and declarative (or constraint-based) ones have sometimes even been assumed to be only apparent or ‘notational variants’ (Chomsky Reference Chomsky2000b). This might be too strong a claim.Footnote 2 Nevertheless, herein lies the argument: if generative syntax and MTS can characterise the same formal systems, output the same stringsets and converge on various characterisations of linguistic phenomena, then there seems to be some invariant structure across these formalisms. In other words, they might be more compatible and less obvious in competition than Pullum acknowledges.Footnote 3
Identifying the common structures could give us traction on which parts of the models are genuinely reflective of the target, that is, natural language, and which parts are merely model artefacts. However, some features of particular models might indeed reflect linguistic reality. The argument from model theory to linguistic cardinality is missing a step. At this point, we move on to Pullum’s argument to the effect that the linguistic infinity postulate is unsupported by the philosophy of linguistics.
3. Pullum on Linguistic Infinity
The idea that natural language is somehow an infinite capacity or even an infinite set has become more than just a metaphor in contemporary linguistics. It has informed positions on the kind of science linguistics is (Katz Reference Katz1996, Chomsky Reference Chomsky, Martin, Michaels and Uriagereka2000a, Postal Reference Postal2009), views on what language itself might be (Chomsky Reference Chomsky1995) and strong claims about the evolution of human language (Berwick & Chomsky Reference Berwick and Chomsky2016). It is also often linked with debates on linguistic creativity, compositionality and recursion (Chomsky Reference Chomsky1980, Hauser et al. Reference Hauser, Chomsky and Fitch2002).
There are a number of popular avenues for establishing linguistic infinitude (although it is often just stated without argument). The first, and earliest, attempts are proof-theoretic reconstructions of an insight, attributed to Willem von Humboldt, that language is ‘infinite use of finite means’. This strategy relies on the concept of recursion and produces an analogy with the natural numbers. In the following section, I will explore this option, with special emphasise on the critique produced by Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010).
3.1. The argument from recursion and generative grammar
The literature on recursion in linguistics is complicated. Specifically, it is unclear at which level (or even which object) the property is meant to be interpreted. I cannot embark on a mission to disentangle the many vines of this particularly thorny issue here. Basically, the mathematical definition involves a property of self-reference, usually in two steps: one that specifies the condition of termination of the recursion (or the base case) and the recursive step, which reduces all other cases to the base (Tomalin (Reference Tomalin2006)).Footnote 4 For example, addition in Peano arithmetic can be defined recursively as $ x+0=x,x+ Sy=S\left(x+y\right) $ (where ‘ $ S $ ’ denotes the ‘successor of’ one place relation). Similarly, the rewrite rules for a context-free grammar are recursive: $ S\to ab $ or $ S\to aSb $ (where ‘S’ is now a unique start symbol or category and ‘a’ and ‘b’ are terminal elements like words). In early generative grammar, phrase-structure rules were instructions to rewrite symbols or strings into other strings, such as $ S\to NP $ $ VP $ (Chomsky Reference Chomsky1957). In fact, as noted in Pullum (Reference Pullum2011), generative grammars are special cases of Post production systems, which are proof-theoretic devices capable of characterising the recursively enumerable sets. More importantly, they are finite rule systems that can generate infinite sets, hence fulfilling von Humboldt’s prophecy.
This brings us to the first argument in favour of discrete infinity, that is, that generative grammars output infinite expressions or grammatical structures. If a grammar $ G $ is a set of finite rules for defining stringsets (or sequences of symbols), then a language is the set of all the strings that $ G $ generates. In Post production system style, ‘ $ G $ will be said to generate a string $ w $ consisting of symbols from $ \Sigma $ [the alphabet] if and only if it is possible to start with $ S $ and produce $ w $ through some finite sequence of rule applications’ (Jäger & Rogers Reference Jäger and Rogers2012: 1957). What does this mean for linguistic infinity? Well, it comes down to what is known as the ‘membership problem’. Generative grammars have sharp boundaries. Some string $ w $ is either generated (or derived) by the rules of $ G $ or it is not. It is either in the language or it is not. There is no in between. This means that the language that the grammar generates ‘has an exact cardinality: it contains either some finite number of objects or a countable infinity of them’ (Pullum Reference Pullum2013: 496). Furthermore, the reason for opting for a countable infinity specifically comes from the data of natural language itself. As Chomsky states,
[T]he rules of the grammar must iterate in some manner to generate an infinite number of sentences, each with its specific sound, structure, and meaning. We make use of this “recursive” property of grammar constantly in everyday life. We construct new sentences freely and use them on appropriate occasions (Chomsky Reference Chomsky1980 : 221).
The best way to mount an analogy between language and arithmetic is by establishing a connection between the set of natural language expressions and the set of natural numbers. Firstly, we need an analogy with the successor function in Peano arithmetic. Recall that the successor function applies to an object in the set and produces another object in that same set endlessly. Thus, we would need a similar kind of operation in language. The usual candidates come from syntax. Consider the following examples:
The idea is something like: English syntax is ‘closed under adverbial modification’. In other words, if (a) is in the set of English sentences or, if you prefer, generated by the rules of English, then so are (b), (c), (d) and so on ad infinitum. The same strategy can be employed for other common examples involving subordination, centre embedding, conjunction etc. Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010) call this the ‘no maximal length claim’ (NML) or the claim that there is no longest English sentence (i.e. you can always create a longer one). The problem they point out is that to establish this analogy, one would have to assume both that every English expression has a successor, and that this function is injective somehow. But doing so would essentially be begging the question. What they do not show is how to account for either the syntactic facts or the intuition that prompts claims like NML. Instead, they offer three different alternative conceptions of grammar that are neutral on the cardinality question, such as the model-theoretic account discussed above.
However, it should be noted that Chomsky (Reference Chomsky2008) does attempt to make a direct argument concerning the Merge operation of Minimalism and arithmetic. He outlines the following procedure for mimicking the successor function within linguistics.
Suppose that a language has the simplest possible lexicon: just one LI [lexical item], call it “one”. Application of Merge to that LI yields $ \left\{ one\right\} $ , call it “two”. Application of Merge to $ \left\{ one\right\} $ yields $ \left\{\left\{ one\right\}\right\} $ , call it “three”. Etc. In effect, Merge applied in this manner yields the successor function. It is straightforward to define addition in terms of Merge(X, Y), and in familiar ways, the rest of arithmetic (138).
The idea is that Merge applied over a lexicon with a single entry results in the successor function. Let us set aside for the moment the issue of whether Merge can operate on one element as it is usually defined in binary terms (perhaps Merge (‘one’, $ \varnothing $ ) is assumed). The problem is that Merge, or the set-theoretic operation of taking two objects and creating a new labelled unit with the two objects as (unordered) members, has nothing to do with language in and of itself.Footnote 5 Making the argument that natural language syntax can be characterised with this simple operation is a separate task (one that Minimalists have taken up for 20 years). But merely showing that you can model the successor function with a recursive set-theoretic operation is trivial. Merge, so stated, can also model the heartbeat of mammals, the rhythm of drums or the barking of dogs. Nothing about these phenomena is intrinsically connected to language or the axiomatisation of arithmetic, for that matter.
This brings us to an important assumption about the nature of generative grammars imbued with recursive rules, namely, that they necessarily yield or entail a discrete infinity of linguistic outputs. Again, Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010) show that this entailment does not hold by means of a toy example using a context-sensitive grammar. The grammar contains a recursive rule ( $ VP\to $ $ VP $ $ VP $ ), which is only capable of a single application given the specified lexicon but nevertheless remains non-trivially recursive. Thus, the recursive element of the grammar does not ensure infinite output (even if it can create infinite structures).
3.2. Two options for infinitude
Despite the convincing challenge laid out by Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010), there remains a number of options for motivating a natural language infinitude. I will discuss one prominent such option and another less discussed alternative. Both seem to require some notion of ideal competence. For instance, Huybregts (Reference Huybregts2019) insists that infinite cardinality is the default or null hypothesis of formal grammars. He specifically challenges the idea that there could have been a step-wise or gradualist approach to infinite language along a cultural or evolutionary path (starting out in linguistic finitude of some sort). He reiterates NML as a datum. What is interesting is where he locates the infinitude. He states ‘[w]e should be careful, however, not to confuse the infinite productivity of the generative procedure (a function in intension) with the cardinality of the decidable set of structures it generates (the well-formed formulas defined in extension)’ (Huybregts Reference Huybregts2019: 3).Footnote 6 Appreciating this caveat allows for explanations of languages with finite output or no overt recursive constructions, without giving up on infinity. Hence, the counterexample provided by Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010) is neutralised, since the grammar ‘may generate an infinite language but only produce a finite subset of it’ (Huybregts Reference Huybregts2019: 3).
In a related fashion, Everaert et al. (Reference Everaert, Huybregts, Chomsky, Berwick and Bolhuis2015) have re-emphasised that structures and not strings are the driving force behind mental linguistic computations. This harks back to the distinction between the strong generative capacity of grammars (the structures or ‘structural descriptions’ they produce) and the weak generative capacity (the stringsets they generate). As Chomsky (Reference Chomsky1965: 61) notes, ‘discussion of weak generative capacity marks only a very early and primitive stage of the study of generative grammar’. He goes on to claim that ‘questions of real linguistic interest’ are those pertaining to the strong generative capacity or descriptive adequacy as well as explanatory adequacy (i.e. language acquisition).Footnote 7
Another strategy, which dovetails with the scientific modelling perspective discussed above, is given in Langendoen (Reference Langendoen2010). He describes the situation in the following way.
For any natural language $ L $ , let $ {L}^{\operatorname{w}} $ represent the finite set of expressions known to belong to L on the basis of such direct evidence. Given that $ {L}^{\operatorname{w}} $ provides indirect evidence for genuinely recursive size-increasing operations, standard generative models project a denumerably (countably) infinite set $ {L}^{\diamond } $ of ‘possible’ members of L. By not distinguishing the models from the language, proponents of such models conclude that $ L={L}^{\diamond } $ and so is infinite. As noted above, that conclusion may be correct, but an argument is still needed to show that the models do not overgenerate. In the absence of such an argument, all we can conclude is that $ L $ lies somewhere between $ {L}^{\operatorname{w}} $ and $ {L}^{\diamond } $ , and so may be either finite or infinite.
To establish a cardinality claim, Langendoen relies on an antecedent (from Sapir) of Katz’s ‘effability principle’: that every language can represent every meaning or proposition. Sapir’s notion of ‘formal completeness’ is based on mathematically ‘complete’ systems, such as geometry or arithmetic. The idea is that every natural language possesses a complete system of reference, independently of the resource (or cognitive) limitations of human cognisers. Just as geometry needs to have the resources to characterise any possible space, natural languages need the resources (in principle) to characterise any thought or concept. What this means for the infinity claim is that closure operators (like adverbial modification) can be assumed to get us from $ L $ to $ {L}^{\diamond } $ in the natural language case by the same logic that takes operators like the successor function from direct evidence of numerosity to countable infinity.Footnote 8
Langendoen then makes an almost Lewisian move (in Lewis (Reference Lewis, Byrne and Kölbel1995[2010])) in allowing for different languages to be modelled with different cardinalities. This process is determined by the needs of the linguistic community. When a particular language only requires a subset of the formal expressive power of another, then that language might indeed be considered finite in comparison. This would distinguish his view from Huybregts (Reference Huybregts2019). However, I think it would also move us further from the essence of Katz’s effability principle and similar universalist conceptions of the commonality of human language. In terms of our dialectic, it is not clear that Pullum would accept that some languages should be modelled with generative grammar and others with MTS, resulting in differing cardinalities.
The place at which we have arrived is one in which Pullum’s critique compels adherents of linguistic infinity to motivate their claims more perspicaciously. This is to the clear benefit of the philosophy of linguistics. However, his arguments do not quite establish that natural languages are not discretely infinite by themselves.
4. Normativity
The last component of Pullum’s philosophy of linguistics, which we will explore here, is his insistence that grammar is not about abstract rules grounded in human psychology or biology but rather descriptive of sets of norms of speakers.
Before we continue, it should be noted that by normativity, Pullum is careful to distinguish his view from prescriptivism about language. In fact, Pullum has been a champion of the anti-prescriptivism movement against the grammatical scruples often associated with linguistics (see Pullum (Reference Pullum2014)). Pullum (Reference Pullum, Nefdt, Dupre and Stantonforthcoming) argues that prescriptivism is guilty of what he calls ‘dialect chauvinism’ or ‘the touting of one dialect as clearly (almost morally) better than another’ (4). At the heart of most ‘grammar rules’ of this variety is an injunction to accept the standards of one dialect (usually one associated with the ‘upper class’) over others considered to be slang or informal. If anything, linguistic theory has grown out of an opposition to unnatural rules of behaviour imposed on speakers for the sake of social identification. Generative grammar replaced this conception with one that takes grammar rules to be features of the maturation of an innate system common to all human beings. These rules characterise not the linguistic performance of individuals but their abstract competence in their languages. They are descriptive of a state of the mind/brain.
Despite the rejection of prescriptivism, Pullum insists that any grammar worth its salt must have normative force. The rules of grammar are meant to define standards of correctness for sentences, according to this view. Furthermore, he follows Kripke (Reference Kripke1982) in questioning how such standards or norms can arise from the kinds of descriptive facts Chomskyans have attributed to brain states: ‘the world, whether on a geophysical or intracranial scale, cannot be said to be right or wrong’ (Pullum Reference Pullum, Nefdt, Dupre and Stantonforthcoming: 4).Footnote 9 Grammatical rules, on the other hand, are more like the rules of chess or driving, for example, ‘the bishop moves diagonally’, ‘right lane must turn right’ etc. You can choose to follow them and thus conform to the conventions of that practice or withdraw from the practice altogether. There is a Wittgensteinian, almost inferentialist, flavour to Pullum’s views on grammar here. He has even suggested a mechanism, borrowed from moral theory, to include corpus linguistics and usage data into the theoretical fold. The proposed epistemology of syntax involves a reflective equilibrium between theory, judgements and corpora:
The goal is an optimal fit between a general linguistic theory (which is never complete), the proposed rules or constraints (which are not quite as conformant with the general theory as we would like), the best grammaticality judgments obtainable (which are not guaranteed to be veridical), and facts from corpora (which may always contain errors) (Pullum Reference Pullum, Rogers and Kepser2007 : 37).
In order to proffer this alternative picture, he dismantles the concept of ‘rule’ in generative grammar. A generative grammar, for him, is neither a function nor a set enumerator, as some in the literature have suggested. It is a holistic, non-deterministic, random, construction system (Pullum (Reference Pullum, Nefdt, Dupre and Stantonforthcoming)). In other places, he traces the concept to Emil Post’s work on production systems (Pullum (Reference Pullum2011, Reference Pullum and McElvenny2018)). Whatever the interpretation of generative grammar, transformational, rewriting systems and so on, the nature of a rule is so far divorced from the ordinary notion that it is not fit for purpose.
The ordinary intuitive understanding of a rule is something that we can follow (that is, behave in a way that complies with it) or break (that is, violate or disobey it). It defines a regular pattern or practice, a way of “going on in the same way”. But nothing of the content of a generative grammar has anything like this character (Pullum Reference Pullum and McElvenny2018 : 202).
It is not hard to imagine what Pullum thinks does have such a character, namely, model-theoretic syntax. We will return to that point in the next section. For now, it is clear that he does not prescribe to any concept of an ‘internalised, intensional, individualistic’ grammar or I-language. For Pullum, grammar is external, social and normative. But despite antagonism from both sides of the I-language-E-language divide, there is no reason to dismiss the possibility of a union of the two (Nefdt (Reference Nefdt2023)). There is some neurolinguistic evidence for modularity and domain-specific linguistic mechanisms in the brain. In a recent review, Tuckute et al. (Reference Tuckute, Kanwisher and Fedorenko2024: 281) claim, among other things, that ‘the brain’s language system constitutes a distinct component of the mind and brain that is specific for language processing, separable from other cognitive systems, and relatively functionally homogeneous across its regions’. Others, such as Friederici et al. (Reference Friederici, Chomsky, Berwick, Moro and Bolhuis2017), identify the Broca’s area (BA 44) among others in syntactic processing (as well as the dorsal pathway). Describing the features of this brainstate at a more abstract level in terms of hierarchical trees and rules for deriving structures might be less about conforming to the ordinary usage of the term ‘rule’ and closer to what generative theory has deemed parameters or settings. When this modular system’s output is externalised to the extracranial linguistic environment, then it enters the world of norms and rules in the traditional sense. This option would, of course, be theory-dependent and synchronic. One might argue, with inferentialism, that language starts with external socio-linguistic rules undergoing internalisation over time, that is, from E-language to I-language.Footnote 10 The options abound. In fact, as Seuren (Reference Seuren2013) notes (channelling De Saussure), Chomsky’s I-language conception can only be a partial part of the story of the social complexity of language. ‘If it were the whole picture…it would be impossible for an individual speaker to discover, say, a new word in L, since the speaker, being competent in L, should already know the word’ (Seuren Reference Seuren2013: 13). Of course, Chomsky’s point is more structural than lexical, but the same reasoning could apply that in order to identify all the structures of a given language, we need to collate evidence from multiple I-languages. This introduces sociality, and with it comes normativity.
In sum, Pullum might be correct that grammar has a normative force. However, this possibility does not rule out that something like an I-language could be characterised by a generative system of rules conceived of in a different way. Naturalism and normativity need not be in conflict.
5. A Unified Outlook on Language
In this last section, we will attempt a unification of the different strands of Pullum’s philosophy of linguistics so far discussed. Given space limitations, I will only provide a sketch here. What makes Pullum such an inimitable scholar is his considered adherence to positions not commonly held together in the literature. Pullum is a formalist about linguistics. His model-theoretic alternative for syntax has been developed by him and others to a high level of mathematical precision. Despite this, he rejects formal properties such as discrete infinity as appropriate properties of natural languages. He believes in formally characterisable linguistic rules but attributes normative force to them and jettisons a psychological or biological grounding of such rules for language.
The formalism and the model theory are certainly not incompatible. As a guide to his interpretation of the term ‘formalism’, in Scholz & Pullum (Reference Scholz and Pullum2007), they distinguish between ‘formal’ as denoting systems which abstract over meaning and ‘formalization’ as a term indicating the conversion of statements of theory into precise mathematical representation. Pullum’s model-theoretic syntax is formal in both senses. The framework translates statements of grammatical constraints of a language into axioms or models of the grammar (via feature structures or attribute matrices, depending on the particular framework). In so far as these statements concerning syntactic norms or rules, such as ‘adverb modifiers in adjective phrases precede the head adjective in English’, abstract over the meaning, and focus on the distributional patterns. They are formal in the sense of an uninterpreted calculus.
Pullum (Reference Pullum, Nefdt, Dupre and Stantonforthcoming) further links model-theoretic syntax to the normativity of language. The best way to capture the rules of language in an analogous manner to other types of common rules is via constraints on linguistic structures. ‘What I mean by a constraint is simply a statement that may be true or not true when evaluated within a certain kind of object’ (Pullum Reference Pullum, Nefdt, Dupre and Stantonforthcoming: 12). These statements can be stated as constraints on formal tree representations, such as ‘if any node is a parent of both an NP node and a VP node, the NP node precedes’. What this kind of rule captures about Standard English is a norm of an expression type. It is an abstraction from social practices of rightness and wrongness of usage, not an abstraction of a brain state of the cogniser. Here, there is genuine disagreement between Pullum and the generativists. Trees are not in the head, but constraints on trees do characterise the linguistic norms of speakers (and signers) in the world. It is ultimately a model of behaviour.
The emerging picture is related to a view in the philosophy of logic and mathematics, which, in my view, connects a number of streams in Pullum’s philosophy of linguistics, including his formalism, his penchant for Wittgensteinian rule-following arguments, normativity and the opposition to infinity. Constructivism encompasses combinations of all of the above to varying degrees. Pinning Pullum down is tricky here. Let’s start with the basics. According to Dummett (Reference Dummett1975: 403), constructivism in mathematics is a theoretical framework with the following properties.
[T]he meaning of all terms, including logical constants, appearing in mathematical statements must be given in relation to constructions which we are capable of effecting, and of our capacity to recognize such constructions as providing proofs of those statements.
Constructivism takes on many forms. The most extreme version, strict finitism, interprets ‘capable of’ in terms of practice and not principle. For a proof to be legitimate, it needs to be more than just finite but also actually surveyable and intelligible in its details. Naturally, the infinite is on the chopping block. The idea that infinitary proofs in mathematics correspond to actual (complete) totalities has been something that scholars have wrestled with since Aristotle.Footnote 11 Finitism has been advocated by many prominent mathematicians and philosophers, including Hilbert and Wittgenstein. Hilbert, for example, aimed to justify Cantorian set theory by means of a finitary arithmetic in the metatheory, whereas strict finitism rejects the transfinite conception of the natural numbers itself. Not all versions of constructivism block countable infinities, though. Intuitionistic set theory merely reinterprets its quantifiers in its expression of the axiom of infinity from ZFC. Larger cardinalities are, however, banned, which causes ripple effects for the axioms of power sets and replacement used to ensure their existence. What does link constructivist approaches to Pullum’s claims about linguistic infinity is that they question the need for the axiom of infinity. ‘That we need an Axiom of Infinity for a theory of natural numbers and standard arithmetic turns out, however, to be an artificiality of (systems like) ZFC’ (Bremer Reference Bremer2007: 134). Similarly, as has been shown above, Pullum considers the linguistic infinitude claim to be an artefact of generative grammars. MTS can thus be seen as a constructive manoeuvre directed at its avoidance. Even the resistance to a mathematical induction for the set of natural languages discussed in Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010) finds a voice in constructivism, wherein mathematical induction itself is challenged as a methodological tool.Footnote 12
What about normativity? Here, Wittgenstein’s (positive) thoughts on mathematics are relevant. For Wittgenstein, mathematics concerns a rich ‘network of norms’ (Wittgenstein Reference Wittgenstein1975: VII Section 67). Contra Platonism, mathematics is not the study of objective facts or facts of the matter, but rather the study of norms such as those that constitute the rules of language. ‘Let us remember that in mathematics we are convinced of grammatical propositions; so the expression, the result, of our being convinced is that we accept a rule’ (Wittgenstein Reference Wittgenstein1975: III Section 27). Thus, mathematical statements or sentences are perceived as primarily normative in nature.
To see how this connects to Pullum’s view of MTS axioms, consider the possible link between Wittgenstein’s metamathematics and Hilbert’s axiomatic method. According to Friederich (Reference Friederich2010), Wittgenstein’s notion of mathematical sentences as norms can be specifically defended in terms of Hilbert’s notion of axioms as implicit definitions. The idea is that axioms defined implicitly do not report objective mathematical facts, but rather constitute normative conventions for the use of mathematical concepts. This presents grammatical axioms such as ‘if any node is parent of both a V node and an NP node, the V node precedes’ as similar to arithmetic axioms such as ‘every natural number has a unique successor’. Instead of stating a fact or a primitive of the system, this definition partly defines both ‘natural number’ and ‘successor’, that is, it establishes a connection (or dependence) between objects. Similarly, model-theoretic axioms in syntax under this guise connect VPs to NPs to PPs in terms of dependence relations.
Again, Friederich (Reference Friederich2010) argues that the normative element of the axioms qua implicit definitions is brought out by noticing the equivalent ‘grammatical form’ of the axioms in terms of a covert Let in front of them. For example, the sentence/axiom ‘every natural number has a unique successor’ should be understood as ‘Let every natural number have a unique successor’. Thus, ‘the grammatical form of this sentence makes it clear that its role is that of stating a norm for the usage of the concepts ‘’natural number” and ‘’successor” and not that of describing anything’ (Friederich Reference Friederich2010: 10). Pullum’s claims about grammaticality, model theory and normativity dovetail with this general Wittgensteinian picture combined with a specific form of constructivism about formal methods.Footnote 13
Of course, constructivism in mathematics, especially strict finitism, is a contested theory.Footnote 14 The same applies to Wittgenstein’s views on the normativity of mathematical statements. Nothing I have stated so far hinges on their truth or acceptance. Nevertheless, the parallel between these accounts in the philosophy of mathematics and Pullum’s views in the philosophy of linguistics remain present and compelling, in my view.Footnote 15
6. Conclusion
In this article, I have described and interrogated a number of themes in Pullum’s philosophy of linguistics. I questioned whether model-theoretic syntax is really in competition with generative approaches. I discussed Pullum & Scholz (Reference Pullum, Scholz and van der Hulst2010) on linguistic infinity and, in spite of general agreement with their challenges, showed how the infinitude claim might still survive as a theoretical posit. Lastly, I outlined his views on the normativity of grammar and contrasted them with the claims of biolinguistics. The hard-fought conclusion is a surprising union of these seemingly disparate elements in the philosophy of linguistics in terms of constructivism in the philosophy of mathematics, coupled with a Wittgensteinian approach to rule-following.
Acknowledgements
My appreciation for Geoff Pullum cannot be understated. Not only did he read (and correct) an earlier draft of this article, but he has been a constant source of inspiration, critique and support in my career. I would also like to thank Agustin Rayo and fellow speakers and audience members at the Pullum tribute conference in Edinburgh in 2023 for useful feedback and engaging commentary on some of the ideas that eventually made their way into the present work.