1 Introduction
Spoken syntax produced in interactional settings provides a major problem for linguistic description as it appears largely fragmented from a formal perspective. The fragmented character mainly derives from two phenomena: the highly frequent occurrence of non-clausal, ‘elliptical’ structures (see e.g. lines 1, 4 and 7 in example (1) below) and the large number of morphosyntactically unintegrated, largely conventionalized expressions that are not obligatory from a clause-grammatical perspective as they lack a syntactic and semantic relationship to any other constituent (e.g. lines 2 and 6). These expressions form a larger class referred to as ‘pragmatic markers’ (e.g. Brinton Reference Brinton1996) or ‘interactives’ (Heine Reference Heine2023) and include e.g. backchannels (e.g. yeah), discourse markers (e.g. I mean, so, you know, well), or interjections (e.g. oh), which may occur in different places within and around a structural unit. Example (1), taken from the Santa Barbara Corpus of Spoken American English (SBC; Du Bois et al. Reference Du Bois, Chafe, Meyer, Thompson, Englebretson and Martey2000–5), illustrates the two phenomena. Sharon and Carolyn are teachers talking about school-related issues.
(1)
1 Sharon: [These kids] are so --
2 (..) I mean,
3 their parents are so disinterested,
4 Carolyn: (..) in them?
5 Sharon: .. th- --
6 (..) Yeah=,
7 and their education, (SBC 004)
Given that syntactic theory has a long tradition of following a single-mind approach, under which a single speaker's syntactic choices are usually described without considering the dialogic-responsive character of speech (for exceptions, see section 2), almost all of the units produced in the different turns in (1) appear syntactically ‘deficient’ (the exception is line 3) as they do not correspond to full clauses, which form the basic unit of analysis in most formalist syntactic frameworks. Yet both participants do not signal any difficulties in understanding each other, i.e. interlocutors can deal with it without any apparent problems. In fact, this is the default real-life processing experience of people involved in everyday interactions.
This article provides an analysis of syntactic fragments and their relation to surrounding structures in face-to-face conversations within a strictly interactional framework for the study of syntax as a social practice for organizing talk-in-interaction. It relies on empirical ethnomethodological principles of Conversation Analysis (Sacks Reference Sacks and Jefferson1992; Schegloff Reference Schegloff2007) for the identification and understanding of interactively created syntactic structures within their sequential context, and on some conceptual understandings regarding syntactic phenomena in interaction (e.g. Lerner Reference Lerner and Lerner2004; Du Bois Reference Du Bois2014) for describing these structures. Data for analysis are based on recordings of private everyday face-to-face interactions provided by the SBC. It will be shown that fragments in speech are units of interactive syntax used as a communicative resource for coordinating interaction. Fragments will therefore be treated as communicatively ‘complete’ structural units in their own right in a ‘dual-mind’ approach to syntax. ‘Fragments’ are defined here from a formal-structural perspective as non-clausal units and from a communicative perspective as units that are produced in reaction to a prior unit in real-time conversation, thus co-occurring with another, preceding structure. This does not cover all possible kinds of fragments in spoken interaction, but those which are responsive to a preceding structural unit.
The article is organized as follows. Section 2 discusses the descriptive framework, arguing for a reorientation in syntactic description from a single-mind toward a dual-mind view on syntax. Section 3 describes the use and function of fragments in interaction and how they are linked to other syntactic units, looking at three phenomena: constituent replacements, syntactic expansions and the use of so-called ‘interactives’ (Heine Reference Heine2023). Section 4 discusses the results of the qualitative-empirical analysis and how syntactic theory can benefit from the study of fragments in spoken syntax. The conclusions are presented in section 5.
2 From single-mind syntax toward dual-mind syntax
2.1 Social cognition and linguistic research
In spoken interaction, individuals no longer act solely from a first-person perspective, but adopt a reciprocity-based, bidirectional view on their speech behavior (awareness of action–reaction patterns), where at least two agents affect each other in terms of mental processing and behavior. Social interaction thus occurs under specific behavioral and cognitive conditions, which have recently received intense scholarly interest in the cognitive and neurocognitive sciences. As Schilbach (Reference Schilbach2010: 1) argues,
social cognition is fundamentally different when an individual is actively and directly interacting with others. In such cases, an individual adopts a ‘second-person perspective’ in which interaction with the other can be thought of as essential or even constitutive for social cognition, rather than merely observing others and relying on a ‘first- (or third-) person grasp’ of their mental states.
A large body of experimental research has shown that social activity involves synchronization of actions over time: when we interact with another person, our brains and bodies are no longer isolated, but immersed in an environment with the other person, in which we become a coupled unit through a continuous moment-to-moment mutual adaptation of our own actions and the actions of the other (Konvalinka & Roepstorff Reference Konvalinka and Roepstorff2012; Dodel et al. Reference Dodel, Cohn, Mersmann, Luu, Forsythe, Jirsa, Schmorrow and Fidopiastis2011). For example, the relative success in decision tasks has been shown to correlate with the co-participants’ ability to find a common language for expressing their thoughts, i.e. to align their language use (lexical and structural choices) over time (Fusaroli et al. Reference Fusaroli, Bahrami, Olsen, Roepstorff, Rees, Frith and Tylén2012). Teams were shown to perform better on a variety of tasks when the members mutually adapted to each other's actions rather than acting individually and playing different roles (e.g. leader vs. follower) (e.g. Konvalinka & Roepstorff Reference Konvalinka and Roepstorff2012).
Studying syntax using paradigms that investigate language in a social ‘offline’ mode, i.e. based on structures isolated from an interactive context, rather than in the ‘online’ mode of a speaker acting in social context, i.e. from a dialogic point of view where the main task is coordination of actions (Richardson et al. Reference Richardson, Dale and Kirkham2007), means that the social signature in syntax remains largely elusive. Research programs including interaction as an integral part of linguistic analysis are mainly following the research paradigms of sociological Conversation Analysis (CA, e.g. Sacks Reference Sacks and Jefferson1992) and Interactional Linguistics (IL, e.g. Couper-Kuhlen & Selting Reference Couper-Kuhlen and Selting2018). One of the most prominent examples of what could be called a ‘social turn’ in the study of grammar and syntax is the foundational volume Interaction & Grammar by Ochs et al. (Reference Ochs, Schegloff and Thompson1996), in which several studies combined CA with functional linguistics in order to explore grammar as ‘part of a broader range of resources – organizations of practices, if you will – which underlay the organization of social life’ (Ochs et al. Reference Ochs, Schegloff and Thompson1996: 2–3). The basic idea was that social interaction is ‘the primordial site for the use and the development of language’ (Schegloff Reference Schegloff, Ochs, Schegloff and Thompson1996), and thus the context in which grammar is shaped. More recently, studies in this tradition have focused, for instance, on the grammar of responsive actions (Thompson et al. Reference Thompson, Fox and Couper-Kuhlen2015), the noun phrase in interaction (Ono & Thompson Reference Ono and Thompson2020) or on ‘emergent’ syntax for conversation (Maschler et al. Reference Maschler, Doehler, Lindström and Keevallik2020). Such studies provide insights into dialogic structures in selected domains of language use and are important for the development of a more comprehensive theory of a dual-mind syntax of interaction.
In grammatical theory, empirical alternatives to the traditional ‘offline’ structuralist and generativist approaches have been developed with the rise of studies on structures produced in spoken interactions, such as the Grammar of speech (Brazil Reference Brazil1995), Spoken grammar (Carter & McCarthy Reference Carter, McCarthy, Hinkel and Fotos2001), Linear unit grammar (Sinclair & Mauranen Reference Sinclair and Mauranen2006), Conversational grammar (Rühlemann Reference Rühlemann2006), On-line syntax (Auer Reference Auer2009), Emergent grammar (e.g. Hopper Reference Hopper, Auer and Pfänder2011), or Dynamic syntax (Kempson et al. Reference Kempson, Meyer-Viol and Gabbay2001). They have captured the cognitively significant real-time aspect that speakers in interaction are subjected to, e.g. lack of planning time, incrementality and unidirectional, temporal-linear emergence of structure in time, and brought us closer to a depiction of the speaker's actual experience with language (or continuous speech) in real life, where fragments or ‘chunks’ are as common as ‘full’ syntactic structures. However, inter-speaker processes and effects that cause interference and co-dependence of all participants’ contributions are still difficult to access with these approaches. An important exception in this respect is Bowie & Aarts’ (Reference Bowie, Aarts, López-Couso, Méndez-Naya, Núñez-Pertejo and Palacios-Martínez2016) study, which identifies different types of grammatical links between clausal fragments in spoken interaction. Recently, formalizations of dialogic-interactional aspects of syntax have been proposed by Wiltschko (Reference Wiltschko2021) and Dorgeloh & Wanner (Reference Dorgeloh and Wanner2023), among others; these studies seek to understand and describe syntactic structures with reference to interactional settings and the surrounding discourse. Yet a large proportion of these studies is based on structures and examples stripped of their dialogic context, which makes it difficult to see how co-participants mutually affect each other's syntactic choices in social encounters.
Syntax emerging in interaction is not merely the product of a single speaker, but based on the behavioral coupling between people in interaction and corresponding adaptive processes, as past decades of research in CA and IL have already shown: phenomena such as collaborative turn constructions (Lerner Reference Lerner1991), anticipatory turn completions (Lerner Reference Lerner and Lerner2004), phrasal responses to question-word interrogatives in dialogue (Thompson et al. Reference Thompson, Fox and Couper-Kuhlen2015), or various forms of alignment involving reuses of prior elements or structures (see e.g. Pickering & Garrod Reference Pickering and Garrod2006; Garrod & Pickering Reference Garrod and Pickering2009 from a cognitive-psychological perspective), give clear evidence of the dual-mind character of syntax. This view is also key to Du Bois’ dialogic syntax (Reference Du Bois2014), which is built on the premise that the use and interpretation of language is situated within a discursive field created by the utterances of prior speakers and that the structural organization of language crosses the boundary of single speakers. The result is a dynamic structure, the ‘diagraph’, defined as ‘a structure that emerges from the mapping of resonance relations between counterpart structures across parallel utterances’ produced in interactions (Du Bois Reference Du Bois2014: 362). While sharing this premise, the present study differs in scope from dialogic syntax, as will soon become clear: dialogic syntax is basically about a next speaker's engagement with the prior speaker's utterance in terms of dialogic resonance (e.g. parallelisms), whereas the approach presented here – ‘dual-mind syntax’ – has a broader scope in that it looks at the general processes involved in the emergent construction of syntactic structures across single speakers. It sees syntax as a resource used by speakers to organize social interaction in general, beyond dialogic resonance.
2.2 Syntax in interaction
The basic task for co-participants in talk-in-interaction is to develop a mental model about the communicative situation and what is talked about, i.e. to come to understand what they are talking about in the same way, and the success of interaction crucially hinges on the degree to which these models become aligned or synchronized (Menenti et al. Reference Menenti, Pickering and Garrod2012). This explains why co-participants quickly begin to align (with varying degrees of conscious control) to the other participant's behavior, i.e. to mutually prime and repeat each other's linguistic and non-linguistic choices, to behave in more similar ways over the course of an interaction in various respects, and to synchronize their actions across modalities, from body posture (Shockley et al. Reference Shockley, Santana and Fowler2003) and gestures (Willems & Hagoort Reference Willems and Hagoort2007) to prosodic aspects and the take-over of lexical elements and syntactic patterns (Branigan et al. Reference Branigan, Pickering and Cleland2000).
Alignment is one of the best examples showing why single-mind approaches to the study of aspects related to interaction (such as syntax) are misleading: over time, co-participants interweave their activities, repeating, imitating and mutually influencing each other's behavior to such an extent that in the end behavioral patterns and linguistic choices can no longer be immediately attributed to a single individual. In other words, the interplay between individuals becomes so close that the emergent actions and structures are no longer merely the result of each individual's decisions in isolation, but co-created. This does not only include repeating the other participant's choices, but also operating on them, for example, by adding or modifying parts to/of a preceding structural unit while leaving other parts intact. In (2), we can observe how each new speaker builds his/her utterance syntactically upon an initial structure (line 1) over a longer stretch of talk. The speakers talk about a woman in the neighborhood.
(2)
1 Harold: [Does she even] have a b- a man-?
2 I guess she must.
3 Miles: … Does she have a what?
4 Jamie: [A ma = n].
5 Harold: [A ma = n].
6 Pete: @@@
7 Jamie: … She has some [kind of a] --
8 Miles: [at least ] temporarily,
9 Pete: [yeah],
10 Harold: [yeah]=.
11 Jamie: @@@ (H) [ at one time ].
12 Harold: [for about five minutes],
13 probably.
(SBC 002)
Note that, from line 4 on, all contributions are fragments, and how this fragmented character of syntax in interaction stretches over several turns. These fragments serve different communicative tasks: the simultaneous responses in line 4 and 5 provide a replacement of the interrogative pronoun in line 3 by a full lexical expression (noun phrase) in order to remove the trouble source, the structures in lines 8, 11 and 12/13 are clearly responsive expansions of the initial structure in line 1 by adverbials creating amusement, and Jamie's structure in line 7 ‘resonates’ the one in line 1, but includes a pre-modification of the man (‘some kind of a’) before it is cut off, which is added to the original syntactic frame and describes the object (a man) in a jocular way. So the co-participants reuse and operate on other participants’ structures or parts of such structures rather than being oriented to building up ‘complete’ syntactic structures with each new contribution.
Structurally speaking, we have a source syntactic frame or ‘anchor structure’ whose epistemic validity is negotiated between the co-participants by modifying or adding single constituents. The anchor structure is the shared point of orientation since, in the end, all fragments are part of one coherent syntactic pattern, as illustrated in table 1.
The example shows that social interaction comes with particular affordances to syntax: the fluent back and forth of turns, getting in and out of the speaker role in group talk, relies heavily on the contribution of smaller fragments to a structure that forms the cognitive anchor of subsequent units. Psycholinguistically speaking, we can assume that there is shared mental representation of structures during production (when speaking) and comprehension (when listening to another person): the structure progressively built up by one speaker is subjected to continuous synchronization of representations in the co-participant's mind, who is then able to produce a fitting piece of syntax (a fragment) after turn transition. This kind of cognitive ‘workspace sharing’ (Kempen et al. Reference Kempen, Olsthoorn and Sprenger2011) allows syntax-in-interaction to be distributed over two or more participants and to maintain the conversational flow.
The larger structure that emerges over time across several turns is the emergent product of the linguistic activity of interacting minds, and thus reflecting structural properties of interaction. Therefore, each single speaker's contribution is essentially the result of an interactively created network of mutually interfering, co-dependent structures and meanings. The larger emergent structure is not merely the sum deriving from the single speakers’ contributions, e.g. S1 + S2 + …, but a new whole beyond the single parts, a hyperproduct SH (e.g. the entire structural network in table 1) that carries signatures of social coordination and has an additional value in that it reflects properties of emergent interaction or interacting minds, including, for example, interaction markers like yeah. SH results from the joint dynamics between interacting individuals rather than from autonomous ‘offline’ brains acting in a social vacuum. This analytic approach resonates Du Bois’ (Reference Du Bois2014: 359) concept of the ‘diagraph’ as ‘a higher-order, supra-sentential syntactic structure that emerges from the structural coupling of two or more utterances’ (or utterance portions), though this coupling is not limited to ‘the mapping of a structured array of resonance relations between them’ (Reference Du Bois2014: 376) here.
Of course, speakers do not always merely add fragments to existing structures, but also produce syntactically independent structures. Anchor structures, i.e. structural units on which co-participants may operate in subsequent turns, alternate with co-dependent fragments in regular intervals. The point is that each new structural unit can potentially be used by other participants for modifications, expansions etc., serving as a new ‘anchor’. In (3), for instance, Lenore asks a question that diverges from Alina's current topical focus, using a syntactically independent structure (line 8).
(3)
1 Alina: (H) So October fourth [rolls around,
2 Lenore: [ (H)= ]
3 Alina: and Liza had to go do some ]thing,
4 so they're stuck babysitting Cassandra.
5 … Great.
6 .. You know,
7 just your big thrill [in the world,
8 Lenore: → [How old is she now].
9 Alina: this little] piss ass.
10 .. (H) Four,
11 five,
12 some place around there,
13 I can't remember.
(SBC 006)
Since Lenore's question is not immediately congruent with Alina's narrative project as it represents a digressing move, it is expressed in a structural frame that reflects its discursive independence, i.e. as a syntactically autonomous unit. The answer, in turn, is delivered by sharing the syntactic frame introduced by Lenore, in which the interrogative pronoun is replaced (lines 10, 11) by numerals, followed by a fragment indicating epistemic uncertainty (line 12).
2.3 The relationship between single-mind vs. dual-mind syntax
The larger structures emerging from the serial production of fragments in real-time interactions largely overlap with the internal structure of independent sentences described in traditional single-mind frameworks, as shown in table 1 above. Single-mind approaches to syntax, based on speakers acting in an interactional ‘offline’ mode, thus appear to provide the cognitive foundation underlying the co-participants’ shared mental representation of anchor structures and the internal structural coherence of the emergent SH. This will also become clear from the discussion in section 3.
However, we need to include additional assumptions and concepts that account for the social signature of spoken syntax produced in interactional contexts. The reason is that meaningful interaction requires more than just the recruitment of internalized syntactic knowledge allowing a speaker to produce ‘well-formed’ structures ready for analysis as decontextualized units. The basic requirement in social settings is to be able to take part in interaction in a dynamic way, maintaining the flow of interaction, and producing units that are responsive and maximally fitted to the emergent structural environment co-produced by multiple agents. Structurally speaking, competent speakers need to be able to mutually coordinate their syntactic choices by integrating each new syntactic fragment in a coherent overall syntactic frame spanning over various speakers. Also note that, as will be shown below, syntactic practices go hand in hand with the establishment of successful ‘grounding’ (Clark & Brennan Reference Clark, Brennan, Resnick, Levine and Teasley1991), as these practices are part of the co-participants’ coordination of their distinct knowledge states and serve the ongoing process of assuring mutual understanding as part of collaborative action.
3 Fragments in dual-mind syntax: three structural phenomena
Dual-mind syntax analyzes how speakers structure language in ongoing interaction while changing their roles continuously from speakers to listeners and back as participants in the same cognitive activity. Before I give a rough outline of the descriptive procedure in dual-mind syntax using three phenomena for exemplification, a remark on the representation of the data and the categorization of fragments is in place.
Data deriving from interactional settings are usually represented in a way that shows segmentation based on speaker roles and intonation units. This format is not particularly useful for dual-mind syntax as it does not adequately represent inter-turn syntactic dependencies. We need a format that allows us to see (i) how co-participants modify single constituents of a single syntactic unit (e.g. these kids > their parents) in subsequent turns (indicated by means of boxes in (1')), (ii) how next speakers expand a syntactic unit ‘in play’ by adding possible further constituents (indicated by means of dotted boxes) that open up slots left unfilled by the prior speaker (=empty boxes), and (iii) how interaction markers (in bold) link syntactic fragments on the interactional level.
(1')
This representation shows more clearly how seemingly free-floating, unintegrated fragments produced by different speakers tend to be mutually compatible and form part of a larger coherent syntactic structure collaboratively constructed by two or more speakers. The transcripts are from now on arranged in this way in order to highlight the syntactic fit of subsequent structures while preserving prosodic and other details. All of the three structural phenomena (i)–(iii) just mentioned fall into the broader category of co-constructions in social interaction, defined as the ‘joint creation of a form, interpretation, stance, action, activity, identity, institution, skill, ideology, emotion, or other culturally meaningful reality’ (Jacoby & Ochs Reference Jacoby and Ochs1995: 171).
The example in (1') shows that we need to distinguish two types of fragments: those that express lexico-grammatical content (though not necessarily a complete proposition) and that have the potential to be a clause constituent (e.g. in them, line 4), and fragments that do not as they organize language beyond grammar and semantics on a higher discourse-interactional level (e.g. yeah, line 6, or oh, line 9) (for a similar classification, see Bowie & Aarts Reference Bowie, Aarts, López-Couso, Méndez-Naya, Núñez-Pertejo and Palacios-Martínez2016). The latter are therefore called ‘interactives’ here (Heine Reference Heine2023). Both types of fragments may be combined, as in lines 6 and 7 (yeah and their education). All fragments depend on their structural environment for their interpretation. No distinction will be made between lexico-grammatical fragments that make a ‘complete’ contribution by implementing an action (e.g. a request for confirmation, line 4) and those that do not (e.g. ‘abandoned’ units such as these kids are so-): for the scholar, the latter may play no clear semantic, grammatical or pragmatic role for analytic purposes, but we can make no judgments as to whether this also holds for the co-participants, for whom they may, for example, provide an interpretive cue relevant for predictive utterance processing (such as ‘this utterance is still about school, though not about the kids, but their parents’ for line 1) or serve as an ‘anchor structure’.
I will now describe how fragments form an integral part of co-constructions by analyzing both their structural properties, their interactional functions and the ways in which they are linked to their structural environment.
3.1 Fragments replacing constituents
In all of the cases falling into this functional category, second speakers (or, with self-repairs, single speakers) produce a fragment that represents a constituent of the same syntactic category and with the same syntactic function as the one that it replaces in a preceding syntactic unit, the anchor structure (in bold from now on). These cases represent paradigmatic links involving alternatives for the same grammatical slot (see also Bowie & Aarts Reference Bowie, Aarts, López-Couso, Méndez-Naya, Núñez-Pertejo and Palacios-Martínez2016). A common form of constituent replacement occurs in the context of other-repairs (Schegloff Reference Schegloff2000), as in (4), where Sharon repairs and semantically specifies Kathy's candidate assertion.
(4)
In both cases (lines 3 and 6), Sharon is oriented to the syntactic structure produced by Kathy; the original head (kids) and the modifier (twelve) of a constituent in that structure (the object twelve kids) are replaced by the next speaker, who represents the epistemic source.
A similar process – this time carried out incrementally by a single speaker as a reaction to the co-participant's responses – can be observed in (5): Alice offers Annette something to eat, continually replacing the nominal referent (the direct object) by single fragments without repeating the anchor structure.
(5)
Here, the subsequent items represent alternatives that may co-exist with the prior one. The various fragments are, in the end, embedded in the structural environment of the anchor structure, which remains activated over several turns.
Functionally, the process is carried out for various communicative purposes, mainly in order to specify, repair, repeat, replace, expand or modify elements in a prior syntactic unit on the expressive or semantic level, thus allowing co-participants to negotiate meanings and to fine-tune semantic details. In (6), both speakers negotiate the meaning of Annette's utterance go out and eat by means of exemplification. This time, the anchor structure is first expanded (see section 3.2 below) by a wh-pronoun that requests an adverbial of place (like where, line 3), which is then replaced by NP fragments.
(6)
Semantic specification as part of other-initiated repair is also illustrated in (7), where Annette requests a confirmation of the replacement of the semantically underspecified pronoun that.
(7)
In (8), the semantic detail negotiated by the speakers is a price. Again, the second speaker (Richard) provides the constituent replacing the question word that functions as a variable, thus operating on the basis of the syntactic frame created by the prior speaker Fred (for a detailed discussion of similar cases, see also Thompson et al. Reference Thompson, Fox and Couper-Kuhlen2015: ch. 2).
(8)
One of the main goals of verbal interaction is the joint negotiation of meanings, for example, particular places, times, persons etc. related to an action or a state. It does therefore not come as a surprise that the replacement operation is very frequent in spoken data and characteristic for ‘grounding’ processes, i.e. the joint construction of a common mental model of what is talked about.
The following examples illustrate higher degrees of complexity. In (9), two speakers work out the details of a location (Richard is talking about the location of his car dealership) by producing syntactic elements of the same category (local adverbials). Each contribution in the sequence is a constituent that it is built upon the anchor structure (we're on ... Firestone Boulevard). Note how the anchor structure is first expanded (Firestone where?) before further post-modifying adverbials are delivered (by the…, before the…, right past…), which are continuously replaced in order to mentally fine-tune the denoted place. The continuous replacement of a single constituent, or of parts of that constituent, is accompanied by interaction markers (yeah, exactly, right).
(9)
The replacements can also affect more than one constituent, and thus lead to a complex structural interplay of various subsequent fragments, all of which are related to the anchor structure. In (10), for instance, the participants negotiate semantic details expressed in a place and a time adverbial.
(10)
A further example is (11), where two speakers jointly work out the referent of the pronoun she that represents the subject in the anchor structure she's pregnant.
(11)
All these examples show how co-participants make their contribution syntactically fitted to a structure created by a prior speaker, thus focusing the joint negotiation of meanings to the syntactic constituent that forms the object of negotiation on the meaning level. Metaphorically speaking, it appears that co-participants jointly work on a single syntactic ‘thread’. The preciseness of syntactic fit of a fragment even allows speakers to use unconventional, creative expressive devices to specify referents, as illustrated in lines 7–8 in (11), where Jamie uses a dialogic element (the mocking quote get over here you nya nya nya…) that imitates the referent in a jocular way and whose meaning becomes clear since it is designed as running parallel to all the prior elements in this slot (e.g. being marked for definiteness).
Constituent overlap occurs typically with wh-questions, where next speakers replace the interrogative pronoun in the syntactic unit produced by a prior speaker, ‘filling in’ the value of a variable (Bowie & Aarts Reference Bowie, Aarts, López-Couso, Méndez-Naya, Núñez-Pertejo and Palacios-Martínez2016). This corresponds to Fox & Thompson's (Reference Fox and Thompson2010) observation that phrasal units are the default option for ‘no-trouble’ responses to specifying wh-questions: these units are specifically fitted to the lexicogrammar of wh-questions and thus in a maximally ‘symbiotic’ relationship with the sequential context. From this perspective, NP fragments are fully integrated in the mentally activated syntactic frame created by the prior speaker, as illustrated in (12).
(12)
Note that there is no need for Annette to change the deictic reference from you to I (and, correspondingly, the form of the verb) if we assume that, together with Alice's syntactic frame, she also implicitly takes over her perspective, from which she is a second person. (Alternatively, we can assume that the deictic shift occurs implicitly through the mere fact that there is a change of speaker roles.)
It is possible to consider processing efficiency the main motivation underlying the use of syntactic fragments, which is a design feature of human language use in general (see Gibson et al. Reference Gibson, Futrell, Piantadosi, Dautriche, Mahowald, Bergen and Levy2019 for an overview) and of communicative interaction in particular. Interaction requires a certain degree of efficiency in order to allow for a fluent interplay of turns and to maintain coherence of talk, especially when it centers around the negotiation of smaller semantic details. Avoiding redundancies prevents co-participants from having to reprocess units of language that have just been processed (‘anchor structures’) and that can be assumed to remain mentally activated over several turns and serve as structural hosts for subsequent syntactic fragments.
3.2 Expansions and co-completions
A second common operation in dual-mind syntax is to operate on a prior speaker's syntactic unit by expanding it with further possible syntactic constituents that do not replace a constituent, but add a further one (indicated by boxes in dotted lines). In these cases, the expanding fragment is syntactically continuous with the prior structure, which shows again how co-participants are co-oriented to a single syntactic pattern. Functionally, this operation serves to provide or ask for more details or to co-complete another speaker's structural unit. The phenomenon corresponds to what is referred to as collaborative turn-construction (Lerner Reference Lerner1991) and collaborative turn-sequence (Lerner Reference Lerner and Lerner2004), which refers to the joint creation of a single structural unit, for example a sentence or a narrative construction format, in subsequent turns produced by two or more speakers. Next speakers may either expand (Schegloff Reference Schegloff and Robinson2016) a prior syntactic unit beyond a point of grammatical completion by adding further, structurally optional constituents (=recompletion), or complete an emergent syntactic unit in which some grammatical projections (e.g. the object of a verb) are still open.
In (13), Richard expands the prior speaker's structure by adding an adverbial in order to confirm and strengthen the truth value of Fred's proposition, the latter of which can be interpreted as a candidate understanding requiring confirmation (note, again, that a change of the deictic reference hired you > me is not necessary for reasons mentioned above).Footnote 1
(13)
Expansions may also include adding an entire relative clause (see also Tao & McCarthy Reference Tao and McCarthy2001) as a strategy to weave a metacomment into a prior speaker's assertion, as shown in (14).
(14)
A similar case is the that-relative clause in (15), which specifies the referent stupid form and an application, respectively.
(15)
Some authors (e.g. Carter & McCarthy Reference Carter, McCarthy, Hinkel and Fotos2001) argue that non-restrictive relative clauses seem more like a second main clause. In fact, it would be possible to interpret which in (14) as a discourse-linking expression establishing a relation on the textual (rather than syntactic) level, the unit following it being syntactically autonomous, one indicator being that such which-clauses can provide a comment on a larger discourse unit rather than on the immediately preceding clause. The discussion cannot be taken up here and loses much of its relevance when we do not consider structures in isolation, i.e. from a single-speaker perspective, but account for the fact that participants co-construct structures both within and beyond single syntactic configurations, creating a hyperproduct on various levels of the language system.
Expansions may co-occur with repetitions of a constituent in the anchor structure. In this case, next speakers re-create parts of the original syntactic frame in order to establish a closer link between the anchor and the expansion, as illustrated in (16).
(16)
Assertions can be expanded for negotiating meanings, for example for clarifying semantic details such as a specific location, as in (17), where the interrogative pronoun where serves as a request for an expansion of the anchor structure with an adverbial that expresses the missing information.
(17)
Such negotiations based on other-initiated repairs can also be performed by means of candidate understandings that are designed for confirmation, as with in here in (18).
(18)
Next speakers can thus ‘reopen’ a potentially completed syntactic structure, often as implicit requests for confirmations, i.e. in repair contexts. In (1''), Carolyn adds a prepositional complement (in them ?) to a subject complement (so disinterested), thus expanding a syntactic structure, which is then again expanded by Sharon herself. This way, both speakers jointly create higher syntactic complexity cumulatively in a piecemeal way – a complexity that emerges from the activity of two interacting minds.
(1'')
As argued by Mushin & Pekarek Doehler (Reference Mushin and Doehler2021: 13), this kind of ‘incrementation’ can be understood as a syntactic practice that allows speakers to maximize the compatibility of actions performed in subsequent turns between the principles of intersubjectivity (i.e. establishing mutual understanding) and of progressivity in the sense of Heritage (Reference Heritage, Enfield and Stivers2007).
Until here, we have only looked at expansions of clausal units. We will now turn to completing next-speaker continuations of syntactic units in progress, i.e. units in which some of the grammatical projections emanating from the words-so-far are still open when the next speaker sets in, as in (19).
(19)
The communicative function of such co-completions partly differs from that of the expansion of clausal units above: while they may also serve as candidate items offered for confirmation in order to negotiate semantic details, they often appear to be used as a communicative resource to signal that one is listening and understanding (otherwise the next speaker would be unable to provide a candidate understanding of what the prior speaker was likely to say). Moreover, they serve as a communicative strategy to become involved in the co-construction of talk, where both participants work together to establish ideas and jointly contribute to create and maintain a particular communicative frame (e.g. joking) rather than passively observing the communicative work done by a single speaker.
In (20), Annette expands Alice's structure by adding a structurally fitting syntactic fragment (line 7) in order to signal understanding.
(20)
Note that expansions may include partial repetitions (e.g. of prepositions), which mark or strengthen the syntactic environment in which the expansion is integrated.
Further examples of other-completions, which again mainly serve the negotiation of meanings, are (21)–(22). In (21), Pete helps Marilyn out with a candidate word.
(21)
The same happens in (22), this time with a partial repeat causing overlap of a constituent.
(22)
The conclusion that we can draw from these examples is that in dual-mind syntax fragments are integrated in a mentally activated anchor structure. The pervasiveness of expansions (or incrementation) shows how important it is to conceive of syntax as a collaborative, joint process rather than as a single speaker's achievement alone. Sharing the work of constructing syntactic structures by producing fragments designed for concrete structural environments created by co-participants is intertwined with interactional tasks such as collaboratively working out meanings, helping out co-participants with candidate words and formulations, signaling involvement and shared understanding, or participating in the creation of a jocular frame, which shows that syntax is one important resource for joint interactional work.
3.3 Interactives
The social aspect of syntax-in-interaction manifests itself also in the use of expressions that deal with interaction management and the organization of talk as a social event, which encompasses functions such as turn taking management, backchanneling, response elicitation, or getting the addressee's attention. The linguistic elements serving these functions form a broad class of expressions, mainly including interjections, vocatives, backchannels/response signals, attention signals, discourse markers, or social formulae, and have recently been subsumed under the label ‘interactives’ by Heine (Reference Heine2023), which will be adopted here. Given that interactives are grammatically unintegrated into the structures they accompany and that many of them represent highly conventionalized, formulaic pieces of syntax (e.g. you know, I mean, I think, oh my God), they also fall into the category of fragments. They form a group of their own as they typically do not express lexico-semantic and grammatical content.
An example of the use of interactives is (23), where we find a broad range of interactives (in bold) serving various functions related to ongoing interaction.
(23)
1 Jamie: We're gonna have babies crying.
2 … [in the middle of the night].
3 Harold: [ (GROAN) ]
4 → … Well it's no worse than her screaming at em,
5 → is it?
6 Pete: → … Yeah but now you'll have both.
7 Jamie: → … Yeah right.
8 … Probably be like,
9 <VOX °shut up you ki-° VOX>,
10 → you know,
11 XX?
12 → Oh= Go= d.
13 … I feel --
14 I s- feel like such an old lady.
(SBC 002)
Initial well indicates a partial divergence (Schiffrin Reference Schiffrin1987; Heine Reference Heine2023: 130) from what can be inferred from Jamie's utterance (which is that the babies will cause a change in the noise background), which is partially denied by Harold. Yeah and right are interaction markers indicating acknowledgment (yeah) and agreement (right), while you know in line 10 serves to make salient the implications of having a neighbor talking and behaving as imitated by Jamie, based on shared knowledge or plausibility that allows for jointly constructed implications (Jucker & Smith Reference Jucker, Smith, Jucker and Ziv1998: 173). Oh God displays the speaker's stance toward the situation (disapproval) and thus serves as an interpretive cue for the co-participants.
Interactives are frequent in spoken interaction: in the stretch of talk that consists of fifty words in (23), there are eight expressions classifiable as interactives, i.e. 16 percent of the words are related to interaction management. The analysis of four conversations in the SBC involving two, three or four participants (SBC 002, 004, 006, 042) yielded a relatively consistent average of 15–18 percent of interactives. So organizing interaction and ongoing discourse is an important task to which speakers devote a considerable amount of their speech activity.
From a syntactic perspective, such expressions are difficult to deal with as they are syntactically unintegrated fragments in the sense that they occur outside formal, morphosyntactic dependency relationships with any other constituent. A typical test for syntactic independence is that expressions such as yeah right, you know or oh God above can be omitted without causing any change in grammaticality or any loss in semantic content, i.e. they are detached from their syntactic environment.
Yet there is reason to assume that the use of interactives does have to do with syntax under a dual-mind approach. A framework that is useful for the description of the grammar of interactives in dual-mind syntax is the one proposed by Heine (Reference Heine2023). Following his analysis, structural relations involving interactives may include three components: a speaker (S), a hearer (H) and a topic (T). T can be part of ongoing discourse, such as something that the prior speaker just said (T1) or that the current speaker will say (T2). A further T-type is a ‘topic’ that derives from the communicative situation, i.e. when speakers refer to something that has not been explicitly verbalized or is not verbalized in upcoming discourse. This is, for example, the case with ouch after bumping against somebody. I will introduce the label (TS) (s = ‘situation’) for this topic type.
Interactives express a relationship between at least two of all possible components S, H, T. Initial well in (23) above, for instance, is used to signal that there is a relationship between two pieces of discourse – T1 and T2 – and indicates that this relationship is based on the attitude of the current speaker (S) towards the prior speaker's (H) view, namely partial disagreement. So well serves a relational function in four functional domains (S, H, T1, T2), i.e. all of these ‘arguments’ are required in uses of well, which can be made explicit by formulating a paraphrase that captures the concrete function of well in (23) (taken over from Heine [2023: 130]):
Paraphrase: I (S) want to slightly correct what you (H) have just said [or implied, A.H.] (T1) by drawing your attention to the fact that there is also T2.
In analogy to descriptive traditions in sentence-based syntax, we can postulate an argument structure for well that looks as follows (based on Heine Reference Heine2023: 130):
Argument structure: well (T1, S, H, T2).
In line 5 in (23), the same speaker (Harold) then establishes a relationship between the structural unit just produced and the next speaker‘s turn by finishing his turn with a tag question (is it), which structures interaction as it prepares a transition of speaker roles. So the use of the tag question requires S and H as arguments in order to be felicitous. Moreover, it links T1 to an upcoming T2, whose production is made relevant by it. Thus, the argument structure is the same as for well: (S, H, T1, T2). Pete then produces a response marker (yeah), by which he establishes a relation between the utterance (T1) produced by Harold, now (H), and himself (S) as the one who reacts to it. Since yeah does not necessarily require more talk after its production, the argument structure includes only three arguments that must be present for appropriate use: T1, S, H. Pete then continues with more talk that is introduced with but, a discourse marker indicating a contrastive relationship between T1 produced by H and upcoming talk T2 produced by himself (S), so but is characterized by the argument structure (T1, S, H, T2).
Note that, as in traditional syntax, the argument structure of interactives is determined by the environment in which they are used: ouch, for example, may involve only (S, TS) when a speaker hurt himself, but it involves (S, H, TS) when it is produced in reaction to some act performed by the addressee. Oh can be (T1, S, H) when the speaker reacts to what a prior speaker said (indicating e.g. surprise or disappointment), but it may also be (T1, S, H, T2) when the speaker uses oh to indicate the sudden rise of a new idea that is set off from prior talk and is followed by more talk, as in (24).
(24) Sabrina: →… Oh=,
→ Mom?
Kitty: → Yeah?
Sabrina: … When we turn --
% When we turn in the five hundred dollars?
(SBC 042)
Following the descriptive schema used to illustrate relationships in dual-mind syntax in this paper, we can represent the structural relationships established by interactives as shown in (23’). The relations between the arguments (S, H) are not indicated in order not to overload the schema; they are present in all cases, but left implicit. The dotted boxes indicate a possible, non-obligatory slot for the use of the respective interactive, in analogy to possible, but not obligatory expansions discussed in section 3.2.
(23’)
The schema illustrates the complex network of interrelated structures, which is essentially what characterizes interaction and distinguishes it from two speakers merely producing monologic sequences of talk. Interactives occur at major boundaries in emergent talk and thus fulfil an important integrating function, especially in environments that are potentially disruptive as the smooth progression of talk and structural compatibility are at risk, for instance after speaker changes or when new topics are introduced. Interactives thus help speakers link syntactic units of any kind, e.g. clauses, pieces of (spoken) discourse, but also syntactic fragments (as in line 7–8 in (23')), to prior and upcoming structures, thus integrating them in emergent, ongoing discourse, and to structure the continuous change of participant roles from speaker to listener and vice versa.
However, interactives operate in a domain of grammar that is not based on morphosyntactic relationships and propositionality, but on relationships that encompass the social situation. I will call the domain in which such relationships are relevant ‘macrogrammar’ as these refer to aspects beyond sentence grammar. Macrogrammar is defined as that part of human language and language cognition that serves establishing relationships outside morphosyntactic and semantic relations, the latter of which I call microgrammar. This distinction follows a longer tradition of postulating dualistic frameworks that separate traditional sentence-based grammar from a component of grammar that deals with structural relationships on the interactional level involving coherence relations in the communicative system as a whole, for which labels such as Thetical Grammar (e.g. Heine et al. Reference Heine, Kaltenböck, Kuteva, Long, Bischoff and Jeny2013), Macrogrammar (e.g. Haselow Reference Haselow2017), Dépendence macrosytaxique (Debaisieux Reference Debaisieux2007) or Interactive Grammar (Heine Reference Heine2023) have been proposed.
Both components of grammar are equally required in dual-mind syntax: while fragments serving constituent replacement and expansion (see sections 3.1 and 3.2 above) rely on microgrammatical principles in the sense that they are integrated in a syntactic frame shared by the co-participants, fragments serving interaction-structuring functions (‘interactives’) follow macrogrammatical principles, given that their use does not, or not necessarily, require compatibility with the morphosyntactic and semantic rule system underlying microgrammar.
For reasons of space, a detailed description of the relation between the two components cannot be provided here, but longer discussions are provided, for example, in Heine et al. (Reference Heine, Kaltenböck, Kuteva, Long, Bischoff and Jeny2013), Haselow (Reference Haselow2017) or Heine (Reference Heine2023: ch. 7). The major distinctions deriving from ongoing research on these two domains of grammar are summarized in table 2.
The survey shows that the two grammars have complementary functions in various respects. (i) refers to what is structured: structuration may either occur within units that are composed (either collaboratively or by a single speaker) on the basis of morphosyntactic principles and determined by semantic planning, or between syntactically independent units of discourse, following principles of interaction (e.g. responding, linking subsequent turns). (ii) is about differences in meaning: microgrammar is about expressing propositional content and relations between propositions, and is thus sensitive to truth conditions, whereas macrogrammar is about meanings anchored in the communicative situation and sensitive to the course of interaction (e.g. upcoming turn transition and projection of partial disagreement). (iii) is about the locus in which syntactic rules and principles are anchored. In microgrammar, they are determined by general principles holding in a given language (e.g. word order rules, morphosyntax) and thus rather abstract, though closely interacting with pragmatic factors, which may have effects on constituent order and movements operations (e.g. left- and right-dislocations). In macrogrammar, the use and position of expressions is determined by local interactional needs for expressing information relevant for the interaction system (e.g. eliciting a response, indicating emotive–expressive meanings). So mastering macrogrammar requires knowledge of how interaction is structured.
Further research needs to explore the details underlying the relationship between the two components of grammar: looking at the data, it appears that either of the two is always active during speech production since speakers may continually shift from one to the other while building up structural relations. For example, with responses to yes–no questions in question–answer adjacency pairs, speakers may opt for ‘interjection-type responses’ for doing the job of confirming (e.g. yeah, uh-huh, mm-hm), i.e. interactives, but they may also create fragments based on the machinery of microgrammar (e.g. he is), or both (yeah, he is). So response signals can be produced by mobilizing resources from any of the two domains of grammar (see also Heine Reference Heine2023: 708–710). How exactly this continuous shift occurs in cognitive terms is still an open question.
4 Discussion
From a dual-mind perspective, fragments contribute to the emergence of a dense network of mutually interacting structures over the course of an interaction. The emergent co-construction of syntactic structures destabilizes any static conception of syntax in terms of fixed ‘units’: syntax is emergent (Hopper Reference Hopper, Auer and Pfänder2011) and remains ‘open’ for further processes (e.g. replacement of constituents, expansions) over several turns, which allows several participants to work on one single syntactic unit collaboratively. Syntactic fragments are, like actions in general (Heritage Reference Heritage1984: 242), both context-shaped and context-renewing in the sense that syntax is adapted to local-sequential environments provided by a prior speaker, but also providing new structural contexts. Fragments turn syntax into an organized collaborative accomplishment.
In contrast to single-mind approaches to syntax, dual-mind syntax describes syntactic relations across single speakers in interactional contexts: based on conventionalized syntactic trajectories, co-participants may expand, revise, or co-construct a syntactic thread via distributed syntactic processing at particular moments in ongoing interaction using type-fitted structural fragments. The relevant evidence has been specified above, but it goes without saying that the empirical foundation needs to be far broader.
Against the background of the frequent interweaving of syntactic structures across speakers, statements such as ‘conversation can do without the lexical and syntactic elaboration that is found in written expository registers’ (Biber et al. Reference Biber, Johansson, Leech, Conrad and Finegan2021: 1038) that we find in many accounts of spoken conversational syntax become questionable. They may be true if we look at single turns or turn-constructional units (Ochs et al. Reference Ochs, Schegloff and Thompson1996), but this would ignore the interplay of structures in emergent talk, by which higher degrees of lexical and syntactic complexity may be co-created, both on the micro- and the macrolevel of grammar.
Dual-mind syntax is based on moment-to-moment decisions on how to continue an emergent syntactic thread, the leading end of which is cognitively activated in the co-participants’ minds as it forms the object of current mental processing. This implies a two-way processing mechanism by which the structural decisions made by a second speaker are coupled to those made by the prior speaker, which creates an action–perception loop between individuals that is at the core of language-based human interaction (for similar views on the action level, see e.g. Hari & Kujala Reference Hari and Kujala2009; Konvalinka & Roepstorff Reference Konvalinka and Roepstorff2012). This, in turn, would explain why fragments are so frequent in interaction: co-participants are not only oriented toward building up syntactic units ‘from scratch’, but continuously synchronize with what another speaker is doing and operate with that speaker's structural decisions whenever this is congruent with their own communicative goals.
The discussion of interactives as elements of macrogrammar further contributes to an understanding of language as it is produced and structured in real-time interaction. Given that fragments, and interactives in particular, play hardly any role in traditional (sentence-based) syntax, they expand the methodological and conceptual tools required for the analysis of structures produced in interactive settings. The idea of postulating an argument structure for interactives, which goes back to Heine (Reference Heine2023), allows us to bring them closer to the kinds of formalization we use in ‘traditional’ syntax, and to show that they do not form a uniform class, but differ from one another in how many and what kinds of interaction-bound arguments they take and what kind of relationship they establish. They are building blocks organizing language on a macrolevel.
It should be noted that the analysis of the three major processes provided here does not and cannot capture all possible aspects of dual-mind syntax. This was not intended here. For instance, nothing has been said about structurally ‘autonomous’ fragments anchored in the communicative situation (e.g. nice shirt as a compliment), whose structural design and meaning are not bound to prior syntactic units. Such phenomena, which would certainly also belong to dual-mind syntax, have to be addressed in further studies.
5 Conclusion
Syntax is deeply involved in the structuration of verbal interaction, but its social signature can be easily overlooked when interacting speakers are studied as isolated agents (as in single-mind approaches) rather than as forming a social unit. In order to include this social signature in descriptions of spoken syntax, an analytic framework labeled ‘dual-mind syntax’ was proposed here. Under this approach, syntactic description may reach far into the real-life language experience of concrete speakers acting in a social context. Dual-mind syntax analyzes the co-dependencies and interferences between structural fragments in emergent interaction. As a result, syntactic fragments are not deficient forms of syntax, but part of a dense network of mutually dependent, coherent structures. Dual-mind syntax may thus expand our understanding of how language works by showing how syntax is shaped by its use in the complex habitat of spoken interaction. Understanding the social dimension of language structure is inevitable to expand the dominant first-person, single-speaker-based analysis and thus to account for the fact that language behavior does not merely originate from autonomous processes within an individual's mind, but is also shaped by and adapted to input coming from other participants in an interactional setting.
Transcription symbols
- [ ]
overlap and simultaneous talk
- &
intonation unit continued
- ..
short pause (less than 0.5 sec)
- …
medium pause (approx. between 0.5 and 1.0 sec)
- (2.0)
measured pause
- =,==
segmental lengthening according to duration
- you need--
truncated intonation unit
- sma-
truncated word
- (H)
inhalation
- sma@ll
talk infected with laughter
- @
laughter
- <XX>
uncertain hearing
- °word°
produced softer than surrounding talk
- .
falling intonation (terminal pitch)
- ,
continuing intonation
- ?
rising intonation
- <VOX
speaker changes voice quality