1. Introduction
Persian is a term for a collection of closely related western Iranian varieties. It is spoken in Iran, Afghanistan, and Tajikistan, and serves as an official language in these counties. This paper deals with the K-suffix in Classical New Persian of the ninth to thirteenth centuries (CNP), Contemporary Written Persian of the late nineteenth to mid-twentieth centuries (CWP), and Contemporary Spoken Persian (Tehran variety) in Iran (CSP).
In all CNP written works, a suffix of the form -ak/ek/ag/ is attested, primarily occurring with nouns but also with adjectives and adverbs. It has traditionally been classified as “diminutive” and presumably is cognate with several formatives containing a velar plosive [k], or a reflex thereof, in other Iranian languages (Balochi, Kurdish, and Lori) and Indo-Aryan. However, in CWP and CSP texts, a suffix of the form -e (K-suffix) is attested mostly with singular nouns. The status of this suffix in CWP is largely similar to that of the K-suffix in CNP, but in CSP it is clearly associated with definiteness. The original function of these suffixes is yet to be established with certainty, but available accounts from both CNP and CWP suggest a high degree of multifunctionality of this suffix. There is often a semantic component of “less than expected size,” but more frequently we find an evaluative component expressing the speaker's empathy, familiarity, endearment, and respect, or conversely, disdain with respect to the diminutive-marked noun.
Such evaluative connotations are widely attested cross-linguisticallyFootnote 1 and in other Iranian languages such as Balochi, Old Shirazi, and Lari.Footnote 2 Given the salience of the evaluative components (and the lack of any reference to “size” in many contexts, see below), I follow Pakendorf and Krivoshapkina in referring to the function of this morphology as evaluative rather than diminutive.Footnote 3
The paper concentrates on what we term the definitizing function of the K-suffix in Persian. It can be demonstrated that, at least in CSP, the K-suffixes are associated with definiteness in a manner approximately comparable to the better-known definite articles of the languages of Europe, e.g., English and Swedish. However, it is still highly dependent on the speaker, genre, and setting.
For that reason, almost all previous studies on the development of definiteness marking assume a demonstrative as its origin (see Section 6). The Persian definiteness marker has considerable implications for our understanding of definiteness systems and their emergence more generally. Looking at the function of the K-suffix in different phases of Persian (CNP, CWP, and CSP), a well-documented New Western Iranian language with available recorded material from its earlier stages, it can be stated with some certainty that the definiteness marker is not related to a demonstrative element.
To the best of my knowledge, there is no previous detailed study of the K-suffix from a diachronic perspective in Persian. The data for this work is taken from extensive corpora of the language phases under study. I complement the quantitative data with a qualitative approach, which demonstrates the various functions with authentic examples and appropriate references to context. I also refer to the results of a questionnaire-based survey with CSP, which is based on the questionnaire used for Kurdish, Balochi, Shirazi, and Lori (see Section 6.2).Footnote 4
One of the most exciting aspects of the data is the high degree of inter-speaker/writer and inter-text variability, particularly in the CWP and CSP corpora. The definiteness function of the K-suffix in CSP is systematically documented for very few texts, typically only for folktales and biographical tales. This is very similar to the results from the questionnaires, which show a high degree of non-conformity and non-systematicity in the definiteness usage across the speakers.
Contrary to the Shirazi data,Footnote 5 the grammaticalization development in CSP appears to be fairly sensitive to speech contexts, typically genre rather than linguistic context.Footnote 6 Given that the usage of evaluative morphology is, by definition, primarily determined by interactional context, this finding is not surprising.
This paper is organized as follows: first, it deals with definiteness and types of definiteness contexts and provides an overview of the Persian language and data. Then it covers previous studies of the K-suffix in Persian and demonstrates the multifunctionality of the K-suffix. The evaluative function of K-suffixes in CNP and CWP is then presented, after which K-suffixes functioning as definiteness markers in CSP are illustrated. Data is presented from an extensive text corpus and questionnaire data, and a suggestion is made regarding the original K-suffix in CNP, CWP, and CSP. Finally, the findings are discussed in light of a new grammaticalization pathway from evaluative to definiteness marker.
1.1 Definiteness
Definiteness will be understood here as a property of a noun phrase that is derived from its information status in a given linguistic context. It is thus a contextual property of referring expressions rather than an inherent property of nouns. A number of different approaches to definiteness have been pursued in the literature, including a philosophical approach invoking uniqueness,Footnote 7 and a discourse-pragmatic approach.Footnote 8 I follow Lyon in considering the primary component of definiteness to be the notion of identifiability.Footnote 9 A noun phrase is considered definite if the speaker assumes that its referent is uniquely identifiable by the addressee. Languages differ cross-linguistically in the extent to which, and means by which, they systematically indicate definiteness in morphosyntax. In English, French, or Arabic, definiteness is marked fairly consistently using items generally referred to as “articles.” Other languages may mark definiteness by affixes, clitics, word-order properties, or various combinations of these strategies; alternatively, they may have no regular means for indicating definiteness. A noun phrase may have definite status by virtue of several possible contextual factors, which we broadly characterize as follows:Footnote 10
In contrast to the seven definiteness contexts outlined above, nouns may be indefinite, (either specific or non-specific), or have generic or sortal reference. The correct analysis of generics is beyond the scope of this paper.Footnote 12
2. The Persian Language
Persian belongs to the Western Iranian branch of the Iranian languages, which in turn belong to the Indo-Iranian branch of Indo-European. Persian is the only Iranian language that has documents available from the Old Persian of the Achaemenids, the Middle Persian of the Sassanids, to New Persian (since the eighth century). Different delimitations of the phases in the development of New Persian have been presented by Iranian scholars. For instance, Lazard introduces the following phases: Early New Persian for the language of the tenth to eleventh centuries, and Classical New Persian for the New Persian of the twelfth to nineteenth centuries, with the twelfth century as a transitional period.Footnote 13 I find these classifications to be a bit too complicated for the present study. For the sake of brevity, I use Classical New Persian (CNP) of the ninth to thirteenth centuries, Contemporary Written Persian (CWP) of the late nineteenth to mid-twentieth centuries, and Contemporary Spoken Persian (CSP) in the present paper.
Modern Persian is a verb-final language that shows the same alignment system in the past and non-past tenses by not having a morphological case system. Persian is mainly spoken in Iran, Afghanistan, and Tajikistan, and is considered a language of education in these countries. The area where Persian is spoken is highly diverse linguistically. Contact languages include four different language families and different genera: Indo-European (Indo-Aryan and Iranian), Dravidian, Turkic, and Semitic.
Data for CNP is taken from critical editions of works from the ninth to thirteenth centuries (see Table 1), data for CWP come from books of fiction from the late nineteenth to mid-twentieth century (see Table 2), and CSP from an extensive corpus of spoken Iranian Persian narrative and a questionnaire answered by fifteen speakers from Tehran (see Section 5). Fig. 1 presents the location of the data for Contemporary Spoken Persian.
I will briefly comment on other functions of the K-suffix (viz., derivational) than evaluative, before we begin our journey into the K-suffixes in the Persian language.
Derivations with the suffix *-ka- are well attested in Old Indo-Iranic (especially in Old Indo-Aryan). Edgerton offers a detailed survey in two papers with the same title, published in the consecutive issues 2–3 of volume 31 of the Journal of the American Oriental Society.Footnote 14 He identifies the core semantics of *-ka- for Proto-Indo-Iranic by comparing the Vedic, Sanskrit, and Avestan evidence:Footnote 15 “1) the formation of nouns of likeness or adjectiv[e]s of characteristic; 2) the diminutiv[e] and (perhaps) pejorativ[e] formations, 3) occasional formations with 2 ka [i.e., adjectives of appurtenance or relationship],Footnote 16 mainly pronominal adjectiv[e]s, and 4) the primary formations from verbal bases, apparently inclining towards the meaning of verbal adjectives or nouns of agent.”
The K-suffix -ak in Persian largely reflects Edgerton's classification. Iranian traditional grammarians already report a similar classification.Footnote 17
In the CNP works under study, the evaluative semantics of K-suffixes are more predominant than other functions (derivational) including adjective<adverb N<adjective. Note that the K-suffix -ak is more productive as a word-creation suffix in CWP and CSP than in CNP, probably because of a national need for creation of words.
In the following example, the adjective narm “soft” has changed into the adverb narmak, “softly, slowly.”
3. The K-suffixes in CNP: Initial ObservationsFootnote 19
Data for analyzing the K-suffixes in CNP comes from critical editions of works from the ninth to thirteenth centuries. Table 1 provides a list of these works.
Across CNP texts, a nominal suffix is found with the forms -ak/ek/ag. Footnote 22 These are likely to be reflexes of the K-suffix -ag in Middle Persian,Footnote 23 e.g., pus-ag “boy” and CNP pesar-ak “boy.”
The K-suffix has been attested with nouns, e.g., pesar-ak “boy,” darvīš-ak “dervish,” adjectives, e.g., ǰavān-ak “young,” saqīr-ak Footnote 24 “small,” andak “little,” and adverbs, ānak “now.”Footnote 25
Traditionally this suffix is referred to as a “diminutive.” Investigation of the K-suffix in CNP has largely been ignored. However, its existence has been reported by scholars. For Early New Judeo-Persian, Paul reports that “-ak functions as diminutive, or it appears without semantic modification, e.g., kanīzak, ‘girl’, xāharak, ‘sister’, mardumakan and šamšērak ‘sword’.”Footnote 27 Gindin, in an unpublished study on Early New Judeo-Persian, mentions -ak as a diminutive suffix, such as in “jūyz-ak” – a diminutive of jūy “river.”Footnote 28
Qarib and colleagues introduce the suffixes -ak/īk/, -čeh/, -žeh/zeh, -īk, -ū, -ek, and -e as diminutive suffixes; however, they maintain that it covers other semantics, e.g., respect, endearment, and pejorative.Footnote 29 Similarly, Ahmadi Givi and Anvari mention -ak, -ū, -e, as a diminutive.Footnote 30 Khayyampur reports that -ak, -čeh, and -ū are used as diminutive suffixes, among others.Footnote 31 Natel KhanlariFootnote 32 considers the suffix -če to be diminutive and the suffix -ak to be šebāht “a likeness suffix.”Footnote 33
3.1 Evaluative and Diminutive Usage in CNP
The most frequent usage of the K-suffix is to express evaluative or diminutive semantics, and it is even compatible with indefinite contexts. The term “diminutive” implies the descriptive content “smaller than normally expected,” and this is evident in some usages of K-suffixes. However, even in these contexts, an evaluative connotation is often discernible and, for the sake of brevity, following NourzaeiFootnote 34 I gloss the suffix with EV, as the most general indication of function, regardless of actual context.
In example (3) the K-suffix gives a description of the physical size of the branch, šāx-ak=ī “a small branch.” Note that the K-suffix is compatible with the indefiniteness context.
Similarly, in example (4), the K-suffix provides a description of the physical size of the deer's fawn. Note that the K-suffix follows a distal demonstrative ān “that.”
In example (5), the K-suffix provides a description of a small amount of water.
In example (6), the K-suffix adds a flavor of sorrow on the part of the speaker regarding the Hendu male slave, rather than a description of the physical size of the male slave.
Similar to example (6), example (7) adds a flavor of sorrow on the part of the speaker regarding the deer's mother, who was following the hunter when she repeatedly fell down, rather than a description of the physical size of the deer's mother. Note that the K-suffix follows a proximal demonstrative īn “this.”
The evaluative component is more obvious in the following examples. In example (8), Joseph's father refers to his son with a K-suffix, although the son is grown up. This is obviously a signal of endearment and affection on the part of the speaker towards the son, rather than a description of his physical size. Note that the K-suffix has been attested with vocative and non-vocative contexts.
Similar to example (8), in the following passage, a dialogue between God and the prophet Noah, Noah refers to his son with a K-suffix, although the son is grown up. Again, this is obviously a signal of endearment and affection on the part of the speaker towards the son, rather than a description of his physical size.
The K-suffix occurs here with an “admiration and respect” connotation. The K-suffix on “Hasan” demonstrates respect towards Hasan, who was an important and influential figure in the Ghaznavid state, rather than a description of his physical size.
Similar to example (10), the K-suffix in example (11) displays admiration and respect towards Abul Abulqāsem-e Hakīm, rather than a description of his physical size.
K-suffixes also occur with pejorative connotations. This can be seen in vocative contexts such as in example (13). The following passage is taken from a dispute between the king and a dervish. Here the K-suffix reflects the king's anger and disapproval of the dervish in the given context.
This can be observed in vocative contexts, as in example (13), where it is taken from a dispute between Halāl and the holy man. Here the K-suffix reflects the king's anger and disapproval of the holy man in the given context.
Finally, we should point out that certain words typically indicating both human and non-human referents seem to include the K-suffix as part of the word stem. The suffix lacks any apparent separate semantic content.
In sum, the K-suffixes of CNP are widely attested with some kind of evaluative semantics, but also as lexicalized and semantically empty elements, and are presumably remnants of the high-frequency evaluative usage associated with certain words. We assume that the multifunctionality of the K-suffix is reasonably representative of earlier stages of Persian and is also compatible with what is known about K-suffixes in earlier stages of other New Western Iranian languages such as Shirazi, Lari, and Balochi. However, in the three phases of Persian (CNP, CWP, and CSP) being studied here, the functionality and frequency of K-suffixes have diverged quite considerably. In particular, in specific genres of CSP, the K-suffix -e/he exhibits a regular marking of definiteness in anaphoric and bridging contexts (see Section 6).
I begin with an outline of K-suffixes in CNP, before focusing on the usage of the K-suffix in CWP (Section 5) and CSP (Section 6) and presenting frequency data from the corpora (Section 7).
3.2 Analysis of the K-suffix in CNP
The K-suffix attaches to nouns, adjectives, and adverbs. The following passage shows the K-suffix with an adjective:
The K-suffix in CNP has a variety of functions, with no obvious structural constraints. However, there is one type of context that demonstrates a different reading than the normal multifunctional semantics of the K-suffix (see Sections 3.4 and 3.5).
The K-suffix in CNP is compatible with indefinite contexts, as in examples (15) and (16).
Examples (17) and (18) show that the K-suffix is compatible with proper nouns, for example, the personal names Hasan-ak “Hasan,” Mahmūd-ak “Mahmud,” gandom-ak “Gandom,” xayr-ak “Xayrak,” mār-ak ebne allsalāt “Marak ebne allsalāt,” and sarbāt-ak “Sarbātak.” Note that proper nouns such as these, where the stem and this suffix can be clearly distinguished, are very rare in the manuscripts. The lack of such examples in these works is probably indicative of the strongly interactional nature of the K-suffix in CNP.Footnote 51
We should point out that there are certain words, typically proper names, which seem to include the K-suffix as part of the word stem, i.e., sīyāmak “Siyamak,” bābak “Babak,” and āl barmak “Albarmak.”Footnote 54
As with proper nouns, the K-suffix is compatible with place names, for example, “čenāša,” “koškak,” and “ġūzak,” as in the following example:
Note that it is not at all obvious what semantic content the K-suffixes have in these contexts; they appear to be relatively vacuous. In contrast to the proper nouns, this type of nouns has a high frequency across the critical editions of works, with Tārikh-e Sistān being an example.
In CNP, there is no constraint against combining the K-suffix with the plural suffix (see Sections 5 and 6 on this point in CWP and CSP). The following examples illustrate a K-suffix with evaluative sense followed by a plural marker “-ān.”
There is no restriction with the K-suffix in relation to the possessed nouns (see Section 6 on this issue).
To sum up, the K-suffix in CNP texts has various functions,Footnote 62 and is not subject to structural constraints such as obtain for CWP and CSP (see Sections 5 and 6). However, we find singular nouns, often accompanied by proximal/distal demonstratives, taking a K-suffix with no apparent connection to small size or any particular evaluative notion. Such examples are very rare and would require a larger corpus to study. However, in Old Shirazi these functions of the K-suffix predominate.Footnote 63
Before demonstrating the use of K-suffixes as signals of proximity and familiarity/recognition, it would be helpful to outline indefiniteness and definiteness strategies in CNP.
3.3 Indefiniteness and Definiteness Strategies in CNP
In CNP, discourse-new,Footnote 64 specific, singular NPs are overtly marked for indefiniteness with an enclitic=ī on the nouns dōst=ī “a friend” and zan=ī “a woman,” as in the following examples. This pattern has been attested in Middle PersianFootnote 65 and Old Shirazi.Footnote 66 Definite NPs, on the other hand, are generally considered to lack any consistent marker of definiteness and are left unmarked.
Once introduced, a referent has the status of definite (anaphoric definite). The two most common strategies for indicating definiteness across CNP (ignoring anaphoric pronouns and zero anaphora) are either combining the noun with a demonstrative pronoun, preferably the distal demonstrative -ān, or using the bare form of the noun with no additional marking.Footnote 69 The following passages (taken from Dārābname) demonstrate these two possibilities. A garden is introduced as a singular indefinite in example (28):
The second mention (anaphoric definite) takes the distal demonstrative ān “that” in combination with the noun ān bāġ-rā, “that garden”:
After this introductory sequence, there are several lines of intervening text with distal demonstratives referring to the garden before it is mentioned again as a bare noun bāġ, “the garden”:
Similar examples with bare nouns can be found in comparable contexts in all works. A similar system has been noted for other Iranian languages such as Vasfi,Footnote 73 Balochi,Footnote 74 and Kurdish.Footnote 75
In sum, I can conclude that, although discourse-new, singular nouns are consistently marked throughout CNP, the marking of definiteness is not consistent. The two strategies most commonly mentioned are the use of the demonstrative plus noun, or the bare form of the noun.
3.4 K-suffixes as Signals of Proximity
The K-suffixes occur in what I will refer to as contexts of proximity. By this I mean contexts in which the referent is an item within the immediate perceptual range of the interlocutors, and will therefore often be accompanied by a proximate demonstrative. Thus, we have a combination of a proximal demonstrative and a noun carrying a K-suffix, as in example (31).
Note that this example lacks any obvious physical size connotations. Instead, it seems to be dependent on a deictic concept of proximity. This is one of most prevalent functions of the K-suffix -ō in Old Shirazi.Footnote 78
3.5 K-suffixes as Signals of Recognition and Familiarity
The only evidence of a familiarity/recognitional reading of the K-suffixes occurs in some works under a relatively tightly constrained set of conditions, and only with the singular nouns discussed in examples (32) and (33).
The following passage is taken from an account in Nowruznāme.Footnote 79 In line 3 of the story, the boy has been introduced for the first time with pesar=ī “a boy,” and the writer refers to the same referent, “boy,” with a proximal demonstrative plus a K-suffix. Among the spectators, the king is pointing to the boy. He says “bring that boy to me,” in line 5 of the story, which refers to the same referent again with a demonstrative pronoun plus a K-suffix (when the king commands his ministers to bring that boy to the palace). Interestingly enough, at the end of the same line, he refers to him with a K-suffix without a demonstrative pronoun. In the rest of this account, the writer refers to him either with a bare noun pesa-rā “the boy” or a distal demonstrative pronoun plus null form īn pesar/ān pesar “this boy/that boy.” This passage demonstrates that the K-suffix does not convey the physical size of the boy, but instead illustrates a familiarity/recognitional notion of the reference.
In the works, I only found one particular case of this. In line 1 the doctor is introduced in the discourse for the first time without the K-suffix tabīb=ī “a physician,” and in line 5 the writer refers to the same referent with a K-suffix tabīb-ak “the doctor.” In the rest of the story, the same referent appears without the K-suffix, tabīb “the physician.” Such passages demonstrate that the K-suffix does not express any physical notion about the physician. Instead, it conveys familiarity/recognition.
Note that we do not have sufficient examples of this type to draw any significant conclusion. In the later stages of Persian, for instance in Golestān Saʿdī and Totināme, we cannot find these types of passages. It would be interesting to closely examine this suffix from the fourteenth to the early nineteenth centuries to see which evaluative notions are more predominant.
Summary
The corpus data for CNP demonstrate that the K-suffix has evaluative semantics that account for most of its usage. It is compatible with indefiniteness contexts, and there are no structural constraints (see CWP and CSP on this issue). It somewhat resembles a sporadic remnant of a now defunct morphology that appears to have been incorporated into some items without any discernible change in meaning; see examples (19) and (21).
In CNP, however, we find nouns accompanied by demonstratives and nouns taking a K-suffix, with no clear connotation of small size, little amount, or clear evaluative content. These passages provide some evidence of how evaluative markers might have evolved towards definiteness marking. One of the most recent cross-linguistic studies on diminutives demonstrates that diminutives also convey meanings of endearment, familiarity, and proximity.Footnote 82 In the case of the proximity and recognitional contexts shown in examples (32) and (33), the concept of familiarity is reduced to physical proximity and shared common ground. Thus, it is not unreasonable to see an evaluative suffix becoming associated with proximity in a non-evaluative sense. We have already observed the concepts of proximity and shared common ground in the K-suffix in Balochi,Footnote 83 and it is the most prominent function of the K-suffix -ō in Old Shirazi Persian,Footnote 84 although in both Sistani Balochi and Old Shirazi, evaluative usage prevails overall. The suggestion here is that the proximate and shared-knowledge usage may have provided a bridging context for the transition from evaluative meaning to definiteness marking.
4. The K-suffix in Contemporary Written Persian: Initial Observations
Data for Contemporary Written Persian are taken from books written in colloquial Persian published from the late nineteenth to mid-twentieth centuries. Table 2 gives an overview of these books.
So far, I have given a detailed discussion of the nature of the K-suffix -ak in CNP (see Section 3). Across the works, we only found one form of the K-suffix, namely, -ak. However, in the CWP books we found four varied forms of the K-suffix (see Section 7 for a discussion of their origin):
(a) a continuation of the K-suffix -ak in CNP as an evaluative notion, e.g., Hammad-ak, “Ahmad,” dīb-ak “demon,”Footnote 85 and hamūm-ak “bathroom.”
(b) the existence of new K-suffixes, e.g., īk, in zan-īk-e, “woman,” ū, in yār-ū “friend,”Footnote 86 -ī, in Hasan-ī “Hasan,” and -e in pesar-e, “boy,” which are mostly found in colloquial and informal written texts with mostly singular nouns.Footnote 87 I assume the -ī suffix to be a short form of the -īk suffix in Hasan-ī “Hasan.”Footnote 88 Determining whether or not they derive from the same origin is not the main point of this paper; what is important is that they display similar (evaluative) semantics.
To the best of my knowledge, Qarib and colleaguesFootnote 89 and AnvariFootnote 90 present the K-suffix -e, including -ū and -ak and -če, as a diminutive marker in their studies. However, a definiteness effect associated with the K-suffix -e in Modern Persian has already been mentioned by various scholars.Footnote 91 In the following section, I will discuss the K-suffix -e in Contemporary Written (see below section) and Contemporary Spoken Persian in Iran (Section 5).
4.1 K-suffix -e in Contemporary Written Persian
Before we study the status of the K-suffix -e/he in CSP, I will give a detailed description of the K-suffix -e in CWP. In contrast to the K-suffix -ak in CNP (Section 3), the K-suffix -e is mostly attested in informal and colloquially written books with a handful of singular nouns.Footnote 92 Note that I found three instances of the K-suffix -e with the plural marker -hā e.g., čerā mesl e xāle zan-īk-e-hā harf mīzanī “why are you talking like gossiping women?”Footnote 93
Its semantic domains in CWP are, to a large extent, similar to those in CNP. However, there are some examples of K-suffixes that distinguish CWP from CNP (see Section 4.2).
Analysis of the K-suffix in CWP
As in CNP, the K-suffix in CWP is compatible with indefinite contexts. See example (34).Footnote 94
It has been attested with the proper nouns ādm-e and Havvā-e, which are signals of the endearment connotations of this suffix.Footnote 97 Note that the same writer used ādm and Havvā without marking them with a K-suffix in his short story titled Afsāneye Afarīnesh.
Example (37) is an ambiguous case. The K-suffix could be interpreted as adding a flavor of sorrow/empathy on the part of the speaker regarding the fate of the small, orphaned boy. It could also be interpreted as a recognitional context, when the girl again refers to the boy after several intervening lines.
The K-suffix also occurs with pejorative connotations, as in the following examples. This can be observed in vocative contexts. Note that there are two evaluative suffixes on the items in examples (38)–(40).
In example (41) the K-suffix occurs in a vocative context:
Similar to the K-suffix -ū in modern Shirazi Persian, I find it in indefiniteness contexts, as in example (42).
Finally, I should point out that certain words, typically indicating place referents, seem to include the K-suffix as part of the word stem, such as in example (43). Note that some compound nouns, such as Albālū xošk-e “dry-cheery” in ʿAlaviye khānom, need further investigation regarding the function of -e.Footnote 106
In contrast to the K-suffix in CNP, the K-suffixes are not attested with possessed nouns formed with person-marking clitics or copula verbs (see example 24). When a noun and an adjective are combined, the K-suffix is attached to the second constituent of the NP, as in pesar bozorg-e “the old brother.” See the following example.
Note that in some books written earlier in the period being studied, we find the K-suffix on the first constituent of compound nouns (a noun combined with an adjective) such as doxtar-e=ye češm sefīd “impudent girl.”Footnote 109 It seems that the movement of the K-suffix to the second constituent of the noun phrase occurred in its later stages of grammaticalization.
4.2 Attestation of the K-suffix -e in Non-evaluative ContextsFootnote 110
We have already found some contexts where the K-suffix -e does not express a diminutive or evaluative sense. Instead, the item marked with the K-suffix has a referent in the previous clauses or, in some cases, the marked items can refer to common background knowledge.
Before introducing these passages, I will briefly summarize definite and indefinite strategies in CWP. As in CNP (Section 4), discourse-new, specific, singular NPs are overtly marked for indefiniteness across the CWP texts. Definite NPs, on the other hand, are generally considered to lack any consistent signal of definiteness.
Indefinites are marked slightly differently than in CNP (see Section 3.3). The word ye/yek “one” preceding the noun (ye kaftār, “a hyena”) may combine with a suffix=ī (yek martīke=ī “a man”). Once introduced, a referent has the status of definite (anaphoric definite). As in CNP, there are two common strategies for indicating definiteness throughout CWP: (a) combining the noun with a demonstrative (ān doxtar, “that girl”), (b) using the bare form of the noun with no additional marking (kaftār “the hyena”).Footnote 111
In the following passage, taken from a story in ʿAlaviye khānom, the word kaftār “the hyena” is introduced in the discourse as a singular indefinite.
Following the introduction, the second mention (anaphoric definite) takes a bare noun kaftār. The writer refers to it several times in the story with a bare noun kaftār. He only marks it with the K-suffix -e once (on page 127), while in the rest of the story it appears as a bare noun.
It is evident from these passages that the K-suffix does not express an evaluative sense. Still, the K-suffix does not mark the items consistently or systematically. It is hard to find a motivation for the writer to mark the same item with a K-suffix only once, and not in the remaining passages of the story.
Similarly, in the following example, the NP, girl, has been introduced for the first time in the story in a restrictive relative clause, ān doxtarī ke “that girl who.”
The second mention in line 12 takes the distal demonstrative ān doxtar, “that girl.” In line 36, the writer again refers to the girl and marks it with the K-suffix, as in the following example.
In line 38 the writer refers to the girl with a combination of the distal demonstrative and the K-suffix -e, ān doxtar-e “that girl.”
In the following example, the item abre “cloud” marked with the K-suffix -e has a referent in the previous context yek teke abr “a bit of cloud.” Note that it comes with the distal demonstrative. It is also worth noting that throughout the books, there are very few passages where the second mention (anaphoric) is marked with a K-suffix (see CSP on this issue).
Similarly, in the following example, the item doxtar-e “the girl” marked with the K-suffix -e has a referent in the previous context ye yatīm=ī “an orphan.” In the continuation of the story, the same referent appears as a bare noun and PROX+NP. It is notable that, after 17 lines, the writer refers to the girl and marks the referent with a K-suffix -e, as doxtar-e “the girl.”
Example (50) is a unique case in the corpus. In the story, pesar “the boy” appears as a bare noun. It is marked just once with the K-suffix in combination with the demonstrative when the man points to the boy and says, “he is not a painter, he is reciting a poem for this boy who is sitting in front of the shop.” In the rest of the text, the same referent “boy” appears as a bare noun.
The writer similarly marks the item zan azīz-e “beloved wife” with a K-suffix, when the woman is pointing to another woman standing close by and says to the man that the beloved wife (lit. dear woman) is over there.
After this, the writer refers back to it either with a bare NP or a combination of demonstrative plus noun.
The following examples, (53) and (54), demonstrate a mutuality reading. Mutuality involves contexts in which the identity of the referent is known by both speakers through their shared world knowledge, even though the referent has not previously been introduced in the linguistic context.
The marked noun dom=e šotor-e “the tail of the camel” does not have a referent in the previous clauses. However, the writer still marks it with the K-suffix because it is familiar to both writer and reader via their common cultural background. This usage has been reported for the K-suffix -ō in Old Shirazi.
Note that the same expression is not marked with the K-suffix in his other book Zende be gur.Footnote 121
Summary
Across the texts, the K-suffix -e of CWP is quite similar to that of CNP, with evaluative connotations accounting for the greatest amount of use. It has been attested in indefiniteness contexts. It shares deictic and recognitional uses with CNP in broader contexts. However, we also encounter some instances in which the K-suffix marks items that have a referent in a previous context and do not convey any evaluative sense. Such examples are rare, but they indicate how an evaluative suffix can develop into a definiteness marker and pave the way towards anaphoric definiteness (for discussion of this as a typical pattern in CSP, see Section 5). In contrast to the K-suffix in CNP (see examples 22–23), the K-suffix -e does not occur with plural markers and possessive constructions, typically when the latter are formed with person-marking clitics and enclitic verb copulas.
This observation can be linked to Hawkins's suggestion that each stage of grammaticalization “maintains the usage possibilities of the previous stage and introduces more ambiguity and polysemy, but expands the grammatical environments and the frequency of usage of the definite article.”Footnote 123
Finally, what should we call the K-suffix -e in CWP?Footnote 124 In my view, this is an open question, however, as we can see above and in Section 4.1, the K-suffix -e is not yet mature and has not grammaticalized as a definiteness marker as such. It is scattered unsystematically throughout the texts and largely preserves its original evaluative connotations. It is still on the way towards becoming a definiteness marker in Persian, as will be discussed in the next section.
5. Contemporary Spoken Persian
Data for the CSP stem from Persian Language Database (PLD) online corpora,Footnote 125 Taghi's corpus,Footnote 126 and my new recordings of Tehrani speakers from Tajrish and my field notes.Footnote 127 The corpus contains a total of 60,207 words (see Table 3 for an overview). In addition, I use spontaneous speech data from Bamberg-Hamedan joint online data,Footnote 128 a variety called Hamedani Persian, and my new recordings. The main speech topics are personal accounts, education, science, and so on.
5.1 Background of Speakers
I do not know the age of the participants for the PLD corpora, as I was informed that the data was recorded from native, educated Tehrani male and female speakers who were born and lived in Tehran. The main speech topics are marriage, women's rights, tales, and free conversations recorded in (1370/1991) and written down in Persian. I transcribed them for this work. The recorded data is about three hours long.
I use twelve texts published in Taghi.Footnote 129 These texts are recorded from two Tehrani speakers aged seventy-three and seventy-five, and written down in Persian. I transcribed them for this study. According to the information supplied by Taghi, both speakers were educated in Islamic schools (savād maktab). They were born in Tehran and lived there for their entire lives. The second speaker moved to Sweden at the age of seventy, but traveled back and forth between there and Tehran.
My data consists of recordings of bibliographical tales and accounts (about one hour) told by Tehrani-educated speakers from Tajrish aged between forty and sixty-five years.
Regarding Hamden-Bamberg, the data consists of recordings of male and female Hamdani speakers aged between thirty and seventy years with different backgrounds from 2017 onwards.Footnote 130
For colloquial Tehrani Persian, I complement the quantitative data with qualitative material which illustrates the various functions with authentic examples and appropriate references to context. I also refer to the results of a questionnaire-based survey with Persian speakers based on the English version of the questionnaire used for Kurdish, Balochi, Shirazi and Lori to capture authentic colloquial speech.Footnote 131 I have modified the questionnaire slightly by reducing the number of plural NPs due to the incompatibility of the K-suffix with plural nouns.
In the previous section, I gave a detailed discussion of the K-suffix -e in CWP. Now I will discuss the status of the K-suffixes -e/he/ye in CSP. The K-suffixes -e/he have been attested in different varieties of Persian, for instance, Taghi ābād, Esfahani, Hamedani, Yazdi,Footnote 132 Najaf ābādi, Qomi, Mashhadi,Footnote 133 Birjandi, Qayeni and Neshaburi.Footnote 134 Notably, the K-suffix-e/he has not been attested in Sistani Persian, which is the variety spoken in Sistan and Balochistan province.Footnote 135
Based on the data available in the Kalbasi,Footnote 136 the TaghiFootnote 137 and the online Bamberg-Hamedan corpora,Footnote 138 and my data, the status of the K-suffix -e/he is almost the same across Persian varieties: it is not obligatory but is systematically used in definite contexts. For instance, Hamedani Persian speech is similar to Tehrani Persian; the K-suffix is very sensitive to genre and setting, which means that it is not attested with scientific topics that need a formal setting. The frequency and usage of the K-suffix in anaphoric contexts (particularly its combination with demonstrative pronouns) diverge in these varieties. Therefore, another study is needed of these varieties using a larger corpus.
In the present study, I will concentrate on the status of the K-suffix -e-he in the Tehrani variety of Persian, for which I already have a large corpus at my disposal. Data for this section was taken from a large contemporary spoken online corpus, Persian Language Database (PLD), published texts of Tehrani Persian in Taghi's corpusFootnote 139 and my recordings of Persian speakers from Tajrish.
Before discussing the nature of the K-suffix, I will give an overview of the system of discourse-new nouns in this phase of Persian.
The system of discourse-new nouns, specific nouns for the singular, and plural nouns is the same as in CWP: the word ye/yek “one” precedes the noun, which may combine with a suffix =ī/e Footnote 140 on the noun to give an indefinite, singular, specific meaning, as in ye olāġ=ī “a donkey” and ye šīr “a lion.”Footnote 141
Similar to CNP and CWP, the most common strategy in CSP for marking a referent with a definite status is to use bare nouns or a combination of nouns plus demonstratives. However, in some genres, typically in folktales and biographical tales, a new strategy has emerged that marks the definite nouns with the K-suffix -e/he systematically, but not obligatorily, in anaphoric contexts. In the next section, I will illustrate this usage of the K-suffix.
5.2 K-suffixes as Definiteness Markers
The common form of the K-suffix in Contemporary Spoken Persian is -e/he (when a word ends with a vowel), for instance kūze/kūze-he “the jug,” bābā/bābā-he “the father.” These suffixes have generally not been attested in standard Persian.Footnote 143 In contrast to CWP, in CSP K-suffixes are not attested with evaluative or diminutive semantics or in indefinite contexts (see Section 4). In the following subsection I will discuss the K-suffix in CSP.
Anaphoric Definiteness
In CSP, singular nouns that are anaphorically definite take a K-suffix, when the relevant structural conditions obtain. The following examples (56 and 57) illustrate K-suffixes in anaphoric definite contexts, with both human and non-human nouns.
Similar to Shirazi Persian, the K-suffix in CSP does not appear in combination with a demonstrative pronoun in anaphoric contexts, as in the following example:
However, in Taghi's data, there are a few anaphoric contexts with a combination of a demonstrative pronoun plus a K-suffix, as in example (61). I have found a combination of the K-suffix with demonstrative pronouns in anaphoric contexts outside of the storyline when the storyteller explains the situation to the audience.Footnote 149
A combination of the K-suffix with a demonstrative pronoun is common in other Persian varieties such as Hamedani in example (62), and in the Qomi variety of Persian.Footnote 151
The appearance of double marking of definite forms is unexpected in the traditional scenario of developing definiteness marking from a demonstrative, and these instances certainly call for further investigation. However, the construction is not unexpected on the analysis suggested here, where we assume that the definiteness marking evolved from evaluative marking via the marking of proximity and shared knowledge/familiarity, which is supported by our results here (see Section 3 on CNP) and also has occurred in Balochi and Old Shirazi. If this really is the first developmental stage, then it is not surprising that it is still available here in the speech of older speakers. For Old Shirazi, we have evidence that the K-suffix always occurs with a demonstrative in earlier stages of the language. At its current stage we observe a complete absence of the demonstratives in anaphoric contexts and a tendency not to use them in situational contexts.Footnote 153
These observations support my hypothesis that in earlier stages of the grammaticalization of the K-suffix towards definiteness, it occurred with the demonstratives and used them as supporting items/hooks before becoming a pure definiteness marker. In this respect, CSP is at an earlier stage of grammaticalization of the K-suffixes, and traces of this earlier stage can still be found in the speech of older speakers.
Bridging and the K-suffix
Under the heading of bridging definiteness, we include referents that are identifiable based on their unambiguous link to another previously mentioned referent. Generally, bridging contexts appear either with a bare NP or possessed nouns such as dar “the door” and modīr-e madrase šūn “the principal of their school,” as in examples (63) and (64).
There are some cases with K-suffixes, such as doktor-e “the doctor” in example (65). The doctor had not been mentioned previously in the story, but it is common knowledge that a hospital has a doctor/several doctors.
Similarly, the singular NP dūkūndār-e “the shopkeeper” marked with the K-suffix is identifiable based on its clear connection with the shop, as it is common knowledge that every shop has a shopkeeper.
Situational Contexts
Based on the data, in situational definiteness contexts, CSP uses two strategies: a combination of demonstrative plus K-suffix or just K-suffix. This is contrary to Koroshi Balochi, which always requires a combination of demonstrative plus a K-suffix.Footnote 158 The following passage displays a situational definiteness context in which the demonstrative combines with a K-suffix with īn māšīn-e “this car.” The car has not been mentioned previously in the story. The driver points to the car and explains to the mechanic that this car transports passengers from Kerman to Tehran.
Example (68) displays a situational definiteness context in which the speaker does not combine a demonstrative with the K-suffix. The basket was previously introduced in line 3 of the story. In the example below (line 4 of the narrative) the speaker points to the basket and says, “give me this basket.”
Similar to example (68), example (69) displays a situational definiteness context, where the demonstrative combines a K-suffix with doxtar-e “the girl.”
5.3 Structural Constraints on K-suffix with Anaphoric Definiteness in CSP
As previously mentioned, anaphorically definite nouns are marked with a K-suffix in CSP. However, the presence of the K-suffix is systematically inhibited under certain conditions. In the following subsections I will describe the main systematic structural constraints on use of the K-suffix with anaphoric definiteness.
Plural
Nouns marked with a plural marker never take a K-suffix regardless of their definiteness status, as in the following examples.
Possessed Nouns
In addition to the independent pronouns, there are person-marking clitics (PC), which are used with all functions of the oblique case, direct and indirect objects, and as possessive pronouns. The K-suffix is systematically absent from possessed nouns formed with a clitic possessive pronoun, e.g., “his cow,” “your son,” and pronouns, e.g., baxt-e doxtar-e mā “the fate of our daughter.” However, it appears with other possessed constructions formed with ezafe constructions, e.g., xūneye pedar-e “the father's house.” This system is similar to Shirazi PersianFootnote 163 and is contrary to Koroshi. In Koroshi, the K-suffix does not appear with all types of possessive constructions.Footnote 164
Proper Nouns and Titles
Generally, the K-suffix is absent from titles and proper nouns, as in examples (73) and (74).Footnote 167 It is notable that, as in Central KurdishFootnote 168 and Koroshi,Footnote 169 king and mullah are considered proper nouns in Persian.Footnote 170 In Shirazi data, mullah is not considered a proper noun and is marked with a K-suffix -ū, e.g., āxūnd-ū “the mullah,” unlike pādšāh/pādošāh “king.”Footnote 171
Note that in fairy tales the K-suffix is attested with a title in āġā dīv-e “Mr. Demon.”Footnote 174
However, both the titles Mrs./Madam and Mr./Sir are marked with the K-suffix when they are used alone, as in example (75).
Some Nouns
The data show that the K-suffix is always absent with some nouns, especially those expressing conventionalized locations, such as xūne “home,” madrase “school,” šahr “city,” maktab “school,” češmeh “spring,” and hamūm “bathroom,” as in the following example.Footnote 176
Unique Referents
The data demonstrate that the K-suffix is systematically absent with unique referents: zamīn “ground,” āsemūn “sky,” xoršīd “sun,” setāre “star.”
Some Prepositions
The data demonstrate that the K-suffix is absent in some combinations with prepositions in the corpus data: sorāġ “after,” az “from,” be “to,” az bālā “above,” as in examples (78)–(81). Note that there is great variation among the speakers.
Particle ham/am
The data show a significant variation across the speakers regarding the absence of the K-suffix before the particle ham/am. The same speaker systematically does not apply the K-suffix before this particle, as in the following examples.
In the same text, example (84), the speaker uses the K-suffix before ham, as in doxtar koulī-ye ham, and does not apply it to the following clause doxtar koulī ham. Such examples certainly need more research.Footnote 185
5.4 Unexpected Absence
I have already discussed the attested constraints of the K-suffix in anaphoric contexts. However, there nevertheless remains a residue of nouns in definiteness contexts that lack the K-suffix. Hence the term “unexpected absence” of K-suffix is used.Footnote 187 The number of such unmarked definite NPs varies considerably across different speakers in our corpus (see below), indicating considerable inter-speaker variation.
In the following passage, the lion, as the main character in the tale, appears without marking with the K-suffix in the definite contexts. In both examples, the lion and the girl are the main characters in the story, and after several mentions with a K-suffix, they appear without a K-suffix. See also the NP gorbe, “cat” in example (56), which lacks a K-suffix despite the cat being one of the important characters in this tale.
Similar to examples (84)–(87), in example (88) the old lady is one of the main characters in the story. After several mentions with a K-suffix, she appears without a K-suffix.
Summary
The K-suffixes in CSP are associated with definiteness contexts, usually anaphoric, and very rarely appear in bridging contexts. They are systematically excluded from indefiniteness contexts and are not associated with obvious evaluative or diminutive semantics. In this sense, we speak of a definiteness function of the K-suffix in CSP, and in this sense CSP is distinct from CWP. However, in CSP definiteness is a necessary but not sufficient condition for the K-suffix. There are still many notionally definite NPs in our corpus that do not take a K-suffix. First of all, we noted certain structural conditions that inhibit the presence of a K-suffix:
(a) Plural marking of the noun,
(b) In combination with clitic pronouns and copula,
(c) When the noun can be construed as a title or proper noun,
(d) after some prepositions,
(e) after a particle “ham/am,”
(f) after some nouns,
(g) with demonstrative pronouns.
The extent of the residue of definite but unmarked items varies from speaker to speaker and according to genre and speech situation. In the next section, we explore the quantitative data from our corpus to shed light on the nature of the changes that have occurred in Persian.
6. The Emergence of Definiteness: Evidence from the Corpus and the Questionnaire
While the grammaticalization of definite markers has been a central issue in grammaticalization theory, researchers usually cite cases (the languages of Western Europe) in which the source of the definite article is some form of deictic element (a “D-element” according to HimmelmannFootnote 192), and this has become the primary paradigm for understanding the diachronic development of definiteness marking cross-linguistically. However, in our ongoing survey of Western New Iranian languages, and Persian in particular, the definiteness suffix has an entirely different source construction, as it comes from an evaluative suffix. Thanks to the existence of data from earlier phases of Persian, we can formulate some initial hypotheses regarding the developmental sequence that led to the current situation. We can see here that the definiteness marker in Persian does not originate from a demonstrative source. And in particular, its combination with the demonstrative pronoun rules out a demonstrative origin.
An overview of the corpora for CNP, CWP and CSP is provided in Table 3.
A second source of data is a questionnaire conducted between 2018 and 2021 with fourteen Tehrani speakers, which is discussed below. But first I consider two metrics from narrative corpus: overall frequency of the K-suffix and distribution of the K-suffix across the corpora for these three phases.
6.1 Overall Frequency of K-suffixes
Overall frequency is counted as the number of occurrences of K-suffixes across all texts in the corpus per orthographic word,Footnote 193 normalized to a value of frequency per 1,000 words, to enable comparison across texts of different lengths. Consideration must be given to the fact that a value of zero is not particularly significant in a small text, while zero occurrences in a larger text is much more significant. Nine texts have fewer than 700 words overall, and in many of them, the number of K-suffixes is high; I left them out of this calculation. The results for the three phases are demonstrated in Fig. 2. The vertical axis represents mean values and the bars give the data for each corpus.
There are some points of interest here. First, the hypothesis that overall frequency would increase with a shift towards a definiteness function is confirmed. In CSP, the mean value of K-suffixes per 1,000 words is 3.2, sixteen times higher than in CWP (0.2), and just over three times more than in CNP (1.0). However, it is also clear that the higher frequency of K-suffixes in CSP is largely the result of three data outliers, with 10.0, 8.0, and 7.0 K-suffixes per 1,000 words, respectively, more than twice the figure for any other texts having a K-suffix, while eight texts still have no items marked with K-suffixes.Footnote 194
Thus, CSP is not characterized by the consistently high level of K-suffixes that one would expect if the forms were uniformly grammaticalized as definiteness markers in this language. Overall frequency is, at best, a very crude measure of grammaticalization, however.Footnote 195 Note that this is the opposite of our Shirazi results, in which the K-suffix can be found across all the texts.
Recall that the qualitative investigation of these three phases demonstrates that in CNP and CWP, K-suffixes are used with evaluative meaning in most instances of use. Given that K-suffixes in these phases are not associated with a predictable and commonly recurring function, we would not expect a uniform frequency of use. Indeed, frequency of evaluative usage may simply be a matter of genre.
In CSP, on the other hand, K-suffixes are not associated with evaluative and diminutive semantics, but are associated with definiteness. However, the association is not fully regular because, as previously mentioned, structural conditions inhibit the K-suffix. Some definite nouns also lack the expected K-suffix for reasons that are not fully understood. It is highly restricted with regard to inter-speaker, inter-setting, and inter-genre factors.
The second remark concerns the decrease and increase in frequency exhibited by the K-suffix in Persian. On the one hand, we can see a significant drop in the frequency of the K-suffixes in CWP. This decrease may be due to the fact that their syntactic domain is becoming increasingly restricted, which means they can only appear with a handful of singular nouns in informal and colloquial settings. Their semantic domain (polyfunctional evaluative notions) is becoming bleached, and the suffix is moving towards definiteness.
Recall that we can find no restrictions on the K-suffix in CNP. It can be found in all parts of speech, apart from verbs and pronouns, throughout the texts. I have noticed the same result in our ongoing survey in Shirazi and Balochi.Footnote 196 It needs to be checked in Kurdish and Lori as well, which are currently being analyzed.
The third exciting point concerns the massive inter-writer/speaker and inter-genre differences found in CWP and CSP, but not in CNP. We observe that the K-suffixes are attested in all the CNP texts studied. What is significant in CNP is the region from which the author of a work comes. We find that works written in the east of Iran have a higher frequency of K-suffixes than ones written in the north. Indications that the K-suffix is developing towards a definiteness marker (see examples 32–33) are also attested in two works titled Tārikh-e Beyhaqi and Nowruznāme, the authors of which come from Khorasan. This might be connected to Lazar's observation that New Persian originated from Khorasan in eastern Iran.Footnote 197 The variety of Persian spoken in Khorasan was influenced by Semitic language earlier than Persian varieties spoken in the north of Iran.
The data from CSP demonstrates that only specific kinds of texts contain K-suffix marking. The texts with a high frequency of K-suffixes in the CSP corpus comprise three traditional folktales and two biographical tales. We cannot find the K-suffix with topics such as education, science, human rights, or the coronavirus, which require formal style. This suggests that genre is the decisive factor in CSP. Development of the definiteness marking within a specific genre has been reported for the Finnish language.Footnote 198
In the data from CWP, we also find three outliers. The three highest values (10, 0.8 and 0.7) come from a book titled Tamsilāt and two other books titled Hājī āqā and ʿAlaviye khānom. Tamsilāt is a colloquial translation into Persian from Azerbaijani Turkish. The highest values of the K-suffix are connected to the same noun, mard-ak-e “man,” with evaluative meaning. It is worth noting the attested items marked with a K-suffix are zan-ak-e “woman,” pesar-e “boy,” and doxtar-e “girl,” as well as one instance each of sawār-e “rider” and šohar-e “husband.”
The same writer, Hedāyat, wrote Hājī āqā and ʿAlaviye khānom. These are short, colloquial Persian stories. Recall that the highest values of the K-suffix belong to the same nouns, mart-ī-ke “man” in Hājī āqā and mart-īk-e and zan-īk-e “woman” in ʿAlaviye khānom, with evaluative meaning.
Surprisingly, K-suffixes have not been used consistently even by the same writer. For instance, some of the books written by Sādeghe-Hedāyet do not contain a single item marked with a K-suffix (such as Buf-e kur, Sag-e velgard, Parvin dokhtar-e sāsān). Another example is Hejazi's book Nasim, in which he does not mark any items with a K-suffix, even though he uses the K-suffix in another book called Zībā. The results demonstrate that as soon as a text switches to formal style, the author does not use the K-suffix.
Overall, a handful of items are marked with the K-suffix, e.g., boy, girl, man, woman, and very seldom other items, e.g., cloud, demon, hyena. The high frequency of K-suffixes in these texts is associated with evaluative and diminutive functions, as the most frequent usages. Thus, the outliers in CWP have a different underlying cause than those of CSP, where the high frequency of K-suffixes is associated with definiteness marking.
In contrast to Shirazi, the overall picture suggests a small number of speakers who use an overall higher frequency of K-suffixes in a specific genre and presumably act as innovators in the development towards definiteness usage in CSP.
Summary of the Narrative Corpus
The corpus data, combined with the qualitative analysis of the K-suffixes in these three phases of Persian, demonstrate that in CNP and CWP, the K-suffixes are largely restricted to evaluative contexts in their highest rates of usage. In contrast to the K-suffix in CNP, in CWP the overall frequencies vary considerably according to genre and content. The suffix is limited to a small number of nouns within certain structural constraints. In CNP, however, we already find signs of K-suffixes combining with nouns in recognitional and deictic contexts without any obvious evaluative or diminutive connotations (see examples 32–33). CWP and CSP also share this type of usage. I consider this to be the first stage in co-opting evaluative morphology to serve as a definiteness marker in Persian. I am already observing the same result in our ongoing survey of Shirazi and Balochi. I also found a few examples in CWP where the items marked with a K-suffix have a referent in the discourse without any obvious evaluative connection and are not dependent on immediate interaction. CSP also shares this type of usage, and systematically uses it in anaphoric contexts. I would suggest that this is the second stage of development of definiteness from an evaluative origin.
CSP differs from CNP and CWP in its almost complete lack of evaluative functions. Also, it expands some of its structural constraints regarding the use of K-suffixes (see Section 5.3) and spreads the suffixes to more items in definiteness contexts. But in CSP, especially in folktales and biographical tales, we find that the K-suffix is systematically used in anaphoric definite contexts, and not in texts discussing topics such as education, science, human rights and women's rights, which are associated with formal settings. This result is not surprising, and this is what can be expected of evaluative as opposed to descriptive or inflectional morphology. The use of evaluative morphology is situationally sensitive and can therefore be expected to adapt flexibly to content, formality, speaker style, and so on.Footnote 199
Overall, the data does not show a simple picture of a spreading out from an assumed anaphoric usage, commonly taken as prototypical for definiteness marking as suggested in grammaticalization theory for Persian.Footnote 200 In the following section I will examine the results of the questionnaire data.
6.2 Presentation of Questionnaire Data
In addition to the corpus data, I tested data from a questionnaire answered by fourteen speakers. The questionnaire used a set of 102 items built into six “mini-narratives” each representing short episodes of approximately ten sentences. In order to capture authentic colloquial speech, we circulated the English form of the questionnaire among participants and asked them to translate it orally into colloquial Persian. Their narratives were recorded with a mobile phone, and the relevant NPs were coded for presence vs. absence of K-suffixes and a number of other features. The results here are from the initial pilot in colloquial Persian based on fourteen speakers (nine female and five male), all of whom come from Tehran.
Fig. 3 presents the percentage of nouns carrying a K-suffix in the respective contexts: first mention (indefinite), bridging, anaphoric, demonstratives, possessed, personal nouns, unique references, and non-referential/generic (as in negated existential, such as “in those days there were no cars”). When considering the questionnaire data, we find more than half of the nouns in anaphoric contexts do not take K-suffixes. Other nouns in these contexts are bare nouns or were in plural, and such cases are not counted here.
As presented in Figs. 3 and 4, overall and across all speakers, we find massive inter-speaker differences in the marking of anaphoric definiteness. Only three speakers use the K-suffix in bridging contexts. The most common forms in bridging contexts are bare nouns or possessed nouns, as we observe in the corpus data.
Moreover, we find consistent observance of the structural constraint against use of K-suffixes with plural markers, possessed nouns formed with person-marking clitics, and generic nouns, along with a complete absence of K-suffixes in the indefinite. Furthermore, we find a consistent lack of K-suffixes with personal names. On the whole, this is the system that was found with the corpus data, as discussed previously. In the following section I will comment on the origin of the various K-suffixes in light of the present data.
7. Origin of the K-suffixes in Persian
7.1 K-suffix -ak
In general, the K-suffixes developing towards a definiteness marker in our New Western Iranian languages survey appear to be derived from *-ka-, presumably with the diminutive (and perhaps) pejorativ[e] formations. The K-suffix -ak in CNP might derive from Middle Persian -g, Pusar-ag<pesar-ak “boy” and duxtag<doxtar-ak “girl.”
The K-suffix -ak is attested in Persian varieties such as Shirazi Persian as an evaluative suffix, alongside the K-suffix -ū used as a definiteness marker.Footnote 201
7.2 K-suffix -e/heFootnote 202
The etymological origin of the K-suffix -e/he is not yet clear to me, and I leave it as an open question. However, I can offer the following two hypotheses:
(1) The K-suffix -e/he might be a short form of the -ak suffix in CNP. The sound K- has been dropped, and the a sound has changed to the e sound. This type of sound shift is widespread among Iranian languages such as in dastag>daste “handle.” In addition, a natural development from Middle Persian to New Persian is the change of Middle Persian -ag to -e, as is apparent in setārag>setāre, and particle -ag>e kardag-kard-e as well.
Across the CWP corpus, however, I found many nouns with a combination of both -ak and -e suffixes, for instance, zan-ak-e “woman,” mard-ak-e “man,” and the following interesting variation of this combination with the same noun “demon.” In its first mention in the story, it appears as yek dīb-ak=e sīyā, “a black demon,” and then subsequently as dīb-e “demon,” dīb-ak-e “demon,” and dīb-ak “demon.”Footnote 203 If we assume that the K-suffix -e is a short form of -ak, we should not find both suffixes combined on the same noun. The co-existence of both suffixes -ak and -e in this scenario seems to be awkward.
(2) The K-suffix -e might have originated from another source instead of being directly connected to the -ak suffix in CNP. However, both of them (-ak and e/he suffixes) are related originally to the same semantic notions, that is, evaluative (ke-suffixes).
An ongoing study by Hashabeiky on Persian (from the sixteenth to eighteenth centuries) shows that only one form of the K-suffix -ak with evaluative sense has been written in an informal style, in two of her manuscripts.Footnote 205 However, Nadimi Harandi and Atayi Kachooyi provide evidence of the K-suffix -e in poetry much earlier (poet, ʿAtar-e Neshaburi, thirteenth century).Footnote 206 This finding suggests that the K-suffix -e has been used by Persian speakers (in informal settings) but has not been registered in earlier texts.
Similar to the K-suffix -ū in Shirazi Persian, available data with the K-suffix -e in Persian shows that this suffix mostly appears with singular nouns and in informal registers. We do not have evidence of its final phonological form. For Shirazi -ū, we can trace this suffix back to -ūk, used as an evaluative suffix in other Iranian languages such as Bami, Kermani, and Sangsari,Footnote 207 while the etymological origin of the Persian -e suffix remains a puzzle for the time being.
In this regard, similar to my observation in ShiraziFootnote 208 of two K-suffixes -ak and -ū originally used as evaluative suffixes, I would suggest that there have been different K-suffixes in Persian with an evaluative meaning (-ak, -īk, -ūk/*ek). Whether or not they are related to the same origin is irrelevant here; what matters is that they show similar (evaluative) semantics. These various forms are most probably a matter of Persian dialectal variation, for which we do not have recorded material of the earlier stages. The K-suffix -e has been grammaticalized as a definiteness marker, and the -ak suffix continued to carry evaluative semantics regardless of genre in written, spoken, formal and informal language settings. However, its evaluative senses, such as endearment when used with proper nouns, have to a large extent been bleachedFootnote 209 and its pejorative meanings have become colorless.
Note that the short form of the K-suffix -īk as ī can still be found in Persian speech, such as in māmī (my lovely mother) and xāharī (my lovely sister), but it is not so frequent. This suffix is very productive as a marker of endearment in other Iranian languages, including Balochi Sistani.Footnote 210 Note that in Sistani Persian, the K-suffixes -ak/ok are still very productive on proper nouns and reflect endearment and pejorative meanings.
8. Considerations of Sources and Paths of Development
The CNP, CWP and CSP corpora studied here exhibit three different types of development of the K-suffix (the reflexes of cognate and originally evaluative morphemes), which can be interpreted as comprising a scale. In CNP, the most conservative stage in the present study, the K-suffix functions as a polyfunctional evaluative morpheme covering a typical array of functions generally associated with diminutives cross-linguisticallyFootnote 211 which are not constrained by definiteness and not subject to structural constraints. However, already at this stage we find some passages with singular nouns in deictic and recognitional contexts. It lies at one end of the scale.
Located in the middle, CWP shows a pre-grammaticalization stage of definiteness marking. The original evaluative meaning of the K-suffix is maintained at its highest usage, but the suffix is subject to structural constraints (i.e., mostly with singular nouns). It shares deictic and recognitional usages of the K-suffix with CNP. The suffix is very immature, and is only sporadically and unsystematically used, even by the same writer, with a handful of nouns.
CSP is found at the other end of the scale. The evaluative usages are not attested, and the suffix is not compatible with indefiniteness contexts. It shares the constraint regarding singular nouns with CWP, but increases in frequency and becomes more closely associated with definiteness contexts. The system does not show a unique spread across the speakers and genres. In the narrative texts investigated, we found a few speakers of CSP who had taken this usage (marking of the NPs with a K-suffix) a step further and now used the K-suffix systematically as a distinct marker of anaphoric definiteness, especially in folktales, biographical genres and informal settings.
This comparison between different stages sheds light on a developmental path from evaluative morpheme to definiteness marker in Persian, as summed up in Table 4. The grammaticalization path is similar to what I already have suggested for other New Western Iranian languages, including Balochi and Shirazi.Footnote 212
These findings suggest that the development of definiteness marking can proceed down a new pathway that is entirely distinct from the one generally presented (demonstrative-based) from a typological perspective. Despite the different pathways, however, the endpoints may be fairly similar. Here the starting point is an evaluative marker. In the first stage of the development, evaluative usage is compatible with deictic and recognitional usage, which often occurs with demonstrative pronouns. The latter are anchored to a concrete and interactive speech context involving some form of “attention direction” on the part of the speaker. In the second stage, evaluative usages may disappear entirely/bleach. In contrast, the deictic and recognitional usages are extended to include anaphoric tracking, which would be more independent of setting and not necessarily dependent on immediate interactions. In the final stages, the K-suffix is systematically associated with anaphoric definiteness contexts, although the system continues to co-exist with inherited unmarked definite strategies (bare noun and demonstrative plus noun). Thus, the basic system of definiteness marking with a K-suffix is similar to the more familiar article-based system, of which anaphoric definiteness is generally the core function.
Several differences can also still be discerned, in particular the constraint that prevents definiteness marking in combination with plural marking and possessed nouns formed with a person-marking clitic. In a recent cross-linguistic study on definiteness,Footnote 213 Becker found no typological evidence for the compatibility of definiteness markers with plural number (although there is clear evidence for incompatibility between indefiniteness markers and plural number). Thus, the Persian constraints (along with Shirazi and Balochi) remain somewhat of a puzzle, compared to definiteness markers in Lori Bakhtiyari and Central Kurdish based on the same K-suffix, for which no such constraints exist. I leave this as an open question, but assume that the constraint might be due to the following facts: (a) these two suffixes (the plural marker and the K-suffix -e/he) are compatible morphologically (since both the plural marker -hā and short form of -e are new in the language); (b) they are compatible semantically, because the plural marker -hā already has a definiteness function, and it does not need to be marked again with another element (e-he);Footnote 214 and (c) the starting point of an evaluative marker in deictic and recognitional contexts in CNP is singular nouns, which suggests a possible scenario – similar to that of the intrusion of the object marker (-rā) into the nominal system with singular nouns, for example in BalochiFootnote 215– where the singular nouns are initially attracted more to the K-suffix than to the plural nouns. I have also noticed a tendency of using the K-suffix with the plural marker in Lori spoken in Fars. This is a topic for future study.
Finally, concerning the development of the definiteness marker in Persian, I would suggest that internal development, for example reducing the case system in Persian, may have favored the emergence of an additional nominal category such as definiteness. So far in the languages in our survey, languages/dialects with a reduced case system exhibit the development of the definiteness marker, for example, Shirazi, Koroshi, Lori, and Central Kurdish. On the other hand, one should not overlook the language contacts (possible earlier Persian contacts with the Semitic languages); see also Haig and Khan.Footnote 216 The ongoing project suggests that several New Western Iranian languages have developed some nascent form of definiteness marking based on evaluative morphology.
Due to the extensive documented material from its earlier phases, the Persian case presented here will provide a benchmark for future studies of Iranian languages, and will broaden the database for our understanding of the development of definiteness cross-linguistically.
Acknowledgments
The author would like to express her gratitude to Geoffrey Haig for his valuable input into this research, designing the questionnaire and sharing ideas on the grammaticalization path, to Bo Utas and Judith Josephson for their comments on earlier drafts, and to the anonymous reviewers of Iranian Studies who provided careful comments during different stages of the review process.
The author is also grateful to her colleagues Carina Jahani, Agnes Korn, Thomas Jügel, Forogh Heshabeiky, Guiti Shokri, Mohammad Mahmudi, Ali Hassuri, Iran Kalbasi, and Ali Ashraf Sadeghi for their discussions of different aspects of this paper. Thanks also to Hannah Sarrazin and Alexander Brontz who took care of the questionnaire data and provided diagrams.
The author is grateful to the Swedish Research Council (Vetenskapsrådet) for funding the research (grant number: 2018-00318).
Thanks to Christian Rammer, Frankfurt, for providing the map. The author would also like to thank Mostafa Assi, and his project assistant Saeideh Ghandi, for giving her permission to use this data. She also thanks her Tehrani speakers, in particular Fereshteh and Farzaneh Vezvai, and last but not least, she would like to thank all her anonymous Tehrani speakers for the time they took to record the questionnaire data and share their beautiful narratives. She would like to acknowledge Mohammad Rasekhmahand and Geoffrey Haig for sharing the Hamedani Persian spoken corpora with her. She would like to thank Elham Izadifar for helping her with new recordings and testing the questionnaire data on Hamedani speakers. Thanks also to Shokoufeh Taghi for double-checking her published data with their sound files. Any remaining errors are, of course, the author's own responsibility.
Abbreviations
- 1
first person
- 2
second person
- 3
third person
- []
additional information to the text
- ()
additional information to the gloss
- …
incomplete sentence
- -
affix boundary
- =
clitic boundary
- ADD
additive particle
- CLM
clause linkage marker
- CNP
Classical New Persian
- COMP
comparative
- COP
copula (present indicative)
- CSP
Contemporary Spoken Persian
- CWP
Contemporary Written Persian
- DEF
definite
- DIST
distal
- EMPH
emphasis
- EV
evaluative
- EZ
ezafe particle
- IMP
imperfective
- IMPV
imperative
- IND
individuation clitic
- INF
infinitive
- NEG
negation
- NPST
non-past stem
- OBJ
object case
- PC
person-marking enclitic (person clitic)
- PL
plural
- PN
personal pronoun
- PP
past participle
- PREV
preverb
- PROX
proximal deixis
- PST
past stem
- REFL
reflexive pronoun
- SG
singular
- VOC
vocative case