Introduction
The aim of this study was to investigate the factors influencing linguistic performance by examining the pragmatic and syntactic aspects of definiteness marking through four different experimental approaches. Previous research has highlighted differences between online and offline tasks (e.g., Roberts and Liszka, Reference Roberts and Liszka2013) and between reading and listening tasks when investigating the same linguistic phenomena (e.g., Johnson, Reference Johnson1992; Murphy, Reference Murphy1997). This study, which includes four experiments—(1) a reading acceptability judgment task, (2) an auditory acceptability judgment task, (3) an online self-paced reading task, and (4) an online self-paced listening task—aimed to deepen our understanding of methodological differences and similarities in psycholinguistic research and their impact on linguistic performance.
The following sections will explore factors affecting linguistic performance in psycholinguistic experimentation by addressing two main distinctions: online vs. offline tasks and listening vs. reading tasks. We will then provide an overview of the Hebrew definite system, detailing its pragmatic and syntactic properties used in this study as a case for testing various experimental paradigms and their effects on linguistic performance. Additionally, we will review previous research on definiteness in Hebrew and present the research question and hypotheses of this study.
Linguistic performance and its influencing factors in psycholinguistics
In the generative framework, as introduced by Chomsky (Reference Chomsky1965), a key distinction is made between linguistic competence and performance. Linguistic competence refers to the inherent knowledge of language that a speaker or listener possesses, while performance relates to the actual use of language in producing and comprehending specific utterances. Inferences about a speaker’s or listener’s competence are often made based on their performance, as the actual use of language provides insights into the underlying knowledge and rules governing linguistic behavior. When examining linguistic performance, an important question arises: what factors influence it?
For monolingual neurotypical adults, linguistic competence is typically intact, yet their performance in psycholinguistic experiments can sometimes appear to indicate otherwise. Some studies have proposed that performance can be influenced by individual traits such as working memory capacity, inhibition capability, and other cognitive abilities (e.g., Bice and Kroll, Reference Bice and Kroll2021; Pakulak and Neville, Reference Pakulak and Neville2011). However, a detailed discussion of these cognitive mechanisms and their interaction with linguistic performance is beyond the scope of this article.
Variation in linguistic performance can also be influenced by the nature of the experimental task itself (e.g., Cuza and Frank, Reference Cuza and Frank2015; Keydeniers, Eliazer, and Schaeffer, Reference Keydeniers, Eliazer and Schaeffer2017; Laurinavichyute and von der Malsburg, Reference Laurinavichyute and von der Malsburg2024; Witzel, Witzel, and Forster, Reference Witzel, Witzel and Forster2012). For example, Witzel et al. (Reference Witzel, Witzel and Forster2012) explored three types of structural ambiguities using four tasks: eye-tracking, self-paced reading, and two maze tasks (grammatical maze and lexical maze). In a within-subject design among 32 English-speaking monolinguals, the study revealed that different tasks elicited varying linguistic performance. Specifically, the eye-tracking patterns differed between ambiguous and nonambiguous sentences across all ambiguity types, while speakers identified only two of the three ambiguity types in the maze tasks. This discrepancy was attributed to the tasks’ nature and the potential strategies participants might develop during these tasks. For instance, in an eye-tracking task, participants might employ strategies like skimming, while in a self-paced reading task, they may develop a steady button-pressing pattern that adapts to comprehension difficulties. The maze tasks, which demanded minimal syntactic processing yet yielded low accuracy, were argued by the authors to be unsuitable for linguistic phenomena involving phrase closure.
Moreover, Cuza and Frank (Reference Cuza and Frank2015) investigated double complementizer constructions in Spanish heritage speakers (Spanish-English bilinguals), English speakers learning Spanish as L2, and Spanish monolingual controls. They used three different tasks to assess these groups: an aural sentence completion task, a written acceptability judgment task, and a written preference task. The findings showed variability in the performance of monolingual controls across these tasks. In the sentence completion task, some participants showed inaccuracy due to omission errors. In the acceptability judgment task, they were able to distinguish between grammatical and ungrammatical constructions but did not completely reject the ungrammatical ones. Nonetheless, in the preference task, their preferences were clear-cut towards the grammatical construction. The authors suggested that this variability might be due to “different speakers, different grammars” (Dabrowska, Reference Dabrowska2012) or the idea that some ungrammatical constructions are “good enough” for communication. However, based on Witzel et al. (Reference Witzel, Witzel and Forster2012), it could also be the case that different tasks probe different linguistic performance.
Specifically for definiteness marking, which is investigated in the current study, Keydeniers et al. (Reference Keydeniers, Eliazer and Schaeffer2017) compared two tasks for (in)definite article choice in Dutch: the sentence elicitation task from Schaeffer and Matthewson (Reference Schaeffer and Matthewson2005) and the noun phrase (NP) sentence elicitation task from van Hout, Harrigan, and de Villiers (Reference Van Hout, Harrigan and de Villiers2010). When comparing the performance of 23 monolingual Dutch-speaking adults, participants performed at ceiling in the sentence elicitation task by Schaeffer and Matthewson (Reference Schaeffer and Matthewson2005), but their accuracy dropped to around 70% in the NP sentence elicitation task by van Hout, Harrigan, and de Villiers (Reference Van Hout, Harrigan and de Villiers2010). The authors concluded that the two tasks tap into different aspects of indefiniteness marking and pointed out methodological concerns in the latter task. For example, eliciting NPs rather than full sentences may have encouraged the production of bare nouns instead of marked ones.
In summary, while monolingual adults’ linguistic competence is clearly intact, their performance can vary depending on the experimental task. As shown, this variation is not confined to a specific task type (online vs. offline) or modality (reading vs. listening). Therefore, the current study seeks to determine whether the linguistic performance of Hebrew speakers is influenced by the task when it comes to definiteness use. The following sections will provide an overview of previous research on psycholinguistic experimentation: task manipulations (online vs. offline) and stimulus representations (reading vs. listening) before addressing (in)definiteness (un)marking in Hebrew.
Discrepancies across tasks (online vs. offline) and modalities (reading vs. listening)
Psycholinguistic studies use online and offline tasks to investigate linguistic intuitions. Online tasks, like self-paced reading or eye-tracking, capture real-time language processing, while offline tasks, such as grammaticality judgments, assess responses after the full stimulus is presented (Marinis, Reference Marinis, Blom and Unsworth2010). Offline tasks are easier to use but may be influenced by memory and extralinguistic factors. Online tasks avoid these issues by tracking responses during processing but require more preparation and equipment (Marinis, Reference Marinis, Blom and Unsworth2010). Both methods possess unique strengths that complement each other in advancing our understanding of linguistic performance, thereby offering valuable insights into linguistic competence.
Interestingly, research reveals varying degrees of alignment between the two methodologies. Some studies report similar results across online and offline tasks (e.g., Casasanto and Stag, Reference Casasanto and Stag2008; Hofmeister et al., Reference Hofmeister, Jaeger, Arnon, Sag and Snider2013; Maia, Reference Maia2008; Weirick, Reference Weirick2021), while others highlight notable discrepancies (e.g., Francis, Reference Francis2010; Jackson and Roberts, Reference Jackson and Roberts2010; Roberts and Liszka, Reference Roberts and Liszka2013).
One explanation for such discrepancies is that online tasks are less sensitive due to the differences in the functioning of the parser and grammar (e.g., Ferreira and Patson, Reference Ferreira and Patson2007; Karimi and Ferreira, Reference Karimi and Ferreira2016). Online tasks, driven by the parser, prioritize speed and efficiency and thus may facilitate real-time communication at the expense of strict grammatical precision. This approach can lead to “good enough” interpretations, where the parser tolerates inconsistencies to maintain fluency in understanding (e.g., Ferreira and Patson, Reference Ferreira and Patson2007; Karimi and Ferreira, Reference Karimi and Ferreira2016). In contrast, offline tasks engage grammatical processes that are slower but adhere more rigorously to linguistic rules, revealing violations that might be overlooked in online comprehension.
A key example is grammatical illusions, where syntactic anomalies are initially perceived as acceptable during online processing but are later identified as errors during offline judgments. For instance, Phillips, Wagers, and Lau (Reference Phillips, Wagers, Lau and Runner2011) demonstrate that the parser can sometimes fail to detect subject–verb agreement violations in the presence of interference from distractor nouns. This “selective interference” shows the parser’s susceptibility to memory-based errors in online processing, making online tasks less sensitive in capturing speakers’ true linguistic competence.
Conversely, some hypotheses suggest that online tasks provide a more accurate representation of speakers’ linguistic competence than offline tasks. The immediacy assumption (Just and Carpenter, Reference Just and Carpenter1980) posits that linguistic information is processed as soon as it becomes available. Online tasks, which measure real-time processing, minimize interference from extralinguistic factors such as memory load or retrospective reasoning, offering a more direct assessment of linguistic knowledge. Moreover, the online commitment (Jackson and Roberts, 2009) highlights how online tasks track participants’ immediate grammatical role assignments and linguistic decisions as they encounter input moment-to-moment. This allows online tasks to capture dynamic processing mechanisms that may remain obscured in offline measures. Indeed, several studies report that online tasks are more sensitive than offline tasks to evaluate linguistic violations (e.g., Jackson and Roberts, 2009; Robberts and Liszka, Reference Roberts and Liszka2013). For example, Roberts and Liszka (Reference Roberts and Liszka2013) examined how tense mismatches with temporal adverbials in English were processed by monolingual and bilingual adults through both an online self-paced reading task and an offline acceptability judgment task. The self-paced reading task uncovered subtle reaction time (RT) differences, demonstrating nuanced sensitivities to mismatched grammatical structures. Specifically, past-perfect sentences paired with past adverbials, which align with evolving linguistic norms in British English, showed faster RTs for ungrammatical conditions compared to grammatical ones. These results indicate an implicit tolerance for these mismatches, detectable only in real-time processing. In contrast, the offline acceptability judgment task revealed a broad rejection of mismatched sentences. The authors attribute this outcome to extralinguistic influences, such as participants’ reliance on extralinguistic considerations or task-specific strategies. As such, we will follow this line of reasoning in our analysis, given the compelling data demonstrating that online tasks provide a more accurate measure of linguistic competence than offline tasks.
Another key consideration in psycholinguistic experimentation is the mode of stimulus presentation—visual (e.g., reading tasks) or auditory (e.g., listening tasks) (e.g., Jobard et al., Reference Jobard, Vigneau, Mazoyer and Tzourio-Mazoyer2007; Johnson, Reference Johnson1992; Murphy, Reference Murphy1997). While some studies report no differences in linguistic behavior across modalities (e.g., Kim and Nan, Reference Kim and Nam2017; Silva et al., Reference Silva, Folia, Inácio, Castro and Petersson2018), others reveal discrepancies between reading and listening tasks targeting the same linguistic phenomena (e.g., Murphy, Reference Murphy1997; Johnson, Reference Johnson1992; Haig, Reference Haig1991).
One potential explanation for the discrepancies between reading and listening tasks is the separate stream hypothesis (Penney, Reference Penney1980). This theory posits that auditory and visual information are processed independently along distinct “streams,” with the auditory modality offering an advantage due to its longer persistence, making it more effective for recall in short-term memory tasks. Supporting this hypothesis, some studies have found that participants perform better in listening tasks (e.g., Dede, Reference DeDe2012, Reference DeDe2013). For example, Dede (Reference DeDe2013) tested the effects of word frequencies in both patients with aphasia and nonbrain-damaged controls, using self-paced reading and listening tasks. Data from the control group revealed no significant differences between reading and listening tasks, yet RTs for the reading task were longer than for the listening task.
Conversely, other studies report the opposite trend, showing that listeners perform worse than readers (e.g., Johnson, Reference Johnson1992; Murphy, Reference Murphy1997; Batel, Reference Batel2020). To explain this finding, Murphy (Reference Murphy1997) proposed that the “burden of auditory processing” (p. 55)—including factors such as the transient nature of auditory stimuli and the challenges of segmenting spoken information—leads listeners to exhibit lower accuracy and longer RTs when processing linguistic information compared to readers. Indeed, research consistently highlights the greater difficulty of listening tasks compared to reading ones (e.g., Johnson, Reference Johnson1992; Murphy, Reference Murphy1997; Batel, Reference Batel2020). For example, Murphy (Reference Murphy1997) tested oral and written grammaticality judgment tasks using English and French monolinguals and L2 learners. Strikingly, both groups of monolingual participants exhibited significantly lower accuracy and longer RTs in the listening task when contrasted with the reading task. Given the consistent evidence supporting the increased challenge of listening tasks, we will follow the second line of reasoning in our analysis, considering listening tasks to be more demanding than reading tasks.
To summarize, different tasks and presentations of the stimuli can play a role in linguistic performance. To explore the relationship between linguistic performance and experimental tasks using (in)definiteness marking in Hebrew, the study will manipulate the task (online vs. offline) and modality (reading vs. listening). In the next subsection, we focus on definiteness in Hebrew as a test case for investigating the effects of psycholinguistic paradigms.
Definiteness in Hebrew - its pragmatic and syntactic properties
Hebrew morpho-syntactically marks definiteness with the article ha-, but does not overtly mark indefiniteness (e.g., Berman, Reference Berman1982, Reference Berman and Slobin1985; Danon, Reference Danon2001, Reference Danon2008; Glinert, Reference Glinert2004; Ritter, Reference Ritter and Ritter1991; Wintner, Reference Wintner2000), seen in the following examples (1a and 1b):


Definiteness is extensively discussed within the field of linguistics. Scholars posit that definiteness involves pragmatic knowledge such as familiarity, uniqueness, and/or a shared referent by both speaker and addressee alongside other pragmatic concepts within the discourse (e.g., Abbott, Reference Abbott1993, Reference Abbott, Horn and Ward2004, Reference Abbott and Keith2006; Ariel, Reference Ariel1990, Reference Ariel, Sanders, Schilperoord and Spooren2001; Heim Reference Heim1982, Reference Heim1983; Ludlow and Segal, Reference Ludlow, Segal, Reimer and Bezuidenhout2004; von Heusinger, Reference Von Heusinger1997; Szabó, Reference Szabó2000, among others). For example, both Szabó (Reference Szabó2000) and Ludlow and Segal (Reference Ludlow, Segal, Reimer and Bezuidenhout2004) argue that definiteness use is derived from uniqueness and other pragmatic considerations. In addition, Heim (Reference Heim1982, Reference Heim1983) argues for the familiarity of definiteness, that is, an element that is previously mentioned in the discourse, that is familiar to both speaker and addressee. Finally, Roberts (Reference Roberts2003) states that a definite NP has a presupposition of a discourse referent, that is familiar and unique in the discourse context. For our purposes, we will treat the use of a definite article as a linguistic choice influenced by pragmatic considerations.
As in other languages, definiteness in Hebrew involves pragmatic knowledge (e.g., Armon-Lotem and Avram, Reference Armon-Lotem, Avram and di Sciullio2005; Hacohen, Reference Hacohen2010). For example, Armon-Lotem and Avram (Reference Armon-Lotem, Avram and di Sciullio2005) show that the usage of the definite article ha- and the object marker et require speakers to distinguish between the speaker’s and hearer’s knowledge, a pragmatic concept known as “the concept of non-shared knowledge” (Schaeffer, Reference Schaeffer1997). In addition, Hacohen (Reference Hacohen2010) shows that the usage of definiteness in Hebrew is pragmatically licensed only when the unique referent is shared by both the speaker and addressee. Hence, pragmatic knowledge is required to derive definiteness marking in Hebrew.
Hebrew is unique in the sense that definiteness also participates in syntactic processes. First, the article ha- is also involved in an obligatory syntactic agreement between a definite noun and its modifying adjective, as seen in examples (2a-b) (Danon, Reference Danon2001, Reference Danon2008; Ritter, Reference Ritter and Ritter1991; Wintner, Reference Wintner2000):


Examples (2a-b) show that syntactic definiteness and semantic definiteness are not necessarily synonymous, as the two examples bear the same semantic meaning but only (2b) is ungrammatical (cf. Danon, Reference Danon2001 for similar examples and rationale). In other words, the definiteness agreement between a noun and its modifier in (2a) is purely syntactic, by which a definite noun must agree in definiteness marking with its modifying adjective. Despite the different analyses of the article ha- and where and how definiteness marking occurs, scholars agree that this definiteness agreement/concord process is a syntactic operation (e.g., Danon, Reference Danon2001; Ritter, Reference Ritter and Ritter1991; Wintner, Reference Wintner2000)Footnote 1 .
Definiteness in Hebrew also interacts with the object marker et. The distribution of et is binary; et obligatorily precedes a definite NP object, and it is ungrammatical with an indefinite one (e.g., Danon, Reference Danon2001, Reference Danon2008, Glinert, Reference Glinert2004; Ritter, Reference Ritter and Ritter1991; Wintner, Reference Wintner2000). This is illustrated in the following examples:


The morphosyntactic encoding of definiteness in Hebrew exhibits distinct variations between subject and object positions, as illustrated in examples (2) and (3). Specifically, definiteness marking in the subject position relies solely on the article ha- (e.g., ha-kelev ha-afor, DEF-grey DEF-dog). In contrast, definite objects are marked with dual markers: the article ha- accompanied by the accusative marker et, which exclusively signals definite objects (Danon, Reference Danon2001).
These differences underscore the critical role of perceptual saliency in language processing (e.g., Batel, Reference Batel2020; Dube et al., Reference Dube, Kung, Peter, Brock and Demuth2016). For example, Dube et al. (Reference Dube, Kung, Peter, Brock and Demuth2016) found that while the type of grammatical error (omission vs. commission) did not significantly affect sensitivity to subject–verb agreement violations, saliency at specific positions (e.g., medial vs. final) did influence participants’ responses. This effect is particularly pronounced in the listening modality, where the phonological reduction of ha- to a- in spoken Hebrew reduces its perceptual saliency during auditory processing (Meir I. and Doron, Reference Meir and Doron2013). As a result, definite objects benefit from additional perceptual cues, providing a processing advantage over definite subjects, a benefit that is more prominent in the auditory modality.
To summarize, the Hebrew definite system requires both pragmatic and syntactic knowledge. The pragmatic aspect is represented by a shared, unique referent that was previously stated in the discourse, and the syntactic properties are represented by definiteness agreement and the accusative marker et. Moreover, variations in definiteness marking across positions of violation can influence linguistic performance, with the object position offering greater perceptual salience than the subject position. This advantage is further amplified in the listening modality, as the phonological reduction of the definite marker affects the subject position, while the dual marking in the object position helps maintain its perceptual saliency during auditory processing.
Previous psycholinguistic investigations of definiteness in Hebrew
In addition to the extensive theoretical literature (e.g., Berman, Reference Berman1982, Reference Berman and Slobin1985; Danon, Reference Danon2001, Reference Danon2008; Glinert, Reference Glinert2004; Ritter, Reference Ritter and Ritter1991; Wintner, Reference Wintner2000; among others), definiteness in Hebrew was also experimentally tested (e.g., Armon-Lotem and Avram, Reference Armon-Lotem, Avram and di Sciullio2005; Hacohen, Reference Hacohen2010; Hacohen et al., Reference Hacohen, Kagan and Plaut2021; Meir, N. et al., Reference Meir, Walters and Armon-Lotem2017, 2021; Plaut and Hacohen, Reference Plaut and Hacohen2022; Uziel-Karl, Reference Uziel-Karl2015; Zur, Reference Zur1983). Most of the experimental work focused on the acquisition of definiteness in Hebrew, using production data (e.g., Meir, N. and Novogrodsky, Reference Meir and Novogrodsky2023; Uziel-Karl, Reference Uziel-Karl2015; Zur, Reference Zur1983). Fewer studies have evaluated the comprehension of definiteness in children (e.g., Armon-Lotem and Avram, Reference Armon-Lotem, Avram and di Sciullio2005; Hacohen, Reference Hacohen2010; Plaut and Hacohen, Reference Plaut and Hacohen2022). The acquisition of definiteness in Hebrew is beyond the scope of this paper, yet it is worth noting that while some studies show early mastery of the Hebrew definite system, others present incomplete acquisition even at later stages, e.g., at the age of 7–8 (cf. Plaut and Hacohen, Reference Plaut and Hacohen2022 and references therein).
Turning to monolingual Hebrew adult speakers, different aspects of definiteness were evaluated in previous studies (e.g., Hacohen, Reference Hacohen2010; Hacohen et al., Reference Hacohen, Kagan and Plaut2021; Plaut and Hacohen, Reference Plaut and Hacohen2022). For example, Hacohen (Reference Hacohen2010) tested the pragmatic aspects of definiteness in Hebrew under telicity in six speakers. Telicity is a linguistic phenomenon, whereby telic predicates, but not atelic ones, have an inherent endpoint that must be reached for the predicate to be true. Crucially, definiteness is one of the necessary conditions to derive telic predicates, as coloring the squares but not coloring squares entails its completion. Using a felicity judgment task manipulating definiteness in nonunique contexts (bring me the painting when there are two paintings) and unique context (give me the green apple when there is a green apple and a red one), results show that definiteness was correctly chosen in unique contexts in 100% of the cases and incorrectly chosen in only 17% in nonunique ones. That is, as expected, monolingual Hebrew adults are aware of the pragmatic requirements of definiteness use in Hebrew.
In addition, Plaut and Hacohen (Reference Plaut and Hacohen2022) tested the acquisition of the accusative marker et as a case of DOM. The study manipulated et (un)marking in three conditions; et-marked definite NPs (grammatical), et-marked indefinite NPs (ungrammatical), and definite NPs unmarked for et (infelicitous) using a listening acceptability judgment task with a 3-point Likert scale. The judgment data of 14 adults showed that et-marked definite objects were rated as completely acceptable in 97% of the cases, strongly indicating the adults’ grasp of et’s syntactic properties. Conversely, when it came to et-marked indefinites (which are ungrammatical), adults rated them as completely unacceptable in 70% of cases and partially unacceptable in 30%—essentially showing 0% complete acceptance of such structures. Finally, participants also judged definite NPs unmarked for et (infelicitous sentences) as somewhat odd, not fully ruled out (29%), and not fully accepted (16%). The absence of categorical rejection or acceptance suggests the influence of pragmatic considerations (cf. Paltiel-Gedalyovich, Reference Paltiel-Gedalyovich2011).
In conclusion, previous studies investigating definiteness in Hebrew show that adult monolingual speakers are aware of the pragmatic and syntactic properties of both the definite article ha- and the accusative marker et. However, previous studies relied on small sample sizes, and they did not cover all the properties of definiteness within the same study thus not making it possible to compare sensitivity to violations of pragmatic nature versus purely syntactic ones. Moreover, to the best of our knowledge, no study has investigated how the processing of definiteness in Hebrew unfolds in real time. Hence, we now turn to present the current study.
The current study
The aim of this study is to assess speakers’ performance in detecting definiteness violations in Hebrew through various tasks (both online and offline) and modalities (reading and listening). Four experiments (a reading acceptability judgment task, an auditory acceptability judgment task, an online self-paced reading task, and an online self-paced listening task) were carried out to assess speakers’ ability to detect pragmatic and syntactic violations of definiteness across tasks.
Our study aims to shed light on whether the performance of monolingual Hebrew speakers in identifying pragmatic and syntactic violations of (in)definiteness marking is affected by the task and/or the modality (online vs. offline tasks and reading vs. listening tasks)?
The null hypothesis posits that there will be no discrepancies between the tasks, implying that speakers’ performance in identifying syntactic and pragmatic definiteness violations will be consistent across both online and offline tasks, as well as between reading and listening modalities. Indeed, several studies reported converging performance across tasks (e.g., Casasanto and Stag, Reference Casasanto and Stag2008; Hofmeister et al, Reference Hofmeister, Jaeger, Arnon, Sag and Snider2013) and modalities (e.g., Kim and Nan, Reference Kim and Nam2017; Silva et al., Reference Silva, Folia, Inácio, Castro and Petersson2018). The alternative hypothesis posits that the experimental task can influence speakers’ linguistic behavior, (e.g., Cuza and Frank, Reference Cuza and Frank2015; Keydeniers et al., Reference Keydeniers, Eliazer and Schaeffer2017; Witzel et al., Reference Witzel, Witzel and Forster2012).
We anticipate differences in linguistic performance between online and offline tasks, aligning with previous research by Roberts and Liszka (Reference Roberts and Liszka2013) and Jackson and Roberts (2009). Specifically, guided by the immediacy assumption (Just and Carpenter, Reference Just and Carpenter1980) and the concept of online commitment (Jackson and Roberts, 2009), we expect that online tasks will facilitate the detection of definiteness violations in real time. Furthermore, we predict that reading tasks will result in greater sensitivity to definiteness violations than listening tasks, which aligns with the “burden of auditory processing” assumption (Murphy, Reference Murphy1997) and psycholinguistic findings (e.g., Johnson, Reference Johnson1992; Batel, Reference Batel2020). Finally, building on the concept of perceptual saliency (e.g., Dube et al., Reference Dube, Kung, Peter, Brock and Demuth2016) and the phonological reduction of the definite article (Meir I. and Doron, Reference Meir and Doron2013), we predict that definiteness manipulation will be more detectable in the object position than in the subject position, as definite objects are reinforced by additional morphosyntactic cues. These positional differences are anticipated to be even more pronounced in the listening modality, where the acoustic properties of Hebrew further diminish the saliency of definiteness marking in the subject position but less so in the object position.
Offline tasks
The materials, data, and analysis scripts for the four studies can be retrieved from the OSF repository (https://osf.io/unj6s/).
The study was approved by the Institutional Review Board (IRB) of Bar-Ilan University. Written/electronic consent was obtained from all participants before participating in the study. Each participant was tested individually. Each of Experiments 1–4 took around 15 minutes to complete. Experiments 1 and 2, the two offline tasks, evaluated participants’ performance in detecting pragmatic and syntactic violations of definiteness through offline acceptability judgments using both reading and listening tasks.
Methods
Participants
In Experiment 1 (the reading acceptability judgment task), 20 adults who speak Hebrew as their native language, completed an online questionnaire (M age = 31 yr, SD = 14, N females = 11, Min-Max:19–67 yr). All participants volunteered to participate in the study and self-reported to be native Hebrew speakers and to use Hebrew as the main language of their daily communication. The participants belonged to mid-high socioeconomic status as measured by their years of education (M = 15, SD = 3 Min-Max:12–19).
In Experiment 2 (the listening acceptability judgment task), 36 adults, who speak Hebrew as their native language, completed an online questionnaire (M age = 34 yr, SD = 12, N females = 24, Min-Max:19–66 yr). All participants volunteered to participate in the study and self-reported to be native Hebrew speakers and use Hebrew as the main language of their daily communication. The participants belonged to mid-high socioeconomic status as measured by their years of education (M = 15, SD = 3 Min-Max:11–20).
Materials and procedure: offline tasks
In Experiments 1 and 2, as in Experiments 3 and 4 (discussed later), the materials were designed to manipulate two factors: type of violation (None, Pragmatic violation, and Syntactic violation) and position of violation (subject, object), creating six experimental conditions. The “None” condition consisted of grammatical and contextually appropriate sentences. Pragmatic violations involved anaphoric errors, where a noun introduced in the first sentence remained indefinite in the second (whether subject or object position), violating the shared knowledge between the speaker and addressee. Syntactic violations varied by position: in the subject position, the violation involved the noun–adjective agreement, where the noun was marked by ha- but the adjective lacked this marking; in the object position, the violation stemmed from the erroneous use of the accusative marker et with an indefinite NP. Each condition combined one violation type and one position of violation, with eight items per condition, totaling 48 experimental items (2 positions × 3 violation types). Additionally, 64 filler items assessing pronoun use were included but were not analyzed (see Table 1).
Table 1. Hebrew stimuli for the offline tasks per syntactic position per condition

In the tasks, participants were presented with scenarios, in which definiteness (un)marking was manipulated in the second sentence. Participants were then asked to rate on a 1–7 scale (1— completely unnatural and 7—very natural) how natural the second sentence was based on the first sentence. The instructions were provided in a written form before the experiment (translated from Hebrew): “The purpose of this questionnaire is to test language processing. In this questionnaire, you will be asked to read/listen to sentences and decide on a 7-point scale whether the sentence sounds natural to you”.
In the reading acceptability judgment task (Experiment 1), participants were asked to read sentences and provide ratings based on the written stimuli. The task was administered using Qualtricsxm, an online anonymous survey platform (Qualtrics, Provo, UT). In the listening acceptability judgment task (Experiment 2), the sentences were prerecorded by a native speaker of Hebrew. The sentences were recorded in their “None” form and then were cross-spliced to create the definiteness violations while preserving natural prosody. The acceptability listening task was conducted using PCIbex (Zehr and Schwarz, Reference Zehr and Schwarz2018). In Experiment 1, participants were exposed to all 48 experimental sentences. However, due to PCIbex storage limitations and the need to maintain an appropriate filler-to-experimental item ratio in Experiment 2, the stimuli were divided into three separate lists. To ensure robust data collection despite this adjustment, we increased the number of participants, assigning each participant to one list based on their birth month.
Analysis: offline tasks
For the two offline tasks, we used linear mixed effects to analyze the data with the lme4 package (e.g., Bates et al., Reference Bates, Mächler, Bolker and Walker2015) in R version 4.1.2 (R Core Team, 2022). To answer our research questions regarding speakers’ performance in identifying definiteness violations in Hebrew across tasks and modalities, we fitted models with random and fixed effects to evaluate ratings across violation types and positions of violation. Random effects were random intercepts for participant code and items; however, when we included random slopes, the models did not converge. Fixed effects included position of violation (sum-coded, with the levels “Subject” and “Object”) and violation type (sum-coded, comparing “None,” “Pragmatic violation,” and “Syntactic violation”). We also included the interaction between these fixed effects to examine differences across sentence types. The models were built incrementally using a stepwise approach. The significance of the main fixed effects was assessed using the ANOVA function. Figures in the study were generated using the ggplot2 package (Wickham, Reference Wickham2016). We used the emmeans package (Lenth, Reference Lenth2019) to conduct pairwise comparisons, for which the Bonferroni adjustments were applied.
Results
To test the effect of modality on linguistic performance in the two offline tasks, a large analysis was conducted (see supplementary materials). This analysis, which incorporated modality (reading vs. listening) as an additional fixed factor, is detailed in Figure 1, which presents the descriptive statistics.

Figure 1. Mean ratings by modality, violation type, and position of violation in the offline tasks.
Our initial analysis explored the effects of modality, violation type, and position of violation on participants’ ratings of definiteness violations, along with the interactions among these factors. By combining data from two experiments, this analysis uncovered several key findings (see supplementary materials, Table S1). First, the model showed that the two tasks differed significantly, as indicated by the modality factor (p<.001), with participants providing overall lower ratings in the reading task (Experiment 1) compared to the listening task (Experiment 2). Additionally, the analysis revealed a significant difference between the two positions of violations (subject and object) across tasks, as observed in the position of violation factor (p<.001). Furthermore, significant variations were observed in ratings between no-violation sentences and between pragmatic violations (p<.001) but not between “None” and syntactic violations (p=.38).
To simplify interpretation and avoid the complexity of three-way interactions, separate models will be presented for each task. Hence, the output model for Experiment 1 (the offline reading acceptability judgment task) is illustrated in Table 2.
Table 2. The final model for Experiment 1 (the reading acceptability judgment task)

As shown in Table 2, using a sum-coded approach, the model revealed a significant difference between the subject and object positions (p=.017). Additionally, both types of violations were significantly different than sentences with no violations (p<.001 for both comparisons), reflecting clear sensitivity to both error types. However, no significant interaction was found between the position of violation and violation type, suggesting that the impact of definiteness violations did not vary between the subject and object positions. Next, we present the results of Experiment 2, the listening acceptability judgment task, where the final model for Experiment 2 is presented in Table 3.
Table 3. The final model for Experiment 2 (the listening acceptability judgment task)

As shown in Table 3, the model highlights several key findings. First, there were significant differences between the subject and object positions of violation (p<.001), and between sentences with no violations and those containing definiteness errors, with both pragmatic and syntactic violations reaching significance (p<.001). Notably, unlike in Experiment 1, Experiment 2 revealed a significant interaction between the position of violation and sentences comparing no violations and pragmatic violations (p<.001), which was not observed in the contrast between sentences with no violations and syntactic violations (p=.379). To further investigate the source of this interaction, we conducted post hoc comparisons with Bonferroni adjustments (see Tables 4 and 5).
Table 4. Post-hoc comparison per position of violation – Experiment 2 (the listening acceptability judgment task)

Table 5. Post-hoc comparison per violation type – Experiment 2 (the listening acceptability judgment task)

The Bonferroni-adjusted post hoc comparisons revealed that participants rated sentences with no violations significantly higher than sentences containing either syntactic or pragmatic violations, across both positions of violation (p<.001 for all comparisons). Hence, the source of interaction stems from rating errors; while no violation sentences received similar ratings across the two positions of violation (p=.23), the two error types—pragmatic and syntactic violations—were rated significantly lower in the object position compared to the subject position (p=.02 and p<.001, respectively).
To summarize, the findings from the two offline tasks reveal similar patterns with notable differences across modalities. Sentences containing definiteness errors were consistently rated lower than those without violations, indicating that both Hebrew readers and listeners identify pragmatic and syntactic definiteness violations as errors. Furthermore, in both tasks, ratings varied by position of violation, with lower ratings observed in the object position. However, while the position of violation did not influence the ratings of definiteness errors in Experiment 1, listeners in Experiment 2 rated definiteness errors in the object position as significantly lower compared to the subject position. This difference will be elaborated upon further in the general discussion. Following these offline studies, we moved on to the final two tasks: the online self-paced reading and listening experiments.
Online tasks
Methods
Participants
In Experiment 3 (the online self-paced reading task), 20 native Hebrew-speaking adults participated (M age = 28yr, SD = 9, N females = 14, Min-Max:20–57). All participants volunteered to participate in the study, self-reporting to be monolingual Hebrew speakers above the age of 18. The participants were from the mid-high socioeconomic status, as they were all BA or MA students.
In Experiment 4 (the online self-paced listening task), 30 native Hebrew-speaking adults participated (M age = 25.9 yr, SD = 14, N females = 22, Min-Max:22–37). Like in Experiment 3, all participants volunteered, self-reported to be monolingual Hebrew speakers over the age of 18, and were from the mid-high socioeconomic status, with all being BA or MA students.
Material and procedure- online tasks
To assess Hebrew speakers’ performance in identifying definiteness violations, we administered online self-paced reading and listening tasks. The stimuli in Experiments 3 and 4 were identical to those used in Experiments 1 and 2; specifically, the written sentences and audio files from Experiments 1 and 2 were used. However, in Experiments 3 and 4, the sentences were divided into segments (six segments for the subject position and seven segments for the object position) and presented one segment at a time, using the moving-window technique (Just et al., Reference Just, Carpenter and Woolley1982) (see Tables 6 and 7 for details on the subject and object positions, respectively).
Table 6. Hebrew stimuli for the online tasks per syntactic position per condition (critical segments are marked in light grey for the pragmatic violations, dark grey for the syntactic violations)

Table 7. Hebrew stimuli for the online tasks-object position

Due to the inherent properties of definiteness characteristics in Hebrew, it was not possible to directly compare the two positions of violation. Furthermore, the critical regions for the violations differed: in the object position, all critical regions are located in segment 5 due to the presence/absence of the accusative marker et. In the subject position, the critical region for the pragmatic violation was in segment 3, where the noun introduced in the first sentence was indefinite in the second sentence. For the syntactic violation, the critical region was in segment 4, where a noun–adjective definiteness agreement is violated by an indefinite adjective following a definite noun. Each type of violation in its critical region was compared to its no-violation sentence counterpart within the corresponding segment, to allow testing the effect of perceptual saliency across different positions of violation. A quarter of the items were followed by a yes/no comprehension question to ensure participants were attentive to the task.
In the online self-paced reading task (Experiment 3), participants were asked to read the sentences at their own individual pace and press the space key to advance to the next segment. In the online self-paced listening task (Experiment 4), participants were asked to listen to sentences and press the space key to listen to the next segment. Both experiments were programmed and administered using E-prime software (Psychology Software Tools, Pittsburgh, PA). To control for item effects, the experimental items were divided into three random lists, each of which was assigned randomly to participants, ensuring that all participants were exposed to all experimental items.
Data trimming and analysis: online tasks
For the two online tasks, we used linear mixed-effects models to analyze the data with the lme4 package (e.g., Bates et al., Reference Bates, Mächler, Bolker and Walker2015) in R version 4.1.2 (R Core Team, 2022). We fitted models with random and fixed effects evaluating ratings across conditions and syntactic positions. The random effects included random intercepts for participant codes and items.
Given that the critical regions vary by the position of violation and violation type, we compared each type of violation to its corresponding no-violation counterpart within the relevant critical segment. This led to two separate models. One model focused on pragmatic violations: position of violation (sum-coded, with the levels “Subject” and “Object”) and violation type (sum-coded, comparing “None” with “Pragmatic violation”). The interaction of these factors was included as fixed effects in segment 3 in the subject position and segment 5 in the object position to evaluate differences in ratings based on the position of violation and violation type. Similarly, the model focusing on syntactic violations included sum-coded factors for the position of violation (sum-coded, with the levels “Subject” and “Object”) and violation type (sum-coded, comparing “None” with “Syntactic violation”). These factors and their interaction were included as fixed effects in segment 4 in the subject position and segment 5 in the object position to assess differences across sentence types and position of violation. For both models, random slopes were excluded due to convergence issues, and only random intercepts for participants and items were retained.
Due to the differences in modality, data trimming was conducted differently. For Experiment 3, to control the differences in length between the segments, residual RTs for each segment were calculated using word-length residuals (number of characters per segment; Ferreira and Clifton, Reference Ferreira and Clifton1986). The residual RTs were then logged-transformed to meet the assumption of normality (e.g., van Dijk et al, Reference van Dijk, Dijkstra and Unsworth2022). For Experiment 4, to control the differences in length across audio files, raw RTs were transformed into residual RTs by subtracting the segment duration from the raw RTs for each segment separately. The residual RTs were logged-transformed to meet the assumption of normality.
Figures in this study were generated using the ggplot2 package (Wickham, Reference Wickham2016). We used the emmeans package (Lenth, Reference Lenth2019) to conduct pairwise comparisons, with Bonferroni adjustments applied to correct for multiple comparisons.
Results: online tasks
The descriptive statistics for Experiment 3 (online self-paced reading task) are presented in Figure 2, and the output model for no violation vs. pragmatic violation is presented in Table 8 Footnote 3 .

Figure 2. Mean log-transformed RTs per condition in Experiment 3 (self-paced reading task, light grey refers to critical region for pragmatic violations, and dark grey for critical region of syntactic violations in the subject position).
Table 8. The final model for Experiment 3 (None vs. pragmatic violation)

Table 8 shows significant differences between the subject and object positions (p<.001), as well as between the sentences with no violations and those with pragmatic violations (p<.001). A significant interaction between the position of violation and violation type emerged (p<.001), indicating that the effect of the violation type was influenced by the position of violation. To follow-up on this significant interaction, we conducted post hoc comparisons with Bonferroni adjustments, which confirmed that sentences with no violation had significantly faster reaction times compared to pragmatic violations both for the subject position (EST=−.10, SE=.01, t =−5.53, p<.0001), and object position (EST=−.31, SE=.01, t= −16.73, p<.0001). Notably, the effect of violation type was more pronounced in the object position, as reflected by the larger estimates. This suggests that the magnitude of the effect of violation type (“None” vs. “Pragmatic violation”) was stronger in the object position, highlighting that the interaction was driven by the varying effect sizes across positions of violation.
We now move to the results for the sentences with no violations vs. syntactic violations, presented in Table 9. There are significant differences between the subject and object positions (p<.001), between sentences with no violations vs. syntactic violation sentences (p<.001), and a significant interaction between the position of violation and violation type (p=.029). To further assess this significant interaction, we conducted post hoc comparisons with Bonferroni adjustments. As with pragmatic violations, the post hoc Bonferroni-adjusted comparisons revealed that speakers processed sentences with no violations significantly faster than those with syntactic violations across both the subject position (EST=−.12, SE=.01, t=−7.51, p<.0001) and the object position (EST=−.07, SE=.02, t = −4.42, p<.0001).
Table 9. The final model for Experiment 3 (None vs. syntactic violation)

To summarize, the results of the self-paced reading task provide several notable insights. Firstly, similar to the offline tasks, speakers showed a clear distinction between sentences with no violations and sentences with definiteness errors across the two positions of violation, as evidenced by slower RTs for the sentences with definiteness violations. Additionally, there was an effect of position of violation: sentences with definiteness violations in the object position were processed slower than those in the subject position, a finding that will be discussed in the general discussion.
Finally, we present the results of the final task, the self-paced listening task. The descriptive statistics for Experiment 4 (the online self-paced listening task) are presented in Figure 3, and the output model for no violation vs. pragmatic violations is presented in Table 10 Footnote 4 .

Figure 3. Mean log-transformed RTs per condition in Experiment 4 (self-paced listening task, light grey refers to critical region for pragmatic violations, and dark grey for critical region of syntactic violations in the subject position).
Table 10. The final model for Experiment 4 (None vs. pragmatic violation)

As shown in Table 10, there were significant differences between the subject and object positions (p=.017), but neither differences between the violation type (p=.93) nor the interaction between the position of violation and violation type (p=.22) turned out to be significant. Unlike in previous experiments, no differences in performance were observed between sentences with no violations and those with pragmatic violations in the self-paced listening task across the different positions.
Finally, Table 11 presents the model comparing sentences with no violations and those with syntactic violations. The results indicated no significant effect of position of violation (p=.14), but there was a significant effect of violation type (p<.001). Importantly, there was a significant interaction between syntactic position and violation type (p<.001). To follow-up on this significant interaction, we conducted post hoc analyses for multiple comparisons using Bonferroni adjustments. The post hoc analyses revealed that in the object position, speakers processed sentences with no violations significantly faster than those with syntactic violations (EST=−.48, SE=.08, t=−5.88, p<.0001). Interestingly, no such effect was observed in the subject position (EST=−.8, SE=.08, t=−1.01, p=.31).
Table 11. The final model for Experiment 4 (None vs. syntactic violation)

To summarize, the results of the online tasks revealed several noteworthy patterns. In the self-paced reading task (Experiment 3), clear differences emerged between sentences containing definiteness errors and those with no violations, aligning with the findings from the offline tasks. However, the results from the self-paced listening task (Experiment 4) diverged from those of Experiments 1–3. In the subject position, no significant differences were found between no-violation sentences and those containing either pragmatic or syntactic violations. Conversely, in the object position, significant differences were observed between sentences with no violations and those with syntactic violations, while no differences were detected between no-violation sentences and those with pragmatic violations.
Another noteworthy finding is the difference between subject and object positions in the reading tasks, where violations in the object position were rated lower and processed more quickly than those in the subject position. This consistent finding will be further overviewed in the general discussion. With that in mind, we now transition to the general discussion
General discussion
This study investigated Hebrew speakers’ linguistic performance in identifying (in)definiteness marking across various tasks (online vs. offline) and modalities (reading vs. listening). Drawing on prior research, we expected participants to detect definiteness errors in Hebrew by differentiating sentences with no violations from both syntactic and pragmatic errors across the different tasks. However, we also anticipated variations in performance based on the task type and modality.
This study is the first to examine definiteness (un)marking using a combination of online and offline tasks, as well as reading and listening modalities. To address our research question, we will discuss the results with a focus on task manipulation (offline vs. online) and modality (listening vs. reading).
Linguistic performance across tasks (online vs. offline) and stimuli presentation (reading vs. listening)
The research question of this study was whether the performance of monolingual adult speakers in identifying pragmatic and syntactic violations of (in)definiteness marking is influenced by the task and/or modality. A summary of our findings is provided in Table 12.
Table 12. Summary of findings: In Experiments 1 and 2, the symbol “>” denotes greater accuracy, while in Experiments 3 and 4, it represents slower reaction times

As expected, our findings align with previous research, demonstrating that monolingual Hebrew-speaking adults could detect both pragmatic and syntactic violations related to definiteness use across different tasks and modalities. However, methodological concerns arise as the distinction between sentences with no violations and definiteness violations did not consistently align across the four tasks. Differences in task performance were evident in the effects and interactions between the position of violation (subject and object) and type of violation (none, syntactic violation, and pragmatic violation). Specifically, in Experiment 1 (offline reading acceptability judgment task), Experiment 2 (the offline listening acceptability judgment task), and Experiment 3 (self-paced reading task), an effect of position of violation was found, with manipulations in the object position resulting in lower ratings and faster response times compared to the subject position. In Experiment 4 (the online self-paced listening task), an effect was observed exclusively for syntactic violations in the object position, where sentences with syntactic violations elicited slower RTs compared to sentences without violations.
This pattern of results can be attributed to perceptual saliency (e.g., Dube et al., Reference Dube, Kung, Peter, Brock and Demuth2016). In the object position, the co-occurrence of the definite article ha- and the object marker et created a facilitative effect, enhancing the perceptual saliency of definiteness violations compared to the subject position across both reading and listening modalities. This dual marking likely accounts for the more pronounced effects observed in tasks involving object violations.
In addition, as predicted, the contrast between positions of violation played a significant role in Experiment 2 (offline listening acceptability judgment task). Listeners detected definiteness errors across both positions, revealing an effect of position of violation and an interaction between position of violation and violation type. Furthermore, comparing the two offline tasks highlights key differences. In Experiment 1 (the offline reading acceptability judgment task), readers rated definiteness errors differently between the subject and object positions, but the interaction between these two variables was not statistically significant. In contrast, in Experiment 2, listeners rated definiteness violations in the object position as significantly worse than in the subject position. These findings support the enhanced perceptual saliency of the object position (Dube et al., Reference Dube, Kung, Peter, Brock and Demuth2016) and emphasize the impact of the acoustic properties of definiteness marking in Hebrew on auditory processing (Meir I. and Doron, Reference Meir and Doron2013).
The findings of Experiment 4 (the online self-paced listening task) differed from those of Experiments 1–3, as listeners did not recognize pragmatic violations in either position of violation. RTs for sentences with and without pragmatic violations were similar, while differences in RTs’ for syntactic violations were observed only in the object position.
These unexpected results may stem from the interplay of perceptual saliency and the acoustic properties of definiteness marking in Hebrew, which likely led to more pronounced effects in the object position. This, combined with parser-grammar misalignment, could have resulted in “shallower” processing and reduced the sensitivity to definiteness violations. For example, Ivanova-Sullivan, Sekerina, Tofighi, and Polinsky (Reference Ivanova-Sullivan, Sekerina, Tofighi and Polinsky2022) tested sensitivity to clitic placement in Bulgarian in monolinguals and bilinguals using a self-paced listening task (i.e., an online task) and an auditory acceptability judgment task (i.e., an offline task). For the monolingual group, while in the offline task, grammatical sentences received higher ratings than ungrammatical ones, there were no differences in response times when processing grammatical and ungrammatical sentences in the self-paced listening task. As evidence that the task is not methodologically flawed, the authors report that in previous self-paced reading task, grammatical constructions were read much faster than ungrammatical ones; hence, it is not a matter of the task itself, rather issues related to the listening modality combined with the online measurement.
Similarly to their findings, it is possible that the phonological complexities of Hebrew definiteness marking, coupled with the online measurement, caused the parser to create “illusions” of grammaticality in infelicitous sentences. Another potential reason for the failure to observe pragmatic violations could be that grammatical violations tend to have a stronger impact than pragmatic ones (cf. Paltiel-Gedalyovich, Reference Paltiel-Gedalyovich2011). However, further research is needed to confirm this.
Finally, although speakers consistently distinguished between no violation sentences and definiteness violations sentences across tasks, significant effects related to position of violation, violation type, and their interactions were observed only in Experiment 3, the online self-paced reading task. Therefore, a question arises; why were no significant interactions observed in the reading acceptability judgment task (Experiment 1), yet they were present in the online self-paced reading task (Experiment 3)?
This finding supports our prediction that participants in online tasks were better at identifying definiteness violations compared to offline tasks, as evidenced by the significant interactions in Experiment 3. This aligns with the immediacy assumption (Just and Carpenter, Reference Just and Carpenter1980) and the concept of online commitment (Jackson and Roberts, 2009), which suggest that linguistic information is processed incrementally as the sentence unfolds. As a result, participants in online tasks can detect definiteness errors as they occur. In contrast, offline tasks require participants to wait until the end of the sentence, allowing working memory and extralinguistic factors to influence their judgments.
This pattern did not extend to the listening modality. In the offline listening task, participants detected both types of violations, whereas in the online listening task, they identified only syntactic violations in the object position. This contrast likely reflects the interplay between the online measurement method and the listening modality, as observed in the findings of Ivanova-Sullivan et al. (Reference Ivanova-Sullivan, Sekerina, Tofighi and Polinsky2022). Additionally, differences in task demands during online processing, as highlighted by Laurinavichyute and von der Malsburg (Reference Laurinavichyute and von der Malsburg2024), may further influence participants’ ability to detect errors. These findings underscore the need for future research to investigate how task design and modality shape error detection in linguistic processing.
While this study provides valuable insights into participants’ performance in detecting syntactic and pragmatic violations across online and offline reading and listening tasks, several limitations should be acknowledged. First, the sample size for each experiment was relatively small, particularly in Experiment 2, where participants were exposed to a significantly lower number of items—only a third of the total items. Future research should aim for larger sample sizes to more accurately assess and validate these findings. Additionally, while the study highlighted variations between online and offline tasks, these differences may be attributable to the tasks themselves rather than the contrast between methodologies (e.g., Laurinavichyute and von der Malsburg Reference Laurinavichyute and von der Malsburg2024). Future studies should further investigate these contrasts, potentially focusing on comparisons between comprehension questions and acceptability judgment tasks within both methodologies.
To summarize, as expected, our findings indicate that Hebrew speakers are able to observe both pragmatic and syntactic violations involving definiteness (un)marking, as evidenced by higher ratings and faster processing of sentences with no violations compared to definiteness violations. However, their performance varied across tasks (online vs. offline) and modalities (reading vs. listening), with the listening modality showing poorer performance. This variation can be attributed to differences in definiteness marking between subject and object positions in spoken Hebrew, as well as to the perceptual saliency of these positions (Dube et al., Reference Dube, Kung, Peter, Brock and Demuth2016). Additionally, the combination of online measurements and the listening modality may have led to more “shallow” processing. Additionally, the type of task—whether online or offline may influence linguistic performance. In the reading modality, the online task reflected a more effective measure for detecting definiteness violations. However, in the listening modality, the complexities of online measurement combined with the acoustic properties of definiteness marking in Hebrew appeared to obscure such effects, highlighting the need for further research to better understand these dynamics.
Conclusion
The current study investigated Hebrew speakers’ performance in observing syntactic and pragmatic violations of definiteness use across tasks (online vs. offline) and modalities (reading vs. listening). The findings show a general pattern across tasks and modalities: speakers detected pragmatic and syntactic violations across tasks, by either rating higher or processing faster sentences with no violations compared to sentences with definiteness violations. Nonetheless, our study showed notable differences across the tasks and modalities. For example, the self-paced listening task revealed no differences in RTs for sentences with no violations and sentences with pragmatic violations, and slower RTs were found for syntactic violations only in the object position. The differences in performance across tasks may lie in the listening modality where the phonological reduction of the definite article reduced the saliency of definiteness sensitivity in the subject position, compared to the object position (Meir, I. and Doron, Reference Meir and Doron2013). This, combined with detecting errors in the listening modality in online processing may affect listeners’ performance in identifying definiteness violations (e.g., Ivanova-Sullivan et al., Reference Ivanova-Sullivan, Sekerina, Tofighi and Polinsky2022). In contrast, participants in the online self-paced reading task demonstrated the strongest effects in identifying both syntactic and pragmatic violations, suggesting that task type and stimulus presentation may influence linguistic performance. This underscores the need for future research to explore these dynamics further and test both modalities and methodologies, aiming to enhance our understanding of linguistic phenomena and provide insights into the differences and similarities across psycholinguistic approaches.
Replication Package
Data, materials, and codes are available at https://osf.io/unj6s/.
Supplementary material
For supplementary material accompanying this paper visit https://doi.org/10.1017/S0142716425000037
Acknowledgements
Dana Plaut-Forckosh’s work was generously supported by the Presidential Scholarships for Outstanding Students, funded by Bar-Ilan University, and the Prof. Nathan Rotenshreich Scholarship Program for Outstanding Doctoral Students in the Humanities. This study was also partially funded by the Israel Science Foundation (ISF) through grant number 552/21, titled “Towards Understanding Heritage Language Development: The Case of Child and Adult Heritage Russian in Israel and the USA”, awarded to Natalia Meir. Their financial support has been instrumental in enabling this research, and we express our heartfelt gratitude for their generous contributions.
Competing interests
We have no known conflict of interest to disclose. All authors contributed equally to this work. Preliminary versions of these findings were showcased as poster presentations at various conferences, e.g., Architectures and Mechanisms for Language Processing (AMLaP).
Natalia Meir, Associate Editor at Applied Psycholinguistics, affirms that she had no involvement in the editorial process of the subject paper.