LEARNING DIRECTION MATTERS: A STUDY ON L2 RHYTHM ACQUISITION BY DUTCH LEARNERS OF SPANISH AND SPANISH LEARNERS OF DUTCH

Lieke van Maastricht; Emiel Krahmer; Marc Swerts; Pilar Prieto

doi:10.1017/S0272263118000062

LEARNING DIRECTION MATTERS

A STUDY ON L2 RHYTHM ACQUISITION BY DUTCH LEARNERS OF SPANISH AND SPANISH LEARNERS OF DUTCH

Published online by Cambridge University Press: 21 May 2018

Lieke van Maastricht ,

Emiel Krahmer ,

Marc Swerts and

Pilar Prieto

Show author details

Lieke van Maastricht*: Affiliation:
Tilburg University, Tilburg center for Cognition and Communication, Tilburg and Radboud University, Centre for Language Studies, Nijmegen
Emiel Krahmer: Affiliation:
Tilburg University, Tilburg center for Cognition and Communication, Tilburg
Marc Swerts: Affiliation:
Tilburg University, Tilburg center for Cognition and Communication, Tilburg
Pilar Prieto: Affiliation:
Institució Catalana de Recerca i Estudis Avançats, Barcelona, Catalunya, and Universitat Pompeu Fabra, Barcelona, Catalunya
*: *Correspondence concerning this article should be addressed to Lieke van Maastricht, Radboud University, Postbus 9102, 6500 HC Nijmegen, Netherlands. E-mail: [email protected]

Article contents

Abstract
INTRODUCTION
METHOD
RESULTS
DISCUSSION AND CONCLUSION
SUPPLEMENTARY MATERIAL
Footnotes
References

Rights & Permissions

Abstract

This study examines the acquisition process of speech rhythm in Dutch learners of Spanish (DLS) and Spanish learners of Dutch (SLD) at different proficiency levels to determine whether learning direction affects the success of rhythm acquisition in a foreign language (L2). Analyses of lengthening effects showed that the two learner groups followed different developmental paths in their acquisition of accentual and final lengthening: Both groups showed transfer effects from the L1, but while the DLS systematically approached their target until attainment, the SLD showed more variability in their development. In addition, syllable structure complexity affected L2 rhythm acquisition, and to a substantially larger extent for the SLD compared to the DLS. The results support a model of L2 rhythm acquisition in which learning direction is included as a factor, and that allows for the interaction of various language-specific properties that contribute to speech rhythm, like syllable structure complexity.

Type: Research Article
Information: Studies in Second Language Acquisition , Volume 41 , Issue 1 , March 2019 , pp. 87 - 121

DOI: https://doi.org/10.1017/S0272263118000062 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright: Copyright © Cambridge University Press 2018

INTRODUCTION

The acquisition of a foreign language (L2) occurs within various linguistic dimensions simultaneously. While many studies focus on L2 attainment of segmental, lexical, syntactic, and semantic properties, research on the acquisition of L2 prosody is substantially underrepresented. Furthermore, within the field of L2 prosody acquisition, the attainment of L2 rhythm has received relatively little attention compared to other suprasegmental properties, like lexical stress, phrasal prominence, and speech rate (Gut, Reference Gut2009). However, native listeners as young as five days old are capable of discriminating languages that have traditionally been classified as prototypically “stress timed” and “syllable timed,” like Dutch and Spanish respectively, based only on their rhythm (Nazzi, Bertoncini, & Mehler, Reference Nazzi, Bertoncini and Mehler1998; Ramus & Mehler, Reference Ramus and Mehler1999; Ramus, Dupoux, & Mehler, Reference Ramus, Dupoux, Mehler, Solé, Recasens and Romero2003). Previous studies suggest that perceived rhythm is the result of an interaction between language-specific factors, such as timing properties, prominence, and boundary marking by means of syllable duration, and syllable structure (e.g., Abercrombie, Reference Abercrombie1967; Li & Post, Reference Li and Post2014; Post & Payne, Reference Post, Payne, Prieto and Esteve-Gibert2018; Prieto, Vanrell, Astruc, Payne, & Post, Reference Prieto, Vanrell, Astruc, Payne and Post2012; White & Mattys, Reference White and Mattys2007a). Thus, rhythm is related to phonemic, phonotactic, and intonational features of language, and producing speech with adequate rhythm therefore requires control in multiple areas.

Hence, it is perhaps unsurprising that many L2 learners have difficulties acquiring the speech rhythm characteristic of their L2. It has been shown that L2 speakers, especially in the early phases of learning, tend to transfer rhythmic properties of their native language (L1) to the L2 (e.g., White & Mattys, Reference White, Mattys, Prieto, Mascaró and Solé2007b), suggesting that targetlike rhythm production is easier when the L1 and L2 are rhythmically similar (e.g., Ordin & Polyanskaya, Reference Ordin and Polyanskaya2015). Indeed, the similarity between the L1 and L2 as a factor of successful L2 acquisition has been studied extensively within the fields of L2 phonology and phonetics, and several theoretical models are based on it, like the Second Language Model (SLM, Flege, 1995), the Perceptual Assimilation Model (PAM, Best, Reference Bolinger, Abe and Kanekiyo1995), and the Second Language Perception model (L2LP, Escudero & Boersma, Reference Escudero and Boersma2004). However, the direction in which languages are learned has been studied less frequently (Gut, Reference Gut2009). Intuitively learning direction is an important factor to investigate: Arguably, acquisition is more challenging from less complex languages toward more complex ones, than vice versa. To our knowledge, the only study on learning direction as a function of L2 prosody acquisition is Rasier’s (Reference Rasier2006) study on the acquisition of pitch accents to mark focus by L1 Dutch learners of French and L1 French learners of Dutch, showing that learning direction indeed affected the degree of success with which L2 learners produced targetlike pitch accent distributions. And while an analysis in which the same two languages are compared cross-directionally sheds more light on the processes underlying the role of learning direction in L2 acquisition, no study has performed such a comparison for speech rhythm acquisition.

Therefore, we explore whether the direction in which L2 acquisition occurs affects the successful attainment of speech rhythm by L2 learners of two languages that are rhythmically different, namely Dutch learners of Spanish (DLS) and Spanish learners of Dutch (SLD). To test which learner group advances more toward its target, we compare DLS and SLD with varying proficiency levels for two measures that correlate with speech rhythm, that is, accentual and final lengthening, in different phonotactic conditions. Before turning to our predictions, several concepts relevant to the current study are described in more detail.

SPEECH RHYTHM

The construct of rhythm has been operationalized in various ways. A diachronic overview of rhythm analyses generally starts with the notion of rhythm as a categorical concept dependent on isochrony (Abercrombie, Reference Abercrombie1967; Pike, Reference Pike1945). Within this view, a distinction has been made between syllable-timed and stress-timed languages, where the former refers to languages in which the intervals between the beginning of all syllables were taken to be equal (e.g., Spanish), while the latter applies to languages in which only the intervals between stressed syllables were assumed to be similar (e.g., Dutch). This categorical distinction was questioned by studies showing that the idea of equal intervals, between all syllables or between stressed syllables only, was not supported by acoustic measurements (e.g., Bolinger, Reference Bolinger, Abe and Kanekiyo1965). This initiated a shift toward the notion of rhythm as a gradient property, with the underlying assumption that no language is either completely syllable-timed or stress-timed (Dauer, Reference Dauer1983,^{Footnote 1} 1987). The results of acoustic and phonetic experiments provided an overview of properties that are relevant to speech rhythm and enable comparisons between languages based on these properties.

More recently, it was shown that not only phonetic and phonotactic but also prosodic features of language influence speech rhythm. Studies showed that languages differ in the extent to which they lengthen stressed and/or accented syllables vis-à-vis unstressed and unaccented syllables (e.g., Beckman & Edwards, Reference Beckman, Edwards and Keating1994), as well as in their lengthening of syllables preceding a prosodic boundary within or at the end of an utterance (e.g., Byrd, Reference Byrd2000). Prieto et al. (Reference Prieto, Vanrell, Astruc, Payne and Post2012) showed that the degree to which languages apply these prosodic lengthening measures contributes to cross-linguistic rhythmic differences. Resulting from these developments, more recent studies of speech rhythm rely on rhythm metrics to measure the timing patterns of utterances, occasionally in combination with lengthening analyses (Grabe & Low, Reference Grabe and Low2002; Gut, Reference Gut2009; Li & Post, Reference Li and Post2014; Ordin & Polyanskaya, Reference Ordin and Polyanskaya2015; Prieto et al., Reference Prieto, Vanrell, Astruc, Payne and Post2012; Ramus, Nespor, & Mehler, Reference Ramus, Nespor and Mehler1999). These analyses followed from the dismissal of rhythm as a dichotomous notion, leading to a need for quantitative data to corroborate the idea of a rhythm continuum and to position a given language on it.^{Footnote 2} In this study, we base our analyses of speech rhythm on measures of accentual and final lengthening, in agreement with Dauer’s (Reference Dauer1983, Reference Dauer1987) list of parametric criteria to rhythmically differentiate between languages, one of which is the presence or absence of durational variation between stressed and unstressed syllables and the use of pitch to mark prominence (also see Allen & Hawkins, Reference Allen, Hawkins, Bell and Hooper1978). The following section explains why the two languages studied in the current investigation differ significantly in their speech rhythm.

TYPOLOGICAL DIFFERENCES BETWEEN DUTCH AND SPANISH

Several typological differences between Spanish and Dutch have been hypothesized to underlie the perceptual distinction between these languages concerning rhythm. One difference concerns syllable complexity constraints: The majority of Spanish syllables have an open structure (syllables consisting of a consonant [C] followed by a vowel [V] are most frequent in two corpus studies: 58.0% of all syllables had CV structure in Navarro Tomás, Reference Navarro Tomás1966, and 53.9% in Hartsuiker, Reference Hartsuiker2002), while the majority of Dutch syllables is closed (CVC syllables represented 62.4% in the corpus study by Hartsuiker, Reference Hartsuiker2002). Moreover, Spanish allows for relatively few syllable structures that are more complex than the CV configuration. Navarro Tomás (Reference Navarro Tomás1966) stated that the most complex syllable type in Spanish is CCVCC, as found in the first syllable of trans-for-mar, “to transform.” Conversely, Dutch is documented as more varied in its syllable structure with complex structures being the norm. Syllable complexity can increase to up to seven segments in one syllable, for instance in the word strengst (“strictest”), which has a CCCVCCC structure (Booij, Reference Booij1995; Van Zon, Reference Van Zon1997). Because Dutch and Spanish differ typologically in this respect, the current study controls for syllable structure in two out of the three conditions (using predominantly CV and CVC syllables, respectively), while in the last (Mixed) condition syllable structures are used that are typical of both languages. In addition to these phonotactic differences, the two languages also differ in prosodic properties: Spanish is known to employ little accentual and final lengthening while both are employed extensively in Dutch (Cambier-Langeveld & Turk, Reference Cambier-Langeveld and Turk1999; Cambier-Langeveld, Nespor, & Van Heuven, Reference Cambier-Langeveld, Nespor and van Heuven1997; Delattre, Reference Delattre1966; Prieto et al., Reference Prieto, Vanrell, Astruc, Payne and Post2012).^{Footnote 3} In the following section, the effects of these differences on L2 rhythm acquisition are discussed.

L2 RHYTHM ACQUISITION

Prior work on L2 rhythm attainment generally concentrated on the influence of the L1 on the L2, and typically reported that although L2 learners from different L1 backgrounds increasingly approached their target, considerable transfer effects usually also occur from the L1 to the L2 (e.g., Carter, Reference Carter, Gess and Rudin2005; White & Mattys, Reference White and Mattys2007a). Recently, Li and Post (Reference Li and Post2014) investigated the rhythm produced by Chinese and German learners of English with intermediate or advanced proficiency level. Their analyses showed that while learners from both L1 backgrounds produced rhythm metric values and syllable durations that increasingly approached the L2 target, their development also showed signs of L1 transfer: Where intermediate learners produced values that were closer to those typical of their L1, the advanced learners produced values that were more similar to those of the L2 target. Interestingly, both learner groups performed equally well, which is surprising, because intuitively German rhythm is more similar to English rhythm than Mandarin rhythm. One might therefore assume that the German learners of English would be more successful at producing the target speech rhythm than the Chinese learners of English.

This is precisely the idea developed further in Ordin and Polyanskaya (Reference Ordin and Polyanskaya2015), who compared French and German L2 learners of English at beginner, intermediate, or advanced proficiency level. Their results corroborated those of Li and Post (Reference Li and Post2014) in that rhythm metric values of both learner groups revealed that durational variability increased as L2 acquisition progressed, which would be an indication of universal L2 acquisition development. Conversely, their results further showed that while the most proficient German learners of English achieved target values (and for some metrics the intermediate learners did too), the French learners of English did not. Ordin and Polyanskaya considered this an indication that L1 speakers of a syllable-timed language (here French) encountered more difficulty acquiring the speech rhythm of a stress-timed language (here English), than L1 speakers of another stress-timed language (here German). However, because Ordin and Polyanskaya compared two different L1-L2 combinations, the design of their study makes it impossible to rule out the possibility that the differences between these two learner groups were due to other segmental, phonotactic, or prosodic properties in which French and German differ from each other.

THE CURRENT STUDY

In view of our limited understanding of rhythm transfer in general, and the importance of learning direction in this context specifically, this study compares DLS and SLD in their rhythm production, to determine which L2 group is more successful at producing targetlike speech rhythm. Consequently, a language model is required that allows for predictions based on more than just the similarity of the L1 and L2 because both learning directions consist of the same language combination. Unfortunately, this excludes popular models of L2 acquisition, such as the SLM (Flege, Reference Flege and Strange1995), PAM (Best, Reference Best and Strange1995), and L2LP (Escudero & Boersma, Reference Escudero and Boersma2004). Moreover, these models concern the acquisition of segmental features and are therefore difficult to apply to suprasegmental L2 properties. Other models that do allow for predictions regarding prosodic features tend to focus on prosodic cues at a lexical level only, and generally take a Universal Grammar perspective, assuming that specific parameters are organized into a hierarchical tree structure in which some are embedded within others (e.g., Archibald, Reference Archibald1994; Özçelik, Reference Özçelik2016).

We therefore base our predictions on Eckman’s (Reference Eckman1977, Reference Eckman, Hansen Edwards and Zampini2008) Markedness Differential Hypothesis (MDH), which is applicable to most areas of L2 acquisition and does not depart from a specific language acquisition theory:

(1) MDH: “The areas of difficulty that a language learner will have can be predicted such that
1. (a) Those areas of the target language which differ from the native language and are more marked that the native language will be difficult;
2. (b) The relative degree of difficulty of the areas of difference of target language which are more marked that the native language will correspond to the relative degree of markedness;
3. (c) Those areas of the target language which are different from the native language, but are not more marked than the native language will not be difficult.” (Eckman, Reference Eckman1977, p. 321)

Eckman defined markedness as follows: “A phenomenon is typologically more marked if its presence in a language implies the presence of another phenomenon; but the presence of the latter does not imply the presence of the former” (1977, pp. 320–321).

As argued, Dutch and Spanish not only differ concerning the overall perception of their rhythm, but also with respect to various phonotactic and prosodic properties that underlie this perceptual distinction. The MDH can therefore be applied on at least three levels: First, young children initially produce speech with a rhythm that has been classified as more syllable-timed, and only later acquire the rhythmic properties specific of their L1 (Allen & Hawkins, Reference Allen, Hawkins, Bell and Hooper1978; Bunta & Ingram, Reference Bunta and Ingram2007; Grabe, Watson, & Post, Reference Grabe, Watson, Post, Ohala, Hasegawa, Ohala, Granville and Bailey1999; Schmidt & Post, Reference Schmidt and Post2015). Most recently, Polyanskaya and Ordin (Reference Polyanskaya and Ordin2015) investigated the attainment of rhythmic patterns by monolingual English children from 4–5 to 10–11 years old and adults. Their results corroborated earlier work and showed that the speech rhythm of the children developed from more syllable-timed to more stress-timed as language acquisition continued. As we know of no cases in which infants first produced a stress-timed rhythm (to later develop a syllable-timed speech rhythm if this is typical of their L1), we assume that a stress-timed rhythm implies a syllable-timed rhythm in an earlier developmental stage, but not vice versa, indicating that stress-timed rhythm is typologically more marked than syllable-timed rhythm.

Second, a similar reasoning is applicable to correlates of rhythm, such as syllable complexity (Prieto et al., Reference Prieto, Vanrell, Astruc, Payne and Post2012): The use of complex syllable structures such as CCCVCCC implies that simple syllable structures, such as CV, are also possible within a language (an example from Dutch being da-me, “lady”). However, the possibility of a CV syllable in Spanish does not imply that a syllable with CCCVCCC structure is also acceptable. From this it follows that the syllable structure of Dutch is more marked than the syllable structure of Spanish (also see Levelt & Van de Vijver, Reference Levelt, Van de Vijver, Kager, Pater and Zonneveld2004; Ordin & Polyanskaya, Reference Ordin and Polyanskaya2015). Third, Dutch is also more marked than Spanish concerning lengthening effects, which are known to correlate with rhythm perception, as accentual and final lengthening are employed more extensively in Dutch than in Spanish. Lengthening implies a baseline that is not lengthened, but not vice versa: Not only does lengthening require more physiological effort than not lengthening (Ten Bosch, 1991), but the majority of all syllables in speech is not lengthened, whereas only a subset of the syllables is lengthened. This implies that the former is indeed the “norm” (less marked), while the latter is the “exception” (more marked). In sum, in all areas discussed, Dutch is arguably more marked than Spanish, which, according to the MDH, should make acquisition of these properties more difficult in Dutch than in Spanish. We therefore predict the following:

(2) Dutch learners of Spanish (DLS) are more successful at approaching their target rhythm than Spanish learners of Dutch (SLD).

Recently, the MDH has been used in two studies on L2 prosody acquisition: Rasier (Reference Rasier2006), who applied it to the production of (de)accentuation patterns to signal focus in L2 French by L1 speakers of Dutch and L2 Dutch by L1 speakers of French, and Ordin and Polyanskaya (Reference Polyanskaya and Ordin2015), who employed it in their analysis of L2 rhythm acquisition by German and French learners of English. Both reported that learners with an L1 background that is less marked than the target language (the L1 speakers of French who were learning Dutch, and the French learners of English, respectively) were less successful at attaining the L2 target than learners with an L1 background that is more or equally marked as the target L2 (the Dutch learners of French and the German learners of English, respectively). In the next section, the collection and analysis of speech data by DLS, SLD, and L1 speakers of Spanish and Dutch is described, followed by a comparison of the two learner groups, by means of accentual and final lengthening measures.

METHOD

PARTICIPANTS

Seventy adults participated in our experiment: five L1 speakers of Dutch and five L1 speakers of Spanish, whose data serve as a baseline^{Footnote 4} to which the data of 30 DLS and 30 SLD are compared. All participants were raised in a monolingual environment and participated voluntarily (Table S1 in the Online Supplementary Materials contains those details about the speaker sample that are relevant to the experiment). The DLS were students of the Spanish program at the University of Groningen or Fuentes Academia de Español. The most proficient DLS were teachers at the Spanish Department of the Radboud University in Nijmegen or the University of Groningen. The SLD were students at the Escuela Oficial de Idiomas in Madrid or Barcelona, and the most proficient SLD were generally teachers at the Escuela Oficial de Idiomas. The L2 learners were subdivided into different proficiency groups, based on the proficiency levels of the Common European Framework of Reference for Languages, which distinguishes between six different proficiency levels ranging from A1 or A2 for beginners, to B1 and B2 for intermediate learners, and C1 and C2 for advanced speakers of an L2 (Council of Europe, 2001). Five speakers were recorded per proficiency level. The institutions already used these proficiency levels, which facilitated the process of determining the proficiency of our participants. Their level in this study corresponded to the level of the last course they had successfully completed. Participants were asked to self-evaluate their skills with respect to specific reading, writing, speaking, and listening proficiency, which were corroborated by the first author with their teachers. In general, these were congruent with students’ overall level, with the productive skills being slightly more challenging than the receptive skills.

Because French is an obligatory subject in Dutch high schools, as is English in Spanish high schools, all participants had some knowledge of an additional West Germanic or Romance language. However, none of them spoke that language at a proficiency level higher than their target L2 proficiency level. We therefore assume that L2 learners were not influenced in their target rhythm production by other foreign languages from the same language family.

MATERIALS

Following Prieto et al. (Reference Prieto, Vanrell, Astruc, Payne and Post2012), the stimuli consisted of 30 sentences per language: 5 sentences with predominantly open syllables (CV), 5 with mostly closed syllables (CVC), and 20 that reflected typical syllable structures in either Dutch or Spanish (Mixed). Consequently, syllable structure was controlled in one third of the stimuli. The Spanish CV and CVC sentences were taken from Prieto et al. (Reference Prieto, Vanrell, Astruc, Payne and Post2012). The Dutch CV and CVC sentences were created by the authors to match the Spanish ones. The Mixed sentences were taken or adapted from Nazzi et al. (Reference Nazzi, Bertoncini and Mehler1998) and Prieto et al. (Reference Prieto, Vanrell, Astruc, Payne and Post2012). The percentage of open syllables was 81.6% in the Dutch CV sentences and 91.9% in the Spanish CV sentences. In the Dutch CVC sentences 78.3% of the syllables were closed, while in the Spanish CVC sentences 59.0% were closed. In the Mixed sentences, 47.7% of the Dutch syllables had an open structure, in contrast to the Spanish Mixed sentences with 69.1% of all syllables open. Thus, the manipulation of syllable structure was realized as intended.^{Footnote 5} All sentences were matched for number of syllables (range: 12–19 syllables), although this may vary somewhat across individuals as a result of participant-specific pronunciation preferences. Sentences were also matched as best as possible for orthographic words (Spanish M = 9.03, Dutch M = 9.63 per sentence) and prosodic words (Spanish M = 4.87, Dutch M = 5.26 per sentence). Infrequent words and complex sentence constructions were avoided where possible, to facilitate the task for L2 learners. Example (3) shows a stimulus sentence for each of the categories in Dutch and Spanish. The whole stimuli set can be found in the Online Supplementary Materials.

(3a) CV syllable structure (16 syllables):
- D: De mama van Susana is een gezellige lerares
- S: La madre de Susana es una buena profesora
(3b) CVC syllable structure (15 syllables):
- D: De wedstrijd van de voetbalclub was niet in het sportcomplex
- S: El mitin del club de tenis no fue en el parking del club
(3c) Mixed syllable structure (16 syllables):
- D: De dader werd helaas bij gebrek aan bewijs vrijgesproken
- S: Reportan inundaciones graves en la primavera

PROCEDURE

Experimental sessions were performed individually and lasted approximately 10 minutes for the L1 speakers, and 20 minutes for the DLS and SLD, who performed the task in both their L1 and L2. The order in which the L2 learners performed these tasks was randomized across participants. The recordings, made with Praat (Boersma & Weenink, Reference Boersma and Weenink2015) and the internal microphone of an Apple Macbook Pro, took place in a quiet room. Participants were instructed to read the sentences at a normal, comfortable pace from the laptop screen, and to repeat the sentence if there were hesitations or other irregularities in their speech, continuing to the next sentence at their own convenience. While a higher L2 proficiency level generally entailed less repetitions, this method ensured very few disfluencies in the speech by L2 learners of all proficiency levels. The few pauses and/or disfluencies that were unavoidable in the recording of the data were excluded from measurement on a syllable basis and are not included in data analysis. L2 learners could ask for translations of words and sentences, but the experiment leader did not provide phonetic coaching and refrained from pronouncing the target words herself. Participants filled in a questionnaire to verify that they met the requirements of each language group concerning L1/L2, proficiency, experience in countries where the target L2 is spoken, age, and gender, and to ensure that none of the participants had dyslexia or visual problems, which might influence their reading performance.

PROSODIC ANALYSIS

The audio recordings were analyzed prosodically in Praat: Each utterance was segmented into words, syllables, and phonemes. Segmental annotation for all utterances was first performed automatically using Praatalign, version 1.9b (Lubbers & Torreira, Reference Lubbers and Torreira2015). Subsequent segmentation and coding was performed manually by the first author, a trained phonetician who is an L1 speaker of Dutch and proficient in Spanish. Manual correction of the preprocessed speech was done by visual inspection of the speech waveforms and wideband spectrograms following standard criteria (see Peterson & Lehiste, Reference Peterson and Lehiste1960; Prieto et al., Reference Prieto, Vanrell, Astruc, Payne and Post2012; White & Mattys, Reference White and Mattys2007a).

In two additional tiers, segments were coded as consonants or vowels, and syllable boundaries were placed. For Spanish, these boundaries were positioned following Prieto et al. (Reference Prieto, Vanrell, Astruc, Payne and Post2012): Prevocalic glides were coded as part of the preceding consonantal interval, and postvocalic glides as part of the preceding vocalic interval (e.g., the first syllable of buena was treated as CCV; the first syllable of Ceilán as CVV). Furthermore, CV structures were maintained whenever possible and a CV resyllabification process occurred across word boundaries. Following Schiller, Meyer, Baayen, and Levelt (Reference Schiller, Meyer, Baayen and Levelt1996), resyllabification also took place in the Dutch utterances, after taking into account phonological rules such as final devoicing (“aard”/ard/ becomes [art]), degemination (“komen naar”/kɔmən nar/becomes [kɔmənar]), as well as final –n deletion after a schwa, and progressive voice assimilation (“uitvallen” /œytvɑlən/ becomes [œytfɑlə]).

To analyze final lengthening effects, each syllable was marked for its phrasal position as either non-final, intermediate phrase (ip) final or intonational phrase (IP) final following the procedure described in Prieto et al. (Reference Prieto, Vanrell, Astruc, Payne and Post2012). The criterion for an IP break was a pause of at least 200 milliseconds, while a break of less than 200 milliseconds and a continuation rise characterized an ip boundary. The non-final syllables were then taken as a baseline condition to which the length of ip-final and IP-final syllables was compared. Prosodic prominence was also annotated, distinguishing between unstressed and unaccented, stressed and accented, and stressed and nuclear accented syllables. In this case, unstressed and unaccented syllables correspond to the baseline to which stressed and accented, and stressed and nuclear accented syllables were compared.^{Footnote 6} Figure 1 illustrates the orthographic, segmental, and prosodic transcription of the Spanish utterance La madre de Susana es una buena profesora (“Susana’s mother is a good teacher”) produced by an L1 speaker of Spanish. The first tier contains the orthographic transcription, the second one the phonetic segmentation, and the third the consonant/vowel coding. In the fourth tier, syllabic segmentation and syllable structure is depicted, and in the two final tiers prominence and phrasal position is coded.^{Footnote 7} In total, 2,100 utterances were collected (5 speakers × 30 utterances × 14 language groups), resulting in 35,808 analyzed syllables and 48,068 analyzed segments.

FIGURE 1. Waveform, spectrogram, F0 contour, and labeling scheme used for the Spanish utterance La madre de Susana es una buena profesora, “Susana’s mother is a good teacher.”

Intertranscriber reliability of the prosodic coding was tested with 10% (105 Dutch and 105 Spanish utterances) of our data. These utterances were randomly selected by the first author, who ensured that they equally represented all language groups, speakers, and phonotactic conditions. After discussing several examples with the first author, two transcribers (one L1 speaker of Dutch and one L1 speaker of Spanish) independently labeled the utterances for phrasal position and phrasal prominence using the guidelines provided in this section. A comparison of the prosodic transcription across the two transcribers per language revealed a high interrater reliability both in phrasal prominence and phrasal position labeling. Agreement on the choice of phrasing level was high: 99.1% consistency for Dutch (κ = .974) and 93.4% for Spanish (κ = .785). Similarly, agreement on the choice of phrasal prominence levels was 97.8% for Dutch (κ = .956) and 88.5% for Spanish (κ = .754). This is comparable to interrater reliability scores in similar studies using prosodic labeling (Prieto et al., Reference Prieto, Vanrell, Astruc, Payne and Post2012), indicating that both prosodic features were labeled reliably (Landis & Koch, Reference Landis and Koch1977).

RESULTS

In what follows, we first compare syllable duration data by L1 Dutch and L1 Spanish as a function of prosodic prominence and phrasal position to form a baseline against which we subsequently compare the DLS and SLD. All analyses are performed using a Generalized Linear Mixed Model (GLMM). Specific response variables and fixed factors are described in the relevant sections, but for all analyses subjects and items were included as random factors, including random intercepts and random slopes for fixed effects and their interaction (Barr, Levy, Scheepers, and Tily, Reference Barr, Levy, Scheepers and Tily2013). Pairwise comparisons that explain main effects and interactions were Bonferroni adjusted.

L1 SPANISH VERSUS L1 DUTCH

A GLMM analysis was performed with syllable duration in seconds as the response variable, and Language Group (two levels: L1 Dutch, L1 Spanish), Syllable Structure (three levels: CV, CVC, Mixed), Phrasal Prominence (three levels: unstressed and unaccented, stressed and accented, stressed and nuclear accented), and Phrasal Position (three levels: non-final, ip-final, IP-final) as fixed factors. The analysis reveals significant main effects for all fixed factors and significant interactions for all relevant combinations, except the interaction between Language Group and Phrasal Position (see Appendix Table A1 for all potential main effects and interactions).

Pairwise comparisons between the three Phrasal Prominence conditions within each L1 group reveal that in both L1 Spanish and L1 Dutch increasing prominence of the syllable entails longer syllable durations. As shown in Figure 2, in L1 Spanish, all Phrasal Prominence levels differ significantly from one another (p < .001), whereas in L1 Dutch, both stressed and accented syllables, and stressed and nuclear accented syllables are significantly longer than unstressed and unaccented syllables (p < .001), but the syllable durations of stressed and accented syllables do not differ significantly from those of nuclear accented syllables (p = .099). Pairwise comparisons between Language Groups within Phrasal Prominence levels reveal that L1 Dutch and L1 Spanish have similar default syllable lengths for unstressed and unaccented syllables (p = .652), but they differ significantly from each other for the other two Phrasal Prominence levels (p < .001) as syllables are lengthened more extensively in L1 Dutch than in L1 Spanish. This confirms prior research on the degree of accentual lengthening used in both languages (Cambier-Langeveld & Turk, Reference Cambier-Langeveld and Turk1999; Cambier-Langeveld et al., Reference Cambier-Langeveld, Nespor and van Heuven1997; Delattre, Reference Delattre1966; Prieto et al., Reference Prieto, Vanrell, Astruc, Payne and Post2012).

FIGURE 2. Mean syllable duration (in milliseconds) in L1 Spanish and L1 Dutch, separated by Phrasal Prominence condition. All sentences.

Controlling for Syllable Structure by examining the CV sentences only does not generate substantial differences to this pattern (see Appendix Table A2 for mean syllable durations per Phrasal Prominence condition and Language Group for both CV and all sentences). The only difference is that the values are lower for both Language Groups in the CV condition than in the complete dataset, which can be explained by the fact that in CVC and Mixed sentences syllables are usually longer, due to their more complex syllable structure. As shown in Figure 3, pairwise comparisons between Language Groups within each Phrasal Prominence level for CV sentences only reveal a similar pattern as the one found for all sentences: L1 Dutch and L1 Spanish have comparable syllable lengths for unstressed and unaccented syllables (p = .205), but they differ significantly from each other for the other two Phrasal Prominence levels (stressed and accented syllables: p = .003, stressed and nuclear accented syllables: p = .009).

FIGURE 3. Mean syllable duration (in milliseconds) in L1 Spanish and L1 Dutch, separated by Phrasal Prominence condition. CV sentences only.

Regarding final lengthening, pairwise comparisons between the three Phrasal Position conditions within L1 groups show that for both Language Groups syllable durations increase significantly with respect to the baseline when the phrasal position of a syllable precedes an ip or IP boundary (see Figure 4, p < .001 for all comparisons). Furthermore, in both Language Groups the ip-final and IP-final syllables do not differ significantly from each other (Dutch: p = .863, Spanish: p = .374). Pairwise comparisons between Language Groups within each Phrasal Position condition show that L1 Dutch and L1 Spanish differ significantly from each other for all Phrasal Position conditions (non-final and IP-final syllables: p < .001, ip-final syllables: p = .003). This could again be because syllables are longer in L1 Dutch in general, even in non-final position, due to its more complex syllable structure.

FIGURE 4. Mean syllable duration (in milliseconds) in L1 Spanish and L1 Dutch, separated by Phrasal Position condition. All sentences.

Controlling for Syllable Structure by examining pairwise comparisons between Phrasal Position conditions within both Language Groups for the CV sentences only reveals a comparable pattern: In both L1s, non-final syllables are significantly shorter than ip-final and IP-final syllables (p < .001 for both), while there is no significant difference between ip-final and IP-final syllables (Dutch: p = .908, Spanish: p = .434). Pairwise comparisons show that speakers of L1 Dutch and L1 Spanish still differ significantly from each other in the non-final and IP-final conditions (non-final syllables: p = .007, IP-final syllables: p = .021), but the difference between the two L1s is not significant in the ip-final condition (p = .313) (see Figure 5). Appendix Table A3 contains the mean syllable durations per Phrasal Position and Language Group for all sentences and CV sentences only.

FIGURE 5. Mean syllable duration (in milliseconds) in L1 Spanish and L1 Dutch, separated by Phrasal Position condition. CV sentences only.

The significant interaction effect between Phrasal Position and Phrasal Prominence on syllable durations was further explored by examining the mean syllable durations per Language Group for all Phrasal Position and Phrasal Prominence combinations. Table 1 shows that both factors interact systematically: Within each Phrasal Position condition accentual lengthening effects increase as syllables are more prominent in the sentences, while increasing syllable durations are also observed between Phrasal Position conditions.

TABLE 1. Mean syllable durations in seconds (standard error) for all utterances by speakers of L1 Dutch and L1 Spanish, separated per Phrasal Position and Phrasal Prominence combination (N = 10)

L1 SPANISH VERSUS DLS

To compare the DLS to their target L1 group, a GLMM analysis was performed with syllable duration as the response variable, and Language Group (seven levels: L1 Spanish, DLS_A1, DLS_A2, DLS_B1, DLS_B2, DLS_C1, and DLS_C2), Syllable Structure (see analysis L1 speakers), Phrasal Prominence (see analysis L1 speakers), and Phrasal Position (see analysis L1 speakers) as fixed factors. The analysis reveals significant main effects for all fixed factors and significant interactions for all relevant combinations, except for the interaction between Language Group, Syllable Structure, and Phrasal Position (see Appendix Table A4 for all potential main effects and interactions).

Pairwise comparisons between all Language Groups overall reveal that the DLS gradually approach target syllable durations as their proficiency increases. The most proficient group, DLS_C2, no longer differs significantly from the target L1 Spanish (p = .735), while all other DLS groups still do (DLS_C1 and DLS_B2: p = .001, DLS_B1, DLS_A2, and DLS_A1: p < .001). This implies that while the DLS_C2 learners have attained a nativelike level in their L2, learners of all other levels still differ significantly from their target. Controlling for syllable structure by comparing the DLS to the L1 Spanish within the CV condition reveals that the effect of Language Group is partially dependent on syllable structure: Within the CV condition the L1 Spanish values are not only comparable to those of the DLS_C2 (p = 1.000), but also to those of the DLS_C1 (p = .116) and DLS_B2 (p = .064). To examine whether the DLS approach L1 values similarly for accentual and final lengthening, pairwise comparisons between Language Groups within prominence and finality conditions were performed.

Regarding Phrasal Prominence, the results show that within all Phrasal Prominence conditions the DLS_C2 are not significantly different from the L1 Spanish. Contrary to the L1 data, this effect appears susceptible to the syllable structure of the sentence in speech by L2 learners, as making the same comparisons within the CV condition reveals that the three highest proficiency levels are comparable to the L1 target, see Table 2 and Figures 6 and 7. Examination of the syllable durations of the different Phrasal Prominence conditions within all Language Groups reveals that all DLS groups show a similar pattern to the L1 Spanish, in which syllable durations are longer as syllables are more prominent within an utterance (see Appendix Table A5).

TABLE 2. p-values of pairwise comparisons between L1 Spanish (N = 5) and all DLS groups (N = 30) for Phrasal Prominence, separated by syllable structure

FIGURE 6. Mean syllable duration (in milliseconds) in L1 Spanish and DLS of all proficiency levels, separated by Phrasal Prominence condition. All sentences.

FIGURE 7. Mean syllable duration (in milliseconds) in L1 Spanish and DLS of all proficiency levels, separated by Phrasal Prominence condition. CV sentences only.

Concerning final lengthening, pairwise comparisons show that for non-final syllables the DLS_C2 and DLS_C1 are not significantly different from the L1 Spanish, and for ip-final and IP-final the DLS_C2 are not significantly different from the target L1 Spanish (see Table 3). This effect is once again influenced by syllable structure, as making the same comparisons within the CV condition reveals that the three highest proficiency levels are comparable to the L1 target.

TABLE 3. p-values of pairwise comparisons between L1 Spanish (N = 5) and all DLS groups (N = 30) for Phrasal Position, separated by syllable structure

Examination of syllable durations for the different Phrasal Position conditions within each Language Group reveals that the three most proficient DLS groups show a similar pattern as the L1 Spanish in which syllable durations are longer when syllables precede a prosodic boundary. Conversely, the values of the three lowest proficiency groups coincide more with the L1 Dutch, corroborating the presence of transfer effects in L2 rhythm acquisition (see Figures 8 and 9, and Appendix Table A6).

FIGURE 8. Mean syllable duration (in milliseconds) in L1 Spanish and DLS of all proficiency levels, separated by Phrasal Position condition. All sentences.

FIGURE 9. Mean syllable duration (in milliseconds) in L1 Spanish and DLS of all proficiency levels, separated by Phrasal Position condition. CV sentences only.

Finally, the joint effect of Phrasal Position and Phrasal Prominence on syllable durations is examined by inspecting the mean syllable durations for all Phrasal Prominence conditions within the separate Phrasal Position conditions. This reveals that both factors interact systematically within each Language Group: Within each Phrasal Position condition accentual lengthening increases as syllables are more prominent in the utterance, while increasing syllable durations are also observed between Phrasal Position conditions (see Appendix Table A7).

L1 DUTCH VERSUS SLD

To compare the SLD to the L1 speakers of Dutch, a GLMM analysis was performed with syllable duration as the response variable, and Language Group (seven levels: L1 Dutch, SLD_A1, SLD_A2, SLD_B1, SLD_B2, SLD_C1, and SLD_C2), Syllable Structure (see analysis L1 speakers), Phrasal Prominence (see analysis L1 speakers), and Phrasal Position (see analysis L1 speakers) as fixed factors. The analysis reveals significant main effects for all fixed factors and significant interactions for all relevant combinations, except for the interaction between Language Group and Phrasal Prominence (see Appendix Table A8 for all main effects and interactions). Pairwise comparisons between Language Groups overall reveal that although the SLD progressively approach target syllable durations as their proficiency increases, all the SLD groups still differ significantly from the L1 Dutch (p-values from p < .001 to p = .028). Crucially, this appears completely due to the syllable structure of the utterances because when comparing the SLD to the L1 Dutch within the CV condition, the L1 Dutch values do not differ significantly from the SLD values for all proficiency levels (p-values from p = .089 to p = 1.000).

Turning to Phrasal Prominence first, the results show that all the SLD groups differ significantly from the target L1 Dutch for both unstressed and unaccented syllables and stressed and nuclear accented syllables (see Table 4). In the stressed and accented condition, only the SLD_A1 group differs significantly from the L1 Dutch. However, this effect is highly susceptible to the syllable structure of the utterance, as making the same comparisons within the CV condition reveals that all the SLD groups for all Phrasal Prominence conditions are comparable to the L1 target.

TABLE 4. p-values of pairwise comparisons between the L1 Dutch (N = 5) and all SLD groups (N = 30) for Phrasal Prominence, separated by syllable structure

Examination of the syllable durations of the different Phrasal Prominence conditions within all Language Groups reveals that both SLD and L1 Dutch show a similar pattern in which syllable durations are longer as syllables are more prominent within an utterance (see Figures 10 and 11, and Appendix Table A9).

FIGURE 10. Mean syllable duration (in milliseconds) in L1 Dutch and SLD of all proficiency levels, separated by Phrasal Prominence condition. All sentences.

FIGURE 11. Mean syllable duration (in milliseconds) in L1 Dutch and SLD of all proficiency levels, separated by Phrasal Prominence condition. CV sentences only.

Regarding final lengthening, pairwise comparisons between the different Language Groups within the three Phrasal Position conditions show that for non-final syllables all SLD groups differ significantly from the L1 Dutch (see Table 5). However, for ip-final syllables the SLD_B1, SLD_B2, and SLD_C2 groups are comparable to the L1 Dutch and for the IP-final syllables this is the case for the DLS_B2 and DLS_C1 groups. This effect is again largely due to syllable structure, as identical comparisons in the CV condition reveal that almost all SLD groups are no longer significantly different from the L1 Dutch.

TABLE 5. p-values of pairwise comparisons between the L1 Dutch (N = 5) and all SLD groups (N = 30) for Phrasal Position, separated by syllable structure

Examination of the syllable durations of the different Phrasal Position conditions within each Language Group reveals that all SLD groups show a similar pattern as the L1 Dutch, in which syllable durations are longer when syllables precede a boundary, either within an utterance or at its end (see Figures 12 and 13, and Appendix Table A10).

FIGURE 12. Mean syllable duration (in milliseconds) in L1 Dutch and SLD of all proficiency levels, separated by Phrasal Position condition. All sentences.

FIGURE 13. Mean syllable duration (in milliseconds) in L1 Dutch and SLD of all proficiency levels, separated by Phrasal Position condition. CV sentences only.

Examining the joint effect of Phrasal Position and Phrasal Prominence on syllable durations by inspection of the mean syllable durations for all Phrasal Prominence conditions within the separate Phrasal Position conditions reveals that both factors interact systematically within each Language Group: Within each Phrasal Position condition accentual lengthening increases as syllables are more prominent in the utterance, while increasing syllable durations are also shown between Phrasal Position conditions (see Appendix Table A11). Accentual lengthening and final lengthening appear to contribute equally to the differences found between the L1 Dutch and the different SLD groups, especially when controlling for syllable structure. When only analyzing the CV sentences, all SLD groups appear to be fully on target in their syllable duration production, however when diversifying syllable structure (consequently making it more typical of L1 Dutch) it becomes rather more difficult to discern a logical pattern in the SLD productions.

DISCUSSION AND CONCLUSION

The current study investigated whether L2 learning direction affects the successful attainment of speech rhythm by DLS and SLD. Based on the MDH, we hypothesized that DLS would be more successful at approaching their target than SLD because rhythm as a whole, and its correlates syllable structure and lengthening effects, is more marked in Dutch than in Spanish. Overall, our results indeed show that learning direction influences L2 rhythm acquisition: Our analyses reveal a different development for DLS than SLD. Comparing the two groups, we can conclude that DLS show a more systematic development toward their target, and more successful attainment of an overall rhythm pattern that coincides with the one produced by L1 Spanish speakers. Thus, our results support our hypothesis and corroborate prior work (Ordin & Polyanskaya, Reference Ordin and Polyanskaya2015; Rasier, Reference Rasier2006).

However, our results do not allow for a complete disentanglement between speech rhythm and syllable structure complexity: Our lengthening analyses revealed different acquisition processes for DLS and SLD. The DLS systematically approach L1 values in all lengthening conditions until attaining targetlike values, generally at the highest proficiency level for all sentences, and at an intermediate to advanced level for the CV sentences only. Conversely, SLD of all proficiency levels are completely on target in the CV sentences only but show no systematic attainment in the analyses including all sentences. Therefore, it seems unlikely that the insignificant difference between the L1 Dutch and the least proficient SLD in the CV sentences is completely due to a perfectly produced speech rhythm by the SLD. Not only do these results show that learning direction influences L2 development, they also suggest that rhythm acquisition by SLD is substantially affected by their difficulties at producing utterances with more complex and/or closed syllable structures: When syllables are more complex and predominantly closed, the SLD are unable to reach target syllable durations, yet when syllables are predominantly open and have a simple CV structure, target patterns appear attainable. In this sense, L2 rhythm acquisition resembles L1 rhythm development in which physical output constraints related to consonant (cluster) production also affect targetlike rhythm production (Ordin & Polyanskaya, Reference Ordin and Polyanskaya2014; Payne, Post, Preito, Vanrell, & Astruc, Reference Payne, Post, Prieto, Vanrell and Astruc2012).

Similar to Li and Post (Reference Li and Post2014), our study shows that L2 rhythm acquisition (like L1 rhythm acquisition, see Post & Payne, Reference Post, Payne, Prieto and Esteve-Gibert2018) is a multisystemic process that requires the simultaneous attainment of several language-specific features, both phonotactic and prosodic. Crucially, depending on the learning direction, some of these features may be more challenging than others. Gradient properties, such as accentual and final lengthening, seem challenging for both DLS and SLD. Yet other, more categorical, characteristics (e.g., syllable structure constraints) appear substantially more difficult to acquire for SLD than DLS. In addition, between-speaker variability may also influence the acquisition process. While we matched our participants to the best of our ability based on their language experience and proficiency and included subject as a random factor in our statistical analyses, individual differences in the L2 acquisition process tend to be substantial (Ellis, Reference Ellis1994), and some factors, such as motivation and language aptitude, could not be considered. Especially studies on the multisystemic nature of L2 prosody acquisition would benefit from careful participant selection, as variation across individual speakers might occur in all “systems” and thus be magnified even more. Our study thus reinforces the need for L2 acquisition theories and models that allow for predictions based on the multisystemic nature of L2 prosody acquisition and that accommodate the inclusion of learning direction, as well as other speaker-based characteristics, as a relevant factor.

Moreover, our results are relevant pedagogically, as they demonstrate that adequate segment production is a prerequisite for successful rhythm attainment. The acquisition of suprasegmentals is often overlooked in educational programs because they are difficult to manipulate consciously and highly context dependent. Conversely, the correct pronunciation of segments usually receives considerable attention. On its own, this might not be a bad practice, as the current research suggests that training in this area may also lead to more successful rhythm production. Interestingly, recent work by Polyanskaya, Ordin, and Busà (Reference Polyanskaya, Ordin and Busà2017) suggests that the relative contribution of segmental characteristics and timing patterns to the assessment of accentedness differ as a factor of the proficiency level of the L2 learner. In other words, while the incorrect pronunciation of segmental properties might contribute more to accentedness in speech produced by less proficient L2 learners, deviance in speech rate and rhythmic patterns could become more salient as L2 learners become more proficient. Future research might therefore be dedicated to production studies investigating this further, as well as to perception studies that may confirm both the effect of deviance in different phonetic areas and in speech by learners of different proficiency levels on judgments by L1 speakers.

Future research could also address the effect of segmental pronunciation training on rhythm acquisition in different developmental stages, in addition to the effect of learning direction for other prosodic features, like lexical stress. Furthermore, because rhythm is related to several language-specific features, the current study could be extended by similar analyses for different L1-L2 combinations. Aside from follow-up studies on L2 production, the effect of (in)correct L2 rhythm production, perhaps in combination with other prosodic features, on L1 perception might be investigated.

SUPPLEMENTARY MATERIAL

To view supplementary material for this article, please visit https://doi.org/10.1017/S0272263118000062

APPENDIX

TABLE A1. Lengthening: Overview of relevant main effects and interactions for L1 speakers (N = 10)

TABLE A2. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Dutch (N = 5) and Spanish (N = 5), separated per Phrasal Prominence condition, for all sentences and CV sentences only

TABLE A3. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Dutch (N = 5) and Spanish (N = 5), separated per Phrasal Position condition, for all sentences and CV sentences only

TABLE A4. Lengthening: Overview of relevant main effects and interactions for DLS (N = 30) in comparison to L1 speakers of Spanish (N = 5)

TABLE A5. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Spanish (N = 5) and DLS of all proficiency levels (N = 30), separated per Phrasal Prominence condition, all sentences

TABLE A6. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Spanish (N = 5) and DLS of all proficiency levels (N = 30), separated per Phrasal Position condition, all sentences

TABLE A7. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Spanish (N = 5) and DLS of all proficiency levels (N = 30), separated per Phrasal Position and Phrasal Prominence combination, all sentences

TABLE A8. Lengthening: Overview of relevant main effects and interactions for SLD (N = 30) in comparison to L1 speakers of Dutch (N = 5)

TABLE A9. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Dutch (N = 5) and SLD of all proficiency levels (N = 30), separated per Phrasal Prominence condition, all sentences

TABLE A10. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Dutch (N = 5) and SLD of all proficiency levels (N = 30), separated per Phrasal Position condition, all sentences

TABLE A11. Mean syllable durations (standard deviations) in seconds produced by L1 speakers of Dutch (N = 5) and SLD of all proficiency levels (N = 30), separated per Phrasal Position and Phrasal Prominence combination, all sentences

Footnotes

This research was made possible through grants from the Prins Bernhard Cultuurfonds (40005750/HEV/ILE), the Jo Kolk fund, the Grup de Recerca Consolidat 2017 (SGR _ 971) and Spanish Ministry of Economy and Competitiveness (FFI2015-6653). We thank Gerdientje Oggel, Johanna Sattler, Arthur Verbiest, and Astrid van Winden for their help during data collection. We gratefully acknowledge Joan Borràs-Comes, Elena Kireva, Mart Lubbers, and Paolo Mairano for their help with data analyses, and Núria Esteve-Gibert and Constantijn Kaland for coding part of the data to test coding reliability. Finally, thanks are due to three anonymous reviewers for their helpful input and feedback on the content of this manuscript. Any errors that remain are our own.

Preliminary versions of the first two studies of this paper were presented at the Speech Prosody conference in May 2016 in Boston and at the New Sounds conference in June 2016 in Aarhus, Denmark, respectively. The current paper includes a more detailed theoretical background and description of the experimental methods, more extensive discussions of the results, and more advanced statistical analyses over the complete dataset.

¹ Dauer (Reference Dauer1983) proposes a continuum ranging from less to more stress-timed. To maintain the terminology used in previous studies on speech rhythm, we will continue to define the end points of a rhythmic continuum as “syllable-timed” and “stress-timed” and use these labels to categorize languages upon the continuum, though we agree that stress-timedness is a gradient, not categorical, feature.

² For a discussion of the suitability of rhythm metrics for this purpose, see Arvaniti (Reference Arvaniti2009, Reference Arvaniti2012) and Wiget et al. (Reference Wiget, White, Schuppler, Grenon, Rauch and Mattys2010). The latter also present useful recommendations for researchers concerning the use of rhythm metrics in empirical studies.

³ Dutch also makes extensive use of vowel reduction (Koopmans-Van Beinum, Reference Koopmans-Van Beinum1980), while Spanish does so very little (Delattre, Reference Delattre1969; Hualde, Reference Hualde2005).

⁴ While comparing the L1 and L2 data of the DLS and SLD might be felicitous in minimizing the effect of individual variability, the L1 data of the L2 learners could not serve as a baseline in the current study as it has been shown that prosodic transfer from the L2 to the L1 occurs in advanced L2 learners (e.g., Mennen, 2004; Van Maastricht et al., Reference Van Maastricht, Krahmer and Swerts2016), while it is unknown whether it also occurs in less proficient speakers. To make equal comparisons between all L2 learners and a typical target baseline, it was deemed more suitable to use typical speakers of the L1.

⁵ This was further investigated by performing two chi-square analyses (one per language), which show that the number of open syllables differs significantly between the three syllable structure conditions, χ2 (2, N = 566) = 53.17, p < .001 for Dutch, and χ2 (2, N = 497) = 96.66, p < .001, for Spanish.

⁶ While some studies, like the current research, prefer to measure accentual and final lengthening separately, others used one combined lengthening measure (e.g., Li and Post, Reference Li and Post2014).

⁷ To facilitate coding, the last two tiers were coded numerically. In the fifth tier, containing the phrasal position coding, “0” stands for “non-final,” “2” stands for “ip-final,” and “3” corresponds to “IP-final” syllables. In the sixth tier, which contains the phrasal prominence coding, “0” stands for “unaccented and unstressed” syllables, “2” corresponds to “stressed and accented,” and “3” to “stressed and nuclear accented” syllables.

References

REFERENCES

Abercrombie, D. (1967). Elements of general phonetics. Edinburgh, UK: Edinburgh University Press.Google Scholar

Allen, G. D., & Hawkins, S. (1978). The development of phonological rhythm. In Bell, A. & Hooper, J. (Eds.), Syllables and segments (pp. 172–185). Amsterdam, The Netherlands: North-Holland Publishing.Google Scholar

Archibald, J. (1994). A formal model of learning L2 prosodic phonology. Second Language Research, 10, 215–240.CrossRef Google Scholar

Arvaniti, A. (2009). Rhythm, timing and the timing of rhythm. Phonetica, 66, 46–63.CrossRef Google Scholar PubMed

Arvaniti, A. (2012). The usefulness of metrics in the quantification of speech rhythm. Journal of Phonetics, 40, 351–373.CrossRef Google Scholar

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255–278.CrossRef Google Scholar PubMed

Beckman, M., & Edwards, J. (1994). Articulatory evidence for differentiating stress categories, phonological structure and phonetic form. In Keating, P. A. (Ed.), Papers in laboratory phonology III (pp. 7–33). Cambridge, UK: Cambridge University Press.Google Scholar

Best, C. T. (1995). A direct-realist view of cross-language speech perception. In Strange, W. (Ed.), Speech perception and linguistic experience: Theoretical and methodological issues (pp. 171–204). Timonium, MD: York Press.Google Scholar

Boersma, P., & Weenink, D. (2015). Praat: Doing phonetics by computer (Version 6.0.04) [Computer software]. Retrieved from http://www.fon.hum.uva.nl/praat/.Google Scholar

Bolinger, D. L. (1965). Pitch accent and sentence rhythm. In Abe, I. & Kanekiyo, T. (Eds.), Forms of English: Accent, morpheme, order (pp. 139–180). Cambridge, MA: Harvard University Press.Google Scholar

Booij, G. (1995). The phonology of Dutch. Oxford, UK: Clarendon Press.Google Scholar

Bunta, F., & Ingram, D. (2007). The acquisition of speech rhythm by bilingual Spanish- and English-speaking 4- and 5-year-old children. Journal of Speech, Language, and Hearing Research, 50, 999–1014.CrossRef Google Scholar PubMed

Byrd, D. (2000). Articulatory vowel lengthening and coordination at phrasal junctures. Phonetica, 57, 3–16.CrossRef Google Scholar PubMed

Cambier-Langeveld, T., & Turk, A. E. (1999). A cross-linguistic study of accentual lengthening: Dutch vs. English. Journal of Phonetics, 27, 255–280.CrossRef Google Scholar

Cambier-Langeveld, T., Nespor, M., & van Heuven, V. J. (1997). The domain of final lengthening in production and perception in Dutch. Proceeding of the European Conference on Speech Communication and Technology, 2, 931–934.Google Scholar

Carter, P. M. (2005). Quantifying rhythmic differences between Spanish, English, and Hispanic English. In Gess, R. S. & Rudin, E. (Eds.), Theoretical and experimental approaches to Romance linguistics: Selected papers from the 34th linguistic symposium on romance languages, Salt Lake City, March 2004 (pp. 63–75). Amsterdam, The Netherlands: John Benjamins Publishing Company.CrossRef Google Scholar

Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge, UK: Cambridge University Press.Google Scholar

Dauer, R. M. (1983). Stress-timing and syllable-timing reanalysed. Journal of Phonetics, 11, 51–62.CrossRef Google Scholar

Dauer, R. M. (1987). Phonetic and phonological components of language rhythm. In Proceedings of the 11th International Congress of Phonetic Sciences (pp. 447–450). Tallinn, Estonia: Academy of Sciences.Google Scholar

Delattre, P. (1966). A comparison of syllable length conditioning among languages. International Review of Applied Linguistics in Language Teaching, 4, 183–198.CrossRef Google Scholar

Delattre, P. (1969). An acoustic and articulatory study of vowel reduction in four languages. International Review of Applied Linguistics for Language Teaching, 7, 295–325.CrossRef Google Scholar

Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language Learning 27, 315–330.CrossRef Google Scholar

Eckman, F. (2008). Typological markedness and second language phonology. In Hansen Edwards, J. G. & Zampini, M. L. (Eds.), Phonology and second language acquisition (pp. 95–115). Philadelphia, PA: John Benjamins.CrossRef Google Scholar

Ellis, R. (1994). The study of second language acquisition. Oxford: Oxford University Press.Google Scholar

Escudero, P., & Boersma, P. (2004). Bridging the gap between L2 speech perception research and phonological theory. Studies in Second Language Acquisition, 26, 551–585.CrossRef Google Scholar

Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In Strange, W. (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233–277). Timonium, MD: York Press.Google Scholar

Grabe, E., & Low, E. L. (2002). Durational variability in speech and the rhythm class hypothesis. Papers in Laboratory Phonology, 7, 515–546.Google Scholar

Grabe, E., Watson, I., & Post, B. (1999). The acquisition of rhythmic patterns in English and French. In Ohala, J. J., Hasegawa, Y., Ohala, M., Granville, D., & Bailey, A. C. (Eds.), Proceedings of the 14th International Congress of Phonetic Sciences (pp. 1201–1204). San Francisco, CA: University of California.Google Scholar

Gut, U. (2009). Non-native speech: A corpus-based analysis of phonological and phonetic properties of L2 English and German. Frankfurt, Germany: Peter Lang.CrossRef Google Scholar

Hartsuiker, R. J. (2002). The addition bias in Dutch and Spanish phonological speech errors: The role of structural context. Language and Cognitive Processes, 17, 61–96.CrossRef Google Scholar

Hualde, J. I. (2005). The sounds of Spanish. Cambridge, UK: Cambridge University Press.Google Scholar

Koopmans-Van Beinum, F. J. (1980). Vowel contrast reduction: An acoustic and perceptual study of Dutch vowels in various speech conditions. Amsterdam, The Netherlands: Academische Pers.Google Scholar

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.CrossRef Google Scholar PubMed

Levelt, C., & Van de Vijver, R. (2004). Syllable types in cross-linguistic and developmental grammars. In Kager, R., Pater, J., & Zonneveld, W. (Eds.), Fixing priorities: Constraints in phonological acquisition (pp. 204–218). Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Li, A., & Post, B. (2014). L2 acquisition of prosodic properties of speech rhythm. Studies in Second Language Acquisition, 36, 223–255.CrossRef Google Scholar

Lubbers, M., & Torreira, F. (2015). Praatalign: An interactive Praat plug-in for performing phonetic forced alignment (version 1.9b) [Computer software]. Retrieved from https://github.com/dopefishh/praatalign.Google Scholar

Mennen, I. (2004). Bi-directional interference in the intonation of Dutch speakers of Greek. Journal of Phonetics, 32, 543–563.Google Scholar

Navarro Tomás, T. (1966). Estudios de fonologia española [Studies on Spanish phonology]. New York, NY: Las Américas.Google Scholar

Nazzi, T., Bertoncini, J., & Mehler, J. (1998). Language discrimination by newborns: Toward an understanding of the role of rhythm. Journal of Experimental Psychology: Human Perception and Performance, 24, 756–766.Google Scholar PubMed

Ordin, M., & Polyanskaya, L. (2014). Development of timing patterns in first and second languages. System, 42, 244–257.CrossRef Google Scholar

Ordin, M., & Polyanskaya, L. (2015). Acquisition of speech rhythm in a second language by learners with rhythmically different native languages. The Journal of the Acoustical Society of America, 138, 533–544.CrossRef Google Scholar

Özçelik, Ö. (2016). The Prosodic Acquisition Path Hypothesis: Towards explaining variability in L2 acquisition of phonology. Glossa: A Journal of General Linguistics, 1, 1–48.CrossRef Google Scholar

Payne, E., Post, B., Prieto, P., Vanrell, M., & Astruc, L. (2012). Measuring child rhythm. Language and Speech, 55, 203–229.CrossRef Google Scholar PubMed

Peterson, G. E., & Lehiste, I. (1960). Duration of syllable nuclei in English. Journal of the Acoustical Society of America, 32, 693–703.CrossRef Google Scholar

Pike, K. L. (1945). The intonation of American English. Ann Arbor, MI: University of Michigan.Google Scholar

Polyanskaya, L., & Ordin, M. (2015). Acquisition of speech rhythm in first language. Journal of the Acoustical Society of America, 138, 199–204.CrossRef Google Scholar PubMed

Polyanskaya, L., Ordin, M., & Busà, M. (2017). Relative salience of speech rhythm and speech rate on perceived foreign accent in a second language. Language and Speech, 60, 333–355.CrossRef Google Scholar

Post, B., & Payne, E. (2018). Speech rhythm in development: What is the child acquiring? In Prieto, P. & Esteve-Gibert, N. (Eds.), The development of prosody in first language acquisition (pp. 126–143). Amsterdam, The Netherlands: John Benjamins.Google Scholar

Prieto, P., Vanrell, M., Astruc, L., Payne, E., & Post, B. (2012). Phonotactic and phrasal properties of speech rhythm: Evidence from Catalan, English, and Spanish. Speech Communication, 54, 681–702.CrossRef Google Scholar

Ramus, F., & Mehler, J. (1999). Language identification with suprasegmental cues: A study based on speech resynthesis. The Journal of the Acoustical Society of America, 105, 512–521.CrossRef Google Scholar PubMed

Ramus, F., Dupoux, E., & Mehler, J. (2003). The psychological reality of rhythm classes: Perceptual studies. In Solé, M., Recasens, D., & Romero, J. (Eds.), Proceedings of the 15th International Congress of Phonetic Sciences (pp. 337–342). Barcelona, Spain: Universitat Autonomá de Barcelona.Google Scholar

Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, 73, 265–292.CrossRef Google Scholar PubMed

Rasier, L. (2006). Prosodie en vreemdetaalverwerving: accentdistributie in het Frans en in het Nederlands als vreemde taal [Prosody and second language acquisition: Accent distribution in French and Dutch as a foreign language] (Doctoral dissertation). Louvain, Belgium: Université catholique de Louvain.Google Scholar

Schiller, N. O., Meyer, A. S., Baayen, R. H., & Levelt, W. J. (1996). A comparison of lexeme and speech syllables in Dutch. Journal of Quantitative Linguistics, 3, 8–28.CrossRef Google Scholar

Schmidt, E., & Post, B. (2015). The development of prosodic features and their contribution to rhythm production in simultaneous bilinguals. Language and Speech, 58, 24–47.CrossRef Google Scholar PubMed

Ten Bosch, L. F. M. (1991). On the structure of vowel systems: Aspects of an extended vowel model using effort and contrast [Doctoral dissertation]. Amsterdam, The Netherlands: Universiteit van Amsterdam.Google Scholar

Van Maastricht, L., Krahmer, E., & Swerts, M. (2016). Prominence patterns in a second language: Intonational transfer from Dutch to Spanish and vice versa. Language Learning, 66(1), 124–158.Google Scholar

Van Zon, M. (1997). Speech processing in Dutch: A cross-linguistic approach [Doctoral dissertation]. Tilburg, The Netherlands: Katholieke Universiteit Brabant.Google Scholar

White, L., & Mattys, S. L. (2007a). Calibrating rhythm: First language and second language studies. Journal of Phonetics, 35, 501–522.CrossRef Google Scholar

White, L., & Mattys, S. L. (2007b). Rhythmic typology and variation in first and second languages. In Prieto, P., Mascaró, J., & Solé, M.-J. (Eds.), Segmental and prosodic issues in Romance phonology: Current issues in linguistic theory series (pp. 237–257). Amsterdam, The Netherlands: John Benjamins.CrossRef Google Scholar

Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O., & Mattys, S. (2010). How stable are acoustic metrics of contrastive speech rhythm? Journal of the Acoustical Society of America, 127, 1559–1569.CrossRef Google Scholar PubMed

FIGURE 1. Waveform, spectrogram, F0 contour, and labeling scheme used for the Spanish utterance La madre de Susana es una buena profesora, “Susana’s mother is a good teacher.”