Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-08T14:34:37.722Z Has data issue: false hasContentIssue false

Long words in maximum entropy phonotactic grammars*

Published online by Cambridge University Press:  15 February 2016

Robert Daland*
Affiliation:
University of California, Los Angeles
*

Abstract

A phonotactic grammar assigns a well-formedness score to all possible surface forms. This paper considers whether phonotactic grammars should be probabilistic, and gives several arguments that they need to be. Hayes & Wilson (2008) demonstrate the promise of a maximum entropy Harmonic Grammar as a probabilistic phonotactic grammar. This paper points out a theoretical issue with maxent phonotactic grammars: they are not guaranteed to assign a well-defined probability distribution, because sequences that contain arbitrary repetitions of unmarked sequences may be underpenalised. The paper motivates a solution to this issue: include a *Struct constraint. A mathematical proof of necessary and sufficient conditions to avoid the underpenalisation problem are given in online supplementary materials.

Type
Articles
Copyright
Copyright © Cambridge University Press 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

I wish to acknowledge the entire LSCP (Laboratoire des Sciences Cognitives, ENS/Paris), especially Benjamin Börschinger and Abdel Fourtassi, who collaborated with me on the project that led to the insights on this paper, Mark Johnson, who pointed out that maxent should work for Σ*, and Alex Cristia, Sharon Peperkamp and Emmanuel Dupoux for inviting me to the LSCP. I also wish to acknowledge Colin Wilson and Maria Gouskova for useful discussion of the issue, and the editors of this journal for advice.

Three theorems which formalise the central contributions of this paper are discussed in the online supplementary materials, available at http://www.journals.cambridge.org/issue_Phonology/Vol32No03.

References

REFERENCES

Anttila, Arto (1997). Deriving variation from grammar. In Hinskens, Frans, van Hout, Roeland & Wetzels, W. Leo (eds.) Variation, change and phonological theory. Amsterdam & Philadelphia: Benjamins. 3568.Google Scholar
Baayen, R. Harald (2001). Word frequency distributions. Dordrecht: Kluwer.CrossRefGoogle Scholar
Baayen, R. Harald & Schreuder, Robert (2000). Towards a psycholinguistic computational model for morphological parsing. Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences 358. 12811293.CrossRefGoogle Scholar
Bane, Max & Riggle, Jason (2012). Consequences of candidate omission. LI 43. 695706.Google Scholar
Boersma, Paul & Hayes, Bruce (2001). Empirical tests of the Gradual Learning Algorithm. LI 32. 4586.Google Scholar
Boersma, Paul & Pater, Joe (to appear). Convergence properties of a Gradual Learning Algorithm for Harmonic Grammar. In McCarthy & Pater (to appear).Google Scholar
Bowers, Dustin (2014). Balancing leveling and composite URs. Paper presented at Phonology 2014, MIT.Google Scholar
Chi, Zhiyi & Geman, Stuart (1998). Estimation of probabilistic context-free grammars. Computational Linguistics 24. 299305.Google Scholar
Chodroff, Eleanor & Wilson, Colin (2014). Phonetic vs. phonological factors in coronal-to-dorsal perceptual assimilation. Paper presented at LabPhon 14: the 14th Conference on Laboratory Phonology, Tokyo.Google Scholar
Chomsky, Noam (1956). Three models for the description of language. IRE Transactions on Information Theory 2:3. 113124.CrossRefGoogle Scholar
Chomsky, Noam & Halle, Morris (1965). Some controversial questions in phonological theory. JL 1. 97138.Google Scholar
Chomsky, Noam & Halle, Morris (1968). The sound pattern of English. New York: Harper & Row.Google Scholar
Coady, Jeffry A. & Evans, Julia L. (2008). Uses and interpretations of non-word repetition tasks in children with and without specific language impairments (SLI). International Journal of Language Communication Disorders 43. 140.CrossRefGoogle ScholarPubMed
Coetzee, Andries W. & Kawahara, Shigeto (2013). Frequency biases in phonological variation. NLLT 31. 4789.Google Scholar
Coetzee, Andries W. & Pater, Joe (2011). The place of variation in phonological theory. In Goldsmith, John, Riggle, Jason & Yu, Alan (eds.) The handbook of phonological theory. 2nd edn. Malden, Mass. & Oxford: Wiley-Blackwell. 401431.Google Scholar
Coleman, John & Pierrehumbert, Janet B. (1997). Stochastic phonological grammars and acceptability. In Coleman, John (ed.) Proceedings of the 3rd Meeting of the ACL Special Interest Group in Computational Phonology. Somerset, NJ: Association for Computational Linguistics. 4956.Google Scholar
Daland, Robert, Börschinger, Benjamin & Fourtassi, Abdellah (2014). On lexical phonotactics and segmentability. Paper presented at LabPhon 14: the 14th Conference on Laboratory Phonology, Tokyo.Google Scholar
Daland, Robert, Hayes, Bruce, White, James, Garellek, Marc, Davis, Andrea & Norrmann, Ingrid (2011). Explaining sonority projection effects. Phonology 28. 197234.CrossRefGoogle Scholar
Davidson, Lisa & Shaw, Jason A. (2012). Sources of illusion in consonant cluster perception. JPh 40. 234248.Google Scholar
Della Pietra, Stephen, Della Pietra, Vincent J. & Lafferty, John D. (1997). Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19. 380393.Google Scholar
Edwards, Jan, Beckman, Mary E. & Munson, Benjamin (2004). The interaction between vocabulary size and phonotactic probability effects on children's production accuracy and fluency in nonword repetition. Journal of Speech, Language, and Hearing Research 47. 421436.Google Scholar
Eisner, Jason (2002). Parameter estimation for probabilistic finite-state transducers. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 18.Google Scholar
Elsner, Micha, Goldwater, Sharon, Feldman, Naomi & Wood, Frank (2013). A joint learning model of word segmentation, lexical acquisition, and phonetic variability. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 4254.Google Scholar
Goldrick, Matthew & Daland, Robert (2009). Linking speech errors and phonological grammars: insights from Harmonic Grammar networks. Phonology 26. 147185.Google Scholar
Goldwater, Sharon & Johnson, Mark (2003). Learning OT constraint rankings using a Maximum Entropy model. In Spenador, Jennifer, Eriksson, Anders & Dahl, Östen (eds.) Proceedings of the Stockholm Workshop on Variation within Optimality Theory. Stockholm: Stockholm University. 111120.Google Scholar
Gouskova, Maria (2003). Deriving economy: syncope in Optimality Theory. PhD dissertation, University of Massachusetts, Amherst.Google Scholar
Grenander, Ulf (1976). Pattern synthesis. New York: Springer.Google Scholar
Harris, Theodore E. (1963). The theory of branching processes. Berlin: Springer.CrossRefGoogle Scholar
Hay, Jennifer, Pierrehumbert, Janet B. & Beckman, Mary E. (2003). Speech perception, well-formedness and the statistics of the lexicon. In Local, John, Ogden, Richard & Temple, Rosalind (eds.) Phonetic interpretation: papers in laboratory phonology VI . Cambridge: Cambridge University Press. 5874.Google Scholar
Hayes, Bruce (2004). Phonological acquisition in Optimality Theory: the early stages. In Kager, René, Pater, Joe & Zonneveld, Wim (eds.) Constraints in phonological acquisition. Cambridge: Cambridge University Press. 158203.Google Scholar
Hayes, Bruce (2011). Interpreting sonority-projection experiments: the role of phonotactic modeling. In Lee, Wai-Sum & Zee, Eric (eds.) Proceedings of the 17th International Congress of Phonetic Sciences, Hong Kong 2011. Hong Kong: University of Hong Kong. 835838.Google Scholar
Hayes, Bruce & White, James (2013). Phonological naturalness and phonotactic learning. LI 44. 4575.Google Scholar
Hayes, Bruce & Wilson, Colin (2008). A maximum entropy model of phonotactics and phonotactic learning. LI 39. 379440.Google Scholar
Jäger, Gerhard (2007). Maximum entropy models and Stochastic Optimality Theory. In Zaenen, Annie, Simpson, Jane, King, Tracy Holloway, Grimshaw, Jane, Maling, Joan & Manning, Chris (eds.) Architectures, rules, and preferences: variations on themes by Joan W. Bresnan. Stanford: CSLI. 467479.Google Scholar
Jarosz, Gaja (2013). Learning with hidden structure in Optimality Theory and Harmonic Grammar: beyond Robust Interpretive Parsing. Phonology 30. 2771.Google Scholar
Jaynes, E. T. (1983). Papers on probability, statistics, and statistical physics. Edited by Rosenkrantz, R. D.. Dordrecht: Kluwer.Google Scholar
Jelinek, Frederick (1997). Statistical methods for speech recognition. Cambridge, Mass.: MIT Press.Google Scholar
Legendre, Géraldine, Miyata, Yoshiro & Smolensky, Paul (1990). Harmonic Grammar: a formal multi-level connectionist theory of linguistic well-formedness: an application. In Proceedings of the 12th Annual Conference of the Cognitive Science Society. Hillsdale: Erlbaum. 884891.Google Scholar
McCarthy, John J. & Pater, Joe (eds.) (to appear). Harmonic Grammar and Harmonic Serialism. London: Equinox.Google Scholar
McCarthy, John J. & Prince, Alan (1993). Prosodic morphology I: constraint interaction and satisfaction. Ms, University of Massachusetts, Amherst & Rutgers University.Google Scholar
McClelland, James L. & Elman, Jeffrey L. (1986). The TRACE model of speech perception. Cognitive Psychology 18. 186.Google Scholar
Magri, Giorgio (2012). Convergence of error-driven ranking algorithms. Phonology 29. 213269.Google Scholar
Manning, Christopher D. & Schütze, Hinrich (1999). Foundations of statistical natural language processing. Cambridge, Mass: MIT Press.Google Scholar
Mattys, Sven L. & Jusczyk, Peter W. (2001). Do infants segments words or recurring contiguous patterns? Journal of Experimental Psychology: Human Perception and Performance 27. 644645.Google Scholar
Merchant, Nazarré & Tesar, Bruce (2008). Learning underlying forms by searching restricted lexical subspaces. CLS 41:2. 3347.Google Scholar
Norris, Dennis & McQueen, James M. (2008). Shortlist B: a Bayesian model of continuous speech recognition. Psychological Review 115. 357395.Google Scholar
Pater, Joe (2008). Gradual learning and convergence. LI 39. 334345.Google Scholar
Pater, Joe (to appear). Universal Grammar with weighted constraints. In McCarthy & Pater (to appear).Google Scholar
Prince, Alan & Smolensky, Paul (1993). Optimality Theory: constraint interaction in generative grammar. Ms, Rutgers University & University of Colorado, Boulder. Published 2004, Malden, Mass. & Oxford: Blackwell.Google Scholar
Riggle, Jason (2004). Generation, recognition, and learning in finite-state Optimality Theory. PhD dissertation, University of California, Los Angeles.Google Scholar
Riggle, Jason (2009). Violation semirings in Optimality Theory. Research on Language and Computation 7. 112.Google Scholar
Scharenborg, Odette, Norris, Dennis, Bosch, Louis ten & McQueen, James M. (2005). How should a speech recognizer work? Cognitive Science 29. 867918.Google Scholar
Smolensky, Paul & Legendre, Géraldine (eds.) (2006). The harmonic mind: from neural computation to optimality-theoretic grammar. 2 vols. Cambridge, Mass.: MIT Press.Google Scholar
Storkel, Holly L., Armbrüster, Jonna & Hogan, Tiffany P. (2006). Differentiating phonotactic probability and neighborhood density in adult word learning. Journal of Speech, Language, and Hearing Research 49. 11751192.Google Scholar
Tesar, Bruce, Alderete, John, Horwood, Graham, Merchant, Nazarré, Nishitani, Koichi & Prince, Alan (2003). Surgery in language learning. WCCFL 22. 477490.Google Scholar
Tesar, Bruce & Prince, Alan (2003). Using phonotactics to learn phonological alternations. CLS 39:2. 209237.Google Scholar
Tesar, Bruce & Smolensky, Paul (1998). Learnability in Optimality Theory. LI 29. 229268.Google Scholar
Wilson, Colin & Davidson, Lisa (2013). Bayesian analysis of non-native cluster production. NELS 40. 265278.Google Scholar
Wilson, Colin, Davidson, Lisa & Martin, Sean (2014). Effects of acoustic–phonetic detail on cross-language speech production. Journal of Memory and Language 77. 124.Google Scholar
Wilson, Colin & Obdeyn, Marieke (2009). Simplifying subsidiary theory: statistical evidence from Arabic, Muna, Shona, and Wargamay. Ms, Johns Hopkins University.Google Scholar
Supplementary material: PDF

Daland supplementary material

Daland supplementary material 1

Download Daland supplementary material(PDF)
PDF 2.3 MB