Hostname: page-component-78c5997874-t5tsf Total loading time: 0 Render date: 2024-11-09T09:05:53.543Z Has data issue: false hasContentIssue false

An alternative to synchronous tree substitution grammars*

Published online by Cambridge University Press:  21 March 2011

ANDREAS MALETTI*
Affiliation:
Universitat Rovira i Virgili, Departament de Filologies Romàniques Avinguda de Catalunya 35, 43002 Tarragona, Spain email: [email protected]

Abstract

Synchronous tree substitution grammars (stsg) are a (formal) tree transformation model that is used in the area of syntax-based machine translation. A competitor that is at least as expressive as stsg is proposed and compared to stsg. The competitor is the extended multi bottom-up tree transducer (mbot), which is the bottom-up analogue with the additional feature that states have non-unary ranks. Unweighted mbot have already been investigated with respect to their basic properties, but the particular properties of the constructions that are required in the machine translation task are largely unknown. stsg and mbot are compared with respect to binarization, regular restriction, and application. Particular attention is paid to the complexity of the constructions.

Type
Papers
Copyright
Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aho, A. V. and Ullman, J. D. 1972. The Theory of Parsing, Translation, and Compiling. Upper Saddle River, NJ, USA: Prentice Hall.Google Scholar
Alexandrakis, A. and Bozapalidis, S. 1987. Weighted grammars and Kleene's theorem. Information Processing Letters 24 (1): 14.CrossRefGoogle Scholar
Arnold, A. and Dauchet, M. 1982. Morphismes et bimorphismes d'arbres. Theoretical Computer Science 20 (1): 3393.CrossRefGoogle Scholar
Bar-Hillel, Y., Perles, M. and Shamir, E. 1964. On formal properties of simple phrase structure grammars. In Bar-Hillel, Y. (ed.), Language and Information: Selected Essays on their Theory and Application, Chapter 9, pp. 116150. Reading, MA, USA: Addison Wesley.Google Scholar
Berstel, J. and Reutenauer, C. 1982. Recognizable formal power series on trees. Theoretical Computer Science 18 (2): 115–48.CrossRefGoogle Scholar
Borchardt, B. 2004. A pumping lemma and decidability problems for recognizable tree series. Acta Cybernetica 16 (4): 509–44.Google Scholar
Brown, P. F., Cocke, J., Della Pietra, S. A., Della Pietra, V. J., Jelinek, F., Lafferty, J. D., Mercer, R. L., and Roossin, P. S. 1990. A statistical approach to machine translation. Computational Linguistics 16 (2): 7985.Google Scholar
Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., and Mercer, R. L. 1993. Mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19 (2): 263311.Google Scholar
Chiang, D. 2005. A hierarchical phrase-based model for statistical machine translation. In Knight, K., Ng, H. T., and Oflazer, K. (eds.), Association for Computational Linguistics: 43rd Annual Meeting, pp. 263–70. Ann Arbor, MI, USA: Association for Computational Linguistics.Google Scholar
Chiang, D. and Knight, K. 2006. Tutorial: an introduction to synchronous grammars. In Calzolari, N., Cardie, C., and Isabelle, P. (eds.), Association for Computational Linguistics: 44th Annual Meeting, Sydney, Australia: Association for Computational Linguistics.Google Scholar
DeNero, J., Bansal, M., Pauls, A. and Klein, D. 2009. Efficient parsing for transducer grammars. In Ostendorf, M., Collins, M., Narayanan, S., Oard, D. W., and Vanderwende, L. (eds.), Human Language Technologies: 2009 Annual Conference, pp. 227–35. Boulder, CO, USA: Association for Computational Linguistics.Google Scholar
DeNero, J., Pauls, A. and Klein, D. 2009. Asynchronous binarization for synchronous grammars. In Su, K.-Y., Su, J., Wiebe, J., and Li, H. (eds.), Association for Computational Linguistics: 47th Annual Meeting, pp. 141–4. Singapore, Singapore: Association for Computational Linguistics.Google Scholar
Eilenberg, S. 1974. Automata, Languages, and Machines, Volume 59 of Pure and Applied Math. Orlando, FL, USA: Academic Press.Google Scholar
Engelfriet, J., Fülöp, Z. and Vogler, H. 2002. Bottom-up and top-down tree series transformations. Journal of Automata, Languages and Combinatorics 7 (1): 1170.Google Scholar
Engelfriet, J., Lilin, E. and Maletti, A. 2009. Extended multi bottom-up tree transducers: Composition and decomposition. Acta Informatica 46 (8): 561–90.CrossRefGoogle Scholar
Engelfriet, J., Rozenberg, G. and Slutzki, G. 1980. Tree transducers, L systems, and two-way machines. Journal of Computer and System Sciences 20 (2): 150202.CrossRefGoogle Scholar
Fülöp, Z., Maletti, A. and Vogler, H. 2010. Preservation of recognizability for synchronous tree substitution grammars. In Drewes, F., and Kuhlmann, M. (eds.), Applications of Tree Automata in Natural Language Processing: 2010 Workshop, pp. 19. Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
Fülöp, Z. and Vogler, H. 2009. Weighted tree automata and tree transducers. In Droste, M., Kuich, W., and Vogler, H. (eds.), Handbook of Weighted Automata, pp. 313403. EATCS Monographs on Theoretical Computer Science, Chapter IX. Berlin, Germany: Springer.CrossRefGoogle Scholar
Golan, J. S. 1999. Semirings and their Applications. Dordrecht: Kluwer Academic.CrossRefGoogle Scholar
Graehl, J., Knight, K. and May, J. 2008. Training tree transducers. Computational Linguistics 34 (3): 391427.CrossRefGoogle Scholar
Hebisch, U. and Weinert, H. J. 1998. Semirings — Algebraic Theory and Applications in Computer Science. Singapore: World Scientific.CrossRefGoogle Scholar
Hopcroft, J. E. and Ullman, J. D. 1979. Introduction to Automata Theory, Languages and Computation. Reading, MA, USA: Addison Wesley.Google Scholar
Huang, L., Zhang, H., Gildea, D. and Knight, K. 2009. Binarization of synchronous context-free grammars. Computational Linguistics 35 (4), 559–95.CrossRefGoogle Scholar
Knight, K. 2007. Capturing practical natural language transformations. Machine Translation 21 (2): 121–33.CrossRefGoogle Scholar
Lilin, E. 1978. Une Généralisation des Transducteurs D'états Finis D'arbres: les S-transducteurs. Thèse 3ème cycle, Université de Lille.Google Scholar
Lilin, E. 1981. Propriétés de clôture d'une extension de transducteurs d'arbres déterministes. In Astesiano, E., and Böhm, C. (eds.), Trees in Algebra and Programming: 6th Colloquium, Volume 112 of Lecture Notes in Computer Science, pp. 280289. Genoa, Italy: Springer.Google Scholar
Maletti, A., Graehl, J., Hopkins, M. and Knight, K. 2009. The power of extended top-down tree transducers. SIAM Journal on Computing 39 (2): 410–30.CrossRefGoogle Scholar
Maletti, A. and Satta, G. 2009. Parsing algorithms based on tree automata. In de la Clergerie, E. V., Bunt, H., and Danlos, L. (eds.), Parsing Technologies: 11th International Conference, pp. 112. Paris, France: Association for Computational Linguistics.Google Scholar
Maletti, A. and Satta, G. 2010. Parsing and translation algorithms based on weighted extended tree transducers. In Drewes, F., and Kuhlmann, M. (eds.), Applications of Tree Automata in Natural Language Processing: 2010 Workshop, pp. 1927. Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
Mohri, M. 2009. Weighted automata algorithms. In Droste, M., Kuich, W., and Vogler, H. (eds.), Handbook of Weighted Automata, pp. 213254. EATCS Monographs on Theoretical Computer Science, Chapter IV. Berlin, Germany: Springer.CrossRefGoogle Scholar
Nederhof, M.-J. and Satta, G. 2003. Probabilistic parsing as intersection. In Parsing Technologies: 8th International Conference, pp. 137148. Nancy, France: Association for Computational Linguistics.Google Scholar
Nederhof, M.-J. and Satta, G. 2008. Computation of distances for regular and context-free probabilistic languages. Theoretical Computer Science 395 (2–3): 235–54.CrossRefGoogle Scholar
Raoult, J.-C. 1993. Recursively defined tree transductions. In Kirchner, C. (ed.), Rewriting Techniques and Applications: 5th International Conference, Volume 690 of Lecture Notes in Computer Science, pp. 343357. Montreal, Canada: Springer.CrossRefGoogle Scholar
Sakarovitch, J. 2009. Rational and recognisable power series. In Droste, M., Kuich, W., and Vogler, H. (eds.), Handbook of Weighted Automata, pp. 105174. EATCS Monographs on Theoretical Computer Science, Chapter IV. Berlin, Germany: Springer.CrossRefGoogle Scholar
Satta, G. 2010. Translation algorithms by means of language intersection. (Manuscript).Google Scholar
Schützenberger, M. P. 1961. On the definition of a family of automata. Information and Control 4 (2–3): 245–70.CrossRefGoogle Scholar
Wang, W., Knight, K. and Marcu, D. 2007. Binarizing syntax trees to improve syntax-based machine translation accuracy. In Eisner, J. (ed.), Empirical Methods in Natural Language Processing: 2007 Joint Conference, pp. 746754. Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
Zhang, H., Huang, L., Gildea, D. and Knight, K. 2006. Synchronous binarization for machine translation. In Moore, R. C., Bilmes, J., Chu-Carroll, J., and Sanderson, M. (eds.), Human Language Technology: 2006 Annual Conference, pp. 256263. New York, NY, USA: Association for Computational Linguistics.Google Scholar