Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-23T08:22:30.009Z Has data issue: false hasContentIssue false

Statistical Translation After Source Reordering: Oracles, Context-Aware Models, and Empirical Analysis

Published online by Cambridge University Press:  14 May 2012

MAXIM KHALILOV
Affiliation:
Institute for Logic, Language and Computation, University of AmsterdamP.O. Box 94242, 1090 GE Amsterdam, The Netherlands e-mails: [email protected], [email protected]
KHALIL SIMA'AN
Affiliation:
Institute for Logic, Language and Computation, University of AmsterdamP.O. Box 94242, 1090 GE Amsterdam, The Netherlands e-mails: [email protected], [email protected]

Abstract

In source reordering the order of the source words is permuted to minimize word order differences with the target sentence and then fed to a translation model. Earlier work highlights the benefits of resolving long-distance reorderings as a pre-processing step to standard phrase-based models. However, the potential performance improvement of source reordering and its impact on the components of the subsequent translation model remain unexplored. In this paper we study both aspects of source reordering. We set up idealized source reordering (oracle) models with/without syntax and present our own syntax-driven model of source reordering. The latter is a statistical model of inversion transduction grammar (ITG)-like tree transductions manipulating a syntactic parse and working with novel conditional reordering parameters. Having set up the models, we report translation experiments showing significant improvement on three language pairs, and contribute an extensive analysis of the impact of source reordering (both oracle and model) on the translation model regarding the quality of its input, phrase-table, and output. Our experiments show that oracle source reordering has untapped potential in improving translation system output. Besides solving difficult reorderings, we find that source reordering creates more monotone parallel training data at the back-end, leading to significantly larger phrase tables with higher coverage of phrase types in unseen data. Unfortunately, this nice property does not carry over to tree-constrained source reordering. Our analysis shows that, from the string-level perspective, tree-constrained reordering might selectively permute word order, leading to larger phrase tables but without increase in phrase coverage in unseen data.

Type
Articles
Copyright
Copyright © Cambridge University Press 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Birch, A., and Osborne, M. 2010. LRscore for evaluating lexical and reordering quality in MT. In Proceedings of the Joint Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden, July 15–16, pp. 327–32. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Brown, P., Della Pietra, V., Della Pietra, S., and Mercer, R. 1993. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19 (2): 263311.Google Scholar
Chiang, D. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Ann Arbor, MI, USA, pp. 263–70. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Chiang, D. 2007. Hierarchical phrase-based translation. Computational Linguistics 2 (33): 201–28.CrossRefGoogle Scholar
Collins, M., Koehn, P., and Kučerová, I. 2005. Clause restructuring for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Ann Arbor, MI, USA, pp. 531–40. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Costa-jussà, M. R., and Fonollosa, J. A. R. 2006. Statistical machine reordering. In Proceedings of the Joint Conference on Human Language Technology and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), New York, NY, USA, pp. 70–6.Google Scholar
DeNeefe, S., Knight, K., Wang, W., and Marcu, D. 2007. What can syntax-based MT learn from phrase-based MT? In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, pp. 755–63.Google Scholar
DeNero, J., and Uszkoreit, J. 2011. Inducing sentence structure from parallel corpora for reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 193203. Edinburgh, Scotland, UK: Association for Computational Linguistics.Google Scholar
Doddington, G. 2002. Automatic evaluation of machine translation quality using n-grams co-occurrence statistics. In Proceedings of the Conference on Human Language Technology (HLT), San Diego, CA, USA, pp. 128–32.Google Scholar
Dyer, C., Clark, J. H., Lavie, A., and Smith, N. A. 2011. Unsupervised word alignment with arbitrary features. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Portland, OR, pp. 409–19.Google Scholar
Dyer, C., and Resnik, P. 2010. Context-free reordering, finite-state translation. In Proceedings of Human Language Technology and North American Chapter of the ACL (HLT-NAACL), Los Angeles, CA, USA, pp. 858–66.Google Scholar
Galley, M., Hopkins, M., Knight, K., and Marcu, D. 2004. What's in a translation rule? In Proceedings of Human Language Technology and North American Chapter of the Association for Computational Linguistics (HLT-NAACL), Boston, MA, USA, pp. 273–80.Google Scholar
Genzel, D. 2010. Automatically learning source-side reordering rules for large-scale machine translation. In Proceedings of the International Conference on Computational Linguistics (COLING), Beijing, China, pp. 376–84.Google Scholar
Huang, L., Zhang, H., Gildea, D., and Knight, K. 2009. Binarization of synchronous context-free grammars. Computational Linguistics 35 (4): 559–95.CrossRefGoogle Scholar
Isozaki, H., Sudoh, K., Tsukada, H., and Duh, K. 2010. Head finalization: a simple reordering rule for Sov languages. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden, pp. 244–51. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Katz-Brown, J., Petrov, S., McDonald, R. T., Och, F. J., Talbot, D., Ichikawa, H., Seno, M., and Kazawa, H. 2011. Training a parser for machine translation reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), a Meeting of SIGDAT, a Special Interest Group of the ACL, Edinburgh, Scotland, UK, pp. 183–92.Google Scholar
Khalilov, M. 2009. New statistical and Syntactic Models for Machine Translation. Ph.D. thesis, Universitat Politècnica de Catalunya, Barcelona, Spain.Google Scholar
Khalilov, M., and Sima'an, K. 2010. A discriminative syntactic model for source permutation via tree transduction. In Proceedings of the Fourth Workshop on Syntax and Structure in Statistical Translation (SSST-4) at the International Conference on Computational Linguistics (COLING), Beijing, China, pp. 92100.Google Scholar
Khalilov, M., and Sima'an, K. 2011. Context-sensitive syntactic source-reordering by statistical transduction. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand, pp. 3846.Google Scholar
Klein, D., and Manning, C. 2003. Accurate unlexicalized parsing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Sapporo, Japan, pp. 423–30.Google Scholar
Koehn, P. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP), Barcelona, Spain, pp. 388–95.Google Scholar
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses: open-source toolkit for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Prague, Czech Republic, pp. 177–80.Google Scholar
Koehn, P., Och, F., and Marcu, D. 2003. Statistical phrase-based machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), Edmonton, Canada, pp. 4854.Google Scholar
Li, L. 1998. A comparison of word order in English and Chinese. Poznań Studies in Contemporary Linguistics (Formerly: Papers and Studies in Contrastive Linguistics) 34: 153–61.Google Scholar
Li, C., Minghui, L., Zhang, D., Li, M., Zhou, M., and Guan, Y. 2007. A probabilistic approach to syntax-based reordering for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Prague, Czech Republic, pp. 720–7.Google Scholar
Marcus, M. P., Santorini, B., and Marcinkiewicz, M. A. 1993. Building a large annotated corpus of English: the Penn treebank. Computational Linguistics 19 (2): 313–30.Google Scholar
Mylonakis, M., and Sima'an, K. 2011. Learning hierarchical translation structure with linguistic annotations. Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT), Portland, OR, USA.Google Scholar
Och, F. 1999. An efficient method for determining bilingual word classes. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Maryland, USA, pp. 71–6.Google Scholar
Och, F. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Sapporo, Japan, pp. 160–7.Google Scholar
Och, F., and Ney, H. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, USA, pp. 295302.Google Scholar
Och, F., and Ney, H. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics 29 (1): 1951.CrossRefGoogle Scholar
Och, F., and Ney, H. 2004. The alignment template approach to statistical machine translation. Computational Linguistics 30 (4): 417–49.CrossRefGoogle Scholar
Papineni, K., Roukos, S., Ward, T., and Zhu, W. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, USA, pp. 311–18.Google Scholar
Popovic', M., and Ney, H. 2006. POS-based word reorderings for statistical machine translation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Genoa, Italy, pp. 1278–83.Google Scholar
PVS, A. 2010. A data mining approach to learn reorder rules for SMT. In Proceedings of Human Language Technologies: the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL/HLT), Los Angeles, CA, USA, pp. 52–7.Google Scholar
Ramanathan, A., Bhattacharyya, P., Hegde, J., Shah, R., and Sasikumar, M. 2008. Simple syntactic and morphological processing can help English–Hindi statistical machine translation. In Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Hyderabad, India, pp. 513–20.Google Scholar
Stolcke, A. 2002. SRILM: an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing (ICSLP), Denver, CO, USA, pp. 901–4.Google Scholar
Tillman, C. 2004. A unigram orientation model for statistical machine translation. In Proceedings of Human Language Technologies: the Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), Boston, MA, USA, pp. 101–4.Google Scholar
Tromble, R., and Eisner, J. 2009. Learning linear ordering problems for better translation. In Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP), Singapore, pp. 1007–16.Google Scholar
Visweswariah, K., Navratil, J., Sorensen, J., Chenthamarakshan, V., and Kambhatla, N. 2010. Syntax-based reordering with automatically derived rules for improved statistical machine translation. In Proceeding of the International Conference on Computational Linguistics (COLING), Beijing, China, pp. 1119–27.Google Scholar
Visweswariah, K., Rajkumar, R., Gandhe, A., Ramanathan, A., and Navratil, J. 2011. A word reordering model for improved machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Edinburgh, Scotland, UK, pp. 486–96.Google Scholar
Wang, C., Collins, M., and Koehn, P. 2007. Chinese syntactic reordering for statistical machine translation. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing (EMNLP) and Conference on Computational Natural Language Learning (CoNLL), Prague, Czech Republic, pp. 737–45.Google Scholar
Wang, W., May, J., Knight, K., and Marcu, D. June 2010. Re-structuring, re-labeling, and re-aligning for syntax-based machine translation. Computational Linguistics 36: 247–77.CrossRefGoogle Scholar
Wu, D. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics 3 (23): 377403.Google Scholar
Wu, D., and Wong, H. 1998. Machine translation with a stochastic grammatical channel. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics (ACL) and the International Conference on Computational Linguistics (COLING), Columbus, OH, USA, pp. 1408–15.Google Scholar
Xia, F., and McCord, M. 2004. Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of the International Conference on Computational Linguistics (COLING), Geneva, Switzerland, pp. 508–14.Google Scholar
Zens, R., and Ney, H. 2003.A comparative study on reordering constraints in statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Sapporo, Japan, pp. 144–51.Google Scholar
Zens, R., Och, F., and Ney, H. 2002. Phrase-based statistical machine translation. In Proceedings of KI: advances in Artificial Intelligence, pp. 18–32.Google Scholar
Zollmann, A., and Venugopal, A. 2006. Syntax-augmented machine translation via chart parsing. In Proceedings of the North American Association for Computational Linguistics Conference (NAACL), pp. 138–41.Google Scholar
Zwarts, S., and Dras, M. 2007. Syntax-based word reordering in phrase-based statistical machine translation: why does it work? Proceedings of the MT Summit XI, Copenhagen, Denmark.Google Scholar