Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-24T22:32:42.681Z Has data issue: false hasContentIssue false

Influence of personal choices on lexical variability in referring expressions

Published online by Cambridge University Press:  09 July 2015

RAQUEL HERVÁS
Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: [email protected], [email protected], [email protected], [email protected], [email protected]
JAVIER ARROYO
Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: [email protected], [email protected], [email protected], [email protected], [email protected]
VIRGINIA FRANCISCO
Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: [email protected], [email protected], [email protected], [email protected], [email protected]
FEDERICO PEINADO
Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: [email protected], [email protected], [email protected], [email protected], [email protected]
PABLO GERVÁS
Affiliation:
Departamento de Ingeniería del Software e Inteligencia Artificial, Universidad Complutense de Madrid, 28040, Madrid, Spain e-mails: [email protected], [email protected], [email protected], [email protected], [email protected]

Abstract

Variability is inherent in human language as different people make different choices when facing the same communicative act. In Natural Language Processing, variability is a challenge. It hinders some tasks such as evaluation of generated expressions, while it constitutes an interesting resource to achieve naturalness and to avoid repetitiveness. In this work, we present a methodological approach to study the influence of lexical variability. We apply this approach to TUNA, a corpus of referring expression lexicalizations, in order to study the use of different lexical choices. First, we reannotate the TUNA corpus with new information about lexicalization, and then we analyze this reannotation to study how people lexicalize referring expressions. The results show that people tend to be consistent when generating referring expressions. But at the same time, different people also share certain preferences.

Type
Articles
Copyright
Copyright © Cambridge University Press 2015 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aamodt, A., and Plaza, E. 1994. Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Communications 7 : 3959.CrossRefGoogle Scholar
Artstein, R., and Poesio, M. 2008. Inter-coder agreement for computational linguistics. Computational Linguistics 34 : 555596.CrossRefGoogle Scholar
Belz, A. and Gatt, A. 2007. The attribute selection for GRE challenge: overview and evaluation results. In Proceedings of the 2nd UCNLG Workshop: Language Generation and Machine Translation, Copenhaguen, Denmark, pp. 7583.Google Scholar
Belz, A., and Gatt, A. 2008. Intrinsic vs. extrinsic evaluation measures for referring expression generation. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies, Columbus, Ohio. Association for Computational Linguistics.CrossRefGoogle Scholar
Biber, D. 1988. Variation Across Speech and Writing. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Biber, D. 1995. Dimensions of Register Variation: A Cross-Linguistic Comparison. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Bohnet, B. 2008. The fingerprint of human referring expressions and their surface realization with graph transducers. In Referring Expression Generation Challenge 2008, 5th International Natural Language Generation Conference, Salt Fork, Ohio. Association for Computational Linguistics, pp. 207–2010.Google Scholar
Bohnet, B. 2009. Generation of referring expression with an individual imprint. In Generation Challenges 2009, European Natural Language Generation Conference, Athens, Greece. Association for Computational Linguistics, pp. 185186.Google Scholar
Brennan, R. L., and Prediger, D. J. 1981. Coefficient Kappa: some uses, misuses, and alternatives. Educational and Psychological Measurement 41 : 687699.CrossRefGoogle Scholar
Dale, R., and Viethen, J. 2009. Referring expression generation through attribute-based heuristics. In Proceedings of the 12th European Natural Language Generation Conference, Athens, Greece. Association for Computational Linguistics, pp. 85–65.Google Scholar
Dale, R., and Viethen, J. 2010. Empirical Methods in Natural Language Generation. Attribute-Centric Referring Expression Generation, pp. 163179. Berlin, Heidelberg: Springer-Verlag.CrossRefGoogle Scholar
Di Fabbrizio, G., Stent, A., and Bangalore, S. 2008. Referring expression generation using speaker-based attribute selection and trainable realization. In Referring Expression Generation Challenge 2008, 5th International Natural Language Generation Conference, Salt Fork, Ohio. Association for Computational Linguistics, pp. 211214.Google Scholar
Gatt, A. 2007. Generating Coherent References to Multiple Entities. PhD Thesis, University of Aberdeen, UK.Google Scholar
Gatt, A., Belz, A., and Kow, E. 2008b. The TUNA challenge 2008: overview and evaluation results. In Proceedings of the 5th International Conference on Natural Language Generation, Ohio, USA. Association for Computational Linguistics, pp. 198206.Google Scholar
Gatt, A., Belz, A., and Kow, E. 2009. The TUNA-REG challenge 2009: overview and evaluation results. In Proceedings of the 12th European Workshop on Natural Language Generation, Athens, Greece. Association for Computational Linguistics, pp. 174182.Google Scholar
Gatt, A., van der Sluis, I., and van Deemter, K. 2007. Evaluating algorithms for the generation of referring expressions using a balanced corpus. In Proceedings of the 11th European Workshop on Natural Language Generation, Germany. Association for Computational Linguistics, pp. 49–56.Google Scholar
Gatt, A., van der Sluis, I., and van Deemter, K. 2008a. XML format guidelines for the TUNA corpus. Technical Report, University of Aberdeen.Google Scholar
Giles, H., Coupland, J., and Coupland, N. 1991. Contexts of Accommodation: Developments in Applied Sociolinguistics. New York: Cambridge University Press.CrossRefGoogle Scholar
Hervás, R. 2009. Referring Expressions and Rhetorical Figures for Entity Distinction and Description in Automatically Generated Discourses. PhD Thesis, Universidad Complutense de Madrid, Spain.Google Scholar
Hervás, R., Francisco, V., and Gervás, P. 2013. Assessing the influence of personal preferences on the choice of vocabulary for natural language generation. Information Processing and Management 49 : 817832.CrossRefGoogle Scholar
Jain, A. K., Murty, M. N., and Flynn, P. J. 1999. Data clustering: a review. ACM Computing Surveys 31 : 264323.CrossRefGoogle Scholar
Krahmer, E., and van Deemter, K. 2012. Computational generation of referring expressions: a survey. Computational Linguistics 38 : 173218.CrossRefGoogle Scholar
Levenshtein, V. 1966. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10 : 707710.Google Scholar
Lin, C., and Och, F. J. 2004. Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain. Association for Computational Linguistics.Google Scholar
MacQueen, J. B. 1967. Some methods for classification and analysis of multiVariate observations. In Cam, L. M. L., and Neyman, J. (eds.), Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 281297. University of California, Berkeley.Google Scholar
Mairesse, F., and Walker, M. A. 2011. Controlling user perceptions of linguistic style: trainable generation of personality traits. Computational Linguistics 37 : 455488.CrossRefGoogle Scholar
Paiva, D., and Evans, R. 2005. Empirically-based control of natural language generation. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, Michigan. Association for Computational Linguistics, pp. 5865.Google Scholar
Papineni, K., Roukos, S., Ward, T., and Zhu, W. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, Pennsylvania. Association for Computational Linguistics, pp. 311318.Google Scholar
Power, R., Scott, D., and Bouayad-Agha, N. 2003. Generating texts with style. In Gelbukh, A. (ed.), Computational Linguistics and Intelligent Text Processing, pp. 93105. Berlin, Heidelberg: Springer-Verlag.CrossRefGoogle Scholar
Randolph, J. J. 2005. Free-marginal multirater Kappa: an alternative to fleiss’ fixed-marginal multirater Kappa. In Joensuu University Learning and Instruction Symposium, Joensuu, Finland.Google Scholar
Reiter, E., and Dale, R. 2000. Building Natural Language Generation Systems. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Reiter, E., Sripada, S., Hunter, J., Yu, J., and Davy, I. 2005. Choosing words in computer-generated weather forecasts. Artificial Intelligence 167 : 137169.Google Scholar
Scott, W. A. 1955. Reliability of content analysis: the case of nominal scale coding. The Public Opinion Quarterly 19 : 321325.Google Scholar
van Deemter, K., Gatt, A., van der Sluis, I. and Power, R. 2012. Generation of referring expressions: assessing the incremental algorithm. Cognitive Science 36 (5): 799836.CrossRefGoogle ScholarPubMed
van Deemter, K., van der Sluis, I., and Gatt, A. 2006. Building a semantically transparent corpus for the generation of referring expressions. In Proceedings of the 4th International Conference on Natural Language Generation (Special Session on Data Sharing and Evaluation), Sydney, Australia, pp. 130132.Google Scholar
Viethen, J., and Dale, R. 2010. Speaker-dependent variation in content selection for referring expression generation. In Proceedings of the 8th Australasian Language Technology Workshop, Melbourne, Australia, pp. 8189.Google Scholar