A Semantic Scattering model for the automatic interpretation of English genitives

ADRIANA BADULESCU; DAN MOLDOVAN

doi:10.1017/S1351324908004798

A Semantic Scattering model for the automatic interpretation of English genitives

Published online by Cambridge University Press: 01 April 2009

ADRIANA BADULESCU and

DAN MOLDOVAN

Show author details

ADRIANA BADULESCU: Affiliation:
Lymba Corporation, 1701 N. Collins Blvd., Suite 3000, Richardson, TX 75080, USA e-mail: [email protected], [email protected]
DAN MOLDOVAN: Affiliation:
Lymba Corporation, 1701 N. Collins Blvd., Suite 3000, Richardson, TX 75080, USA e-mail: [email protected], [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

An important problem in knowledge discovery from text is the automatic extraction of semantic relations. This paper addresses the automatic classification of the semantic relations expressed by English genitives. A learning model is introduced based on the statistical analysis of the distribution of genitives' semantic relations in a corpus. The semantic and contextual features of the genitive's noun phrase constituents play a key role in the identification of the semantic relation. The algorithm was trained and tested on a corpus of approximately 20,000 sentences and achieved an f-measure of 79.80 per cent for of-genitives, far better than the 40.60 per cent obtained using a Decision Trees algorithm, the 50.55 per cent obtained using a Naive Bayes algorithm, or the 72.13 per cent obtained using a Support Vector Machines algorithm on the same corpus using the same features. The results were similar for s-genitives: 78.45 per cent using Semantic Scattering, 47.00 per cent using Decision Trees, 43.70 per cent using Naive Bayes, and 70.32 per cent using a Support Vector Machines algorithm. The results demonstrate the importance of word sense disambiguation and semantic generalization/specialization for this task. They also demonstrate that different patterns (in our case the two types of genitive constructions) encode different semantic information and should be treated differently in the sense that different models should be built for different patterns.

Type: Papers
Information: Natural Language Engineering , Volume 15 , Issue 2 , April 2009 , pp. 215 - 239

DOI: https://doi.org/10.1017/S1351324908004798 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Agirre, E., Alfonseca, E., and de Lacalle, O. L. 2004. Approximating hierarchy-based similarity for WordNet Nominal Synsets using topic signatures. In Proceedings of GWC 2004, Brno, Czech Republic.Google Scholar

Altenberg, B. 1982. Binominal NPs in a thematic perspective: genitive vs. of-constructions in 17th century English. Scandinavian Symposium on Syntax Variation, Stockholm Studies in English, 52, Stockholm: Almqvist & Wiksell.Google Scholar

Badulescu, A. 2004. Classification of Semantic Relations Between Nouns. PhD Dissertation, University of Texas at Dallas, Richardson, Texas.Google Scholar

Barker, C. 1995. Possessive Descriptions. Stanford: CSLI Publications.Google Scholar

Berland, M., and Charniak, E. 1999. Finding parts in very large corpora. In Proceeding of ACL, 1999, College Park, Maryland.CrossRef Google Scholar

Carreras, X., and Marquez, L. 2004. Introduction to the CoNLL-2004 shared task: semantic role labeling. In Proceedings of CoNLL-2004, Boston, Massachusetts.CrossRef Google Scholar

Carreras, X., and Marquez, L. 2005. Introduction to the CoNLL-2005 shared task: semantic role labeling. In Proceedings of CoNLL-2005, Ann Arbor, Michigan.CrossRef Google Scholar

Chang, C.-C., and Lin, C.-J. 2004. Libsvm: a library for support vector machines. http://www.csie.ntu.edu.tw/cjlin/papers/libsvm.pdf (accessed December 29, 2004).Google Scholar

Chomsky, N. 1986. Knowledge of Language. Its Nature, Origin and Use. New York: Praeger.Google Scholar

Cilibrasi, R. L., and Vitanyi, P. M. B. 2007. The google similarity distance. In IEEE/ACM Transactions on Knowledge and Data Engineering, 19(3):370–383.CrossRef Google Scholar

Evens, M. 1980. Lexical-semantic relations: a comparative survey. Edmonton, Canada: Linguistic Research, Inc.Google Scholar

Fellbaum, C. 1998. WordNet—An Electronic Lexical Database. Cambridge, MA: MIT Press.CrossRef Google Scholar

Friedman, N., Geiger, D., and Goldszmidt, M. 1997. Bayesian network classifiers. Machine Learning 29 (2/3):131–163.CrossRef Google Scholar

Girju, R., Badulescu, A., and Moldovan, D. 2003. Learning semantic constraints for the automatic discovery of part-whole relations. In Proceedings of the Human Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada.CrossRef Google Scholar

Girju, R., Badulescu, A., and Moldovan, D. 2006. Automatic discovery of part-whole relations. Computational Linguistics 32 (2):83–135.Google Scholar

Girju, R., Moldovan, D., Tatu, M., and Antohe, D. 2005. On the semantics of noun compounds. Computer Speech & Language 19 (4):479–496.CrossRef Google Scholar

Girju, R., Nakov, P., Nastase, V., Szpakowicz, S., Turney, P., and Yuret, D. 2007. SemEval-2007 task 04: classification of semantic relations between nominals. In Proceedings of SemEval-2007 Workshop at ACL 2007, Prague, Czech Republic.CrossRef Google Scholar

Hearst, M. 1998. Automated discovery of WordNet relations. In Fellbaum, C. (ed.), An Electronic Lexical Database and Some of its Applications. Cambridge, MA: MIT Press.Google Scholar

Hirst, G., and St-Onge, D. 1998. Lexical chains as representations of context for the detection and correction of malapropisms. In Fellbaum, Christiane (ed.), WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.Google Scholar

Huddleston, R., and Pullum, G. K. 2002. The Cambridge Grammar of the English Language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Jespersen, O. 1949. A modern English grammar on historical principles. In Syntax 7, Copenhagen: Munksgaard.Google Scholar

Jiang, J. J., and Conrath, D. W. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of ROCLING 1997, Taipei, Taiwan.Google Scholar

Langacker, R. 1990. Concept, Image, and Symbol. The Cognitive Basis of Grammar. New York: Berlin and Mouton.Google Scholar

Langacker, R. 1992. The symbolic nature of cognitive grammar: the meaning of of and of-periphrasis. In Putz, M. (ed.), Thirty Years of Linguistic Evolution, Amsterdam: John Benjamins.Google Scholar

Langacker, R. 1993. Reference-point constructions. Cognitive Linguistics 4:1–38.CrossRef Google Scholar

Langacker, R. 1995. Possession and possessive constructions. In Taylor, J. and MacLaury, R. (eds.), Language and the Cognitive Construal of the World, Berlin: Mouton de Gruyter.Google Scholar

Lapata, M. 2002. The disambiguation of nominalisations. Computational Linguistics 28 (3):357–388.CrossRef Google Scholar

Lauer, M. 1995. Designing Statistical Language Learners: Experiments on Noun Compounds. PhD Thesis, Macquarie University, Australia.Google Scholar

Leacock, C., and Chodorow, M. 1998. Combining local context and WordNet similarity for word sense identification. In Fellbaum, C. (ed.), WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.Google Scholar

Li, H., and Abe, N. 1998. Generalizing case frames using a thesaurus and the MDL principle. Computational Linguistics 24 (2):217–224.Google Scholar

Light, M., and Greiff, W. 2002. Statistical models for the induction and use of selectional preferences. Cognitive Science 87:1–13.Google Scholar

Litkowski, K. 2004. Senseval-3 task. Automatic labeling of semantic roles. In Proceedings of Senseval 3, Barcelona, Spain.Google Scholar

Maki, W. S., McKinley, L. N., and Thompson, A. G. 2004. Semantic distance norms computed from an electronic dictionary (WordNet). Behavior Research Methods, Instruments & Computers, Web-Based Archive of Norms, Stimuli, and Data 36 (3):421–431.Google Scholar PubMed

McCarthy, D. 2001. Lexical Acqusition at the Syntax–Semantics Interface: Diathesis Alternations, Subcategorization Frames and Selectional Preferences. PhD Dissertation, Univesity of Sussex.Google Scholar

Moldovan, D., and Badulescu, A. 2005. A Semantic Scattering model for the automatic interpretation of genitives. In Proceedings of the Human Language Technology Conference (HLT-NAACL) 2006, Vancouver, Canada.CrossRef Google Scholar

Moldovan, D., Badulescu, A., Tatu, M., Antohe, D., and Girju, R. 2004. Models for the semantic classification of noun phrases. In Proceedings of the Human Language Technology Conference (HLT-NAACL) 2004, Computational Lexical Semantics Workshop, Boston, Massachusetts.CrossRef Google Scholar

Moldovan, D., Clark, C., and Bowden, M. 2007. Lymba's PowerAnswer 4 in TREC 2007. In Proceedings of TREC 2007 Conference, Gaithersburg, Maryland.Google Scholar

Moldovan, D., and Novischi, A. 2002. Lexical chains for question answering. In Proceedings of COLING 2002, Taipei, Taiwan.CrossRef Google Scholar

Nikiforidou, K. 1991. The meanings of the genitive: a case study in the semantic structure and semantic change. Cognitive Linguistics 2 (149):149–205.CrossRef Google Scholar

Novischi, A., Moldovan, D., Parker, P., Badulescu, A., and Hauser, B. 2004. LCC's WSD systems for senseval 3. In Proceedings of Senseval 3, Barcelona, Spain.Google Scholar

Partee, B., and Borschev, V. 1999. Possessives, favorite, and coercion. In Proceedings of ESCOL99, Ithaca, New York.Google Scholar

Quinlan, R. 2002. Data mining tools see5 and c5.0. http://www.rulequest.com/see5-info.html (accessed December 29, 2004).Google Scholar

Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. 1985. A Comprehensive Grammar of English Language. Harlow, England: Longman.Google Scholar

Rissanen, J. 1978. Modeling by shortest data description. Automatica 14:149–205.CrossRef Google Scholar

Rosario, B., Hearst, M., and Fillmore, C. 2002. The descendent of hierarchy and selection in relational semantics. In Proceedings of ACL 2002, Philadelphia, Pennsylvania.CrossRef Google Scholar

Siegel, S., and Castellan, N. J. 1988. Non Parametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.Google Scholar

Stefanowitsch, A. 2001. Constructional semantics as a limit to grammatical alternation: two genitives of English. In Rohdenburg, G. and Mondorf, B. (eds.), Determinants of Grammatical Variation in English, Berlin: Mouton de Gruyter.Google Scholar

Strang, B. 1962. Modern English Structure. London: Edward Arnold.Google Scholar

Taylor, J. 1996. Possessives in English. An Exploration in Cognitive Grammar. Oxford: Clarendon Press.CrossRef Google Scholar

Vikner, C., and Jensen, P. A. 1999. A Semantic Analysis of the English Genitive: Interaction of Lexical and Formal Semantics. Denmark: Ms. Copenhagen and Kolding.Google Scholar

Williams, E. 1982. The NP cycle. Linguistic Inquiry 13:277–295.Google Scholar

Article contents

A Semantic Scattering model for the automatic interpretation of English genitives

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests