A state-of-the-art of semantic change computation

XURI TANG

doi:10.1017/S1351324918000220

A state-of-the-art of semantic change computation

Published online by Cambridge University Press: 18 June 2018

XURI TANG

Show author details

XURI TANG*: Affiliation:
School of Foreign Languages, Huazhong University of Science and Technology, Wuhan, China e-mail: [email protected]

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

This paper reviews the state-of-the-art of one emergent field in computational linguistics—semantic change computation. It summarizes the literature by proposing a framework that identifies five components in the field: diachronic corpus, diachronic word sense characterization, change modelling, evaluation and data visualization. Despite its potentials, the review shows that current studies are mainly focused on testifying hypotheses of semantic change from theoretical linguistics and that several core issues remain to be tackled: the need of diachronic corpora for languages other than English, the comparison and development of approaches to diachronic word sense characterization and change modelling, the need of comprehensive evaluation data and further exploration of data visualization techniques for hypothesis justification.

Type: Survey Paper
Information: Natural Language Engineering , Volume 24 , Issue 5 , September 2018 , pp. 649 - 676

DOI: https://doi.org/10.1017/S1351324918000220 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

The author is much obliged to the three anonymous reviewers for their inspiring comments that have helped improve the paper.s readability and comprehensiveness. This research is supported by the Fund of Chinese Natural Science (Grant 61772278) and Innovation Fund of Huazhong University of Science and Technology (Grant 2018WKZDJC003).

References

Agirre, E., and Soroa, A. 2007. UBC-AS: a graph based unsupervised system for induction and classification. In Paper presented at the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague.Google Scholar

Andersen, H. 1989. Understanding linguistic innovations. In Breivik, L. E. and Jahr, E. H. (eds.), Language Change: Contributions to the Study of Its Causes, pp. 5–28. Berlin: Mouton de Gruyter.Google Scholar

Bailey, C.-J. N. 1973. Variation and Linguistic Theory: Center for Applied Linguistics, Arlington, VA: Center for Applied Linguistics.Google Scholar

Beckner, C., Ellis, N. C., Blythe, R., Holland, H., Bybee, J., Ke, J., Christiansen, M. H., Larsen-Freeman, D., Croft, W., and Schoenemann, T., 2009. Language is a complex adaptive system: position paper. Language Learning 59 (Suppl. 1): 1–26.Google Scholar

Benito, A., Losada, A. G., Therón, R., Dorn, A., Seltmann, M., and Wandl-Vogt, E. 2016. A spatio-temporal visual analysis tool for historical dictionaries. In Paper presented at the 4th International Conference on Technological Ecosystems for Enhancing Multiculturality, New York, NY, USA.Google Scholar

Bennett, A., Baldwin, T., Lau, J. H., McCarthy, D., and Bond, F. 2016. LexSemTM: a semantic dataset based on all-words unsupervised sense distribution learning. In Paper presented at the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), Berlin, Germany.Google Scholar

Blank, A., and Koch, P. 1999. Introduction: historical semantics and cognition. In Blank, A. and Koch, P. (eds.), Historical Semantics and Cognition, pp. 1–16. Berlin: Mouton de Gruyter.Google Scholar

Blei, D. M., and Lafferty, J. D. 2006. Dynamic topic models. In Paper presented at the 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, USA.Google Scholar

Blei, D. M., Ng, A. Y., and Jordan, M. I., 2003. Latent dirichlet allocation. Journal of Machine Learning Research 2 : 993–1022.Google Scholar

Bloomfield, L., 1933. Language. New York: Holt, Rinehart and Winston, Inc.Google Scholar

Boussidan, A., and Ploux, S. 2011. Using topic salience and connotational drifts to detect candidates to semantic change. In Paper presented at the 9th International Conference on Computational Semantics, Oxford, United Kingdom.Google Scholar

Broad, C. D., 1938. Examination of McTaggart’s Philosophy (Vol. II). Cambridge, MA: Cambridge University Press.Google Scholar

Brockwell, P. J., and Davis, R. A. 2002. Introduction to Time Series and Forecasting, 2nd ed. New York: Springer.Google Scholar

Bullinaria, J., and Levy, J., 2007. Extracting semantic representations from word co-occurrence statistics: a computational study. Behavior Research Methods 39 (3): 510–26.Google Scholar

Bullinaria, J., and Levy, J., 2012. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD. Behavior Research Methods 44 (3): 890–907.Google Scholar

Cao, Y., Huang, L., Ji, H., Chen, X., and Li, J. 2017. Bridge text and knowledge by learning multi-prototype entity mention embedding. In Paper presented at the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada.Google Scholar

Cavallin, K. 2012. Automatic extraction of potential examples of semantic change using lexical sets. In Paper presented at the KONVENS 2012 Vienna.Google Scholar

Cook, P., Lau, J. H., McCarthy, D., and Baldwin, T. 2014. Novel word-sense identification. In Paper presented at the 25th International Conference on Computational Linguistics, Dublin, Ireland.Google Scholar

Crystal, D. 2006. Language and the Internet, 2nd ed. New York: Cambridge University Press.Google Scholar

Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R., 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41 (6): 391–407.Google Scholar

Dubossarsky, H., Grossman, E., and Weinshall, D. 2017. Outta control: laws of semantic change and inherent biases in word representation models. In Paper presented at the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark.Google Scholar

Dubossarsky, H., Tsvetkov, Y., Dyer, C., Weinshall, D., and Grossman, E. 2015. A bottom up approach to category mapping and meaning change. In Paper presented at the NetWordS 2015, Pisa, Italy.Google Scholar

Erk, K. 2006. Unknown word sense detection as outlier detection. In Paper presented at the 2006 Human Language Technology Conference of the North American Chapter of the ACL, New York, NY.Google Scholar

Fellbaum, C., 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.Google Scholar

Firth, J. R. 1957. A Synopsis of Linguistic Theory, 1930–1955 Studies in Linguistic Analysis, pp. 1–32. Oxford: Blackwell.Google Scholar

Fischer, R., 1998. Lexical Change in Present-day English: A Corpus-based Study of the Motivation, Institutionalization, and Productivity of Creative Neologisms. Tübingen: Gunter Narr Verlag.Google Scholar

Fortson, B. W. 2008. An approach to semantic change. In Joseph, B. D. and Janda, R. D. (eds.), The Handbook of Historical Linguistics, pp. 648–666. Malden, MA: Blackwell Publishing Ltd.Google Scholar

Frermann, L., and Lapata, M., 2016. A Bayesian model of diachronic meaning change. Transactions of the Association for Computational Linguistics 4 : 31–45.Google Scholar

Geach, P. T., 1969. God and the Soul. London: Routledge and Kegan Paul.Google Scholar

Geach, P. T., 1979. Truth, Love, and Immortality: An Introduction to McTaggart’s Philosophy. Michigan: Hutchinson.Google Scholar

Geeraerts, D., 1983. Reclassifying semantic change. Quaderni di Semantica 4 : 217–40.Google Scholar

Geeraerts, D., 1997. Diachronic Prototype Semantics: A Contribution to Historical Lexicology. Oxford, USA: Oxford University Press.Google Scholar

Geeraerts, D. 1999. Diachronic prototype semantics: a digest. In Blank, A. and Koch, P. (eds.), Historical Semantics and Cognition, pp. 91–108. Berlin: De Ruyter Mouton.Google Scholar

Goldberg, Y. and Orwant, J. 2013. A dataset of syntactic-ngrams over time from a very large corpus of English books. In Paper presented at the Joint Conference on Lexical and Computational Semantics, Atlanta, GA, USA.Google Scholar

Gulordava, K. and Baroni, M. 2011. A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. In Paper presented at the GEMS 2011 Workshop on Geometrical Models of Natural Language Semantics.Google Scholar

Hale, M., 2007. Historical Linguistics: Theory and Method. Oxford: Blackwell.Google Scholar

Hamilton, W. L., Leskovec, J., and Dan, J. 2016. Diachronic word embeddings reveal statistical laws of semantic change. In Paper presented at the 54th Annual Meeting of the Association for Computational Linguistics.Google Scholar

Harris, Z. S., 1954. Distributional structure. Word 10 (2–3): 146–62.Google Scholar

Heine, B., Claudi, U., and Hünnemeyer, F., 1991. Grammaticalization: A Conceptual Framework. Chicago, IL: University of Chicago Press.Google Scholar

Heylen, K., Wielfaert, T., Speelman, D., and Geeraerts, D., 2015. Monitoring polysemy: word space models as a tool for large-scale lexical semantic analysis. Lingua 157 : 153–72.Google Scholar

Hilpert, M., and Gries, S. T., 2009. Assessing frequency changes in multi-stage diachronic corpora: applications for historical corpus linguistics and the study of language acquisition. Literary and Linguistic Computing 34 (4): 385–401.Google Scholar

Hilpert, M., and Perek, F., 2015. Meaning change in a petri dish: constructions, semantic vector spaces, and motion charts. Linguistics Vanguard 1 (1): 339–50.Google Scholar

Hollman, W. B. 2009. Semantic change. In Culpeper, J., Katamba, F., Kerswill, P., and McEnery, T. (eds.), English Language: Description, Variation and Context, pp. 301–13. Basingstoke: Palgrave.Google Scholar

Jatowt, A., and Duh, K. 2014. A framework for analyzing semantic change of words across time. In Paper presented at the 14th ACM/IEEE-CS Joint Conference on Digital Libraries, London, United Kingdom.Google Scholar

Kintsch, W., 2001. Predication. Cognitive Science 25 (2): 173–202.Google Scholar

Korkontzelos, I., and Manandhar, S. 2010. UoY: Graphs of unambiguous vertices for word sense induction and disambiguation. In Paper presented at the 5th International Workshop on Semantic Evaluation, Uppsala.Google Scholar

Kroch, A., 1989. Reflexes of Grammar in patterns of language change. Language Variation and Change 1 (3): 199–244.Google Scholar

Kulkarni, V., Alrfou, R., Perozzi, B., and Skiena, S. 2015. Statistically significant detection of linguistic change. In Paper presented at the 24th International Conference on World Wide Web, Florence, Italy.Google Scholar

Labov, W., 1994. Principles of Linguistic Change: Internal Factors. Oxford: Blackwell.Google Scholar

Landau, S. I. 2001. Dictionaries: The Art and Craft of Lexicography, 2nd ed. Cambridge, MA: Cambridge University Press.Google Scholar

Lau, J. H., Cook, P., McCarthy, D., Gella, S., and Baldwin, T. 2014. Learning word sense distributions, detecting unattested senses and identifying novel senses using topic models. In Paper presented at the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, USA.Google Scholar

Lau, J. H., Cook, P., McCarthy, D., Newman, D., and Baldwin, T. 2012. Word sense induction for novel sense detection. In Paper presented at the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France.Google Scholar

Levy, O., Goldberg, Y., and Dagan, I., 2015. Improving distributional similarity with lessons learned from word embeddings. Bulletin De La Société Botanique De France 75 (3): 552–5.Google Scholar

Lewis, D. K., 1969. Convention: A Philosophical Study. Cambridge, MA: Harvard University Press.Google Scholar

Li, J., and Dan, J. 2015. Do multi-sense embeddings improve natural language understanding? In Paper presented at the 2015 Empirical Methods in Natural Language Processing, Lisbon, Portugal.Google Scholar

Lin, D. 1998. Automatic retrieval and clustering of similar words. In Paper presented at the 17th International Conference on Computational Linguistics, Montreal, Quebec, Canada.Google Scholar

Lin, Y., Michel, J. B., Aiden, E. L., Orwant, J., Brockman, W., and Petrov, S. 2012. Syntactic annotations for the google books ngram corpus. In Paper presented at the ACL 2012 System Demonstrations.Google Scholar

Liu, Y., Liu, Z., Chua, T. S., and Sun, M. 2015. Topical word embeddings. In Paper presented at the 29th AAAI Conference on Artificial Intelligence.Google Scholar

Mantia, F. L., Licata, I., and Perconti, P., 2017. Language in Complexity: The Emerging Meaning. Berlin, Heidelberg: Springer.Google Scholar

Marco, A. D., and Navigli, R., 2013. Clustering and diversifying web search results with graph-based word sense induction. Computational Linguistics 39 (3): 709–54.Google Scholar

Massip-Bonet, À. 2013. Language as a complex adaptive system: towards an integrative linguistics. In Massip-Bonet, À. and Bastardas-Boada, A. (eds.), Complexity Perspectives on Language, Communication and Society, pp. 35–60. Berlin, Heidelberg: Springer.Google Scholar

Michel, J.-B., Shen, Y. K., Aiden, A. P., Veres, A., Gray, M. K., Team, T. G. B., Pickett, J. P., Hoiberg, D., Clancy, D., Norvig, P., Orwant, J., Pinker, S., Nowak, M. A., and Aiden, E. L., 2011. Quantitative analysis of culture using millions of digitized books. Science 331 (6014): 176–82.Google Scholar

Mikolov, T., Sutskever, I., Chen, K., Corrado, G., and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In Paper presented at the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada.Google Scholar

Mitra, S., Mitra, R., Maity, S. K., Riedl, M., Biemann, C., Goyal, P., and Mukherjee, A., 2015. An automatic approach to identify word sense changes in text media across timescales. Natural Language Engineering 21 (5): 773–98.Google Scholar

Mortensen, C. 2016. Change and inconsistency. The Stanford Encyclopedia of Philosophy (Winter 2016 Edition). From https://plato.stanford.edu/archives/win2016/entries/change/.Google Scholar

Nasiruddin, M. 2013. A state of the art of word sense induction: a way towards word sense disambiguation for under-resourced languages. In Paper presented at the TALN-RÉCITAL 2013, Les Sables d’Olonne, France.Google Scholar

Navigli, R., 2009. Word sense disambiguation: a survey. ACM Computing Surveys 41 (2): 1–69.Google Scholar

Navigli, R., and Ponzetto, S. P., 2012. The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193 : 217–50.Google Scholar

Neuman, Y., Hames, H., and Cohen, Y. 2017. An information-based procedure for measuring semantic change in historical data. Measurement 105, (Suppl. C): 130–5.Google Scholar

Pasini, T., and Navigli, R. 2018. Two knowledge-based methods for high-performance sense distribution learning. In Paper presented at the AAAI 2018, New Orleans.Google Scholar

Pennington, J., Socher, R., and Manning, C. 2014. GloVe: global vectors for word representation. In Paper presented at the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar.Google Scholar

Prévost, N. 2003. The physics of language: towards a phase-transition of language change. Ph. D., Simon Fraser University.Google Scholar

Reisinger, J., and Mooney, R. J. 2010. Multi-prototype vector-space models of word meaning. In Paper presented at the 2010 Conference of the North American Chapter of the Association for Computational Linguistics.Google Scholar

Rohrdantz, C., Hautli, A., Mayer, T., Butt, M., Keim, D. A., and Plank, F. 2011. Towards tracking semantic change by visual analytics. In Paper presented at the 49th Annual Meeting of the Association for Computational Linguistics, Portland, Oregon, USA.Google Scholar

Rudolph, M., and Blei, D. 2018. Dynamic Bernoulli embeddings for language evolution. In Paper presented at the WWW 2018: The 2018 Web Conference, Lyon, France.Google Scholar

Sagi, E., Kaufmann, S., and Clark, B. 2009. Semantic density analysis: comparing word meaning across time and phonetic space. In Paper presented at the EACL 2009 Workshop on GEMS: GEometical Models of Natural Language Semantics, Athens, Greece.Google Scholar

Sagi, E., Kaufmann, S., and Clark, B. 2011. Tracing semantic change with latent semantic analysis. In Allan, K. and Robinson, J. A. (eds.), Current Methods in Historical Semantics. Berlin, Germany: Mouton de Gruyter.Google Scholar

Sinclair, J. 2005. Corpus and text: basic principles. In Wynne, M. (ed.), Developing Linguistic Corpora: A Guide to Good Practice, pp. 1–16. Oxford: Oxbow Books.Google Scholar

Sweetser, E., 1990. From Etymology to Pragmatics: Metaphorical and Cultural Aspects of Semantic Structure. Cambridge: Cambridge University Press.Google Scholar

Tang, X., Qu, W., and Chen, X., 2016. Semantic change computation: a successive approach. World Wide Web - Internet & Web Information Systems 19 (3): 375–415.Google Scholar

Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M., 2006. Hierarchical Dirichlet processes. Journal of the American Statistical Association 101 (476): 1566–81.Google Scholar

Traugott, E. C., and Dasher, R. B., 2002. Regularity in Semantic Change. Cambridge: Cambridge University Press.Google Scholar

Véronis, J., 2004. HyperLex: lexical cartography for information retrieval. Computer Speech & Language 18 (3): 223–52.Google Scholar

Wang, X., and Mccallum, A. 2006. Topics over time: a non-Markov continuous-time model of topical trends. In Paper presented at the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA.Google Scholar

Weaver, W. 1955. Translation. In Locke, W. N. and Booth, D. A. (eds.), Machine Translation of Languages, pp. 15–22. Cambridge, MA: MIT Press.Google Scholar

Wijaya, D. T., and Yeniterzi, R. 2011. Understanding semantic change of words over centuries. In Paper presented at the 2011 International Workshop on Detecting and Exploiting Cultural Diversity on the Social Web, Glasgow, Scotland, UK.Google Scholar

Yang, X., and Kemp, C. 2015. A computational evaluation of two laws of semantic change. In Paper presented at the 37th Annual Meeting of the Cognitive Science Society, Austin, TX.Google Scholar

Yao, Z., Sun, Y., Ding, W., Rao, N., and Xiong, H. 2017. Discovery of evolving semantics through dynamic word embedding learning. In Paper presented at the International Conference on Web Search and Data Mining (WSDM-2018).Google Scholar

Zuraw, K. 2006. Language change: probabilistic models. In Brown, E. K. and Anderson, A. (eds.), Encyclopedia of Language & Linguistics. Oxford: Elsevier.Google Scholar

Article contents

A state-of-the-art of semantic change computation

Abstract

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests