Hostname: page-component-cd9895bd7-jkksz Total loading time: 0 Render date: 2024-12-23T17:18:28.491Z Has data issue: false hasContentIssue false

Improving sentiment analysis with multi-task learning of negation

Published online by Cambridge University Press:  11 November 2020

Jeremy Barnes*
Affiliation:
Language Technology Group, University of Oslo, Oslo, Norway
Erik Velldal
Affiliation:
Language Technology Group, University of Oslo, Oslo, Norway
Lilja Øvrelid
Affiliation:
Language Technology Group, University of Oslo, Oslo, Norway
*
*Corresponding author. E-mail: [email protected]

Abstract

Sentiment analysis is directly affected by compositional phenomena in language that act on the prior polarity of the words and phrases found in the text. Negation is the most prevalent of these phenomena, and in order to correctly predict sentiment, a classifier must be able to identify negation and disentangle the effect that its scope has on the final polarity of a text. This paper proposes a multi-task approach to explicitly incorporate information about negation in sentiment analysis, which we show outperforms learning negation implicitly in an end-to-end manner. We describe our approach, a cascading and hierarchical neural architecture with selective sharing of Long Short-term Memory layers, and show that explicitly training the model with negation as an auxiliary task helps improve the main task of sentiment analysis. The effect is demonstrated across several different standard English-language data sets for both tasks, and we analyze several aspects of our system related to its performance, varying types and amounts of input data and different multi-task setups.

Type
Article
Copyright
© The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ambartsoumian, A. and Popowich, F. (2018). Self-attention: A better building block for sentiment analysis neural network classifiers. In Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Brussels, Belgium, pp. 130139.CrossRefGoogle Scholar
Augenstein, I., Ruder, S. and Søgaard, A. (2018). Multi-task learning of pairwise sequence classification tasks over disparate label spaces. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, USA, pp. 18961906.CrossRefGoogle Scholar
Augenstein, I. and Søgaard, A. (2017). Multi-task learning of keyphrase boundary classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 341346.CrossRefGoogle Scholar
Barnes, J., Klinger, R. and Schulte im Walde, S. (2017). Assessing state-of-the-art sentiment models on state-of-the-art sentiment datasets. In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Copenhagen, Denmark, pp. 212.CrossRefGoogle Scholar
Barnes, J., Øvrelid, L. and Velldal, E. (2019a). Sentiment analysis is not solved! Assessing and probing sentiment classification. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Florence, Italy, pp. 1223.CrossRefGoogle Scholar
Barnes, J., Touileb, S., Øvrelid, L. and Velldal, E. (2019b). Lexicon information in neural sentiment analysis: A multi-task learning approach. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, Turku, Finland.CrossRefGoogle Scholar
Bies, A., Mott, J., Warner, C. and Kulick, S. (2012). English web treebank. In Technical Report LDC2012T13, Linguistic Data Consortium, Philidelphia, PA, USA.Google Scholar
Bingel, J. and Søgaard, A. (2017). Identifying beneficial task relations for multi-task learning in deep neural networks. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 164169.CrossRefGoogle Scholar
Bjerva, J. (2017). Will my auxiliary tagging task help? Estimating auxiliary tasks effectivity in multi-task learning. In Proceedings of the 21st Nordic Conference on Computational Linguistics, Gothenburg, Sweden, pp. 216220.Google Scholar
Caruana, R. (1993). Multitask learning: A knowledge-based source of inductive bias. In Proceedings of the Tenth International Conference on Machine Learning. Morgan Kaufmann, pp. 4148.CrossRefGoogle Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K. and Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Research 12, 24932537.Google Scholar
Councill, I., McDonald, R. and Velikovich, L. (2010). What’s great and what’s not: Learning to classify the scope of negation for improved sentiment analysis. In Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, Uppsala, Sweden, pp. 5159.Google Scholar
Cruz, N.P., Taboada, M. and Mitkov, R. (2016). A machine-learning approach to negation and speculation detection for sentiment analysis. Journal of the Association for Information Science and Technology 67(9), 21182136.CrossRefGoogle Scholar
Das, S.R. and Chen, M.Y. (2007). Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Science 53(9), 13751388.CrossRefGoogle Scholar
Devlin, J., Chang, M.-W., Lee, K. and Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.Google Scholar
Enger, M., Velldal, E. and Øvrelid, L. (2017). An open-source tool for negation detection: A maximum-margin approach. In Proceedings of the EACL workshop on Computational Semantics Beyond Events and Roles (SemBEaR), Valencia, Spain, pp. 6469.CrossRefGoogle Scholar
Fancellu, F., Lopez, A. and Webber, B. (2016). Neural networks for negation scope detection. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 495504.CrossRefGoogle Scholar
Fancellu, F., Lopez, A., Webber, B. and He, H. (2017). Detecting negation scope is easy, except when it isn’t. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 5863.CrossRefGoogle Scholar
Fares, M., Oepen, S. and Velldal, E. (2018). Transfer and multi-task learning for noun–noun compound interpretation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 14881498.CrossRefGoogle Scholar
Farias, D.H. and Rosso, P. (2017). Irony, sarcasm, and sentiment analysis. In Pozzi F.A., Fersini E., Messina E. and Liu B. (eds), Sentiment Analysis in Social Networks, Chapter 7. Boston, USA: Morgan Kaufmann, pp. 113128.Google Scholar
Felbo, B., Mislove, A., Søgaard, A., Rahwan, I. and Lehmann, S. (2017). Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 16151625.CrossRefGoogle Scholar
Goldberg, Y. (2017). Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies 10(1), 1309.CrossRefGoogle Scholar
Graves, A. and Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18(5), 602610. IJCNN 2005.CrossRefGoogle ScholarPubMed
Howard, J. and Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp. 328339.CrossRefGoogle Scholar
Hu, M. and Liu, B. (2004). Mining opinion features in customer reviews. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, USA, pp. 168177.Google Scholar
Irsoy, O. and Cardie, C. (2014). Deep recursive neural networks for compositionality in language. In Ghahramani Z., Welling M., Cortes C., Lawrence N.D. and Weinberger K.Q. (eds), Advances in Neural Information Processing Systems 27. Curran Associates, Inc., pp. 20962104Google Scholar
Iyyer, M., Manjunatha, V., Boyd-Graber, J. and Daume, III H. (2015). Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 16811691.CrossRefGoogle Scholar
Jochim, C., Bonin, F., Bar-Haim, R. and Slonim, N. (2018). SLIDE – a sentiment lexicon of common idioms. In Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan, pp. 23872392.Google Scholar
Kennedy, A. and Inkpen, D. (2006). Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence 22(2), 110125.CrossRefGoogle Scholar
Kingma, D. and Ba, J. (2014). Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations.Google Scholar
Konstantinova, N., de Sousa, S.C., Cruz, N.P., Maña, M.J., Taboada, M. and Mitkov, R. (2012). A review corpus annotated for negation, speculation and their scope. In Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 31903195.Google Scholar
Kshirsagar, M., Thomson, S., Schneider, N., Carbonell, J., Smith, N.A. and Dyer, C. (2015). Frame-semantic role labeling with heterogeneous annotations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 218224.CrossRefGoogle Scholar
Lapponi, E., Read, J. and Øvrelid, L. (2012a). Representing and resolving negation for sentiment analysis. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining Workshops, Washington, DC, USA, pp. 687692.CrossRefGoogle Scholar
Lapponi, E., Velldal, E., Øvrelid, L. and Read, J. (2012b). UiO2: Sequence-labeling negation using dependency features. In Proceedings of the First Joint Conference on Lexical and Computational Semantics, Montreal, Canada, pp. 319327.Google Scholar
Lei, Z., Yang, Y., Yang, M. and Liu, Y. (2018). A multi-sentiment-resource enhanced attention network for sentiment classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp. 758763.CrossRefGoogle Scholar
Liu, P., Qian, K., Qiu, X. and Huang, X. (2017). Idiom-aware compositional distributed semantics. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 12041213.CrossRefGoogle Scholar
Liu, P., Qiu, X. and Huang, X. (2016). Recurrent neural network for text classification with multi-task learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, pp. 28732879.Google Scholar
Liu, Q., Fancellu, F. and Webber, B. (2018). NegPar: A parallel corpus annotated for negation. In Proceedings of the 11th International Conference on Language Resources and Evaluation, Miyazaki, Japan, pp. 34643472.Google Scholar
Martínez Alonso, H. and Plank, B. (2017). When is multitask learning effective? Semantic sequence prediction under varying data conditions. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 4453.CrossRefGoogle Scholar
Morante, R. and Blanco, E. (2012). *SEM 2012 shared task: Resolving the scope and focus of negation. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM), Montréal, Canada, pp. 265274.Google Scholar
Morante, R. and Daelemans, W. (2009). A metalearning approach to processing the scope of negation. In Proceedings of the 13th Conference on Computational Natural Language Learning, Boulder, USA.CrossRefGoogle Scholar
Morante, R. and Daelemans, W. (2012). ConanDoyle-neg: Annotation of negation cues and their scope in Conan Doyle stories. In Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey.Google Scholar
Morante, R., Liekens, A. and Daelemans, W. (2008). Learning the scope of negation in biomedical texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Waikiki, Hawaii.CrossRefGoogle Scholar
Nakov, P., Rosenthal, S., Kozareva, Z., Stoyanov, V., Ritter, A. and Wilson, T. (2013). Semeval-2013 task 2: Sentiment analysis in twitter. In Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval).Google Scholar
Packard, W., Bender, E.M., Read, J., Oepen, S. and Dridan, R. (2014). Simple negation scope resolution through deep parsing: A semantic solution to a semantic problem. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, USA.CrossRefGoogle Scholar
Padó, S. (2006). User’s guide to sigf: Significance testing by approximate randomisation. https://nlpado.de/sebastian/software/sigf.shtml.Google Scholar
Pang, B. and Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1–2), 1135.CrossRefGoogle Scholar
Pang, B., Lee, L. and Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, Philadelphia, USA, pp. 7986.Google Scholar
Peng, N. and Dredze, M. (2017). Multi-task domain adaptation for sequence tagging. In Proceedings of the 2nd Workshop on Representation Learning for NLP, Vancouver, Canada, pp. 91100.CrossRefGoogle Scholar
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K. and Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, USA, pp. 22272237.CrossRefGoogle Scholar
Plank, B. (2016). Keystroke dynamics as signal for shallow syntactic parsing. In Proceedings of the 26th International Conference on Computational Linguistics, Osaka, Japan, pp. 609619.Google Scholar
Polanyi, L. and Zaenen, A. (2006). Contextual Valence Shifters. Netherlands, Dordrecht: Springer, pp. 110.Google Scholar
Qian, Z., Li, P., Zhu, Q., Zhou, G., Luo, Z. and Luo, W. (2016). Speculation and negation scope detection via convolutional neural networks. In The 2016 Conference on Empirical Methods in Natural Language Processing.CrossRefGoogle Scholar
Read, J., Velldal, E., Øvrelid, L. and Oepen, S. (2012). UiO1: Constituent-based discriminative ranking for negation resolution. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM), Montreal, Canada.Google Scholar
Ruder, S., Bingel, J., Augenstein, I. and Søgaard, A. (2019). Latent multi-task architecture learning. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, USA.CrossRefGoogle Scholar
Sanh, V., Wolf, T. and Ruder, S. (2019). A hierarchical multi-task approach for learning embeddings from semantic tasks. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, Honolulu, USA.CrossRefGoogle Scholar
Schneider, N. and Smith, N.A. (2015). A corpus and model integrating multiword expressions and supersenses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, pp. 15371547.CrossRefGoogle Scholar
Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C., Ng, A. and Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, USA, pp. 16311642.Google Scholar
Søgaard, A. and Goldberg, Y. (2016). Deep multi-task learning with low level tasks supervised at lower layers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp. 231235.CrossRefGoogle Scholar
Taboada, M., Brooke, J., Tofiloski, M., Voll, K. and Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics 37(2), 267307.CrossRefGoogle Scholar
Tai, K.S., Socher, R. and Manning, C.D. (2015). Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 15561566.CrossRefGoogle Scholar
Tang, D., Wei, F., Yang, N., Zhou, M., Liu, T. and Qin, B. (2014). Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 15551565.CrossRefGoogle Scholar
Turney, P. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, USA.Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u. and Polosukhin, I. (2017). Attention is all you need. In Guyon I., Luxburg U.V., Bengio S., Wallach H., Fergus R., Vishwanathan S. and Garnett R. (eds), Advances in Neural Information Processing Systems 30. Curran Associates, Inc., pp. 59986008.Google Scholar
Velldal, E., Øvrelid, L., Read, J. and Oepen, S. (2012). Speculation and negation: Rules, rankers, and the role of syntax. Computational Linguistics 38(2), 369410.CrossRefGoogle Scholar
Verma, R., Kim, S. and Walter, D. (2018). Syntactical analysis of the weaknesses of sentiment analyzers. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 11221127.CrossRefGoogle Scholar
Vincze, V., Szarvas, G., Farkas, R., Móra, G. and Csirik, J. (2008). The BioScope corpus: Biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics, Suppl 11.CrossRefGoogle Scholar
White, J. (2012). UWashington: Negation resolution using machine learning methods. In Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM), Montreal, Canada.Google Scholar
Wiegand, M., Balahur, A., Roth, B., Klakow, D. and Montoyo, A. (2010). A survey on the role of negation in sentiment analysis. In Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, Uppsala, Sweden, pp. 6068.Google Scholar
Williams, L., Bannister, C., Arribas-Ayllon, M., Preece, A. and Spasić, I. (2015). The role of idioms in sentiment analysis. Expert Systems with Applications 42(21), 73757385.CrossRefGoogle Scholar
Xu, H., Liu, B., Shu, L. and Yu, P.S. (2019). BERT post-training for review reading comprehension and aspect-based sentiment analysis. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA.Google Scholar
Yeh, A. (2000). More accurate tests for the statistical significance of result differences. In Proceedings of the 18th Conference on Computational Linguistics, Saarbrücken, Germany, pp. 947953.CrossRefGoogle Scholar
Zhu, X., Guo, H., Mohammad, S. and Kiritchenko, S. (2014). An empirical study on the effect of negation words on sentiment. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, Maryland, pp. 304313.CrossRefGoogle Scholar