Hostname: page-component-586b7cd67f-gb8f7 Total loading time: 0 Render date: 2024-11-22T23:40:20.919Z Has data issue: false hasContentIssue false

A syntactic approach for opinion mining on Spanish reviews

Published online by Cambridge University Press:  09 August 2013

DAVID VILARES
Affiliation:
Departamento de Computación, Universidade da Coruña Campus de Elviña, 15071 A Coruña, Spain e-mails: [email protected], [email protected], [email protected]
MIGUEL A. ALONSO
Affiliation:
Departamento de Computación, Universidade da Coruña Campus de Elviña, 15071 A Coruña, Spain e-mails: [email protected], [email protected], [email protected]
CARLOS GÓMEZ-RODRÍGUEZ
Affiliation:
Departamento de Computación, Universidade da Coruña Campus de Elviña, 15071 A Coruña, Spain e-mails: [email protected], [email protected], [email protected]

Abstract

We describe an opinion mining system which classifies the polarity of Spanish texts. We propose an NLP approach that undertakes pre-processing, tokenisation and POS tagging of texts to then obtain the syntactic structure of sentences by means of a dependency parser. This structure is then used to address three of the most significant linguistic constructions for the purpose in question: intensification, subordinate adversative clauses and negation. We also propose a semi-automatic domain adaptation method to improve the accuracy of our system in specific application domains, by enriching semantic dictionaries using machine learning methods in order to adapt the semantic orientation of their words to a particular field. Experimental results are promising in both general and specific domains.

Type
Articles
Copyright
Copyright © Cambridge University Press 2013 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aue, A., and Gamon, M. 2005. Customizing sentiment classifiers to new domains: a case study. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets, BG.Google Scholar
Bakliwal, A., Arora, P., Madhappan, S., Kapre, N., Singh, M., and Varma, V. 2012. Mining sentiments from Tweets. In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis. WASSA '12, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 1118).Google Scholar
Boiy, E., and Moens, M., 2009. A machine learning approach to sentiment analysis in multilingual Web texts. Information Retrieval 12 (5): 526–58.CrossRefGoogle Scholar
Brill, E. 1992. A simple rule-based part of speech tagger. In Proceedings of the Workshop on Speech and Natural Language. HLT '91, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 112–16).Google Scholar
Brooke, J., Tofiloski, M., and Taboada, M. 2009. Cross-linguistic sentiment analysis: from English to Spanish. In Proceedings of International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria (pp. 50–4).Google Scholar
Buchholz, S., and Marsi, E. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the Tenth Conference on Computational Natural Language Learning. CoNLL-X '06, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 149–64).CrossRefGoogle Scholar
Campos, H., 1993. De la oración simple a la oración compuesta: Curso Superior de Gramática Española. Washington, D.C.: Georgetown University Press.Google Scholar
Chang, C., and Lin, C. 2011. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems Technology 2 (3): 27:1–27:27.Google Scholar
Cruz Mata, F. L., 2011. Extracción de opiniones sobre características: Un enfoque Práctico adaptado al dominio. PhD thesis, Spain: Universidad de Sevilla.Google Scholar
Fernández Anta, A., Morere, P., Núñez Chiroque, L., and Santos, A. 2012. Techniques for sentiment analysis and topic detection of Spanish tweets: preliminary report. In TASS 2012 Working Notes, Castellón de la Plana, Spain.Google Scholar
Gómez-Rodríguez, C., Carroll, J., and Weir, D., 2011. Dependency parsing schemata and mildly non-projective dependency parsing. Computational Linguistics 37 (3): 541–86.Google Scholar
Greene, S., and Resnik, P. 2009. More than words: syntactic packaging and implicit sentiment. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. NAACL '09, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 503–11).Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I. H., 2009. The WEKA data mining software: an update. SIGKDD Explorations 11 (1): 1018.Google Scholar
Jia, L., Yu, C., and Meng, W. 2009. The effect of negation on sentiment analysis and retrieval effectiveness. Proceedings of the 18th ACM Conference on Information and Knowledge Management. CIKM'09, New York, NY, USA: ACM (pp. 1827–30).Google Scholar
Joshi, M., and Penstein-Rosé, C. 2009. Generalizing dependency features for opinion mining. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. ACLShort '09, Suntec, Singapore: Association for Computational Linguistics (pp. 313–16).CrossRefGoogle Scholar
Kennedy, A., and Inkpen, D., 2006. Sentiment classification of movie reviews using contextual valence shifters. Computational Intelligence 22 (2): 110–25.Google Scholar
Kübler, S., McDonald, R., and Nivre, J., 2009. Dependency Parsing. San Rafael, CA: Morgan & ClayPool Publishers.CrossRefGoogle Scholar
Montejo-Ráez, A., Martínez-Cámara, E., Martín-Valdivia, M. T., and Ureña López, L. A. 2012. Random walk weighting over sentiwordnet for sentiment polarity detection on Twitter. In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis. WASSA '12, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 310).Google Scholar
Murray, G., and Carenini, G., 2011. Subjectivity detection in spoken and written conversations. Natural Language Engineering 17 (3): 397418.Google Scholar
Nakagawa, T., Inui, K., and Kurohashi, S. 2010. Dependency tree-based sentiment classification using CRFs with hidden variables. In NAACL HLT'10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Proceedings of the Main Conference. HLT '10, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 786–94).Google Scholar
Nivre, J., 2008. Algorithms for deterministic incremental dependency parsing. Computational Linguistics 34 (4): 513–53.Google Scholar
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., and Marsi, E. 2007. MaltParser: a language-independent system for data-driven dependency parsing. Natural Language Engineering, 13 (2): 95135.CrossRefGoogle Scholar
Pak, A., and Paroubek, P. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10), Valletta, Malta: European Language Resources Association (ELRA).Google Scholar
Pang, B., and Lee, L., 2008. Opinion Mining and Sentiment Analysis. Hanover, MA, USA: Now Publishers.Google Scholar
Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing – Volume 10. EMNLP '02, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 7986).Google Scholar
Reyes, A., Rosso, P., and Buscaldi, D., 2012. From humor recognition to irony detection: the figurative language of social media. Data and Knowledge Engineering 74 : 112.Google Scholar
Reyes, A., Rosso, P., and Veale, T., 2013. A multidimensional approach for detecting irony in Twitter. Language Resources and Evaluation 47 (1): 239–68.Google Scholar
Saralegi Urizar, X., and San Vicente Roncal, I. 2012. Detecting sentiments in Spanish Tweets. In: TASS 2012 Working Notes. Castellón de la Plana, Spain.Google Scholar
Sidorov, G., Miranda-Jiménez, S., Viveros-Jiménez, F., Gelbukh, A., Castro-Sánchez, N., Velásquez, F., Díaz-Rangel, I., Suárez-Guerra, S., Treviño, A., and Gordon, J. 2013. Empirical study of machine learning based approach for opinion mining in tweets. In Proceedings of the 11th Mexican International Conference on Advances in Artificial Intelligence – Volume Part I. MICAI'12, Berlin, Heidelberg: Springer-Verlag (pp. 114).Google Scholar
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., and Stede, M., 2011. Lexicon-based methods for sentiment analysis. Computational Linguistics 37 (2): 267307.Google Scholar
Taulé, M., Martí, M. A., and Recasens, M. 2008. AnCora: multilevel annotated corpora for Catalan and Spanish. In Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odjik, J., Piperidis, S., and Tapias, D. (eds.), Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08). Marrakech, Morocco.Google Scholar
Turney, P. D. 2002. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. ACL '02, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 417–24).Google Scholar
Vilares, D., Alonso, M. A., and Gómez-Rodríguez, C. 2013. Supervised polarity classification of Spanish tweets based on linguistic knowledge. In Proceedings of the 2013 ACM Symposium on Document Engineering. DocEng '13. New York, NY, USA: ACM.Google Scholar
Volokh, A., and Neumann, G. 2012. Task-oriented dependency parsing evaluation methodology. In The 13th IEEE International Conference on Information Reuse and Integration (IRI), Las Vegas, NV (pp. 132–7).Google Scholar
Wu, Y., Zhang, Q., Huang, X., and Wu, L. 2009. Phrase dependency parsing for opinion mining. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. EMNLP '09, vol. 3, Stroudsburg, PA, USA: Association for Computational Linguistics (pp. 1533–41).Google Scholar
Yang, K. 2008. WIDIT in TREC 2008 blog track: leveraging multiple sources of opinion evidence. In NIST Special Publication 500-277: The Seventeenth Text REtrieval Conference Proceedings (TREC 2008). Gaithersburg, Maryland.Google Scholar
Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., and Liu, B. 2011. Combining lexicon-based and learning-based methods for Twitter sentiment analysis. Technical Reptort HPL-2011-89. HP Laboratories, Palo Alto, CA.Google Scholar
Zhang, C., Zeng, D., Li, J., Wang, F., and Zuo, W. 2009. Sentiment analysis of Chinese documents: from sentence to document level. Journal of the American Society for Information Science and Technology, 60 (12): 2474–87.CrossRefGoogle Scholar