Citation function, polarity and influence classification

MYRIAM HERNÁNDEZ-ALVAREZ; JOSÉ M. GOMEZ SORIANO; PATRICIO MARTÍNEZ-BARCO

doi:10.1017/S1351324916000346

Citation function, polarity and influence classification

Published online by Cambridge University Press: 09 April 2017

MYRIAM HERNÁNDEZ-ALVAREZ

JOSÉ M. GOMEZ SORIANO and

PATRICIO MARTÍNEZ-BARCO

Show author details

MYRIAM HERNÁNDEZ-ALVAREZ: Affiliation:
Escuela Politécnica Nacional, Facultad de Ingeniería de, Sistemas, Quito, Ecuador e-mail: [email protected]
JOSÉ M. GOMEZ SORIANO: Affiliation:
Dpto. de Lenguajes y, Sistemas Informáticos, Universidad de Alicante, Alicante, España e-mails: [email protected]; [email protected]
PATRICIO MARTÍNEZ-BARCO: Affiliation:
Dpto. de Lenguajes y, Sistemas Informáticos, Universidad de Alicante, Alicante, España e-mails: [email protected]; [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Current methods for assessing the impact of authors and scientific media employ tools such as H-Index, Co-Citation and PageRank. These tools are primarily based on citation counting, which considers all citations to be equal. This type of methods can produce perverse incentives to publish controversial or incomplete papers, as mixed or negative reviews often generate larger citation counts and better indexes, regardless of whether the citations were critical or exerted minimal influence on the citing document. Passing citations that are employed to establish background, which do not have a real impact on the citing paper, are common in scientific literature. However, these citations have equal weight in impact evaluations. Notable researchers have emphasized the need to correct this situation by developing estimation methods that consider the different roles of quotations in citing papers. To accomplish this type of evaluation, a context citation analysis should be applied to determine the nature of the citations. We propose that citations should be categorized using four dimensions – FUNCTION, POLARITY, ASPECTS and INFLUENCE – as these dimensions provide adequate information that can be employed toward the generation of a qualitative method to measure the impact of a given publication in a citing paper. In this paper, we used interchangeably the words influence and impact. We present a method for obtaining this information using our proposed classification scheme and manually annotated corpus, which is marked with meaningful keywords and labels to help identify the characteristics or properties that constitute what we call ASPECTS. We develop a classification scheme which considers purpose definition shared by previous works. Our contribution is to abstract purpose classes from several other schemes and divide a complex structure in more manageable parts, to attain a simple system that combines low granularity dimensions but nevertheless produces a fine-grained classification. For annotators, the classification process is simple because in a first step, the coders distinguish only four primary classes, and in a second pass, they add the information contained in ASPECTS keyword and labels to obtain the more specific functions. This way, we gain a high granularity labeling that gives enough information about the citations to characterize and classify them, and we achieve this detailed coding with a straightforward process where the level of human error could be minimized.

Type: Articles
Information: Natural Language Engineering , Volume 23 , Issue 4 , July 2017 , pp. 561 - 588

DOI: https://doi.org/10.1017/S1351324916000346 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abu-Jbara, A., Ezra, J., and Radev, D., 2013. Purpose and polarity of citation: towards NLP-based bibliometrics. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, ACL, Atlanta, GA, pp. 596–606.Google Scholar

Artstein, R., and Poesio, M., 2008. Inter-coder agreement for computational linguistics. Computational Linguistics 34 (4): 555–96.Google Scholar

Athar, A. 2014. Sentiment analysis of scientific citations. Technical Report (UCAM-CL-TR-856), University of Cambridge, Computer Laboratoy.Google Scholar

Cano, V., 1989. Citation behavior: classification, utility, and location. Journal of the American Society for Information Science 40 (4): 284–90.3.0.CO;2-Z>CrossRef Google Scholar

Ciancarini, P., DiAAAAIorio, A., Nuzzolese, A. G., Peroni, S., and Vitali, F. 2014. Evaluating citation functions in CiTO: Cognitive issues. In Presutti, V., Stankovic, M., Cambria, E., Cantador, I., DiAAAAIorio, A., DiAAAANoia, T., Lange, C., Recupero, D. R., and Tordai, A. (eds.), Semantic Web: Trends and Challenges, pp. 580–94. Berlin: Springer International Publishing.Google Scholar

Ciancarini, P., Di Iorio, A., Nuzzolese, A. G., Peroni, S., and Vitali, F. 2013. Semantic annotation of scholarly documents and citations. In Baldoni, M., Baroglio, C., Boella, G., and Micalizio, R. (eds.), AI*IA 2013: Advances in Artificial Intelligence, 8249: pp. 336–47. Berlin: Springer.Google Scholar

Cortes, C., and Vapnik, V. 1995. Support-vector networks. Machine Learning, 20 (3): 273–97.Google Scholar

Di Iorio, A., Nuzzolese, A. G., & Peroni, S., 2013. Characterising citations in scholarly documents: The CiTalO framework. In Extended Semantic Web Conference, Springer, Berlin, pp. 66–77.Google Scholar

Dong, C., and Schäfer, U., 2011. Ensemble-style self-training on citation classification. In Proceedings of 5th International Joint Conference on Natural Language Processing, Asian Federation of Natural Language Processing, Chiang Mai, Thailand, pp. 623–31.Google Scholar

Fleiss, Joseph L. 1971. Measuring nominal scale agreement among many raters. In Psychological Bulletin, 76 (5): 378–82.Google Scholar

Garzone, M. A. 1997. Automated classification of citations using linguistic semantic grammars. Master’s Thesis. The University of Western Ontario. Available at http://www.collectionscanada.gc.ca/obj/s4/f2/dsk2/ftp04/mq28570.pdf Google Scholar

Garzone, M. and Mercer, R. E., 2000. Towards an automated citation classifier. In Advances in Artificial Intelligence, Springer, Berlin, pp. 337–46.Google Scholar

Geertzen, J. 2012. Inter-Rater Agreement with multiple raters and variables. Retrieved October 8, 2014, from https://nlp-ml.io/jg/software/ira/.Google Scholar

Herlach, G., 1978. Can retrieval of information from citation indexes be simplified? Multiple mention of a reference as a characteristic of the link between cited and citing article. Journal of the American Society for Information Science 29 (6): 308–10.Google Scholar

Hernández–Alvarez, M. and Gómez, J.M. 2015a. Survey about citation context Analysis: tasks, techniques, and resources. Natural Language Engineering. Available on CJO 2015 doi: 10.1017/S1351324915000388.CrossRef Google Scholar

Hernández–Alvarez, M. and Gómez, J.M., 2015b. Esquema de anotación para categorización de citas en bibliografía científica. Procesamiento del Lenguaje Natural 54: 45–52.Google Scholar

Hirsch, J. E., 2005. An index to quantify an individual’s scientific research output. Proceedings of the National academy of Sciences of the United States of America, United States of America 102 (46): 16569–72.Google Scholar

Hyland, K. 1998. Hedging in Scientific Research Articles, Vol. 54. Amsterdam: John Benjamins Publishing.Google Scholar

Hyland, K., 1996. Writing without conviction? Hedging in science research articles. Applied Linguistics 17: 433–54.Google Scholar

Di Iorio, A., Nuzzolese, A. G., and Peroni, S. 2013. Towards the automatic identification of the nature of citations. In García, A., Lange, C., Lord, P. and Stevens, R. (eds.), SePublica, pp. 63–74. Montpellier, France: SePublica.Google Scholar

Jochim, C., and Schütze, H., 2012. Towards a generic and flexible citation classifier based on a faceted classification scheme. In Proceedings of COLING’12, Mumbai, India, pp. 1343–58.Google Scholar

Kataria, S., Mitra, P., and Bhatia, S., 2010. Utilizing Context in Generative Bayesian Models for Linked Corpus. In AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA, pp. 1340–45.Google Scholar

Krippendorff, K., 2004. Reliability in content analysis: some common misconceptions and recommendations. Human Communication Research 30 (3): 411–33.Google Scholar

Landis, J. R., and Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics 33: 159–74.Google Scholar

Li, X., He, Y., Meyers, A., and Grishman, R., 2013. Towards fine-grained citation function classification. In Proceedings of Recent Advances in Natural Language Processing, Hissar, Bulgaria, pp. 402–7.Google Scholar

Liakata, M., Saha, S., Dobnik, S., Batchelor, C., and Rebholz-Schuhmann, D., 2012. Automatic recognition of conceptualization zones in scientific articles and two life science applications. Bioinformatics 28: 991–1000.Google Scholar

Marder, E., Kettenmann, H., and Grillner, S., 2010. Impacting our young. Proceedings of the National Academy of Sciences of the United States of America 107: 21233.Google Scholar

McCain, K. W., and Turner, K., 1989. Citation context analysis and aging patterns of journal articles in molecular genetics. Scientometrics 17 (1): 127–63.Google Scholar

McKeown, K., Daume, H., Chaturvedi, S., Paparrizos, J., Thadani, K., Barrio, P., Biran, O., Bothe, S., Collins, M., Fleischmann, K. R., Gravano, L., Jha, R., King, B., McInerney, K., Moon, T., Neelakantan, A., O’Seaghdha, D., Radev, D., Templeton, C. and Teufel, S. 2016. Predicting the impact of scientific concepts using full-text features. Journal of the Association for Information Science and Technology. doi: 10.1002/asi.23612.Google Scholar

Mercer, R. E., Di Marco, C., and Kroon, F. W., 2004. The frequency of hedging cues in citation contexts in scientific writing. In Advances in Artificial Intelligence, Springer, Berlin, pp. 75–88.Google Scholar

Meyers, A., 2013. Contrasting and corroborating citations in journal articles. In Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013, Hissar, Bulgaria, pp. 460–66.Google Scholar

Moravcsik, M. J., and Murugesan, P., 1975. Some results on the function and quality of citations. Social Studies of Science 5 (1): 86–92.Google Scholar

Page, L., Brin, S., Motwani, R., and Winograd, T. 1999. The PageRank citation ranking: bringing order to the web. Technical Report (SIDL-WP-1999-0120), Stanford InfoLab, Stanford University.Google Scholar

Prabha, C. G. 1983. Some aspects of citation behavior: a pilot study in business administration. Journal of the American Society for Information Science, 34 (3): 202–6.Google Scholar

Radicchi, F., 2012. In science “there is no bad publicity”: papers criticized in comments have high scientific impact. Nature Scientific Reports 2: 815.CrossRef Google Scholar PubMed

Sample, I. 2013. Nobel winner declares boycott of top science journals. The Guardian. http://www.theguardian.com/science/2013/dec/09/nobel-winner-boycott-science-journals.Google Scholar

Small, H., 1973. Co-citation in the scientific literature: a new measure of the relationship between two documents. Journal of the American Society for Information Science 24: 265–69.Google Scholar

Sollaci, L. B., and Pereira, M. G., 2004. The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. Journal of the Medical Library Association 92 (3): 364.Google Scholar

Swales, J., 1990. Genre analysis: English in academic and research settings. Cambridge: Cambridge University Press.Google Scholar

Teufel, S. 1999. Argumentative zoning: information extraction from scientific text. Doctoral dissertation, School of Cognitive Science, University of Edinburgh, UK. http://www.cl.cam.ac.uk/~sht25/thesis/t1.pdf.Google Scholar

Teufel, S., Siddharthan, A., and Tidhar, D., 2006. Automatic classification of citation function. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, ACL, Stroudsburg, PA, pp. 103–10.Google Scholar

Teufel, S., Siddharthan, A., and Tidhar, D., 2009. An annotation scheme for citation function. In Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, ACL, Stroudsburg, PA, pp. 80–7.Google Scholar

Verlic, M., Stiglic, G., Kocbek, S., and Kokol, P. 2008. Sentiment in Science - a case study of CBMS contributions in years 2003 to 2007. In 2008 21^st IEEE International Symposium on Computer-Based Medical Systems, Finland: University of Jyväskylä, pp. 138–43.Google Scholar

Article contents

Citation function, polarity and influence classification

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests