Hostname: page-component-586b7cd67f-rcrh6 Total loading time: 0 Render date: 2024-11-28T15:19:10.844Z Has data issue: false hasContentIssue false

A closed-domain question answering framework using reliable resources to assist students

Published online by Cambridge University Press:  10 April 2018

CANER DERİCİ
Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]
YİĞİT AYDIN
Affiliation:
Department of Computer Education and Educational Technology, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected]
ÇİĞDEM YENİALACA
Affiliation:
Department of Computer Education and Educational Technology, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected]
NİHAL YAĞMUR AYDIN
Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]
GÜNİZİ KARTAL
Affiliation:
Department of Computer Education and Educational Technology, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected]
ARZUCAN ÖZGÜR
Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]
TUNGA GÜNGÖR
Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]

Abstract

This paper describes a question answering framework that can answer student questions given in natural language. We suggest a methodology that makes use of reliable resources only, provides the answer in the form of a multi-document summary for both factoid and open-ended questions, and produces an answer also from foreign resources by translating into the native language. The resources are compiled using a question database in the selected domains based on reliability and coverage metrics. A question is parsed using a dependency parser, important parts are extracted by rule-based and statistical methods, the question is converted into a representation, and a query is built. Documents relevant to the query are retrieved from the set of resources. The documents are summarized and the answers to the question together with other relevant information about the topic of the question are shown to the user. A summary answer from the foreign resources is also built by the translation of the input question and the retrieved documents. The proposed approach was applied to the Turkish language and it was tested with several experiments and a pilot study. The experiments have shown that the summaries returned include the answer for about 50–60 percent of the questions. The data bank built for factoid and open-ended questions in the two domains covered is made publicly available.

Type
Article
Copyright
Copyright © Cambridge University Press 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*This work was supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) under the grant number 113E036. We would like to thank Çağıl Uluşahin Sönmez for her contribution in the Google Translate interface of the research.

References

Abacha, A.B., and Zweigenbaum, P., 2015. MEANS: a medical question-answering system combining NLP techniques and semantic web technologies. Information Processing and Management 51 : 570–94.Google Scholar
Alguliev, R.M., Aliguliyev, R.M., and Isazade, N.R., 2013. Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40 : 1675–89.Google Scholar
Barzilay, R., and Elhadad, M. 1997. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, pp. 10–7.Google Scholar
Bernhard, D., and Gurevych, I. 2009. Combining lexical semantic resources with question & answer archives for translation-based answer finding. In Proceedings of ACL-IJCNLP, pp. 728–36.Google Scholar
Bollegala, D., Okazaki, N., and Ishizuka, M., 2012. A preference learning approach to sentence ordering for multi-document summarization. Information Sciences 217 : 7895.Google Scholar
Bordes, A., Chopra, S., and Weston, J. 2014. Question answering with subgraph embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 615–20.Google Scholar
Bordes, A., Weston, J., and Usunier, N. 2014. Open question answering with weakly supervised embedding models. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), Springer-Verlag, pp. 165–80.Google Scholar
Bouziane, A., Bouchina, D., Doumi, N., and Malki, M., 2015. Question answering systems: survey and trends. Procedia Computer Science 73 : 366–75.Google Scholar
Brill, E., Dumais, S., and Banko, M. 2002. An analysis of the AskMSR Question-Answering system. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 257–64.Google Scholar
Chali, Y., Hasan, S.A., and Mojahid, M., 2015. A reinforcement learning formulation to the complex question answering problem. Information Processing and Management 51 : 252–72.Google Scholar
Chen, Y., Zhou, M., and Wang, S. 2006. Reranking answers for definitional QA using language modeling. In Proceedings of ACL/COLING, pp. 1081–8.Google Scholar
Chu-Carroll, J., Fan, J., Boguraev, B.K., Carmel, D., Sheinwald, D., and Welty, C., 2012a. Finding needles in the haystack: search and candidate generation. IBM Journal of Research and Development 56 (3): 300–11.Google Scholar
Chu-Carroll, J., Fan, J., Schlaefer, N., and Zadrozny, W. 2012b. Textual resource acquisition and engineering. IBM Journal of Research and Development 56 (3/4): 4.14.11.Google Scholar
Codina-Filba, J., Bouayad-Agha, N., Burga, A., Casamayor, G., Mille, S., Müller, A., Saggion, H., and Wanner, L., 2017. Using genre-specific features for patent summaries. Information Processing and Management 53 (1): 151–74.Google Scholar
Derici, C., Çelik, K., Kutbay, E., Aydın, Y., Güngör, T., Özgür, A., and Kartal, G. 2015. Question analysis for a closed domain question answering system. In Gelbukh, A. (ed.), Proceedings of Computational Linguistics and Intelligent Text Processing (CicLing), pp. 468–82. Springer, Cairo.Google Scholar
Derici, C., Çelik, K., Özgür, A., Güngör, T., Kutbay, E., Aydın, Y., and Kartal, G. 2014. Türkçe soru cevaplama sistemlerinde kural tabanlıodak çıkarımı(Rule-based focus extraction in Turkish question answering systems). In Proceedings of Signal Processing and Communications Applications Conference (SIU), pp. 1604–7.Google Scholar
Diefenbach, D., Lopez, V., Singh, K., and Maret, P. 2017. Core techniques of question answering systems over knowledge bases: a survey. Knowledge and Information Systems, pp. 141, Berlin, Germany: Springer.Google Scholar
Dong, L., Wei, F., Zhou, M., and Xu, K. 2015. Question answering over Freebase with multi-column convolutional neural networks. In Proceedings of International Joint Conference on Natural Language Processing (IJNLP), pp. 260–9.Google Scholar
Er, N.P., and Çiçekli, I. 2013. A factoid question answering system using answer pattern matching. In Proceedings of International Joint Conference on Natural Language Processing (IJNLP), pp. 854–8.Google Scholar
Eryiğit, G., Nivre, J., and Oflazer, K., 2008. Dependency parsing of Turkish. Computational Linguistics 34 (3): 357–89.Google Scholar
Fan, J., Kalyanpur, A., Gondek, D.C., and Ferrucci, D.A., 2012. Automatic knowledge extraction from documents. IBM Journal of Research and Development 56 (3/4): 5:15:10.Google Scholar
Feng, M., Xiang, B., Glass, M.R., Wang, L., and Zhou, B. 2015. Applying deep learning to answer selection: a study and an open task. In Proceedings of Automatic Speech Recognition and Understanding (ASRU), pp. 813–20.Google Scholar
Ferreira, R., Cabral, L.de S., Freitas, F., Lins, R.D., Silva, G.de F., Simske, S.J., and Favaro, L., 2014. A multi-document summarization system based on statistics and linguistic treatment. Expert Systems with Applications 41 : 5780–7.Google Scholar
Ferreira, R., Cabral, L. de S., Lins, R.F., Silva, G.P., Freitas, F., Cavalcanti, G.D.C., Lima, R., Simske, S.J., and Favaro, L., 2013. Assessing sentence scoring techniques for extractive text summarization. Expert Systems with Applications 40 : 5755–64.Google Scholar
Ferrucci, D.A., 2012. Introduction to “this is Watson”. IBM Journal of Research and Development 56 (3): 235–49.Google Scholar
Figueroa, F., and Neumann, G., 2016. Context-aware semantic classification of search queries for browsing community question–answering archives. Knowledge-Based Systems 96 : 113.Google Scholar
Ganesan, K., Zhai, C.X., and Han, J. 2010. Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of International Conference on Computational Linguistics (COLING), pp. 340–8.Google Scholar
Glavas, G., and Snajder, J., 2014. Event graphs for information retrieval and multi-document summarization. Expert Systems with Applications 41 : 6904–16.Google Scholar
Gondek, D.C., Lally, A., Kalyanpur, A., Murdock, J.W., Duboue, P.A., Zhang, L., Pan, Y., Qiu, Z.M., and Welty, C., 2012. A framework for merging and ranking of answers in DeepQA. IBM Journal of Research and Development 56 (3): 399410.Google Scholar
Habibi, M., Mahdabi, P., and Popescu-Belis, A., 2016. Question answering in conversations: query refinement using contextual and semantic information. Data & Knowledge Engineering 106 : 3851.Google Scholar
He, R., Tang, J., Gong, P., Hu, Q., and Wang, B., 2016. Multi-document summarization via group sparse learning. Information Sciences 349–50 : 1224.Google Scholar
Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., and Ngomo, A.-C.N., 2016. Survey on challenges of question answering in the semantic web. Semantic Web 8 (6): 126.Google Scholar
Iyyer, M., Boyd-Graber, J., Claudino, L., Socher, R., and Daume, H. 2014. A neural network for factoid question answering over paragraphs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 633–44.Google Scholar
İlhan, S., Duru, N., Karagöz, Ş., and Sağır, M. 2008. Metin madenciliği ile soru cevaplama sistemi (A question answering system based on text mining). In Proceedings of Elektrik-Elektronik ve Biyomedikal Mühendisliği Konferansı(ELECO) (Conference on Electrical-Electronics and Biomedical Engineering), pp. 356–9.Google Scholar
Katz, B. 1997. Annotating the world wide web using natural language. In Proceedings of the Conference on Computer Assisted Information Searching on the Internet (RIAO), pp. 136–55.Google Scholar
Khodadi, I., and Abadeh, M.S., 2016. Genetic programming-based feature learning for question answering. Information Processing and Management 52 : 340–57.Google Scholar
Kolomiyets, O. and Moens, M.F., 2011. A survey on question answering technology from an information retrieval perspective. Information Sciences 181 : 5412–34.Google Scholar
Lally, A., Prager, J.M., McCord, M.C., Boguraev, B.K., Patwardhan, S., Fan, J., Fodor, P., and Chu-Caroll, J. 2012. Question analysis: how Watson reads a clue. IBM Journal of Research and Development 56 (3/4), 2:12:14.Google Scholar
Landis, J.R., and Koch, G.G., 1977. The measurement of observer agreement for categorical data. Biometrics 33 (1): 159–74.Google Scholar
Li, J., Sun, L., Kit, C., and Webster, J. 2007. A query-focused multi-document summarizer based on lexical chains. In Proceedings of the Document Understanding Conference (DUC).Google Scholar
Lin, C.-Y. 2004. ROUGE: a package for automatic evaluation of summaries. In Proceedings of Workshop on Text Summarization Branches Out (WAS), pp. 74–81.Google Scholar
Lloret, E. and Palomar, M., 2012. Text summarisation in progress: a literature review. Artificial Intelligence Review 37 (1): 141.Google Scholar
Mani, I., 2001. Automatic Summarization. Amsterdam: John Benjamins Pub.Google Scholar
Marujo, L., Ling, W., Ribeiro, R., Gershman, A., Carbonell, J., de Matos, D.M., and Neto, J.P., 2016. Exploring events and distributed representations of text in multi-document summarization. Knowledge-Based Systems 94 : 3342.Google Scholar
McCord, M.C., Murdock, J.W., and Boguraev, B.K. 2012. Deep parsing in Watson. IBM Journal of Research and Development 56 (3/4), 3–1:3–15.Google Scholar
Medelyan, O. 2007. Computing lexical chains with graph clustering. In Proceedings of the Annual Meeting of the ACL: Student Research Workshop, pp. 85–90.Google Scholar
Metzler, D., and Croft, W.B., 2004. Combining the language model and inference network approaches to retrieval. Information Processing and Management 40 (5): 735–50.Google Scholar
Mishra, A., and Jain, S.K., 2016. A survey on question answering systems with classification. Journal of King Saud University 28 : 345–61.Google Scholar
Molino, P., Lops, P., Semeraro, G., Gemmis, M., and Basile, P., 2015. Playing with knowledge: a virtual player for “who wants to be a millionaire?” that leverages question answering techniques. Artificial Intelligence 222 : 157–81.Google Scholar
Momtazi, S. and Klakow, D., 2015. Bridging the vocabulary gap between questions and answer sentences. Information Processing and Management 51 : 595615.Google Scholar
Morita, H., Sakai, T., and Okumura, M. 2011. Query snowball: a co-occurrence-based approach to multi-document summarization for question answering. In Proceedings of the Annual Meeting of the ACL, pp. 223–9.Google Scholar
Murdock, J.W., Fan, J., Lally, A., Shima, H., and Boguraev, B.K., 2012a. Textual evidence gathering and analysis. IBM Journal of Research and Development 56 (3): 325–38.Google Scholar
Murdock, J.W., Kalyanpur, A., Welty, C., Fan, J., Ferrucci, D.A., Gondek, D.C., Zhang, L., and Kanayama, H., 2012b. Typing candidate answers using type coercion. IBM Journal of Research and Development 56 (3): 312–24.Google Scholar
Nagao, M., Tsujii, J., and Nakamura, J., 1988. The Japanese government project for machine translation. Computational Linguistics 11 (2–3): 91110.Google Scholar
Nenkova, A., and McKeown, K. 2012. A survey of text summarization techniques. In Aggarwal, C. C. and Zhai, C. X. (eds.), Mining Text Data. Boston, MA: Springer, pp. 4376.Google Scholar
Oliveira, H., Ferreira, R., Lima, R., Lins, R.F., Freitas, F., Riss, M., and Simske, S.J., 2016. Assessing shallow sentence scoring techniques and combinations for single and multi-document summarization. Expert Systems with Applications 65 : 6886.Google Scholar
Olvera-Lobo, M.D., and Gutierrez-Artacho, J. 2015. Question answering track evaluation in TREC, CLEF and NTCIR. In Rocha, A., Correia, A., Costanzo, S., and Reis, L. (eds.), New Contributions in Information Systems and Technologies - Advances in Intelligent Systems and Computing, p. 353, Berlin, Germany: Springer.Google Scholar
Pechsiri, C. and Piriyakul, R., 2016. Developing a why-how question answering system on community web boards with a causality graph including procedural knowledge. Information Processing in Agriculture 3 : 3653.Google Scholar
Qiang, J.-P., Chen, P., Ding, W., Xie, F., and Wu, X., 2016. Multi-document summarization using closed patterns. Knowledge-Based Systems 99 : 2838.Google Scholar
Sak, H., Güngör, T., and Saraçlar, M., 2011. Resources for Turkish morphological processing. Language Resources and Evaluation 45 : 249–61.Google Scholar
Shekarpour, S., Marx, E., Ngomo, A.-C.N., and Auer, S., 2015. SINA: semantic interpretation of user queries for question answering on interlinked data. Web Semantics: Science, Services and Agents on the World Wide Web 30 : 3951.Google Scholar
Silber, H.G., and McCoy, K.F., 2002. Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Computational Linguistics 28 (4): 487–96.Google Scholar
Utomo, F.S., Suryana, N., and Azmi, M.S., 2017. Question answering system: a review on question analysis, document processing, and answer extraction techniques. Journal of Theoretical and Applied Information Technology 95 (14): 3158–74.Google Scholar
Wan, X. 2009. Topic analysis for topic-focused multi-document summarization. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM, pp. 1609–12.Google Scholar
Wang, D., and Nyberg, E. 2015. A long short-term memory model for answer sentence selection in question answering. In Proceedings of ACL-IJCNLP, pp. 707–12.Google Scholar
Wang, D., Zhu, S., Li, T., and Gong, Y. 2012. Comparative document summarization via discriminative sentence selection. ACM Transactions on Knowledge Discovery from Data 6 (3), 12:112:18.Google Scholar
Wu, Y., Hori, C., Kashioka, H., and Kawai, H., 2015. Leveraging social Q&A collections for improving complex question answering. Computer Speech and Language 29 : 119.Google Scholar
Xiong, S., and Ji, D., 2016. Query-focused multi-document summarization using hypergraph-based ranking. Information Processing and Management 52 : 670–81.Google Scholar
Xiong, C., Merity, S., and Socher, R. 2016. Dynamic memory networks for visual and textual question answering. In Proceedings of the International Conference on Machine Learning, pp. 2397–406.Google Scholar
Yang, L., Ai, Q., Spina, D., Chen, R-C., Pang, L., Croft, W.B., Guo, J., and Scholer, F. 2016. Beyond factoid QA: effective methods for non-factoid answer sentence retrieval. In Proceedings of the European Conference on Information Retrieval (ECIR, pp. 115–28.Google Scholar
Yang, M.-C., Lee, D.-G., Park, S.-Y., and Rim, H.-C., 2015a. Knowledge-based question answering using the semantic embedding space. Expert Systems with Applications 42 : 9086–104.Google Scholar
Yang, Y., Yih, W.-t., and Meek, C. 2015b. WIKIQA: A challenge dataset for open-domain question answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2013–18.Google Scholar
Yih, W-T., He, X., and Meek, C. 2014. Semantic parsing for single-relation question answering. In Proceedings of the Annual Meeting of ACL, pp. 643–8.Google Scholar
Yu, L., Hermann, K.M., Blunsom, P., and Pulman, S. 2014. Deep learning for answer sentence selection, In Proceedings of NIPS Deep Learning Workshop.Google Scholar
Zheng, Z. 2002. AnswerBus question answering system. In Proceedings of the International Conference on Human Language Technology Research (HLT, pp. 399–404.Google Scholar
Zhong, S.-h., Liu, Y., Li, B., and Long, J., 2015. Query-oriented unsupervised multi-document summarization via deep learning model. Expert Systems with Applications 42 : 8146–55.Google Scholar