A closed-domain question answering framework using reliable resources to assist students

CANER DERİCİ; YİĞİT AYDIN; ÇİĞDEM YENİALACA; NİHAL YAĞMUR AYDIN; GÜNİZİ KARTAL; ARZUCAN ÖZGÜR; TUNGA GÜNGÖR

doi:10.1017/S1351324918000141

A closed-domain question answering framework using reliable resources to assist students

Published online by Cambridge University Press: 10 April 2018

ARZUCAN ÖZGÜR and

CANER DERİCİ: Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]
YİĞİT AYDIN: Affiliation:
Department of Computer Education and Educational Technology, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected]
ÇİĞDEM YENİALACA: Affiliation:
Department of Computer Education and Educational Technology, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected]
NİHAL YAĞMUR AYDIN: Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]
GÜNİZİ KARTAL: Affiliation:
Department of Computer Education and Educational Technology, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected]
ARZUCAN ÖZGÜR: Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]
TUNGA GÜNGÖR: Affiliation:
Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey e-mail: [email protected], [email protected], [email protected], [email protected]

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

This paper describes a question answering framework that can answer student questions given in natural language. We suggest a methodology that makes use of reliable resources only, provides the answer in the form of a multi-document summary for both factoid and open-ended questions, and produces an answer also from foreign resources by translating into the native language. The resources are compiled using a question database in the selected domains based on reliability and coverage metrics. A question is parsed using a dependency parser, important parts are extracted by rule-based and statistical methods, the question is converted into a representation, and a query is built. Documents relevant to the query are retrieved from the set of resources. The documents are summarized and the answers to the question together with other relevant information about the topic of the question are shown to the user. A summary answer from the foreign resources is also built by the translation of the input question and the retrieved documents. The proposed approach was applied to the Turkish language and it was tested with several experiments and a pilot study. The experiments have shown that the summaries returned include the answer for about 50–60 percent of the questions. The data bank built for factoid and open-ended questions in the two domains covered is made publicly available.

Type: Article
Information: Natural Language Engineering , Volume 24 , Issue 5 , September 2018 , pp. 725 - 762

DOI: https://doi.org/10.1017/S1351324918000141 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

*This work was supported by The Scientific and Technological Research Council of Turkey (TÜBİTAK) under the grant number 113E036. We would like to thank Çağıl Uluşahin Sönmez for her contribution in the Google Translate interface of the research.

References

Abacha, A.B., and Zweigenbaum, P., 2015. MEANS: a medical question-answering system combining NLP techniques and semantic web technologies. Information Processing and Management 51 : 570–94.Google Scholar

Alguliev, R.M., Aliguliyev, R.M., and Isazade, N.R., 2013. Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40 : 1675–89.Google Scholar

Barzilay, R., and Elhadad, M. 1997. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, pp. 10–7.Google Scholar

Bernhard, D., and Gurevych, I. 2009. Combining lexical semantic resources with question & answer archives for translation-based answer finding. In Proceedings of ACL-IJCNLP, pp. 728–36.Google Scholar

Bollegala, D., Okazaki, N., and Ishizuka, M., 2012. A preference learning approach to sentence ordering for multi-document summarization. Information Sciences 217 : 78–95.Google Scholar

Bordes, A., Chopra, S., and Weston, J. 2014. Question answering with subgraph embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 615–20.Google Scholar

Bordes, A., Weston, J., and Usunier, N. 2014. Open question answering with weakly supervised embedding models. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), Springer-Verlag, pp. 165–80.Google Scholar

Bouziane, A., Bouchina, D., Doumi, N., and Malki, M., 2015. Question answering systems: survey and trends. Procedia Computer Science 73 : 366–75.Google Scholar

Brill, E., Dumais, S., and Banko, M. 2002. An analysis of the AskMSR Question-Answering system. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 257–64.Google Scholar

Chali, Y., Hasan, S.A., and Mojahid, M., 2015. A reinforcement learning formulation to the complex question answering problem. Information Processing and Management 51 : 252–72.Google Scholar

Chen, Y., Zhou, M., and Wang, S. 2006. Reranking answers for definitional QA using language modeling. In Proceedings of ACL/COLING, pp. 1081–8.Google Scholar

Chu-Carroll, J., Fan, J., Boguraev, B.K., Carmel, D., Sheinwald, D., and Welty, C., 2012a. Finding needles in the haystack: search and candidate generation. IBM Journal of Research and Development 56 (3): 300–11.Google Scholar

Chu-Carroll, J., Fan, J., Schlaefer, N., and Zadrozny, W. 2012b. Textual resource acquisition and engineering. IBM Journal of Research and Development 56 (3/4): 4.1–4.11.Google Scholar

Codina-Filba, J., Bouayad-Agha, N., Burga, A., Casamayor, G., Mille, S., Müller, A., Saggion, H., and Wanner, L., 2017. Using genre-specific features for patent summaries. Information Processing and Management 53 (1): 151–74.Google Scholar

Derici, C., Çelik, K., Kutbay, E., Aydın, Y., Güngör, T., Özgür, A., and Kartal, G. 2015. Question analysis for a closed domain question answering system. In Gelbukh, A. (ed.), Proceedings of Computational Linguistics and Intelligent Text Processing (CicLing), pp. 468–82. Springer, Cairo.Google Scholar

Derici, C., Çelik, K., Özgür, A., Güngör, T., Kutbay, E., Aydın, Y., and Kartal, G. 2014. Türkçe soru cevaplama sistemlerinde kural tabanlıodak çıkarımı(Rule-based focus extraction in Turkish question answering systems). In Proceedings of Signal Processing and Communications Applications Conference (SIU), pp. 1604–7.Google Scholar

Diefenbach, D., Lopez, V., Singh, K., and Maret, P. 2017. Core techniques of question answering systems over knowledge bases: a survey. Knowledge and Information Systems, pp. 1–41, Berlin, Germany: Springer.Google Scholar

Dong, L., Wei, F., Zhou, M., and Xu, K. 2015. Question answering over Freebase with multi-column convolutional neural networks. In Proceedings of International Joint Conference on Natural Language Processing (IJNLP), pp. 260–9.Google Scholar

Er, N.P., and Çiçekli, I. 2013. A factoid question answering system using answer pattern matching. In Proceedings of International Joint Conference on Natural Language Processing (IJNLP), pp. 854–8.Google Scholar

Eryiğit, G., Nivre, J., and Oflazer, K., 2008. Dependency parsing of Turkish. Computational Linguistics 34 (3): 357–89.Google Scholar

Fan, J., Kalyanpur, A., Gondek, D.C., and Ferrucci, D.A., 2012. Automatic knowledge extraction from documents. IBM Journal of Research and Development 56 (3/4): 5:1–5:10.Google Scholar

Feng, M., Xiang, B., Glass, M.R., Wang, L., and Zhou, B. 2015. Applying deep learning to answer selection: a study and an open task. In Proceedings of Automatic Speech Recognition and Understanding (ASRU), pp. 813–20.Google Scholar

Ferreira, R., Cabral, L.de S., Freitas, F., Lins, R.D., Silva, G.de F., Simske, S.J., and Favaro, L., 2014. A multi-document summarization system based on statistics and linguistic treatment. Expert Systems with Applications 41 : 5780–7.Google Scholar

Ferreira, R., Cabral, L. de S., Lins, R.F., Silva, G.P., Freitas, F., Cavalcanti, G.D.C., Lima, R., Simske, S.J., and Favaro, L., 2013. Assessing sentence scoring techniques for extractive text summarization. Expert Systems with Applications 40 : 5755–64.Google Scholar

Ferrucci, D.A., 2012. Introduction to “this is Watson”. IBM Journal of Research and Development 56 (3): 235–49.Google Scholar

Figueroa, F., and Neumann, G., 2016. Context-aware semantic classification of search queries for browsing community question–answering archives. Knowledge-Based Systems 96 : 1–13.Google Scholar

Ganesan, K., Zhai, C.X., and Han, J. 2010. Opinosis: a graph-based approach to abstractive summarization of highly redundant opinions. In Proceedings of International Conference on Computational Linguistics (COLING), pp. 340–8.Google Scholar

Glavas, G., and Snajder, J., 2014. Event graphs for information retrieval and multi-document summarization. Expert Systems with Applications 41 : 6904–16.Google Scholar

Gondek, D.C., Lally, A., Kalyanpur, A., Murdock, J.W., Duboue, P.A., Zhang, L., Pan, Y., Qiu, Z.M., and Welty, C., 2012. A framework for merging and ranking of answers in DeepQA. IBM Journal of Research and Development 56 (3): 399–410.Google Scholar

Habibi, M., Mahdabi, P., and Popescu-Belis, A., 2016. Question answering in conversations: query refinement using contextual and semantic information. Data & Knowledge Engineering 106 : 38–51.Google Scholar

He, R., Tang, J., Gong, P., Hu, Q., and Wang, B., 2016. Multi-document summarization via group sparse learning. Information Sciences 349–50 : 12–24.Google Scholar

Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., and Ngomo, A.-C.N., 2016. Survey on challenges of question answering in the semantic web. Semantic Web 8 (6): 1–26.Google Scholar

Iyyer, M., Boyd-Graber, J., Claudino, L., Socher, R., and Daume, H. 2014. A neural network for factoid question answering over paragraphs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 633–44.Google Scholar

İlhan, S., Duru, N., Karagöz, Ş., and Sağır, M. 2008. Metin madenciliği ile soru cevaplama sistemi (A question answering system based on text mining). In Proceedings of Elektrik-Elektronik ve Biyomedikal Mühendisliği Konferansı(ELECO) (Conference on Electrical-Electronics and Biomedical Engineering), pp. 356–9.Google Scholar

Katz, B. 1997. Annotating the world wide web using natural language. In Proceedings of the Conference on Computer Assisted Information Searching on the Internet (RIAO), pp. 136–55.Google Scholar

Khodadi, I., and Abadeh, M.S., 2016. Genetic programming-based feature learning for question answering. Information Processing and Management 52 : 340–57.Google Scholar

Kolomiyets, O. and Moens, M.F., 2011. A survey on question answering technology from an information retrieval perspective. Information Sciences 181 : 5412–34.Google Scholar

Lally, A., Prager, J.M., McCord, M.C., Boguraev, B.K., Patwardhan, S., Fan, J., Fodor, P., and Chu-Caroll, J. 2012. Question analysis: how Watson reads a clue. IBM Journal of Research and Development 56 (3/4), 2:1–2:14.Google Scholar

Landis, J.R., and Koch, G.G., 1977. The measurement of observer agreement for categorical data. Biometrics 33 (1): 159–74.Google Scholar

Li, J., Sun, L., Kit, C., and Webster, J. 2007. A query-focused multi-document summarizer based on lexical chains. In Proceedings of the Document Understanding Conference (DUC).Google Scholar

Lin, C.-Y. 2004. ROUGE: a package for automatic evaluation of summaries. In Proceedings of Workshop on Text Summarization Branches Out (WAS), pp. 74–81.Google Scholar

Lloret, E. and Palomar, M., 2012. Text summarisation in progress: a literature review. Artificial Intelligence Review 37 (1): 1–41.Google Scholar

Mani, I., 2001. Automatic Summarization. Amsterdam: John Benjamins Pub.Google Scholar

Marujo, L., Ling, W., Ribeiro, R., Gershman, A., Carbonell, J., de Matos, D.M., and Neto, J.P., 2016. Exploring events and distributed representations of text in multi-document summarization. Knowledge-Based Systems 94 : 33–42.Google Scholar

McCord, M.C., Murdock, J.W., and Boguraev, B.K. 2012. Deep parsing in Watson. IBM Journal of Research and Development 56 (3/4), 3–1:3–15.Google Scholar

Medelyan, O. 2007. Computing lexical chains with graph clustering. In Proceedings of the Annual Meeting of the ACL: Student Research Workshop, pp. 85–90.Google Scholar

Metzler, D., and Croft, W.B., 2004. Combining the language model and inference network approaches to retrieval. Information Processing and Management 40 (5): 735–50.Google Scholar

Mishra, A., and Jain, S.K., 2016. A survey on question answering systems with classification. Journal of King Saud University 28 : 345–61.Google Scholar

Molino, P., Lops, P., Semeraro, G., Gemmis, M., and Basile, P., 2015. Playing with knowledge: a virtual player for “who wants to be a millionaire?” that leverages question answering techniques. Artificial Intelligence 222 : 157–81.Google Scholar

Momtazi, S. and Klakow, D., 2015. Bridging the vocabulary gap between questions and answer sentences. Information Processing and Management 51 : 595–615.Google Scholar

Morita, H., Sakai, T., and Okumura, M. 2011. Query snowball: a co-occurrence-based approach to multi-document summarization for question answering. In Proceedings of the Annual Meeting of the ACL, pp. 223–9.Google Scholar

Murdock, J.W., Fan, J., Lally, A., Shima, H., and Boguraev, B.K., 2012a. Textual evidence gathering and analysis. IBM Journal of Research and Development 56 (3): 325–38.Google Scholar

Murdock, J.W., Kalyanpur, A., Welty, C., Fan, J., Ferrucci, D.A., Gondek, D.C., Zhang, L., and Kanayama, H., 2012b. Typing candidate answers using type coercion. IBM Journal of Research and Development 56 (3): 312–24.Google Scholar

Nagao, M., Tsujii, J., and Nakamura, J., 1988. The Japanese government project for machine translation. Computational Linguistics 11 (2–3): 91–110.Google Scholar

Nenkova, A., and McKeown, K. 2012. A survey of text summarization techniques. In Aggarwal, C. C. and Zhai, C. X. (eds.), Mining Text Data. Boston, MA: Springer, pp. 43–76.Google Scholar

Oliveira, H., Ferreira, R., Lima, R., Lins, R.F., Freitas, F., Riss, M., and Simske, S.J., 2016. Assessing shallow sentence scoring techniques and combinations for single and multi-document summarization. Expert Systems with Applications 65 : 68–86.Google Scholar

Olvera-Lobo, M.D., and Gutierrez-Artacho, J. 2015. Question answering track evaluation in TREC, CLEF and NTCIR. In Rocha, A., Correia, A., Costanzo, S., and Reis, L. (eds.), New Contributions in Information Systems and Technologies - Advances in Intelligent Systems and Computing, p. 353, Berlin, Germany: Springer.Google Scholar

Pechsiri, C. and Piriyakul, R., 2016. Developing a why-how question answering system on community web boards with a causality graph including procedural knowledge. Information Processing in Agriculture 3 : 36–53.Google Scholar

Qiang, J.-P., Chen, P., Ding, W., Xie, F., and Wu, X., 2016. Multi-document summarization using closed patterns. Knowledge-Based Systems 99 : 28–38.Google Scholar

Sak, H., Güngör, T., and Saraçlar, M., 2011. Resources for Turkish morphological processing. Language Resources and Evaluation 45 : 249–61.Google Scholar

Shekarpour, S., Marx, E., Ngomo, A.-C.N., and Auer, S., 2015. SINA: semantic interpretation of user queries for question answering on interlinked data. Web Semantics: Science, Services and Agents on the World Wide Web 30 : 39–51.Google Scholar

Silber, H.G., and McCoy, K.F., 2002. Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Computational Linguistics 28 (4): 487–96.Google Scholar

Utomo, F.S., Suryana, N., and Azmi, M.S., 2017. Question answering system: a review on question analysis, document processing, and answer extraction techniques. Journal of Theoretical and Applied Information Technology 95 (14): 3158–74.Google Scholar

Wan, X. 2009. Topic analysis for topic-focused multi-document summarization. In Proceedings of the ACM Conference on Information and Knowledge Management (CIKM, pp. 1609–12.Google Scholar

Wang, D., and Nyberg, E. 2015. A long short-term memory model for answer sentence selection in question answering. In Proceedings of ACL-IJCNLP, pp. 707–12.Google Scholar

Wang, D., Zhu, S., Li, T., and Gong, Y. 2012. Comparative document summarization via discriminative sentence selection. ACM Transactions on Knowledge Discovery from Data 6 (3), 12:1–12:18.Google Scholar

Wu, Y., Hori, C., Kashioka, H., and Kawai, H., 2015. Leveraging social Q&A collections for improving complex question answering. Computer Speech and Language 29 : 1–19.Google Scholar

Xiong, S., and Ji, D., 2016. Query-focused multi-document summarization using hypergraph-based ranking. Information Processing and Management 52 : 670–81.Google Scholar

Xiong, C., Merity, S., and Socher, R. 2016. Dynamic memory networks for visual and textual question answering. In Proceedings of the International Conference on Machine Learning, pp. 2397–406.Google Scholar

Yang, L., Ai, Q., Spina, D., Chen, R-C., Pang, L., Croft, W.B., Guo, J., and Scholer, F. 2016. Beyond factoid QA: effective methods for non-factoid answer sentence retrieval. In Proceedings of the European Conference on Information Retrieval (ECIR, pp. 115–28.Google Scholar

Yang, M.-C., Lee, D.-G., Park, S.-Y., and Rim, H.-C., 2015a. Knowledge-based question answering using the semantic embedding space. Expert Systems with Applications 42 : 9086–104.Google Scholar

Yang, Y., Yih, W.-t., and Meek, C. 2015b. WIKIQA: A challenge dataset for open-domain question answering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2013–18.Google Scholar

Yih, W-T., He, X., and Meek, C. 2014. Semantic parsing for single-relation question answering. In Proceedings of the Annual Meeting of ACL, pp. 643–8.Google Scholar

Yu, L., Hermann, K.M., Blunsom, P., and Pulman, S. 2014. Deep learning for answer sentence selection, In Proceedings of NIPS Deep Learning Workshop.Google Scholar

Zheng, Z. 2002. AnswerBus question answering system. In Proceedings of the International Conference on Human Language Technology Research (HLT, pp. 399–404.Google Scholar

Zhong, S.-h., Liu, Y., Li, B., and Long, J., 2015. Query-oriented unsupervised multi-document summarization via deep learning model. Expert Systems with Applications 42 : 8146–55.Google Scholar

Article contents

A closed-domain question answering framework using reliable resources to assist students

Abstract

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests