Discourse structure and language technology

B. WEBBER; M. EGG; V. KORDONI

doi:10.1017/S1351324911000337

Discourse structure and language technology

Published online by Cambridge University Press: 08 December 2011

B. WEBBER ,

M. EGG and

V. KORDONI

Show author details

B. WEBBER: Affiliation:
School of Informatics, University of Edinburgh, Edinburgh, UK e-mail: [email protected]
M. EGG: Affiliation:
Department of English and American Studies, Humboldt University, Berlin, Germany e-mail: [email protected]
V. KORDONI: Affiliation:
German Research Centre for Artificial Intelligence (DFKI GmbH) and Department of Computational Linguistics, Saarland University, Saarbrcken, Germany e-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

An increasing number of researchers and practitioners in Natural Language Engineering face the prospect of having to work with entire texts, rather than individual sentences. While it is clear that text must have useful structure, its nature may be less clear, making it more difficult to exploit in applications. This survey of work on discourse structure thus provides a primer on the bases of which discourse is structured along with some of their formal properties. It then lays out the current state-of-the-art with respect to algorithms for recognizing these different structures, and how these algorithms are currently being used in Language Technology applications. After identifying resources that should prove useful in improving algorithm performance across a range of languages, we conclude by speculating on future discourse structure-enabled technology.

Type: Articles
Information: Natural Language Engineering , Volume 18 , Issue 4 , October 2012 , pp. 437 - 490

DOI: https://doi.org/10.1017/S1351324911000337 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agarwal, S., Choubey, L., and Yu, H. 2010. Automatically classifying the role of citations in biomedical articles. In Proceedings of American Medical Informatics Association (AMIA), Fall Symposium, Washington, DC, November 13–17, pp. 11–15.Google Scholar

Agarwal, S., and Yu, H. 2009. Automatically classifying sentences in full-text biomedical articles into introduction, methods, results and discussion. Bioinformatics 25 (23): 3174–80.CrossRef Google Scholar PubMed

Al-Saif, A., and Markert, K. 2010. The Leeds Arabic Discourse Treebank: annotating discourse connectives for Arabic. In Proceedings of 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta, May 17–23.Google Scholar

Al-Saif, A., and Markert, K. 2011. Modelling discourse relations for Arabic. In Proceedings of Empirical Methods in Natural Language Processing, Edinburgh, Scotland pp. 736–47.Google Scholar

Asher, N. 1993. Reference to Abstract Objects in Discourse. Boston MA: Kluwer.CrossRef Google Scholar

Asher, N., and Lascarides, A. 2003. Logics of Conversation. Cambridge, UK: Cambridge University Press.Google Scholar

Baldridge, J., Asher, N., and Hunter, J. 2007. Annotation for and robust parsing of discourse structure on unrestricted texts. Zeitschrift für Sprachwissenschaft 26: 213–39.CrossRef Google Scholar

Barzilay, R., and Elhadad, M. 1997. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 10–17.Google Scholar

Barzilay, R., and Lapata, M. 2008. Modeling local coherence: an entity-based approach. Computational Linguistics 34 (1): 1–34.Google Scholar

Barzilay, R., and Lee, L. 2004. Catching the drift: probabilistic content models with applications to generation and summarization. In Proceedings of the 2nd Human Language Technology Conference and Annual Meeting of the North American Chapter, Boston, MA, USA, pp. 113–20. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Bestgen, Y. 2006. Improving text segmentation using latent semantic analysis: a reanalysis of Choi, Wiemer-Hastings, and Moore (2001). Computational Linguistics 32 (1): 5–12.CrossRef Google Scholar

Bex, F., and Verheij, B. 2010. Story schemes for argumentation about the facts of a crime. In Proceedings, AAAI Fall Symposium on Computational Narratives. Menlo Park, CA: AAAI Press.Google Scholar

Buch-Kromann, M., and Korzen, I. 2010 (July). The unified annotation of syntax and discourse in the Copenhagen Dependency Treebanks. In Proceedings of the Fourth Linguistic Annotation Workshop, Uppsala, Sweden, July 15–16, pp. 127–31.Google Scholar

Buch-Kromann, M., Korzen, I., and Müller, H. H. 2009. Uncovering the ‘lost’ structure of translations with parallel treebanks. In Alves, F., Göpferich, S., and Mees, I. (eds.), Copenhagen Studies of Language: Methodology, Technology and Innovation in Translation Process Research, pp. 199–224. Copenhagen Studies of Language, vol. 38. Frederiksberg, Denmark: Copenhagen Business School.Google Scholar

Bunt, H., Alexandersson, J., Carletta, J., Choe, J.-W., Fang, A. C., Hasida, K., Lee, K., Petukhova, V., Popescu-Belis, A., Romary, L., Soria, C., and Traum, D. 2010. Towards an ISO standard for dialogue act annotation. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.Google Scholar

Burchardt, A., Frank, A., Erk, K., Kowalski, A., and Padó, S. 2006. SALTO – versatile multi-level annotation tool. In Proceedings of LREC 2006, Genoa, Italy.Google Scholar

Burstein, J., Marcu, D., Andreyev, S., and Chodorow, M. 2001. Towards automatic classification of discourse elements in essays. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, pp. 98–105. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Burstein, J., Marcu, D., and Knight, K. 2003. Finding the WRITE stuff: automatic identification of discourse structure in student essays. IEEE Intelligent Systems: Special Issue on Advances in Natural Language Processing 18: 32–9.CrossRef Google Scholar

Callison-Birch, C. 2008. Syntactic constraints on paraphrases extracted from parallel corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '08), Honolulu, HI, USA.Google Scholar

Carlson, L., Marcu, D., and Okurowski, M. E. 2003. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Kuppevelt, J. van and Smith, R. (eds.), Current Directions in Discourse and Dialogue, pp. 85–112. New York: Kluwer.CrossRef Google Scholar

Chambers, N., and Jurafsky, D. 2008. Unsupervised learning of narrative event chains. In Proceedings, Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, OH, USA, pp. 789–97.Google Scholar

Chen, H., Branavan, S. R. K., Barzilay, R., and Karger, D. 2009. Global models of document structure using latent permutations. In Proceedings, Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Boulder, CO, USA, pp. 371–9.Google Scholar

Chiarcos, C., Dipper, S., Götze, M., Leser, U., Ldeling, A., Ritz, J., and Stede, M. 2008. A flexible framework for integrating annotations from different tools and tagsets. Traitement Automatique des Langues 49: 271–93.Google Scholar

Choi, F. Y. Y., Wiemer-Hastings, P., and Moore, J. 2001. Latent semantic analysis for text segmentation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '01), Pittsburgh, PA USA, pp. 109–17.Google Scholar

Chung, G. 2009 (February). Sentence retrieval for abstracts of randomized controlled trials. BMC Medical Informatics and Decision Making 9 (10).CrossRef Google Scholar PubMed

Clarke, J., and Lapata, M. 2010. Discourse constraints for document compression. Computational Linguistics 36 (3): 411–41.Google Scholar

Dale, R. 1992. Generating Referring Expressions. Cambridge MA: MIT Press.Google Scholar

Daume, H. III, and Marcu, D. 2002. A noisy-channel model for document compression. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, pp. 449–56.Google Scholar

Do, Q. X., Chan, Y. S., and Roth, D. 2011. Minimally supervised event causality identification. In Proceedings, Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, pp. 294–303.Google Scholar

Eales, J., Stevens, R., and Robertson, D. 2008. Full-text mining: linking practice, protocols and articles in biological research. In Proceedings of the BioLink SIG, ISMB 2008, Toronto, Canada.Google Scholar

Egg, M., and Redeker, G. 2010. How complex is discourse structure? In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta, pp. 1619–23.Google Scholar

Eisenstein, J., and Barzilay, R. 2008. Bayesian unsupervised topic segmentation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, (EMNLP '08), Honolulu, HI, pp. 334–43.Google Scholar

Elsner, M., and Charniak, E. 2008a. Coreference-inspired coherence modeling. In Proceedings of ACL-HLT 2008, Columbus, OH, USA.Google Scholar

Elsner, M., and Charniak, E. 2008b. You talking to me? In Proceedings of ACL-HLT 2008, Columbus, OH, pp. 834–42.Google Scholar

Elwell, R., and Baldridge, J. 2008. Discourse connective argument identification with connective specic rankers. In Proceedings of the IEEE Conference on Semantic Computing (ICSC-08), Santa Clara, CA, USA.Google Scholar

Finlayson, M. 2009. Deriving narrative morphologies via analogical story merging. In Proceedings, 2nd International Conference on Analogy, Sofia, Bulgaria, pp. 127–36.Google Scholar

Foster, G., Isabelle, P., and Kuhn, R. 2010. Translating structured documents. In Proceedings of AMTA, Atlanta, GA, USA.Google Scholar

Galley, M., McKeown, K., Fosler-Lussier, E., and Jing, H. 2003. Discourse segmentation of multi-party conversation. In Proceedings of the 41st Annual Conference of the Association for Computational Linguistics, Sapporo, Japan.Google Scholar

Ghorbel, H., Ballim, A., and Coray, G. 2001. ROSETTA: rhetorical and semantic environment for text alignment. Proceedings of Corpus Linguistics, Lancaster, UK, pp. 224–33.Google Scholar

Ghosh, S., Johansson, R., Riccardi, G., and Tonelli, S. 2011b. Shallow discourse parsing with conditional random fields. In Proceedings, International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, November 8–13.Google Scholar

Ghosh, S., Tonelli, S., Riccardi, G., and Johansson, R. 2011a. End-to-end discourse parser evaluation. In Proceedings, IEEE Conference on Semantic Computing (ICSC-11), Hong Kong.Google Scholar

Grosz, B., Joshi, A., and Weinstein, S. 1995. Centering: a framework for modelling the local coherence of discourse. Computational Linguistics 21 (2): 203–25.Google Scholar

Grosz, B., and Sidner, C. 1986. Attention, intention and the structure of discourse. Computational Linguistics 12 (3): 175–204.Google Scholar

Grosz, B., and Sidner, C. 1990. Plans for discourse. In Cohen, P., Morgan, J., and Pollack, M. (eds.), Intentions in Communication, pp. 417–44. Cambridge MA: MIT Press.CrossRef Google Scholar

Gu, Z., and Cercone, N. 2006. Segment-based hidden Markov models for information extraction. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, July 17–21, pp. 481–8. Stroudsburg PA: Association for Computational Linguistics.Google Scholar

Guillou, L. 2011. Improving Pronoun Translation for Statistical Machine Translation (SMT). M.Sc. dissertation, University of Edinburgh, Edinburgh, UK.Google Scholar

Guo, Y., Korhonen, A., Liakata, M., Silins, I., Sun, L., and Stenius, U. 2010 (July). Identifying the information structure of scientific abstracts. In Proceedings of the 2010 BioNLP Workshop, Uppsala, Sweden.Google Scholar

Halliday, M., and Hasan, R. 1976. Cohesion in English. Switzerland: Longman.Google Scholar

Hardmeier, C., and Federico, M. 2010. Modelling pronominal anaphora in Statistical Machine Translation. In Proceedings 7th Int'l Workshop on Spoken Language Translation, Paris, France, December 2–3, pp. 283–90.Google Scholar

Hardt, D., and Elming, J. 2010. Incremental re-training for post-editing SMT. In Proceedings of AMTA, Denver, CO, USA.Google Scholar

Hearst, M. 1994. Multi-paragraph segmentation of expository text. In Proceedings, 32nd Annual Meeting of the Association for Computational Linguistics, Plainsboro, NJ, USA, pp. 9–16.CrossRef Google Scholar

Hearst, M. 1997. TextTiling: segmenting text into multi-paragraph subtopic passages. Computational Linguistics 23 (1): 33–64.Google Scholar

Higgins, D., Burstein, J., Marcu, D., and Gentile, C. 2004. Evaluating multiple aspects of coherence in student essays. In Proceedings of HLT-NAACL, Boston, MA, USA, pp. 185–92. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Hirohata, K., Okazaki, N., Ananiadou, S., and Ishizuka, M. 2008. Identifying sections in scientific abstracts using conditional random fields. In Proceedings of the 3rd International Joint Conference on Natural Language Processing, Hyderabad, India, pp. 381–8.Google Scholar

Holler, A., and Irmen, L. 2007. Empirically assessing effects of the right frontier constraint. In Proceedings of the 6th Discourse Anaphora and Anaphor Resolution Conference, Lagos (Algarve), Portugal, pp. 15–27.Google Scholar

Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., and Weischedel, R. 2006. OntoNotes: the 90% solution. In Proceedings, Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 57–60. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Ide, N., Prasad, R., and Joshi, A. 2011. Towards interoperability for the Penn Discourse Treebank. In Proceedings, 6th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-6), Oxford, UK, pp. 49–55.Google Scholar

Kan, M.-Y., Klavans, J., and McKeown, K. 1998. Linear segmentation and segment significance. In Proceedings of the Sixth Workshop on Very Large Corpora, Montreal, Canada.Google Scholar

Kim, J.-D., Ohta, T., Tateisi, Y., and Tsujii, J. 2003. GENIA corpus – semantically annotated corpus for bio-textmining. Bioinformatics 19 (Suppl 1): i180–2.CrossRef Google Scholar PubMed

Kingsbury, P., and Palmer, M. 2002. From Treebank to PropBank. In Proceedings of the 3rd International Conference on Language Resources and Evalution (LREC), Spain.Google Scholar

Kintsch, W., and van Dijk, T. 1978. Towards a model of text comprehension and production. Psychological Review 85: 363–94.CrossRef Google Scholar

Knott, A. 2001. Semantic and pragmatic relations and their intended effects. In Sanders, T., Schilperoord, J., and Spooren, W. (eds.), Text Representation: Linguistic and Psycholinguistic Aspects, pp. 127–51. Amsterdam: Benjamins.CrossRef Google Scholar

Knott, A., Oberlander, J., O'Donnell, M., and Mellish, C. 2001. Beyond elaboration: the interaction of relations and focus in coherent text. In Sanders, T., Schilperoord, J., and Spooren, W. (eds.), Text Representation: Linguistic and Psycholinguistic Aspects, pp. 181–96. Amsterdam: Benjamins.Google Scholar

Koppel, M., and Ordan, N. 2011. Translationese and its dialects. In Proceedings of the 49th Annual Meeting, pp. 1318–26. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Lee, A., Prasad, R., Joshi, A., Dinesh, N., and Webber, B. 2006. Complexity of dependencies in discourse: are dependencies in discourse more complex than in syntax? In Proceedings of the 5th Workshop on Treebanks and Linguistic Theory (TLT'06), Prague, Czech Republic.Google Scholar

Lee, A., Prasad, R., Joshi, A., and Webber, B. 2008. Departures from tree structures in discourse. In Proceedings of the Workshop on Constraints in Discourse III, Potsdam, Germany.Google Scholar

Liakata, M., Teufel, S., Siddharthan, A., and Batchelor, C. 2010. Corpora for the conceptualisation and zoning of scientific papers. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.Google Scholar

Lin, J., Karakos, D., Demner-Fushman, D., and Khudanpur, S. 2006. Generative content models for structural analysis of medical abstracts. In Proceedings of the HLT-NAACL Workshop on BioNLP, Brooklyn, New York, pp. 65–72.Google Scholar

Lin, Z., Ng, H. T., and Kan, M.-Y. 2010 (November). A PDTB-styled end-to-end discourse parser. Technical Report, Department of Computing, National University of Singapore. Available at http://arxiv.org/abs/1011.0835 Google Scholar

Lochbaum, K. 1998. A collaborative planning model of intentional structure. Computational Linguistics 24 (4), 525–72.Google Scholar

Louis, A., Joshi, A., and Nenkova, A. 2010. Discourse indicators for content selection in summarization. In Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL '10, pp. 147–56. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Louis, A., and Nenkova, A. 2011. General versus specific sentences: automatic identification and application to analysis of news summaries. Technical Report, University of Pennsylvania. Available at http://repository.upenn.edu/cis_reports/Google Scholar

Maamouri, M., and Bies, A. 2004. Developing an Arabic treebank: methods, guidelines, procedures, and tools. In Proceedings of the Workshop on Computational Approaches to Arabic Script-Based Languages, pp. 2–9. Stroudsburg, PA: ACL.Google Scholar

Malioutov, I., and Barzilay, R. 2006. Minimum cut model for spoken lecture segmentation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics (CoLing-ACL 2006), Sydney, Australia.Google Scholar

Mandler, J. 1984. Stories, Scripts, and Scenes: Aspects of Schema Theory. Hillsdale NJ: Lawrence Erlbaum.Google Scholar

Mani, I. 2001. Automatic Summarization. Amsterdam, Netherlands: Benjamins.Google Scholar

Mann, W., and Thompson, S. 1988. Rhetorical structure theory: toward a functional theory of text organization. Text 8 (3), 243–1.Google Scholar

Marcu, D. 1999. A decision-based approach to rhetorical parsing. In Proceedings of ACL'99, Maryland, USA, pp. 365–72.Google Scholar

Marcu, D. 2000. The rhetorical parsing of unrestricted texts: a surface-based approach. Computational Linguistics 26: 395–448.CrossRef Google Scholar

Marcu, D., Carlson, L., and Watanabe, M. 2000. The automatic translation of discourse structures. In Proceedings of the 1st Conference of the North American Chapter of the ACL, Seattle, WA, pp. 9–17.Google Scholar

Marcu, D., and Echihabi, A. 2002. An unsupervised approach to recognizing discourse relations. In Proceedings of ACL'02, Philadelphia, PA, USA.Google Scholar

Marcus, M., Santorini, B., and Marcinkiewicz, M. A. 1993. Building a large-scale annotated corpus of English: the Penn TreeBank. Computational Linguistics 19: 313–30.Google Scholar

Martin, J. 2000. Beyond exchange: appraisal systems in English. In Hunston, S. and Thompson, G. (eds.), Evaluation in Text: Authorial Distance and the Construction of Discourse, pp. 142–75. Oxford, UK: Oxford University Press.CrossRef Google Scholar

Maslennikov, M., and Chua, T.-S. 2007. A multi-resolution framework for information extraction from free text. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 592–99. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

McDonald, R., Crammer, K., and Pereira, F. 2005. Online large-margin training of dependency parsers. In Proceedings of ACL, Michigan, USA. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

McKeown, K. 1985. Text Generation: Using Discourse Strategies and Focus Constraints to Generate Natural Language Texts. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

McKnight, L., and Srinivasan, P. 2003. Categorization of sentence types in medical abstracts. In Proceedings of the AMIA Annual Symposium, Washington DC, pp. 440–44.Google Scholar

Meyer, T. 2011. Disambiguating temporal-contrastive connectives for machine translation. In Proceedings of the 49th Annual Meeting, Association for Computational Linguistics, Student Session, pp. 46–51. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Mitkov, R. 1999. Introduction: special issue on anaphora resolution in machine translation and multilingual NLP. Machine Translation 14: 159–61.CrossRef Google Scholar

Mizuta, Y., Korhonen, A., Mullen, T., and Collier, N. 2006. Zone analysis in biology articles as a basis for information extraction. International Journal of Medical Informatics 75: 468–87.Google Scholar

Mladová, L., Šárka, Z., and Hajičová, E. 2008. From sentence to discourse: building an annotation scheme for discourse based on the Prague Dependency Treebank. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco.Google Scholar

Moens, M.-F., Uyttendaele, C., and Dumortier, J. 1999. Information extraction from legal texts: the potential of discourse analysis. International Journal of Human-Computer Studies 51: 1155–71.CrossRef Google Scholar

Moore, J. 1995. Participating in Explanatory Dialogues. Cambridge MA: MIT Press.Google Scholar

Moore, J., and Paris, C. 1993. Planning text for advisory dialogues: capturing intentional and rhetorical information. Computational Linguistics 19 (4): 651–95.Google Scholar

Moore, J., and Pollack, M. 1992. A problem for RST: the need for multi-level discourse analysis. Computational Linguistics 18 (4): 537–44.Google Scholar

Moser, M., and Moore, J. 1996. Toward a synthesis of two accounts of discourse structure. Computational Linguistics 22 (3): 409–19.Google Scholar

Nagard, R. L., and Koehn, P. 2010. Aiding pronoun translation with co-reference resolution. In Proceedings of the 5th Joint Workshop on Statistical Machine Translation and Metrics (MATR), Uppsala, Sweden.Google Scholar

Ono, K., Sumita, K., and Miike, S. 1994. Abstract generation based on rhetorical structure extraction. In Proceedings, International Conference on Computational Linguistics (COLING), Kyoto, Japan, pp. 344–48.Google Scholar

Oza, U., Prasad, R., Kolachina, S., Sharma, D. M., and Joshi, A. 2009. The Hindi Discourse Relation Bank. In Proceedings of the 3rd ACL Language Annotation Workshop (LAW III), Singapore.Google Scholar

Palau, R. M., and Moens, M.-F. 2009. Argumentation mining: the detection, classification and structure of arguments in text. In Proceedings of the 12th International Conference on Artificial Intelligence and Law, ICAIL '09, pp. 98–107. New York: ACM.Google Scholar

Pang, B., and Lee, L. 2005. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of ACL, pp. 115–24. Stroudsburg PA: ACL.Google Scholar

Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86. Stroudsburg PA: Association for Computational Linguistics.Google Scholar

Paris, C. 1988. Tailoring object descriptions to a user's level of expertise. Computational Linguistics 14 (3), 64–78.Google Scholar

Pasch, R., Brausse, U., Breindl, E., and Wassner, U. 2003. Handbuch der Deutschen Konnektoren. Berlin, Germany: Walter de Gruyter.CrossRef Google Scholar

Patwardhan, S., and Riloff, E. 2007. Effective information extraction with semantic affinity patterns and relevant regions. In Proceedings of the 2007 Conference on Empirical Methods in Natural Language Processing (EMNLP-07), Prague, Czech Republic.Google Scholar

Petukhova, V., and Bunt, H. 2009. Towards a multidimensional semantics of discourse markers in spoken dialogue. In Proceedings, 8th International Conference on Computational Semantics, Tilburg, The Netherlands, pp. 157–68.Google Scholar

Petukhova, V., Préevot, L., and Bunt, H. 2011. Multi-level discourse relations in dialogue. In Proceedings, 6th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-6), Oxford, UK, pp. 18–27.Google Scholar

Pitler, E., and Nenkova, A. 2009. Using syntax to disambiguate explicit discourse connectives in text. In Proceedings of the 47th Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing (ACL-IJCNLP '09), Singapore.Google Scholar

Pitler, E., Raghupathy, M., Mehta, H., Nenkova, A., Lee, A., and Joshi, A. 2008. Easily identifiable discourse relations. In Proceedings, International Conference on Computational Linguistics (COLING), Manchester, UK.Google Scholar

Poesio, M., Stevenson, R., Eugenio, B. D., and Hitzeman, J. 2004. Centering: a parametric theory and its instantiations. Computational Linguistics 30: 309–63.CrossRef Google Scholar

Polanyi, L., Culy, C., van den Berg, M., Thione, G. L., and Ahn, D. 2004a. A rule-based approach to discourse parsing. In Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue, p. 10. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Polanyi, L., Culy, C., van den Berg, M., Thione, G. L., and Ahn, D. 2004b. Sentential structure and discourse parsing. In Proceedings of the ACL 2004 Workshop on Discourse Annotation, Barcelona, Spain.Google Scholar

Polanyi, L., and Zaenen, A. 2004. Contextual valence shifters. In Proceedings of AAAI Spring Symposium on Attitude, Stanford CA, USA, p. 10.Google Scholar

Prasad, R., Dinesh, N., Lee, A., Joshi, A., and Webber, B. 2007. Attribution and its annotation in the Penn Discourse TreeBank. TAL (Traitement Automatique des Langues) 47 (2): 43–63.Google Scholar

Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., et al. . 2008. The Penn Discourse Treebank 2.0. In Proceedings of the 6th International Conference on Language Resources and Evaluation, Morocco.Google Scholar

Prasad, R., Joshi, A., and Webber, B. 2010a. Exploiting scope for shallow discourse parsing. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.Google Scholar

Prasad, R., Joshi, A., and Webber, B. 2010b. Realization of discourse relations by other means: alternative lexicalizations. In Proceedings, International Conference on Computational Linguistics (COLING). Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Prasad, R., McRoy, S., Frid, N., Joshi, A., and Yu, H. 2011. The Biomedical Discourse Relation Bank. BMC Bioinformatics 12 (188): 18. http://www.biomedcentral.com/1471-2015/12/188 Google Scholar

Propp, V. 1968. The Morphology of the Folktale, 2nd ed.Austin TX: University of Texas Press. Publication of the American Folklore Society, Inc., Bibliographical & Special Series.Google Scholar

Purver, M. 2011. Topic segmentation. In: Tur, G. and de Mori, R. (eds.), Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. Hoboken NJ: Wiley. Chapter 11, doi:1002/9781119992691.ch11.Google Scholar

Purver, M., Griffiths, T., Körding, K. P., and Tenenbaum, J. 2006. Unsupervised topic modelling for multi-party spoken discourse. In Proceedings, International Conference on Computational Linguistics (COLING) and the Annual Meeting of the Association for Computational Linguistics, pp. 17–24. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Pustejovsky, J., Meyers, A., Palmer, M., and Poesio, M. 2005. Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank and Coreference. In CorpusAnno '05: Proceedings of the Workshop on Frontiers in Corpus Annotations II, pp. 5–12. Stroudsburg, PA: Association for Computational Linguistics.CrossRef Google Scholar

Ruch, P., Boyer, C., Chichester, C., Tbahriti, I., Geissbühler, A., Fabry, P., et al. 2007. Using argumentation to extract key sentences from biomedical abstracts. International Journal of Medical Informatics 76 (2–3): 195–200.Google Scholar

Rumelhart, D. 1975. Notes on a schema for stories. In Bobrow, D. and Collins, A. (eds.), Representation and Understanding: Studies in Cognitive Science, pp. 211–36. New York: Academic Press.Google Scholar

Sagae, K. 2009. Analysis of discourse structure with syntactic dependencies and data-driven shift-reduce parsing. In Proceedings of IWPT 2009, Paris, France.Google Scholar

Sagae, K., and Lavie, A. 2005. A classifier-based parser with linear run-time complexity. In Proceedings of IWPT 2005, Vancouver, British Columbia.Google Scholar

Say, B., Zeyrek, D., Oflazer, K., and Özge, U. 2004. Development of a corpus and a treebank for present day written Turkish. In Current Research in Turkish Linguistics, 11th International Conference on Turkish Linguistics (ICTL 2002), Eastern Mediterranean University, Northern Cyprus, pp. 183–92.Google Scholar

Schank, R., and Abelson, R. 1977. Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures. Hillsdale NJ: Lawrence Erlbaum.Google Scholar

Schilder, F. 2002. Robust discourse parsing via discourse markers, topicality and position. Natural Language Engineering 8 (3): 235–55.Google Scholar

Sibun, P. 1992. Generating text without trees. Computational Intelligence, 8 (1): 102–22.Google Scholar

Soricut, R., and Marcu, D. 2003. Sentence level discourse parsing using syntactic and lexical information. In Proceedings of HLT/NAACL 2003, Edmonton, Canada.Google Scholar

Sporleder, C., and Lascarides, A. 2008. Using automatically labelled examples to classify rhetorical relations: a critical assessment. Natural Language Engineering 14 (3): 369–416.CrossRef Google Scholar

Stede, M. 2004. The Potsdam Commentary Corpus. In ACL Workshop on Discourse Annotation. Stroudsburg, PA: ACL.Google Scholar

Stede, M. 2008a. Disambiguating rhetorical structure. Research on Language and Computation 6: 311–32.Google Scholar

Stede, M. 2008b. RST revisited: disentangling nuclearity. In Fabricius-Hansen, C. and Ramm, W. (eds.), Subordination versus Coordination in Sentence and Text, pp. 33–58. Amsterdam, Netherlands: John Benjamins.Google Scholar

Subba, R., and Eugenio, B. D. 2009. An effective discourse parser that uses rich linguistic information. In Proceedings of NAACL '09, pp. 566–74. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Subba, R., Eugenio, B. D., and Kim, S. N. 2006. Discourse parsing: learning FOL rules based on rich verb semantic representations to automatically label rhetorical relations. In Proceedings of the EACL 2006 Workshop on Learning Structured Information in Natural Language Applications, Trento, Italy.Google Scholar

Sweetser, E. 1990. From Etymology to Pragmatics. Metaphorical and Cultural Aspects of Semantic Structure. Cambridge, UK: Cambridge University Press.Google Scholar

Taboada, M., Brooke, J., and Stede, M. 2009. Genre-based paragraph classification for sentiment analysis. In Proceedings of SIGDIAL 2009, London, UK, pp. 62–70.Google Scholar

Taboada, M., and Mann, W. 2006. Applications of rhetorical structure theory. Discourse Studies 8: 567–88.Google Scholar

Tamames, J., and de Lorenzo, V. 2010. EnvMine: a text-mining system for the automatic extraction of contextual information. BMC Bioinformatics 11: 294.Google Scholar

Teufel, S., and Moens, M. 2002. Summarizing scientific articles – experiments with relevance and rhetorical status. Computational Linguistics 28: 409–45.Google Scholar

Teufel, S., Siddharthan, A., and Batchelor, C. 2009. Towards discipline-independent argumentative zoning: evidence from chemistry and computational linguistics. In Proceedings, Conference on Empirical Methods in Natural Language Processing, Singapore, pp. 1493–502.Google Scholar

Thione, G., van den Berg, M., Polanyi, L., and Culy, C. 2004. Hybrid text summarization: combining external relevance measures with structural analysis. In Proceedings of the ACL 2004 Workshop Text Summarization Branches Out, Barcelona, Spain. Stroudsburg, PA: ACL.Google Scholar

Tonelli, S., Riccardi, G., Prasad, R., and Joshi, A. 2010. Annotation of discourse relations for conversational spoken dialogs. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta.Google Scholar

Toolan, M. 2006. Narrative: linguistic and structural theories. In Brown, K. (ed.), Encyclopedia of Language and Linguistics, 2nd ed., pp. 459–73. Amsterdam, Netherlands: Elsevier.Google Scholar

Turney, P. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 417–24. Stroudsburg, PA: Association for Computational Linguistics.Google Scholar

Uzêda, V. R., Pardo, T. A. S., and Nunes, M. D. G. V. 2010. A comprehensive comparative evaluation of RST-based summarization methods. ACM Transactions on Speech and Language Processing 6: 1–20.CrossRef Google Scholar

van der Vliet, N., Berzlánovich, I., Bouma, G., Egg, M., and Redeker, G. 2011. Building a discourse-annotated Dutch text corpus. In Dipper, S. and Zinsmeister, H. (eds.), Bochumer Linguistische Arbeitsberichte, 157–71.Google Scholar

Versley, Y. 2010. Discovery of ambiguous and unambiguous discourse connectives via annotation projection. In Workshop on the Annotation and Exploitation of Parallel Corpora (AEPC), NODALIDA, Tartu, Estonia.Google Scholar

Voll, K., and Taboada, M. 2007. Not all words are created equal: extracting semantic orientation as a function of adjective relevance. In Proceedings of the 20th Australian Joint Conference on Artificial Intelligence, Gold Coast, Australia, pp. 337–46.Google Scholar

Walker, M., Stent, A., Mairess, F., and Prasad, R. 2007. Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research 30: 413–56.Google Scholar

Wang, L., Lui, M., Kim, S. N., Nivre, J., and Baldwin, T. 2011. Predicting thread discourse structure over technical web forms. In Proceedings, Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, pp. 13–25.Google Scholar

Webber, B. 1991. Structure and ostension in the interpretation of discourse deixis. Language and Cognitive Processes 6 (2): 107–35.Google Scholar

Webber, B. 2006. Accounting for discourse relations: constituency and dependency. In Butt, M., Dalrymple, M., and King, T. (eds.), Intelligent Linguistic Architectures, pp. 339–60. Stanford, CA: CSLI.Google Scholar

Webber, B. 2009. Genre distinctions for discourse in the Penn TreeBank. In Proceedings of the Joint Conference of the 47th Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing, Suntec, Singapore.Google Scholar

Wellner, B. 2008. Sequence Models and Ranking Methods for Discourse Parsing. PhD thesis, Brandeis University, Waltham, MA, USA.Google Scholar

Wellner, B., and Pustejovsky, J. 2007. Automatically identifying the arguments of discourse connectives. In Proceedings of the 2007 Conference on Empirical Methods in Natural Language Processing (EMNLP-07), Prague, Czech Republic.Google Scholar

Wolf, F., and Gibson, E. 2005. Representing discourse coherence: a corpus-based study. Computational Linguistics 31: 249–87.Google Scholar

Woods, W. 1968. Procedural semantics for a question-answering machine. In Proceedings of the AFIPS National Computer Conference, pp. 457–71. Montvale NJ: AFIPS Press.Google Scholar

Zeyrek, D., Demirşahin, I., Sevdik-Çallı, A., Ögel Balaban, H., İhsan, Y., and Turan, Ü. D. 2010. The annotation scheme of the Turkish discourse bank and an evaluation of inconsistent annotations. In Proceedings of the 4th Linguistic Annotation Workshop (LAW III), Uppsala, Sweden.Google Scholar

Zeyrek, D., Turan, Ü. D., Bozsahin, C., Çakıcı, R., et al. 2009. Annotating subordinators in the Turkish discourse bank. In Proceedings of the 3rd Linguistic Annotation Workshop (LAW III), Singapore.Google Scholar

Zeyrek, D., and Webber, B. 2008. A discourse resource for Turkish: annotating discourse connectives in the METU corpus. In Proceedings of the 6th Workshop on Asian Language Resources (ALR6), Hyderabad, India.Google Scholar

Article contents

Discourse structure and language technology

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests