Discourse Processing and Its Applications in Text Mining --- Tutoral at ICDM-2018

Time: 18 Nov 3:40 - 5:40
Location: Virgo IV (Resort World Sentosa), Singapore.

Tutors

Shafiq Joty	Giuseppe Carenini	Raymond T Ng	Gabriel Murray

Tutorial Abstract

Discourse processing is a suite of Natural Language Processing (NLP) tasks to uncover linguistic structures from texts at several levels, which can support many text mining applications. This involves identifying the topic structure, the coherence structure, the coreference structure, and the conversation structure for conversational discourse. Taken together, these structures can inform text summarization, essay scoring, sentiment analysis, machine translation, information extraction, question answering, and thread recovery. The tutorial starts with an overview of basic concepts in discourse analysis – monologue vs. conversation, synchronous vs. asynchronous conversation, and key linguistic structures in discourse analysis. It then covers traditional machine learning methods along with the most recent works using deep learning, and compare their performances on benchmark datasets. For each discourse structure we describe, we show its applications in downstream text mining tasks. Methods and metrics for evaluation are discussed in detail. We conclude the tutorial with an interactive discussion of future challenges and opportunities.

Tutorial Outline

Introduction

Discourse & its different forms
Linguistic structures in discourse & discourse analysis tasks recognition
Applications of discourse analysis

Discourse Parsing & Its Applications

Discourse annotations
Discourse parsing with RST
Discourse parsing in PDTB
Applications of Discourse Parsing

Coffee Break

Coherence Models & Its Applications

Coherence models for Texts
Coherence models for Conversations
Applications (Evaluation tasks)

Conversational Structures

Discourse Structures in Conversations
Thread identification models for synchronous & asynchronous conversations
Speech act recognition models for synchronous & asynchronous conversations
Evaluation & Applications

Conclusions & Future Challenges

Learning from limited annotated data
Language & domain transfer
New emerging applications

discourse-analysis