This repository contains the source code of our paper “A Unified Linear-Time Framework for Sentence-Level Discourse Parsing” in ACL 2019.
Getting Started These instructions will help you to run our unified discourse parser based on RST dataset.
Prerequisites * PyTorch 0.4 or higher * Python 3 * AllenNLP Dataset We train and evaluate the model with the standard RST Discourse Treebank (RST-DT) corpus. * Segmenter: we utilize all 7673 sentences for training and 991 sentences for testing.
About This package includes:
A discourse segmenter A discourse parser Evaluation metrics for discourse parsing Download Document-level Discourse Parser for English
Demo Link
Installation Required for the discourse segmenter:
Charniak’s reranking parser. Put it in Tools/CharniakParserRerank and install it. Taggers from UIUC. Download POS tagger and shallow chunker [LBJPOS.jar, LBJChunk.jar, LBJ2.jar, LBJ2Library.jar] and put these in Tools/UIUC_TOOLs/ Install scikit-learn and scipy (instructions) Install java if not installed (instructions for Ubuntu) Make sure the Tools/SPADE_UTILS/bin/edubreak is set to executable.