Resources

GlobalWoZ: Globalizing MultiWoZ to Develop Multilingual Task-Oriented Dialogue Systems

Multilingual ToD

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation

A novel training objective for text generation

LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5

a unified framework for LFLL based on prompt tuning of T5

Rethinking Self-Supervision Objectives for Generalizable Coherence Modeling

We show empirically that increasing the density of negative samples improves the basic model, and using a global negative queue further improves and stabilizes the model while training with hard negative samples.

A Unified Speaker Adaptation Approach for ASR

A unified speaker adaptation approach consisting of feature adaptation and model adaptation for ASR.

MulDA: A Multilingual Data Augmentation Framework for Low-Resource Cross-Lingual NER

A data augmentation method for NER.

RST Parsing from Scratch

A novel top-down end-to-end formulation of document level discourse parsing in the Rhetorical Structure Theory (RST) framework.

UXLA: A Robust Unsupervised Data Augmentation Framework for Zero-Resouce Cross-Lingual NLP

We propose UXLA, a novel data augmentation framework for self-supervised learning in zero-resource transfer learning scenarios.

Rethinking Coherence Modeling: Synthetic vs. Downstream Tasks

Coherence models are typically evaluated only on synthetic tasks, which may not be representative of their performance in downstream applications. To investigate how representative the synthetic tasks are of downstream use cases, we conduct experiments on benchmarking well-known traditional and neural coherence models on synthetic sentence ordering tasks, and contrast this with their performance on three downstream applications: coherence evaluation for MT and summarization, and next utterance prediction in retrieval-based dialog.

DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks

Data augmentation for low resource tagging.

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test-suite

An extensive, targeted dataset that can be used as a test suite for pronoun translation, covering multiple source languages and different pronoun errors drawn from real system translations, for English

A Unified Neural Coherence Model

A unified coherence model that incorporates sentence grammar, inter-sentence coherence relations, and global coherence patterns into a common neural framework.

Hierarchical Pointer Net Parsing

A hierarchical pointer network parsers applied to dependency and sentence-level discourse parsing tasks.

Discourse Processing and Its Applications --- Tutoral at ACL-2019

Discourse processing is a suite of Natural Language Processing (NLP) tasks to uncover linguistic structures from texts at several levels, which can support many NLP applications.

Unsupervised Word Translation

Adversarial Autoencoder with Cycle Consistency and Improved Training

Malay-English Neural Machine Translation System.

A demo of malay english Machine translation system.

Discourse Processing and Its Applications in Text Mining --- Tutoral at ICDM-2018

Discourse processing is a suite of Natural Language Processing (NLP) tasks to uncover linguistic structures from texts at several levels, which can support many text mining applications.

Coherence Modeling of Asynchronus Conversations

A neural approach for modeling coherence of asynchronus conversation

A Unified Linear-Time Framework for Sentence-Level Discourse Parsing

Community Question Answering System

This search tool helps you to find good answers to your question by searching through previously asked questions in the Qatarliving forum.

Deep Learning for Crisis Computing

Python implementation of a number of deep neural networks classifiers for the classification of crisis-related data on Twitter.

Discourse Parser for English

Discourse-informed Sen2Vec

CON-S2V: A Generic Framework for Incorporating

LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space

This paper shows with a semi-supervised algorithm that BLI is more suitable through Non-Linear Mapping (specially for low resource languages).

Neural Domain Adaptation Model for Machine Translation

Neural Local Coherence Model

Neural coherence for monologue

Recurrent Neural Models for Fine-grained Opinion Analysis

Speech act recognizer for synchronous and asynchronous conversations

This resource addresses the problem of speech act recognition in written asynchronous conversations

Topic Segmenter & Labeler for Asynchronous Conversations

This parser builds a discourse tree by applying an optimal parsing algorithm to probabilities inferred from two Conditional Random Fields: one for intra-sentential parsing and the other for multi-sentential parsing.

SegBot: A Generic Neural Text Segmentation Model with Pointer Network

Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation

Fully unsupervised mining method that can built synthetic parallel data for unsupervised machine translation.

Cross-model Back-translated Distillation for Unsupervised Machine Translation

A novel strategy to improve unsupervised MT by using back-translation with multiple models.

Data Diversification: A Simple Strategy For Neural Machine Translation

A simple way to boost many NMT tasks by using multiple backward and forward models.

Differentiable Window for Dynamic Local Attention]

This resource contains the source code of our ACL-2020 paper entitled Differentiable Window for Dynamic Local Attention

Efficient Constituency Parsing by Pointing

This resource contains the source code of our ACL-2020 paper entitled Efficient Constituency Parsing by Pointing

Tree-Structured Attention with Hierarchical Accumulation

A novel attention mechanism that aggregates hierarchical structures to encode constituency trees for downstream tasks.