Domain Adaptation with Adversarial Training and Graph Embeddings

Abstract

In recent years there has been a growing interest in deep neural networks (DNN) and representation learning with applications to a myriad of NLP and data mining problems. The success of DNNs is heavily dependent on the availability of labeled data. However, obtaining labeled data is a big challenge in many real-world problems. In such cases, a DNN model can leverage labeled and unlabeled data from a related domain, but it has to deal with the shift in data distributions between the domains. In this paper, we study the problem of classifying social media posts during a crisis event (e.g., Earthquake). For that, we use labeled and unlabeled data from past similar events (e.g., Flood) and unlabeled data for the current event. We propose a novel model that performs adversarial learning based domain adaptation to deal with distribution drifts and graph based semi-supervised learning to leverage unlabeled data within a single unified deep learning framework. Our experiments with two real-world crisis datasets collected from Twitter demonstrate significant improvements over several baselines.

Publication
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18)
Date
Links