Fully unsupervised mining method that can built synthetic parallel data for unsupervised machine translation.
A novel strategy to improve unsupervised MT by using back-translation with multiple models.
A simple way to boost many NMT tasks by using multiple backward and forward models.
A novel attention mechanism that aggregates hierarchical structures to encode constituency trees for downstream tasks.