CE7455: Deep Learning for Natural Language Processing: From Theory to Practice

NTU-NLP

Natural Langauge Processing lab of NTU.

Course Objectives

Natural Language Processing (NLP) is one of the most important fields in Artificial Intelligence (AI). It has become very crucial in the information age because most of the information is in the form of unstructured text. NLP technologies are applied everywhere as people communicate mostly in language: language translation, web search, customer support, emails, forums, advertisement, radiology reports, to name a few.

There are a number of core NLP tasks and machine learning models behind NLP applications. Deep learning, a sub-field of machine learning, has recently brought a paradigm shift from traditional task-specific feature engineering to end-to-end systems and has obtained high performance across many different NLP tasks and downstream applications. Tech companies like Google, Baidu, Alibaba, Apple, Amazon, Facebook, Tencent, and Microsoft are now actively working on deep learning methods to improve their products. For example, Google recently replaced its traditional statistical machine translation and speech-recognition systems with systems based on deep learning methods.

Optional Textbooks

Deep Learning by Goodfellow, Bengio, and Courville free online
Machine Learning — A Probabilistic Perspective by Kevin Murphy online
Natural Language Processing by Jacob Eisenstein free online
Speech and Language Processing by Dan Jurafsky and James H. Martin (3rd ed. draft)

Intended Learning Outcomes

In this course, students will learn state-of-the-art deep learning methods for NLP. Through lectures and practical assignments, students will learn the necessary tricks for making their models work on practical problems. They will learn to implement, and possibly to invent their own deep learning models using available deep learning libraries like Pytorch.

Our Approach

Thorough and Detailed: How to write from scratch, debug and train deep neural models
State of the art: Most lecture materials are new from research world in the past 1-5 years.
Practical: Focus on practical techniques for training the models, and on GPUs.
Fun: Cover exciting new advancements in NLP (e.g., Transformer, BERT).

Assessment Approach

Weekly Workload

Every two-hour lecture will be accompanied by practice problems implemented in PyTorch.
There will be a 30-min office hour per week to discuss assignments and project.
There will be some invited talks from NLP researchers (see the schedule).
There will be 5% marks for class participation.

Assignments (individually graded)

There will be three (3) assignments contributing to 3 * 15% = 45% of the total assessment.
Assignments will be posted on NTU-Learn (see the schedule).
Late day policy
- 2 free late days; afterwards,10% off per day late
- Not accepted after 3 late days
Students will be graded individually on the assignments. They will be allowed to discuss with each other on the homework assignments, but they are required to submit individual write-ups and coding exercises.

Final Project (Group work but individually graded)

There will be a final project contributing to the remaining 50% of the total course-work assessment.
- 1–3 people per group
- Project proposal: 5%, presentation: 10%, report: 35%
Instructions for project proposal and final report will be posted on NTU-Learn (see the schedule).
The project will be a group or individual work depending on the student’s preference. Students will be graded individually. The final project presentation will ensure the student’s understanding of the project

Course Prerequisites

Proficiency in Python (using numpy and PyTorch). There is a lecture for those who are not familiar with Python.
College Calculus, Linear Algebra
Basic Probability and Statistics
Machine Learning basics

Schedule & Course Content

Week 13: In-class Project presentation

02:30 PM - 5:30 PM 13 April 2022 Online (Teams)

Project final report guidelines

Project presentation: 10-12 min/group

Week 12: Meta & Multi-task Learning for NLP

6 April 2022 Online (Teams), NTU, Singapore

Assignment 3 in

Invited talk on Unsupervised MT by Xuan Phi

Lecture Slide

Multitask-learning
Fine-Tuning for Transfer Learning
Meta-learning problem
Two views of Meta-learning Problem
Black-box meta learning (GPT3)
Optimization-based meta learning (MAML)
Non-parametric meta learning (ptotyical nets)

Week 11: GANs, Adversarial NLP and Deep Generative Models

02:30 PM - 5:30 PM 30 March 2022 Online (Teams), NTU, Singapore

Slides on adversarial nets

Slides on adverarial attacks (prepared by Samson@Amazon)

Slides on deep generative models

Invited talk on Text Generation by Lin Xiang

Lecture Content

Generative adversarial nets (GANs)
Domain adversarial nets (DANs)
Adversarial attacks in NLP
Defense:
- Training with adversarial examples
- Consistency regularization
- Cross-view consistency
Variational inference
Auto encoders
Variational auto encoders
Conditional VAEs
Vector Quantized VAEs
Variational Generative adversarial nets

Suggested Readings

Week 10: Contextual embeddings and self-supervised learning

02:30 PM - 5:30 PM 23 March 2022 Online (Teams), NTU, Singapore

Lecture Slide

Assignment 2 in

Assignment 3 out

Project final report guidelines

Lecture Content

Pre-training and fine-tuning paradigm
- CoVe
- TagLM
- ELMo
- GPT
- ULMfit
- BERT (+ mBERT)
- XLM
- XL-Net
- BART (+ mBART)
- T5 (+ mT5)
Evaluation benchmarks
- GLUE
- SQuAD
- NER
- SuperGLUE
- XNLI

TA: Mathieu

Pre-train Fine-tune with HF

Suggested Readings

Week 9: Seq2Seq Variants and Transformer

02:30 PM - 5:30 PM 16 March 2022 Online (Teams), NTU, Singapore

Lecture Slide

Project Proposal due

Lecture Content

Seq2Seq Variants (Pointer nets, Pointer Generator Nets)
- Machine Translation
- Summarization
- Parsing
- image/video captioning
Transformer architecture
- Self-attention
- Positional encoding
- Multi-head attention

Practical exercise with Pytorch

TA: Bosheng Ding

The Annotated Transformer

Suggested Readings

Week 8: Seq2Seq, Attention, Subwords

02:30 PM - 5:30 PM 9 March 2022 Online (Teams), NTU, Singapore

Lecture Slide

Project Proposal due

Lecture Content

Information bottleneck issue with vanilla Seq2Seq
Attention to the rescue
Details of attention mechanism
Attention variants
Morphology in MT
Subword level models

Practical exercise with Pytorch

TA: Xuan-Phi (NTU-NLP)

Practical exercise with Pytorch

Neural machine translation tutorial in pytorch

Suggested Readings

Recess Week (make up class for CNY): Machine translation and Seq2Seg Models

02:30 PM - 5:30 PM 2 March 2022 Online (Teams), NTU, Singapore

Lecture Slide

Tutorial 4

Lecture Content

Machine translation
- Early days (1950s)
- Statistical machine translation or SMT (1990-2010)
- Alignment in SMT
- Decoding in SMT
Neural machine translation or NMT (2014 - )
Encoder-decoder model for NMT
Advantages and disadvantages of NMT
Greedy vs. beam-search decoding
MT evaluation
Other applications of Seq2Seq

Suggested Readings

Week 7: Recurrent Neural Nets

02:30 PM - 5:30 PM 23 February 2022 Online (Teams), NTU, Singapore

Lecture Slide

Tutorial 3

Assignment 1 in

Assignment 2 out (in NTU-Learn)

Lecture Content

Basic RNN structures
Language modeling with RNNs
Backpropagation through time
Text generation with RNN LM
Issues with Vanilla RNNs
Exploding gradient
Gated Recurrent Units (GRUs) and LSTMs
Bidirectional RNNs
Multi-layer RNNs
Sequence labeling with RNNs
Sequence classification with RNNs

TA: Saiful Bari

Practical exercise with Pytorch

Named Entity Recognition

Suggested Readings

Week 6: Window-based methods & CNNs

02:30 PM - 5:30 PM 16 February 2022 Online (Teams), NTU, Singapore

Lecture Slide

Tutorial 3

Lecture Content

Classification tasks in NLP
Window-based Approach for language modeling
Window-based Approach for NER, POS tagging, and Chunking
Convolutional Neural Net for NLP
Max-margin Training
Scaling Softmax (Adaptive input & output)

Practical exercise with Pytorch

CNN for word encoding

Suggested Readings

Week 5: Word Vectors

02:30 PM - 5:30 PM 9 February 2022 Online (Teams), NTU, Singapore

Lecture Slide

Project Proposal Instructions

Tutorial 2

**Assignment 1 out (in NTU-Learn) **

Lecture Content

Word meaning
Denotational semantics
Distributed representation of words
Word2Vec models (Skip-gram, CBOW)
Negative sampling
Glove
FastText
Evaluating word vectors
- Intrinsic evaluation
- Extrinsic evaluation
Cross-lingual word vectors

TA: Mathieu

Practical exercise with Pytorch

Skip-gram training
Visualization

Suggested Readings

Week 4: Chinese New Year

02:30 PM - 5:30 PM 2 February 2022 Online (Teams), NTU, Singapore

Chinese New Year (No Lecture)

Week 3: Neural Network & Optimization Basics

02:30 PM - 5:30 PM 26 January 2022 Online (Teams), NTU, Singapore

Lecture Slide

Tutorial 2

Lecture Content

Why Deep Learning for NLP?
From Logistic Regression to Feed-forward NN
- Activation functions
SGD with Backpropagation
Adaptive SGD (Adagrad, adam, RMSProp)
Regularization (Weight Decay, Dropout, Batch normalization, Gradient clipping)
Introduction to Word Vectors

TA: Mathieu

Practical exercise with Pytorch

Pytorch notebook

Backpropagation
Dropout
Batch normalization
Initialization
Gradient clipping

Suggested Readings

Week 2: Machine Learning Basics

02:30 PM - 5:30 PM 19 January 2022 Online (Teams), NTU, Singapore

Lecture Slide

Tutorial 1

Lecture Content

What is Machine Learning?
Supervised vs. unsupervised learning
Linear Regression
Logistic Regression
Multi-class classification
Parameter estimation (MLE & MAP)
Gradient-based optimization & SGD

TA: Mathieu

Practical exercise with Pytorch

Deep learning with PyTorch
Linear Regression
Logistic Regression
Numpy notebook
[Supplementary]
- Numerical programming with Pytorch - Pytorch intro

Week 1: Introduction

02:30 PM - 5:30 PM 12 January 2022 Online (Teams), NTU, Singapore

Lecture Slide

Lecture Content

What is Natural Language Processing?
Why is language understanding difficult?
What is Deep Learning?
Deep learning vs. other machine learning methods?
Why deep learning for NLP?
Applications of deep learning to NLP
Knowing the target group (background, field of study, programming experience)
Expectation from the course

Python & PyTorch Basics

Programming in Python
- Jupiter Notebook and google colab
- Introduction to python
- Deep Learning Frameworks
- Why Pytorch?
- Deep learning with PyTorch
[Supplementary]
- Numerical programming with numpy/scipy - Numpy intro
- Numerical programming with Pytorch - Pytorch intro

Futher Reading: Deep Reinforcement Learning for NLP

1 January 2022 Online (Teams), NTU, Singapore

What is RL?
Key concepts: Rewards, Policy, Value Function
What is Deep RL?
Policy-based Deep RL
- Deep Policy Network
- Policy Gradient
Deep Q-Learning
Applications of Deep RL in NLP
- Abstractive summarization
- Dialogue generation
- Question answering
- Multimodal (image and video captioning)
- Machine translation

CE7455: Deep Learning for Natural Language Processing: From Theory to Practice

NTU-NLP

Course Objectives

Intended Learning Outcomes

Assessment Approach

Course Prerequisites

Teaching

Instructor

Shafiq Rayhan Joty

Teaching Assistants

Bosheng Ding

Qin Chengwei

Chen Hailin

M Saiful Bari

Mathieu Ravaut

Xuan Phi Nguyen

Ruochen Zhao

Schedule & Course Content

Week 13: In-class Project presentation

Week 12: Meta & Multi-task Learning for NLP

Week 11: GANs, Adversarial NLP and Deep Generative Models

Week 10: Contextual embeddings and self-supervised learning

Week 9: Seq2Seq Variants and Transformer

Week 8: Seq2Seq, Attention, Subwords

Recess Week (make up class for CNY): Machine translation and Seq2Seg Models

Week 7: Recurrent Neural Nets

Week 6: Window-based methods & CNNs

Week 5: Word Vectors

Week 4: Chinese New Year

Week 3: Neural Network & Optimization Basics

Week 2: Machine Learning Basics

Week 1: Introduction

Futher Reading: Deep Reinforcement Learning for NLP