Sequence Modeling

Long-Short Term Memory Networks in PyTorch

PyTorch Tutorial for LSTMs.
The Unreasonable Effectiveness of Recurrent Neural Networks Andrej Karpathy

Classic blog post describing RNNs and LSTMs.
Understanding LSTM Networks, Christopher Olah

Blog post describing LSTMs with lots of visuals.
The Illustrated Transformer, Jay Alammar

High level explantation of transformers.
Visual explanations of RNN’s

Youtube video. Recurrent Neural Networks are an extremely powerful machine learning technique but they may be a little hard to grasp at first. For those just getting into machine learning and deep learning, this is a guide in plain English with helpful visuals to help you grok RNN’s. (10 mins)
The fall of RNN / LSTM

Blog post describing what attention is, and why attention based models often outperform LSTMs.
Attention and Augmented Recurrent Neural Networks

Shows ways to augment RNNs: Neural Turing Machines, Attentional Interfaces, Adaptive Computation Time, Neural Programmers
Transformers: Attention in Disguise

In this post, we will be describing a class of sequence processing models known as Transformers. Transformers came out on the scene not too long ago and have rocked the natural language processing community because of their pitch: state-of-the-art and efficient sequence processing without recurrent units or convolution.