Recurrent Neural Networks

Recurrent Neural Networks (RNNs) are a type of neural network that are specifically designed to process sequential data, such as text, speech, or time series data. One of the key features of RNNs is their ability to retain information from past inputs, which allows them to understand context and relationships within a sequence.

A few years ago, RNNs were considered the state-of-the-art method for neural machine translation, which is the task of automatically translating text from one language to another. The ability of RNNs to understand context and relationships within a sequence of words allows them to produce more accurate translations.

Modern architectures for large language models (LLMs) such as GPT are built on top of RNNs and use other techniques such as transformer networks to improve performance. RNNs remain a fundamental building block for many of these more advanced models.

Index

Mathematics of RNNs: Derivations for a specific example of RNNs

The Code - version 1: Python code written “from scratch” (using numpy).

The Code - version 2: PyTorch version of the code.