C. Hinsley

20 December 2023


I’d like to catalogue and comment on different attempts to explain Transformers (Vaswani et al 2017) mathematically.

A mathematical perspective on Transformers

Formal Algorithms for Transformers

"Attention", "Transformers", in Neural Network "Large Language Models"

The Annotated Transformer