Intuition for concepts in Transformers — Attention Explained

transformer architecture

Attention

seq2seq vs attention
Basic flow of attention mechanism

Scaled Dot Product Attention

Self Attention

Multi Head Attention

Masked Attention

Other components

Input/Output Embedding

Positional Encoder

References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Avinash

Avinash

Data Science at ShareChat. Ola. IIT Madras.