ALPHA Timepoint is in alpha Talk to Us
Publication of "Attention Is All You Need"

Publication of "Attention Is All You Need"

Eight researchers from Google Brain, Google Research, and the University of Toronto publish "Attention Is All You Need" (arXiv:1706.03762), introducing the Transformer architecture — a neural network built entirely on self-attention mechanisms, dispensing with recurrence and convolution. The paper proposes multi-head attention, scaled dot-product attention, and sinusoidal positional encodings, achieving state-of-the-art results on WMT 2014 English-to-German and English-to-French translation benchmarks while training significantly faster than prior architectures. Presented at NeurIPS 2017 in Long Beach, California, the paper would become the most cited work in deep learning history (over 140,000 citations by 2024), winning the NeurIPS 2023 Test of Time Award. The Transformer directly spawned BERT, GPT, T5, PaLM, LLaMA, and virtually every large language model that followed, making it the single most consequential architecture paper in the history of artificial intelligence.

Year 2017
Date 6/12
Location Mountain View, California, United States
Layer 1
Visibility PUBLIC
artificial-intelligence deep-learning transformer natural-language-processing neural-networks machine-learning attention-mechanism google-brain

Key Figures

Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez Lukasz Kaiser Illia Polosukhin

Related Moments

I
ImageNet: A Large-Scale Hierarchical Image Database
· thematic
M
Minsky & Papert Publish "Perceptrons"
· thematic
R
Rosenblatt Demonstrates the Perceptron
· thematic
H
Hinton Publishes "A Fast Learning Algorithm for Deep Belief Nets"
· thematic
A
AlphaGo Move 37
· same era
A
AlphaGo Move 37
· follows
A
AlphaGo Move 37
· thematic
D
Dartmouth Summer Research Project on Artificial Intelligence
· thematic