ImageNet: A Large-Scale Hierarchical Image Database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei present "ImageNet: A Large-Scale Hierarchical Image Database" at CVPR 2009 in Miami Beach, introducing a dataset of 3.2 million cleanly annotated images across 5,247 categories organized according to the WordNet hierarchy. Labeled by nearly 50,000 crowdworkers from 167 countries via Amazon Mechanical Turk — a then-novel approach that the academic community initially dismissed — the dataset would eventually grow to over 14 million images across 21,841 synsets. Beginning in 2006 as Fei-Fei Li's project at Princeton, ImageNet was conceived from the insight that the bottleneck in computer vision was not better algorithms but better data: models needed to see the visual world at scale before they could understand it. The ImageNet Large Scale Visual Recognition Challenge (ILSVRC), launched in 2010, provided the competitive benchmark that channeled the field's energy toward a single measurable goal. When Krizhevsky, Sutskever, and Hinton's AlexNet shattered the 2012 ILSVRC by a 10.8-point margin using GPU-trained deep convolutional networks, it was ImageNet that made the victory legible — and reproducible. Without the dataset, there was no challenge; without the challenge, there was no moment of proof. ImageNet supplied the fuel that the backpropagation algorithm had been waiting 23 years to burn, igniting the modern deep learning revolution and every large-scale AI system that followed.