logo

Labelbox Leaderboards

We are going beyond traditional benchmarks to measure the preference and performance of leading AI models. Labelbox leaderboards measure model capabilities by using its data factory: platform, scientific process and expert humans.

Complex reasoning
Updated: July 10, 2025Total: 14 Models
Complex reasoning icon
Constraint bench
Updated: August 21, 2025Total: 9 Models
Constraint bench icon
Video generation
Updated: June 6, 2025Total: 7 Models
Video generation icon
Image generation
Updated: June 6, 2025Total: 8 Models
Image generation icon
Speech generation
Updated: June 10, 2025Total: 9 Models
Speech generation icon
Multimodal-reasoning
Updated: March 12, 2025Total: 8 Models
Multimodal-reasoning icon
Agentic Search
Updated: June 13, 2025Total: 3 Models
Agentic Search icon
Deep Research Agents
Updated: July 21, 2025Total: 3 Models
Deep Research Agents icon