logo

Beyond benchmarks

We are going beyond traditional benchmarks to measure the likability and performance of generative AI models. Labelbox leaderboards measure model capabilities by using its data factory: platform, scientific process and expert humans.

Leaderboards

Learn more about our approach to human-centric AI evaluation.

Image generation

Last updated: February 10, 2025
Image generation

Speech generation

Last updated: March 11, 2025
Speech generation

Video generation

Last updated: March 11, 2025
Video generation

Multimodal-reasoning

Last updated: March 12, 2025
Multimodal-reasoning