logo

Beyond benchmarks

We are going beyond traditional benchmarks to measure the likability and performance of generative AI models. Labelbox leaderboards measure model capabilities by using its data factory: platform, scientific process and expert humans.

Leaderboards

Learn more about our approach to human-centric AI evaluation.

Image generation

Last updated: February 10, 2025
Image generation
Imagen 3DALL·E 3Flux 1....Stable ...Ideogra...Recraft v3

Speech generation

Last updated: March 11, 2025
Speech generation
Eleven ...Open AI...AWS PollyCartesiaKokoroDeepgramGoogle TTSXTTS-V2

Video generation

Last updated: March 11, 2025
Video generation
Runway ...Luma Ray 2TencentPika 1.5Luma Dr...

Multimodal-reasoning

Last updated: March 12, 2025
Multimodal-reasoning
Claude ...Gemini ...O1Pixtral...Gemini ...GPT-4oAWS Nov...Llama 3...