logo
Leaderboards

Multimodal-reasoning

Last updated: December 9, 2024

The Labelbox multimodal reasoning leaderboard evaluates AI models based on their ability to mimic human-like understanding and decision making. The leaderboard evaluates leading models on their abilities to conduct logical storytelling, detect differences in images, generate image captions, and perform spatial reasoning.

Human preference evaluation

Diverse pool of US-based Alignerrs, including generalists and creative artists

Consensus of three Alignerrs per task

Standardized instructions and ontology for consistent evaluations

Carefully curated prompt generation process, balancing creativity and clarity

Storytelling

Description:

Options:

Differences

Description:

Options:

Captioning

Description:

Options:

Spatial

Description:

Options: