Leaderboards
Multimodal-reasoning
Last updated: December 9, 2024The Labelbox multimodal reasoning leaderboard evaluates AI models based on their ability to mimic human-like understanding and decision making. The leaderboard evaluates leading models on their abilities to conduct logical storytelling, detect differences in images, generate image captions, and perform spatial reasoning.
Human preference evaluation
Diverse pool of US-based Alignerrs, including generalists and creative artists
Consensus of three Alignerrs per task
Standardized instructions and ontology for consistent evaluations
Carefully curated prompt generation process, balancing creativity and clarity
Storytelling
Description:
Options:
Differences
Description:
Options:
Captioning
Description:
Options:
Spatial
Description:
Options: