Image generation
Last updated: September 9, 2024Our image generation leaderboard evaluates AI models on their ability to generate high-quality images from textual descriptions. We assess various factors based on the project's specific criteria.
DALL•E 3
Leads most metrics — Elo rating, TrueSkill rating, user preferences, prompt alignment, and visual appealImagen 3
Performs exceptionally well in average rank, suggesting strong performance in direct comparisonsIdeogram 2
Despite being lower overall ratings, shows strong performance in prompt alignmentStable Diffusion 3
Not leading in any particular category, show consistent performance across all metricsFlux 1.5
Not leading in any particular category, show consistent performance across all metricsHuman preference evaluation
Diverse pool of US-based Alignerrs, including generalists and creative artists
Consensus of three Alignerrs per task
Standardized instructions and ontology for consistent evaluations
Carefully curated prompt generation process, balancing creativity and clarity
Overall preference
Prompt alignment
Visual appeal
Examples
High-resolution photograph: a small, intricately decorated spool of thread on a rustic wooden table. Warm, natural lighting. Close-up perspective, capturing meticulous details. Cozy, vintage atmosphere.
DALL•E
Google Imagen 3
StableDiffusion
Flux1.5
Ideogram2