logo
Leaderboards

Speech generation

Last updated: September 9, 2024

Our speech generation leaderboard evaluates AI models on their ability to generate high-quality speech from textual descriptions. We assess factors such as speech quality, word error rate and naturalness.

Human preference evaluation

Diverse pool of US-based Alignerrs, including generalists and creative artists

Consensus of three Alignerrs per task

Standardized instructions and ontology for consistent evaluations

Carefully curated prompt generation process, balancing creativity and clarity

Context awareness

Pronunciation accuracy

Speech naturalness

Examples

PROMPT

"Hello, and thank you for calling customer support. Your estimated wait time is... 3 minutes and 27 seconds. While you're waiting, did you know you can manage your account online at www.example.com? Your account number is ACC-2023-78901-XYZ. For security purposes, please have your PIN ready - that's the 4-digit number you chose when you opened your account. Remember, we will never ask for your full Social Security number or password over the phone. If you're calling about our new promotion, use code SUMMER2023 for 15% off your next purchase. Oh! Looks like a representative is available now. Please hold while I transfer you. (cheerful customer service voice, then professional)",

Open AI

Cartesia

AWS

Elevenlabs

Deepgram

Google