Labelbox•April 24, 2025

Reinventing AI evaluation: Discover the simplicity of Labelbox's new form-based MMC editor

We are thrilled to announce a completely redesigned Labelbox Multimodal Chat (MMC) editor. The re-architected editor makes it intuitive and efficient to test and evaluate models by engaging in live, multi-turn and multimodal interactions.

We have fundamentally reimagined the user experience. Through an intuitive, form-based design, it directly tackles the challenges inherent in evaluating complex frontier models. The new layout makes it easier for trainers and human evaluators to classify, rank, rate, rewrite, and evaluate step-by-step reasoning (chain-of-thought) responses.

New form-based design of the Labelbox Multimodal Chat (MMC) editor

The new MMC editor transforms how teams evaluate frontier models, helping you and your AI trainers perform model comparisons in chat arena environments, assess response accuracy efficiently, and generate high-quality training data. Read on to discover more about this major update.

A modern UX is non-negotiable for rapid frontier model evaluation

Evaluating today's sophisticated frontier models is inherently complex. Engaging these models in live, multi-turn, multimodal conversations generates vast streams of nuanced data. Capturing meaningful feedback, comparing subtle differences between model responses, and generating high quality training data demands powerful, yet intuitive, tooling.

Older interface designs, including our previous MMC editor, can create friction points that slowed the critical evaluation process. This was especially true during large-scale projects, where support for different data types and a diverse group of AI trainers are essential. An inefficient user experience acts as a drag on the entire development lifecycle.

Labelbox is committed to delivering industry-leading software and services, positioning us as the data factory for the world’s leading AI teams. Our new multimodal chat editor is just one of the one of the ways we're delivering on that goal.

Meet the new form-based Multimodal Chat (MMC) editor

Inspired by user feedback, increasingly sophisticated requirements from top AI labs, and the need for a more scalable solution, we created an editor centered around a logical, form-based workflow. This paradigm shift delivers an experience that feels familiar and intuitive, much like completing a standard online form, making it easier for AI trainers, like our global network of Alignerrs, to learn and use effectively.

New form based design provides intuitive interfaces for common classifications, like selecting the best overall response

Along with the new design, the refreshed MMC editor introduces another powerful visual aid: the minimap. Located subtly on the right side of the window, the minimap provides simple visual alerts to any missing inputs across the entire task form. It dynamically highlights areas requiring attention, such as missing inputs for required classification fields.

As users progress through turns and generate more messages, these visual alerts populate, clearly marking the exact location of any incomplete required elements within the potentially long page of tasks. This allows trainers to quickly navigate the form and pinpoint exactly where action is needed, ensuring no required data is missed. The corresponding alert marking disappears once the required classification is completed. For tasks involving carousels where input is needed for multiple responses, the alert persists until all required inputs across all items in the carousel are finished.

You can see the subtle visual queues in the image below on the right hand side of the screen alongside the vertical scrollbar.

The subtle minimap on the right hand side of the window clearly identifies uncompleted tasks, liking missing classifications for required fields.

Subtle minimap added to the right side of the screen to help AI trainers quickly identify required tasks that they have not completed.

This redesigned interface also includes the following enhancements:

Layout transformation: The old left-side panel is gone. Instead, the editor presents tasks and content in scrolling panes or forms that follow one after another. This linear flow ensures all necessary information and tools are presented logically and sequentially, eliminating the need to hunt for hidden elements.
Integrated instructions: Addressing a major pain point, project instructions are now always visible within the workflow, typically positioned directly adjacent to the tools or tasks they relate to. This immediate context dramatically reduces the likelihood of missed guidance, leading to fewer errors, less reliance on external support channels, and ultimately, more consistent and accurate data.
Clear task status: New visual cues provide instant feedback on progress. A "Required" pill appears in the top right of any pane containing mandatory fields, preventing submission until those tasks are complete. Once all required elements in a section are addressed, a sleek "Finished" pill confirms completion.
Structured turn-level tasks: Tasks that apply to the entire conversation turn – such as selecting the best overall response based on specific criteria or performing rankings – are now presented in their own dedicated sections within the linear flow. This makes complex evaluation schemas intuitive to navigate and complete accurately.
Improved performance: Despite rendering more components simultaneously to support complex annotation tasks and data calculations, the new editor is snappy and responsive. We've implemented quasi-virtualization techniques, allowing elements to render extremely quickly as the user scrolls. This performance boost is crucial for maintaining high throughput, especially when dealing with long conversations or intricate ontologies.

Navigating the new MMC editor experience

Using the new editor is designed to be straightforward. Here’s a typical workflow:

Start the conversation: When you begin a new task, you'll see the initial prompt that will be sent to your selected models (e.g., Gemini, Claude, GPT-4, or other foundation models available in Labelbox).
Evaluate the initial prompt: The first pane(s) will present the prompt itself, along with any associated tasks defined in your project ontology, such as classifying the prompt's intent or selecting specific text spans. You'll interact with radio buttons, checklists, text selection tools, etc., just as before, but within this new linear structure. Subclasses will populate as expected. Remember to check for the "Required" pill.

Based on our ontology, we must annotate and comment on the prompt first, which is required as indicated by the pill box in the top right.

Review model responses: Once the prompt is processed, the next sections will display the model responses, often within the new carousel view if multiple models are being compared. Directly below each response, you'll find the fields for per-message evaluation (e.g., rating accuracy, identifying flaws). Complete these tasks for each response.
Complete turn-level tasks: Following the individual response evaluations, you'll encounter panes dedicated to turn-level tasks, such as selecting the overall best response or ranking the responses according to predefined criteria.

Review each model response and then complete the turn-level tasks, like the step reasoning task is shown here and has been redesigned.

Continue the conversation: Once all required fields for the current turn are completed (indicated by the "Finished" pills), you can input a new prompt to continue the multi-turn conversation. The process repeats: evaluate your new prompt, review the subsequent model responses, complete turn-level tasks, and so on.
Find missing inputs with minimap: Before submitting the final results of your last turn, use the minimap on the right hand side of the screen to identify any errors or incomplete tasks.

The key difference is the seamless flow. Instead of navigating collapsed views or separate panels, you simply scroll through one continuous, logical form.

See the new editor in action! Explore the streamlined workflow firsthand in our interactive click-through demo showcasing a chat arena experience.

For detailed instructions on configuration options, ontology setup, and advanced usage, please refer to the core documentation on the MMC editor.

Accelerate Your GenAI Development Today

The new Labelbox Multimodal Chat (MMC) editor represents a significant step forward in empowering AI teams. It delivers a streamlined, powerful, and intuitive experience meticulously designed for the demanding nature of modern generative AI data generation and evaluation. By simplifying complexity and boosting efficiency, this new editor helps you generate higher quality data, iterate faster, and ultimately build more capable and reliable AI models.

Ready to experience the difference? Try the new multimodal chat editor in your Labelbox projects today.

New to Labelbox? Contact our sales team to learn how our platform can accelerate your AI initiatives and meet your specific data needs.

Continue reading

Welcoming Upcraft to the Labelbox team

We've acquired Upcraft to bring AI agent technology into Alignerr, scaling how elite domain experts train, evaluate, and improve the world’s most advanced AI models.

Labelbox•February 10, 2026

Announcing R-ConstraintBench: A novel way to stress-test LLM reasoning abilities under interacting constraints

We've released a research paper on R-ConstraintBench, a novel benchmark for evaluating LLM reasoning on realistic resource-constrained project scheduling problems (RCPSP), a well-known NP-complete challenge.

Labelbox•August 22, 2025

Introducing Labelbox Evaluation Studio: Drive AGI advancements with real-time feedback on model performance

Labelbox Evaluation Studio unlocks a private, real-time platform where top AI teams unlock tailored insights, instantly spot strengths and weaknesses, and accelerate faster frontier model improvements.

Labelbox•August 5, 2025

Try Labelbox today

Get started for free or see how Labelbox can fit your specific needs by requesting a demo

Start for free