Introducing Recursion: the RL platform for enterprise specialist agents

Blog

Insights on AI research, reinforcement learning, evaluations, and enterprise AI systems

Latest Applied research Releases Announcements Use cases Engineering

Introducing Recursion: The RL platform for enterprise specialist agents

Recursion is a unified reinforcement learning platform for developing, evaluating, and deploying specialist AI models that improve from real enterprise execution.

Labelbox•June 24, 2026

Bridging insight and innovation: Introducing Labelbox Applied Research

Today we’re launching Labelbox Applied Research with three flagship pillars: Labelbox Evals for unified model evaluation, Labelbox Agents for building reliable and interpretable agents, and Labelbox Robotics (LBRx) for delivering high-quality training data for advanced robotic manipulation.

Labelbox•November 19, 2025

Announcing R-ConstraintBench: A novel way to stress-test LLM reasoning abilities under interacting constraints

We've released a research paper on R-ConstraintBench, a novel benchmark for evaluating LLM reasoning on realistic resource-constrained project scheduling problems (RCPSP), a well-known NP-complete challenge.

Labelbox•August 22, 2025

Introducing Labelbox Evaluation Studio: Drive AGI advancements with real-time feedback on model performance

Labelbox Evaluation Studio unlocks a private, real-time platform where top AI teams unlock tailored insights, instantly spot strengths and weaknesses, and accelerate faster frontier model improvements.

Labelbox•August 5, 2025

Teaching agents to use tools with human supervision: MCP support now available

Meet Labelbox's MMC editor now with MCP support which enables human-in-the-loop evaluation by making it easy to inspect, label, and correct agentic-tool interactions.

Labelbox•July 24, 2025

Benchmarking deep research agents

Introducing Labelbox’s deep research leaderboard: an open, continuously‑updated scorecard that shows showing how top AI agents like OpenAI, Google, and Anthropic perform on long-form research tasks.

Labelbox•July 21, 2025

An economic report on the human expertise fueling frontier AI

Our latest report examines the emerging expert economy driving frontier AI, detailing the backgrounds and disciplines of these knowledge workers and their impact on cutting-edge AI systems. It also covers their earnings and the high-skill data crucial for the next stage of AI development.

Labelbox•July 17, 2025

Building true RL systems: An experiment on solving real business tasks

We tested rubric-based rewards and GRPO on a real-world e-commerce task and found they outperformed sparse rewards by 300%. This helps validate their effectiveness for complex, multi-step business workflows.

Labelbox•July 1, 2025

Agentic AI: What it takes to build AI that acts

Agentic AI is emerging as a new frontier in autonomy, where models can plan, adapt, and take action independently. In this post we highlight three real-world projects with leading AI labs, from multi step tool use to structured reasoning and dynamic instruction following.

Labelbox•June 30, 2025

Benchmarking agentic search

Enterprises need search-augmented LLMs that deliver fast, trustworthy, and up-to-date answers—not just polished language. Since public benchmarks rarely test for this, the Labelbox research team conducted its own study across three frontier models: Gemini 2.5 Pro, GPT-4.1, and Claude 4.0 Opus.

Labelbox•June 13, 2025

Rubric evaluations: Fueling the next wave of reinforcement learning

See how Labelbox utilizes custom rubric-based evaluations to help leading AI labs train and assess advanced frontier models with depth and nuance.

Labelbox•May 16, 2025

Try Labelbox today

Get started for free or see how Labelbox can fit your specific needs by requesting a demo

Start for free