×

Higher-quality training signal for personalized shopping AI

Problem

A Fortune 500 e-commerce enterprise set out to personalize the shopper experience with AI. The models — image classification and object detection — had to correctly tag tens of thousands of product SKUs, and they are only as good as the signal they train on. The company's Search and Recommendation data science team first used a service that generated labels with AI. That auto-generated signal consistently missed the team's quality bar, and the models couldn't learn what they needed to.

Solution

The team moved to Labelbox. The platform produced the expert-graded training signal that made the models trainable on real product data, and gave the data science team one system to scope use cases, approve projects, and receive signal against a defined SLA. Labelbox automated orchestration through its Python SDK — moving data between BigQuery and Google Cloud and connecting training jobs to Vertex AI. Model-assisted labeling tightened the loop: a baseline model generated pre-labels, Labelbox's platform refined them into corrected, expert-graded signal, and a weekly review of sampled signal calibrated quality on an ongoing basis.

Result

Signal quality improved enough to unblock the company's AI initiatives. Model-assisted labeling and the review loop increased signal-production speed and efficiency by 50% with no drop in quality. As the models improved, the signal they helped generate got better and each round got faster.

Higher-quality training signal for personalized shopping AI

A Fortune 500 e-commerce company needed accurate image classification and object detection across tens of thousands of SKUs to personalize shopping. Labelbox's platform produced the expert-graded training signal, and the continuous quality loop, that made those models trainable.

The challenge

A Fortune 500 e-commerce enterprise set out to personalize the shopper experience with AI. The models — image classification and object detection — had to correctly tag tens of thousands of product SKUs. Those models are only as good as the signal they train on. The Search and Recommendation data science team first used a service that generated labels with AI. That auto-generated signal consistently missed the team's quality bar, and the models couldn't learn what they needed to.

The approach

The team moved to Labelbox. The platform produced the expert-graded training signal that made the models trainable on real product data, and gave the data science team one system to scope each use case, approve projects, and receive signal against a defined SLA. Labelbox automated the orchestration through its Python SDK, moving data between BigQuery and Google Cloud and connecting training jobs to Vertex AI. Model-assisted labeling tightened the loop: a baseline model produced pre-labels, Labelbox's platform refined them into corrected, expert-graded signal — cutting the work to produce each batch by roughly 50% — and a weekly review of sampled signal calibrated quality on an ongoing basis.

The outcome

Signal quality improved enough to unblock the company's AI initiatives. The review loop and model-assisted labeling increased signal-production speed and efficiency by 50% with no drop in quality. As the models improved, the signal they helped generate got better, and each round got faster.

Where this goes

This is the pattern behind every specialist model: expert-graded signal, a tight feedback loop, and models that improve with each iteration.