logo
×

How Walmart drives innovation for their conversational AI and LLM applications

Problem

Walmart's team wanted to find faster ways to annotate conversational text from shopping chatbots and label inventory images for their object detection and classification models, which included tens of millions of diverse product SKUs.

Solution

Labelbox Annotate, which provided intuitive text and image editors with the ability to tag entity relationships (NER) for conversations. Integrations with Google BigQuery via the Labelbox Python SDK to automate manual workflows and data import.

Result

“Conversational AI is a very exciting and challenging field, because large language models are extremely sensitive to data quality,” shared Philippe Hanrigou, a Walmart director of data-science dedicated to conversational AI and chatbots. “Labelbox is a gamechanger because of the quality of the labels we can now acquire for our models. It’s hard to imagine a life now without Labelbox.”

Note: The quotes for this post were sourced from Walmart's Global Tech blog on our Sparkcubate partnership with Walmart (source).


Walmart, a leading global retailer, was searching for ways to improve the way they produce labeled data for their conversational AI and LLM-powered applications. The data science team wanted to find faster ways to annotate conversational text from shopping chatbots and label inventory images for their object detection and classification models, which included tens of millions of diverse product SKUs. By applying labels to conversations, Walmart’s conversational AI models would become stronger and more natural over time, leading to better conversational AI models and ultimately, happier customers.


The company previously relied on tech-enabled BPOs to create and manage their labeled data but found the decision to be suboptimal because of the BPO’s process lacking transparency, giving them sparse visibility into the quality of the training data being produced. In addition, tech-enabled BPOs typically did not provide dedicated software that would easily give team members the ability to collaborate on the training data iteration process itself. This encompassed a range of stakeholders including natural language data-scientists, ML engineers, data engineers, linguists and software engineers, etc. The relationship with these vendors were more of a black box approach, omitting important metrics such as individual labeler and project-level analytics reporting, while in-house subject matter experts could not closely collaborate with external service providers. 


Labelbox offered a clearer and more efficient alternative, by providing an end-to-end in-app labeling workflow for the company’s conversational AI efforts. As Walmart continues to invest in chatbot and large language model capabilities, Labelbox provided a labeling interface optimized for conversations with a high degree of consistency and quality control. The team adopted Labelbox, which delivered intuitive text and image editors with the ability to tag named entity recognition (NER) relationships for conversations. The company's unstructured data for these chatbots existed as a mix of intertwined natural voice commands, text messages, images, and traditional GUI interactions. Labelbox’s data engine also allowed the company’s data science team to work with any labeling vendor (whether internal or external) and collaborate easily with their in-house domain experts. This regularly fed into a labeling process whereby reviewers could check and ensure quality benchmarks in training data were being met. 


Conversational AI for retail is incredibly complex, given that products, types of products and brands must all be labeled to effectively fulfill what a customer wants to order. For example, a Text-to-Shop customer may want to order not only milk, but organic milk—further, Great Value organic milk. Each of these specifications requires a label, and Labelbox provides Walmart teams the ability to determine whether or not their conversational AI correctly identifies that product for the customer.


Furthermore, Labelbox’s Annotate product provided the ability to automate a lot of the manual orchestration via the use of the Python SDK. This allowed labeling workflows to be initiated from Google BigQuery, which the enterprise heavily relied on, and had set up as a core part of its existing data infrastructure. Labels could now be easily pulled and pushed from BigQuery tables for structured data and could be easily created from the Labelbox and Google Cloud integration.


By choosing Labelbox, the enterprise saw dramatic efficiency gains, because they were now able to get full visibility into their labeling pipeline via in-depth project analytics. Model-based pre-labeling (also known as model-assisted labeling) also sped up their labeling process by allowing their team to adjust annotations as opposed to creating ground-truth labels from scratch. The company is now able to draw insights about labeling performance, taking actions that directly translates into improvements in label throughput, efficiency and quality. 


“Conversational AI is a very exciting and challenging field, because large language models are extremely sensitive to data quality,” shared Philippe Hanrigou, a Walmart director of data-science dedicated to conversational AI and chatbots. “Labelbox is a gamechanger because of the quality of the labels we can now acquire for our models. It’s hard to imagine a life now without Labelbox.”


In terms of ROI, the company’s labeled data accuracy improved by an estimated 25% through Labelbox's quality assurance systems, review workflows and real-time collaboration, while Labelbox’s Boost team was able to deliver high-quality data (with 95% accuracy in labeled data) and at a 25% reduction in turnaround time compared to similar services. Labelbox is helping hone the technology that powers Walmart’s Text to Shop platform, customer service platforms, and more future intelligent applications by actively working with other Walmart teams to identify LLM and GenAI initiatives that it can help power.