
P&G makes owned, trusted AI data an enterprise standard
Problem
P&G collects large amounts of unstructured data throughout prototyping, testing, and manufacturing. Refined, tagged, and classified, it becomes training signal for AI models. But because the data is proprietary, the team needed stringent privacy and full transparency throughout the process — and signal it owned and trusted.
Solution
P&G chose Labelbox after a rigorous analysis of every data annotation platform on the market — for ease of use, compatibility with its existing infrastructure, and its data security and transparency standards. With Labelbox, P&G brought in internal and external contributors as projects required, recalibrated quickly as needs evolved, and accelerated model iteration and time to market — while keeping ownership of its data.
Result
P&G first used Labelbox for computer vision defect detection on its product lines. Over three years it expanded to more AI projects. Today Labelbox is an enterprise-wide solution at P&G, producing training signal for AI across every business unit globally.

P&G's product decisions depend on AI trained on proprietary data. Labelbox gave it the AI data infrastructure to produce trusted training signal while keeping ownership and full transparency — now standard across business units worldwide.
The challenge
Procter & Gamble is a Fortune 500 consumer goods company behind Tide, Charmin, Pampers, Braun, Old Spice, and more. Testing, developing, and manufacturing its products runs on research and analysis — studying public sentiment around brands, ingredients, and scents; detecting defects on the production line — which means collecting large amounts of unstructured data: images, videos, text. To train AI on it, P&G had to turn that unstructured data into high-quality structured signal. Two constraints made it hard: the signal had to be trusted and owned, and privacy was non-negotiable. Consumer data must stay private, and manufacturing and prototyping data is IP.
There's a lot of work that goes into every little business decision that we make. And for a large company that works at scale to serve billions of consumers, the decisions that we make in terms of what we're going to manufacture, put into the market, and support, are very critical. Getting high-quality trustworthy data that we own and that we are fully confident in is critical for us,” said Kelly Anderson, Director of Data Science and AI at Procter & Gamble during a panel at CES 2023.
The approach
P&G ran a side-by-side comparison of every data annotation platform on the market, weighing user experience, compatibility with existing infrastructure, and cost. It chose Labelbox.
We did a side-by-side comparison of all the different data annotation platforms in the marketplace, and we really focused on things such as the user experience, compatibility with our existing solutions and infrastructure, as well as the cost model. Labelbox really stood out to us because [they] had a very compelling technical solution, we were strategically aligned on where we wanted to go with the data, and [they] also had a very clear and sustainable price structure,” said Mercy Chang, Senior Purchasing Manager at Procter & Gamble.
What set Labelbox apart was a model of ownership and transparency. Privacy was the gate every option had to pass.
Data is the lifeblood and one of the foundation blocks of how we're able to deliver superior products to consumers. So it's incredibly important that we safeguard it. When it comes to data privacy and information security, it's non-negotiable, because our consumers, customers, and partners put so much trust in us,” said Chang.
One of the things that we loved about Labelbox from the very beginning is their belief that the data is ours,” said Anderson.
P&G kept its data in its own cloud instance, linked it to Labelbox, and had an external high-quality labeling workforce enrich it with tags — and could watch the signal mature, an insight a black-box labeling service can't give.
Having a transparent platform that allows us to keep our data in our cloud instance, link that data to the Labelbox user interface, have an external high-quality labeling workforce enrich that data with tags….we could actually see how the data is maturing,” said Anderson.
P&G could bring whichever contributors a project needed, internal or external, and keep full ownership and transparency no matter who produced the signal — unlike prior labeling partners that took data and returned it labeled with no view into the process.
Having a transparent agile platform that’s fluid to work with….has been a transformation in how we do our work, and has been very powerful in increasing the speed of iteration, which leads to faster time to market with better insights,” said Anderson.
The outcome
P&G first used Labelbox for computer vision defect detection on its product lines. Over three years it expanded to more AI projects, and Labelbox is now an enterprise-wide solution producing training signal for AI across every business unit globally.
We're packaging the Labelbox solution and redistributing it across all our business units around the globe. So this is no longer just an R&D solution, it's a corporate solution,” said Chang.
Where this goes
For an enterprise, the data is the IP — and the learning loop built on it is the advantage. P&G turned trusted, owned signal into a standard the whole company runs on.