How a Fortune 500 creative software company improved the speed of their AI development by 50%

Problem

This team previously spent many engineering cycles building out their own training data infrastructure, which meant project delays and siloed AI development efforts among disparate groups who were looking to find what data was available for ML use.

Solution

Labelbox Catalog & Annotate. Using a variety of native editors and the Labelbox Boost workforce, the enterprise has dramatically streamlined their end-to-end worfklows for identifying and collaborating on training data.

Result

Multiple divisions at the enterprise now leverage Labelbox for PDF, video, text, and image annotation and the Research team has improved labeling quality and sped up their labeling operations by 50%.

A Fortune 500 software company that provides creative digital marketing and document management solutions was looking for a single platform to consolidate and unify their data-centric workflows for creating AI data and boosting model performance. For many years, the company has been empowering their leading cloud products with AI under the hood, which garnered multiple industry awards and widespread recognition.  


A major division of the enterprise’s R&D arm became one of the very first adopters of Labelbox. This team's mandate was to deliver innovation through video understanding via the use of transcripts, joint vision and language, as well as document understanding. This team previously spent a lot of engineering cycles building out their own training data infrastructure, which meant project delays and siloed AI development efforts among the disparate groups who were looking to find what data was available for ML use. The team chose Labelbox because it provided a central standardized platform across their entire division for efficiently collaborating between ML & AI teams, as well as their internal and external labeling teams. 


The company also found that prior to Labelbox, the process of evaluating the quality of their AI data was a highly fragmented process. This led to lower confidence in their models, with AI/ML initiatives taking much longer than expected to yield a return on their investments. To alleviate this problem, the company leveraged Labelbox for a variety of their primary data modalities - which included text, PDF, images and videos - and tied the process of finding unstructured data with their core annotation process.


Labelbox Catalog provided the ability to quickly leverage metadata and custom embeddings for filtering all of this unstructured data, saving the team hours of time every day for a process that would have taken months of custom engineering work. The customer was then able to leverage Labelbox Annotate to develop a consistent process for curating high-quality AI data. Using built-in quality assurance tools, the team quickly saw increases in model performance, specifically for their NLP and computer vision projects which focused on understanding the complex structure of PDF documents. 


As an additional benefit of Labelbox’s Annotate tools for collaboration, taxonomists on their data science team are now able to more effectively work with internal stakeholders. This consisted of defining what data ML teams are interested in, translating requests into specific labeling instructions, and then evolving project ontologies as business needs changed.


Fast forward six months into adopting Labelbox, multiple teams at the enterprise are now expanding their use of the platform, including their core Cloud product development, AI/ML services, and R&D team in order to enable breakthroughs faster. Using Labelbox’s full suite of editors, the Research team has improved labeling quality and sped up their labeling operations by 50% and ship products faster to market.