Scaling text signal for an edtech question-answering model
Problem
Due to rapid growth during the pandemic, the company found the volume of annotated text signal its ML models needed wasn't scalable on its previous solutions, Amazon SageMaker and Prodigy.
Solution
Labelbox's collaborative text annotation, with advanced user permissioning and consensus QA tools, to produce the signal.
Result
The education technology company delivered the hundreds of thousands of annotations its ML models needed at record speed, while tracking both productivity and signal quality.

An edtech company's models recommend better question-answer pairs. Labelbox produced the hundreds of thousands of expert-graded text signals it needed, with full visibility into quality.
The challenge
A leading education technology company had used Amazon SageMaker GroundTruth, Prodigy, and an in-house tool to label its text data. Rapid pandemic growth broke that setup: the volume of annotated text its models needed wasn't scalable on those solutions.
The approach
The company turned to Labelbox and its Workforce to produce hundreds of thousands of text signals for its models. Students answer questions as screenshots, so the team converts OCR data back into text for experts to annotate, and built an in-house comparison system within Labelbox to measure the accuracy of different OCR labels. Labelbox scaled past basic text labeling with advanced user permissioning, text catalog dataset management, and consensus-based quality assurance. Three months in, the team had visibility it lacked before.
tracking productivity and quality in other services felt more like a black box because after submitting responses, there was nothing else that we could do. In contrast, Labelbox provides the ability to count the number of labels done, revisit submitted labels, fix errors, run a full quality assurance pipeline and manage labeler productivity.
The outcome
The company delivered the hundreds of thousands of annotations its models needed at record speed, while tracking both productivity and signal quality. AI now smooths the complex question-answering process and speeds experts — populating data fields faster, making better question/answer recommendations, and ultimately boosting student learning and outcomes.
Where this goes
Education is a personalization problem: the right question, the right answer, for the right student. Expert-graded signal at scale is what makes those recommendations good enough to learn from.