Build vs. Buy

On deploying a successful data labeling solution

Starting to annotate data is easy, scaling and managing is hard

It's quick and easy to start annotating data using locally installed tools. For most simple annotation tasks being performed by a single labeler, this solution architecture works well. As data labeling needs scale, data management and quality control processes are needed to produce accurate and consistent training data. A common cause of underperforming AI systems is low accuracy training data.

Calculator

Cost of Engineering
$

Labelbox Feature Build Estimations

Remove / Re-estimate features as you see fit

129 days

  Setup a new web service
Days
  User Management
Days
  Permission Management
Days
  Single Sign On
Days
  Team Management
Days
  Team Management
Days
  Dataset Uploading
Days
  Onpremise Data Support
Days
  Realtime Progress Reporting
Days
  Review / Editing Workflow
Days
  Audit Trails
Days
  Dynamic Ontology Creator
Days
  Image Segmentation Template
Days
  Image Segmentation Editing / Class Adjustment
Days
  Image Preloading
Days
  Label AutoSaving
Days
  Pixel Level Segmentation
Days
  Optimized Keyboard Shortcuts
Days
  Labeler Performance Reporting
Days
  Full API access and API Key Management
Days
  Automatic Tensorflow Record Generation
Days

Maintaince Cost

Anyone who has built software stystem understands bugs, database upgrades, and required updates come up.
%
Cost of Labeler Productivity

Do you think professionally made labeling tools will be more productive than tools you would build in house?

Number of Labelers

Cost of Labeler Per Hour

$

Number of hours per week

Productivity gains

%
Total Cost This Year
Build Cost:
$82,560
Maintenance Cost:
$8,256
Productivity Cost:
$15,600
Total Cost:
$106,416
Estimated Completion Date:
May 5, 2019

Important Considerations

Developing expert human intelligence requires training by experts, and this also applies to training expert artificial intelligence. Achieving compelling AI performance is preceeded by numerous experimentation and optimization cycles. Rapid deployment of expert AI systems depends on mature data labeling infrastructure capable of producing training data that is consistent and accurate. When building data labeling infrastructure, consider the following:

Total Cost of Ownership

Homegrown tools are built to exist and serve a particular function, but with new business demands comes the cost of upgrades. There is a high cost to ongoing maintenance, both in time and money. Technical debt accrues over time due to engineer turn-over, product neglect, and evolving product demands.

Unknown and Evolving Scope

Developing an internal product requires planning, resource allocation, and preparing for the unknown. Because feature flagging platforms are relatively new, it can be difficult to accurately define the scope and construct a solution for needs across engineering and product groups.

Minimum Viable Functionality

Internal tools are generally not built for usability, scalability, or cross-team support. They are built to solve an immediate pain point or provide minimum viable functionality as quickly as possible.

Data Labeling is Cross Functional

Turning raw data into accurate and consistent training data is a team effort. Engineers, domain experts (labelers), and managers must work together while playing different roles. Data labeling infrastructure must facilitate this by providing information and interfaces unique to these roles.

Enterprise Readiness

Productionizing AI systems takes fast, reliable, and scaled infrastructure across raw data collection, data labeling, and compute.

Data Labeling Services / Outsourced Data Labeling

Data labeling services provide cost efficient access to labor pools. The advantage of this is quick turnaround of labeled data at low per label costs. The performance of your AI system is determined by both the accuracy and quantity of training data. If a data labeling service does have the requisite domain expertise to label your data, make sure to quantify the labeling accuracy needed and communicate these requirements to the labeling service.

Buying a Data Labeling Solution

Creating accurate and consistent training data requires a set of integrated tools that enable your cross functional team of engineers, labelers and managers to collaborate effectively. When buying a data labeling solution, priotitize the following:

  • Enterprise ready
  • Stable
  • Configurable without code
  • Intuitive
  • Well Supported

Data Labeling and Management with Labelbox

Labelbox is an enterprise-grade data labeling platform for building expert artificial intelligence. Every day, hundreds of teams use Labelbox to create and manage high quality training data.

Labelbox provides comprehensive value right from the start, including:

  • Configurable annotation tools
  • Roles & permissions
  • Labeler performance analytics
  • On-premise data
  • Built-in tools to integrate labeling services and/or a managed workforce
  • QA/QC tooling and label review workflows
  • Compatibility with your ML framework
  • Data label management
  • SLA backed customer support

Get started with a free account

Join the next industrial revolution