LabelboxSeptember 26, 2022

Annotate documents, HTML files, high-resolution pathology slides, and queue & review data better with custom workflows

This month's improvements to Labelbox Annotate unlocks a host of novel use cases and allows teams to better queue & review data with custom workflows. Teams can now leverage our document editor for both NER and OCR annotations, view data in a specific manner with our HTML editor, deep zoom on tiled imagery for medical use cases, and more. As we continue to release Workflows across our customer base, teams can more effectively queue and review their data rows.

Annotate everything in your documents

PDF documents are inherently complex – they often contain lots of text, images, charts, graphs, and more. Traditional OCR solutions might only capture a fraction of the information, losing out on context that can limit the accuracy of the model.

Use our document editor to effectively capture both content and context.

  • The document editor is now a multimodal annotation platform for NER and OCR techniques
  • You can create named-entities with a custom text layer - toggle the text layer on and it will appear any time you want to highlight an entity
  • Tokenization is available at both the word-level and character-level
  • Export named-entities with your raw text file to easily identify the text, page number, groupID, and tokenID of named entities
  • Teams can use the custom text layer to annotate text of interest alongside OCR techniques to effectively capture both content and context.

To learn more, please refer to this guide.

Queue & review data better with customizable workflows

Earlier this year, we released the Data Rows tab and Batches to help teams better navigate & queue data for labeling.

We are continuing to release Workflows to customers on a rolling basis. With Workflows, you can create a highly customizable, step-by-step review pipeline to drive efficiency and automation into your review process. Learn more about by watching the video demo below.

With the arrival of Workflows, we’ll be introducing a new way for teams to queue and review their data.

Why are we making this change?

AI teams need more granular control over labeling workflows. In order to help you streamline and improve the creation, maintenance, and quality control of data rows, we’ll be introducing a new way for teams to queue and review.

How does this affect me?

By the end of the year, we’ll be deprecating the following features for all customers and will be replacing them with better, more robust features that help teams both streamline and maximize labeling cost and efficiency.

Review stepWorkflows

Delete & re-queueWorkflows

Dataset-based queueingBatch-based queueing

Labels tabData rows tab

We know that change is never easy. We’ll be reaching out to and working with affected teams as we roll out these changes across our customer base — stay tuned for more information regarding best practices in migrating your projects.

Where can I learn more?

The migration to Workflows, Batches, and the Data Rows tab will be implemented on a rolling basis. To learn more about this migration in detail, feel free to refer to our documentation. We’ve also summarized these changes in the video demo below.

In the meantime, we’d love to learn more about your current review and rework process.

We know that change is never easy. We’ll be reaching out to and working with affected teams as we roll out these changes across our customer base — stay tuned for more information regarding best practices in migrating your projects.

View data row details in any labeling editor

Additional context and information at the data row level can be extremely helpful to labelers. It provides ample context and easy access to information related to that particular data row.

  • View information such the external ID, asset type, when the data row was created, and which dataset the data row belongs to
  • Based on data type, view specific media attributes such as an image's width & height, the number of pages on a PDF, the frame rate and duration of a video, and more
  • Easily see if there is any metadata attached – such as a MAL imported annotations

View and classify specific data in the HTML editor

The HTML editor is a powerful way for teams to annotate data intended to be viewed in a specific manner; it provides teams with greater flexibility and a unique way to visualize data.

It can unlock specific use cases, such as:

  • Comparing two objects in the same pane
  • Ranking tasks on multiple assets
  • Annotating public webpages that have been saved as HTML files
  • Classifying text that is formatted in a specific manner
  • and more

Currently, we only support classification type tasks (radio, checklist, and free-text). To learn more about how to import HTML files and our HTML editor capabilities, refer to our documentation.

Deep zoom on tiled imagery for medical use cases

Teams who are interested in annotating high-resolution pathology slides can now zoom in and out of a pathology slide with pixel-perfect resolution.

  • The tiled imagery editor now supports rendering assets from a Deep Zoom tile server
  • To get started, stand up a Deep Zoom server, such as with OpenSlide, and direct Labelbox's tiled imagery editor to the tile URL
  • Once configured, you can start zooming in on the details of your pathology slide to more accurately view and label your data

In beta: Identify an object's direction with the Cuboid tool

Sign up for the Closed Beta

We're working on a new tool to help teams annotate an object's direction, specifically 3 key angles that compromise an object’s directionality: pitch (up & down), roll (side to side), and yaw (left & right).

  • Capturing the direction of an object relative to its depth in the space can be important for teams who are hoping to train their model on the size and orientation of an object
  • The Cuboid tool will allow team to annotate and edit an object's directionality (scale, angle, and position) in relation to its surroundings

To be an early participant in the Closed Beta, please sign up.