Labelbox•December 20, 2022
All Labelbox users can now natively upload and annotate documents and conversational text to train language models for their specific business use case. Read on to learn more about the latest improvements to the video editor and updates to the new way to queue and review your data rows.
All Labelbox users can now natively upload and annotate their PDF documents for NER and OCR use cases.
Documents such as PDFs are widespread across a variety of industries such as financial services, real-estate, healthcare, and many more. PDFs contain valuable information that often need to be extracted for greater understanding or a specific business use case.
However, not only are PDFs inherently complex – as they often contain text, images, graphs, and more – but the format of PDFs can vary. If I am looking to train a model to extract key information from product brochures, such as the product name and product specifications, I’d be faced with the difficult task of capturing both text and visual information across a variety of product brochure styles. Interpreting and extracting information from native PDFs of various formats and structures is a key challenge in any AI use case based on PDF data.
The document editor is a multimodal annotation platform, allowing you to easily turn stores of PDF files and documents into performant ML models. You can annotate text with an NER text layer alongside traditional OCR to extract both text and images without losing context.
Leverage our document editor to:
All Labelbox users can now natively upload and annotate text formatted as conversations.
The rise in natural language processing (NLP) language models have given machine learning teams the opportunity to build custom tailored experiences for their business use cases. Use cases can range from improving customer support metrics, creating delightful customer experiences, or preserving brand identity and loyalty.
As we’ve seen with the virality and success of OpenAI’s ChatGPT, we’ll likely continue to see AI powered language experiences penetrate all major industries. One of our customers is using our conversational text editor to annotate conversations to better understand the intent of their current users. Rather than spending time and cost on manual conversation review, they set out to build a model to identify the intent and types of questions that are frequently asked by their customers.
With our conversational text editor, you can:
Learn how to train a chatbot or how to annotate conversational text for a chatbot use case in our latest guides, or read more about our conversational text editor in our documentation.
All Labelbox users can use the cuboid tool to capture three dimensional space in 2D images.
For certain use cases, capturing and understanding an object or person’s dimensions is important. This is especially true for instances where the angle, size, and depth of a person or object in an image can represent different things.
For example, if you’re interested in capturing the head tilt and angle of a person’s head in an image, you can use the cuboid tool to capture the up and down, side to side, and left to right dimensions. Rather than simply annotating the person’s head with a bounding box, the cuboid tool allows the model to be trained on the exact rotation and directionality of the person’s head.
To create a cuboid annotation, simply create an image project and select “cuboid” from the tool dropdown during ontology creation. Create a cuboid by drawing a bounding box over the object in the image.
Once released, it will automatically become a cuboid — allowing you to use the various levers on the tool to adjust the cuboid's rotation along the x, y, and z axes. At the top of the editor, you’ll find corresponding buttons to switch to Rotate mode (x axis), Move mode (y axis), or Scale mode (z axis).
To learn more about the cuboid tool, feel free to refer to our documentation.
The video editor currently supports both radio and checklist classifications at the frame and global level.
A highly requested feature, this update gives you greater flexibility in adjusting the keyframes within a classification segment. You can simply click and drag the keyframe to adjust it to the desired classification length.
If you’re a current Labelbox user, you can try out this new way to classify videos in our video editor today.
Over other data types, video labeling can be a complex task. There are often many objects of varying sizes that need to be labeled across frames. Even a short one minute video can often take labelers tens of minutes to annotate.
We’re introducing bounding box tracking for video. A user can now simply draw a bounding box around the object of interest and the object will be tracked across frames.
To sign up for the beta, please fill out this form.
All Labelbox customers will be moving to a new way to queue and review before the end of January.
We’ve continued to roll out batch-based queueing, custom review workflows, and the Data Rows tab across our user base.
All new projects will automatically be configured with batch-based queueing, the Workflow tab, and the Data Rows tab.
Decide what data to label in priority with batch-based queueing
Learn more about the power of batches: How to prepare and submit a batch for labeling
Customize your review process with the Workflow tab
Learn more about workflows: How to customize your annotation review process
Better search, surface, and prioritize data within a project
Learn more about the Data Rows tab: How to search, surface, and prioritize data within a project
On November 21st, we released the above update to Free, Education, and Starter users. Starting December 15th, we have begun rolling out this update to our Pro and Enterprise customers.
In Q1 2023, Labelbox will open a migration path that will allow you to move all of your old projects into this new paradigm.
We’ve compiled a list of resources to help you better familiarize yourself with this new way to queue & review:
You can harness the potential of the most powerful language models, such as ChatGPT, BERT, etc., and tailor them to your unique business application. Domain-specific chatbots will need to be trained on quality annotated data that relates to your specific use case.