A vital aspect of a strong data engine involves the creation of large volumes of high-quality training data.
A common bottleneck for many AI teams is the inability to have a holistic view across their data to filter and prioritize the right data to label. Resembling a black-box approach to labeling operations, this lack of insight and visibility can cause teams to spend more time and cost on labeling.
The better you’re able to search, understand, and manage your data, the faster you’ll be able to prioritize Data Rows for labeling and accelerate model development.
The Data Rows tab is the central hub for all Data Rows within a given project. It allows you to view, manage, and filter for Data Rows within your project.
With the Data Rows tab, teams have a holistic view of:
Benchmark - shows the agreement score
Consensus - shows the number of labels completed and the agreement score
You can sort Data Rows by:
To surface a subset of Data Rows, teams can filter on:
Created at
Labeled by
Reviewed by
Re-worked by (coming soon)
Skipped
The Data Rows tab works with the new Overview page to give teams a holistic picture of all their Data Rows and where each data row is in their labeling workflow.
Prior to the Data Rows tab, teams were using the Label tab to keep track of data row activity. While it provided a view of all Data Rows within a project, teams were limited in how they were able to quickly surface data rows of interest.
Over the Labels tab, the Data Rows tab supports more advanced filtering capabilities, so teams can find Data Rows and quickly understand the status of a data row. Teams now have a more cohesive way of filtering and surfacing specific data rows across Catalog and Model.
Designed to work with batches and multi-step review workflows, the Data Rows tab gives you a much better and holistic view of your labeling operations.
You can learn more about the Data Rows tab in our documentation.
The Data Rows tab will update in sync with your project’s overview page, allowing you to easily see how your Data Rows are progressing through your project’s workflow.
View all Data Rows within a specific stage of your workflow in the left panel of the Data Rows tab. Clicking into each status will bring up all the data rows within that stage of your workflow.
You can also view your Data Rows in “gallery view” – allowing you to view Data Rows with a thumbnail view. This view will display and render any bounding box annotations in the preview.
Teams can use dynamic filters to query and surface specific Data Rows of interest. Mirroring the search capabilities in Catalog, you can query for Data Rows within a project faster than ever.
With flexible querying, you can use a combination of AND/OR conditions on attributes for more granular searches. Filter on:
Created at
Labeled by
Reviewed by
Re-worked by (coming soon)
Skipped
Batches are a collection of Data Rows that are queued from Catalog and added to your labeling project. They are critical in enabling faster data-centric iterations and in helping unlock active learning workflows to improve label or model errors.
You can easily manage and view batches directly from the Data Rows tab:
How to add, view & manage batches for a Benchmark project
How to add, view & manage batches for a Consensus project
How to delete a batch
You can learn more about batches in this guide or in our documentation.
Complex projects might feature a high number of Data Rows. It’s important that teams are able to effectively manage data rows of interest to improve labeling efficiency.
You can conduct actions in bulk by selecting bulk Data Rows together and completing one of the desired actions below:
For many Enterprise teams working on larger and more complex projects, a key question becomes how to structure, review, and complete training data projects in a systematic way.
Multi-step review workflows can give teams the flexibility to review Data Rows at a specific step of the review process. Rather than having to review or sort through all of your Data Rows, the Data Row tab gives teams a holistic look into all the review steps within a project’s workflow.
Flexible querying with dynamic filters:
The search capabilities in the Data Rows tab mirror Catalog – you can query to surface a specific subset of Data Rows within a project to better QA and understand all the data rows within your project.
Filters like Annotation type or sorting by Function allows teams to identify and QA a subset of Data Rows that meet a specific criteria. In addition to Workflows, the Data Rows tab unlocks the unique ability for ad-hoc review and flexible QA in addition to being able to view & manage your entire labeling operations.