LabelboxMarch 2, 2023

Group, prioritize, and analyze data with slices and a new way to export your data

Selecting high-impact data is crucial for improving model performance and informing your downstream ML workflow. This month, we're introducing a way for you to easily group and investigate specific slices of data in Model as well as analyze specific data rows based on metadata value. We are also introducing a new way for you to export your data in the Data Rows tab, giving you more control over which data rows and export fields you choose to export.

Prioritize and save high-impact data rows in Model

To accurately improve data quality and debug your model, you need a way to organize and surface specific data rows within a model run (a training experiment that contains a versioned snapshot of annotations, predictions, and metadata). These specific data rows can represent instances of rare data, edge cases, or test scenarios that you would want to monitor and inspect further.

Slices are now available in Model, allowing you to filter for specific data rows and save search queries:

  • Automatically visualize, organize, and prioritize specific slices of data within a model run
  • As slices are dynamic, they provide you with a real-time view of all current data rows that match your slice’s search criteria
  • Generate custom slices based on a specific search query tailored to your use case. For example, you can organize your data by specific search filters or confidence thresholds
  • You can edit and update the filters associated with a given slice at any time

Auto-generated slices

  • To make it even easier to draw trends from and analyze your predictions and annotations, Labelbox will automatically group and generate interesting slices of data based on model predictions.
  • Once you upload your model predictions to a model run, you’ll see an option to enable auto-generated slices

The auto-generated slices provide a valuable starting point for evaluating your model’s effectiveness and can help you quickly surface any data quality issues or model weaknesses.

You can explore the auto-generated slices or save your first slice in Model today.

Analyze data rows on metadata values in Model

Metadata is non-annotation information on an asset that you can customize and upload to Labelbox.

  • You can now filter and search for data rows based on its metadata value. This allows you to analyze data rows and compare your model performance on specific metadata values
  • For example, you can surface and drill into specific data rows that were taken within a specific date range or that contain a specific tag
  • You can further save these data rows, with a specific metadata value, as a slice for further analysis

In beta: A new way to group and export your data from projects

After you explore and filter your data, you can export it to streamline AI workflows supported by adjacent tools and training environments within your pipeline.

We are making major improvements to the way you can export annotations from a labeling project. With the current way of exporting data, you can select which time range you want to export your data within, but you are limited in what other information you can choose to extract and export.

With this new and improved way of exporting data from your projects, you have more control over what type of information you choose to export:

Additional export fields

You can now export more detailed information from your data rows via the UI as well as SDK. Choose to include or exclude relevant attributes in your export:

From the Data Rows tab in our UI, you can select and export a sub-selection of the data rows of most interest based on your predefined or new parameters:

Export the entire project

  • Click 'All data rows' at the top of the Data Rows tab
  • Select 'Export data v2 (beta)'
  • Choose which export fields to include and select 'Export JSON'
  • View export status in the notifications

Export based on label status

  • Choose to export data rows based on 'To label', 'In Review', 'In Rework', or 'Done' statuses
  • Select 'Export data v2 (beta)'
  • Choose which export fields to include and select 'Export JSON'
  • View export status in the notifications

Filter and export specific data rows

  • Leverage filters, such as "Label actions" or "Find text" in the Data Rows tab to export a specific subset of your data
  • Select 'Export data v2 (beta)'
  • Choose which export fields to include and select 'Export JSON'
  • View export status in the notifications

Export a few specific data rows

  • Hand select data rows in gallery view by using the check boxes on the top left of each data row
  • Select 'Export data v2 (beta)'
  • Choose which export fields to include and select 'Export JSON'
  • View export status in the notifications

While we will continue to support the old way of exporting data, we encourage you to test out an improved export functionality that aligns with the import format standard.

Please refer to our documentation for more detailed instructions on how to export your data through the UI or through the Python SDK.