Catalog

Catalog

Curate unstructured data with precision

Quickly search and visualize all of your unstructured data in one place. With all your data, metadata, labels, and predictions at your fingertips you can make better decisions to unblock your AI initiatives.
Curate unstructured data with precision
Search for data faster than ever

Search for data faster than ever

Search for labeled and unlabeled data using filters for metadata, model inferences, and other attributes like embedding similarity. No need for an engineer to write one-off query scripts just to find data. Send data directly to a labeling project in just a few clicks.

Learn how
Leverage active learning to curate better data

Leverage active learning to curate better data

Not all data impacts model performance equally. Through our active learning workflows and uncertainty sampling, you can filter for data with low-confidence predictions to curate and label the right data–not just more data.

Auto label data using weak supervision techniques

Auto label data using weak supervision techniques

Assign custom metadata in bulk to assets that meet Catalog filter criteria, including embedding similarity, without manual labeling. Leverage this metadata in downstream workflows like helping curate data for labeling projects or structuring custom review workflows. Automatically apply this metadata to new data that meets the filter criteria.

Analyze your data with metrics that matter

Analyze your data with metrics that matter

View a detailed class distribution of ground truth labels or model inferences to get a better understanding of your data. See how performance metrics like F1 score vary across your data so you can make the most informed decisions when curating data to label.

Use embeddings to uncover new insights

Use embeddings to uncover new insights

Model embeddings help you quickly uncover high-level patterns and visually similar data from across all your datasets. Labelbox offers precomputed embeddings by default or you can upload your own via the SDK. Use Similarity Search to find more examples of low-performing classes, edge cases, or other rare data.

Share and act on insights faster

Share and act on insights faster

Don’t let searching for data and edge cases slow your team down or hold up conversations with stakeholders or customers. Instead of relying on one-off query scripts, search and discover data faster inside Catalog.

Testimonial, Catalog, Deque

“Catalog is huge for us. Pre-Catalog, we only had 50% accuracy on models and would have to rely on tedious manual data selection. With Catalog in Labelbox, we can quickly search and visualize all of our unstructured data and use active learning and weak supervision techniques to target data collection on models quickly – it takes a lot of the time and effort out of the data selection process."

Noé Barrell

ML Engineer

See it in action