Table of Contents

Datasets

Alex Cota Updated by Alex Cota

Below are some frequently used methods for Datasets. For a complete list of methods see the API reference.

A Dataset is a collection of Data Rows.

Before you start

  1. Complete the installation and authentication steps.
  2. Make sure the API client is initialized:
from labelbox import Client
client = Client()

Create a Dataset

Use the create_dataset method to create a dataset and attach it to a project.

project = client.get_project("<project_id>")
dataset = client.create_dataset(name="<dataset_name>", projects=project)

Where:

  • name is the name you give your Dataset
  • projects is the project you wish to attach the Dataset to.

Fetch a Dataset

Use the get_dataset method to fetch a Dataset by ID.

dataset = client.get_dataset("<dataset_id>")
print(dataset)

Fetch multiple Datasets

Below are three examples for fetching multiple datasets using the get_datasets method.

a. Fetch all datasets.

for dataset in client.get_datasets():
print(dataset.uid, dataset.name)

b. Pass a where parameter with a standard comparison operator (==, !=, >, >=, <, <=). Then, iterate over the PaginatedCollection object.

To learn more about PaginatedCollection objects, see our docs Pagination.
from labelbox import Dataset
datasets = client.get_datasets(where=Dataset.name == "<dataset_name>")
for x in datasets:
print(x)

c. Combine comparisons using logical expressions. Currently, the where clause supports the logical AND operator.

from labelbox import Project
datasets = client.get_datasets(where=(Dataset.name == "<dataset_name>") & (Dataset.description == "<dataset_description>"))
for x in datasets:
print(x)

Was this page helpful?

Projects

Data Rows

Contact