Table of Contents
Updated by Alex Cota
Here are some general concepts that are helpful to be aware of as you work with the Labelbox Python SDK.
Access a field as an attribute of the object.
project.name # Get name of project
To update a field, use the
update method to target and modify an updatable value in the object’s type. Pass the field and the new value.
project.update(name="Project Name") # Update project name
To access related objects, call the relationship as a method. Example: To get all datasets for a project, define
project and call
datasets as a method to access the datasets related to
project.datasets() # Get all datasets for project
To add a relationship between two objects, call the
connect method directly from the relationship.
project.datasets.connect(dataset_1) # Connect dataset_1 to project
To update a relationship, use the
disconnect method and then the
connect method. It is important to note that
update() does not work for updating relationships.
project.datasets.disconnect(dataset_1) # Disconnect dataset_1
project.datasets.connect(dataset_2) # Connect dataset_2
Sometimes, a call to the server may result in a very large number of objects being returned. To prevent too many objects being returned at once, the Labelbox server API limits the number of returned objects. The Python SDK respects that limit and automatically paginates fetches. This is done transparently for you, but it has some implications.
projects = client.get_projects()
projects = list(projects)
# listproject = projects
datasets = project.datasets()
for dataset in datasets:
There are several points of interest in the code above.
- For both the top-level object fetch,
client.get_projects(), and the relationship call,
PaginatedCollectionobject is returned. This
PaginatedCollectionobject takes care of the paginated fetching.
- Note that nothing is fetched immediately when the
PaginatedCollectionobject is created.
- Round-trips to the server are made only as you iterate through a
PaginatedCollection. In the code above that happens when a
listis initialized with a
PaginatedCollection, and when a
PaginatedCollectionis iterated over in a for loop.
- You cannot get a count of objects in the relationship from a
PaginatedCollectionnor can you access objects within it like you would a list (using squared-bracket indexing). You can only iterate over it.
Be careful about converting a
PaginatedCollection into a
list. This will cause all objects in that collection to be fetched from the server. In cases when you need only some objects (let's say the first 10 objects), it is much faster to iterate over the
PaginatedCollection and simply stop once you're done.
The following code demonstrates how to do this.
data_rows = dataset.data_rows()
first_ten = 
for data_row in data_rows:
if len(first_ten) >= 10:
Immediate updates on the server side
Each data update using
object.update() on the client side immediately performs the same update on the server side. If the client side update does not raise an exception, you can assume that the update successfully passed on the server side.
When you fetch an object from the server, the client obtains all field values for that object. When you access that obtained field value, the cached value is returned. There is no round-trip to the server to get the field value you have already fetched. Server-side updates that happen after the client-side fetch are not auto-propagated, meaning the values returned will still be the cached values.
Unlike fields, relationships are not cached. Relationships are fetched every time you call them. This is made explicit by defining relationships as callable methods on objects (refer to section above).
project.name # Fields are accessed as attributes
project.datasets() # Relationships are called as methods
In many cases you may not be concerned with relationship data freshness because only you will only be modifying your data during small timeframes. In those situations, it is completely fine to keep references to related objects.
project_datasets = list(project.datasets())