Table of Contents

Import annotations

Alex Cota Updated by Alex Cota

This page describes how to use the Python SDK to annotations as pre-labels to Labelbox via the Python SDK. For an overview of importing predictions/annotations, see our documentation on Model-assisted labeling.

Before you start

Make sure you have the proper authentication. To learn how to create an API key, see Getting started. Also, if you are importing annotations from an NDJSON file, see our documentation on creating an NDJSON file.

Import annotations

There are three ways to import annotations using the instance method upload_annotations in the Project class.

If you are importing more than 1,000 mask annotations at a time, consider submitting separate jobs, as they can take longer than other annotation types to import.

Wait until the import job is complete before opening the Editor to make sure all annotations are imported properly.
1. Pass public URL to NDJSON file

This sample script uses the upload_annotations method to pass a publicly hosted URL pointing to the NDJSON file containing the annotations.

upload_job = project.upload_annotations(
name="upload_annotation_job_1",
annotations="https://storage.googleapis.com/public-bucket/predictions.ndjson")

If you are importing a URL to an NDJSON file, check that the host of of the public URL allows standard browsers to download by doing the following:

  1. Navigate to your URL using any browser. It should return the expected NDJSON.
  2. Run wget -O- --user-agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36' <url> | cat. It should return the expected NDJSON.
2. Upload local NDJSON file

You can also use the upload_annotations method to import a local NDSJON file. Labelbox will validate whether the file is a proper NDJSON file by ensuring that every line of a file is a valid JSON.

from pathlib import Path

predictions_file = Path("/home/predictions/predictions.ndjson")

upload_job = project.upload_annotations(
name="upload_annotation_job_1",
annotations=predictions_file)
3. Pass a list of dictionaries

The upload_annotations method also accepts annotations as a list of dictionaries. Labelbox will automatically convert the dicts to an NDJSON file.

annotations = [ 
{
"uuid": "9fd9a92e-2560-4e77-81d4-b2e955800092",
"schemaId": "ckappz7d700gn0zbocmqkwd9i",
"dataRow": {
"id": "ck1s02fqxm8fi0757f0e6qtdc"
},
"bbox": {
"top": 48,
"left": 58,
"height": 865,
"width": 1512
}
},
{
"uuid": "29b878f3-c2b4-4dbf-9f22-a795f0720125",
"schemaId": "ckappz7d800gp0zboqdpmfcty",
"dataRow": {
"id": "ck1s02fqxm8fi0757f0e6qtdc"
},
"polygon": [
{"x": 147.692, "y": 118.154},
{"x": 142.769, "y": 404.923},
{"x": 57.846, "y": 318.769},
{"x": 28.308, "y": 169.846}
]
}
]

upload_job = project.upload_annotations(
name="upload_annotation_job_1",
annotations=annotations)

Check import status

Project.upload_annotations returns a BulkImportRequest. You can use this object to check the state of the job.

BulkImportRequestState refers to the whole import job and returns one of the following states:

RUNNING

Indicates that the import job is not done yet. See the sample script below.

FAILED

If state is FAILED, you’ll only get an errorFileUrl to an NDJSON containing the error message. The statusFileUrl will be null.

FINISHED

If state is FINISHED, you’ll get a statusFileUrl to an NDJSON (expires after 24 hours) that contains a SUCCESS or FAILED status per prediction. You’ll also get an errorFileUrl to an NDJSON which has the same format as the outputFileUrl except it contains ONLY error messages for each prediction that did not import successfully.

Additionally, BulkImportRequest exposes wait_until_done. Grab the request separately and block using the wait_until_done method below. BulkImportRequestState will either return FAILED or FINISHED once it is no longer in the RUNNING state.

Note: Any import request will take at least a few minutes. Just grab the request instance at some later point in time if you wish to verify it is finished running.
from labelbox import Client
from labelbox.schema.bulk_import_request import BulkImportRequest
from labelbox.schema.enums import BulkImportRequestState

client = Client(api_key="<LABELBOX_API_KEY>")

upload_job = BulkImportRequest.from_name(
client,
project_id="<project_id>"
name="test_bulk_import_request")

upload_job.wait_until_done()
assert (
upload_job.state == BulkImportRequestState.FINISHED or
upload_job.state == BulkImportRequestState.FAILED
)

The state field refers to the whole import job and will be one of the following:

BulkImportRequestState.RUNNING
BulkImportRequestState.FAILED
BulkImportRequestState.FINISHED

Once the import has finished, the BulkImportRequest object will also hold a status_file_url and error_file_url. These are urls to ndjson files that contain the status or error of each annotation that was uploaded.

Each line in the NDJSONs will include:

uuid

Specifies the prediction for the status row.

dataRow

JSON object containing the Labelbox data row ID for the failed prediction.

status

Indicates SUCCESS or FAILED.

errors

An array of error messages. Only present if status is FAILED.

For predictions not successfully uploaded, fix the errors indicated in the error_file_url and start another bulk import with just the annotations that require a retry.

import ndjson
import requests

print(f'Here is the status file: {upload_job.status_file_url}')
print(f'Here is the error file: {upload_job.error_file_url}')

if upload_job.error_file_url:
res = requests.get(upload_job.error_file_url)
errors = ndjson.loads(res.text)
for error in errors:
print(
"An annotation failed to import for "
f"datarow: {error['dataRow']} due to: "
f"{error['errors']}")

Was this page helpful?

Data Rows

Labels

Contact