Table of Contents

Model-assisted labeling

Alex Cota Updated by Alex Cota

Model-assisted labeling workflow

The model-assisted labeling workflow in the new Editor allows you to import computer-generated predictions and load them as editable Features on an asset. This can be useful for speeding up the labeling process and supporting human labeling efforts.

Model-assisted labeling supports the following label types:

Before you start

  1. Make sure you have the proper authentication.
  2. Create a project.
  3. Create your dataset and attach data rows.
  4. Use this query to get the IDs for your data rows.
query GetDataRows{ 
dataset(
where:{
id: "<DATASET_ID>"
}
){
dataRows(first:100){
id
}
}
}
  1. Turn on Model-assisted labeling for your project by navigating to “Settings” > “Automation”. When this is on, you will be able to view predictions in the labeling interface. Only admins can toggle on/off Model-assisted labeling.
    You may also turn predictions on via the GraphQL API by using the following mutation.
mutation TurnPredictionsOn($projectId: ID!){ 
project(
where:{id:$projectId}
){
showPredictionsToLabelers(show:true){
id
showingPredictionsToLabelers
}
}
}

For a code sample, reference the end-to-end example in Project setup.

Bulk import predictions

Step 1: Create the NDJSON file

Your import file must be in newline delimited (NDJSON) format. Each prediction will get its own row in the import file and must include the following information, regardless of its type.

uuid

User-generated UUID for each prediction. See the example below for a sample uuid.

The following UUID formats are supported:

A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11

{a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}

a0eebc999c0b4ef8bb6d6bb9bd380a11

a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11

{a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}

schemaId

ID of the Feature schema defined in the ontology. To get this value, use this query and copy the value for featureSchemaId:

query { 
project (where:
{id: "ck5r5laiahj9h0a561xrcnv6b"}
) {
ontology {
normalized }
}
}

dataRow

A JSON object holding Labelbox id of the data row.

Mask predictions

Each mask color in your import file should match the corresponding mask color on the image. Passing a URI/mask color pair for a mask color that doesn’t exist in the image will generate an error and no prediction will be created.

This is the additional information you must provide in your import file if you are importing mask predictions.

instanceURI

public URL for a mask you wish to import. If you are importing multiple mask predictions on one data row, each mask should reference the same instanceURI.

colorRGB

An array of RGB values from 0 to 255 that indicates which color represents each given mask. Only 3-channel RGB colors is supported.

Below is a sample of 3 masks that reference the same instanceURI and datarow. Before you import, make sure the masks and data row dimensions match. See a full sample import file here.

{ 
"uuid": "45b15f9d-7884-4bb7-ac01-3567e8ed6c36",
"schemaId": "ck68grts29n7w0890wv344dif",
"dataRow": {
"id": "cjxav5aa07r1g0dsq70t9eveg"
},
"mask": {
"instanceURI": "https://api.labelbox.com/masks/feature/ck7a0jw4o0nk80x9o5offz4mc",
"colorRGB": [
255,
255,
255
]
}
}
{
"uuid": "3a95ddcd-3ad0-4dc5-a24e-c05004b4b4d5",
"schemaId": "ck7wi85rnd1050757aac5ba4d",
"dataRow": {
"id": "cjxav5aa07r1g0dsq70t9eveg"
},
"mask": {
"instanceURI": "https://api.labelbox.com/masks/feature/ck7a0jw4o0nk80x9o5offz4mc",
"colorRGB": [
255,
0,
0
]
}
}
{
"uuid": "f8284cbc-ecf3-4363-9e10-138501daf5f7",
"schemaId": "ck7wi85pr1xz6079026yc0hch",
"dataRow": {
"id": "cjxav5aa07r1g0dsq70t9eveg"
},
"mask": {
"instanceURI": "https://api.labelbox.com/masks/feature/ck7a0jw4o0nk80x9o5offz4mc",
"colorRGB": [
0,
0,
0
]
}
}

Vector predictions

The following sample attaches bounding box, polygon, point, and polyline predictions to the same data row. Note: The geometry format in the import file matches the geometry format in the export file for each type. See a full sample import file here.

{
"uuid": "efca0c21-5206-4da6-8cb5-d6ca43649cfa",
"schemaId": "ck67grts29n7x0890atmeiahw",
"dataRow": {
"id": "cjxav4aa07r1g0dsq70t9eveg"
},
"bbox": {
"top": 153,
"left": 34,
"height": 204,
"width": 67
}
}
{
"uuid": "1b5762e9-416c-44cf-9a5f-07effb51f863",
"schemaId": "ck67grts29n7y0890q89jdcyp",
"dataRow": {
"id": "cjxav4aa07r1g0dsq70t9eveg"
},
"polygon": [
{
"x": 2,
"y": 99
},
{
"x": 93,
"y": 5
},
{
"x": 51,
"y": 106
},
{
"x": 176,
"y": 142
}
]
}
{
"uuid": "62e1d949-1c75-47f6-9ea2-e938da17d37c",
"schemaId": "ck68grts29n7z08903nvgaim5",
"dataRow": {
"id": "cjxav5aa07r1g0dsq70t9eveg"
},
"line": [
{
"x": 58,
"y": 148
},
{
"x": 135,
"y": 79
},
{
"x": 53,
"y": 191
}
]
}
{
"uuid": "532953e6-746f-4d74-945d-b4a9c2786479",
"schemaId": "ck68grts29n800890roip3u5d",
"dataRow": {
"id": "cjxav5aa07r1g0dsq70t9eveg"
},
"point": {
"x": 30,
"y": 150
}
}

Classifications and Object Classifications

Classifications, both global and within object features, are also supported. The below examples show the three types of classification supported in Model-assisted labeling, and examples of each nested in an object. Like with Vector tools, the format for import should mirror the format received in exports.

Each type expects a different input: Text classifications take a String, Radio classifications take an Option Schema Id, and Checklists take a list of Option Schema Ids.

Global Classifications
# Radio Classification
{
"schemaId": "ckd11j3yk000c0z0u4xn6dc4r",
"uuid": "1278daa6-ce64-4363-be24-4fa5eadffb17",
"dataRow": {
"id": "ckd11jg6scq9c0cq43vmh6i07"
},
"answer": {
"schemaId": "ckd11j415000u0z0ubu7ee4w2"
}
}
# Checklist Classification
{
"schemaId": "ckd1295hc00640z0uapvm1xbd",
"uuid": "fb72782d-f6ed-43ba-8677-77b03197392d",
"dataRow": {
"id": "ckd1299m8cqbs0cq43mju1bvp"
},
"answers": [
{
"schemaId": "ckd1295jn00760z0u01hw4yz5"
}, {
"schemaId": "ckd1295hh006g0z0ucbxgfgec"
}
]
}
# Text Classification
{
"schemaId": "ckd1295hg006c0z0u6x41hx0d",
"uuid": "4f1fe322-7b80-49a1-81cb-5914404df378",
"dataRow": {
"id": "ckd1299m8cqck0cq42lsz5khc"
},
"answer": "Text response"
}

Object Classifications

For classifications nested in objects, like in the export, the nested classifications are stored in an array under the key “classifications.”

# Radio classification nested in a Bounding Box 
{
"uuid": "6e20a5ec-613d-4ecd-8fe5-34e47a05fea8",
"schemaId": "ckd1295hh006e0z0uh2x32i82",
"dataRow": {
"id": "ckd1299m8cqc00cq4c1by8hza"
},
"bbox": {
"top": 216,
"left": 144,
"height": 69,
"width": 67
},
"classifications": [
{
"schemaId": "ckd1295j9006k0z0udz3rh4mp", # Question Schema Id
"answer": {
"schemaId": "ckd1295jk006y0z0u1sk8h075" # Option Schema Id
}
}
]
}
# Checklist classification nested in a Point
{
"uuid": "9ee22dd1-550b-4686-8590-6cc2d3747911",
"schemaId": "ckd11j3ym000i0z0u288t78kk",
"dataRow": {
"id": "ckd11jg6scq9c0cq43vmh6i07"
},
"point": {
"x": 372,
"y": 19
},
"classifications": [
{
"schemaId": "ckd11j40y000s0z0u0hwse6ru",
"answers": [
{
"schemaId": "ckd11j45m001c0z0ub2dzah43"
}, {
"schemaId": "ckd11j45n001e0z0u2eqh4rlb"
}
]
}
]
}
# Text classification nested in a PolyLine
{
"uuid": "a84e9209-6d3a-45a6-9f83-872d12f1eb3d",
"schemaId": "ckd11j3yl000g0z0u4bmz3slk",
"dataRow": {
"id": "ckd11jg6scq9c0cq43vmh6i07"
},
"line": [
{"x": 189, "y": 161},
{"x": 80, "y": 96},
{"x": 5, "y": 214}
],

"classifications": [
{
"schemaId": "ckd11j40w000q0z0ugkerhl3u",
"answer": "a nested text response"
}
]
}

Step 2: Import the NDJSON file

Please limit the total number of imported features to under 500k features per customer across all of your projects.

Once you have properly formatted your NDJSON import file, create a URL for your file. Then, use the createBulkImportRequest mutation to create a bulk import request.

projectId

The ID of the project to import the predictions.

name

Each bulk import request should have a unique name per project.

fileUrl

The URL containing your predictions should be downloadable external to your organization. Check if the given URL is public before uploading by running wget -O- <url> | cat. You will know the <url> is public If you see your NDJSON.

You can check that the host of the public URL allows standard browsers to download by:

  1. Navigating to your URL using any browser and it should return the expected NDJSON.
  2. Run wget -O- --user-agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36' <url> | cat and it should return the expected NDJSON.

mutation { 
createBulkImportRequest(data: {
projectId: "ck5r3rfav005h0887x584pwak",
name: "import_job_1",
fileUrl: "https://foobar.com/test2file"}) {
id
}
}

To check the status of the import job, pass this query.

query { 
bulkImportRequest(where: {
projectId: "ck5r3rfav005h0887x584pwak",
name: "import_job_1"}) {
id
name
state
statusFileUrl
errorFileUrl
}
}

The state field returned by the above query refers to the whole import job and will be one of the following:

RUNNING
FAILED
FINISHED

If state is FINISHED, you’ll get a statusFileUrl to an NDJSON (expires after 24 hours) that contains a SUCCESS or FAILED status per prediction.

If state is FINISHED, you’ll also get an errorFileUrl to an NDJSON which has the same format as the statusFileUrl except it contains ONLY error messages for each prediction that did not import successfully.

If the state is FINISHED but there are no errors in the errorFileUrl, make sure you have predictions turned on for your project (see step 5 in the "Before you start" section).

If state is FAILED, you’ll only get an errorFileUrl to an NDJSON containing the error message. The statusFileUrl will be null.

These are the fields that will be included NDJSON files:

uuid

Specifies the prediction for the status row.

dataRow

JSON object containing the Labelbox data row ID for the failed prediction.

status

Indicates SUCCESS or FAILED.

errors

An array of error messages. Only present if status is FAILED.

For predictions not uploaded successfully the first time, fix the errors indicated in the errorFileUrl and start another bulk import with the amended input file. Labelbox will identify the failed predictions and retry the upload of the failed data rows.

Load predictions in the Editor

When an asset is loaded in the labeling interface, any predictions for that asset will show up as editable Features for the user.

Segmentation in label interface

Predictions will be loaded on an asset only when the following conditions are met:

  • Model-assisted labeling toggle is on.
  • There are predictions created for the data rows.
  • There are no non-prediction annotations that have already been created by the user on the data rows.

Duration (Timer) The timer functionality is still a work in progress. Currently, if the labeler skips or submits a label without making any changes first, the timer will not start and the duration time recorded will be 0 seconds. This logic may be revised in a future update.

Update/delete predictions

When a labeler chooses to “skip” an asset with predictions, the predictions will get deleted from the asset. Next time a user loads the asset in the labeling interface, the predictions will be loaded again. Overwriting predictions on the data row will not automatically overwrite any Features already generated from the original predictions. To update Features generated from predictions, you will need to:

  1. Delete the labels from the data row in the UI.
  2. Create a new bulkImportRequest containing the prediction with the same uuid.
  3. Load the asset again to see the updated predictions.
If you label an asset then delete the label and keep the label as a template, that label template will take precedence over any model-assisted labels (predictions) you import for that asset.

Export format

You can find the export formats for each prediction/Feature type here.

Was this page helpful?

Real-time human-in-the-loop labeling

Contact