Image Segmentation 101

For commercial AI teams, an enormous amount of time and energy is spent on the development and curation of datasets. This is because, unlike academic machine learning models, the use of open source datasets for a commercial application will not likely yield an accurate representation of the real-world. Instead, what’s required is the creation of a new labeled dataset that is tailored towards the specific situation.

In order to create this novel labeled dataset, data scientists and ML engineers have the choice between a variety of annotation types. In computer vision, the frequently picked choice when looking to differentiate between objects with the highest degree of accuracy is image segmentation. It’s important to stress that without the right tooling, however, image segmentation can be prohibitive for many projects, as it becomes very costly to label the amount of training data necessary to achieve performant model results.

What is image segmentation?

With image segmentation, each annotated pixel in an image belongs to a single class. It is often used to label images for applications that require high accuracy and is manually intensive because it requires pixel-level accuracy. A single image can take up to 30 minutes or beyond to complete. The output is a mask that outlines the shape of the object in the image. Although segmentation annotations come in a lot of different types (such as semantic segmentation, instance segmentation, panoptic segmentation, etc), the practice of image segmentation generally describes the need to annotate every pixel of the image with a class.


Image segmentation being used to identify a specific vehicle type.


Image segmentation being used to annotate every pixel and distinguish between items such as sky, ground, and vehicle type.

What are the benefits of using image segmentation for my ML model?

The primary benefit of image segmentation can be best understood by comparing the three common annotation types within computer vision: 1) classification 2) object detection and 3) image segmentation.

  • With image classification, the goal is to simply identify which objects and other properties exist in an image.

  • With image object detection, you go one step further to find the position (bounding boxes) of individual objects.

  • With image segmentation, the goal is to recognize and understand what's in the image at the pixel level. Every pixel in an image belongs to a single class, as opposed to object detection where the bounding boxes of objects can overlap.


For point of comparison, employing image segmentation is particularly useful when dealing with use cases in a model where you need to definitively know whether or not an image contains the object of interest and also what isn’t an object of interest. This is in contrast to other annotation types such as classification or bounding boxes that may be faster in nature but less accurate. In short, annotations generated from image segmentation tend to end up with the most widely applicable and versatile models, because they are the most focused on what is in the contents of an image.

How does a training data platform support complex image segmentation?

Training data platforms are commonly equipped with at least one tool which allows you to outline complex shapes for image segmentation. At Labelbox, our pen tool allows you to draw freehand as well as straight lines. Having fast and ergonomic drawing tools help reduce the time-consuming nature of having pixel-perfect labels consistently.


Labelbox pen tool illustrated

In addition, training data platforms typically include additional features that specifically help optimize your Image Segmentation project which include:

  • Customization based on ontology:

    The ability to configure the label editor to your exact data structure (ontology) requirements, with the ability to further classify instances that you have segmented. Ontology management includes classifications, custom attributes, hierarchical relationships and more.

  • An emphasis on performance for a wide array of devices:

    A focus on making complex labels really fast, even on lower spec PCs and laptops. Performance becomes critical for professional labelers who are working in an annotation editor all day.

  • Support for shared borders:

    When creating image segmentation masks, it’s important to be able to share borders between objects. With the Labelbox editor, it’s simple. Whenever you draw a new object, if you overlap the border of an already existing object, the new border you’re drawing will be shared.

  • Brightness and contrast features:

    Sometimes objects in the dark or night-time images can be hard to clearly distinguish from each other. Labelbox includes brightness and contrast controls to help illuminate images and bring out edges between objects for sharper delineation.


Image segmentation objects with nested classifications

What are some example real-world use cases?

Image segmentation is popular for real-world ML models when high accuracy is required of the computer vision application being built. Customers employing image segmentation can be found in use such as autonomous vehicles, medical imagery, retail applications and more.


An example of a retail customer using image segmentation masks to better manage inventory