With AI-powered claims automation, you can now seamlessly integrate the latest advances in foundation models into your damage detection and disaster assessment models. As the demand for real-time intelligence into understanding residential and commercial properties grows, it's essential for teams to maximize compliance and minimize operational costs.
However, teams can face multiple challenges when implementing AI for damage detection. This includes:
Labelbox is a data-centric AI platform that empowers businesses to transform their claims automation through advanced computer vision techniques. Instead of relying on time-consuming manual reviews, companies can leverage Labelbox’s AI-assisted data enrichment and flexible training frameworks to quickly build task-specific models that uncover actionable insights from damage assessment.
In this guide, we’ll walk through how your team can leverage Labelbox’s platform to build a task-specific model to improve building damage detection via aerial imagery. Specifically, this guide will walk through how you can explore and better understand unstructured data to make more data-driven business decisions around damage detection initiatives.
Part 1: Explore and enhance your data
Part 2: Create a model run and evaluate model performance
You can follow along with both parts of the tutorial below via:
Follow along with the tutorial and walkthrough in either the Google Colab Notebook. If you are following along, please make a copy of the notebook.
For this use case, we’ll be working with an open dataset from the Hurricane Maria aerial assessment dataset – with the goal of quickly curating data and finding building damage from high-volumes of images.
The first step will be to gather data:
Please download the dataset and store it in an appropriate location on your environment. You'll also need to update the read/write file paths throughout the notebook to reflect relevant locations on your environment. You'll also need to update all references to API keys, and Labelbox ontology, project, and model run IDs
Once you’ve uploaded your dataset, you should see your image data rendered in Labelbox Catalog. You can browse through the dataset and visualize your data in a no-code interface to quickly pinpoint and curate data for model training.
In this demo, we'll be using Catalog to find relevant images of buildings for our dataset with the goal of annotating footprints using foundation models.
Leverage custom and out-of-the-box smart filters and embeddings to quickly explore product listings, surface similar data, and optimize data curation for ML. You can:
The next step is to use foundation models to detect as many building footprints as possible.
Model Foundry enables teams to choose from a library of models and in this case, we'll be using an object detection model (Grounding DINO) to generate previews and attach them as pre-labels.
With Model Foundry, you can automate data workflows, including data labeling with world-class foundation models. Leverage a variety of open source or third-party models to accelerate pre-labeling and cut labeling costs by up to 90%.
By pre-labeling building footprints, we have significantly sped up our labeling efficiency and the next step is to set up our annotation project.
1) Set up annotation project and ontology.
2) Ensure ontology includes sub classification for damage severity (e.g., Low, Medium, High).
3) Include bounding box model predictions from Foundry pre-labels.
4) Zoom-in to determine the level of damage and select the appropriate level of damage or draw additional bounding boxes as needed.
These next series of steps are optional and will help you bypass the manual component by uploading ground-truth directly by using the Annotation project ID.
1) Reference a cloud bucket that references annotations and corresponding damage classifications.
2) Send annotation data type of annotation project so that you have ground-truth data.
3) By uploading annotation data directly via the Python SDK, you can access ground-truth data to send directly to your models for fine-tuning and refinement.
Follow along with the tutorial and walkthrough in this Colab Notebook. If you are following along, please make a copy of the notebook.
Create a model run
Once you have your labeled data in your project in Annotate, you’re ready to move on to creating a model run in Labelbox Model.
Model training occurs outside of Labelbox. Labelbox Model works with any model training and inference framework, major cloud providers (AWS, Azure, GCS), and any data lake (Databricks, Snowflake).
We’ll be using this Colab notebook to train a model on the training dataset and bring back inferences from the trained model for evaluation and diagnosis.
You’ll be able to view the model in the ‘Experiments’ tab in Labelbox Model – you can view ground truth predictions in green and predictions in red.
A disagreement between model predictions and ground truth labels can be due to a model error (poor model prediction) or a labeling mistake (ground truth is wrong).
After running error analysis, you can make more informed decisions on how to iterate and improve your model’s performance with corrective action or targeted data selection.
Curate high-impact data to drastically improve model performance
Once you’ve identified an example of a corner-case where the model might be struggling, you can easily leverage Catalog to surface similar unlabeled examples to improve model performance.
Compare model runs across iterations
Improve model development by up to 90% by leveraging Labelbox Model to compare model runs across iterations to track and quantify how model performance has improved over time.
With new high-impact data labeled, you can retrain the model using the same steps with the Colab notebook on this improved data set. You can track model improvements across various runs for comparison and how this has affected model performance.
By analyzing high-volumes of images, videos and text, Labelbox provides valuable human-in-the-loop insights for damage detection processes to ensure underwriting risks, enabling insurance companies to make data-driven decisions that improve operational efficiency, compliance and revenue.