LabelboxMarch 12, 2024

Labelbox video search powered by Google Gemini Pro Vision

Labelbox is excited to introduce enhanced video search capabilities powered by Gemini Pro Vision. Integration of Google’s state-of-the-art multimodal model now powers natural language search across videos within Labelbox Catalog. The same embeddings are also used to group and view similar videos in order to accelerate data curation and find outliers in your datasets.

Natural language search on private video collections

The idea of using natural language to search videos is becoming more and more common for all of us. Using your iPhone or Android mobile device, you can simply type in a few words of text to find matching images and videos. For data teams working with large datasets of private images and video, providing a similar experience is not quite that simple. This requires using a powerful AI model to extract critical information from each of your video files. Thankfully, we have done all the heavy lifting for you. 

Check out this quick demo of how to use video search in Labelbox:

Find relevant videos using natural language

Adding video to our natural language search capabilities – which also include images, documents, audio and more – is incredibly powerful for data curation. Within seconds, data teams can find specific events in their video datasets and build a simple classifier to automatically surface similar events in the incoming stream. This technique allows ML teams to quickly retrain new task-specific models to improve performance in this area.

Use natural language to find specific scenes from high-volumes of videos

Identify and label outliers in your data using cluster view

Another powerful use of the embeddings generated by Google Gemini Pro Vision is for similarity search. Using the new cluster view, you can quickly identify groups of similar videos. This is a quick way to identify outliers in your dataset and find similar data to label in order to retrain a model to perform better in certain areas.

Labelbox's video similarity search powered by Google Gemini Pro Vision


Labelbox is always looking to accelerate and optimize the data labeling process for our customers. The integration of Google Gemini Pro Vision with Labelbox for video search powers our natural language search and similarity search capabilities.

With the beta release of video search, we have generated embeddings using Google Gemini Pro Vision for all videos in Catalog for current Labelbox users on free and paid plans. We are also analyzing all newly uploaded videos, so try using natural language to search your videos today.