QuerYD: A video dataset with textual and audio narrations

Published on: 2021-02-17
Contributors: A. Oncescu, J. F. Henriques, Y. Liu, A. Zisserman, S. Albanie
Datarows: 2,766 datarows
Foundation models

QuerYD is a dataset for retrieval and event localisation in video. Videos are sourced from YouTube and descriptions are provided through contributions on the YouDescribe project page.

This is a subset of the original data and does not contain original annotations. This dataset was used to demonstrate power of using foundation models to make the videos queryable. Check out the original blog post