Twelve Labs Develops Enhanced Search Capabilities in Video AI
- agoodman
- Apr 29
- 1 min read

April 29, 2025
In the fast-evolving field of artificial intelligence, a new frontier is emerging: video understanding. Jae Lee, co-founder of Twelve Labs, believes that AI models adept at interpreting videos as well as they do text could unlock groundbreaking applications.
Twelve Labs is pioneering this space by offering functionalities such as precise searches within video content, clip summarization, and responses to context-based queries like "When did the person in the red shirt enter the restaurant?" These capabilities have garnered investments from major tech firms such as Nvidia, Samsung, and Intel.
To Lee, a data scientist by training, traditional keyword searches, which only pull up titles, tags, and descriptions, never made sense for videos because they fail to access the actual content of the clips. Frustrated by the inadequacies of existing solutions, Lee and his team developed a model that maps text to video content, including actions, objects, and sounds, offering a more nuanced search capability that surpasses basic keyword tagging. Twelve Labs has developed a “'video-first” approach, prioritizing video data over other formats and customizing solutions that integrate seamlessly with users' existing data, setting it apart from other offerings in the market.
With technological innovation comes ethical responsibility. A significant concern is the potential for AI models to learn and perpetuate biases as highlighted by a 2021 study. Twelve Labs is proactive in addressing these issues, developing ethical benchmarks and datasets to prevent biases in their AI models.
Author: Mikaela Wang, 2024/2025 Articling Student-At-Law
Comments