(Photo file, courtesy of Twelve Labs)
Global chip giant Nvidia Corp. has co-led a $50 million Series A funding for Twelve Labs, a South Korean artificial intelligence company specializing in video analysis, according to the tech startup on Wednesday.
Nvidia’s venture capital arm NVentures and New Enterprise Associates, a new investor in Twelve Labs, jointly led the Series A round. Existing global investors including Index Ventures, Radical Ventures and WndrCo, led by DreamWorks co-founder Jeffrey Katzenberg, and Seoul-based Korea Investment Partners joined the round.
The existing investors participated in pre-Series A funding of about $10 million last October, in which Nvidia made its first investment in a Korean generative AI startup.
Twelve Labs has attracted about $77 million including the Series A funding since its inception in 2021. The company said it will use the funds for research and development of its AI-based video understanding and search technologies and recruitment of more than 50 employees by the end of the year.
“The world-class team at Twelve Labs is leveraging Nvidia accelerated computing together with their incredible capacity for video understanding, leading to new ways for enterprise customers to take advantage of generative AI,” said Mohamed Siddeek, corporate vice president and head of NVentures.
“The large language model (LLM) market is dominated by a handful of Big Tech corporations such as OpenAI, but we believe that Twelve Labs can become a global leader in the multimodal AI industry for video understanding,” said John MJ Kim, principal at Korea Investment Partners.
Multimodal AI is used for a machine learning model, in which various data types including image, text, speech and number are combined with intelligence processing algorithms for accurate and sophisticated outputs.
Based on multimodal model, the startup analyzes images and sound in a video and matches them to human languages. The model also can create text based on the video content, edit a short form video and categorize videos by a certain standard.
The technology boosts efficiency in creating YouTube Shorts, setting up advertisement strategy for a video and even finding missing persons by understanding of closed-circuit television (CCTV) footage.
Twelve Labs has integrated some Nvidia frameworks and services within its platform, including the NVIDIA H100 Tensor Core Graphic Processing Unit (GPU) and NVIDIA L40S GPU, to improve its video understanding technology.
In March, Twelve Labs released multimodal model Marengo-2.6, which enables various search tasks in videos, texts, images and audios and also launched a beta version of Pegasus-1, which is designed to understand and articulate video content.
By Eun-Yi Ko
koko@hankyung.com
Jihyun Kim edited this article.