Origin Lab raises $8M to help video game companies sell data to world-model builders

TL;DR

Origin Lab has raised $8 million in seed funding to develop a marketplace where AI research labs can purchase high-quality video game data. The platform aims to bridge the gap between game assets and world-model training, addressing a key data scarcity issue for physical AI systems.

Origin Lab, an AI startup focused on providing training data for physical world models, has raised $8 million in seed funding led by Lightspeed Ventures. The company aims to connect video game developers with AI research labs seeking high-quality, licensed data to train models that understand physical environments. This funding signals growing industry interest in leveraging gaming assets for advanced AI applications.

The funding round was led by Lightspeed Ventures, with participation from SV Angel, Eniac, Seven Stars, FPV, and angel investors including Twitch co-founder Kevin Lin and Cruise founder Kyle Vogt. Origin Lab plans to create a marketplace where AI labs such as Yann LeCun’s AMI Labs or Fei-Fei Li’s World Labs can purchase licensed video game data, which can include rendered scenes, walkthrough footage, or other digital assets.

According to co-CEO Anne-Margot Rodde, the platform will serve as a bridge between video game companies and AI research, enabling game assets to be converted into training data suitable for physical world modeling. The startup aims to address the longstanding challenge of data licensing and quality issues that have hindered AI labs’ access to gaming data, which is considered valuable for understanding spatial and motion dynamics.

Why It Matters

This development is significant because it highlights a new revenue stream for video game companies and addresses a critical bottleneck in AI research—access to high-quality, diverse training data. As physical AI systems become more prevalent, the need for realistic, varied data sources grows. The platform could accelerate the development of robots, autonomous vehicles, and other physical AI applications by providing them with richer training datasets.

Moreover, the funding and market interest underscore the increasing importance of data vendors in the AI ecosystem, as large labs seek scalable, high-quality data sources to fuel their models. This trend could reshape how AI training data is sourced and commodified in the coming years.

Amazon

video game data licensing software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Historically, AI labs have relied on publicly available datasets or data generated internally, but the growing complexity of physical world models demands more diverse and realistic datasets. Video game environments offer controlled, detailed, and scalable data that can simulate physical interactions with high fidelity. However, licensing and data quality issues have limited their use in research.

In late 2024, OpenAI faced scrutiny when its Sora video-generation model appeared to use footage from popular video games and streamers, raising questions about data sourcing. The interest from major companies like Amazon in using Twitch streams further emphasizes the market’s demand for licensed, high-quality gaming data. Origin Lab’s approach aims to formalize this data pipeline, making it accessible and legally licensed for AI research.

“The AI systems that are being built now need to understand how the physical world works and how things move. That data essentially lives in video games.”

— Anne-Margot Rodde, co-CEO and co-founder of Origin Lab

“We’ve seen how sharp the revenue scaling can be for data vendors serving major labs. These are very well-capitalized businesses, and the bottleneck for all of them is data.”

— Faraz Fatemi, partner at Lightspeed Ventures

Amazon

high quality gaming asset datasets

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It remains unclear how quickly the platform will be adopted by AI labs and video game companies, or how licensing negotiations will be managed at scale. Details about the specific types of data offered and how the platform ensures data quality and legality are still emerging. Additionally, the impact on existing data licensing models in the gaming industry is yet to be seen.

Vulkan 3D Graphics Rendering Cookbook: Implement expert-level techniques for high-performance graphics with Vulkan

Vulkan 3D Graphics Rendering Cookbook: Implement expert-level techniques for high-performance graphics with Vulkan

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Following the funding announcement, Origin Lab plans to develop its marketplace platform, establish licensing agreements with game developers, and onboard initial AI research partners. The company aims to launch a pilot version within the next six months and expand its data offerings based on early feedback and partnerships.

Mastering Vision-Language-Action Models: A Practical Guide to Designing and Training VLAMs for Intelligent Robots Using OpenVLA, RT-2 Insights, and Chain-of-Thought Reasoning

Mastering Vision-Language-Action Models: A Practical Guide to Designing and Training VLAMs for Intelligent Robots Using OpenVLA, RT-2 Insights, and Chain-of-Thought Reasoning

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

How will Origin Lab ensure the legality of the data sold?

Origin Lab is working to establish licensing agreements with game companies to ensure all data sold complies with legal standards. Details on specific licensing models are still being finalized.

What types of video game data will be available on the platform?

The platform aims to offer various assets, including rendered scenes, walkthrough videos, and other digital assets that can be used as training data for physical world models.

Why is gaming data valuable for AI research?

Gaming environments provide controlled, detailed, and scalable data that simulate physical interactions, making them highly useful for training models that need to understand motion, spatial relationships, and object dynamics.

You May Also Like

Tekken director Katsuhiro Harada is back with his own studio under SNK

Fighting game veteran Katsuhiro Harada has founded VS Studio, a new development team under SNK, after departing from Bandai Namco in December 2023.

Here’s what Mira Murati’s AI company is up to

Thinking Machines, founded by Mira Murati, demonstrates new AI ‘interaction models’ enabling real-time, multi-modal collaboration with users, with limited preview planned.

Sony’s new Xperia phone gets an overdue redesign

Sony’s Xperia 1 VIII introduces a new square camera array and enhanced telephoto lens, marking a significant design and camera upgrade for the flagship.

How Sony leveraged data to make the Demon Slayer film a hit

Sony’s cross-group data analysis strategy significantly increased the marketing effectiveness of the Demon Slayer film, making it a major hit.