Elon Musk Calls Synthetic Data the Next Stage of AI Development

Elon Musk has concurred with the view expressed by several other experts that there is a shortage of real-world data left for training AI models, TechCrunch reports.

“We have essentially exhausted the entirety of human knowledge for AI training. This happened last year,” he said in a discussion with Stagwell Chairman Mark Penn.

In December, Ilya Sutskever—co-founder of OpenAI and founder of AI startup Safe Superintelligence—stated that the industry has hit the limit of data usage. According to him, AI agents, synthetic information, and accelerated computations represent the next phase in AI’s evolution, likely leading to the emergence of superintelligence.

Musk believes synthetic data, i.e. AI-generated information, is the way forward.

“The only way to supplement [real-world data] is through synthetic data, where AI itself creates [the training information]. With such material, [the AI] essentially grades itself and undergoes self-learning,” the entrepreneur explained.

In 2024, AI startup Anthropic used synthetic data to train one of its flagship models, Claude 3.5 Sonnet. Meta refined its Llama 3.1 models using AI-generated materials. OpenAI also employs synthetic information to train o1—a “reasoning” artificial intelligence system.

Context

AI startups have begun searching for new ways to scale due to a shortage of high-quality data.