Chinese AI startup DeepSeek has introduced a new large language model that reportedly surpasses counterparts from Meta and OpenAI in testing.

🚀 Introducing DeepSeek-V3!

Biggest leap forward yet:
⚡ 60 tokens/second (3x faster than V2!)
💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🐋 1/n pic.twitter.com/p1dV9gJ2Sd
— DeepSeek (@deepseek_ai) December 26, 2024

The model, DeepSeek V3, boasts 671 billion parameters, compared to 405 billion in Llama 3.1. This indicates enhanced adaptability to complex applications and higher accuracy in responses.

The Hangzhou-based company trained the model in just two months with a budget of $5.58 million, using only 2,048 GPUs. This is significantly fewer resources than typically required by major tech firms. DeepSeek promises the best price-to-performance ratio in the market.

🎉 What’s new in V3?

🧠 671B MoE parameters
🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:
Model 👉 https://t.co/9iwEF6aLuk
Paper 👉 https://t.co/ruzwMFYAAH

🐋 2/n
— DeepSeek (@deepseek_ai) December 26, 2024

Future plans include introducing multimodality and “other advanced features.”

OpenAI team member Andrej Karpathy praised DeepSeek’s development, calling it impressive given the limited resources.

DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M).

For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being… https://t.co/EW7q2pQ94B
— Andrej Karpathy (@karpathy) December 26, 2024

“This doesn’t mean large GPU clusters are unnecessary for cutting-edge LLMs, but it shows the importance of maximizing available resources. This project demonstrates there’s still much to optimize in both data and algorithms,” Karpathy added.

Previously, DeepSeek released a “competitor to OpenAI’s o1” — the advanced, “thinking” model DeepSeek-R1-Lite-Preview.

In July, Chinese company Kuaishou launched its video-generation AI model Kling, making it publicly available.

sber offers to expand access to cryptocurrency transactions for all qualified investors

MetaMask will allow users to bet on politics and sports through Polymarket

Ethereum Developers Unveil Kohaku Plan for Privacy & Wallet Security

Hackers stole more than $20 million from Hyperliquid trader, gaining access to a private key

ShapeShift returns to privacy with Zcash Shield support

Moscow Exchange will present ten new indices for cryptocurrencies

Analyst: the precious metals market overheated, investors will start to move into bitcoin

BNB overtakes XRP and becomes the third largest crypto asset

Moscow Exchange will present ten new indices for cryptocurrencies

Analyst: the precious metals market overheated, investors will start to move into bitcoin

ShapeShift returns to privacy with Zcash Shield support

sber offers to expand access to cryptocurrency transactions for all qualified investors

Hackers stole more than $20 million from Hyperliquid trader, gaining access to a private key

Kazakhstan closed 130 crypto platforms and confiscated $17 million for illegal activities

sber offers to expand access to cryptocurrency transactions for all qualified investors

MetaMask will allow users to bet on politics and sports through Polymarket

Ethereum Developers Unveil Kohaku Plan for Privacy & Wallet Security

Hackers stole more than $20 million from Hyperliquid trader, gaining access to a private key

ShapeShift returns to privacy with Zcash Shield support

More Powerful Than Meta and OpenAI: Chinese Startup DeepSeek Unveils AI Model