China’s DeepSeek says it releases long-awaited new AI model


Chinese startup DeepSeek released a new artificial intelligence model on Friday, more than a year after it wowed the world with a low-cost reasoning model that matched the capabilities of American rivals.

DeepSeek-V4 “presents an ultra-long context of one million words,” the company said in a statement on social media platform WeChat, hailing it as “cost-effective” in a separate announcement at X.

The announcement came after Meta said it planned to cut a tenth of its staff as it seeks productivity gains from the rest of its workforce while investing heavily in artificial intelligence. Reports said Microsoft was also looking to trim its ranks.

DeepSeek-V4’s context length, which determines how much input a model is able to absorb to help it accomplish tasks, “(achieves) leadership in both in-house and open-source domains across agent capabilities, world knowledge, and reasoning performance.”

An open-source “preliminary version” of the model is now available, the company said.

DeepSeek-V4 is released in two versions, DeepSeek-V4-Pro and DeepSeek-V4-Flash, with the latter being “a more efficient and economical choice” because it has smaller parameters.

V4-Pro has 1.6 trillion parameters while V4-Flash has 284 billion parameters, which improve the decision-making ability of the models.

The model is also “optimized” for popular AI Agent products such as Claude Code, OpenClaw, OpenCode and CodeBuddy, the statement said.

“In world knowledge benchmarks, DeepSeek-V4-Pro significantly leads other open-source models and is only slightly surpassed by the top-tier closed-source model, (Google’s) Gemini-Pro-3.1,” the statement added.

Hangzhou-based DeepSeek burst onto the scene in January last year with an AI-generating chatbot powered by its R1 reasoning model that overturned assumptions about US dominance in the strategic sector.

This so-called “Friend of DeepSeek” triggered a sell-off in AI-related stocks and a reckoning of business strategy in what was also described as a “Sputnik moment” for the industry.

The chatbot performed at a similar level to ChatGPT and other leading US offerings, but the company said it took significantly less computing power to develop.

However, its sudden popularity raised questions about privacy and data censorship, with the chatbot often refusing to answer questions on sensitive topics, such as the 1989 Tiananmen crackdown.

Domestically, DeepSeek’s AI tools have been widely adopted by Chinese municipalities and healthcare institutions, as well as the financial sector and other businesses.

This has been driven in part by DeepSeek’s decision to make its systems open source, with their inner workings public – in contrast to the proprietary models sold by OpenAI and other Western rivals.

“Major AI models produced in China spearheaded the development of the global open-source AI ecosystem,” Chinese Premier Li Qiang told an annual meeting of China’s top decision-makers last month.

The AI ​​race has intensified the rivalry between China and the United States, and the White House on Thursday accused Chinese entities of a massive attempt to steal artificial intelligence technology.

“The US has evidence that foreign entities, primarily in China, are running industrial-scale distillation campaigns to steal US AI,” chief science and technology officer Michael Kratsios said in a post on X.

“We will take action to protect American innovation.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *