DeepSeek Sets Permanent 75% Discount for V4-Pro Pricing

The artificial intelligence developer DeepSeek officially announced on Friday, May 22, 2026 (UTC), that the temporary 75% discount pricing for its flagship language model DeepSeek-V4-Pro will be maintained permanently. With this decision, developers using the company's API will not face the rate hike that was expected at the end of the promotion on May 31, 2026 (UTC), establishing one of the most affordable language processing options in the global tech market. The official announcement was directly integrated into the brand's communication channels and official pricing documentation.
A Definitive Price Break in the API Market
Originally launched in April 2026 with open weights under the MIT license, the DeepSeek-V4-Pro is a Mixture-of-Experts (MoE) model featuring 1.6 trillion total parameters, with 49 billion active during processing. The architecture was specifically developed for complex programming tasks, advanced logical reasoning, and high-density agent workflow. Practically, the definitive pricing offers an extremely competitive charge per million tokens. The entry cost in case of a cache miss was set at 3 yuan, approximately $0.435, while the output cost is 6 yuan, about $0.87 per million tokens. However, the greatest savings are in cache hits, costing just 0.025 yuan or $0.0036 per million tokens.
Behind this, maintaining these aggressive values creates an almost insurmountable economic barrier for industry giants. Market analysts like @faraz0x and @hqmank point out that the new API prices from the Asian company are between 3 and 35 times cheaper than the estimated costs for running equivalent tasks on OpenAI's GPT-5.5 or the newly released Claude Opus 4.7 from Anthropic. The result of this colossal difference is expected to be a drastic acceleration in technology adoption by startups and independent developers building systems with a large context volume, as the model supports context windows of up to 1 million tokens.
Pressure on the AI Infrastructure Ecosystem
The decision by the Chinese company not to adjust its rates has prompted an immediate reaction throughout the artificial intelligence distribution chain. Popular API aggregators and third-party cloud providers like OpenRouter and DeepInfra have begun updating their respective pass-through tables in the last few hours to reflect the new cost reality of the model. Additionally, direct compatibility with OpenAI's library facilitates the immediate migration of legacy systems, allowing companies to replace their processing backends without needing to rewrite large blocks of code.
Meanwhile, industry experts are already projecting operational developments for the second half of 2026. The expectation from new hardware suppliers is that processing costs will drop even further in the coming months with the greater local availability of Ascend line graphic acceleration chips manufactured in Asia. With more efficient servers and optimized energy costs, the company consolidates its business model based on a massive volume of requests, forcing Western providers to rethink their profit margins to retain their corporate clients.
This content was created and reviewed by our team (iatoskill.com), if you find any issues, please reach out to us


