logo
  • menu
  • Markets
  • ETFs
  • Live
  • Spot
  • Futures
  • Learn
  • Sign In
  • Sign Up
  • Downloads
  • English
  • |
  • USD
  • |
Sign Up
Crypto PricesLearnLatest NewsDownloadsMarketsSpotAnnouncements
Home/
Latest News/
Live

DeepSeek, Xiaomi Just Made Frontier AI 99% Cheaper. American Labs Went the Other Way

By Decrypt
May 28, 2026
4.6 
★
★
★
★
★
★
★
★
★
★
 478 User Rating
Share

Quick explainer for the non-developers in the room: When you use ChatGPT or Claude in a browser, you're paying a flat subscription—or nothing. When a company builds a product on top of an AI model, they pay per token, where a token is roughly three-quarters of a word. Every message sent, every reply generated, every document processed: all of it adds up at a rate measured in millions of tokens.

An API is the raw pipe that makes this possible, making it possible for an app, an agent, a web site, etc. to use the model in their own environment. So token pricing determines whether an AI-powered product is economically viable or a money pit.

Token plans are a subscription wrapper on top of that. You buy credits upfront; the model eats through them. Xiaomi's billing upgrade gives users 5 to 8 times more tokens at the same price. The Max plan at $100 now gets you 82 billion tokens, up from 1.6 billion.

For context, 82 billion tokens is more than 60 billion words.

Why the cuts are real, not marketing

Fuli Luo, head of Xiaomi's MiMo team and a former core DeepSeek developer who co-built DeepSeek-V2, published a technical explanation on X. The biggest savings come from a smarter way of storing and reusing information the AI has already processed. Instead of repeatedly doing the same work, Xiaomi’s system can remember much more data at once—about five times more than before. That means the AI needs far less computing power, cutting storage and processing costs by around 80%.

Behind the MiMo API Price Reduction:The deepest price cut, up to 99%, is for Input (Cache Hit). The core reason is our inference framework now supports hierarchical KV cache optimization for SWA. Production inference engine tests show this optimization increases cached token…

“Operating at these newly reduced API prices, our production inference engine is running at near full capacity, and we can still essentially break even,” Luo wrote. “If more architectures that save compute and KV [Key-Value cache] cache emerge, along with better inference Infra to drive down API costs, this will form an excellent virtuous cycle in the industry.”

The result is a model 98% cheaper than GPT-5.5 Pro with a competitive performance.

Silicon Valley’s bet

DeepSeek V4-Pro is a 1.6 trillion parameter model that gives you the knowledge base of a massive model at a fraction of the compute cost. It now permanently runs at $0.435 input and $0.87 output per million tokens. That's a model that scored 80.6% on SWE-Verified against Claude Opus 4.6's 80.8%—a benchmark measuring real GitHub issue resolution, not cherry-picked demos. The pricing gap between models with essentially the same coding score: 34x on output.

DeepSeek and Xiaomi aren't alone

Kimi K2.5 from Moonshot AI, with 76.8% on SWE-bench Verified, runs $0.60 input and $2.50 output. GLM-5.1 from Z.AI beat Claude Opus 4.6 on a key coding benchmark earlier this quarter. Four Chinese frontier models shipped in a 12-day window in early May, all under one-third of Opus 4.7's per-token cost.

For better visualization, this chart shows how Chinese models stack up against the three most popular American AI providers (Anthropic, OpenAI, and Meta) in terms of price to quality ratio.

Image: Artificialanalysis.ai

The Q2 2026 gap between Chinese and American frontier models sits at 15–30x, depending on which models you compare—and that's the baseline, before any cache discounts.

What this week's cuts do is collapse that gap further for the specific workloads that actually run in production: agent pipelines with stable system prompts, document processors, retrieval tools, things that hit cache constantly. At $0.003625 per million cached input tokens, DeepSeek V4-Pro's cost for repeated context is functionally rounding error.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of BitKan. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. BitKan shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. Products mentioned in this article may not be available in your region.

Latest News

Industry

Cryptocurrency

Airdrop

Markets

  • Invesco Files for Tokenized Fund to Back Stablecoin Reserves

    Invesco Files for Tokenized Fund to Back Stablecoin Reserves

    Invesco has officially filed with the U.S. Securities and Exchange Commission (SEC) to launch the Invesco Stablecoin Reserves Onchain Fund, a new vehicle designed to offer stablecoin issuers a compliant way to manage their collateral.
    Martha Grizzard
    Jun 26, 2026
  • Spark and Uniswap Target $4T Market with New FX Infrastructure

    Spark and Uniswap Target $4T Market with New FX Infrastructure

    Uniswap and the decentralized finance protocol Spark have launched a shared liquidity infrastructure designed to function as a foreign-exchange network for the growing number of stablecoin issuers.
    Wayne Ingram
    Jun 26, 2026
  • Ethereum Foundation to Cut Budget by 40% in Major Restructuring

    Ethereum Foundation to Cut Budget by 40% in Major Restructuring

    The Ethereum Foundation (EF) has announced a comprehensive reorganization that includes a 40% reduction in its 2026 budget and a 20% cut to its workforce, signaling a shift toward a leaner, endowment-style operational model for the blockchain ecosystem.
    Wayne Ingram
    Jun 25, 2026
  • Japan Regulators Greenlight Ripple’s RLUSD Stablecoin Launch

    Japan Regulators Greenlight Ripple’s RLUSD Stablecoin Launch

    The Japan Financial Services Agency (JFSA) approved RLUSD under the Payment Services Act.
    Wayne Ingram
    Jun 25, 2026
  • SpaceX Prices Record $75B IPO at $135, Hits $1.8T Valuation

    SpaceX Prices Record $75B IPO at $135, Hits $1.8T Valuation

    SpaceX has officially executed the largest initial public offering in Wall Street history, substantially eclipsing all previous market records.
    Wayne Ingram
    Jun 12, 2026
View more data 
BTCBTC(BTC)
$0
--(Last 24h)
SpotFutures

Top

View more
  1. 1S&P 500 Reclaims 200-Day Moving Average, Bitcoin Gains
  2. 2Trump Softens His Stance on Reciprocal Tariffs, US Stocks and Crypto Markets Rise
  3. 3Vitalik Buterin : The current price of ETH has not been affected by the merger event
  4. 4Vibhu Norby : Solana Spaces store to bring 100K people to Solana per month
  5. 5CZ: compared with the record high nine months ago, the current situation of the industry is much better

Top Gainers

View more
Adventure Gold
Adventure GoldAGLD

$0.2213

+78.32%
Audiera
AudieraBEAT

$2.5636

+29.33%
Bella Protocol
Bella ProtocolBEL

$0.1918

+24.06%
Wiki Cat
Wiki CatWKC

$0.0000000733

+21.97%
BNB Attestation Service
BNB Attestation ServiceBAS

$0.0525

+20.31%

Top Trending

View more
AAVE
AAVEAAVE

$95.8600

+16.62%
Audiera
AudieraBEAT

$2.5636

+29.33%
Block Street
Block StreetBSB

$0.3146

+2.49%
Binance Coin
Binance CoinBNB

$568.140

+1.40%
Solana
SolanaSOL

$71.8300

+6.97%

Recently added

View more
Nesa
NesaNES

$0.1922

-4.24%
Arcium
ArciumARX

$0.2671

+6.46%
Ambire AdEx
Ambire AdExADX

$0.0566

+1.43%
Re
ReRE

$0.5616

-4.21%
o1 exchange
o1 exchangeO

$0.4508

-18.89%

Learn

View more
  1. 1Crypto Trading Bots: What Are They and How Do They Work?
  2. 2What Are Appchains? How Do Application-Specific Blockchains Work?
  3. 3What Is Chain Abstraction? What Are the Advantages and Challenges?
  4. 4What Are Intent-Based Transactions? How Do They Work?
  5. 5What Are Modular Blockchains? How Do They Scale Networks?
About Us
  • About BitKan
  • Contact Us
  • Announcements
  • VIP Program
  • BitKan Ambassador
  • Institutional Services
Products
  • Spot
  • Futures
  • Crypto Prices
  • Learn
  • News
  • Markets
  • How to Buy Crypto
  • BTC to USD Calculator
  • Reward
Help
  • Help Center
  • Email Us
  • Live Chat
  • Download APP
  • Listing Application
  • Buy Bitcoin
  • Buy Ethereum
  • Buy Dogecoin
  • Buy Altcoins
Terms
  • Terms of Use
  • Privacy Policy
  • Trading Rules
  • Fee
K-Site
English
About Us
+
  • About BitKan
  • Contact Us
  • Announcements
  • VIP Program
  • BitKan Ambassador
  • Institutional Services
Products
+
  • Spot
  • Futures
  • Crypto Prices
  • Learn
  • News
  • Markets
  • How to Buy Crypto
  • BTC to USD Calculator
  • Reward
Help
+
  • Help Center
  • Email Us
  • Live Chat
  • Download APP
  • Listing Application
  • Buy Bitcoin
  • Buy Ethereum
  • Buy Dogecoin
  • Buy Altcoins
Terms
+
  • Terms of Use
  • Privacy Policy
  • Trading Rules
  • Fee
K-Site
+
  • Twitter
  • Facebook
  • Telegram
  • YouTube
  • Instagram
  • Medium
  • Linkedin
@2012-2026 BITKAN.com