logo
  • menu
  • Markets
  • ETFs
  • Live
  • Spot
  • Futures
  • Learn
  • Sign In
  • Sign Up
  • Downloads
  • English
  • |
  • USD
  • |
Sign Up
Crypto PricesLearnLatest NewsDownloadsMarketsSpotAnnouncements
Home/
Latest News/
Live

China's Xiaomi MiMo Is Now 15X Faster Than ChatGPT and Claude

By Decrypt
Jun 9, 2026
4.7 
★
★
★
★
★
★
★
★
★
★
 82 User Rating
Share

Most people know Xiaomi as the Chinese phone brand. The one that makes cheap electric scooters and air purifiers. Not exactly the company you'd expect to break a major AI inference speed record on a Monday morning.

Parameters are the internal numerical weights that define how a model thinks—the more you have, the more complex the patterns it can recognize. Tokens are the chunks of text the model reads and writes, roughly three-quarters of a word each on average.

Xiaomi did it on a single 8-GPU commodity node. Standard hardware, no custom chips. That changes the calculus for who can actually deploy this kind of speed in production.

Neither runs on hardware you can rent from AWS tonight.

Xiaomi did it on commodity GPUs through software alone—a combination of model-level tricks and a purpose-built inference engine called TileRT.

What's actually going on under the hood

Two techniques carry the speed. The first technique is called FP4 Quantization: instead of running the model at full 8-bit or 16-bit numerical precision, Xiaomi shrinks the expert layers—which make up most of the 1 trillion parameters—down to 4-bit. Memory footprint drops, bandwidth pressure drops, speed goes up. The catch is usually a small quality degradation. Xiaomi's fix is surgical: only the expert layers get compressed, everything else stays at full precision. With this approach, quality loss is described as near-zero.

The second is DFlash speculative decoding. Normal speculative decoding has a small draft model guess the next few tokens, then the big model verifies them in parallel. DFlash skips the sequential drafting entirely—it fills a whole block of masked positions in a single forward pass. In coding tasks, the big model accepts an average of 6.3 out of 8 proposed tokens per verification round. That's six tokens confirmed in one step instead of one.

TileRT ties it together. It keeps the entire compute pipeline continuously resident inside the GPU—no per-operator launch overhead, no execution gaps.

Xiaomi calls this approach "extreme model-system codesign," and the phrase is accurate: Neither technique alone gets to 1,000 tokens per second, but the synergy among all approaches does.

UltraSpeed accelerates that exact MiMo V2.5 Pro model, not a stripped-down version.

Fast enough inference changes how you can use a model. You can run dozens of reasoning paths in parallel instead of waiting on one answer. Fraud detection, trading signal generation, real-time agent loops—all of these have hard latency constraints that 60 tokens per second can't meet. At 1,000 tokens per second, they can.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of BitKan. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. BitKan shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. Products mentioned in this article may not be available in your region.

Latest News

Industry

Cryptocurrency

Airdrop

Markets

  • SpaceX Prices Record $75B IPO at $135, Hits $1.8T Valuation

    SpaceX Prices Record $75B IPO at $135, Hits $1.8T Valuation

    SpaceX has officially executed the largest initial public offering in Wall Street history, substantially eclipsing all previous market records.
    Wayne Ingram
    Jun 12, 2026
  • Stablecoin Secondary Market Rules Pit Banks Against Crypto

    Stablecoin Secondary Market Rules Pit Banks Against Crypto

    The Bank Policy Institute and The Clearing House want anti-money laundering rules to cover secondary market activity.
    Martha Grizzard
    Jun 12, 2026
  • VerifiedX Launches Bitcoin Sidechain for Native DeFi Privacy

    VerifiedX Launches Bitcoin Sidechain for Native DeFi Privacy

    VerifiedX has officially introduced a decentralized "reliever chain" designed to bring programmable, privacy-preserving functionality to the Bitcoin network.
    Martha Grizzard
    May 18, 2026
  • Japan’s SBI and Rakuten Plan Crypto Trusts as Rules Finalize

    Japan’s SBI and Rakuten Plan Crypto Trusts as Rules Finalize

    SBI Securities and Rakuten Securities have officially announced plans to introduce cryptocurrency investment trusts to their massive retail user bases.
    Craig Green
    May 18, 2026
  • Senate Advances CLARITY Act: A New Era for U.S. Crypto Oversight

    Senate Advances CLARITY Act: A New Era for U.S. Crypto Oversight

    The Senate Banking Committee advanced the CLARITY Act on May 14, 2026 to establish a comprehensive federal framework for the digital asset industry.
    May 15, 2026
View more data 
BTCBTC(BTC)
$0
--(Last 24h)
SpotFutures

Top

View more
  1. 1S&P 500 Reclaims 200-Day Moving Average, Bitcoin Gains
  2. 2Trump Softens His Stance on Reciprocal Tariffs, US Stocks and Crypto Markets Rise
  3. 3Vitalik Buterin : The current price of ETH has not been affected by the merger event
  4. 4Vibhu Norby : Solana Spaces store to bring 100K people to Solana per month
  5. 5CZ: compared with the record high nine months ago, the current situation of the industry is much better

Top Gainers

View more
QuickSwap
QuickSwapQUICK

$0.0100

+43.53%
Solstice
SolsticeSLX

$0.2797

+36.91%
Atletico Madrid Fan Token
Atletico Madrid Fan TokenATM

$1.8040

+30.35%
o1 exchange
o1 exchangeO

$0.7084

+25.25%
BNB Attestation Service
BNB Attestation ServiceBAS

$0.0392

+24.45%

Top Trending

View more
Bitcoin Cash
Bitcoin CashBCH

$188.700

-2.02%
Hyperliquid
HyperliquidHYPE

$62.8420

+1.29%
Litecoin
LitecoinLTC

$40.8500

-2.51%
LAB
LABLAB

$16.2439

+7.85%
Solana
SolanaSOL

$67.6100

-2.06%

Recently added

View more
Arcium
ArciumARX

$0.2728

-18.32%
Ambire AdEx
Ambire AdExADX

$0.0562

+3.31%
Re
ReRE

$0.6658

-18.27%
o1 exchange
o1 exchangeO

$0.7084

+25.25%
SpaceX
SpaceXSPCXB

$153.820

-1.93%

Learn

View more
  1. 1Can Stablecoins Earn Interest? How to Generate Real Yield?
  2. 2What Are Short Liquidations? How Can Traders Prevent Them in Crypto?
  3. 3What Is Rehypothecation Risk in Crypto? How to Protect Yourself
  4. 4What Is pERC20? How Does This Ethereum Token Standard Work?
  5. 5What Are Crypto Prediction Markets? A Complete Guide for Beginners
About Us
  • About BitKan
  • Contact Us
  • Announcements
  • VIP Program
  • BitKan Ambassador
  • Institutional Services
Products
  • Spot
  • Futures
  • Crypto Prices
  • Learn
  • News
  • Markets
  • How to Buy Crypto
  • BTC to USD Calculator
  • Reward
Help
  • Help Center
  • Email Us
  • Live Chat
  • Download APP
  • Listing Application
  • Buy Bitcoin
  • Buy Ethereum
  • Buy Dogecoin
  • Buy Altcoins
Terms
  • Terms of Use
  • Privacy Policy
  • Trading Rules
  • Fee
K-Site
English
About Us
+
  • About BitKan
  • Contact Us
  • Announcements
  • VIP Program
  • BitKan Ambassador
  • Institutional Services
Products
+
  • Spot
  • Futures
  • Crypto Prices
  • Learn
  • News
  • Markets
  • How to Buy Crypto
  • BTC to USD Calculator
  • Reward
Help
+
  • Help Center
  • Email Us
  • Live Chat
  • Download APP
  • Listing Application
  • Buy Bitcoin
  • Buy Ethereum
  • Buy Dogecoin
  • Buy Altcoins
Terms
+
  • Terms of Use
  • Privacy Policy
  • Trading Rules
  • Fee
K-Site
+
  • Twitter
  • Facebook
  • Telegram
  • YouTube
  • Instagram
  • Medium
  • Linkedin
@2012-2026 BITKAN.com