logo
  • menu
  • Markets
  • ETFs
  • Live
  • Spot
  • Futures
  • Learn
  • Sign In
  • Sign Up
  • Downloads
  • English
  • |
  • USD
  • |
Sign Up
Crypto PricesLearnLatest NewsDownloadsMarketsSpotAnnouncements
Home/
Latest News/
Live

AI Models Can’t Agree on Basic Facts Most of the Time, Study Shows

By Decrypt
May 30, 2026
4.1 
★
★
★
★
★
★
★
★
★
★
 420 User Rating
Share

The study gave GPT-5.4, Claude Opus 4.7, Gemini 3 Pro, Gemini 3 Pro with Search, and Sonar Pro the same 1,000 real-world fact-check claims submitted by actual users. The models had to pick one of four labels: true, mostly true, misleading, or false.

On 672 out of 1,000 claims, at least one model broke from the majority. In 34% of cases, the disagreement was severe: one model called a claim true while another called it false.

“These aren’t benchmark items with public answer keys—they’re claims real users submitted for verification to a fact-checking platform,” the study reads. “Only one verdict bucket can be correct per claim, so any disagreement among the panel means at least one model’s verdict is label-inconsistent under this 4-bucket rubric.”

The research used a setup that makes it harder for the AI companies to explain away. Instead of pulling claims from standard test sets—the kind that often leak into training data—the researchers used claims submitted by real people to Lenz’s fact-checking platform. “Most of these claims are unlikely to appear in any training corpus with a gold label attached—there’s no canonical answer key to pattern-match against, no benchmark leaderboard to anchor to,” the paper notes.

The statistical measure of agreement, called Krippendorff’s alpha, came in at 0.639 on a scale where 1.0 means perfect agreement and 0 means random chance. The study says this indicates “nontrivial but limited agreement.” “The models’ verdicts are structured rather than random, but not consistent enough to treat the panel as a single interchangeable judge,” researchers note. Researchers generally consider anything below 0.8 to be weak.

When all five models did agree—which happened on only 328 out of 1,000 claims—they almost never agreed that something was misleading or mostly true. Just four claims received a unanimous “misleading” verdict. Zero received unanimous “mostly true.”

The researchers provided example claims where the AI models showed the most divergence, including "The World Bank's active portfolio in Nigeria stands an over $16.4 billion as of 2025." ChatGPT 5.4 said it was "mostly true" while Gemini 3 Pro called it "false" and its sister model Gemini 3 Pro + Search rated it "misleading."

In another example, the models were provided with the claim: "Donald Trump said that an attack on Iran was postponed at the request of Gulf Allies." GPT-5.4 said it was false, Claude Opus 4.7 called it mostly true, Gemini 3 Pro said false, and Gemini 3 Pro + Search rated it true.

“The panel converges on definitive verdicts; the middle of the rubric is where it fractures,” the researchers found. Unanimity only happened at the extremes: either the claim was definitely true or definitely false.

AI companies love to tell you their models are getting more accurate. They publish benchmark scores showing steady improvement. But the Lenz study tested these models on the kind of jagged, ambiguous claims that real humans actually argue about—and found that the models argue too.

The paper is careful to point this out. “A majority of frontier models is not ground truth. The majority verdict is sometimes wrong; an individual dissenting model is sometimes right. We use the majority as a structural reference point for measuring disagreement, not as a stand-in for correctness.”

On the 328 claims where all five models agreed, zero received a unanimous "mostly true." The nuance bucket emptied out completely. If AI models can only find consensus at the extremes, can they be trusted as fact checkers at all?

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of BitKan. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. BitKan shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. Products mentioned in this article may not be available in your region.

Latest News

Industry

Cryptocurrency

Airdrop

Markets

  • Ethereum Foundation to Cut Budget by 40% in Major Restructuring

    Ethereum Foundation to Cut Budget by 40% in Major Restructuring

    The Ethereum Foundation (EF) has announced a comprehensive reorganization that includes a 40% reduction in its 2026 budget and a 20% cut to its workforce, signaling a shift toward a leaner, endowment-style operational model for the blockchain ecosystem.
    Wayne Ingram
    Jun 25, 2026
  • Japan Regulators Greenlight Ripple’s RLUSD Stablecoin Launch

    Japan Regulators Greenlight Ripple’s RLUSD Stablecoin Launch

    The Japan Financial Services Agency (JFSA) approved RLUSD under the Payment Services Act.
    Wayne Ingram
    Jun 25, 2026
  • SpaceX Prices Record $75B IPO at $135, Hits $1.8T Valuation

    SpaceX Prices Record $75B IPO at $135, Hits $1.8T Valuation

    SpaceX has officially executed the largest initial public offering in Wall Street history, substantially eclipsing all previous market records.
    Wayne Ingram
    Jun 12, 2026
  • Stablecoin Secondary Market Rules Pit Banks Against Crypto

    Stablecoin Secondary Market Rules Pit Banks Against Crypto

    The Bank Policy Institute and The Clearing House want anti-money laundering rules to cover secondary market activity.
    Martha Grizzard
    Jun 12, 2026
  • VerifiedX Launches Bitcoin Sidechain for Native DeFi Privacy

    VerifiedX Launches Bitcoin Sidechain for Native DeFi Privacy

    VerifiedX has officially introduced a decentralized "reliever chain" designed to bring programmable, privacy-preserving functionality to the Bitcoin network.
    Martha Grizzard
    May 18, 2026
View more data 
BTCBTC(BTC)
$0
--(Last 24h)
SpotFutures

Top

View more
  1. 1S&P 500 Reclaims 200-Day Moving Average, Bitcoin Gains
  2. 2Trump Softens His Stance on Reciprocal Tariffs, US Stocks and Crypto Markets Rise
  3. 3Vitalik Buterin : The current price of ETH has not been affected by the merger event
  4. 4Vibhu Norby : Solana Spaces store to bring 100K people to Solana per month
  5. 5CZ: compared with the record high nine months ago, the current situation of the industry is much better

Top Gainers

View more
Jotchua
JotchuaJOTCHUA

$0.009912

+44.89%
Gravity
GravityG

$0.004070

+35.67%
Heima
HeimaHEI

$0.1616

+34.89%
Audiera
AudieraBEAT

$2.4385

+32.91%
Wiki Cat
Wiki CatWKC

$0.0000000706

+23.45%

Top Trending

View more
Monero
MoneroXMR

$308.100

-3.45%
Jito
JitoJTO

$0.7544

+13.21%
Audiera
AudieraBEAT

$2.4379

+32.88%
Horizen
HorizenZEN

$4.0660

-2.98%
Hyperliquid
HyperliquidHYPE

$62.2250

-2.35%

Recently added

View more
Nesa
NesaNES

$0.1966

-20.89%
Arcium
ArciumARX

$0.2575

+2.22%
Ambire AdEx
Ambire AdExADX

$0.0570

-1.21%
Re
ReRE

$0.5662

-3.20%
o1 exchange
o1 exchangeO

$0.4369

-27.89%

Learn

View more
  1. 1Crypto Trading Bots: What Are They and How Do They Work?
  2. 2What Are Appchains? How Do Application-Specific Blockchains Work?
  3. 3What Is Chain Abstraction? What Are the Advantages and Challenges?
  4. 4What Are Intent-Based Transactions? How Do They Work?
  5. 5What Are Modular Blockchains? How Do They Scale Networks?
About Us
  • About BitKan
  • Contact Us
  • Announcements
  • VIP Program
  • BitKan Ambassador
  • Institutional Services
Products
  • Spot
  • Futures
  • Crypto Prices
  • Learn
  • News
  • Markets
  • How to Buy Crypto
  • BTC to USD Calculator
  • Reward
Help
  • Help Center
  • Email Us
  • Live Chat
  • Download APP
  • Listing Application
  • Buy Bitcoin
  • Buy Ethereum
  • Buy Dogecoin
  • Buy Altcoins
Terms
  • Terms of Use
  • Privacy Policy
  • Trading Rules
  • Fee
K-Site
English
About Us
+
  • About BitKan
  • Contact Us
  • Announcements
  • VIP Program
  • BitKan Ambassador
  • Institutional Services
Products
+
  • Spot
  • Futures
  • Crypto Prices
  • Learn
  • News
  • Markets
  • How to Buy Crypto
  • BTC to USD Calculator
  • Reward
Help
+
  • Help Center
  • Email Us
  • Live Chat
  • Download APP
  • Listing Application
  • Buy Bitcoin
  • Buy Ethereum
  • Buy Dogecoin
  • Buy Altcoins
Terms
+
  • Terms of Use
  • Privacy Policy
  • Trading Rules
  • Fee
K-Site
+
  • Twitter
  • Facebook
  • Telegram
  • YouTube
  • Instagram
  • Medium
  • Linkedin
@2012-2026 BITKAN.com