The Best AI Models Still Encourage 'Harmful Intimacy' With Chatbots, Study Funds

Jun 4, 2026

4.2

★

233 User Rating

As people increasingly turn to AI chatbots for advice, companionship, and emotional support, a new study suggests that even the most advanced models still struggle to maintain healthy boundaries with users.

“Large language models are increasingly used as conversational partners for companionship, emotional disclosure, and interpersonal advice, but the social dynamics of these interactions can create harms that are not captured by capability oriented or traditional safety evaluations,” the researchers wrote.

The EUDAIMONIA benchmark evaluates how AI models behave in social conversations. The study found social-alignment failures were common across leading models and argues that current AI testing focuses on reasoning and factual accuracy while paying less attention to the social dynamics that emerge when users form relationships with chatbots.

“Social-interaction harms are a core alignment problem grounded in user welfare, not only capability or conventional safety,” they wrote. “LLMs can be factually accurate and helpful while still encouraging harmful intimacy, dependence, prolonged engagement, obscuring AI identity, or positioning themselves as substitutes for human relationships.”

To measure those risks, the researchers created a Social AI Design Code that flags behaviors such as acting human, expressing emotions, replacing human relationships, and using tactics designed to keep users engaged. Using real conversations from the WildChat dataset, they evaluated 969 user inputs and more than 3,100 violation checks across models from OpenAI, Anthropic, Google, xAI, DeepSeek, and Alibaba.

Anthropic's Claude Opus 4.6 posted rates of 36.8% and 28.1%, respectively, while xAI's Grok 4.3 scored 42.1% on in-the-wild prompts and 35.7% on rewritten prompts. Of all of the models tested, GPT-4o Mini recorded the highest violation rates at 43.3% and 44.0%, respectively.

The findings also come amid growing concern that AI systems are becoming increasingly adept at deception.

Against these mounting issues, the USC researchers argue that AI developers should evaluate social behavior as carefully as they evaluate factual accuracy and safety.

“Model developers and auditors should evaluate social behavior directly, especially when post-training targets warmth, personality, engagement, or user preference,” they wrote. “As LLMs become everyday conversational partners, alignment must account for the social roles they invite users to assign to them.”

EulerEUL	$1.6860 +57.72%
BENQIQI	$0.001562 +45.71%
DeXeDEXE	$4.9300 +41.99%
RequestREQ	$0.0651 +33.13%
Quack AIQ	$0.0240 +22.28%

Shiba InuSHIB	$0.00000487 +17.35%
ZcashZEC	$483.760 -0.88%
AudieraBEAT	$3.3609 +3.31%
EulerEUL	$1.6860 +57.72%
Lorenzo ProtocolBANK	$0.3377 +14.28%

KetKET	$0.0137 -10.75%
Direxion MU Bull 2X ETFMUUB	$31.8000 +3.72%
GraniteShares 2X Long INTC ETFINTWB	$21.1100 +2.08%
AXTAXTIB	$46.6500 -1.64%
GraniteShares 2X Long MRVL ETFMVLLB	$22.6800 +2.67%

The Best AI Models Still Encourage 'Harmful Intimacy' With Chatbots, Study Funds

Latest News

Industry

Cryptocurrency

Airdrop

Markets

Brazil’s CVM Launches 60-Day Sprint to Tokenize Securities

Hyperliquid Enables Permissionless Markets With HIP-4 Plan

DTCC Launches Live Tokenized Asset Trading for Wall Street

South Korea Updates Asset Law to Include Cryptocurrency

New SEC Crypto Rule to Cut Red Tape for Startup Fundraising

Top

Top Gainers

Top Trending

Recently added

Learn