University researchers in China have found a way to alter the behavior of AI voice models by embedding hidden commands inside audio clips that are inaudible to humans. The attack has an up to 96% success rate, according to research out of Zhejiang University.
The attack works by modifying the numerical values inside a digital audio waveform in ways that are not perceptible to human listeners but still affect how AI models interpret the signal. Researchers said the manipulated audio can override or redirect a model’s behavior even when legitimate user instructions are included with the clip.
“Many previous attacks on generative models required the attacker to have complete control over both the final audio input and original instructions given to the model, essentially acting as the user,” the study said. “Here, the attacker manipulates only the audio data being processed by the model, which makes it possible to attack a model while it’s being used by someone else.”
According to the study, possible delivery methods include online videos, music clips, voice notes, or audio from Zoom calls uploaded to AI transcription services. The team also said unpublished follow-up work demonstrated similar attacks in live AI voice chats.
The researchers said monitoring a model’s internal attention mechanisms was the most effective defense they tested. However, they also found that attackers aware of the defense could reduce the strength of the manipulation while maintaining much of the attack’s effectiveness.
“These single-point defenses struggle to resist our attack because we found it’s very hard for these models to distinguish the normal user intent and our adversary attack,” Chen said.



















