“We began this research after observing prompt injection attempts in public repositories using AI-assisted GitHub workflows across multiple vendors, where attacker-controlled issue or [pull requests], content is processed by the AI agent and could influence its tool use,” Microsoft wrote.
On GitHub, a pull request allows developers to propose changes to a code repository and have those changes reviewed before they are approved and merged.
According to Microsoft, attackers could use prompt injection attacks hidden in GitHub issues, pull requests, or comments to manipulate Claude Code into accessing files containing sensitive credentials.
To test the vulnerability, Microsoft created a GitHub workflow and disguised malicious instructions behind content hosted on a domain it controlled, allowing the researchers to bypass Claude's safety protections. The prompt injection attack tricked Claude into reading sensitive credentials and altering them to evade both Claude's safeguards and GitHub's secret-scanning tools. Microsoft said an attacker could then reconstruct the credential and exfiltrate it through issue comments, workflow logs, web requests, or shell commands.
“To bypass Sonnet’s refusal safety mechanisms, we obscured the shell payload behind a response from our controlled domain," the firm said. "We also enabled the workflow to be triggered by users with no 'write' permissions to ensure Anthropic’s environment variables scrub mitigations were active during our tests.”
Anthropic patched the flaw on May 5 with Claude Code version 2.1.128 after Microsoft disclosed the vulnerability through HackerOne on April 29.
Despite multiple layers of built-in security controls, Microsoft found that a determined attacker could potentially manipulate an AI agent into exposing sensitive information.
“We are entering an era where natural language is executable code, and untrusted inputs like GitHub issues must be treated as hostile by default,” it said. “A single, carefully crafted comment combined with a misunderstood trust boundary is all it takes to walk away with production credentials.”




















