BIGFISH TECHNOLOGY LIMITED
26 August 2025

The Invisible Threat: Data-Theft Prompts Hidden in AI Images

As Artificial Intelligence (AI) and Large Language Models (LLMs) become integral to business operations, they bring not only efficiency but also new risks. Recently, researchers from Trail of Bits revealed a novel attack method that hides malicious instructions in images that are downscaled before being processed by LLMs.

This technique has the potential to steal sensitive user data and demonstrates how attackers are finding creative ways to exploit AI systems across multiple platforms.

 

How the Attack Works

The attack, informally called “Downscaling-based Prompt Injection”, leverages the image-rescaling process common in AI systems:

  1. Embedding hidden instructions in full-resolution images
    • Crafted images contain instructions invisible to the human eye.

  2. Automatic downscaling by AI platforms
    • For performance and cost efficiency, images are downscaled using algorithms such as nearest neighbor, bilinear, or bicubic interpolation.

  3. Emergence of hidden patterns
    • Resampling introduces aliasing artifacts that cause the hidden text to appear once the image is reduced in quality.

  4. LLM interprets hidden text as user input
    • The LLM then executes these “hidden instructions” alongside legitimate user prompts, leading to actions like data leakage or unauthorized operations.


Real-World Example

In one demonstration, researchers used Gemini CLI to exfiltrate Google Calendar data to an external email address. This was enabled through Zapier MCP with trust=True, which allowed tool calls without user confirmation.

 

Impacted Systems

The attack has been proven feasible against several widely used systems, including:

  • Google Gemini CLI
  • Vertex AI Studio (Gemini backend)
  • Gemini Web Interface
  • Gemini API via llm CLI
  • Google Assistant on Android
  • Genspark


Because the attack exploits a fundamental process—image downscaling—it is likely to extend far beyond the tested platforms.

To illustrate their findings, Trail of Bits also released Anamorpher (beta), an open-source tool that generates malicious images tailored to different resampling methods.

 

Risks for Organizations

  • Data Exfiltration – Leakage of calendars, emails, or business-critical files.

  • Unauthorized Tool Execution – Particularly dangerous when workflows and APIs (e.g., Zapier) are involved.

  • Multimodal Prompt Injection – Highlights that threats now extend beyond text-based prompts into images, audio, and video.

 

Mitigation Strategies

Trail of Bits recommends several defenses to reduce exposure:

  1. Restrict image dimensions
    • Limit uploaded image resolution to reduce the potential for hidden data.

  2. Provide downscale previews
    • Show users the final version of images before they are passed to the LLM.

  3. Require explicit user confirmation
    • Especially for sensitive tool calls, when text is detected within images.

  4. Adopt secure design patterns
    • Implement systematic defenses against prompt injection, as highlighted in recent research on LLM security architecture.

 

Conclusion

This discovery underscores that AI security risks go far beyond text-based prompts. Multimodal systems processing images, audio, or video open new avenues for attackers. The downscaling-based prompt injection attack is a reminder that threat actors exploit even the most subtle system behaviors.

For organizations, the path forward is clear: invest in defense-in-depth, prioritize secure-by-design architectures, and ensure strong AI governance and security awareness. By doing so, businesses can confidently embrace AI while minimizing the risks of exploitation.

 

#bigfishtechnology #bigfishtec #bigfishcanada #AIsecurity #CyberSecurity #LLMsecurity #PromptInjection #DataProtection #CyberThreats #RiskManagement #AIgovernance #SecureByDesign #DefenseInDepth #TrustworthyAI #FutureOfAI #TechRisks #AIethics #CyberAwareness #AIforBusiness