I Scanned 152 Files of My Own AI-Generated Code for Invisible Unicode Malware

Two weeks ago, a supply chain attack called Glassworm compromised 150+ GitHub repositories and 72+ browser extensions by hiding malicious payloads in characters that are literally invisible in every editor, terminal, and code review tool on the planet.

I build AI infrastructure for a living. Every hook file, every automation script, every Nexus job in my homelab was generated by Claude Code. When I read the Glassworm post-mortem and saw “targets AI-generated code using invisible Private Use Area Unicode characters,” I had one thought: I should scan my own files.

What Glassworm Actually Did

The Glassworm campaign (March 3–9, 2026) exploited a fundamental property of Unicode that almost no developer thinks about: the character encoding standard has thousands of valid codepoints that render as absolutely nothing. No visual width. No pixel on screen. No line in a diff. But they exist as bytes in your files, and in the right parser context, they can change execution.

The specific technique: insert Variation Selector characters (U+FE00–U+FE0F) adjacent to identifiers or string literals in JavaScript and Python code. These characters are designed to modify the visual rendering of the preceding character — but when placed after an ASCII character, they’re simply invisible noise to the human reviewer. The parser, however, sees them.

“The attack doesn’t need to change what code looks like. It needs to change what code is.”

The Glassworm operators wrapped their payload commits in AI-generated changelogs — benign descriptions of unrelated refactors, generated to pass casual review. The combination of invisible characters and AI-authored cover text made it nearly undetectable through normal review workflows.

Why My Infrastructure Was the Target Profile

I run approximately 30 Claude Code hooks that execute on every tool call, session start, and file write. I have 15 Nexus job scripts that run headlessly on a cron-driven pipeline with access to my homelab infrastructure. All of it was written by an LLM.

This is exactly the profile Glassworm targets:

AI-generated code trusted without byte-level review
Headless execution with elevated permissions
No human review step between generation and execution
Active development cycle — new files added regularly

I’m not saying Claude Code has been compromised. I’m saying the attack surface exists, the tooling to audit it is trivial to build, and I had never run it.

Building the Scanner

Python’s unicodedata module makes this straightforward. I wrote a recursive file scanner that checks every text file in my hooks and jobs directories against the known Glassworm character ranges, plus a broader set of invisible Unicode that’s been used in related attacks:

U+FE00–U+FE0F: Variation Selectors — Glassworm’s primary vector
U+E0100–U+E01EF: Variation Selectors Supplement — secondary vector with 240-character payload capacity
U+202A–U+202E: Bidirectional overrides — the Trojan Source attack family
U+200B–U+200D: Zero-width characters — string comparison bypass

The scanner runs in under 3 seconds across 152 files. It outputs per-file JSON that can be rendered into an audit report.

What I Found

The scanner flagged 7 files with 8 CRITICAL-severity findings, all in the U+FE00–U+FE0F range — Glassworm’s primary vector. For a moment, that felt alarming.

Then I investigated each finding. Every single one was U+FE0F (Variation Selector 16) following U+26A0 (⚠). The sequence U+26A0 U+FE0F is how UTF-8 encodes the ⚠️ warning emoji. It’s present in my hooks because Claude Code writes console.log('⚠️ Warning: ...') — standard, intentional, completely benign.

The actual Glassworm payload range (U+E0100–U+E01EF) returned zero results. No zero-width characters. No bidirectional overrides. The infrastructure is clean.

The Interesting Part: False Positives Are Feature, Not Bug

The scanner’s “false positive” on ⚠️ emoji is actually the right behavior for a security tool. Here’s why: you want the scanner to flag every occurrence and make you investigate. A scanner that silently skips “probably fine” characters misses the attack.

A more sophisticated version would check context — VS16 is only suspicious when not preceded by a valid emoji base character. I’ve included that logic in the pre-commit hook version of the scanner. For the audit scan, I prefer to over-flag and investigate manually. The investigation confirmed clean code.

What Every Developer with AI-Generated Code Should Do

Three things, in order of effort:

1. Run a one-time audit. Take the scanner from this post and run it against your AI-generated files. It takes 5 minutes. The peace of mind is worth it.

2. Install a pre-commit hook. Add the context-aware version to .git/hooks/pre-commit. It blocks any commit with suspicious invisible Unicode, with a carve-out for legitimate emoji sequences. Two minutes to install, permanent protection.

3. Treat AI-generated code as external input. We review human PRs. We review open-source dependencies. AI-generated code deserves the same skepticism — especially code that runs headlessly with elevated permissions.

The Broader Lesson

We’re in a transitional period where AI-generated code is becoming the majority of new code written in many organizations. The tooling assumptions we built for human-authored code — visual code review, diff comparison, “obviously malicious” pattern recognition — don’t hold when the attacker can generate perfect-looking AI output with invisible payload characters.

The Glassworm campaign is a preview. The techniques will get more sophisticated. The payloads will get more targeted. The AI-generated cover commits will get more convincing.

The good news: the countermeasures are simple. Byte-level scanning catches what visual review misses. The characters have to be in the file. Your tools just need to look for them.

I scanned 152 files. It took 3 seconds. It found 8 findings. I investigated all 8 in 10 minutes. Infrastructure clean.

That’s a good outcome. Make it reproducible.

Scanner source and full audit report: aurora.theklyx.space/aurora/2026-03-17-glassworm-scanner/. Pre-commit hook included in the remediation section.

I Scanned 152 Files of My Own AI-Generated Code for Invisible Unicode Malware

What Glassworm Actually Did

Why My Infrastructure Was the Target Profile

Building the Scanner

What I Found

The Interesting Part: False Positives Are Feature, Not Bug

What Every Developer with AI-Generated Code Should Do

The Broader Lesson

Tags :

Share :

Related Posts

4 Essentials for Executive & Business Buyin on your Incident Response Plan

The CyberSecurity & Evolving Threats

Top 5 things for a Successful Cyber Response 'IR' Plan

Pre-Selection Beats Post-Selection: How I Made Claude Code 10-30x Faster

I Ran 849 Tests on AI Context Files. Here's What Actually Works.

How I Made Claude Code Safer (And You Can Too)

Claude Code Has Two New CVEs — Here's What They Exploit and How to Harden Your Setup

Four Generations of Broken Promises: Why AI SOC Agents Might Actually Be Different

The Math Problem AI Just Changed for Security Testing

The SIEM Cost Trap — Why Your Data Lake + AI Agents Will Win

Your Data Lake Is Only as Useful as Its Ability to Answer a Question

What AI Is Actually Doing in Your SOC — and What It Shouldn't Be Doing Yet