← All Challenges
Challenge 12: Unicode Homoglyph Substitution
Difficulty: Hard
Category: Signature Detection Bypass
Objective
Bypass the scanner’s signature detection by replacing one or more characters in the payload with visually identical characters from different Unicode blocks. The string looks the same to human eyes but has completely different bytes on disk.
Scanner Behavior
The scanner performs static byte-pattern matching against file contents. It searches for the following strings as exact ASCII byte sequences:
malware(bytes:6D 61 6C 77 61 72 65)virustrojanevil_payloaddropperransomwarepayload.exe
The scanner matches against specific byte values corresponding to ASCII characters. It does not perform visual similarity analysis, Unicode normalization, or homoglyph detection.
Rules
- Your script must produce a string that is visually indistinguishable from one of the blocked signatures when displayed.
- The file must use homoglyph characters that have different byte representations than their ASCII counterparts.
- The scanner must fail to detect the signature due to the different underlying bytes.
- Bonus: consider whether your script needs the string to be functionally equivalent or just visually equivalent.
Hints
- Not all characters that look like
aare actuallya. The Cyrillicа(U+0430) looks identical to Latina(U+0061) but has completely different bytes. - Unicode contains thousands of characters that are visual duplicates of ASCII letters across different scripts (Cyrillic, Greek, mathematical symbols, etc.).
- Replacing even one character with its homoglyph breaks the ASCII byte pattern match.
- This technique is used in real-world phishing attacks (IDN homograph attacks) and is extremely effective against byte-level scanners.
AMSI Raccoon Lab