Understanding the binary formats that operating systems load is foundational to malware analysis and incident response. When you can read a PE or ELF header fluently, a packed dropper, a process-hollowed victim, or a rootkit-injected shared object stops being opaque; the anomalies jump out before you even open a disassembler. This article walks through both formats field by field, builds annotated hex mockups, and catalogs every common hiding spot with the detection command that exposes it.
PE Headers: Windows Portable Executable
Every Windows executable, DLL, driver, and COM object shares the same on-disk layout. The loader reads the headers, maps sections into memory, resolves imports, and transfers control to the entry point. Knowing exactly what the loader reads tells you exactly where an attacker can lie.
DOS Header: IMAGE_DOS_HEADER
The very first two bytes of every PE file are 0x4D 0x5A, the ASCII characters MZ, the initials of Mark Zbikowski, one of the original MS-DOS architects. The e_magic field at offset 0 holds this value. The loader checks it first; if it is absent the file is rejected immediately.
The field that matters most for PE parsing is e_lfanew at offset 0x3C. It is a 32-bit offset (RVA relative to the start of the file) that points to the PE signature. In the vast majority of real-world executables this value is 0x80 or 0x100, though technically any value ≥ 64 is valid. Malware packers sometimes push it very far forward and pack data in the space between the DOS stub and the PE signature.
Between the DOS header and the PE signature sits the DOS stub, a tiny 16-bit program that prints “This program cannot be run in DOS mode” and exits with an error code. It is vestigial on modern systems but the loader still skips over it. Attackers occasionally replace stub bytes with shellcode, though this is uncommon because nothing executes it on NT systems.
PE Signature
At the offset given by e_lfanew you find four bytes: 0x50 0x45 0x00 0x00, i.e. PE\0\0 in ASCII. The two trailing null bytes are required. If they are wrong the loader refuses to map the file.
COFF File Header: IMAGE_FILE_HEADER
Immediately following the PE signature is the 20-byte COFF File Header. Its fields:
Machine: target architecture. 0x014C = x86 (i386), 0x8664 = x86-64 (AMD64), 0x01C4 = ARM Thumb-2, 0xAA64 = ARM64 (AArch64). A mismatch between this field and the host CPU causes an immediate load failure, unless the binary is masquerading as a different architecture through packer tricks.NumberOfSections: count ofIMAGE_SECTION_HEADERentries that follow the optional header. Legitimate files rarely exceed 8–12 sections; highly packed or obfuscated samples sometimes have 1 (the entire binary is a single blob) or 20+ (section injection).TimeDateStamp: Unix epoch timestamp of when the linker produced the file. Valuable for threat intelligence clustering (same compiler run = same build infrastructure), but trivially forged. Zero or the Unix epoch (0x00000000) is a common red flag. Values far in the past or future (year 1970, year 2099) indicate deliberate zeroing.SizeOfOptionalHeader: size of the next header. For PE32 this is typically 0x00E0, for PE32+ 0x00F0. An unexpected value breaks parsing in some tools.Characteristics: bitmask of file properties. IMAGE_FILE_EXECUTABLE_IMAGE (0x0002) marks an EXE; IMAGE_FILE_DLL (0x2000) marks a DLL. Having both bits set simultaneously is suspicious — a legitimate DLL should not claim to be an executable image.
Optional Header: IMAGE_OPTIONAL_HEADER
Despite the name, this header is mandatory. At 96 bytes (PE32) or 112 bytes (PE32+) it is the largest header and carries the most analysis-relevant fields.
Magic identifies the sub-format: 0x010B = PE32 (32-bit), 0x020B = PE32+ (64-bit). A PE32+ binary with a 32-bit Machine field is a structural impossibility and should trigger immediate suspicion.
AddressOfEntryPoint (AEP) is a Relative Virtual Address (RVA), an offset from ImageBase — where the loader transfers control after setup. For DLLs this is DllMain; for EXEs it is the CRT startup stub before WinMain/main. A critical check: the section containing AEP should be executable (characteristics flag 0x20000000) and ideally read-only. If AEP points into a writable section (.data, .bss, or a packed section marked RW), that strongly suggests the loader stub decrypts code into that region at runtime — classic packer or shellcode loader behavior.
ImageBase is the preferred load address. The default for EXEs is 0x00400000 and for DLLs 0x10000000, though ASLR overrides this at load time. Malware sometimes sets ImageBase to zero and relies entirely on the base relocation table, or sets it to a value that collides with a known system DLL to force that DLL to relocate.
SectionAlignment and FileAlignment control how sections are padded in memory vs. on disk respectively. SectionAlignment is almost always 0x1000 (page size); FileAlignment is typically 0x200 (sector size). If SectionAlignment < FileAlignment the loader rejects the file. If they are set equal (e.g., both 0x1000), the binary is a “raw” or “aligned” PE — sometimes used by shellcode loaders that want on-disk layout identical to in-memory layout.
SizeOfImage must equal the total in-memory footprint of the binary rounded up to SectionAlignment. Packers that inject additional PT_LOAD-like sections at runtime sometimes set this value larger than what the section headers account for, reserving virtual address space for dynamically allocated code.
DllCharacteristics is the security feature bitmask. The flags every analyst should know:
| Flag | Value | Meaning |
|---|---|---|
| ASLR (DYNAMIC_BASE) | 0x0040 | Binary can be loaded at a random base address |
| NX (NX_COMPAT) | 0x0100 | Compatible with Data Execution Prevention |
| NO_SEH | 0x0400 | No Structured Exception Handling used |
| FORCE_INTEGRITY | 0x0080 | Code integrity checks enforced |
| CFG | 0x4000 | Control Flow Guard enabled |
| TERMINAL_SERVER_AWARE | 0x8000 | Aware of terminal server session |
Legitimate modern binaries compiled with /DYNAMICBASE /NXCOMPAT will have both ASLR and NX set. Malware compiled with older or custom toolchains frequently has neither. The absence of CFG in a binary claiming to be a Windows system component is an anomaly worth investigating.
Data Directories
The final 128 bytes of the Optional Header are the Data Directory, with 16 entries of 8 bytes each (RVA + Size), each pointing to a specific structure within the mapped binary. Not all entries are used; unused entries have both fields zeroed. The most security-relevant entries:
- Import Table (entry 1): points to the
IMAGE_IMPORT_DESCRIPTORarray. Every imported DLL has an entry here listing the API names or ordinals to resolve. This is the first thing analysts check: what APIs does this binary call?kernel32.dllwith onlyLoadLibraryAandGetProcAddressmeans all other API resolution is at runtime — typical of loaders and shellcode runners. - Export Table (entry 0): present in DLLs, lists functions other binaries can import. Malware DLLs (sideloading payloads, proxy DLLs) may export a single function or forward all exports to the legitimate DLL they are masquerading as.
- Resource Table (entry 2):
.rsrcsection tree: icons, version info, string tables, dialogs. A high-entropyRCDATAorBITMAPresource of several hundred kilobytes is a classic sign of an encrypted payload stored for runtime extraction. - Certificate Table (entry 4): points to the Authenticode signature (WIN_CERTIFICATE structure) appended to the file. This data is not mapped into memory — it lives in the file overlay. Attackers can steal a legitimate signature’s certificate data and re-use it after patching the binary;
sigcheck -aandAuthentiCheckdetect the mismatch. - Base Relocation Table (entry 5): required when ASLR loads the binary at a non-preferred base address. Binaries compiled without
/FIXEDinclude this; malware that patches absolute addresses into its code may strip the relocation table entirely, breaking ASLR. - TLS Directory (entry 9): Thread Local Storage. Contains an array of callback function pointers executed before
AddressOfEntryPoint. This is a favorite anti-analysis trick: the TLS callback runs before any breakpoint on OEP can fire.pescan -tandcapadetect TLS callbacks. - Load Config Directory (entry 10): among other things, contains the
/GSsecurity cookie and the CFG function bitmap. Its absence in a binary claiming CFG support is a contradiction. - IAT (entry 12): Import Address Table. In-memory, this table is patched by the loader with the resolved function addresses. Memory scanners compare disk IAT entries (which should be thunks pointing back into the import descriptor) to in-memory values; patched entries indicate IAT hooking.
Section Headers: IMAGE_SECTION_HEADER
Each section header is 40 bytes. The fields:
Name: 8 bytes, null-padded (not null-terminated if exactly 8 chars). Conventional names are metadata, not enforced by the loader: the loader ignores the name entirely and only looks at the characteristics and addresses.VirtualSize: size of the section in memory.VirtualAddress: RVA where the section is mapped.SizeOfRawData: size of the section on disk (must be aligned toFileAlignment).PointerToRawData: file offset of the section’s raw bytes.Characteristics: bitmask:0x20000000= execute,0x40000000= read,0x80000000= write.
Conventional sections and their expected characteristics:
| Section | Expected flags | Notes |
|---|---|---|
.text |
Execute + Read | Code. Should never be writable. |
.data |
Read + Write | Initialized global variables. |
.rdata |
Read only | Constants, import/export tables, strings. |
.rsrc |
Read only | Resources. |
.reloc |
Read only | Base relocation table. |
.bss |
Read + Write | Uninitialized data (often merged into .data). |
Red flags in section headers:
- Write + Execute simultaneously (
0xE0000000) — no legitimate section needs both. Classic sign of self-modifying shellcode, a packer stub, or process injection. VirtualSize»SizeOfRawData: the section is much larger in memory than on disk. The extra space is zeroed by the loader and then filled at runtime, the hallmark of an unpacking stub.- High entropy (> 7.0 bits/byte): sections of random-looking data indicate compression or encryption. Legitimate code averages 5.5–6.5; compressed/encrypted blobs approach 8.0.
- Blank or control-character names — the loader doesn’t care, but tools that rely on section names for heuristics skip unnamed sections.
- Extra sections beyond what the linker produces — injected by packers or hollowing tools, usually at the end of the section table.
PE Header Diagram
PE Hex View Mockup
The annotated hex below shows the first 0x90 bytes of a minimal PE. The MZ magic at 0x00, the e_lfanew pointer at 0x3C pointing to 0x80, and the PE signature at 0x80 are the three anchors every analyst reads first.
ELF Headers — Linux / Android / Embedded
The Executable and Linkable Format is the standard binary format on Linux, Android (native code), BSD, and most embedded systems. Its design is more orthogonal than PE: it separates the runtime view (program headers / segments) from the linker view (section headers), and the two can be partially inconsistent — a property malware exploits.
ELF Ident — The First 16 Bytes
The binary opens with a 16-byte identification array (e_ident) that is architecture-independent:
| Offset | Length | Field | Typical value |
|---|---|---|---|
| 0 | 4 | Magic | 7F 45 4C 46 (\x7fELF) |
| 4 | 1 | Class | 01 = ELF32, 02 = ELF64 |
| 5 | 1 | Data | 01 = LSB (little-endian), 02 = MSB |
| 6 | 1 | Version | 01 (always 1) |
| 7 | 1 | OS/ABI | 00 = System V, 03 = Linux, 09 = FreeBSD |
| 8 | 1 | ABI version | 00 (unused by most OSes) |
| 9 | 7 | Padding | all zeros |
The magic bytes 7F 45 4C 46 are non-printable followed by ELF in ASCII. The kernel checks these first and returns ENOEXEC if they are wrong. Malware that scrambles ELF headers after loading itself into memory relies on having already invoked mmap/mprotect before the header is checked again.
ELF Header Fields (Elf64_Ehdr)
After the 16-byte ident, the remaining header fields are architecture-width-dependent. For ELF64:
e_type— binary type. 0x0002 =ET_EXEC(position-dependent executable), 0x0003 =ET_DYN(position-independent executable or shared library), 0x0004 =ET_CORE(core dump). Modern Linux binaries compiled with-fPIE -pieareET_DYNeven when they are executables, not shared libraries — this is intentional for ASLR support.e_machine— target architecture: 0x0003 = x86, 0x003E = x86-64, 0x0028 = ARM (32-bit), 0x00B7 = AArch64, 0x00F3 = RISC-V.e_entry— virtual address of the entry point (_start, which calls__libc_start_main). Stripped binaries still have a valid entry address; it is one of the first symbols reconstructed during analysis.e_phoff— file offset of the Program Header Table. For a standard ELF64 this is 0x40 (immediately after the ELF header).e_shoff— file offset of the Section Header Table. Malware routinely zeros this field to break static analysis tools that rely on sections; the binary still executes normally because the runtime loader only needs program headers.e_phnum/e_shnum— counts of program headers and section headers. Injecting extra program headers requires incrementinge_phnumand adding entries before the existing ones (since the kernel iterates from 0 toe_phnum).e_shstrndx— index of the section containing section name strings. Ife_shoffis zero, this field is meaningless.
Program Headers (Segments — Runtime View)
The program header table is what the kernel and dynamic linker read at load time. Each entry describes a segment — a contiguous region of the file mapped into a contiguous region of memory. The segment types most relevant to analysis:
PT_LOAD — loadable segment. The kernel calls mmap for each one. A standard binary has two: one R-X (code, including .text) and one RW- (data, including .data/.bss). Each entry specifies p_offset (file offset), p_vaddr (virtual address), p_filesz (bytes from file), p_memsz (bytes in memory — may be larger for BSS), and p_flags (PF_R=4, PF_W=2, PF_X=1). A PT_LOAD with flags RWX (7) is an unconditional red flag — no legitimate binary needs a simultaneously writable and executable segment.
PT_DYNAMIC — points to the .dynamic section, which contains the dynamic linking metadata (DT_NEEDED entries, GOT/PLT addresses, symbol table pointers). Without this segment the dynamic linker cannot resolve imports.
PT_INTERP — path to the dynamic linker, typically /lib64/ld-linux-x86-64.so.2 or /lib/ld-linux.so.2 for 32-bit. Statically linked binaries have no PT_INTERP. An unusual path here (e.g., /tmp/.x/ld.so or a path in /dev/shm) means the attacker is supplying a custom loader — a sophisticated technique used by some rootkits.
PT_GNU_STACK — the p_flags of this segment control the NX policy for the stack. A legitimate binary has RW- (flags=6, no execute). If PF_X is set (flags=7), the stack is executable — either the binary uses intentional stack execution (rare, legacy) or the attacker cleared the NX protection to enable shellcode on the stack. checksec and readelf -l both show this.
PT_GNU_RELRO — marks a range of the address space as read-only after dynamic linking completes (.got, parts of .data). Its absence means GOT entries remain writable throughout execution — enabling GOT overwrite attacks.
Section Headers (Linker View)
Section headers are a map of the binary’s internal structure for linkers and debuggers. The kernel does not need them at runtime. This means stripping them (strip --strip-all) produces a fully functional binary that is harder to analyze. Malware distributed in the wild is almost always stripped.
Important sections:
.text— machine code.SHT_PROGBITS, flagsAX(alloc + execute)..rodata— read-only data: string literals, jump tables, constant arrays..data/.bss— initialized and uninitialized writable data..dynamic— theElf64_Dynarray driving dynamic linking..dynsym+.dynstr— the minimal symbol table needed for dynamic linking. Always present in dynamically linked binaries (cannot be stripped without breaking the binary)..symtab+.strtab— the full symbol table. Stripped in production/malware releases. Their presence is a debugging artifact..got— Global Offset Table. At load time the dynamic linker patches pointers here for each resolved symbol. GOT overwrite (writing a function pointer to redirect execution) is a well-known post-exploitation technique..plt— Procedure Linkage Table. A table of small stubs; each stub either jumps through the GOT (resolved) or calls the lazy binding resolver. The PLT is in the executable segment and is never writable..rela.plt/.rela.dyn— relocation tables describing which GOT slots to patch and with which symbols..init_array— array of constructor function pointers called beforemain(). Functionally equivalent to PE TLS callbacks. Malware that installs itself as a shared library can hide in.init_arrayentries.
Dynamic Linking Deep-Dive
The PT_DYNAMIC segment contains a flat array of (tag, value) pairs called Elf64_Dyn:
DT_NEEDED— one entry per required shared library (libc.so.6,libpthread.so.0, etc.). A binary with noDT_NEEDEDentries is statically linked. Malware often has minimalDT_NEEDEDentries or none, carrying all code statically.DT_RPATH/DT_RUNPATH— colon-separated list of paths searched for shared libraries before the system defaults. If this contains a writable path (e.g.,.,/tmp,/home/user) an attacker can plant a malicious library with the sameDT_SONAMEas a legitimate one and hijack every import.DT_SONAME— the library’s own name, used when other binariesDT_NEEDEDreference it.
How PLT/GOT lazy binding works: on the first call to printf, the PLT stub jumps to GOT entry [printf]. Initially that GOT entry points back into the PLT resolver stub, which calls _dl_runtime_resolve. The dynamic linker finds the real printf address and patches the GOT entry. Every subsequent call jumps directly to libc. This lazy resolution mechanism means the GOT is writable during program execution — making it a target for any attacker with a write primitive.
ELF File Structure Layout
The diagram below shows the complete ELF64 file layout from byte 0 to the Section Header Table, with every major field colour-coded by region. The right-side bar names each region as the loader and linker see it.
ELF Header Hex Mockup
The 64 bytes below are the complete ELF64 header for a typical x86-64 ET_DYN binary with a program header table at offset 0x40:
Where Malware Hides — and How to Detect It
PE Hiding Spots
1. Overlay / Appended Data
After the last section’s raw data ends on disk, the PE format has no explicit end-of-file marker. Bytes appended beyond the last section are called the overlay. Installers legitimately use this (e.g., self-extracting archives append a ZIP blob), but malware uses it to store encrypted shellcode, config blobs, or secondary payloads that are loaded at runtime via SetFilePointer + ReadFile at the overlay offset.
Detection: diec sample.exe, binwalk sample.exe, or python3 -c "import pefile; p=pefile.PE('sample.exe'); print(p.get_overlay_data_start_offset())". Any overlay > a few kilobytes in a binary that isn’t a self-extractor deserves scrutiny. Measure entropy of the overlay — random-looking = encrypted.
2. Section Slack Space
The difference between VirtualSize and SizeOfRawData is zeroed padding on disk. Malware can hide data in this slack between the end of a section’s actual content and the next FileAlignment boundary. The technique is subtle — the data is present on disk but invisible to most viewers because they only display up to SizeOfRawData.
Detection: pe-sieve --pid <PID> compares in-memory sections to disk; hollows-hunter flags modified regions. Manually: pefile in Python can dump raw section bytes beyond the virtual size.
3. TLS Callbacks
Thread Local Storage callbacks (AddressOfCallBacks in the TLS directory) are function pointers invoked by the loader for every thread creation — including the initial thread, before AddressOfEntryPoint. Setting a breakpoint at OEP in a debugger will miss any code in a TLS callback entirely. Packers and anti-debug stubs routinely live here.
Detection: pescan -t sample.exe lists TLS callbacks. capa sample.exe flags the capability. In x64dbg/WinDbg, the EntryBreakpointMode needs to be set to break on TLS callbacks, not just OEP.
4. Resource Section Abuse (.rsrc)
The resource section tree is a three-level hierarchy (type → name → language). RCDATA and BITMAP resource types accept arbitrary binary data. Malware stores encrypted payloads here — the resource subsystem is not scanned by many AV engines, and the data is conveniently extracted at runtime via FindResource + LoadResource + LockResource.
Detection: ResourceHacker or peframe sample.exe show resource content. Measure entropy per resource — a BITMAP entry with entropy > 7.5 and no valid BMP header is a payload. binwalk -e can extract embedded blobs.
5. Extra Injected Sections
After hollowing or injection, tools like Process Hacker reveal sections in memory that have no disk counterpart, or sections with names not present in the original binary. On disk, packers append new sections at the end of the section table.
Detection: iterate sections with pefile and flag any with Characteristics & 0xE0000000 == 0xE0000000 (RWX), entropy > 7.0, or names not in the standard set. Compare disk section count to in-memory section count.
6. Import Table Spoofing / Stomping
A loader stub that does all API resolution at runtime only needs two imports: LoadLibraryA and GetProcAddress. The import table will be almost empty — one or two DLL entries. Everything the binary actually calls is resolved by walking the export table of the loaded DLL manually, like a PEB walk (described in the assembly article).
Detection: pestudio Imports view — fewer than five imports total with LoadLibraryA + GetProcAddress present is a near-certain sign of runtime API resolution. capa detects the peb walk capability directly.
7. Authenticode Stomping
The CertificateTable data directory points to a WIN_CERTIFICATE structure appended to the file. Because this data is not mapped into memory and is excluded from the hash Microsoft verifies, an attacker can copy the certificate block from a legitimately signed binary and append it to a malicious one. Many security products check only that WinVerifyTrust returns ERROR_SUCCESS, not whether the hash matches.
Detection: sigcheck -a sample.exe (Sysinternals) shows both the certificate subject and whether the file hash matches. AuthentiCheck specifically tests hash validity independent of chain trust.
8. PE Header Erasure After Loading
Some loaders zero out the MZ/PE signature in memory after successfully mapping the binary. This breaks any tool that tries to dump and re-parse the binary from memory (e.g., procdump), since the result will not be a valid PE.
Detection: pe-sieve --pid <PID> compares disk vs. memory and reports HEADER_ERASED findings. The fix is to reconstruct the PE header from the section information still visible in memory.
9. Section Name vs. Characteristics Mismatch
The loader ignores section names — only Characteristics flags matter for memory permissions. A packer can name a writable+executable section .text to fool analysts and tools that key off the name. Conversely, it can mark the actual .text section as non-executable to hide code in .data.
Detection: compare every section name to its expected characteristics. Flag any .text with write permission or any .data with execute permission.
10. Checksum Manipulation
The CheckSum field in the Optional Header is a CRC-like value computed over the entire file. The Windows loader ignores it for user-mode EXEs but validates it for drivers and system DLLs. A zero or incorrect checksum in a binary claiming to be a Windows system file is a reliable indicator of tampering.
Detection: MapFileAndCheckSum (Win32 API) or pe-bear recomputes the correct checksum and flags mismatches. All legitimate Windows system DLLs have a valid, non-zero checksum.
ELF Hiding Spots
1. Stripped Symbols
strip --strip-all removes .symtab, .strtab, and debug sections. All function names become FUN_ addresses in Ghidra or sub_ in IDA. The binary executes identically — only analysis is impaired.
Detection: file sample reports “stripped”. readelf -S sample | grep -E 'symtab|strtab' returns nothing. The .dynsym/.dynstr sections remain (they cannot be stripped) and provide a minimal set of exported symbol names as starting points.
2. PT_GNU_STACK RWX
An executable stack allows shellcode injection via stack buffer overflows. Set with execstack -s sample or by omitting the PT_GNU_STACK segment and relying on the kernel default (which on some older kernels defaults to executable).
Detection: readelf -l sample | grep GNU_STACK — check the flags column. RWE or flags: 7 means executable stack. checksec --file=sample reports NX disabled.
3. GOT/PLT Overwrite
After gaining a write primitive (e.g., format string bug, heap overflow), an attacker overwrites a .got.plt entry to redirect the next call to that library function toward shellcode. This is detectable post-exploitation.
Detection: attach gdb and compare GOT entries to expected library addresses: x/gx &puts@got.plt should resolve to an address within libc.so. Values in anonymous memory regions or non-library pages indicate hooking. checksec flags RELRO status: Full RELRO means GOT is read-only after startup.
4. LD_PRELOAD Hijack
Setting LD_PRELOAD=/path/to/evil.so causes the dynamic linker to load that library first, allowing it to intercept any function via symbol override. The system-wide variant uses /etc/ld.so.preload. This is the most common persistence mechanism for Linux userland rootkits.
Detection: cat /etc/ld.so.preload — should be empty on a clean system. For running processes: cat /proc/<pid>/maps | grep -v '\.so\.' | grep '\.so' finds anonymously-loaded shared objects. strace -e trace=open,openat ls reveals which libraries are actually opened at startup.
5. Injected PT_LOAD Segment
An attacker with write access to an ELF binary can insert an additional PT_LOAD entry in the program header table, e_phnum, and back the segment with encrypted shellcode at a high file offset. The kernel loads all PT_LOAD segments unconditionally.
Detection: readelf -l sample | grep LOAD — more than two PT_LOAD entries is unusual for most binaries. A PT_LOAD with RWX flags is unconditional justification for deeper analysis.
6. Ghost Sections
Section headers can point to file offsets or virtual addresses that fall outside any PT_LOAD segment. The kernel ignores section headers at runtime, so ghost sections are invisible to the process but visible to static analysis tools — until the malware zeros e_shoff to hide the entire section table.
Detection: cross-reference every section’s sh_addr against the ranges covered by PT_LOAD segments. Any section whose address is not covered by a loadable segment is a ghost. readelf -S vs readelf -l side by side.
7. Debug Info Abuse
DWARF debug information in .debug_info, .debug_str, and related sections is parsed only by debuggers and DWARF-aware tools — not by AV scanners. Attackers have stored encrypted payloads here, relying on the section’s presence in legitimate debug builds for cover.
Detection: dwarfdump --all sample dumps all DWARF data. Measure section entropy: binwalk --entropy sample. High entropy in .debug_* sections that are not otherwise consistent with debug builds is suspicious.
8. RPATH / RUNPATH Manipulation
DT_RPATH and DT_RUNPATH entries in .dynamic specify library search paths that take precedence over system defaults. If they include relative paths (., ./lib) or attacker-controlled directories (/tmp, user home), any binary launched from that directory will load attacker libraries instead of system ones.
Detection: readelf -d sample | grep -E 'RPATH|RUNPATH' or patchelf --print-rpath sample. Any value other than standard system paths (/usr/lib, /lib) or empty should be investigated.
9. .init_array / Constructor Abuse
Functions listed in .init_array are called by __libc_start_main before main(), in the same way PE TLS callbacks precede OEP. A malicious shared library can install itself permanently by planting a function in .init_array that forks, drops a rootkit, or patches /etc/ld.so.preload.
Detection: readelf -d sample | grep INIT shows the DT_INIT and DT_INIT_ARRAY addresses. objdump -d -j .init_array sample disassembles constructor pointers. In gdb, catch load breaks on each library constructor.
10. UPX / Custom Packer
upx --best compresses the entire ELF into a single PT_LOAD stub that decompresses the original binary into memory and transfers control. The PT_INTERP is replaced by the UPX unpacker stub. The original section table is destroyed.
Detection: file sample reports “UPX compressed”. binwalk sample detects the UPX magic and embedded ELF. upx -d sample decompresses if the magic is intact; custom packers that strip UPX headers require binwalk -e or dynamic analysis.
Analysis Workflow Cheat-Sheet
| Step | PE (Windows) | ELF (Linux) |
|---|---|---|
| Identify format | file sample.exe |
file sample |
| Full header overview | diec sample.exe |
readelf -h sample |
| Section list | pestudio → Sections |
readelf -S sample |
| Imports / dependencies | pestudio → Imports |
readelf -d sample (DT_NEEDED) |
| Entropy per section | peframe sample.exe |
binwalk --entropy sample |
| Strings | strings -n 8 -e l sample.exe |
strings -n 8 sample |
| TLS / init hooks | pescan -t sample.exe |
readelf -d sample (INIT_ARRAY) |
| Security features | winchecksec sample.exe |
checksec --file=sample |
| Packing detection | diec, exeinfope |
binwalk -e sample |
| Overlay / appended | pefile overlay offset |
binwalk -e sample |
| Memory dump compare | pe-sieve --pid <PID> |
volatility3 linux.proc.maps |
| YARA scan | yara rules.yar sample.exe |
yara rules.yar sample |
Quick YARA Rules
rule PE_TLS_Callback_NoASLR_NoNX
{
meta:
description = "PE with TLS callback and missing ASLR or NX — common packer/anti-debug pattern"
author = "benjitrapp"
date = "2026-06-23"
condition:
uint16(0) == 0x5A4D // MZ magic
and uint32(uint32(0x3C)) == 0x00004550 // PE signature
// TLS directory present (entry 9 has non-zero RVA)
and uint32(uint32(0x3C) + 0x18 + 0x60 + 9 * 8) != 0
// DllCharacteristics missing ASLR (0x0040) or NX (0x0100)
and (
(uint16(uint32(0x3C) + 0x18 + 0x5E) & 0x0040) == 0
or (uint16(uint32(0x3C) + 0x18 + 0x5E) & 0x0100) == 0
)
}
rule ELF_RWX_Stack
{
meta:
description = "ELF binary with executable stack (PT_GNU_STACK PF_X set)"
author = "benjitrapp"
date = "2026-06-23"
strings:
// PT_GNU_STACK type = 0x6474e551, PF_X in flags byte
// Little-endian pattern: type bytes + flags with exec bit
$gnu_stack_rwx = { 51 E5 74 64 07 00 00 00 } // type=GNU_STACK, flags=RWX (32-bit PHdr)
condition:
uint32(0) == 0x464C457F // \x7fELF magic
and $gnu_stack_rwx
}
Resources
- Microsoft PE/COFF Specification — authoritative reference for all PE structures
- corkami PE101 / ELF101 posters — visual format maps, indispensable for quick reference
- pefile — Python library for parsing PE files programmatically
- readelf / objdump — binutils suite, standard on every Linux system
- capa — Mandiant’s capability detection tool, maps PE/ELF behaviors to ATT&CK
- pe-sieve / hollows-hunter — hasherezade’s tools for in-memory PE anomaly detection
- pestudio — Marc Ochsenmeier’s static PE analysis workbench
- checksec — binary security feature checker for ELF and PE
- binwalk — firmware and binary entropy/structure scanner