PE & ELF Headers: Structure, Analysis & Malware Hiding Spots

Understanding the binary formats that operating systems load is foundational to malware analysis and incident response. When you can read a PE or ELF header fluently, a packed dropper, a process-hollowed victim, or a rootkit-injected shared object stops being opaque; the anomalies jump out before you even open a disassembler. This article walks through both formats field by field, builds annotated hex mockups, and catalogs every common hiding spot with the detection command that exposes it.

PE Headers: Windows Portable Executable

Every Windows executable, DLL, driver, and COM object shares the same on-disk layout. The loader reads the headers, maps sections into memory, resolves imports, and transfers control to the entry point. Knowing exactly what the loader reads tells you exactly where an attacker can lie.

DOS Header: `IMAGE_DOS_HEADER`

The very first two bytes of every PE file are 0x4D 0x5A, the ASCII characters MZ, the initials of Mark Zbikowski, one of the original MS-DOS architects. The e_magic field at offset 0 holds this value. The loader checks it first; if it is absent the file is rejected immediately.

The field that matters most for PE parsing is e_lfanew at offset 0x3C. It is a 32-bit offset (RVA relative to the start of the file) that points to the PE signature. In the vast majority of real-world executables this value is 0x80 or 0x100, though technically any value ≥ 64 is valid. Malware packers sometimes push it very far forward and pack data in the space between the DOS stub and the PE signature.

Between the DOS header and the PE signature sits the DOS stub, a tiny 16-bit program that prints “This program cannot be run in DOS mode” and exits with an error code. It is vestigial on modern systems but the loader still skips over it. Attackers occasionally replace stub bytes with shellcode, though this is uncommon because nothing executes it on NT systems.

PE Signature

At the offset given by e_lfanew you find four bytes: 0x50 0x45 0x00 0x00, i.e. PE\0\0 in ASCII. The two trailing null bytes are required. If they are wrong the loader refuses to map the file.

COFF File Header: `IMAGE_FILE_HEADER`

Immediately following the PE signature is the 20-byte COFF File Header. Its fields:

Machine: target architecture. 0x014C = x86 (i386), 0x8664 = x86-64 (AMD64), 0x01C4 = ARM Thumb-2, 0xAA64 = ARM64 (AArch64). A mismatch between this field and the host CPU causes an immediate load failure, unless the binary is masquerading as a different architecture through packer tricks.
NumberOfSections: count of IMAGE_SECTION_HEADER entries that follow the optional header. Legitimate files rarely exceed 8–12 sections; highly packed or obfuscated samples sometimes have 1 (the entire binary is a single blob) or 20+ (section injection).
TimeDateStamp: Unix epoch timestamp of when the linker produced the file. Valuable for threat intelligence clustering (same compiler run = same build infrastructure), but trivially forged. Zero or the Unix epoch (0x00000000) is a common red flag. Values far in the past or future (year 1970, year 2099) indicate deliberate zeroing.
SizeOfOptionalHeader: size of the next header. For PE32 this is typically 0x00E0, for PE32+ 0x00F0. An unexpected value breaks parsing in some tools.
Characteristics: bitmask of file properties. IMAGE_FILE_EXECUTABLE_IMAGE (0x0002) marks an EXE; IMAGE_FILE_DLL (0x2000) marks a DLL. Having both bits set simultaneously is suspicious — a legitimate DLL should not claim to be an executable image.

Optional Header: `IMAGE_OPTIONAL_HEADER`

Despite the name, this header is mandatory. At 96 bytes (PE32) or 112 bytes (PE32+) it is the largest header and carries the most analysis-relevant fields.

Magic identifies the sub-format: 0x010B = PE32 (32-bit), 0x020B = PE32+ (64-bit). A PE32+ binary with a 32-bit Machine field is a structural impossibility and should trigger immediate suspicion.

AddressOfEntryPoint (AEP) is a Relative Virtual Address (RVA), an offset from ImageBase — where the loader transfers control after setup. For DLLs this is DllMain; for EXEs it is the CRT startup stub before WinMain/main. A critical check: the section containing AEP should be executable (characteristics flag 0x20000000) and ideally read-only. If AEP points into a writable section (.data, .bss, or a packed section marked RW), that strongly suggests the loader stub decrypts code into that region at runtime — classic packer or shellcode loader behavior.

ImageBase is the preferred load address. The default for EXEs is 0x00400000 and for DLLs 0x10000000, though ASLR overrides this at load time. Malware sometimes sets ImageBase to zero and relies entirely on the base relocation table, or sets it to a value that collides with a known system DLL to force that DLL to relocate.

SectionAlignment and FileAlignment control how sections are padded in memory vs. on disk respectively. SectionAlignment is almost always 0x1000 (page size); FileAlignment is typically 0x200 (sector size). If SectionAlignment < FileAlignment the loader rejects the file. If they are set equal (e.g., both 0x1000), the binary is a “raw” or “aligned” PE — sometimes used by shellcode loaders that want on-disk layout identical to in-memory layout.

SizeOfImage must equal the total in-memory footprint of the binary rounded up to SectionAlignment. Packers that inject additional PT_LOAD-like sections at runtime sometimes set this value larger than what the section headers account for, reserving virtual address space for dynamically allocated code.

DllCharacteristics is the security feature bitmask. The flags every analyst should know:

Flag	Value	Meaning
ASLR (DYNAMIC_BASE)	0x0040	Binary can be loaded at a random base address
NX (NX_COMPAT)	0x0100	Compatible with Data Execution Prevention
NO_SEH	0x0400	No Structured Exception Handling used
FORCE_INTEGRITY	0x0080	Code integrity checks enforced
CFG	0x4000	Control Flow Guard enabled
TERMINAL_SERVER_AWARE	0x8000	Aware of terminal server session

Legitimate modern binaries compiled with /DYNAMICBASE /NXCOMPAT will have both ASLR and NX set. Malware compiled with older or custom toolchains frequently has neither. The absence of CFG in a binary claiming to be a Windows system component is an anomaly worth investigating.

Data Directories

The final 128 bytes of the Optional Header are the Data Directory, with 16 entries of 8 bytes each (RVA + Size), each pointing to a specific structure within the mapped binary. Not all entries are used; unused entries have both fields zeroed. The most security-relevant entries:

Import Table (entry 1): points to the IMAGE_IMPORT_DESCRIPTOR array. Every imported DLL has an entry here listing the API names or ordinals to resolve. This is the first thing analysts check: what APIs does this binary call? kernel32.dll with only LoadLibraryA and GetProcAddress means all other API resolution is at runtime — typical of loaders and shellcode runners.
Export Table (entry 0): present in DLLs, lists functions other binaries can import. Malware DLLs (sideloading payloads, proxy DLLs) may export a single function or forward all exports to the legitimate DLL they are masquerading as.
Resource Table (entry 2): .rsrc section tree: icons, version info, string tables, dialogs. A high-entropy RCDATA or BITMAP resource of several hundred kilobytes is a classic sign of an encrypted payload stored for runtime extraction.
Certificate Table (entry 4): points to the Authenticode signature (WIN_CERTIFICATE structure) appended to the file. This data is not mapped into memory — it lives in the file overlay. Attackers can steal a legitimate signature’s certificate data and re-use it after patching the binary; sigcheck -a and AuthentiCheck detect the mismatch.
Base Relocation Table (entry 5): required when ASLR loads the binary at a non-preferred base address. Binaries compiled without /FIXED include this; malware that patches absolute addresses into its code may strip the relocation table entirely, breaking ASLR.
TLS Directory (entry 9): Thread Local Storage. Contains an array of callback function pointers executed before AddressOfEntryPoint. This is a favorite anti-analysis trick: the TLS callback runs before any breakpoint on OEP can fire. pescan -t and capa detect TLS callbacks.
Load Config Directory (entry 10): among other things, contains the /GS security cookie and the CFG function bitmap. Its absence in a binary claiming CFG support is a contradiction.
IAT (entry 12): Import Address Table. In-memory, this table is patched by the loader with the resolved function addresses. Memory scanners compare disk IAT entries (which should be thunks pointing back into the import descriptor) to in-memory values; patched entries indicate IAT hooking.

Section Headers: `IMAGE_SECTION_HEADER`

Each section header is 40 bytes. The fields:

Name: 8 bytes, null-padded (not null-terminated if exactly 8 chars). Conventional names are metadata, not enforced by the loader: the loader ignores the name entirely and only looks at the characteristics and addresses.
VirtualSize: size of the section in memory.
VirtualAddress: RVA where the section is mapped.
SizeOfRawData: size of the section on disk (must be aligned to FileAlignment).
PointerToRawData: file offset of the section’s raw bytes.
Characteristics: bitmask: 0x20000000 = execute, 0x40000000 = read, 0x80000000 = write.

Conventional sections and their expected characteristics:

Section	Expected flags	Notes
`.text`	Execute + Read	Code. Should never be writable.
`.data`	Read + Write	Initialized global variables.
`.rdata`	Read only	Constants, import/export tables, strings.
`.rsrc`	Read only	Resources.
`.reloc`	Read only	Base relocation table.
`.bss`	Read + Write	Uninitialized data (often merged into `.data`).

Red flags in section headers:

Write + Execute simultaneously (0xE0000000) — no legitimate section needs both. Classic sign of self-modifying shellcode, a packer stub, or process injection.
VirtualSize » SizeOfRawData: the section is much larger in memory than on disk. The extra space is zeroed by the loader and then filled at runtime, the hallmark of an unpacking stub.
High entropy (> 7.0 bits/byte): sections of random-looking data indicate compression or encryption. Legitimate code averages 5.5–6.5; compressed/encrypted blobs approach 8.0.
Blank or control-character names — the loader doesn’t care, but tools that rely on section names for heuristics skip unnamed sections.
Extra sections beyond what the linker produces — injected by packers or hollowing tools, usually at the end of the section table.

PE Header Diagram

DOS Header: IMAGE_DOS_HEADER (64 bytes, offset 0x00)

e_magic: 0x5A4D

"MZ", Mark Zbikowski · loader rejects file if absent

e_cblp

bytes on last page

e_cp

pages in file

e_crlc

relocations

e_cparhdr

header size (paragraphs)

e_minalloc

e_maxalloc

e_ss · e_sp

e_csum · e_ip · e_cs

e_lfarlc · e_ovno

e_res[4]: reserved (8 bytes, always zero)

e_oemid · e_oeminfo

e_res2[10] — reserved (20 bytes, always zero)

e_lfanew — at offset 0x3C — 32-bit pointer to PE signature

Typically 0x80 or 0x100. Packers push this far forward to hide data between DOS stub and PE signature.

DOS Header

DOS Stub — 16-bit Program (variable size)

16-bit x86 machine code

Prints "This program cannot be run in DOS mode" and exits. Never runs on NT. Attackers sometimes replace with shellcode — but nothing executes it on Windows NT kernels.

DOS Stub

PE Signature + COFF File Header — IMAGE_FILE_HEADER (24 bytes total)

Signature: 0x50 0x45 0x00 0x00 — "PE\0\0"

Must be exactly these 4 bytes at the offset given by e_lfanew. Wrong bytes = loader refuses to map the file.

Machine

0x014C=x86 · 0x8664=AMD64 · 0x01C4=ARM · 0xAA64=ARM64

NumberOfSections

determines how many IMAGE_SECTION_HEADER entries follow

TimeDateStamp

Unix timestamp of link time. Malware zeroes or fakes this. Legitimate Microsoft system files have verifiable timestamps.

PointerToSymbolTable

deprecated — should be 0

NumberOfSymbols

deprecated — should be 0

SizeOfOptionalHeader

PE32=0xE0 · PE32+=0xF0

Characteristics

0x0002=EXE · 0x2000=DLL · 0x0100=32-bit · 0x0020=stripped · 0x0001=no relocations

PE Sig + COFF Header

Optional Header — Standard Fields (IMAGE_OPTIONAL_HEADER)

Magic

0x010B = PE32 · 0x020B = PE32+

MajorLinkerVersion

MinorLinkerVersion

SizeOfCode

sum of sizes of all code sections

SizeOfInitializedData

SizeOfUninitializedData

AddressOfEntryPoint (RVA)

OEP — where Windows transfers control. Malware: check if this RVA falls inside a writable or non-.text section — red flag for packed code.

BaseOfCode (RVA)

BaseOfData (RVA) — PE32 only, absent in PE32+

Optional Header (Standard)

Optional Header — Windows-Specific Fields

ImageBase

preferred load address (4b PE32 / 8b PE32+). ASLR overrides this. DLLs: 0x10000000, EXEs: 0x00400000. Hollowed processes show wrong ImageBase in PEB.

SectionAlignment

typically 0x1000 (4 KB page)

FileAlignment

typically 0x200 (512 bytes)

MajorOS

version

MinorOS

version

MajorImage

version

MinorImage

version

MajorSubsystem

version

MinorSubsystem

version

Win32VersionValue

reserved — must be 0

SizeOfImage

total virtual size — must be multiple of SectionAlignment

SizeOfHeaders

file offset where first section raw data begins

CheckSum

0 in most malware · wrong value = tampered

Subsystem

2=GUI · 3=CUI · 9=WinCE

DllCharacteristics

0x0040=ASLR · 0x0100=NX · 0x0400=no-SEH · 0x4000=CFG — missing ASLR/NX = red flag

SizeOfStackReserve

SizeOfStackCommit

SizeOfHeapReserve

SizeOfHeapCommit

LoaderFlags

reserved — must be 0

NumberOfRvaAndSizes

always 16 — number of Data Directory entries that follow

Optional Header (Windows)

Data Directories — IMAGE_DATA_DIRECTORY × 16 (each: 4-byte RVA + 4-byte Size)

ExportTable (RVA)

DLL function exports — walking this manually is how shellcode resolves APIs

SizeOfExportTable

ImportTable (RVA)

array of IMAGE_IMPORT_DESCRIPTOR — tells loader which DLLs and functions to bind. Minimal = only LoadLibrary/GetProcAddress = runtime resolution.

SizeOfImportTable

ResourceTable (RVA)

.rsrc section root — RCDATA/BITMAP resources often carry encrypted shellcode

SizeOfResourceTable

ExceptionTable (RVA)

x64 structured exception handlers (PDATA section)

SizeOfExceptionTable

CertificateTable (RVA)

Authenticode signature — file offset (not RVA!). Stolen/weak cert = stomped signature. Verify with sigcheck -a.

SizeOfCertificateTable

BaseRelocationTable (RVA)

.reloc section — required if ASLR rebase is needed

SizeOfBaseRelocationTable

Debug (RVA)

SizeOfDebug

Architecture (RVA)

reserved — must be zero

SizeOfArchitecture

GlobalPtr (RVA)

MIPS/IA-64 global pointer only

0x00 0x00 0x00 0x00

TLSTable (RVA)

IMAGE_TLS_DIRECTORY — callbacks here run BEFORE AddressOfEntryPoint. Primary anti-analysis and persistence technique.

SizeOfTLSTable

LoadConfigTable (RVA)

CFG bitmap, SE handler table, stack cookie

SizeOfLoadConfigTable

BoundImport (RVA)

SizeOfBoundImport

ImportAddressTable (RVA)

IAT — this is what gets patched at load time (and what API hooks overwrite at runtime)

SizeOfImportAddressTable

DelayImportDescriptor (RVA)

delay-loaded imports — resolved on first call

SizeOfDelayImportDescriptor

.NET / CLRRuntimeHeader (RVA)

present in managed .NET executables

SizeOfCLRRuntimeHeader

Reserved — 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

Data Directories

Section Headers — IMAGE_SECTION_HEADER × NumberOfSections (40 bytes each)

Name[8] — 8 ASCII bytes, null-padded

Common names: .text · .rdata · .data · .rsrc · .reloc · .pdata · .tls — name is cosmetic only; Characteristics flags define actual permissions.

VirtualSize

actual used size in memory. VirtualSize >> SizeOfRawData = compressed/packed data inside.

VirtualAddress (RVA)

where section is mapped in memory relative to ImageBase

SizeOfRawData

aligned to FileAlignment — size of section on disk

PointerToRawData

file offset where section data starts — check for overlay past last section

PointerToRelocations

object files only — 0 in executables

PointerToLinenumbers

deprecated — always 0

NumberOfRelocations

NumberOfLinenumbers

Characteristics

0x20=code · 0x40=init.data · 0x80=uninit.data · 0x20000000=execute · 0x40000000=read · 0x80000000=write — WRITE+EXECUTE together = red flag

··· one 40-byte entry repeated for every section (NumberOfSections total) ···

Section Headers

PE Hex View Mockup

The annotated hex below shows the first 0x90 bytes of a minimal PE. The MZ magic at 0x00, the e_lfanew pointer at 0x3C pointing to 0x80, and the PE signature at 0x80 are the three anchors every analyst reads first.

Hex view — minimal PE (first 0x90 bytes) · offsets in hex

0000 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 ← MZ magic (e_magic = 0x5A4D)

0010 B8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 DOS header fields

0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 reserved (e_res[])

0030 00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00 ← e_lfanew = 0x80 (PE signature offset)

0040 0E 1F BA 0E 00 B4 09 CD 21 B8 01 4C CD 21 54 68 DOS stub — INT 21h exit routine

0050 69 73 20 70 72 6F 67 72 61 6D 20 63 61 6E 6E 6F "is program canno" (DOS stub string)

0060 74 20 62 65 20 72 75 6E 20 69 6E 20 44 4F 53 20 "t be run in DOS "

0070 6D 6F 64 65 2E 0D 0D 0A 24 00 00 00 00 00 00 00 "mode..." + $ terminator

0080 50 45 00 00 4C 01 03 00 A3 F8 4B 5E 00 00 00 00 ← PE\0\0 signature + Machine=0x014C (x86) + TimeDateStamp

0090 00 00 00 00 E0 00 02 01 00 00 00 00 00 00 00 00 SizeOfOptionalHeader=0xE0, Characteristics=0x0102 (EXE+32bit)

ELF Headers — Linux / Android / Embedded

The Executable and Linkable Format is the standard binary format on Linux, Android (native code), BSD, and most embedded systems. Its design is more orthogonal than PE: it separates the runtime view (program headers / segments) from the linker view (section headers), and the two can be partially inconsistent — a property malware exploits.

ELF Ident — The First 16 Bytes

The binary opens with a 16-byte identification array (e_ident) that is architecture-independent:

Offset	Length	Field	Typical value
0	4	Magic	7F 45 4C 46 (`\x7fELF`)
4	1	Class	`01` = ELF32, `02` = ELF64
5	1	Data	`01` = LSB (little-endian), `02` = MSB
6	1	Version	`01` (always 1)
7	1	OS/ABI	`00` = System V, `03` = Linux, `09` = FreeBSD
8	1	ABI version	`00` (unused by most OSes)
9	7	Padding	all zeros

The magic bytes 7F 45 4C 46 are non-printable followed by ELF in ASCII. The kernel checks these first and returns ENOEXEC if they are wrong. Malware that scrambles ELF headers after loading itself into memory relies on having already invoked mmap/mprotect before the header is checked again.

ELF Header Fields (`Elf64_Ehdr`)

After the 16-byte ident, the remaining header fields are architecture-width-dependent. For ELF64:

e_type — binary type. 0x0002 = ET_EXEC (position-dependent executable), 0x0003 = ET_DYN (position-independent executable or shared library), 0x0004 = ET_CORE (core dump). Modern Linux binaries compiled with -fPIE -pie are ET_DYN even when they are executables, not shared libraries — this is intentional for ASLR support.
e_machine — target architecture: 0x0003 = x86, 0x003E = x86-64, 0x0028 = ARM (32-bit), 0x00B7 = AArch64, 0x00F3 = RISC-V.
e_entry — virtual address of the entry point (_start, which calls __libc_start_main). Stripped binaries still have a valid entry address; it is one of the first symbols reconstructed during analysis.
e_phoff — file offset of the Program Header Table. For a standard ELF64 this is 0x40 (immediately after the ELF header).
e_shoff — file offset of the Section Header Table. Malware routinely zeros this field to break static analysis tools that rely on sections; the binary still executes normally because the runtime loader only needs program headers.
e_phnum / e_shnum — counts of program headers and section headers. Injecting extra program headers requires incrementing e_phnum and adding entries before the existing ones (since the kernel iterates from 0 to e_phnum).
e_shstrndx — index of the section containing section name strings. If e_shoff is zero, this field is meaningless.

Program Headers (Segments — Runtime View)

The program header table is what the kernel and dynamic linker read at load time. Each entry describes a segment — a contiguous region of the file mapped into a contiguous region of memory. The segment types most relevant to analysis:

PT_LOAD — loadable segment. The kernel calls mmap for each one. A standard binary has two: one R-X (code, including .text) and one RW- (data, including .data/.bss). Each entry specifies p_offset (file offset), p_vaddr (virtual address), p_filesz (bytes from file), p_memsz (bytes in memory — may be larger for BSS), and p_flags (PF_R=4, PF_W=2, PF_X=1). A PT_LOAD with flags RWX (7) is an unconditional red flag — no legitimate binary needs a simultaneously writable and executable segment.

PT_DYNAMIC — points to the .dynamic section, which contains the dynamic linking metadata (DT_NEEDED entries, GOT/PLT addresses, symbol table pointers). Without this segment the dynamic linker cannot resolve imports.

PT_INTERP — path to the dynamic linker, typically /lib64/ld-linux-x86-64.so.2 or /lib/ld-linux.so.2 for 32-bit. Statically linked binaries have no PT_INTERP. An unusual path here (e.g., /tmp/.x/ld.so or a path in /dev/shm) means the attacker is supplying a custom loader — a sophisticated technique used by some rootkits.

PT_GNU_STACK — the p_flags of this segment control the NX policy for the stack. A legitimate binary has RW- (flags=6, no execute). If PF_X is set (flags=7), the stack is executable — either the binary uses intentional stack execution (rare, legacy) or the attacker cleared the NX protection to enable shellcode on the stack. checksec and readelf -l both show this.

PT_GNU_RELRO — marks a range of the address space as read-only after dynamic linking completes (.got, parts of .data). Its absence means GOT entries remain writable throughout execution — enabling GOT overwrite attacks.

Section Headers (Linker View)

Section headers are a map of the binary’s internal structure for linkers and debuggers. The kernel does not need them at runtime. This means stripping them (strip --strip-all) produces a fully functional binary that is harder to analyze. Malware distributed in the wild is almost always stripped.

Important sections:

.text — machine code. SHT_PROGBITS, flags AX (alloc + execute).
.rodata — read-only data: string literals, jump tables, constant arrays.
.data / .bss — initialized and uninitialized writable data.
.dynamic — the Elf64_Dyn array driving dynamic linking.
.dynsym + .dynstr — the minimal symbol table needed for dynamic linking. Always present in dynamically linked binaries (cannot be stripped without breaking the binary).
.symtab + .strtab — the full symbol table. Stripped in production/malware releases. Their presence is a debugging artifact.
.got — Global Offset Table. At load time the dynamic linker patches pointers here for each resolved symbol. GOT overwrite (writing a function pointer to redirect execution) is a well-known post-exploitation technique.
.plt — Procedure Linkage Table. A table of small stubs; each stub either jumps through the GOT (resolved) or calls the lazy binding resolver. The PLT is in the executable segment and is never writable.
.rela.plt / .rela.dyn — relocation tables describing which GOT slots to patch and with which symbols.
.init_array — array of constructor function pointers called before main(). Functionally equivalent to PE TLS callbacks. Malware that installs itself as a shared library can hide in .init_array entries.

Dynamic Linking Deep-Dive

The PT_DYNAMIC segment contains a flat array of (tag, value) pairs called Elf64_Dyn:

DT_NEEDED — one entry per required shared library (libc.so.6, libpthread.so.0, etc.). A binary with no DT_NEEDED entries is statically linked. Malware often has minimal DT_NEEDED entries or none, carrying all code statically.
DT_RPATH / DT_RUNPATH — colon-separated list of paths searched for shared libraries before the system defaults. If this contains a writable path (e.g., ., /tmp, /home/user) an attacker can plant a malicious library with the same DT_SONAME as a legitimate one and hijack every import.
DT_SONAME — the library’s own name, used when other binaries DT_NEEDED reference it.

How PLT/GOT lazy binding works: on the first call to printf, the PLT stub jumps to GOT entry [printf]. Initially that GOT entry points back into the PLT resolver stub, which calls _dl_runtime_resolve. The dynamic linker finds the real printf address and patches the GOT entry. Every subsequent call jumps directly to libc. This lazy resolution mechanism means the GOT is writable during program execution — making it a target for any attacker with a write primitive.

ELF File Structure Layout

The diagram below shows the complete ELF64 file layout from byte 0 to the Section Header Table, with every major field colour-coded by region. The right-side bar names each region as the loader and linker see it.

ELF Ident — e_ident[EI_NIDENT] (16 bytes, offset 0x00)

Magic: 7f 45 4c 46

"\x7fELF" — loader rejects file if absent

Class

1 = ELF32 / 2 = ELF64

Data

1 = LE / 2 = BE

EI_VERSION

always 1

OS/ABI

0 = SysV / 3 = Linux

ABI Version

usually 0

EI_PAD — 7 reserved zero bytes

ELF Ident

ELF Header Fields — Elf64_Ehdr (offset 0x10 → 0x3F)

e_type

2 = ET_EXEC / 3 = ET_DYN / 4 = ET_CORE

e_machine

0x03 = x86 · 0x3E = AMD64 · 0x28 = ARM · 0xB7 = AArch64

e_version

always 1

e_entry — entry point virtual address (8 bytes)

Where execution starts. Shellcode loaders jump here directly.

e_phoff — Program Header Table file offset (8 bytes)

Typically 0x40 (immediately after this header). The runtime loader uses only this — not e_shoff.

e_shoff — Section Header Table file offset (8 bytes)

Zeroed by packers and malware to defeat readelf / static analysis. Binary still executes normally.

e_flags

arch flags

e_ehsize

64 bytes

e_phentsize

56 bytes

e_phnum

# program headers

e_shentsize

64 bytes

e_shnum

# section headers

e_shstrndx

section index of .shstrtab (section-name string table)

ELF Header

Program Header Table — Elf64_Phdr × e_phnum (runtime / loader view)

PT_PHDR

PF_R

location of the program header table itself

PT_INTERP

PF_R

/lib64/ld-linux-x86-64.so.2

path to dynamic linker — unusual path = library hijack attempt

PT_LOAD [0]

code segment

PF_R | PF_X

Read + eXecute

maps .text + .rodata + .plt into memory (never writable)

PT_LOAD [1]

data segment

PF_R | PF_W

Read + Write

maps .data + .bss + .dynamic + .got into memory

PT_DYNAMIC

PF_R | PF_W

points to .dynamic section — DT_NEEDED, DT_RPATH, DT_INIT_ARRAY…

PT_GNU_STACK

NX control

PF_R | PF_W

NX = ON ✓

If PF_X is present here → stack is executable → NX disabled → shellcode on stack possible

PT_GNU_RELRO

PF_R

pages marked read-only after relocation (protects .got from runtime overwrites if Full RELRO)

Program Headers

Sections — Code (SHF_ALLOC | SHF_EXECINSTR)

.text

SHT_PROGBITS · AX

PF_R | PF_X

compiled machine code — the function bodies

.plt

SHT_PROGBITS · AX

PF_R | PF_X

Procedure Linkage Table — small stubs that jump through the GOT to resolve dynamic symbols

Code Sections

Sections — Read-Only Data (SHF_ALLOC, not writable)

.rodata

SHT_PROGBITS · A

PF_R only

string literals, jump tables, const arrays — mapped in the same RX segment as .text

.rodata

Sections — Writable Data (SHF_ALLOC | SHF_WRITE)

.data

SHT_PROGBITS · WA

PF_R | PF_W

initialized global and static variables

.bss

SHT_NOBITS · WA

PF_R | PF_W

uninitialized globals — zero file size, zeroed by loader at map time

Data Sections

Sections — Dynamic Linking

.dynamic

SHT_DYNAMIC · WA

PF_R | PF_W

Elf64_Dyn tag/value array — DT_NEEDED, DT_RPATH, DT_INIT_ARRAY, DT_SONAME…

.got / .got.plt

SHT_PROGBITS · WA

PF_R | PF_W

⚠ writable!

Global Offset Table — pointer slots patched by loader at runtime. Primary GOT-overwrite exploitation target.

.dynsym

SHT_DYNSYM · A

PF_R

minimal symbol table for dynamic linking — cannot be stripped without breaking the binary

.dynstr

SHT_STRTAB · A

PF_R

string table backing .dynsym — contains imported symbol and library names

.rela.plt

SHT_RELA · AI

PF_R

relocation entries for PLT GOT slots — which address to patch and with which symbol

.rela.dyn

SHT_RELA · A

PF_R

relocations for non-PLT symbols (global variables, copy relocations)

Dynamic Linking

Sections — Constructors & Destructors (run before / after main)

.init_array

SHT_INIT_ARRAY · WA

PF_R | PF_W

function pointers called before main() — equivalent to PE TLS callbacks. Malware persistence target in shared libraries.

.fini_array

SHT_FINI_ARRAY · WA

PF_R | PF_W

function pointers called after main() returns / at exit()

.init / .fini

Sections — Full Symbol Table (stripped in malware / production builds)

.symtab

SHT_SYMTAB · none

—

full symbol table (all function / variable names). Absent in stripped binaries. Presence = debug build or sloppy packer.

.strtab

SHT_STRTAB · none

—

string table backing .symtab — also removed when stripped

.shstrtab

SHT_STRTAB · A

PF_R

section name strings — always present, index stored in e_shstrndx

Symbol Tables

Section Header Table — Elf64_Shdr × e_shnum (64 bytes each)

sh_name

offset into .shstrtab

sh_type

SHT_PROGBITS / DYNAMIC / NOBITS…

sh_flags

SHF_ALLOC · SHF_WRITE · SHF_EXECINSTR

sh_addr — virtual address in memory

sh_offset — file offset of section data

sh_size

sh_link

sh_info

sh_addralign

sh_entsize

··· one 64-byte entry repeated for every section (e_shnum total) ···

Section Header Table

ELF Header Hex Mockup

The 64 bytes below are the complete ELF64 header for a typical x86-64 ET_DYN binary with a program header table at offset 0x40:

Hex view — ELF64 header (0x40 bytes) · x86-64 ET_DYN PIE binary

0000 7F 45 4C 46 02 01 01 00 00 00 00 00 00 00 00 00 magic · class=ELF64 · LSB · ver=1 · ABI=SysV · padding

0010 03 00 3E 00 01 00 00 00 A0 10 00 00 00 00 00 00 e_type=ET_DYN · e_machine=x86-64 · e_version=1 · e_entry=0x10A0

0020 40 00 00 00 00 00 00 00 98 32 00 00 00 00 00 00 e_phoff=0x40 (PHT right after header) · e_shoff=0x3298

0030 00 00 00 00 40 00 38 00 0D 00 40 00 1C 00 1B 00 e_flags=0 · e_ehsize=64 · e_phentsize=56 · e_phnum=13 · e_shentsize=64 · e_shnum=28 · e_shstrndx=27

Where Malware Hides — and How to Detect It

PE Hiding Spots

1. Overlay / Appended Data

After the last section’s raw data ends on disk, the PE format has no explicit end-of-file marker. Bytes appended beyond the last section are called the overlay. Installers legitimately use this (e.g., self-extracting archives append a ZIP blob), but malware uses it to store encrypted shellcode, config blobs, or secondary payloads that are loaded at runtime via SetFilePointer + ReadFile at the overlay offset.

Detection: diec sample.exe, binwalk sample.exe, or python3 -c "import pefile; p=pefile.PE('sample.exe'); print(p.get_overlay_data_start_offset())". Any overlay > a few kilobytes in a binary that isn’t a self-extractor deserves scrutiny. Measure entropy of the overlay — random-looking = encrypted.

2. Section Slack Space

The difference between VirtualSize and SizeOfRawData is zeroed padding on disk. Malware can hide data in this slack between the end of a section’s actual content and the next FileAlignment boundary. The technique is subtle — the data is present on disk but invisible to most viewers because they only display up to SizeOfRawData.

Detection: pe-sieve --pid <PID> compares in-memory sections to disk; hollows-hunter flags modified regions. Manually: pefile in Python can dump raw section bytes beyond the virtual size.

3. TLS Callbacks

Thread Local Storage callbacks (AddressOfCallBacks in the TLS directory) are function pointers invoked by the loader for every thread creation — including the initial thread, before AddressOfEntryPoint. Setting a breakpoint at OEP in a debugger will miss any code in a TLS callback entirely. Packers and anti-debug stubs routinely live here.

Detection: pescan -t sample.exe lists TLS callbacks. capa sample.exe flags the capability. In x64dbg/WinDbg, the EntryBreakpointMode needs to be set to break on TLS callbacks, not just OEP.

4. Resource Section Abuse (`.rsrc`)

The resource section tree is a three-level hierarchy (type → name → language). RCDATA and BITMAP resource types accept arbitrary binary data. Malware stores encrypted payloads here — the resource subsystem is not scanned by many AV engines, and the data is conveniently extracted at runtime via FindResource + LoadResource + LockResource.

Detection: ResourceHacker or peframe sample.exe show resource content. Measure entropy per resource — a BITMAP entry with entropy > 7.5 and no valid BMP header is a payload. binwalk -e can extract embedded blobs.

5. Extra Injected Sections

After hollowing or injection, tools like Process Hacker reveal sections in memory that have no disk counterpart, or sections with names not present in the original binary. On disk, packers append new sections at the end of the section table.

Detection: iterate sections with pefile and flag any with Characteristics & 0xE0000000 == 0xE0000000 (RWX), entropy > 7.0, or names not in the standard set. Compare disk section count to in-memory section count.

6. Import Table Spoofing / Stomping

A loader stub that does all API resolution at runtime only needs two imports: LoadLibraryA and GetProcAddress. The import table will be almost empty — one or two DLL entries. Everything the binary actually calls is resolved by walking the export table of the loaded DLL manually, like a PEB walk (described in the assembly article).

Detection: pestudio Imports view — fewer than five imports total with LoadLibraryA + GetProcAddress present is a near-certain sign of runtime API resolution. capa detects the peb walk capability directly.

7. Authenticode Stomping

The CertificateTable data directory points to a WIN_CERTIFICATE structure appended to the file. Because this data is not mapped into memory and is excluded from the hash Microsoft verifies, an attacker can copy the certificate block from a legitimately signed binary and append it to a malicious one. Many security products check only that WinVerifyTrust returns ERROR_SUCCESS, not whether the hash matches.

Detection: sigcheck -a sample.exe (Sysinternals) shows both the certificate subject and whether the file hash matches. AuthentiCheck specifically tests hash validity independent of chain trust.

8. PE Header Erasure After Loading

Some loaders zero out the MZ/PE signature in memory after successfully mapping the binary. This breaks any tool that tries to dump and re-parse the binary from memory (e.g., procdump), since the result will not be a valid PE.

Detection: pe-sieve --pid <PID> compares disk vs. memory and reports HEADER_ERASED findings. The fix is to reconstruct the PE header from the section information still visible in memory.

9. Section Name vs. Characteristics Mismatch

The loader ignores section names — only Characteristics flags matter for memory permissions. A packer can name a writable+executable section .text to fool analysts and tools that key off the name. Conversely, it can mark the actual .text section as non-executable to hide code in .data.

Detection: compare every section name to its expected characteristics. Flag any .text with write permission or any .data with execute permission.

10. Checksum Manipulation

The CheckSum field in the Optional Header is a CRC-like value computed over the entire file. The Windows loader ignores it for user-mode EXEs but validates it for drivers and system DLLs. A zero or incorrect checksum in a binary claiming to be a Windows system file is a reliable indicator of tampering.

Detection: MapFileAndCheckSum (Win32 API) or pe-bear recomputes the correct checksum and flags mismatches. All legitimate Windows system DLLs have a valid, non-zero checksum.

ELF Hiding Spots

1. Stripped Symbols

strip --strip-all removes .symtab, .strtab, and debug sections. All function names become FUN_ addresses in Ghidra or sub_ in IDA. The binary executes identically — only analysis is impaired.

Detection: file sample reports “stripped”. readelf -S sample | grep -E 'symtab|strtab' returns nothing. The .dynsym/.dynstr sections remain (they cannot be stripped) and provide a minimal set of exported symbol names as starting points.

2. PT_GNU_STACK RWX

An executable stack allows shellcode injection via stack buffer overflows. Set with execstack -s sample or by omitting the PT_GNU_STACK segment and relying on the kernel default (which on some older kernels defaults to executable).

Detection: readelf -l sample | grep GNU_STACK — check the flags column. RWE or flags: 7 means executable stack. checksec --file=sample reports NX disabled.

3. GOT/PLT Overwrite

After gaining a write primitive (e.g., format string bug, heap overflow), an attacker overwrites a .got.plt entry to redirect the next call to that library function toward shellcode. This is detectable post-exploitation.

Detection: attach gdb and compare GOT entries to expected library addresses: x/gx &puts@got.plt should resolve to an address within libc.so. Values in anonymous memory regions or non-library pages indicate hooking. checksec flags RELRO status: Full RELRO means GOT is read-only after startup.

4. LD_PRELOAD Hijack

Setting LD_PRELOAD=/path/to/evil.so causes the dynamic linker to load that library first, allowing it to intercept any function via symbol override. The system-wide variant uses /etc/ld.so.preload. This is the most common persistence mechanism for Linux userland rootkits.

Detection: cat /etc/ld.so.preload — should be empty on a clean system. For running processes: cat /proc/<pid>/maps | grep -v '\.so\.' | grep '\.so' finds anonymously-loaded shared objects. strace -e trace=open,openat ls reveals which libraries are actually opened at startup.

5. Injected PT_LOAD Segment

An attacker with write access to an ELF binary can insert an additional PT_LOAD entry in the program header table, e_phnum, and back the segment with encrypted shellcode at a high file offset. The kernel loads all PT_LOAD segments unconditionally.

Detection: readelf -l sample | grep LOAD — more than two PT_LOAD entries is unusual for most binaries. A PT_LOAD with RWX flags is unconditional justification for deeper analysis.

6. Ghost Sections

Section headers can point to file offsets or virtual addresses that fall outside any PT_LOAD segment. The kernel ignores section headers at runtime, so ghost sections are invisible to the process but visible to static analysis tools — until the malware zeros e_shoff to hide the entire section table.

Detection: cross-reference every section’s sh_addr against the ranges covered by PT_LOAD segments. Any section whose address is not covered by a loadable segment is a ghost. readelf -S vs readelf -l side by side.

7. Debug Info Abuse

DWARF debug information in .debug_info, .debug_str, and related sections is parsed only by debuggers and DWARF-aware tools — not by AV scanners. Attackers have stored encrypted payloads here, relying on the section’s presence in legitimate debug builds for cover.

Detection: dwarfdump --all sample dumps all DWARF data. Measure section entropy: binwalk --entropy sample. High entropy in .debug_* sections that are not otherwise consistent with debug builds is suspicious.

8. RPATH / RUNPATH Manipulation

DT_RPATH and DT_RUNPATH entries in .dynamic specify library search paths that take precedence over system defaults. If they include relative paths (., ./lib) or attacker-controlled directories (/tmp, user home), any binary launched from that directory will load attacker libraries instead of system ones.

Detection: readelf -d sample | grep -E 'RPATH|RUNPATH' or patchelf --print-rpath sample. Any value other than standard system paths (/usr/lib, /lib) or empty should be investigated.

9. `.init_array` / Constructor Abuse

Functions listed in .init_array are called by __libc_start_main before main(), in the same way PE TLS callbacks precede OEP. A malicious shared library can install itself permanently by planting a function in .init_array that forks, drops a rootkit, or patches /etc/ld.so.preload.

Detection: readelf -d sample | grep INIT shows the DT_INIT and DT_INIT_ARRAY addresses. objdump -d -j .init_array sample disassembles constructor pointers. In gdb, catch load breaks on each library constructor.

10. UPX / Custom Packer

upx --best compresses the entire ELF into a single PT_LOAD stub that decompresses the original binary into memory and transfers control. The PT_INTERP is replaced by the UPX unpacker stub. The original section table is destroyed.

Detection: file sample reports “UPX compressed”. binwalk sample detects the UPX magic and embedded ELF. upx -d sample decompresses if the magic is intact; custom packers that strip UPX headers require binwalk -e or dynamic analysis.

Analysis Workflow Cheat-Sheet

Step	PE (Windows)	ELF (Linux)
Identify format	`file sample.exe`	`file sample`
Full header overview	`diec sample.exe`	`readelf -h sample`
Section list	`pestudio` → Sections	`readelf -S sample`
Imports / dependencies	`pestudio` → Imports	`readelf -d sample` (DT_NEEDED)
Entropy per section	`peframe sample.exe`	`binwalk --entropy sample`
Strings	`strings -n 8 -e l sample.exe`	`strings -n 8 sample`
TLS / init hooks	`pescan -t sample.exe`	`readelf -d sample` (INIT_ARRAY)
Security features	`winchecksec sample.exe`	`checksec --file=sample`
Packing detection	`diec`, `exeinfope`	`binwalk -e sample`
Overlay / appended	`pefile` overlay offset	`binwalk -e sample`
Memory dump compare	`pe-sieve --pid <PID>`	`volatility3 linux.proc.maps`
YARA scan	`yara rules.yar sample.exe`	`yara rules.yar sample`

Quick YARA Rules

rule PE_TLS_Callback_NoASLR_NoNX
{
    meta:
        description = "PE with TLS callback and missing ASLR or NX — common packer/anti-debug pattern"
        author      = "benjitrapp"
        date        = "2026-06-23"

    condition:
        uint16(0) == 0x5A4D                          // MZ magic
        and uint32(uint32(0x3C)) == 0x00004550       // PE signature
        // TLS directory present (entry 9 has non-zero RVA)
        and uint32(uint32(0x3C) + 0x18 + 0x60 + 9 * 8) != 0
        // DllCharacteristics missing ASLR (0x0040) or NX (0x0100)
        and (
            (uint16(uint32(0x3C) + 0x18 + 0x5E) & 0x0040) == 0
            or (uint16(uint32(0x3C) + 0x18 + 0x5E) & 0x0100) == 0
        )
}

rule ELF_RWX_Stack
{
    meta:
        description = "ELF binary with executable stack (PT_GNU_STACK PF_X set)"
        author      = "benjitrapp"
        date        = "2026-06-23"

    strings:
        // PT_GNU_STACK type = 0x6474e551, PF_X in flags byte
        // Little-endian pattern: type bytes + flags with exec bit
        $gnu_stack_rwx = { 51 E5 74 64 07 00 00 00 }   // type=GNU_STACK, flags=RWX (32-bit PHdr)

    condition:
        uint32(0) == 0x464C457F   // \x7fELF magic
        and $gnu_stack_rwx
}

Resources

Microsoft PE/COFF Specification — authoritative reference for all PE structures
corkami PE101 / ELF101 posters — visual format maps, indispensable for quick reference
pefile — Python library for parsing PE files programmatically
readelf / objdump — binutils suite, standard on every Linux system
capa — Mandiant’s capability detection tool, maps PE/ELF behaviors to ATT&CK
pe-sieve / hollows-hunter — hasherezade’s tools for in-memory PE anomaly detection
pestudio — Marc Ochsenmeier’s static PE analysis workbench
checksec — binary security feature checker for ELF and PE
binwalk — firmware and binary entropy/structure scanner

Written on June 23, 2026

◀ Back to Defense related posts

PE & ELF Headers: Structure, Analysis & Malware Hiding Spots

PE Headers: Windows Portable Executable

DOS Header: IMAGE_DOS_HEADER

PE Signature

COFF File Header: IMAGE_FILE_HEADER

Optional Header: IMAGE_OPTIONAL_HEADER

Data Directories

Section Headers: IMAGE_SECTION_HEADER

PE Header Diagram

PE Hex View Mockup

ELF Headers — Linux / Android / Embedded

ELF Ident — The First 16 Bytes

ELF Header Fields (Elf64_Ehdr)

Program Headers (Segments — Runtime View)

Section Headers (Linker View)

Dynamic Linking Deep-Dive

ELF File Structure Layout

ELF Header Hex Mockup

Where Malware Hides — and How to Detect It

PE Hiding Spots

1. Overlay / Appended Data

2. Section Slack Space

3. TLS Callbacks

4. Resource Section Abuse (.rsrc)

5. Extra Injected Sections

6. Import Table Spoofing / Stomping

7. Authenticode Stomping

8. PE Header Erasure After Loading

9. Section Name vs. Characteristics Mismatch

10. Checksum Manipulation

ELF Hiding Spots

1. Stripped Symbols

2. PT_GNU_STACK RWX

3. GOT/PLT Overwrite

4. LD_PRELOAD Hijack

5. Injected PT_LOAD Segment

6. Ghost Sections

7. Debug Info Abuse

8. RPATH / RUNPATH Manipulation

9. .init_array / Constructor Abuse

10. UPX / Custom Packer

Analysis Workflow Cheat-Sheet

Quick YARA Rules

Resources

DOS Header: `IMAGE_DOS_HEADER`

COFF File Header: `IMAGE_FILE_HEADER`

Optional Header: `IMAGE_OPTIONAL_HEADER`

Section Headers: `IMAGE_SECTION_HEADER`

ELF Header Fields (`Elf64_Ehdr`)

4. Resource Section Abuse (`.rsrc`)

9. `.init_array` / Constructor Abuse