DFIR - Digital Forensic Incident Response

Digital Forensics and Incident Response (DFIR) is a field within cybersecurity that focuses on the identification, investigation, and remediation of cyber attacks.

DFIR has two main components:

Digital Forensic: A subset of forensic science that examines system data, user activity, and other pieces of digital evidence to determine if an attack is in progress and who may be behind the activity.
Incident Response: The overarching process that an organization will follow in order to prepare for, detect, contain, and recover from a data breach.

Sigma Rules
YARA Rules
Benji’s Linux DFIR Script

Best resources to dive into DFIR

thedfirreport.com - Learn from attacks by reading DFIR Reports and adopt IOCs
detection.fyi - Awesome resource for tons of Sigma Rules (thedfirreport also has great Sigma Rules)
Sigma-CLI - Starting point for learning Sigma Rules
Tsurugi-DFIR-Linux
FalconHound - FalconHound is a blue team multi-tool. It allows you to utilize and enhance the power of BloodHound in a more automated fashion.
YARA Forge
YARA Forge Blog Post

Sigma Rules

What are Sigma Rules

Sigma is a generic signature format that allows to make detections in log events. Rules are easy to write and applicable to any type of log. Best of all, Sigma rules are abstract and not tied to any particular SIEM. This helps in making Sigma rules shareable.

Once a Cyber Security Researcher, Defender or Analyst develops a detection method, they can use Sigma to describe and share their technique with others. Here’s a quote from Sigma HQ:

Sigma is for log files what Snort is for network traffic and YARA is for files.

Sigma can be seen on a high level like this:

Let’s look at an example from Sigma HQ:

The heart of aboves rule is the detection section. When the condition evaluates to true it means the rule made a detection. A condition is composed of named expressions. For example, here a selection and a filter expression are declared. These expressions perform a tests against attributes of the logs (in this particular case web logs).

Anatomy of a Sigma Rule

Source: Twitter - fr0gger

Sigmac Generates SQL

The sigmac compiler is used to translate an abstract Sigma rule into a concrete form which can be evaluated by an actual SIEM or processing platform. Sigmac has many backends capable of translating a rule into an arbitrary SIEM like: Splunk, QRadar, ElasticSearch, ArcSight, FortiSIEM and generic SQL.

Using the SQL sigmac backend we can translate the rule above into:

SELECT 
    * 
FROM
    (
    SELECT
    (
        (cs-uri-query LIKE '%cmd=read%'
        OR cs-uri-query LIKE '%connect&target%'
        OR cs-uri-query LIKE '%cmd=connect%'
        OR cs-uri-query LIKE '%cmd=disconnect%'
        OR cs-uri-query LIKE '%cmd=forward%')
        AND (cs-referer IS NULL
        AND cs-USER-agent IS NULL
        AND cs-METHOD LIKE 'POST')) 
 AS web_webshell_regeorg,
       *
   FROM
    test_webserver_logs
   )
WHERE 
     web_webshell_regeorg = TRUE

These SQL statements are typically invoked by a scheduler on particular trigger interval; say 1 hour. Every hour the detection searches through the newest events. Such a scheduler could also be enhanced using Spark or other toolings.

However, some Sigma rules apply temporal aggregations. For example Enumeration via the Global Catalog counts occurrence of events in a window of time.

detection:
    selection:
        EventID: 5156
        DestPort:
        - 3268
        - 3269
    timeframe: 1h
    condition: selection | count() by SourceAddress > 2000

Using the batch model above, these types of queries reprocess the same events over and over. Especially if the correlation window is large.Therefore it might be a good idea to reduce the detection latency by increasing the trigger rate to a shorter window like every 5 minutes. Doing so this will introduce even more reprocessing of the same events.

Ideally to reduce reprocessing of the same events over and over again, something like anomaly detection can help to remember what was the last event it processed and what were the value of the counters so. That is exactly what Spark Structured Streaming framework was built for. A streaming query triggers a micro-batch every minute (configurable). It reads in the new events, updates all counters and persists them (for disaster recovery).

Spark Structured streaming can easily evaluate the SQL produced by the sigmac compiler. FTo achieve this it is required to create a streaming data frame by connecting to a queuing mechanism (EventHubs, Pub/Sub, Kafka, …).

YARA Rules

All about YARA:

“The pattern matching swiss knife for malware researchers (and everyone else)” (Virustotal., 2020)

With such a fitting quote, YARA can identify information based on both binary and textual patterns, such as hexadecimal and strings contained within a file.

Rules are used to label these patterns. For example, YARA rules are frequently written to determine if a file is malicious or not, based upon the features — or patterns — it presents. Strings are a fundamental component of programming languages. Applications use strings to store data such as text.

For example, the code snippet below prints “Hello World” in Python. The text “Hello World” would be stored as a string.

print("Hello World!")

We could write a YARA rule to search for Hello World! in every program on our operating system if we would like.

Why does Malware use Strings? Malware, just like our Hello World! application, uses strings to store textual data. Here are a few examples of the data that various malware types store within strings:

Anatomy of a YARA Rule

Information security researcher “fr0gger_” has recently created a handy cheatsheet that breaks down and visualises the elements of a YARA rule (shown above, all image credits go to him). It’s a great reference point for getting started!

Simple Command Line

Apply rule in /foo/bar/rules to all files in the current directory. Subdirectories are not scanned:

yara /foo/bar/rules  .

Scan all files in the /foo directory and its subdirectories:

yara /foo/bar/rules -r /foo

Basic Rule

Basic rule layout with suggested various meta descriptors.

rule specifc_name       // e.g. APT_CN_Winnti_exploit or crimeware_ZZ_RAT
{
    meta:               // suggested
        author = 
        date_created = 
        date_last_modified = 
        description = 
        filetype = 
        hash = 
        hash = 
        source = 
        TLP = 
        license = 
        min_yara_version = 
    
    strings:
        $a1 = "string type 1-1" ascii wide fullword nocase
        $a2 = "string type 1-2" 
        $a3 = "string type 1-3"

        $b1 = "string type 2-1"
        $b2 = "string type 2-2"
        $b3 = "string type 2-3"

        $c1 = "string type 3-1"
        $c2 = "string type 3-2"
        $c3 = "string type 3-3"
    
    condition:
        any of $a* or (2 of $b*) or (1 of $c*)
}

Strings

text strings are case sensitive.
case insensitive can be used with nocase option.
wide modifier can be used for UTF-16 formatting BUT not true UTF-16 support.
ascii modifier can be used for ASCII encoding (default).
xor can be used to search for strings with a single byte xor applied to them.
you can combine all types. E.g. ascii wide nocase xor.
can write xor (0x01-0xff) for a specific range of xor bytes rather than all 255.
fullword odifier guarantees that the string will match only if it appears in the file delimited by non-alphanumeric characters.
use fullword keyword when string is short.

Hexadecimal Strings

Hexadecimal strings allow three special constructions that make them more flexible: wild-cards, jumps, and alternatives.

Wildcards

Wild-cards are just placeholders that you can put into the string indicating that some bytes are unknown and they should match anything.

The placeholder character is the question mark (?).

rule WildcardExample
{
    strings:
       $hex_string = { E2 34 ?? C8 A? FB }

    condition:
       $hex_string
}

Jumps

Jumps can be used when the length of the content can vary or is unknown.

rule JumpExample
{
        strings:
           $hex_string = { F4 23 [4-6] 62 B4 }

        condition:
           $hex_string
}

Alternatives

Alternatives can give a hex option similar to a regular expression.

rule AlternativesExample1
{
    strings:
       $hex_string = { F4 23 ( 62 B4 | 56 ) 45 }

    condition:
       $hex_string
}

Regular Expressions

They are defined in the same way as text strings, but enclosed in forward slashes instead of double-quotes.
E.g. /([Cc]at|[Dd]og)/.
can also be used in conjunction with nocase, ascii, wide, or fullword.
Regular expression evaluation is inherently slower than plain string matching and consumes a significant amount of memory.

Counting strings

The number of occurrences of each string is represented by a variable whose name is the string identifier but with a # character in place of the $ character.

rule CountExample
{
    strings:
        $a = "dummy1"
        $b = "dummy2"

    condition:
        #a == 6 and #b > 10
}

This rule matches any file or process containing the string $a exactly six times, and more than ten occurrences of string $b.

File Size

Can use bytes or append KB or MB as required.
E.g. filesize > 10KB
Don’t use filesize modifier when scanning memory images.

Offsets

The at modifier can be used to denote a decimal offset value (or virtual address for a running file).

The in modifier allows a range to be specified.

rule InExample_AtExample
{
    strings:
        $a = "dummy1"
        $b = "dummy2"
        $c = "dummy3"

    condition:
        $a in (0..100) and ($b in (100..filesize) or $c at 150)
}

Conditions

Conditions are nothing more than Boolean expressions as those that can be found in all programming languages, for example in an if statement. They can contain the typical Boolean operators and, or, and not, and relational operators >=, <=, <, >, == and !=. Also, the arithmetic operators (+, -, *, \, %) and bitwise operators (&, |, <<, >>, ~, ^) can be used on numerical expressions.

Checking File Headers

if possible for speed and accuracy check the file headers.
don’t check file headers for memory images.

Examples:

PE Header uint16(0) == 0x5A4D

ELF Header uint32(0) == 0x464c457f

Comments

/*
    This is a multi-line comment ...
*/

rule example { // this is a single line comment

PE Module

The PE module allows you to create more fine-grained rules for PE files by using attributes and features of the PE file format.

Preface file with import "pe"

see PE Module

Math Module

The Math module allows you to calculate certain values from portions of your file and create signatures based on those results.

Useful examples:

math.entropy(0, filesize) >= 7

Rule Tips

Don’t use rules solely based on Windows APIs, this is prone to FPs.
don’t try to match on run-time generated strings (for disk files).
don’t put all criteria a necessary. Eg. not all of them but five of them.

What to match

Mutex
Rare User Agents
Registry Keys
Typos
PDB Paths
GUIDs
Internal module names
Encoded or encrypted configuration strings

Use clusters of groups (See example at start). Eg:

unique artifacts found in malware
Win APIs
File properties and structure (sections, entropy, timestamp, filesize etc.)

Then use these clusters to create the condition:

any of $a* or 5 of $b* or 2 of $c*

Random Tips

Run command line with -p option to increase performance on SSDs.
pe.imphash() == "blah" where blah needs to be lower case hex.
loops vs strings: strings is faster.

Random Rules from Kaspersky Webinar

Chinese language samples, signed but not signed by MS, mimic legit Microsoft files by re-using their PE meta info

condition:
    uint16(0) == 0x5A4D
    and filesize < 1000000
    and pe.version_info["Company Name"] contains "Microsoft" 
    and pe.number_of_signatures > 0
    and not forall i in (0..pe.number_of_signatures - 1):
        (pe.signatures[i].issuer contains "Microsoft" or pe.signatures[i].issuer contains "Verisign")
    and pe.language(0x04 // LANG_CHINESE)

malware for AMD64 but compiled date before 64bit OS first released/created.

condition:
    uint16(0) == 0x5A4D
    and (pe.machine == pe.MACHINE_AMD64 or pe.machine == pe.MACHINE_IA64)
    and pe.timestamp > 631155661 // 1990-01-01
    and pe.timestamp < 1072915200 // 2004-01-01
    and filesize > 2000000

YARA IDDQD God Mode Rule

Source: Neo23x0/god-mode-rules/

/*
      _____        __  __  ___        __      
     / ___/__  ___/ / /  |/  /__  ___/ /__    
    / (_ / _ \/ _  / / /|_/ / _ \/ _  / -_)   
    \___/\___/\_,_/_/_/__/_/\___/\_,_/\__/    
     \ \/ / _ | / _ \/ _ |   / _ \__ __/ /__  
      \  / __ |/ , _/ __ |  / , _/ // / / -_) 
      /_/_/ |_/_/|_/_/ |_| /_/|_|\_,_/_/\__/  
   
   Florian Roth - v0.8.0 December 2023 - Merry Christmas!

   The 'God Mode Rule' is a proof-of-concept YARA rule designed to 
   identify a wide range of security threats. It includes detections for 
   Mimikatz usage, Metasploit Meterpreter payloads, PowerShell obfuscation 
   and encoded payloads, various malware indicators, and specific hacking 
   tools. This rule also targets ransomware behaviors, such as 
   shadow copy deletion commands, and patterns indicative of crypto mining. 
   It's further enhanced to detect obfuscation techniques and signs of 
   advanced persistent threats (APTs), including unique strings from 
   well-known hacking tools and frameworks. 
*/

rule IDDQD_God_Mode_Rule {
   meta:
      description = "Detects a wide array of cyber threats, from malware and ransomware to advanced persistent threats (APTs)"
      author = "Florian Roth"
      reference = "Internal Research - get a god mode rule set with THOR by Nextron Systems"
      date = "2019-05-15"
      modified = "2023-12-23"
      score = 60
   strings:
      $ = "sekurlsa::logonpasswords" ascii wide nocase           /* Mimikatz Command */
      $ = "ERROR kuhl" wide xor                                  /* Mimikatz Error */
      $ = " -w hidden " ascii wide                               /* Power Shell Params */
      $ = "Koadic." ascii                                        /* Koadic Framework */
      $ = "ReflectiveLoader" fullword ascii wide                 /* Generic - Common Export Name */
      $ = "%s as %s\\%s: %d" ascii xor                           /* CobaltStrike indicator */
      $ = "[System.Convert]::FromBase64String(" ascii            /* PowerShell - Base64 encoded payload */
      $ = "/meterpreter/" ascii                                  /* Metasploit Framework - Meterpreter */
      $ = / -[eE][decoman]{0,41} ['"]?(JAB|SUVYI|aWV4I|SQBFAFgA|aQBlAHgA|cgBlAG)/ ascii wide  /* PowerShell encoded code */
      $ = /  (sEt|SEt|SeT|sET|seT)  / ascii wide                 /* Casing Obfuscation */
      $ = ");iex " nocase ascii wide                             /* PowerShell - compact code */ 
      $ = "Nir Sofer" fullword wide                              /* Hack Tool Producer */
      $ = "impacket." ascii                                      /* Impacket Library */
      $ = /\[[\+\-!E]\] (exploit|target|vulnerab|shell|inject)/ nocase  /* Hack Tool Output Pattern */
      $ = "0000FEEDACDC}" ascii wide                             /* Squiblydoo - Class ID */
      $ = "vssadmin delete shadows" ascii nocase                 /* Shadow Copy Deletion via vssadmin - often used in ransomware */
      $ = " shadowcopy delete" ascii wide nocase                 /* Shadow Copy Deletion via WMIC - often used in ransomware */
      $ = " delete catalog -quiet" ascii wide nocase             /* Shadow Copy Deletion via wbadmin - often used in ransomware */
      $ = "stratum+tcp://" ascii wide                            /* Stratum Address - used in Crypto Miners */
      $ = /\\(Debug|Release)\\(Key[lL]og|[Ii]nject|Steal|By[Pp]ass|Amsi|Dropper|Loader|CVE\-)/  /* Typical PDB strings found in malware or hack tools */
      $ = /(Dropper|Bypass|Injection|Potato)\.pdb/ nocase        /* Typical PDP strings found in hack tools */
      $ = "Mozilla/5.0" xor(0x01-0xff) ascii wide                /* XORed Mozilla user agent - often found in implants */
      $ = "amsi.dllATVSH" ascii xor                              /* Havoc C2 */
      $ = "BeaconJitter" xor                                     /* Sliver */
      $ = "main.Merlin" ascii fullword                           /* Merlin C2 */
      $ = { 48 83 EC 50 4D 63 68 3C 48 89 4D 10 }                /* Brute Ratel C4 */
      $ = "}{0}\"-f " ascii                                      /* PowerShell obfuscation - format string */
      $ = "HISTORY=/dev/null" ascii                              /* Linux HISTORY tampering - found in many samples */
      $ = " /tmp/x;" ascii                                       /* Often used in malicious linux scripts */
      $ = /comsvcs(\.dll)?[, ]{1,2}(MiniDump|#24)/               /* Process dumping method using comsvcs.dll's MiniDump */
      $ = "AmsiScanBuffer" ascii wide base64                     /* AMSI Bypass */
      $ = "AmsiScanBuffer" xor(0x01-0xff)                        /* AMSI Bypass */
      $ = "%%%%%%%%%%%######%%%#%%####%  &%%**#" ascii wide xor  /* SeatBelt */
   condition:
      1 of them
}

Resources

Yara Binaries Yara Readthedocs
Yara Git Repo
Kaspersky Yara Webinar
yaraPCAP
Kaspersky Klara
YarGen

Benji’s Linux DFIR Script

This Bash script is designed for conducting digital forensics on Linux systems. The script automates the collection of a wide range of system and user data that could be used during DFIR investigations.

Features:

System Information: Collects basic system information including uptime, startup time, and hardware clock readouts.
Operating System Details: Extract information about the operating system installation, including installer logs and file system details.
Network Information: Gathers network configuration, IP addresses, and network interface details.
Installed Programs: Lists all installed packages using both rpm and apt.
Hardware Information: Retrieves detailed information about PCI devices, hardware summaries, and BIOS data.
System Logs: Captures system journal logs and the contents of the /var/log directory.
User Data: Extracts user-specific data like recently used files and bash command history and zsh command history.
Memory Dump: Performs a memory dump for detailed analysis.
Process Information: Captures information about current running processes.
User Login History: Records user login history and scheduled tasks.
Secure Output Handling: Compresses and encrypts the gathered data for security.

Usage:

Set Permissions: Ensure the script is executable:

   sudo apt-get install util-linux  # For Debian/Ubuntu
   sudo yum install util-linux  # For CentOS/RHEL
   chmod +x DFLinux.sh
   sudo ./DFLinux.sh

Output: Check the specified output directory for the collected data.

Requirements:

The script is intended for use on Linux systems.
Please make sure you have the necessary permissions to execute the script and access system files.
Required tools: dump, gpg, netstat, ifconfig, lshw, dmidecode, etc., should be installed.

#!/bin/bash

output_dir="/tmp/ExtractedInfo"
logfile="$output_dir/forensics_log.txt"

initialize_logs() {
    echo "Forensic data extraction started at $(date)" > "$logfile"
}

write_output() {
    local command="$1"
    local filename="$2"

    if $command > "$output_dir/$filename" 2>&1; then
        echo "Successfully executed: $command" >> "$logfile"
    else
        echo "Failed to execute: $command" >> "$logfile"
    fi
}

check_and_write_history() {
    local user_home="$1"
    local username="$2"

    if [ -f "$user_home/.bash_history" ]; then
        write_output "cat $user_home/.bash_history" "bash_command_history_$username.txt"
    else
        echo "No .bash_history for $username" >> "$output_dir/bash_command_history_$username.txt"
    fi

    if [ -f "$user_home/.zsh_history" ]; then
        write_output "cat $user_home/.zsh_history" "zsh_command_history_$username.txt"
    else
        echo "No .zsh_history for $username" >> "$output_dir/zsh_command_history_$username.txt"
    fi

    write_output "cat $user_home/.local/share/recently-used.xbel" "recently_used_files_$username.txt"
}

initialize_logs

# System Information Extraction
write_output "uptime -p" "system_uptime.txt"
write_output "uptime -s" "system_startup_time.txt"
write_output "date" "current_system_date.txt"
write_output "date +%s" "current_unix_timestamp.txt"

if command -v hwclock &>/dev/null; then
    write_output "hwclock -r" "hardware_clock_readout.txt"
else
    echo "hwclock command not found" >> "$logfile"
fi

# Operating System Installation Date
write_output "df -P /" "root_filesystem_info.txt"
write_output "ls -l /var/log/installer" "os_installer_log.txt"
write_output "tune2fs -l /dev/sda1" "root_partition_filesystem_details.txt" # Check for correct root partition

# Network Information
write_output "ifconfig" "network_configuration.txt"
write_output "ip addr" "ip_address_info.txt"
write_output "netstat -i" "network_interfaces.txt"

# Installed Programs
write_output "dpkg -l" "dpkg_installed_packages.txt" # Replaced 'apt' with 'dpkg -l'
write_output "rpm -qa" "rpm_installed_packages.txt"

# Hardware Information
write_output "lspci" "pci_device_list.txt"
write_output "lshw -short" "hardware_summary_report.txt"
write_output "dmidecode" "dmi_bios_info.txt"

# System Logs and Usage
write_output "journalctl" "system_journal_logs.txt"
write_output "ls -lah /var/log/" "var_log_directory_listing.txt"

for user_home in /home/*; do
    username=$(basename "$user_home")
    check_and_write_history "$user_home" "$username"
done

if crontab -l &>/dev/null; then
    write_output "crontab -l" "scheduled_cron_jobs_root.txt"
else
    echo "No crontab for root" >> "$output_dir/scheduled_cron_jobs_root.txt"
fi

tar -czf "$output_dir/user_data.tar.gz" -C "$output_dir" ./*.txt --remove-files

echo "Data extraction complete. Check the $output_dir directory for output." >> "$logfile"
echo "Forensic data extraction completed at $(date)" >> "$logfile"

echo "Data extraction complete. Check the $output_dir directory for output."

Written on December 10, 2023

◀ Back to Defense related posts

Der Benji