Digital Forensics and Incident Response (DFIR) is a field within cybersecurity that focuses on the identification, investigation, and remediation of cyber attacks.
DFIR has two main components:
- Digital Forensic: A subset of forensic science that examines system data, user activity, and other pieces of digital evidence to determine if an attack is in progress and who may be behind the activity.
- Incident Response: The overarching process that an organization will follow in order to prepare for, detect, contain, and recover from a data breach.
Content for quick navigation
Best resources to dive into DFIR
- thedfirreport.com - Learn from attacks by reading DFIR Reports and adopt IOCs
- detection.fyi - Awesome resource for tons of Sigma Rules (thedfirreport also has great Sigma Rules)
- Sigma-CLI - Starting point for learning Sigma Rules
- Tsurugi-DFIR-Linux
- FalconHound - FalconHound is a blue team multi-tool. It allows you to utilize and enhance the power of BloodHound in a more automated fashion.
- YARA Forge
- YARA Forge Blog Post
Sigma Rules
What are Sigma Rules
Sigma is a generic signature format that allows to make detections in log events. Rules are easy to write and applicable to any type of log. Best of all, Sigma rules are abstract and not tied to any particular SIEM. This helps in making Sigma rules shareable.
Once a Cyber Security Researcher, Defender or Analyst develops a detection method, they can use Sigma to describe and share their technique with others. Here’s a quote from Sigma HQ:
Sigma is for log files what Snort is for network traffic and YARA is for files.
Sigma can be seen on a high level like this:
Let’s look at an example from Sigma HQ:
The heart of aboves rule is the detection section. When the condition evaluates to true it means the rule made a detection. A condition is composed of named expressions. For example, here a selection and a filter expression are declared. These expressions perform a tests against attributes of the logs (in this particular case web logs).
Anatomy of a Sigma Rule
Source: Twitter - fr0gger
Sigmac Generates SQL
The sigmac compiler is used to translate an abstract Sigma rule into a concrete form which can be evaluated by an actual SIEM or processing platform. Sigmac has many backends capable of translating a rule into an arbitrary SIEM like: Splunk, QRadar, ElasticSearch, ArcSight, FortiSIEM and generic SQL.
Using the SQL sigmac backend we can translate the rule above into:
SELECT
*
FROM
(
SELECT
(
(cs-uri-query LIKE '%cmd=read%'
OR cs-uri-query LIKE '%connect&target%'
OR cs-uri-query LIKE '%cmd=connect%'
OR cs-uri-query LIKE '%cmd=disconnect%'
OR cs-uri-query LIKE '%cmd=forward%')
AND (cs-referer IS NULL
AND cs-USER-agent IS NULL
AND cs-METHOD LIKE 'POST'))
AS web_webshell_regeorg,
*
FROM
test_webserver_logs
)
WHERE
web_webshell_regeorg = TRUE
These SQL statements are typically invoked by a scheduler on particular trigger interval; say 1 hour. Every hour the detection searches through the newest events. Such a scheduler could also be enhanced using Spark or other toolings.
However, some Sigma rules apply temporal aggregations. For example Enumeration via the Global Catalog counts occurrence of events in a window of time.
detection:
selection:
EventID: 5156
DestPort:
- 3268
- 3269
timeframe: 1h
condition: selection | count() by SourceAddress > 2000
Using the batch model above, these types of queries reprocess the same events over and over. Especially if the correlation window is large.Therefore it might be a good idea to reduce the detection latency by increasing the trigger rate to a shorter window like every 5 minutes. Doing so this will introduce even more reprocessing of the same events.
Ideally to reduce reprocessing of the same events over and over again, something like anomaly detection can help to remember what was the last event it processed and what were the value of the counters so. That is exactly what Spark Structured Streaming framework was built for. A streaming query triggers a micro-batch every minute (configurable). It reads in the new events, updates all counters and persists them (for disaster recovery).
Spark Structured streaming can easily evaluate the SQL produced by the sigmac compiler. FTo achieve this it is required to create a streaming data frame by connecting to a queuing mechanism (EventHubs, Pub/Sub, Kafka, …).
YARA Rules
All about YARA:
“The pattern matching swiss knife for malware researchers (and everyone else)” (Virustotal., 2020)
With such a fitting quote, YARA can identify information based on both binary and textual patterns, such as hexadecimal and strings contained within a file.
Rules are used to label these patterns. For example, YARA rules are frequently written to determine if a file is malicious or not, based upon the features — or patterns — it presents. Strings are a fundamental component of programming languages. Applications use strings to store data such as text.
For example, the code snippet below prints “Hello World” in Python. The text “Hello World” would be stored as a string.
print("Hello World!")
We could write a YARA rule to search for Hello World!
in every program on our operating system if we would like.
Why does Malware use Strings? Malware, just like our Hello World!
application, uses strings to store textual data. Here are a few examples of the data that various malware types store within strings:
Anatomy of a YARA Rule
Information security researcher “fr0gger_” has recently created a handy cheatsheet that breaks down and visualises the elements of a YARA rule (shown above, all image credits go to him). It’s a great reference point for getting started!
Simple Command Line
Apply rule in /foo/bar/rules to all files in the current directory. Subdirectories are not scanned:
yara /foo/bar/rules .
Scan all files in the /foo directory and its subdirectories:
yara /foo/bar/rules -r /foo
Basic Rule
Basic rule layout with suggested various meta descriptors.
rule specifc_name // e.g. APT_CN_Winnti_exploit or crimeware_ZZ_RAT
{
meta: // suggested
author =
date_created =
date_last_modified =
description =
filetype =
hash =
hash =
source =
TLP =
license =
min_yara_version =
strings:
$a1 = "string type 1-1" ascii wide fullword nocase
$a2 = "string type 1-2"
$a3 = "string type 1-3"
$b1 = "string type 2-1"
$b2 = "string type 2-2"
$b3 = "string type 2-3"
$c1 = "string type 3-1"
$c2 = "string type 3-2"
$c3 = "string type 3-3"
condition:
any of $a* or (2 of $b*) or (1 of $c*)
}
Strings
- text strings are case sensitive.
- case insensitive can be used with
nocase
option. wide
modifier can be used for UTF-16 formatting BUT not true UTF-16 support.ascii
modifier can be used for ASCII encoding (default).xor
can be used to search for strings with a single byte xor applied to them.- you can combine all types. E.g.
ascii wide nocase xor
. - can write
xor (0x01-0xff)
for a specific range of xor bytes rather than all 255. fullword
odifier guarantees that the string will match only if it appears in the file delimited by non-alphanumeric characters.- use
fullword
keyword when string is short.
Hexadecimal Strings
- Hexadecimal strings allow three special constructions that make them more flexible: wild-cards, jumps, and alternatives.
Wildcards
Wild-cards are just placeholders that you can put into the string indicating that some bytes are unknown and they should match anything.
The placeholder character is the question mark (?).
rule WildcardExample
{
strings:
$hex_string = { E2 34 ?? C8 A? FB }
condition:
$hex_string
}
Jumps
Jumps can be used when the length of the content can vary or is unknown.
rule JumpExample
{
strings:
$hex_string = { F4 23 [4-6] 62 B4 }
condition:
$hex_string
}
Alternatives
Alternatives can give a hex option similar to a regular expression.
rule AlternativesExample1
{
strings:
$hex_string = { F4 23 ( 62 B4 | 56 ) 45 }
condition:
$hex_string
}
Regular Expressions
- They are defined in the same way as text strings, but enclosed in forward slashes instead of double-quotes.
- E.g.
/([Cc]at|[Dd]og)/
. - can also be used in conjunction with
nocase
,ascii
,wide
, orfullword
. - Regular expression evaluation is inherently slower than plain string matching and consumes a significant amount of memory.
Counting strings
The number of occurrences of each string is represented by a variable whose name is the string identifier but with a # character in place of the $ character.
rule CountExample
{
strings:
$a = "dummy1"
$b = "dummy2"
condition:
#a == 6 and #b > 10
}
This rule matches any file or process containing the string $a exactly six times, and more than ten occurrences of string $b.
File Size
- Can use bytes or append
KB
orMB
as required. - E.g.
filesize > 10KB
- Don’t use filesize modifier when scanning memory images.
Offsets
The at
modifier can be used to denote a decimal offset value (or virtual address for a running file).
The in
modifier allows a range to be specified.
rule InExample_AtExample
{
strings:
$a = "dummy1"
$b = "dummy2"
$c = "dummy3"
condition:
$a in (0..100) and ($b in (100..filesize) or $c at 150)
}
Conditions
Conditions are nothing more than Boolean expressions as those that can be found in all programming languages, for example in an if statement. They can contain the typical Boolean operators and
, or
, and not
, and relational operators >=
, <=
, <
, >
, ==
and !=
. Also, the arithmetic operators (+
, -
, *
, \
, %
) and bitwise operators (&
, |
, <<
, >>
, ~
, ^
) can be used on numerical expressions.
Checking File Headers
- if possible for speed and accuracy check the file headers.
- don’t check file headers for memory images.
Examples:
PE Header
uint16(0) == 0x5A4D
ELF Header
uint32(0) == 0x464c457f
Comments
/*
This is a multi-line comment ...
*/
rule example { // this is a single line comment
PE Module
The PE module allows you to create more fine-grained rules for PE files by using attributes and features of the PE file format.
Preface file with import "pe"
see PE Module
Math Module
The Math module allows you to calculate certain values from portions of your file and create signatures based on those results.
Useful examples:
math.entropy(0, filesize) >= 7
Rule Tips
- Don’t use rules solely based on Windows APIs, this is prone to FPs.
- don’t try to match on run-time generated strings (for disk files).
- don’t put all criteria a necessary. Eg. not
all of them
butfive of them
.
What to match
- Mutex
- Rare User Agents
- Registry Keys
- Typos
- PDB Paths
- GUIDs
- Internal module names
- Encoded or encrypted configuration strings
Use clusters of groups (See example at start). Eg:
- unique artifacts found in malware
- Win APIs
- File properties and structure (sections, entropy, timestamp, filesize etc.)
Then use these clusters to create the condition
:
any of $a* or 5 of $b* or 2 of $c*
Random Tips
- Run command line with -p option to increase performance on SSDs.
pe.imphash() == "blah"
whereblah
needs to be lower case hex.- loops vs strings: strings is faster.
Random Rules from Kaspersky Webinar
Chinese language samples, signed but not signed by MS, mimic legit Microsoft files by re-using their PE meta info
condition:
uint16(0) == 0x5A4D
and filesize < 1000000
and pe.version_info["Company Name"] contains "Microsoft"
and pe.number_of_signatures > 0
and not forall i in (0..pe.number_of_signatures - 1):
(pe.signatures[i].issuer contains "Microsoft" or pe.signatures[i].issuer contains "Verisign")
and pe.language(0x04 // LANG_CHINESE)
malware for AMD64 but compiled date before 64bit OS first released/created.
condition:
uint16(0) == 0x5A4D
and (pe.machine == pe.MACHINE_AMD64 or pe.machine == pe.MACHINE_IA64)
and pe.timestamp > 631155661 // 1990-01-01
and pe.timestamp < 1072915200 // 2004-01-01
and filesize > 2000000
YARA IDDQD God Mode Rule
Source: Neo23x0/god-mode-rules/
/*
_____ __ __ ___ __
/ ___/__ ___/ / / |/ /__ ___/ /__
/ (_ / _ \/ _ / / /|_/ / _ \/ _ / -_)
\___/\___/\_,_/_/_/__/_/\___/\_,_/\__/
\ \/ / _ | / _ \/ _ | / _ \__ __/ /__
\ / __ |/ , _/ __ | / , _/ // / / -_)
/_/_/ |_/_/|_/_/ |_| /_/|_|\_,_/_/\__/
Florian Roth - v0.8.0 December 2023 - Merry Christmas!
The 'God Mode Rule' is a proof-of-concept YARA rule designed to
identify a wide range of security threats. It includes detections for
Mimikatz usage, Metasploit Meterpreter payloads, PowerShell obfuscation
and encoded payloads, various malware indicators, and specific hacking
tools. This rule also targets ransomware behaviors, such as
shadow copy deletion commands, and patterns indicative of crypto mining.
It's further enhanced to detect obfuscation techniques and signs of
advanced persistent threats (APTs), including unique strings from
well-known hacking tools and frameworks.
*/
rule IDDQD_God_Mode_Rule {
meta:
description = "Detects a wide array of cyber threats, from malware and ransomware to advanced persistent threats (APTs)"
author = "Florian Roth"
reference = "Internal Research - get a god mode rule set with THOR by Nextron Systems"
date = "2019-05-15"
modified = "2023-12-23"
score = 60
strings:
$ = "sekurlsa::logonpasswords" ascii wide nocase /* Mimikatz Command */
$ = "ERROR kuhl" wide xor /* Mimikatz Error */
$ = " -w hidden " ascii wide /* Power Shell Params */
$ = "Koadic." ascii /* Koadic Framework */
$ = "ReflectiveLoader" fullword ascii wide /* Generic - Common Export Name */
$ = "%s as %s\\%s: %d" ascii xor /* CobaltStrike indicator */
$ = "[System.Convert]::FromBase64String(" ascii /* PowerShell - Base64 encoded payload */
$ = "/meterpreter/" ascii /* Metasploit Framework - Meterpreter */
$ = / -[eE][decoman]{0,41} ['"]?(JAB|SUVYI|aWV4I|SQBFAFgA|aQBlAHgA|cgBlAG)/ ascii wide /* PowerShell encoded code */
$ = / (sEt|SEt|SeT|sET|seT) / ascii wide /* Casing Obfuscation */
$ = ");iex " nocase ascii wide /* PowerShell - compact code */
$ = "Nir Sofer" fullword wide /* Hack Tool Producer */
$ = "impacket." ascii /* Impacket Library */
$ = /\[[\+\-!E]\] (exploit|target|vulnerab|shell|inject)/ nocase /* Hack Tool Output Pattern */
$ = "0000FEEDACDC}" ascii wide /* Squiblydoo - Class ID */
$ = "vssadmin delete shadows" ascii nocase /* Shadow Copy Deletion via vssadmin - often used in ransomware */
$ = " shadowcopy delete" ascii wide nocase /* Shadow Copy Deletion via WMIC - often used in ransomware */
$ = " delete catalog -quiet" ascii wide nocase /* Shadow Copy Deletion via wbadmin - often used in ransomware */
$ = "stratum+tcp://" ascii wide /* Stratum Address - used in Crypto Miners */
$ = /\\(Debug|Release)\\(Key[lL]og|[Ii]nject|Steal|By[Pp]ass|Amsi|Dropper|Loader|CVE\-)/ /* Typical PDB strings found in malware or hack tools */
$ = /(Dropper|Bypass|Injection|Potato)\.pdb/ nocase /* Typical PDP strings found in hack tools */
$ = "Mozilla/5.0" xor(0x01-0xff) ascii wide /* XORed Mozilla user agent - often found in implants */
$ = "amsi.dllATVSH" ascii xor /* Havoc C2 */
$ = "BeaconJitter" xor /* Sliver */
$ = "main.Merlin" ascii fullword /* Merlin C2 */
$ = { 48 83 EC 50 4D 63 68 3C 48 89 4D 10 } /* Brute Ratel C4 */
$ = "}{0}\"-f " ascii /* PowerShell obfuscation - format string */
$ = "HISTORY=/dev/null" ascii /* Linux HISTORY tampering - found in many samples */
$ = " /tmp/x;" ascii /* Often used in malicious linux scripts */
$ = /comsvcs(\.dll)?[, ]{1,2}(MiniDump|#24)/ /* Process dumping method using comsvcs.dll's MiniDump */
$ = "AmsiScanBuffer" ascii wide base64 /* AMSI Bypass */
$ = "AmsiScanBuffer" xor(0x01-0xff) /* AMSI Bypass */
$ = "%%%%%%%%%%%######%%%#%%####% &%%**#" ascii wide xor /* SeatBelt */
condition:
1 of them
}
Resources
Yara Binaries
Yara Readthedocs
Yara Git Repo
Kaspersky Yara Webinar
yaraPCAP
Kaspersky Klara
YarGen
Benji’s Linux DFIR Script
This Bash script is designed for conducting digital forensics on Linux systems. The script automates the collection of a wide range of system and user data that could be used during DFIR investigations.
Features:
- System Information: Collects basic system information including uptime, startup time, and hardware clock readouts.
- Operating System Details: Extract information about the operating system installation, including installer logs and file system details.
- Network Information: Gathers network configuration, IP addresses, and network interface details.
- Installed Programs: Lists all installed packages using both
rpm
andapt
. - Hardware Information: Retrieves detailed information about PCI devices, hardware summaries, and BIOS data.
- System Logs: Captures system journal logs and the contents of the
/var/log
directory. - User Data: Extracts user-specific data like recently used files and bash command history and zsh command history.
- Memory Dump: Performs a memory dump for detailed analysis.
- Process Information: Captures information about current running processes.
- User Login History: Records user login history and scheduled tasks.
- Secure Output Handling: Compresses and encrypts the gathered data for security.
Usage:
Set Permissions: Ensure the script is executable:
sudo apt-get install util-linux # For Debian/Ubuntu
sudo yum install util-linux # For CentOS/RHEL
chmod +x DFLinux.sh
sudo ./DFLinux.sh
Output: Check the specified output directory for the collected data.
Requirements:
- The script is intended for use on Linux systems.
- Please make sure you have the necessary permissions to execute the script and access system files.
- Required tools: dump, gpg, netstat, ifconfig, lshw, dmidecode, etc., should be installed.
#!/bin/bash
output_dir="/tmp/ExtractedInfo"
logfile="$output_dir/forensics_log.txt"
initialize_logs() {
echo "Forensic data extraction started at $(date)" > "$logfile"
}
write_output() {
local command="$1"
local filename="$2"
if $command > "$output_dir/$filename" 2>&1; then
echo "Successfully executed: $command" >> "$logfile"
else
echo "Failed to execute: $command" >> "$logfile"
fi
}
check_and_write_history() {
local user_home="$1"
local username="$2"
if [ -f "$user_home/.bash_history" ]; then
write_output "cat $user_home/.bash_history" "bash_command_history_$username.txt"
else
echo "No .bash_history for $username" >> "$output_dir/bash_command_history_$username.txt"
fi
if [ -f "$user_home/.zsh_history" ]; then
write_output "cat $user_home/.zsh_history" "zsh_command_history_$username.txt"
else
echo "No .zsh_history for $username" >> "$output_dir/zsh_command_history_$username.txt"
fi
write_output "cat $user_home/.local/share/recently-used.xbel" "recently_used_files_$username.txt"
}
initialize_logs
# System Information Extraction
write_output "uptime -p" "system_uptime.txt"
write_output "uptime -s" "system_startup_time.txt"
write_output "date" "current_system_date.txt"
write_output "date +%s" "current_unix_timestamp.txt"
if command -v hwclock &>/dev/null; then
write_output "hwclock -r" "hardware_clock_readout.txt"
else
echo "hwclock command not found" >> "$logfile"
fi
# Operating System Installation Date
write_output "df -P /" "root_filesystem_info.txt"
write_output "ls -l /var/log/installer" "os_installer_log.txt"
write_output "tune2fs -l /dev/sda1" "root_partition_filesystem_details.txt" # Check for correct root partition
# Network Information
write_output "ifconfig" "network_configuration.txt"
write_output "ip addr" "ip_address_info.txt"
write_output "netstat -i" "network_interfaces.txt"
# Installed Programs
write_output "dpkg -l" "dpkg_installed_packages.txt" # Replaced 'apt' with 'dpkg -l'
write_output "rpm -qa" "rpm_installed_packages.txt"
# Hardware Information
write_output "lspci" "pci_device_list.txt"
write_output "lshw -short" "hardware_summary_report.txt"
write_output "dmidecode" "dmi_bios_info.txt"
# System Logs and Usage
write_output "journalctl" "system_journal_logs.txt"
write_output "ls -lah /var/log/" "var_log_directory_listing.txt"
for user_home in /home/*; do
username=$(basename "$user_home")
check_and_write_history "$user_home" "$username"
done
if crontab -l &>/dev/null; then
write_output "crontab -l" "scheduled_cron_jobs_root.txt"
else
echo "No crontab for root" >> "$output_dir/scheduled_cron_jobs_root.txt"
fi
tar -czf "$output_dir/user_data.tar.gz" -C "$output_dir" ./*.txt --remove-files
echo "Data extraction complete. Check the $output_dir directory for output." >> "$logfile"
echo "Forensic data extraction completed at $(date)" >> "$logfile"
echo "Data extraction complete. Check the $output_dir directory for output."