Signature Evasion
PRACTICE! PRACTICE! PRACTICE!
An adversary may struggle to overcome specific detections when facing an advanced anti-virus engine or EDR (Endpoint Detection & Response) solution. Even after employing some of the most common obfuscation or evasion techniques discussed in Obfuscation Principles, signatures in a malicious file may still be present.
To combat persistent signatures, adversaries can observe each individually and address them as needed.
In this room, we will understand what signatures are and how to find them, then attempt to break them following an agnostic thought process. To dive deeper and combat heuristic signatures, we will also discuss more advanced code concepts and “malware best practices.”
Learning Objectives
Understand the origins of signatures and how to observe/detect them in malicious code
Implement documented obfuscation methodology to break signatures
Leverage non-obfuscation-based techniques to break non-function oriented signatures.
Before beginning this room, familiarize yourself with basic programming logic and syntax. Knowledge of C and PowerShell is recommended but not required
Signature Identification
Before jumping into breaking signatures, we need to understand and identify what we are looking for. As covered in Introduction to Anti-Virus, signatures are used by anti-virus engines to track and identify possible suspicious and/or malicious programs. In this task, we will observe how we can manually identify an exact byte where a signature starts.
When identifying signatures, whether manually or automated, we must employ an iterative process to determine what byte a signature starts at. By recursively splitting a compiled binary in half and testing it, we can get a rough estimate of a byte-range to investigate further.
We can use the native utilities head
, dd
, or split
to split a compiled binary. In the below command prompt, we will walk through using head to find the first signature present in a msfvenom binary
Once split, move the binary from your development environment to a machine with the anti-virus engine you would like to test on. If an alert appears, move to the lower half of the split binary and split it again. If an alert does not appear, move to the upper half of the split binary and split it again. Continue this pattern until you cannot determine where to go; this will typically occur around the kilobyte range.
Once you have reached the point at which you no longer accurately split the binary, you can use a hex editor to view the end of the binary where the signature is present.
We have the location of a signature; how human-readable it is will be determined by the tool itself and the compilation method.
Now… no one wants to spend hours going back and forth trying to track down bad bytes; let’s automate it! In the next task, we will look at a few FOSS (Free and Open-Source Software) solutions to aid us in identifying signatures in compiled code.
Automating Signature Identification
The process shown in the previous task can be quite arduous. To speed it up, we can automate it using scripts to split bytes over an interval for us. Find-AVSignature will split a provided range of bytes through a given interval.
Find-AVSignature
This script relieves a lot of the manual work, but still has several limitations. Although it requires less interaction than the previous task, it still requires an appropriate interval to be set to function properly. This script will also only observe strings of the binary when dropped to disk rather than scanning using the full functionality of the anti-virus engine.
To solve this problem we can use other FOSS (Free and Open-Source Software) tools that leverage the engines themselves to scan the file, including DefenderCheck, ThreatCheck, and AMSITrigger. In this task, we will primarily focus on ThreatCheck and briefly mention the uses of AMSITrigger at the end.
ThreatCheck
ThreatCheck is a fork of DefenderCheck and is arguably the most widely used/reliable of the three. To identify possible signatures, ThreatCheck leverages several anti-virus engines against split compiled binaries and reports where it believes bad bytes are present.
ThreatCheck does not provide a pre-compiled release to the public. For ease of use we have already compiled the tool for you; it can be found in C:\Users\Administrator\Desktop\Tools
of the attached machine.
Below is the basic syntax usage of ThreatCheck.
ThreatCheck Help Menu
For our uses we only need to supply a file and optionally an engine; however, we will primarily want to use AMSITrigger when dealing with AMSI (Anti-Malware Scan Interface), as we will discuss later in this task.
ThreatCheck
It’s that simple! No other configuration or syntax is required and we can get straight to modifying our tooling. To efficiently use this tool we can identify any bad bytes that are first discovered then recursively break them and run the tool again until no signatures are identified.
Note: There may be instances of false positives, in which the tool will report no bad bytes. This will require your own intuition to observe and solve; however, we will discuss this further in task 4.
AMSITrigger
As covered in Runtime Detection Evasion, AMSI leverages the runtime, making signatures harder to identify and resolve. ThreatCheck also does not support certain file types such as PowerShell that AMSITrigger does.
AMSITrigger will leverage the AMSI engine and scan functions against a provided PowerShell script and report any specific sections of code it believes need to be alerted on.
AMSITrigger does provide a pre-compiled release on their GitHub and can also be found on the Desktop of the attached machine.
Below is the syntax usage of AMSITrigger
AMSITrigger Help Menu
For our uses we only need to supply a file and the preferred format to report signatures.
AMSI Trigger Example
In the next task we will discuss how you can use the information gathered from these tools to break signatures.
Static Code-Based Signatures
Once we have identified a troublesome signature we need to decide how we want to deal with it. Depending on the strength and type of signature, it may be broken using simple obfuscation as covered in Obfuscation Principles, or it may require specific investigation and remedy. In this task, we aim to provide several solutions to remedy static signatures present in functions.
The Layered Obfuscation Taxonomy covers the most reliable solutions as part of the Obfuscating Methods and Obfuscating Classes layer.
Obfuscating methods
Obfuscation Method | Purpose |
Method Proxy | Creates a proxy method or a replacement object |
Method Scattering/Aggregation | Combine multiple methods into one or scatter a method into several |
Method Clone | Create replicas of a method and randomly call each |
Obfuscating Classes
Obfuscation Method | Purpose |
Class Hierarchy Flattening | Create proxies for classes using interfaces |
Class Splitting/Coalescing | Transfer local variables or instruction groups to another class |
Dropping Modifiers | Remove class modifiers (public, private) and make all members public |
Looking at the above tables, even though they may use specific technical terms or ideas, we can group them into a core set of agnostic methods applicable to any object or data structure.
The techniques class splitting/coalescing and method scattering/aggregation can be grouped into an overarching concept of splitting or merging any given OOP (Object-Oriented Programming) function.
Other techniques such as dropping modifiers or method clone can be grouped into an overarching concept of removing or obscuring identifiable information.
Splitting and Merging Objects
The methodology required to split or merge objects is very similar to the objective of concatenation as covered in Obfuscation Principles.
The premise behind this concept is relatively easy, we are looking to create a new object function that can break the signature while maintaining the previous functionality.
To provide a more concrete example of this, we can use the well-known case study in Covenant present in the GetMessageFormat
string. We will first look at how the solution was implemented then break it down and apply it to the obfuscation taxonomy.
Original String
Below is the original string that is detected
Obfuscated Method
Below is the new class used to replace and concatenate the string.
Recapping this case study, class splitting is used to create a new class for the local variable to concatenate. We will cover how to recognize when to use a specific method later in this task and throughout the practical challenge.
Removing and Obscuring Identifiable Information
The core concept behind removing identifiable information is similar to obscuring variable names as covered in Obfuscation Principles. In this task, we are taking it one step further by specifically applying it to identified signatures in any objects including methods and classes.
An example of this can be found in Mimikatz where an alert is generated for the string wdigest.dll
. This can be solved by replacing the string with any random identifier changed throughout all instances of the string. This can be categorized in the obfuscation taxonomy under the method proxy technique.
This is almost no different than as discussed in Obfuscation Principles; however, it is applied to a specific situation.
Using the knowledge you have accrued throughout this task, obfuscate the following PowerShell snippet, using AmsiTrigger to visual signatures.
Static Property-Based Signatures
Various detection engines or analysts may consider different indicators rather than strings or static signatures to contribute to their hypothesis. Signatures can be attached to several file properties, including file hash, entropy, author, name, or other identifiable information to be used individually or in conjunction. These properties are often used in rule sets such as YARA or Sigma.
Some properties may be easily manipulated, while others can be more difficult, specifically when dealing with pre-compiled closed-source applications.
This task will discuss manipulating the file hash and entropy of both open-source and closed-source applications.
Note: several other properties such as PE headers or module properties can be used as indicators. Because these properties often require an agent or other measures to detect, we will not cover them in this room to keep the focus on signatures.
File Hashes
A file hash, also known as a checksum, is used to tag/identify a unique file. They are commonly used to verify a file’s authenticity or its known purpose (malicious or not). File hashes are generally arbitrary to modify and are changed due to any modification to the file.
If we have access to the source for an application, we can modify any arbitrary section of the code and re-compile it to create a new hash. That solution is straightforward, but what if we need a pre-compiled or signed application?
When dealing with a signed or closed-source application, we must employ bit-flipping.
Bit-flipping is a common cryptographic attack that will mutate a given application by flipping and testing each possible bit until it finds a viable bit. By flipping one viable bit, it will change the signature and hash of the application while maintaining all functionality.
We can use a script to create a bit-flipped list by flipping each bit and creating a new mutated variant (~3000 - 200000 variants). Below is an example of a python bit-flipping implementation.
Once the list is created, we must search for intact unique properties of the file. For example, if we are bit-flipping msbuild
, we need to use signtool
to search for a file with a useable certificate. This will guarantee that the functionality of the file is not broken, and the application will maintain its signed attribution.
We can leverage a script to loop through the bit-flipped list and verify functional variants. Below is an example of a batch script implementation.
This technique can be very lucrative, although it can take a long time and will only have a limited period until the hash is discovered. Below is a comparison of the original MSBuild application and the bit-flipped variation.
Entropy
From IBM, Entropy is defined as “the randomness of the data in a file used to determine whether a file contains hidden data or suspicious scripts.” EDRs and other scanners often leverage entropy to identify potential suspicious files or contribute to an overall malicious score.
Entropy can be problematic for obfuscated scripts, specifically when obscuring identifiable information such as variables or functions.
To lower entropy, we can replace random identifiers with randomly selected English words. For example, we may change a variable from q234uf
to nature
.
To prove the efficacy of changing identifiers, we can observe how the entropy changes using CyberChef.
Below is the Shannon entropy scale for a standard English paragraph.
Shannon entropy: 4.587362034903882
Below is the Shannon entropy scale for a small script with random identifiers.
Shannon entropy: 5.341436973971389
Depending on the EDR employed, a “suspicious” entropy value is ~ greater than 6.8.
The difference between a random value and English text will become amplified with a larger file and more occurrences.
Note that entropy will generally never be used alone and only to support a hypothesis. For example, the entropy for the command pskill
and the hivenightmare exploit are almost identical.
To see entropy in action, let’s look at how an EDR would use it to contribute to threat indicators.
In the white paper, An Empirical Assessment of Endpoint Detection and Response Systems against Advanced Persistent Threats Attack Vectors, SentinelOne is shown to detect a DLL due to high entropy, specifically through AES encryption.
Behavioral Signatures
Obfuscating functions and properties can achieve a lot with minimal modification. Even after breaking static signatures attached to a file, modern engines may still observe the behavior and functionality of the binary. This presents numerous problems for attackers that cannot be solved with simple obfuscation.
As covered in Introduction to Anti-Virus, modern anti-virus engines will employ two common methods to detect behavior: observing imports and hooking known malicious calls. While imports, as will be covered in this task, can be easily obfuscated or modified with minimal requirements, hooking requires complex techniques out of scope for this room. Because of the prevalence of API calls specifically, observing these functions can be a significant factor in determining if a file is suspicious, along with other behavioral tests/considerations.
Before diving too deep into rewriting or importing calls, let’s discuss how API calls are traditionally utilized and imported. We will cover C-based languages first and then briefly cover .NET-based languages later in this task.
API calls and other functions native to an operating system require a pointer to a function address and a structure to utilize them.
Structures for functions are simple; they are located in import libraries such as kernel32
or ntdll
that store function structures and other core information for Windows.
The most significant issue to function imports is the function addresses. Obtaining a pointer may seem straightforward, although because of ASLR (Address Space Layout Randomization), function addresses are dynamic and must be found.
Rather than altering code at runtime, the Windows loader windows.h
is employed. At runtime, the loader will map all modules to process address space and list all functions from each. That handles the modules, but how are function addresses assigned?
One of the most critical functions of the Windows loader is the IAT (Import Address Table). The IAT will store function addresses for all imported functions that can assign a pointer for the function.
The IAT is stored in the PE (Portable Executable) header IMAGE_OPTIONAL_HEADER
and is filled by the Windows loader at runtime. The Windows loader obtains the function addresses or, more precisely, thunks from a pointer table, accessed from an API call or thunk table. Check out the Windows Internals room for more information about the PE structure.
At a glance, an API is assigned a pointer to a thunk as the function address from the Windows loader. To make this a little more tangible, we can observe an example of the PE dump for a function.
The import table can provide a lot of insight into the functionality of a binary that can be detrimental to an adversary. But how can we prevent our functions from appearing in the IAT if it is required to assign a function address?
As briefly mentioned, the thunk table is not the only way to obtain a pointer for a function address. We can also utilize an API call to obtain the function address from the import library itself. This technique is known as dynamic loading and can be used to avoid the IAT and minimize the use of the Windows loader.
We will write our structures and create new arbitrary names for functions to employ dynamic loading.
At a high level, we can break up dynamic loading in C languages into four steps,
Define the structure of the call
Obtain the handle of the module the call address is present in
Obtain the process address of the call
Use the newly created call
To begin dynamically loading an API call, we must first define a structure for the call before the main function. The call structure will define any inputs or outputs that may be required for the call to function. We can find structures for a specific call on the Microsoft documentation. For example, the structure for GetComputerNameA
can be found here. Because we are implementing this as a new call in C, the syntax must change a little, but the structure stays the same, as seen below.
To access the address of the API call, we must first load the library where it is defined. We will define this in the main function. This is commonly kernel32.dll
or ntdll.dll
for any Windows API calls. Below is an example of the syntax required to load a library into a module handle.
Using the previously loaded module, we can obtain the process address for the specified API call. This will come directly after the LoadLibrary
call. We can store this call by casting it along with the previously defined structure. Below is an example of the syntax required to obtain the API call.
Although this method solves many concerns and problems, there are still several considerations that must be noted. Firstly, GetProcAddress
and LoadLibraryA
are still present in the IAT; although not a direct indicator it can lead to or reinforce suspicion; this problem can be solved using PIC (Position Independent Code). Modern agents will also hook specific functions and monitor kernel interactions; this can be solved using API unhooking.
Using the knowledge you have accrued throughout this task, obfuscate the following C snippet, ensuring no suspicious API calls are present in the IAT.
Putting it all Together
As reiterated through both this room and Obfuscation Principles, no one method will be 100% effective or reliable.
To create a more effective and reliable methodology, we can combine several of the methods covered in this room and the previous.
When determining what order you want to begin obfuscation, consider the impact of each method. For example, is it easier to obfuscate an already broken class or is it easier to break a class that is obfuscated?
Note: In general, You should run automated obfuscation or less specific obfuscation methods after specific signature breaking, however, you will not need those techniques for this challenge.
Taking these notes into consideration, modify the provided binary to meet the specifications below.
No suspicious library calls present
No leaked function or variable names
File hash is different than the original hash
Binary bypasses common anti-virus engines
Note: When considering library calls and leaked function, be conscious of the IAT table and strings of your binary.
Once sufficiently obfuscated, compile the payload on the AttackBox or VM of your choice using GCC or other C compiler. The file name must be saved as challenge.exe
. Once compiled, submit the executable to the webserver at http://MACHINE_IP/
. If your payload satisfies the requirements listed, it will be ran and a beacon will be sent to the provided server IP and port.
Note: It is also essential to change the C2Server
and C2Port
variables in the provided payload or this challenge will not properly work and you will not receive a shell back.
Note: When compiling with GCC you will need to add compiler options for winsock2
and ws2tcpip
. These libraries can be included using the compiler flags -lwsock32
and -lws2_32
If you are still stuck we have provided a walkthrough of the solution below.
Conclusion
Signature evasion can kick off the process of preparing a malicious application to evade cutting-edge solutions and detection measures.
In this room, we covered how to identify signatures and break various types of signatures.
The techniques shown in this room are generally tool-agnostic and can be applied to many use cases as both tooling and defenses shift.
At this point, you can begin understanding other more advanced detection measures or analysis techniques and continue improving your offensive tool craft.
Last updated