Attackers use batch files to automate and speed up their work because they allow the execution of multiple commands. This way, the attacker does not need to provide any manual input but just needs to execute the malicious script on the victim’s system.
To prevent these malicious batch files from being detected by classic antivirus software, batch files are often further obfuscated using an obfuscator. We will tell you what can happen in such a case.
Table of contents
What is Batch File Obfuscation?
Batch file obfuscation is a method to reduce the readability and understandability of a batch file by obfuscating information. The reasons why batch files are obfuscated and how batch file obfuscation can be detected can be read here: An Introduction to Batch File Obfuscation
Below, a practical example from a DFIR deployment is presented. The incident is related to the BlackBasta ransomware group. Oneconsult’s incident response team shows how they proceeded when encountering suspicious files and an unknown technique.
Suspicious Batch Files
The Oneconsult team found two strange batch files during an incident response engagement. These files aroused interest because they were stored in “C:\Windows\” and no batch files were expected in this location. First, the more straightforward batch file was analyzed, which uninstalls a monitoring component of Defender.
It became interesting when the second .bat file was opened in the editor. As seen in Figure 2, the foreign characters did not help in understanding how the batch file worked. Based on the file extension, it was assumed to be a .bat file using an obfuscation that has not been encountered before.
Pursue Hypotheses
It was suspected that UTF BOMs were being used for confusion. BOM stands for Byte Order Mark and is a byte sequence appended to the beginning of a file. It is used as an identifier to define the byte order as well as the encoding form in UCS/Unicode strings. This allows different systems and text files to be represented correctly. In the past, UTF-16 was often used, which requires two or four bytes for encoding. In order for UTF-16 to be interpreted correctly, it must also be specified whether the bytes are read from left to right or the other way around. The first type of reading is called Big Endian (BE) and the second Little Endian (LE). If UTF-16 LE is to be interpreted, the string fffe (hexadecimal) can be specified at the beginning of a file. Nowadays mainly UTF-8 is used, which supports characters with a length of up to four bytes.
To investigate this, the file was uploaded to CyberChef. CyberChef can be used to quickly and easily apply different transformations to an input. This is especially helpful in DFIR deployments, as hypotheses can be tested without much effort. As can be seen in Figure 3, certain elements are directly readable.
Search for Suitable Obfuscator
A Google search for the term “batch file obfuscation” was performed to understand the obfuscation. One of the first hits points to a GitHub repository that provides an obfuscator for batch files. The provided example shows a significant similarity to the batch file found. The two tricks behind this obfuscation could be understood by skimming the code. As suspected, the first step was to put a wrong UTF BOM at the beginning of the malicious .bat file, in this case, UTF-16 LE. CyberChef does not interpret this BOM, so certain strings are recognizable. Also, cmd.exe does not interpret this BOM, so the code is executed exactly as in the original batch file. The second trick is due to Batch’s syntax on how to access substrings (see asciich.ch). A substring extracts a certain number of characters from a string. The content of the defined string "%VARIABLE_NAME"is output from the "START POSITION" parameter, while the number of characters to be output is determined by the "NUMBER" parameter:
%VARIABLE NAME:~START POSITION,NUMBER SIGN%
An example of this substring approach is demonstrated in Figure 4. Here, the variable "oc" is assigned the string “I hope your week has gotten off to a great start!”. The set command is responsible for setting the variable and with the command “echo” the whole string can be displayed. Now, in the third line, only the part starting from the thirty-seventh character is to be echoed; this is achieved with "%oc:~37%". With "%oc:~37,5", it is determined that from the thirty seventh character, the next 5 characters are to be the output.Deobfuscation
In the malicious batch file, a randomly named variable "¯ÃÃÃ"
was also defined and assigned a string. This string contains all characters used in the (un-obfuscated) original file. Thus, any character (and thus the program logic) in the original file can be reconstructed by the described substring approach. As can be seen in Figure 5, “%¯ÃÃÃ:~11.1%" becomes "p" and "%¯ÃÃÃ:~15.1%" becomes "o".
So that this does not have to be done for each character; this process can be simplified. For this, the content of CyberChef was copied and pasted into a new .bat file. At the beginning of each line, except the line with “set”, the command “echo” was placed. The newly created batch file can be seen in Figure 6.
The created batch file can now be executed in a VM. Execution in a VM prevents the attacker’s intended commands from being run on a real and valuable system. In general, suspicious files should always be executed in a virtual environment and never on a valuable host system. Figure 7 shows the actual contents of the malicious batch file. The attacker intended to disable Windows Defender with the first two commands, and with the last command, Windows Defender is supposed to be uninstalled.
What has not been addressed so far is that the obfuscator has defined certain special rules. These insert additional symbols between certain substrings. An example of such additional symbols can be seen in Figure 8. However, this is not relevant to understanding the batch file’s purpose. If this disturbs the reconstruction, these characters can be removed manually.
Summary
In this batch file obfuscation technique, the first step is to convert UTF-8 to UTF-16 LE using BOM. This means that when the file is checked with a text editor, the actual functionality cannot be understood. If this BOM is removed or opened with a tool that ignores it, some program elements can be identified. To further understand the entire content, the syntax used by Batch must be understood. After this, the individual letters and commands can be reconstructed. Attackers use various obfuscation techniques or develop them further, leading to unknown techniques being encountered. Therefore, getting an initial overview using a Google search is always advisable.
Do you still have questions about the batch file obfuscation technique or need support? Our Digital Forensics and Incident Response Team will be happy to help. We look forward to hearing from you.