Threat Level: green Handler on Duty: Didier Stevens

SANS ISC: Architecture, compilers and black magic, or "what else affects the ability of AVs to detect malicious files" SANS ISC InfoSec Forums

Watch ISC TV. Great for NOCs, SOCs and Living Rooms: https://isctv.sans.edu

Sign Up for Free!   Forgot Password?
Log In or Sign Up for Free!
Architecture, compilers and black magic, or "what else affects the ability of AVs to detect malicious files"

In my last diary, we went over the impact of different Base encodings on the ability of anti-malware tools to detect malicious code[1]. Since results of our tests showed (among other things) that AV tools in general still struggle significantly more with detecting 64-bit malicious code then 32-bit malicious code, I thought it might be interesting to discuss another factor that might impact the ability of AVs to detect malware – specifically the choice of a compiler.

Since compilers can differ significantly in terms of their internal functionality and the use of various optimization techniques, it makes sense why they generate slightly (or significantly) different output from the same initial code. But do these differences have any impact in terms of hindering the ability of anti-malware to correctly identify malicious code? To try and answer this question, I devised a simple test.

I would start with 3 C++ programs:

  • one that would write a benign text file do disk (i.e. a “control” program that would allow us to identify the number of false-positive detections),
  • one that would drop the EICAR test file[2] on disk and
  • a program that would execute a staged Meterpreter reverse-tcp payload.

I would then compile each of these programs with 2 different compilers (TDM-GCC and Microsoft Visual C++) into both 32 and 64-bit binaries and use VirusTotal to find out whether would be any significant differences in their detections.

I used the following very simple code for both the “control” program and the EICAR dropper:

#include <iostream>
#include <fstream>
using namespace std;

int main() {
  ofstream OutFile([filename]);
  OutFile << [file contents];
  OutFile.close();
}

In the benign file, I’ve used the string “This is a benign string” as the contents. For the malicious program that would execute Meterpreter, I’ve used the same code as last time.

After all the binaries were compiled, it turned out that their VirusTotal detection scores did indeed differ – quite significantly in the case of our only “real” malicious file.

Besides a significant difference in the number of detections of the same 32-bit and 64-bit code, the chart seems to show that overall, the use Microsoft’s compiler might have negative impact on the ability of anti-malware tools to detect malicious code, at least when compared with TDM-GCC… But this isn’t actually true.

Even a small change to the initial code might result in significant changes to the resulting binary (and the ability of AVs to detect it) and these changes would be different for every compiler.

This can be seen quite clearly in the following chart, which shows detection scores of the original Meterpreter binaries and the scores for binaries that were compiled from the same source code, in which the only change was an increase (by about 1k) of the amount of memory allocated for the Meterpreter shellcode (which would not affect the basic function of the binary in this instance).

As we can see, the use of Visual C++ compiler over TDM-GCC does not always result in lower detection scores, as it might have appeared from the first chart. What is obvious from both charts, however, is that the choice of a compiler truly does make an impact when it comes to the ability of AVs to detect malicious code.

Even so, this is far from the whole story, since the choice of a compiler is not the only thing that affects the contents of a resulting binary. Use of different optimization settings might go a long way to change the result as well and the same might be true for use of different linkers. Given how simple our test code was, the differences caused by the use of different linkers should not be too significant here, but in the case of more complex code, it could certainly be noticeable... and there are many other factors as well. That will, however, be a topic for another time.

[1] https://isc.sans.edu/forums/diary/All+your+Base+arenearly+equal+when+it+comes+to+AV+evasion+but+64bit+executables+are+not/27466/
[2] https://www.eicar.org/?page_id=3950

-----------
Jan Kopriva
@jk0pr
Alef Nula

Jan

58 Posts
ISC Handler
Jun 9th 2021

Sign Up for Free or Log In to start participating in the conversation!