Decoding Pseudo-Darkleech (Part #2)
Please refer to the first part that I posted earlier on some background on what "pseudo-darkleech" is, and how we decoded the first stages.
The code snippet that we were left with at the end of part #1 was this:
It sure looks like JavaScript, but is not overly readable. But adding a couple line breaks also works wonders in this case:
cat stage2.js | perl -pe 's/([;\}\{])/$1\n/g' ... this adds a line break after every ; { and }
Now, two code blocks stand out. The first is querying the "userAgent" (browser type), and it is getting queried for the presence of "rv:11", "MSIE" and "MSIE 10". "rv:11" can be found in both the old Firefox 11, but also in Internet Explorer 11. "MSIE" and "MSIE 10" are looking for that particular version of Internet Explorer. This section is coded as convoluted as it is, and also includes that odd (+[window.sidebar]) section, because the bad guys are trying to fool dynamic analysis in malware sandboxes and proxy servers. On a regular browser, this code works as intended, and returns a value of "2" in the variable "ug" if the browser is IE10 or IE11. But on a "Spidermonkey" or other JavaScript interpreter that does not emulate the full range of the browser's document object model (DOM), this section will leave "ug" undefined, or set it to zero.
The second block again refers to the "evs" section, but this time, replace(/[^a-z]/g,"") strips out all the numbers and spaces, and retains only the text characters. What then follows is a loop over this resulting string, and another XOR-operation to decode it. This time, it isn't a simple XOR with "9" like in the first stage, rather
npyu="tbQos5ZSsPE3rk";
^npyu.charCodeAt(kte%npyu.length); kte++;
an XOR-operation with a password (npyu) of length 14, which means that the code block is making use of a polyalphabetic cipher ("Vigenère"). The consequence of this is that the "evs" block alone cannot really be decoded without also decoding the JavaScript that contains the password. A simple XOR with 9 is trivially broken, but a XOR with a 14-character password cannot reasonably be brute forced.
So .. lets clean up the code a bit more, so that we can actually run it in SpiderMonkey, and see the result.
Instead of document.getElementById("evs"), we simply set the varaible zfsp directly, and assign it the content of evs:
zfsp="a9ca7 d97,b 52 3j3 db3ax4 82 d-126 gb9u6 103cc 109d 102 -126 3fef9d 1xd22 96 -p10 -ax9 b-10cd8g. 1bm07 10bw .....
Next, we strip off all that browser detection logic, and just set the result, "ug=2". And finally, instead of actually "running" the decoded block, at the very end, we want to "print" it: print(exutkqb);. Which leaves us with:
And we are ready to rock: daniel@debian:$ js script-cleaned.js
In the end, the "pseudo-darkleech" code block just generates an IFRAME that loads an Angler Exploit Kit. All these stages of having a HTML and JavaScript block where one decodes the other first into a script and then into the IFRAME, and all this using of browser directives and even a polyalphabetic cipher is not malicious per se, since it does not exploit any vulnerabilities. It just serves to "hide" the malicious IFRAME from proxies and malware sandboxes, so that the AnglerEK really only loads on the user PC, and not in an emulator or filter that aims to detect its presence.
Decoding Pseudo-Darkleech (#1)
I'm currently going through a phase of WordPress dPression. Either my users are exceptionally adept at finding hacked and subverted WordPress sites, or there are just so many of these sites out there. This week's particular fun seems to be happening on restaurant web sites. Inevitably, when checking out the origin of some crud, I discover a dPressing installation that shows signs of being owned since months. The subverted sites currently lead to Angler Exploit Kit (Angler EK), and are using "Pseudo Darkleech" as their gate.
Pseudo-Darkleech is not the most fortunate name for malcode, but as far as I can tell, it was "invented" by Sucuri back in December 2015, and has been taken up by others, like by fellow ISC Handler Brad over at malware-traffic-analysis.net. This is what pseudo-darkleech currently looks like:
And this is the tiny bit of code that the entire blob above decodes into:
cerfsvolants-wer4u-org showed up for the first time on April 18, and has been in use since. "cerf volant" is French and means "flying a kite". I hope this was a random selection, because the only other option is that this particular malware miscreant is actually making fun of us. Virustotal shows a couple of goodies that have been observed from this site.
In this diary, we'll do a step-by-step of the decoding, to show how it can be done, and more importantly, to show how massively convoluted the encoding used in current exploit kit gates has become. If, in a corporate setting, you are wondering why you get all the AnglerEK (JS/Redirector) hits only on your workstation anti-virus, but not on your proxy content filter, this diary is for you. You'll see that it is becoming very hard (aka "impossible") to detect such malcode without actually running it in a real browser. Sit back, and get some popcorn! :).
If you look at the first picture above, you'll notice there are two elements. One is a HTML "DIV" section named "evs", and filled with what looks like a garbage combination of numbers and letters. The other is a "script" section, but filled with what does not look like JavaScript at all.
For starters, lets ignore the "evs", and make sense of the "script". It seems to be a long list of variables that are assigned some values, but it is impossible to figure out rhyme or reason. When confronted with something like this, I first use a quick Perl command to make the blob more readable:
cat script.js | perl -pe 's/;/;\n/g';
This adds a line break to every ";", and thus separates out the individual Javascript commands. The result is still far from pretty, but it allows to determine that 99% of the code really only assigns values to variables. It is only near the bottom of the code block that we find the first actual JavaScript function call:
rtmj+=qpbuzz;
rtmj+=outpp;
rrv(rtmj)();
hkgcz="\x63\x78\x63";
rtmj=hkgcz;
So it is probably fair to assume that we can replace rrv(rtmj)(); with a print(rtmj); and run the result through JS/Spidermonkey, to see what gives:
daniel@debian:$ js script-edited.js
Note how the decoded JavaScript references the "evs" section that we ignored earlier!
replace(/[^\d ]/g,"") : Everything that is not a space " " or a number \d gets replaced with "" (empty) .. so this cuts out all the characters, and only leaves the numerals
for(i=0;...parseint(a[i])^9 This loops over the numerals, and does a ^9 (XOR with 9) operation on the number
fromCharCode : Turns the decoded number into its equivalent ASCII character
Hey, we can do this in Perl, too:
daniel@debian:$ cat evs | perl -pe 's/[^\d ]//g; s/(\d+)\s+/chr($1^9)/ge'
Even more progress :). I'll finish the analysis in a second diary that I'll post later.
Comments