My next class:

Form Spam: Increasing the Attacker's work function.

Published: 2006-11-08. Last Updated: 2006-11-08 15:42:36 UTC
by Johannes Ullrich (Version: 1)
0 comment(s)
Quite a while ago (2 years?) we started using a contact form. Part of the reason for the contact form was to avoid having to post our e-mail address to the site. Because we all know that posting your e-mail address on any web site is a sure way to get spammed to death. However, over the last year, the amount of spam sent via our contact form has exploded, and it was time to figure out how to combat it.

For a brief time we used "captchas". The idea is simple. You add a hard to read image of a few random letters asking a submitter to identify themselves as "human" by entering the text. However, the problem is obvious: To make it hard to OCR the image, it has to be quite hard to read. I came across one perl script used by a bot, that can recognize simple captchas in a second. Good captchas need to use colors, distorted letters and such, making them hard to read for many humans even if they have good eye sight.Using such a form can be difficult if you have bad vision. Our somewhat ugly home made captcha solution caused submissions to drop by about 30%, which wasn't acceptable.

Next, we implemented a couple of simple key word filters. They worked ok, but its kind of hard for us. What about people who are trying to send us a report that they see a bot that sends "Viagra" ads?

Another approach we took (and still take to some extent) is to block spammers. We had a lot of repeat offenders. But then again, they sometimes come through proxy servers and from dynamic IPs, so we end up locking out legitimate users as well. For what its worth: We do see in some cases a couple thousand "spam attempts" from IP addresses after they get blocked.

What you are really looking for is a method that will make it harder for a machine to submit a report but will be invisible to the user. Last week I experimented with this and came up with two ways to do so. One is implemented now and works amazingly well. The other doesn't work for us but may work for you.

First the method that doesn't work for us: Encrypted forms in Javascript. This one doesn't work for us since given our audience, I don't want to use Javascript for anything that is affects usability of our site. Sure, maybe some eye candy here and there, but that's it. Essentially encrypted javascript is what a lot of malware uses to disguise itself. But why not use it to "hide" a form from bots? The work to decrypt it is done by the browser and the user will never know that the form is encrypted. A spammer could implement a javascript parser, but its extra work. And now you just made their spam-bot slower by a factor of 10 or however long it takes to decrypt the form.

The second method is simpler, and does not require javascript. Instead, one or more fake form fields are added to the form. But style sheets are used to make them "invisible". To further confuse the attacker, the fake form fields are given names like "subject" and such suggesting to the bot that these are the form fields they are looking for. However, whenever a form is submitted with content in a "hidden" field, it is discarded. I am not talking about the classic hidden form fields that are not user changeable, but form fields that are marked with "display: none" like:

Sure, in particular after I write this article, attackers may catch on. But there are many ways to mark a form field as "invisible". You can randomize the names of your form fields to further confuse them. In short: you again increased the workload on the spammer without affecting the regular user. For a sample, just take a look at our contact form. We received only about 3 or 4 pieces of spam after implementing this last week. Usually we received dozens of pieces of spam a day.

All modern browsers do support style sheets, and for those that don't you can leave a little note in the form telling them whats going on. The fact that still some spam makes it past this method suggests that there is some manual spamming going on. But its minimal... and sure, lets have them hire armies of spaminators to have them submit these forms. Either way you succeeded in making spam more expensive and shifting the economics against it.

Couple user feedback items:

- Margles suggest to use a modified form of captcha: "Why not use an image to be identified? A house, or identify the gender of a person standing in the doorway, or a cat versus a dog. Something that would lend itself to one-word identification.".
Great idea. I think that's at least easier then some of the horrible captchas.

- Ed writes: "So far I have been successful by using a session variable that is set when the form is requested via http get. If the submitted form doesn't have the session variable set, I dump the email and return a bogus error message. Also I strip any http://  or  http://www from any submission so our users aren't likely to click on any links that load malware. The domain name, path and filename remain but its not hard to reconstuct if its a legitimate url submission."
I did try the session variable, and it didn't work for me that well :-(. Maybe I didn't implement it quite right.

- Neal writes: "[on some site the] submission from .. asks you to enter ... text found in a gif. However, no matter what you enter the first time, it says you entered it wrong"
Mean and devious. I like it!
"The best solutions I found requires symantical interpretation ... For example: Complete this sentance: I ate a ____ and it was good. ...  (a) hand (b) watermellon (c) rubber band (d) tire (e) Billy"The problems with this approach: ... non-english speakers... 5 options is still 20% spam ..."
        ... rubber tires (drool...)
Neal also points to this method which somewhat implements what was suggested above: http://www.kittenauth.com/ . Pictures of cute kittens! How can one NOT use that approach ;-) ?



---------
Johannes Ullrich, SANS Institute.



Keywords:
0 comment(s)
My next class:

Comments


Diary Archives