Quick Tip: Pastebin Monitoring & Recon

Published: 2011-11-24
Last Updated: 2011-11-24 16:56:55 UTC
by Russ McRee (Version: 1)
6 comment(s)

Happy Thanksgiving!

On the heels of Dr. Ullrich's diary regarding SCADA hacks published on Pastebin I thought I'd mention some Pastebin monitoring and recon resources that you may find useful.

One reader wrote in to say that you could use Google Alerts to monitor Pastebin for names and keywords of interest to you, but you may prefer a Google Custom Search instead. Configure it to monitor Pastebin and other similar sites; set names and keywords that are relevant for your needs.

Or, as Lenny pointed out in his July blog entry, you could use Andrew's PasteLert or PasteBin Scraper. And in case you weren't following along, Andrew --> Paterva --> Maltego --> Pastebin Transforms.

More than one SANS certification track curriculum discusses Maltego use for good reason. :-)

Any useful Pastebin crawling/scraping tactics you'd like to share? We await your comments or contact. 

Russ McRee
@holisticinfosec

Keywords: pastebin
6 comment(s)

Comments

Pastebin.py in a console...very nice:

http://www.shellguardians.com/2011/07/monitoring-pastebin-leaks.html
Now that they've put the site: functionality back in, I use Google Alerts to monitor them all "+my +keywords +here site:pastebin.com OR site:paste2.org OR site:paste.bradleygill.com OR site:pastie.org OR site:dpaste.com OR site:paste.pocoo.org OR site:pastie.textmate.org"
I have always just hacked my own scrapers for various purposes. One has proved useful enough that I have made the output appear in the form of a nice html web page. It scrapes the dice.com jobs board twice a day and generates a few web pages to ease looking for software contracts. You can see it here:

http://www.elilabs.com/~rj/dice_date.html

Similar techniques can be used to scrape just about anything from anywhere. This scraper is coded as a bash shell script, but perl would have looked nicer in the source file.
The problem with using google alerts is that they arent that timely. So there can be cases where pastes expire before you get to see them. Which is why I prefer a more regular and specific scan of the paste sites.

Additionally the API's for the various pastebins unfortunately dont offer the level of searching and alerting just yet. I tried to contact a few of them but got no reply. Scraping breaks most terms of use, however the chaps from pastie.org said that it would be okay (which is why the pastebin scraper still works for that).

But if you are looking for a cmd line version you are welcome to hack up any of the code I hacked up for the pastebin/pastelert stuff, its all free to do whatever you feel like (aka the i-dont-know-licensing-and-dont-care license):

http://andrewmohawk.com/tag/pastebin/

Cheers
-AM
Check out "morning coffee" firefox extension. Configire isc,sans.org, your own custom search terms (such as Alex has above), a few twitter searches, etc. to open every day in a new firefox tab
To avoid upsetting the webmasters, I generally put a random delay in my scraping scripts so that it doesn't whack the site all at once with a big load from my machine. The idea is to take maybe 10 minutes to do what you could have done full speed in a few seconds, so that it looks more like normal human initiated activity.

Diary Archives