Security BSides – Web Scraping for Fun and Profit
Our Security Researcher Scott Goodwin and Senior IT Audit Manager Nick DeLena recently presented at BSides Boston.
Pastebin.com and other public ‘paste’ sites are rich sources of sensitive information. Hackers will often post their stolen ‘loot’ to websites like these for public consumption. These sources of information go largely unmonitored.
Pastebin is keenly aware of this fact, and offers users the ability to create a list of alert keywords. In the event that one of the keywords is found in a public paste, an email is sent to the user. They will also remove pastes that are found to contain personally identifiable information. However, we have shown that a well-designed scraper can capture this information before it is removed by the Pastebin team. These data can include:
- Suite of stolen NSA tools published to Pastebin
- NASA and other government sector breaches published to Pastebin
- Daily onslaught of compromised website credentials, Netflix, proxies, and occasionally, credit card data and even SSNs.