As a followup to last week’s post regarding the number of stolen account credentials that show up on Pastebin daily, I’ve collected enough data to get a more accurate picture of the posting rate.
As a reminder, here was the first day’s data:
Start time: 20171113 2100UTC Credentials parsed to date: 792,488 Clean (unproblematic) credentials: 734,807 Unique clean credentials: 475,653
Credentials parsed to date: I’ve had a homebrew pastebin scraper analyzing new pastes, watching for email addresses, for a while now. This is where the number of credentials extracted stood as of Start time.
Clean (unproblematic) credentials: I wrote a somewhat lazy parser that attempts to help me identify patterns in the extracted paste bodies so I can more effectively grab credentials pasted in a variety of formats. There are still some that I haven’t quite worked through yet, so this count removes those, leaving only the ones I’m confident in.
Unique clean credentials: A count of the unique credentials parsed from the pastebin data extracted as of Start time.
the latest day’s data is:
Start time: 20171121 2100UTC Credentials parsed to date: 988,019 Clean (unproblematic) credentials: 887,403 Unique clean credentials: 523,298
So, the results:
Potential credentials posted per day: 24,441 Identified credentials posted per day: 19,074 Unique credentials posted per day: 5,956
Almost 6,000 new, unique credentials posted to Pastebin per day, modulo my script’s ability to accurately extract them.