Recently a huge 87GB archive file containing 773 million unique email addresses and their passwords has been found in an online hacking forum. The file is being called “Collection #1” and was created to easily be used in credential stuffing attacks. The discovery has recently found by security researcher and “Have I Been Pwned” creator Troy Hunt and consists of 2,800 different files. This is not a new data breach but a compilation and combination of prior lists that have been around and called “Collection #1”. To see if you have been part of the hack you can check here. TO find out more about “Collection #1” you can view Troy Hunts Blog:
“In total, there are 1,160,253,228 unique combinations of email addresses and passwords. This is when treating the password as case sensitive but the email address as not case sensitive. This also includes some junk because hackers being hackers, they don’t always neatly format their data dumps into an easily consumable fashion. (I found a combination of different delimiter types including colons, semicolons, spaces and indeed a combination of different file types such as delimited text files, files containing SQL statements and other compressed archives.)
The unique email addresses totalled 772,904,991. This is the headline you’re seeing as this is the volume of data that has now been loaded into Have I Been Pwned (HIBP). It’s after as much clean-up as I could reasonably do and per the previous paragraph, the source data was presented in a variety of different formats and levels of “cleanliness”. This number makes it the single largest breach ever to be loaded into HIBP.
There are 21,222,975 unique passwords. As with the email addresses, this was after implementing a bunch of rules to do as much clean-up as I could including stripping out passwords that were still in hashed form, ignoring strings that contained control characters and those that were obviously fragments of SQL statements. Regardless of best efforts, the end result is not perfect nor does it need to be. It’ll be 99.x% perfect though and that x% has very little bearing on the practical use of this data. And yes, they’re all now in Pwned Passwords, more on that soon.”