Biggest Data Breaches

Every day while we use different apps, payment services, online banking, e-commerce stores and other websites, we expect the companies behind these platforms to keep our data safe. And since data has become a fairly valuable asset, we generally feel that all the types of services mentioned above would at least pay us back for using their products and giving them our business in the form of data security.

But if recent history is anything to go by, that is not the case. If you follow security news even semi-regularly, you would know that almost every month of the year gives us headline-worthy news about how hackers made off with our data. Data breaches have become relatively regular now.

An image featuring data breach and cyber attack concept with multiple numbers and characters

And it seems like no amount of security measures and precautions, even at leading technology companies, can keep hackers away from infiltrating their systems and stealing mountains of customer data. In fact, in some cases, companies manage to lose all of their sensitive customer data and then proceed to take new cybersecurity measures, which again fall apart in the next data breach or cyberattack.

For the end-user or customer, it has become obvious that most identity theft incidents in the online world occur precisely because cybercriminals use information that becomes available as a result of major data breaches. You might think your information is safe or that your data may never have been exposed due to a data breach, but you might change your mind after looking at the number of people whose sensitive information has been leaked in various data breaches in the past.

Let’s take a look at the biggest data breaches on record as of March 2021. These data breaches impacted hundreds of millions and even billions of people around the world.

CAM4 Data Breach

An image featuring data breach concept with a PC motherboard

The CAM4 data breach became known around May 2020, and it affected a total of 10.88 billion user records.

CAM4’s main operation is to provide live streaming services to creative webcam performances. Hackers managed to breach its Elasticsearch server and compromised nearly 11 billion data records from customers.

The full impact of the data that was breached or hacked remains unknown, but the leaked sensitive information did contain critical items, including:

  • Names (first and last)
  • Spam detection logs
  • Fraud detection logs
  • IP addresses
  • Password hashes
  • Token information
  • Chat transcripts of CAM4 and user interaction
  • Conversations between users
  • Inter-user chats
  • Payment logs such as currency, amount of money spent and credit card type
  • Usernames
  • User language
  • Device specification
  • Gender
  • Sign-up information
  • Original location
  • Email addresses

And once again, we find that most of the email addresses that hackers compromised had cloud storage service linkages. In other words, hackers could take their cyberattacks one level further by launching phishing attempts on all the users whose email addresses were leaked. Victims of their phishing attacks could inadvertently provide hackers with even more personal information such as business cards, photos, credit card numbers, and more.

An image featuring a person using her laptop and phone that says data breach on the laptop and her phone being inaccessible

Moreover, the sensitive content available on CAM4 means that any information that is a part of the data breach can come in handy when the time comes for hackers and other cybercriminals to defame and blackmail victims not just once but multiple times.

Anurag Sen discovered the leak while working for Safety Detectives, and according to his team, the actual size of the database was more than 7 terabytes. Such a massive amount of leaked information means that hackers have the usernames and passwords of at least several million users. SafetyDetectives also said that hackers could further weaponize the data to target groups and individuals.


As mentioned, email addresses and full names were also compromised. Hackers can use these to identify users and then carry out additional attacks with their identity.

Yahoo Data Breach

An image featuring a skull made out of code representing a cyber attack

The full extent of a massive Yahoo data breach became known in 2017, and it affected around 3 billion accounts. The company confirmed that a group of hackers had breached its systems back in August 2013 and had managed to compromise data belonging to over a billion accounts. Yahoo also revealed that hackers compromised security questions and answers on accounts that they had exposed. That significantly increased the risk of identity theft for the exposed accounts.

Yahoo did not report that it suffered a data breach soon after it happened. Instead, it waited for years and only brought it to light when the company was negotiating with Verizon for a possible sale in December 2016.

In any case, the data breach meant millions of users had to scramble to change their passwords. They also had to change their security questions and answers because Yahoo stored them on their servers without any encryption. After the announcement, though, Yahoo also added that it would encrypt security questions and answers in the future.

But that wasn’t the end of the Yahoo data breach. When October 2017 came around, the company made a statement that the data breach it thought had compromised 1 billion accounts had actually compromised close to 3 billion accounts.

The company said that it had carried out a new investigation, and it revealed that sensitive data such as banking information, payment details and passwords (which Yahoo stored in clear text) had not been compromised.

Whether or not Yahoo truthfully shared the details and extent of the data breach is more or less irrelevant because it will remain one of the biggest data breaches in cybersecurity history. Some believe that Yahoo actually went through two massive data breaches within a span of four years—the first one in 2013 and the other sometime in 2017. Yahoo took four years to fully investigate the impact of the matter. But that did not stop the company’s reputation as a secure web service from crashing.

AIS (Advanced Info Service) Data Breach

An image featuring a person wearing a hoodie and is using his PC with ones and zeros in the front representing a hacker

This data breach affected around 8 billion records. Security researcher Justin Paine discovered the extent of the data breach while browsing Shodan and BinaryEdge on May 7, 2020. He had found an open ElasticSearch database.

Paine quickly wrote a report of his finding that a subsidiary of AIS (Advanced Info Service), a mobile network operator based in Thailand, controlled the compromised database. AIS is one of the biggest SGM phone operators in the country and boasts over 40 million customers. Of course, that did not change the fact that hackers found a way to break into the database and leak the potent combination of Netflow logs and DNS query logs that belonged to customers of AIS’s subsidiary, AWN (Advanced Wireless Network).

At the time, Paine said the leaked database present at BinaryEdge was likely observed and exposed (and then made accessible) close to May 1, 2020.

In the following three weeks or so, the size of the database grew by the hour or, more specifically, 200 million new data rows every day.

The total number of stolen customer records quickly crossed the 8 billion mark, with the final tally of 8.3 billion records in a single database. AIS did come out and confirm the data breach. The company, pretty much like all the other companies who have had to deal with data breaches, said its security mechanisms did not do the job.

AIS thanked the security researcher who broke the news for taking the time to address the data breach and contact AIS in addition to the Thailand National CERT team.


After a verification process, AIS came to the conclusion that internet usage patterns and the data leaks did not give them any evidence that the leaked data could enable hackers to identify their customers.

Keepnet Labs Data Breach

An image featuring a person wearing a hoodie and hacking on his laptop and PC representing a hacker

This data breach affected over 5 billion records and was reported in March 2020. As with some of the other biggest data breaches in history, security researchers found a leaked Elasticsearch database.

Security expert Bob Diachenko put out the first report and, using information collected via techniques such as reversing DNS records and SSL certificates, ascertained that the leaked ElasticSearch database was managed by a cybersecurity company based in the United Kingdom.

Diachenko also reported that the discovery process was a bit ironic in the sense that it was a data breach database. Such databases are huge, and this one contained tons of information related to security incidents that happened in the last several years or so. The researcher added, though, that the leaked database did not expose customer records or even company data. With that said, this data breach did leak a collection of previous data breaches.

According to Diachenko, BinaryEdge had indexed the database, and the related cluster on ElasticSearch provided two collections. One was labeled leaks_v2, while the other was labeled leaks_v1. The leaks_v1 collection had over 5 billion records, while the leaks_v2 collection had over 15 million records. All records were updated in real-time.

Diachenko reported that the compromised database had good structure and included information such as:

  • Sources of the previous data leaks (including past breaches that happened at companies such as VK, Tumblr, LinkedIn, Twitter, and Adobe)
  • Email domain
  • Email addresses
  • Passwords including plaintext, encrypted and hashed based on the previous data breach in question
  • Leak year
  • Hash type

About a month after Bob had reported the breach, Keepnet labs put out its own public statement addressing the data leak. The company had begun a new partnership with another service provider in March 2020. The company also mentioned that the new provider had performed several scheduled maintenance tasks and begun the process of migrating the ElasticSearch database.

However, during the migration operation, the engineer working on the job disabled the firewall for, according to the engineer himself, around 10 minutes. The engineer disabled it to speed up the migration operation. That 10-minute window was more than enough for BinaryEdge, an online indexing service, to index the data belonging to the ElasticSearch database.

Another interesting bit about this data breach story is that Keepnet approached several media outlets, including Diachenkko himself, and requested them to amend their story so as not to mislead the public over the security breach. Apparently, Keepnet wanted readers to know that the breach happened not because of its weak security systems but because of negligence.

BlueKai Data Breach

An image featuring a person using his laptop that says data breach on it

This data breach affected records numbering in the billions and was first reported in June 2020. Anurag Sen, a security researcher, discovered an unsecured database that he could access via the open internet.

The database belonged to BlueKai, a startup acquired by Oracle in 2014 for a reported $400 million. It held several billion records that contained information, including:

  • Names
  • Email addresses
  • Home addresses
  • Web browsing history, which contained more specific information like newsletter subscriptions
  • Online purchases

According to TechCrunch, BlueKai as an app had managed to build up a massive bank containing web tracking data that rivaled banks with the federal government. BlueKai used tracking technologies, including website cookies, that follow online users over the internet as they visit different websites or services.

Cyware reported that BlueKai tracked almost 1.2% of all internet traffic. More specifically, it also tracked information about popular websites such as ESPN, Forbes, Healthline,, Glassdoor, Rotten Tomatoes, Amazon, Levi’s and The New York Times. Since these websites serve millions of users each month, it stands to reason that this data breach compromised a significant amount of data.

Aadhaar Data Breach

An image featuring multiple locks that are blue colored with one of them being red and unlocked representing security data breach

The Aadhaar data breach affected more than a billion people and took place in March 2018. Unlike some of the other data breaches on this list, this data breach resulted in the leakage of some of the most sensitive data that hackers could use to carry out identity fraud online.

Hackers managed to expose the personal information of over 1.1 billion Indian citizens and then put that information up for sale on the dark web.

The breach allowed hackers to make off with biometric data that was, at the time, the largest in the world. Later, it was also revealed that the breach came about because of a system data leak that another state-owned utility service ran.

Regardless, the data leak enabled complete access to Aadharr holders’ private information. It exposed Aadhaar holder names, financial information such as their personal bank details and, most importantly, their identity numbers (a unique 12-digit number).

And that’s not all. The data leak also resulted in more personal data such as retina scans, thumbprints and photographs of almost all Indian citizens being exposed. As mentioned above, such specific and unique identifying details of customers can easily be used to commit identity fraud in the future.

Whisper Data Breach

An image featuring a lock that is unlocked representing data breach

While the number of people affected by the Whisper data breach is unknown, it is known that it led to the compromise of a database that contained over 900 million records and posts. The metadata generated around those posts also lost protection as a result of the breach. The exact date of the breach is also unknown, but the database was found in March 2020.

Whisper is an app that offers secret-sharing tools to strangers. The service markets itself as the safest place to share secrets on the internet. The data breach leaked the personally identifiable information of millions, including:

  • Ages
  • Locations
  • Random details
  • Intimate confessions

In addition to that, the leak enabled access to the posts made on the app, which (when posted on the app did offer anonymity) led to even more information tied directly to the “anonymous” posts. Fortunately for the people whose information got leaked, the data breach did not have any real names attached to it. However, it did have the stated names the users chose for themselves when they signed up on the site.

The leak exposed other personal information as well, such as:

  • Ethnicity
  • Hometown
  • Gender
  • Nickname
  • Group memberships

One Washington Post report stated that security consultants Matthew Porter and Dan Ehrlich, along with some independent researchers, discovered the leaked database.

Researchers said they had complete access to more than 900 million records belonging to users, which essentially spanned the entire lifetime of the Whisper app itself.

First American Financial Data Breach

This data breach affected over 885 million users, and it was first discovered in May 2019. The First American Financial Corporation reported the data breach and said that over 16 years’ worth of sensitive records (totaling close to 885 million), including incredibly sensitive information such as social security numbers, had been leaked. The financial corporation also said the records contained personal data including:

  • Social security numbers
  • Bank account information
  • Wire transactions
  • Mortgage paperwork


As you can see, even though the number of affected people is not as high as in some of the other data breaches we mentioned in this post, the information that has been leaked is of extreme sensitivity. Data Breach

An image featuring a motherboard that says breach on it with red color representing data breach

This data breach affected around 763 million users in 2019. A MongoDB instance leaked millions of email addresses, phone numbers, records on dates of birth, and more personal information. And with no password to protect the data, hackers would have had ample opportunity to make something of it for identity theft and other types of fraud.

The data breach that impacted over 700 million users leaked personal information such as:

  • Names
  • Phone numbers
  • Email addresses
  • Dates of birth
  • Genders
  • IP addresses

Facebook Data Breach

An image featuring an unlocked lock with the Facebook logo on it representing Facebook Data Breach

Facebook suffered a data breach affecting over 540 million users. The incident was reported in April 2019.

Technically, Facebook didn’t actually leak the data itself. According to the UpGuard Cyber Risk research team, two third-party Facebook apps leaked datasets that became available for access to the general internet at large.

One of these apps was developed by Cultura Colectiva, a media company based in Mexico. The data it leaked contained 540 million user records, amounting to 146 GBs.

It contained information such as:

  • Account names
  • Facebook IDs
  • Reactions
  • Likes
  • Comments

Sina Weibo Data Breach

An image featuring a person wearing a hoodie and using multiple computers and a laptop to hack representing a hacker

In March 2020, it was revealed that around 540 million users of Weibo had their personal details leaked as a result of a data breach. Weibo is one of the biggest social media networking websites in China. Attackers hacked the platform a year earlier than the reported date of the breach and made data belonging to millions of Chinese users available online for sale.

Interestingly enough, a database containing over 540 million records only sold for $250 via the dark web. Apparently, the database did not have any information that could have hacking potential. More specifically, it did not contain user passwords or payment information.

But that doesn’t mean the leak was entirely useless. The leaked data still included personally identifiable information such as:

  1. Names
  2. Usernames
  3. Locations
  4. Phone numbers for a portion of the users
  5. Gender


It goes without saying that hackers can use such information to carry out identity theft and other types of fraud that rely on the impersonation of other people.

Conclusion: Biggest Data Breaches That Affect Millions of Users May Become Common

An image featuring a red unlocked lock representing data breach

There are a ton of other data breaches that we left out of this list as they impacted fewer users. But that doesn’t mean they weren’t damaging for the users and companies involved. And since the number of such attacks is increasing on a yearly basis, organizations and services will have to enhance and redeploy their security and privacy strategies in order to keep customers’ data and personal information secure.

The global pandemic, in this regard, held up finances and resources that new security mechanisms would have required for companies to protest customer information.

Learn about how to safeguard your own data in our complete privacy guide.

Use the comments section below to let us know what you think about data breaches and their frequency in the coming years.

Leave a Comment