Back to blog

The Facebook Data Leak Explained

ReliaQuest 8 April 2021

This weekend press exposed a significant data leakage containing the records of 533 million Facebook users. The records were posted on multiple cybercriminal forums for free. This incident exposed the personal information of Facebook users, including phone numbers, emails, full names, job occupations, and birth dates. Much of this information was likely scraped from public Facebook profiles, as Facebook alluded to in a new statement made on 06 Apr 2021. However, the leakage also included data that wasn’t made public by users, such as their phone numbers. In this blog we dive into what happened, how the information was exposed, who has taken responsibility for the attack, and the risks involved to affected users.

What is the Facebook Data Leak?

The initial incident started in mid-to-late 2019. It is believed that threat actors scraped Facebook’s website to acquire the information of millions of users. Web scraping refers to the process of using automated scripts or bots to harvest public information from sites, such as any information users make publicly available on their profiles (Names, City, Education, etc.).

Scraping is not a new technique, and it occurs daily. Cybercriminals frequently scrape sites such as Facebook, Twitter, and Reddit, and many other sites. Cybercriminals can leverage the data extracted from sites for a variety of purposes, including spamming, information gathering, and social engineering attacks. They can also sell scraped data for a profit to other cybercriminals, marketing companies, or call centers.

Figure 1: Raidforums user advertising scraped Instagram database

As previously mentioned, data scraped from sites is usually public data. If users set their emails, names, and locations to be public, then that data could be viewed and harvested by virtually anyone. However, the data exposed from Facebook wasn’t your usual data scraping incident. Threat actors were able to harvest users’ phone numbers, even if the users had set their number to be private on their Facebook profiles. Facebook stated that they believed that cybercriminals accomplished this by exploiting Facebook’s “contact importer” feature, which allows users to find other users by using their phone numbers.

This feature could have been exploited by uploading large sets of phone numbers and identifying which Facebook profiles matched the numbers. Facebook stated that this feature was fixed in September 2019, following the discovery that threat actors were abusing the feature. However, while Facebook fixed the feature in 2019, the phone numbers of 533 million users had already been harvested by malicious individuals, along with other identifying information on users.

How Was the Facebook Data Distributed in the Cybercriminal World?

Initially, attackers offered the data at quite a steep price. As the data began circulating in open and gated cybercriminal forums in 2020, a listing on Russian-speaking cybercriminal forum XSS in August 2020 advertised the sale of this data for “only” USD 25,000 (see Figure 2). Listings were identified across several other forums, such as Raidforums. The sheer size of the data leakage and the wide geography it covered (106 countries) made the data a gold mine for cybercriminals. Therefore, these listings often caught the interest of multiple threat actors.

Figure 2: XSS user advertises Facebook leak in August 2020

The XSS user who initially shared the data was allegedly responsible for the attack. When other forum members questioned the origin of the breached data, the original poster claimed that they had exploited a zero-day vulnerability on Facebook’s website. This vulnerability allegedly allowed the threat actor to grab users’ data from their Facebook ID (see Figures 3-4). The user also stated that the data extracted dated from 01 Jan 2020, as Facebook had patched the vulnerability by then. The user did not provide further information.

Figure 3: XSS user claims that they exploited a vulnerability to crawl Facebook users

*FIGURE 4: XSS user provides more information on how they claim to have acquired the Facebook data leak*

Cybercriminals often purchase data to re-sell it to other cybercriminals for a profit, the cost or set price of the data breach lowering with each transaction. From 2019-2021, the data likely exchanged hands multiple times— an activity frequently observed in cybercriminal forums. Eventually, the data breach becomes devalued, and users will expose it for free to gain reputation or notoriety within a cybercriminal forum. In the case of the Facebook breach, this is the most likely situation. On 03 April 2021, a user on the English-speaking cybercriminal forum Raidforums uploaded the entire Facebook breach for a negligible cost of eight forum tokens (approximately USD 2.52).

*FIGURE 5: Raidforums user exposes the Facebook data leak for free*

Within 5 days, more than 4,800 forum members had unlocked the data with their tokens; the thread received over 1,000 replies and 200,000 views, making it one of the most viewed threads on the criminal forum. The data was an instant success within the cybercriminal community. The data leakage and free download links have since been reposted across multiple deep and dark web forums. The data can now be easily acquired by any cybercriminals who wish to use it.

What Data was Included in the Breach?

Virtually every individual included in the data leakage had their phone numbers exposed, including Mark Zuckerberg himself and other founding members of Facebook. The exposure likely depended on how much information users left public on their profile, with the exception of their phone number. Any data that was public on the affected Facebook profiles was likely harvested. The dataset typically included the victim’s full names, location, phone numbers, Facebook IDs, the company they worked for, and birth dates.

*FIGURE 6: Mark Zuckerberg’s data exposed in the Facebook leak (phone number censored)*

Email addresses were also a high-value, sensitive piece of personal data exposed in this leakage. However, not all accounts contained exposed emails— security researchers predicted that only those accounts that opted to make their email addresses public in 2019 were affected. Digital Shadows (now ReliaQuest) identified more than 122 million email addresses listed in the data leak. Most of these emails were Facebook.com emails in the format: [email protected], which were likely emails used for Facebook messages, and not users’ personal email addresses. Therefore, removing these revealed a more realistic number of emails exposed in the breach. The number of email addresses exposed was distributed as follows.

Total Emails exposed (excluding Facebook.com) emails	3,300,747
.com emails	2,602,626
.edu emails	5,997
.org emails	3,428
.gov emails	514
Others (.de, .net, .fr, .co.uk, .ru, etc)	688,182

Table 1: Number of email addresses exposed in the Facebook data leak

What is Your Risk as a Facebook User?

If you believe that your email address or phone number was affected by the breach, you check whether or not your data was exposed with the service HaveIBeenZucked.

Fear not! The leaked data included no passwords, and it is unlikely that cybercriminals can use the information by itself to hack into your accounts. However, users who had their data exposed should be aware of suspicious and unsolicited emails, phone calls, and messages from unknown sources. Considering the high interest that this leakage has gathered within cybercriminal communities, it is highly likely that criminals will attempt to use the data to launch social engineering attacks or spam users with unwanted messages. Call centers may also use this data to continue launching vishing (voice phishing) attacks on unsuspecting victims.
While this data may be “old,” it is likely that information has remained unchanged for most users. After all, individuals do not usually change their phone number and email address every year or two. High-profile Facebook users, such as politicians, company executives, and public figures, are most likely to be targeted by attacks, but all affected users should proceed with care. Data leakages such as this one are common, and if your information wasn’t affected by this leakage, it might have been exposed in other incidents. As security experts often say, it is not a matter of if data has been exposed, but when. Therefore, it is crucial for users to always be cautious and exercise security best practices wherever possible.

Annex A

The list of countries affected, along with the number of records exposed:

Egypt	45,183,147
Italy	35,677,337
USA	32,315,291
Saudi Arabia	28,804,686
France	19,848,557
Turkey	19,638,821
Morocco	19,147,770
Colombia	17,957,906
Iraq	17,116,398
South Africa	14,323,766
Mexico	13,330,561
Malaysia	11,675,893
United Kingdom	11,522,328
Algeria	11,505,898
Spain	10,894,206
Russia	9,996,405
Sudan	9,464,722
Nigeria	9,000,127
Peru	8,075,316
Brazil	8,064,915
Australia	7,320,478
UAE	6,978,927
Syria	6,939,528
Chile	6,889,082
Tunisia	6,247,880
India	6,162,449
Germany	6,054,422
Netherlands	5,430,387
Oman	5,048,532
Yemen	4,617,359
Kuwait	4,502,021
Libya	4,204,514
Israel	3,956,428
Bangladesh	3,816,531
Canada	3,494,385
Palestine	3,367,570
Kazakhstan	3,214,290
Belgium	3,183,540
Jordan	3,105,988
Singapore	3,073,009
Iran	3,057,522
Bolivia	2,959,209
Hong Kong	2,937,841
Qatar	2,789,724
Poland	2,669,381
Argentina	2,339,557
Portugal	2,227,361
Cameroon	1,997,658
Lebanon	1,829,661
Guatemala	1,645,068
Switzerland	1,592,039
Uruguay	1,509,317
Panama	1,502,310
Costa Rica	1,464,002
Ireland	1,449,921
Bahrain	1,424,219
Finland	1,381,569
Czech Republic	1,375,988
Austria	1,249,388
Sweden	1,092,140
Ghana	1,027,969
Philippines	889,629
Mauritius	848,558
Taiwan	734,807
China	670,334
Croatia	659,115
Denmark	639,841
Greece	617,722
Afghanistan	558,393
Angola	508,903
Albania	506,602
Norway	475,809
Bulgaria	432,473
Japan	428,615
Macao	414,284
Namibia	409,356
Jamaica	385,890
Hungary	377,045
Ecuador	318,824
Botswana	240,632
Slovenia	229,039
Lithuania	220,160
Brunei	213,798
Luxembourg	188,201
Serbia	162,898
Puerto Rico	138,183
Indonesia	130,321
South Korea	121,744
Cyprus	119,022
Malta	115,367
Azerbaijan	99,472
Georgia	95,193
Estonia	87,533
Maldives	86,337
Moldova	46,237
Iceland	31,343
Honduras	16,142
Burundi	15,709
Haiti	15,407
Djibouti	14,327
Ethiopia	12,752
Burkina Faso	6,413
Fiji	5,364
El Salvador	4,479
Cambodia	2,838

Table 2: Records exposed per country (order from largest to smallest)