This weekend press exposed a significant data leakage containing the records of 533 million Facebook users. The records were posted on multiple cybercriminal forums for free. This incident exposed the personal information of Facebook users, including phone numbers, emails, full names, job occupations, and birth dates. Much of this information was likely scraped from public Facebook profiles, as Facebook alluded to in a new statement made on 06 Apr 2021. However, the leakage also included data that wasn’t made public by users, such as their phone numbers. In this blog we dive into what happened, how the information was exposed, who has taken responsibility for the attack, and the risks involved to affected users.

What is the Facebook Data Leak?

The initial incident started in mid-to-late 2019. It is believed that threat actors scraped Facebook’s website to acquire the information of millions of users. Web scraping refers to the process of using automated scripts or bots to harvest public information from sites, such as any information users make publicly available on their profiles (Names, City, Education, etc.). 

Scraping is not a new technique, and it occurs daily. Cybercriminals frequently scrape sites such as Facebook, Twitter, and Reddit, and many other sites. Cybercriminals can leverage the data extracted from sites for a variety of purposes, including spamming, information gathering, and social engineering attacks. They can also sell scraped data for a profit to other cybercriminals, marketing companies, or call centers.

Figure 1: Raidforums user advertising scraped Instagram database

As previously mentioned, data scraped from sites is usually public data. If users set their emails, names, and locations to be public, then that data could be viewed and harvested by virtually anyone. However, the data exposed from Facebook wasn’t your usual data scraping incident. Threat actors were able to harvest users’ phone numbers, even if the users had set their number to be private on their Facebook profiles. Facebook stated that they believed that cybercriminals accomplished this by exploiting Facebook’s “contact importer” feature, which allows users to find other users by using their phone numbers. 

This feature could have been exploited by uploading large sets of phone numbers and identifying which Facebook profiles matched the numbers. Facebook stated that this feature was fixed in September 2019, following the discovery that threat actors were abusing the feature. However, while Facebook fixed the feature in 2019, the phone numbers of 533 million users had already been harvested by malicious individuals, along with other identifying information on users.

How Was the Facebook Data Distributed in the Cybercriminal World?

Initially, attackers offered the data at quite a steep price. As the data began circulating in open and gated cybercriminal forums in 2020, a listing on Russian-speaking cybercriminal forum XSS in August 2020 advertised the sale of this data for “only” USD 25,000 (see Figure 2). Listings were identified across several other forums, such as Raidforums. The sheer size of the data leakage and the wide geography it covered (106 countries) made the data a gold mine for cybercriminals. Therefore, these listings often caught the interest of multiple threat actors.

Figure 2: XSS user advertises Facebook leak in August 2020

The XSS user who initially shared the data was allegedly responsible for the attack. When other forum members questioned the origin of the breached data, the original poster claimed that they had exploited a zero-day vulnerability on Facebook’s website. This vulnerability allegedly allowed the threat actor to grab users’ data from their Facebook ID (see Figures 3-4). The user also stated that the data extracted dated from 01 Jan 2020, as Facebook had patched the vulnerability by then. The user did not provide further information.

 

Figure 3: XSS user claims that they exploited a vulnerability to crawl Facebook users
FIGURE 4: XSS user provides more information on how they claim to have acquired the Facebook data leak

Cybercriminals often purchase data to re-sell it to other cybercriminals for a profit, the cost or set price of the data breach lowering with each transaction. From 2019-2021, the data likely exchanged hands multiple times— an activity frequently observed in cybercriminal forums. Eventually, the data breach becomes devalued, and users will expose it for free to gain reputation or notoriety within a cybercriminal forum. In the case of the Facebook breach, this is the most likely situation. On 03 April 2021, a user on the English-speaking cybercriminal forum Raidforums uploaded the entire Facebook breach for a negligible cost of eight forum tokens (approximately USD 2.52).

FIGURE 5: Raidforums user exposes the Facebook data leak for free

Within 5 days, more than 4,800 forum members had unlocked the data with their tokens; the thread received over 1,000 replies and 200,000 views, making it one of the most viewed threads on the criminal forum. The data was an instant success within the cybercriminal community. The data leakage and free download links have since been reposted across multiple deep and dark web forums. The data can now be easily acquired by any cybercriminals who wish to use it.

What Data was Included in the Breach?

Virtually every individual included in the data leakage had their phone numbers exposed, including Mark Zuckerberg himself and other founding members of Facebook. The exposure likely depended on how much information users left public on their profile, with the exception of their phone number. Any data that was public on the affected Facebook profiles was likely harvested. The dataset typically included the victim’s full names, location, phone numbers, Facebook IDs, the company they worked for, and birth dates. 

FIGURE 6: Mark Zuckerberg’s data exposed in the Facebook leak (phone number censored)

Email addresses were also a high-value, sensitive piece of personal data exposed in this leakage. However, not all accounts contained exposed emails— security researchers predicted that only those accounts that opted to make their email addresses public in 2019 were affected. Digital Shadows (now ReliaQuest) identified more than 122 million email addresses listed in the data leak. Most of these emails were Facebook.com emails in the format: [email protected], which were likely emails used for Facebook messages, and not users’ personal email addresses. Therefore, removing these revealed a more realistic number of emails exposed in the breach. The number of email addresses exposed was distributed as follows.

Total Emails exposed (excluding Facebook.com) emails 3,300,747
.com emails 2,602,626
.edu emails 5,997
.org emails 3,428
.gov emails 514
Others (.de, .net, .fr, .co.uk, .ru, etc) 688,182
Table 1: Number of email addresses exposed in the Facebook data leak

What is Your Risk as a Facebook User?

If you believe that your email address or phone number was affected by the breach, you check whether or not your data was exposed with the service HaveIBeenZucked.

Fear not! The leaked data included no passwords, and it is unlikely that cybercriminals can use the information by itself to hack into your accounts. However, users who had their data exposed should be aware of suspicious and unsolicited emails, phone calls, and messages from unknown sources. Considering the high interest that this leakage has gathered within cybercriminal communities, it is highly likely that criminals will attempt to use the data to launch social engineering attacks or spam users with unwanted messages. Call centers may also use this data to continue launching vishing (voice phishing) attacks on unsuspecting victims.
While this data may be “old,” it is likely that information has remained unchanged for most users. After all, individuals do not usually change their phone number and email address every year or two. High-profile Facebook users, such as politicians, company executives, and public figures, are most likely to be targeted by attacks, but all affected users should proceed with care. Data leakages such as this one are common, and if your information wasn’t affected by this leakage, it might have been exposed in other incidents. As security experts often say, it is not a matter of if data has been exposed, but when. Therefore, it is crucial for users to always be cautious and exercise security best practices wherever possible.

Annex A

The list of countries affected, along with the number of records exposed:

Egypt 45,183,147
Italy 35,677,337
USA 32,315,291
Saudi Arabia 28,804,686
France 19,848,557
Turkey 19,638,821
Morocco 19,147,770
Colombia 17,957,906
Iraq 17,116,398
South Africa 14,323,766
Mexico 13,330,561
Malaysia 11,675,893
United Kingdom 11,522,328
Algeria 11,505,898
Spain 10,894,206
Russia 9,996,405
Sudan 9,464,722
Nigeria 9,000,127
Peru 8,075,316
Brazil 8,064,915
Australia 7,320,478
UAE 6,978,927
Syria 6,939,528
Chile 6,889,082
Tunisia 6,247,880
India 6,162,449
Germany 6,054,422
Netherlands 5,430,387
Oman 5,048,532
Yemen 4,617,359
Kuwait 4,502,021
Libya 4,204,514
Israel 3,956,428
Bangladesh 3,816,531
Canada 3,494,385
Palestine 3,367,570
Kazakhstan 3,214,290
Belgium 3,183,540
Jordan 3,105,988
Singapore 3,073,009
Iran 3,057,522
Bolivia 2,959,209
Hong Kong 2,937,841
Qatar 2,789,724
Poland 2,669,381
Argentina 2,339,557
Portugal 2,227,361
Cameroon 1,997,658
Lebanon 1,829,661
Guatemala 1,645,068
Switzerland 1,592,039
Uruguay 1,509,317
Panama 1,502,310
Costa Rica 1,464,002
Ireland 1,449,921
Bahrain 1,424,219
Finland 1,381,569
Czech Republic 1,375,988
Austria 1,249,388
Sweden 1,092,140
Ghana 1,027,969
Philippines 889,629
Mauritius 848,558
Taiwan 734,807
China 670,334
Croatia 659,115
Denmark 639,841
Greece 617,722
Afghanistan 558,393
Angola 508,903
Albania 506,602
Norway 475,809
Bulgaria 432,473
Japan 428,615
Macao 414,284
Namibia 409,356
Jamaica 385,890
Hungary 377,045
Ecuador 318,824
Botswana 240,632
Slovenia 229,039
Lithuania 220,160
Brunei 213,798
Luxembourg 188,201
Serbia 162,898
Puerto Rico 138,183
Indonesia 130,321
South Korea 121,744
Cyprus 119,022
Malta 115,367
Azerbaijan 99,472
Georgia 95,193
Estonia 87,533
Maldives 86,337
Moldova 46,237
Iceland 31,343
Honduras 16,142
Burundi 15,709
Haiti 15,407
Djibouti 14,327
Ethiopia 12,752
Burkina Faso 6,413
Fiji 5,364
El Salvador 4,479
Cambodia 2,838
Table 2: Records exposed per country (order from largest to smallest)