With Equifax‘s breach of 145 million records still fresh in everyone’s memory and the recent Facebook data privacy controversy, protecting personal data has become part of the political, economic and cultural zeitgeist. Debates over how data can be misused are now commonplace, and newsfeeds are awash with stores of “yet another breach of personal information”. There’s a reason for this; data is a valuable commodity, and there’s a lot of money to be made from trading personal information or using it for fraud. Cybercriminals are therefore continuing to launch phishing campaigns and network intrusions designed to collect personal data.

However, our latest research report, “Too Much Information”, highlights that there is a large amount of personal data already exposed that puts your employees and customers at risk. This data is unintentionally made public through misconfigured Amazon S3 buckets, rsync, SMB, FTP, NAS drives, and misconfigured websites. Let’s focus on a few examples that illustrate the extent of this exposure.

Tax Returns

Today is tax deadline day, which means there are still people scrambling to submit their tax returns. This window affords criminals opportunities to commit tax return fraud. As we talked about in a previous blog, “It’s Accrual World: Tax Return Fraud in 2018”, criminals go to great lengths to acquire this information. Spoiler alert: there’s plenty of information already out there.

Figure 1: Types of publicly-available personal information

In fact, the most common employee data found in our research was payroll and tax return files, which accounted for 700,000 and 60,000 files respectively. Looking into many of these examples, it was common for this information to be exposed through a contractor – for instance, a boutique accounting firm that backed up their client information. A redacted exposed pay stub is shown in Figure 2.

Figure 2: A redacted example of an exposed pay stub

Unhealthy Exposure

Aside from financial information, there was also a strong medicinal flavor to the findings. Almost 5000 patient lists were publicly available. Most surprisingly, we found over two million .dcm files (2,205,350) exposed on an open SMB port based in Italy. These Digital Imaging and Communications in Medicine (DICOM) files enable the creation and storage of medical tests, like MRIs, that contain personal health information. That’s an awful lot of files, and it doesn’t get much more personal than that.

Personally Identifiable Information versus Personal Data

Personally Identifiable Information (PII) and Personal Data are two terms that are often used interchangeably.  PII is mainly used in the U.S. and is defined by NIST as:

“Any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information.”.

Pretty comprehensive, right? Well, not as comprehensive as “personal data”, which broadens the definition to include things like device ID, IP addresses, and cookies. Personal data is used as part of the General Data Protection Regulation (GDPR) definition, which comes fully into place next month.

Our research found that a significant portion of the exposed data was in the European Union (537,720,919 files). With GDPR firmly on the horizon, organizations must consider how they are protecting employee and consumer information across these services. With employees and contractors often backing up and archiving data on their home networks or using cloud storage solutions, organizations need to ensure they have visibility into all the potential areas their customers’ personal data may be exposed. Out of sight may mean out of mind, but with GDPR coming into force, this could also mean organizations may soon be “out of pocket”.

Figure 3: The top countries making up the 500 million exposed files in the European Union


To learn more about the other types of sensitive data that these services are exposing, download a copy of our report. You can also find out more about the implications of GDPR in our “Path to Compliance” paper.


Want more Digital Shadows (now ReliaQuest) research? Subscribe to our threat intelligence emails here.


Photon logo small