As digital transformation accelerates to occupy all work aspects, organizations’ data become more dispersed and massive by the day. Since the pandemic, the world has even more accelerated its adoption of digital technology to remain operational during the lengthy lockdown periods. For instance, remote working models and cloud adoption have increased explosively during the previous two years. Many studies predict this adoption will continue to grow after the pandemic ends.
As a part of their daily work, enterprises today collect a massive amount of data from various sources, such as:
- Internet of Things (IoT) devices and sensors
- Network appliances, security tools, and other devices log data
- Customers’ data, such as Personally Identifiable Information (PII), financial and patients’ health records
- Third-party, vendors, suppliers, contractors, and sub-contractors’ information
Not all data follow the same format when storing it in digital storage solutions, and many of these data are sensitive and must be protected from unauthorized access, and abnormal data access. The ability to track data access and detect anomalies in behavior, usage patterns, and other unusual events is critical for successfully enforcing zero trust.
Digitizing business data and processes bring numerous advantages to enterprises; however, it also widens their cyberattack surface and makes them more susceptible to cyberattacks.
Data breaches have become a norm these days. Almost every week, we hear in the news about a major data breach exposing the sensitive records of customers. According to IBM “Cost of a Data Breach” report, the cost of a single data breach in 2022 reached $4.35 million. This number is projected to reach even higher shortly as more sensitive data find its way to cloud systems.
In today’s complex IT threat landscape, data security compliance affects all businesses across sectors. All enterprises, regardless of their size, are subject to one or more data security compliance when dealing (storing, processing, or transmitting) with customers’ data, including personal, financial, and health information. Failing to protect customer-sensitive information is costly and can result in damaging your entire business. For instance, if your enterprise becomes non-compliant, it could face a multitude of business costs, such as:
- Fines that reach up to millions of dollars – For example, a severe violation of GDPR can reach up to 20 million euros.
- Civil lawsuits from affected customers
- Increasing cost of insurance fees
- Reputation loss, as customers avoid organizations that do not protect their information adequately
Organizations are deploying various security solutions to protect their IT environment from unauthorized access and enforce additional security policies to govern employees’ access to sensitive resources, especially customers’ data. However, the primary defense strategy remains knowing:
- What are all your data sources and in which type of environment does the data reside?
- What systems and applications access, process, transform, and cleanse your data, and are they compliant with regulations?
- Who has access to different data sources and what can they do with access?
- What are the patterns of data use?
- What is your most sensitive data for which a data breach poses a significant threat to your bottom line and where is that data?
- Are the tools and controls in place working as configured and/or expected?
Data mapping has become an inevitable first step to compliance with various data privacy regulations.
Defining Data Mapping
In the cybersecurity context, data mapping is defined as the practice of having a centralized data catalog that answers the following questions about an organization’s data assets:
- Where are they stored?
- Who has access to each piece of information?
- Who owns each asset?
- What currently deployed security controls and governance measures are used to protect these assets from unauthorized access?
A centralized data catalog will allow an organization to know what type of data it has and its sensitivity level to business and plan the required security controls and governance policies accordingly to protect them. Without data mapping, achieving complete data visibility is impossible, leading to leaving important data unprotected and becoming susceptible to unauthorized or abnormal access.
Organizations’ IT environments have become more hybrid and span cloud and on-premise infrastructure. Having a centralized solution for data mapping to discover all sensitive data assets becomes critical to achieving compliance. Automated solutions that leverage Artificial Intelligence and Machine Learning in technologies can find sensitive data hiding in both structured and unstructured data types, such as SQL databases, Office documents, Excel spreadsheets, log files etc.
How Data Mapping helps in compliance?
Data compliance refers to the regulations imposed by government bodies and some industries (e.g., financial and health) that govern how organizations should keep customers’ data safe from data breaches and other damages (improper use, leaks, or destruction). Such regulations apply to any sensitive data type, such as customer PII data, financial records, health and medical information, and any sensitive data belonging to employees, vendors, and third-party contractors.
When we say an organization is “compliant,” this means it manages sensitive data (collect, store, process, and transmit) following the relevant enforced regulations.
Data mapping is used in many use cases to streamline data processing activities and drive value to the business. However, the most important use case of it is in cybersecurity. Data mapping becomes a vital component of any cybersecurity defense program or compliance effort. For instance, data mapping helps organizations with:
- Better response to security incidents when all structure and unstructured data is already mapped in a centralized system.
- Avoid any delay in notifying affected parties when a security breach occurs. Laws such as CCPA and GDPR impose high fines when failing to notify affected parties within a defined time frame.
- Lightens the load on Compliance and IT teams during data and system audits
Mapping data will allow your organizations to know where all the sensitive data is stored, and thus plan its cybersecurity controls and governance policies according to this knowledge.
Although data protection regulations do not mention data mapping strategy in name, popular data protection regulations such as CCPA and GDPR encourage organizations to have data mapping capability to facilitate staying in compliance with their terms.
General Data Protection Regulation (GDPR)
The European General Data Protection Regulation encourages organizations to develop data mapping strategies to meet compliance requirements.
- GDPR Article 30 requires organizations to keep track of their data processing activities. Data mapping capability will allow organizations to easily track all their customer data locations and provide a comprehensive tracking record of the various processing activities that sensitive data undergo.
- GDPR requires organizations to keep customers’ sensitive information for a pre-defined period. It also requires them to track the transfer of sensitive data to third parties, whether within the EU countries or internationally. Data mapping allows organizations to achieve these principles automatically.
- Article 35 of the GDPR requires organizations to conduct regular Data Protection Impact Assessments (DPIAs), which measure the potential risk when processing sensitive customer information. For instance, to conform to article 35, organizations should be aware of the type and sensitivity of data they are handling and how it is processed, transmitted, and stored. The entire path where this data is traveled when processing it -along with any vendor or third-party organizations that have access to it- should be appropriately documented. Data mapping helps organizations achieve this efficiently.
- When a data breach happens, Article 33 of the GDPR requires impacted organizations to notify customers within 27 hours if the violation (the breached customer data) affects their freedom or rights. Data mapping allows organizations to identify any sensitive information exposed in a data breach so they can respond to security incidents without delay and inform affected customers to avoid becoming in non-compliant.
California Consumer Privacy Act (CCPA)
The CCPA requires all organizations that collect or process personal data for California, United States residents to follow. Data mapping is essential to become compliant with the CCPA for the following reasons:
- The CCPA has a broad definition of personal information. For instance, any piece of information that identifies individuals uniquely, such as their name, phone, and email address, is considered PII. However, the CCPA recognizes more information to be treated as PII, such as biometric info, geolocation data, and household information. Data mapping allow organizations to know the type of data they have that are considered PII according to the CCPA.
- The CCPA requires organizations collecting personal information of Californian residents to declare their collection sources. For instance, organizations collect data about their customers from various sources. When doing data mapping, these sources will be revealed and meet the CCPA requirements of data collection source declaration.
- Responding to consumers’ requests about their data: The CCPA requires organizations to give their customers access to their data upon request (to update or delete it). Customer requests should be fulfilled within 45 days. As we know, more organizations are leveraging hybrid environments to keep their customers’ data, this makes consumer information spread over on-premise and cloud environments. When data mapping is applied, organizations can find all relevant information of a specific client within time and avoid becoming in a non-compliance status.
Data Classification Process
Not all information assets have the same value to a business; for instance, some business data can be available to the public; however, revealing customer PII and other sensitive business data can have catastrophic consequences.
To protect your data assets adequately, you need to classify them first. Data classification allows easy information retrieval and helps organizations set the highest security controls to protect the most sensitive data.
The following section outlines a general data classification process:
- Collect Information: The first step in any data classification endeavor is identifying the data to be classified. For instance, it is not necessary to organize all data. Some data is better to be removed -or destroyed- securely instead of classifying it. When identifying the data that must be classified, we need to:
- Know where it is located and how many copies of it exist (e.g., a copy stored on the cloud and one on a local data center)
- Its importance to the business – for example, PII, Financial or health records are always considered extremely important to protect.
- Identify who has access to this data.
- Develop a classification methodology or standard: Now that we know what information needs to be classified, we should develop a framework or methodology to mark and organize this data. Tags and metadata are commonly used in this step to allow automated solutions to classify data automatically.
- Enforce standards: in this step, sensitive data, such as PII, PHI, and financial information, is protected using the appropriate security controls and policies enforced by internal data protection regulations and other government and industry standards.
- Process Classified Data: The actual work begins in processing identified data according to the data classification framework.
Streamlining Data Mapping with Automation
Performing data mapping with manual methods is a daunting task and prone to errors. For instance, important information could be missed and not linked to its data subject owner, which makes them subject to unauthorized access and consequently makes your organization subject to compliance sanctions.
Automated data mapping tools simplify compliance and make it more efficient. For instance, using an automated solution to conduct data mapping will mitigate human errors and allow your organization to discover all sensitive data within its IT systems and assign the required security controls to protect them. It will also facilitate conforming to various data protection regulations (the current one and those that will emerge in the future) that require organizations to track the various processing activities conducted on customer-sensitive information.
There are different data mapping solutions in the market; however, they do not provide the same features. For instance, when selecting a data mapping solution, ensure it supports the following key functions:
- Automatic data and user discovery
- Data asset classification and understanding at a granular level, so that each individual access can be evaluated.
- End-to-end tracking of who/what/how of data access at all points of access and sources
- How the request to data is made? From which use-facing web server, through which services (micro-services and/or monolithic servers) on a multitude of physical and/or virtual servers/systems and/or multitude of physical and/or virtual networks
- What data is requested? From a single field and/or from multiple fields in a container (database table, for example), and/or from multiple data sources (databases, for example) on a single and/or from multiple data source servers (database servers, for example)
- What type of data is requested? Is it PII data, PCI data, confidential data, sensitive data?
- Who requested the data? The user account logged in to user-facing web server might not be the user account used to retrieve the data from the data source. When a cascade of services are present between the user-facing server and the data, there could be multiple data sources and/or multiple data source users.
- The above information can’t be put together by looking at a single log file. It can’t be done by looking at the network traffic. The data is often transported in encrypted format. As there are multiple relevant and irrelevant services on any given system/network, and the data passes through multiple such systems/networks, tracking the data is very hard.
- Stitching all the above information forms a real data flow, the basis for tracking the user data access pattern.
- The solution should be user-friendly and enables you to visualize the data
In today’s information age, organizations worldwide rely heavily on data to plan their future growth decisions. Modern organizations collect data from a large number of data sources as a part of their daily work; some of this data is sensitive in nature, such as customers’ and vendors’ personal information, and must be protected according to the enforced data protection regulation enforced by governments and other industry bodies.
Data mapping allows organizations to know what data they have, where it is stored and the path it takes, beginning from collection to processing and storage. Doing this manually to track large volumes of data is tedious and prone to various human errors. Data mapping is foundational in meeting regulatory compliance, but not all solutions provide comprehensive visibility across your environment. Learn about DruvStar DataVision holistic data compliance solution, which provides user-centric data mapping regardless of data source and access points.