Oracle Cloud Infrastructure Documentation

About Data Masking

Challenge

The amount of data that organizations collect and manage, including sensitive and personal data, is growing every day. The growing security threats have made it necessary to limit exposure of sensitive data. At the same time, different data privacy laws and standards such as EU GDPR, PCI-DSS, and HIPPA mandate you to protect personal data. Live production database environments contain valuable and sensitive data, and to meet security and compliance requirements, you need to protect this data. Usually, organizations implement multiple security controls in their production environments to ensure that access to sensitive data is tightly controlled.

You collect data probably to improve your products and services, provide better user experience, and support and grow your business. To best utilize the collected data, you need to share it with different teams, both internal and external, for various use-cases such as development, testing, training, and data analytics. Copying production data for non-production purposes proliferates sensitive data, expands the security and compliance boundary, and increases the likelihood of data breaches. If left unprotected, contractors or offshore workers might access the data and possibly move it across locations. Data privacy standards such as PCI-DSS and EU GDPR also emphasize on protecting sensitive information in non-production environments because these environments are typically not as protected or monitored as production systems.

The challenge is to reduce the unnecessary spread and exposure of sensitive data while maintaining its usability for non-production purposes.

Solution

Even in non-production environments, you need protect your sensitive data and stay compliant with data privacy regulations. The recommended solution is to mask your sensitive data before using it in non-production environments. This way, you minimize the sensitive data you have, and thus, reduce the risk and compliance boundary.

Data Masking

Data masking, also known as static data masking, is the process of permanently replacing sensitive data with fictitious yet realistic looking data. It helps you generate realistic and fully functional data with similar characteristics as the original data to replace sensitive or confidential information. Data masking limits sensitive data proliferation by anonymizing sensitive data while enabling you to use production-like data. It ensures that malicious actors cannot benefit from the fictitious data even if they gain access to it.

Data masking is ideal for virtually any situation when confidential or regulated data needs to be shared with non-production users. These users may include internal users, such as application developers or external business partners, such as offshore testing companies, suppliers, and customers. Data masking contrasts with encryption, which simply hides data, and the original data can be retrieved with the appropriate access or key. With data masking, the original sensitive data cannot be retrieved or accessed.

Common Data Masking Requirements

Organizations typically mask data using custom scripts or solutions. While these in-house solutions might work for a few columns, they do not work for large applications with distributed databases and thousands of columns. An enterprise data masking solution should be able to fulfill the following data masking requirements:

  • Locate sensitive data in the midst of numerous applications, databases, and environments.
  • Correctly mask sensitive data having different shapes and forms such as names, Social Security numbers, email addresses, credit card numbers (Mastercard, Visa, and so on), and blood type.
  • Ensure that the masked data is irreversible, that is, one should not be able to retrieve the original data from the masked data.
  • Ensure that the masked data is realistic enough to be useful for non-production purposes such as development and analytics.
  • Ensure that the applications continue to work with the masked data.

Data Masking in Oracle Data Safe

The Data Masking component of Oracle Data Safe addresses the common data masking requirements and more. It simplifies the process of masking data in your non-production databases by providing an automated, flexible, and easy-to-use solution. It enables you to:

  • Maximize the business value of your data without exposing the sensitive data
  • Minimize the compliance boundary by not proliferating the sensitive production data
  • Mask your Oracle databases hosted on Oracle Cloud
  • Use various masking techniques to meet your specific business requirements
  • Preserve data integrity ensuring that the masked data continues to work with applications

To mask sensitive data, you need to understand what sensitive data you have and where it is located. Data Discovery helps you automatically discover sensitive data and referential (parent-child) relationships, and creates a sensitive data model containing all the necessary information. The Data Masking wizard enables you to use a sensitive data model to create a masking policy defining how the data should be masked. A masking policy associates sensitive columns with masking formats, which define the logic to mask the associated sensitive column. Masking policies are used to mask data on your target database. Data masking ensures referential integrity by masking related columns consistently.

Oracle Data Safe provides a comprehensive set of masking formats to help you mask common sensitive and personal data such as names, national identifiers, credit card numbers, phone numbers, and religion. You also have masking options such as shuffling, encryption, and replacing with random numbers, strings, and dates. Oracle Data Safe provides you the capability to easily create new masking formats, without requiring any technical skills. You can store these user-defined masking formats in Oracle Data Safe Library for future use.

Similarly, you can create masking policies and store them in the Oracle Data Safe Library. You can use an existing masking policy to mask different target databases. You can also download a masking policy as an XML file, edit it, and upload it to the same or a different Oracle Data Safe Library.

Data Masking generates a masking report that summarizes what was masked in the database. For example, the report tells you the names of the sensitive columns masked, the masking formats used, and the total number of tables, columns and values masked.