dovle01

PII - How to discover PII in a version of your Project in TDM

Discussion created by dovle01 Employee on Aug 1, 2016

Introduction/Summary:

CA Test Data Manager (TDM) contains a data sampling function that empowers you to profile data version within a project. Once you have sampled your data, Datamaker is used to filter the sampled data to determine which tables and columns potentially contain PII (Personally Identifiable Information).  With judicious use of filters, you can quickly narrow in on the PII areas in your data.  By using the filters that are OOTB, you have a starting point to refine and enhance filters to fit your business needs.

 

Background: 

Suppose you are given a table(s) of data and are requested to mask them (read obfuscate,  encode, or whatever term is used in your organization).  That is it.  You have no further information other than it is live data and cannot be tested with.  One table and 10-15 columns - no problem.  But what about hundreds of columns and many tables?

You need a mechanism to get you as close as possible to grouping the possible PII Columns based on types.  Datamaker provides that mechanism through profiling via filters.  You do have to import the data into a project within Datamaker and assign it to a version.  That version then can be profiled and have tags applied based on the various conditions in the filters.  From those tags, the profiler is able to group your columns into similar types of PII types.

Please keep in mind that the filters are generic and broad based in an effort to try and capture as many "potential" columns as possible.  As a result, there may be "too many" columns that are really not PII but are tagged as such simply because the filter "threw a wide net".  At this point, you can use your visual inspection abilities to ferret out the DOB columns from the Billing Date columns that may be returned.

Perhaps you know of a special set of PII columns but do not know where they are located.  Using our filters as a template, you can create your own and locate those unique PII fields.  Then through transformation map, you can quickly locate those columns and PII data after you have profiled the tables and tagged the data.

 

Environment: 

TDM 3.5 and above

 

Instructions:

Please refer to the video that covers this topic - PII - How to discover PII in a version of your Project

(https://urldefense.proofpoint.com/v2/url?u=https-3A__catechnologies.webex.com_catechnologies_ldr.php-3FRCID-3D84db2883d312bdf05511bd660cafee23&d=DQMGaQ&c=_hRq4mqlUmqpqlyQ5hkoDXIVh6I6pxfkkNxQuL0p-Z0&r=q1T0BxNzsQFmYeCG2wsKOg&m=iaLCYfLQKEnLZsCRgzrJTqVfxjq9MOGLn5k0ZndkNbk&s=Jj10UZ9w6LPw_owNYJnNX3Rd57qQZoEPacgE9r71K9k&e=)

 

Additional Information:

Documentation URLs as seen in the WebEx

https://docops.ca.com/ca-test-data-manager/3-5/en/provisioning-test-data/discover-personally-identifiable-information

https://docops.ca.com/ca-test-data-manager/3-5/en/reference/filter-options-for-transformation-maps

https://www.google.com/?gws_rd=ssl#q=regular+expression

Outcomes