AnsweredAssumed Answered

SEED Data while using Fast Data Masker

Question asked by mmvohal on Sep 9, 2016
Latest reply on Sep 13, 2016 by gilta03

We are using Fast Data Masker to mask data and we would like to keep the city and postal code values belonging to the original country and state values while masking. It is a database seed-table (scramble_gtsrc_refernce_data) with Hashlov1 function and not a seed file that is being used.

 

Issue 1:  Seed values not available in the seed table:

 

We saw that when seed values are not found in the seed table for a particular search (Restrict column) values,  it updates the DE_IDENT_IND column to P but no masking is performed. Before the actual masking is started these messages are in the log:

 

>Column RD_REF_VALUE should contain values for RESTRICT_COLUMN

>No seed data found for value 'AE Abu Zaby (Abu Dhabi)' and category Global Country State

 

How do we insulate ourselves from such non availability of SEED data? How do clients typically handle this scenario?

Does the tool expect us to make every probable seed value available? Is there a mechanism to default to a value or do something else if we anticipate having new records in future or now for which seed values may not be available?

 

Issue 2: Use of RD_INDEX_ID:

 

We were told:

rd_index col is used as an index for a given data category. It has been introduced in the recent versions of TDM. Given that you are eventually going to upgrade to latest TDM version in future, I’d recommend keeping it on. So, your table def would look like this :

 

CREATE TABLE gtsrc_reference_data (

        rd_ref_id                        varchar2 (254)   NOT NULL,

        rd_ref_value                     varchar2 (254)   NOT NULL,

        rd_old_value                     varchar2 (254)   ,

        rd_ref_value2                    varchar2 (254)   ,

        rd_ref_value3                    varchar2 (254)   ,

        rd_ref_value4                    varchar2 (254)   ,

        rd_ref_value5                    varchar2 (254)   ,

        rd_index                         number (10, 0)    );

 

CREATE INDEX gtsrc_reference_data_x1 ON gtsrc_reference_data  (

        rd_ref_id                        ,

        rd_ref_value                      );

 

What is the benefit of the RD_INDEX_ID column in the gtsrc_reference_data or scramble_gtsrc_reference_data? We already have index created on the rd_ref_id column which acts as our Data Category. Also, since we have a single seedtable containing seed data for multiple countries with the corresponding state, city, postal_code values, how does rd_index_id column work in this scenario where we have on one Data Category: Global Country State across the whole table.

 

If we just add the column without providing values it errors as follows:

>loading seed data Global Country State at 2016.09.08 15:50:23.964 EDT

>Null values found for column rd_index in seed table for category Global Country State

Outcomes