Create a masking job
Terminology
Metadata Package: A collection of metadata components that are supported by The Plugin.
Managed Package: These are packages owned and managed by Salesforce or Salesforce partners.
Unmanaged Package: These are packages that are created as part of customizations done by end users.
The Plugin: The Masking Plugin for Salesforce Package Management.
Task: A task is a step in The Plugin execution.
Creating a masking job
To create a masking job, select the New Job button at the top of the ‘Jobs’ page.
In the Connector section, select a connector to use as the source (and sometimes also the target) of a masking job.
The Synchronize schema option is recommended if the schema of the source has changed from the last time it was synchronized. This process can take several minutes and will prevent the next step of the masking job from being triggered until finished.
In the Ruleset section, choose any available Ruleset within Delphix Compliance Services, or start with an empty one by selecting Define custom rules.
Defining custom rules will make the Customize Rules checkbox a requirement. If using an existing Ruleset, the Customize Rules checkbox is optionally used in order to modify the ruleset for this specific job.
the ruleset selected in the previous section can be tuned specifically for the job being created; these changes will not be saved in the ruleset itself. If a given column/field from the database should not be masked, search for it and mark the algorithm as None.
In the Configuration section, masking table filters can be applied with provided Salesforce Object Query Language (SOQL) WHERE clauses for any table in the job.
In the Salesforce Metadata section, users can select the exact metadata components to disable as well as what tables to disable for.
This will allow the user to specify exactly which of the metadata components should be disabled as part of the masking job. The following list contains the metadata components.
Triggers
Workflows
Process Builders
Validation Rules
Feeds
Field History
DuplicateRules
Picklist
GlobalPicklist
Unique
MatchingRule
Required
LookupFilter
In the Table Optimization (Beta) section, select whether to use table optimizations.
This feature is useful if the tables to be masked are very large (In order of 10 million rows or more)
Opting into this feature will allow Delphix to use some fine-tuned performance controls, which will improve the overall query operation time for a very large table.
In the Batch Execution section Select whether to run bulk update in serial or parallel mode.
Parallel: By default, DCS reads, masks, and submits Salesforce asynchronous batch jobs serially. To achieve parallelization, DCS submits subsequent batches before previously submitted batches are complete or even started (queued). Running batch jobs in parallel may cause row lock issues, where multiple processes attempt to acquire a single lock. For example, the field Full_name depends on the fields First_name and Last_name. If there is a batch job updating the First_name or Last_name of a row, the Full_name field will be locked down. If another batch job wants to update the Full_name field of the same row, a row lock issue may occur at this point.
Serial mode: The
serialize_updates
attribute of the create job request payload is set to true. The parameter will then be added to the JDBC URL connection parameters that will be used by the CDATA’s JDBC Batch APIs.
In the Summary section, review the details of the masking job configuration, then click Save.
Delphix does not recommend skipping disabling the triggers, as it could cause the masking job to run slow or prevent it from running at all.
Job table
Inside the drawer of each compliance job, the disabled metadata components, table optimizations, and masking filters applied are shown.
Plugin operation stages
Setup
In the setup stage, the Package Manager prepares the data necessary to perform its work. Specifically, the Package Manager performs the following:
Retrieve table names from the ruleset of the current masking job.
Perform FlowDefinition, ApexTigger, and CustomField processing for each table name.
Pre-task
In the pre-task stage, the Package Manager disables the metadata packages on the tables that are being masked by a masking job. This stage grooms the table(s) for data masking. Specifically, the Package Manager performs the following:
Perform Metadata Retrieval.
Perform Metadata Deploy without any modification, to check if the Metadata configuration retrieved from the Salesforce instance has any issues or not.
Perform XML modification to disable packages and then perform Metadata Deploy.
File Type | Parent Node | Node | Value |
---|---|---|---|
Triggers | ApexTrigger | status | InActive |
Workflows | rules | active | false |
Validation Rules | validationRules | active | false |
Feed History | fields | trackFeedHistory | false |
Field History | fields | trackHistory | false |
Process Builders | FlowDefinition | activeVersionNumber | 0 |
Post-task
In the post-task stage, the Package Manager enables the metadata packages that were disabled in the pre-task stage. Specifically, the Package Manager performs the following:
Perform Metadata Deploy to enable Metadata packages that were disabled in Pre-Task.
Perform cleanup of the base working directory.
Package manager cleanup
Some salesforce sandboxes, usually after masking jobs are executed, might have duplicate listViews that will cause a job to fail when triggers are disabled. This means that a manual cleanup step must be run after masking to prevent future errors. Steps below:
Find the developer console (under setup) and run the following inside the query editor:
SELECT min(id), max(name), DeveloperName, SobjectType FROM ListView GROUP BY DeveloperName,SobjectType HAVING COUNT(DeveloperName) > 1
The results should return back all duplicates that will need to be deleted (developerName = All, and Ideas_Last_7_Days does not count).
Every other row means it needs to be deleted. SobjectType will show which table needs to have the listview deleted from it.
Tasks need to be searched in lightning mode. Events are found on the Calendar app.
If a duplicate shows up but the correct name for it is unclear (i.e. it does not seem like a duplicate), run the following query where the developer name should be the object being deduplicated:
SELECT id, name, DeveloperName, SobjectType FROM ListView WHERE DeveloperName='<developer name>'
This should give at least two rows of what the visible 'name' actually is.Make sure that all the duplicates are removed when trying to run a new masking job (particularly if disabling triggers).
Disable rehearsal tool for all tables in the schema
Some automations become disabled by DCS while running masking jobs, then they are enabled again when the job completes. Triggers and validation rules on the sandbox are parts of the aforementioned automations that are disabled by default while masking.
An option is available where users can choose to disable triggers and validation rules on all tables in the sandbox, not only the masked tables. Users can tell DCS not to disable triggers and validation rules, if desired, by selecting the "Do not disable triggers and validations" option.
When users select either the "Disable triggers and validations on all tables" or the "Disable triggers and validations only on masked tables" option, the run_triggers
attribute of the create job payload is set to true, and the target_trigger
attribute is set to the selected value.
If users select "Do not disable triggers and validations", the run_triggers
attribute will be set as false.
Run a masking job in serialized mode
When running a masking job, DCS updates Salesforce records using Salesforce Bulk API V2, which is abstracted by CDATA's JDBC Batch APIs. Salesforce bulk jobs are leveraged to achieve parallelization in future DCS updates.
Bulk job and batch job
Many Salesforce batch jobs belong to one bulk job. A bulk job has a max size of 100,000 records and a batch job has a max size of 10,000 records. Within a batch job, Salesforce processes 200 records at a time (in a chunk), which also corresponds to a transaction.
API
When the create-job endpoint is called, the user can specify the metadata components to disable in the request payload, which will be included in the generated configuration file as it is passed to the masking container.