Character Replacement
The Character Replacement framework allows the creation of rules to replace specific characters in a string with other characters. It was introduced in version 19.0.0.0.
The framework must wrap another string algorithm, which defaults to dlpx-core:CM Alpha Numeric. Character replacement is often used as part of a chained string algorithm, either before or after masking.
It can be used for the removal of accented characters (é, ñ, etc) and to ensure the output of the algorithm is changed from the input. There is no UI or default instance for the character replacement framework at this time. Character replacements are specified in a list of JSON rules. There is no enforced limit to the number of rules that can be created.
You must provide a string algorithm (via stringAlgorithm key in the config) to create an instance of this framework.
Example
This example uses a Character Replacement algorithm with the following configuration:
stringAlgorithm:
dlpx-core:CM Alpha-Numeric
replacementRules:
outputFilteredCharacters: ["a", "e", "i", "o"]
outputReplacementCharacter: "$"
filteredCharacters: ["a", "e", "i", "o", "u", "y"]
replacementCharacter: “#“
requireInputChange: false
filterAccents: BOTH
Will produce masked results as follows:
“Andérsoñ” => “Xpb#sh#$”
“Emma Thompson” => “C#x# U$#m$x##”
“Sophia Jones sundar clarke” => “V#g### H#j#g q#$j#$ mg#xh#”
The above algorithm configuration defines a Character Replacement algorithm and rules for replacing characters in strings.
The stringAlgorithm key is required and the user needs to pass the “algorithm-uuid” of a valid String algorithm present in DCS, It operates on strings by replacing occurrences of certain characters with specified replacements.
In this configuration, there are a couple of replacement rules. It states that whenever the output string contains the characters ‘a', 'e', 'i' or 'o', they should be replaced with '$' symbol after applying String Algorithm to mask the values. Similarly, any occurrence of 'a', 'e', 'i', 'o', 'u' or 'y’ in the input and output should be replaced with '#' symbol.
Priority of outputReplacementCharacter is higher than replacementCharacter, as replacementCharacter replaces both for input and output while outputReplacementCharacter just replaces the output.
The configuration specifies that no change is required for the input characters which means values before and after masking could be identical. It also indicates that filtering accents should be applied to both input and output strings ("é" => “e“, “ñ“ => “n“).