Free Text Redaction
A Free Text Redaction framework helps to remove sensitive data that appears in free-text columns such as “Notes.” This type of algorithm requires some expertise to use because it must be set to recognize sensitive data within a block of text.
The algorithm uses a list of lookup words to determine what information it needs to mask. Decide which words the algorithm uses to search for material, such as addresses. For example, setting the algorithm to look for “St,” “Cir,” “Blvd,” and other words that suggest an address. Pattern matching can be used to identify potentially sensitive information. For example, a number that takes the form 123-45-6789 is likely to be a Social Security Number. Lookup words and regular expressions will match individual words within the input text, rather than phrases.
This framework can also be used to show or hide information by displaying either a DenyList or an AllowList.
DenyList
Designated material will be redacted (removed). For example, a deny list can be set to hide patient names and addresses. The deny list feature will match the data in the lookup file to the input.
AllowList
ONLY designated material will be visible. For example, if a drug company wants to assess how often a particular drug is being prescribed, an allow list can be used so that only the name of the drug will appear in the notes.
Below is an example of the Free Text Redaction framework.
Input: The customer Bob Jones is satisfied with the terms of the sales agreement. Please call to confirm at 718-223-7896.
Redact type: DenyList
Lookup file
Bob
Jones
agreement
Lookup file redaction value: XXXX
Regular expressions entry: [0-9]{3}-[0-9]{3}-[0-9]{4}
Regular expression redaction value: YYYY
MASKING RESULT: The customer XXXX XXXX is satisfied with the terms of the sales XXXX. Please call to confirm at YYYY.
"Bob", "Jones", "agreement" and the phone number are redacted.