Redaction types
The three types of redactions are explained in this article.
Extract text redactions are used for redacting passages of text.
- Create these redactions by highlighting text in a document's OCR section in the usual fashion.
- Either set "Type of Redaction" to "Extract text" or "Redaction Status" to "Pending".
Text redactions are case sensitive and process whole words only. Text starting with a partial word will not be found. If text ends with a partial word, the entire word will be redacted. A word is a series of characters without punctuation or spaces.
In some cases, a document may contain several instances of the text being redacted. In this case, after applying the redaction, MasterFile will alert you by setting the "Redaction Status" to "Error" and asking you to set "Which instance to redact?" as explained below.
Search string redactions are used to redact short, repeating text strings (SIN number, name, or phone number, etc.) within a page range or across an entire PDF.
- Click "Make Extract" from an open document profile, i.e. without highlighting any text.
- Enter each string to be redacted on a new line in the extract profile's Extract field, where the Extract's text is normally displayed.
- Set the "Segment type" to "Page" and enter the default page range in the extract's "Required Information" section.
- In the "Redaction" section, set the "Type of Redaction" to "Search strings".
Search string redactions that have been applied appear with a grey flag rather than the usual black flag to alert you to double check all instances of the text strings were found and redacted. To help you confirm all instances were redacted, the "Redaction reporting/errors" field will indicate how many instances of each search term was found and redacted.
You can optionally
- Set a custom page range within which to redact a string. That overrides the extract's default page range. The format is:
<starting page no.>....<ending page no.>....<text to redact>
For example,
1....-1....800-123-7654
3....5....John Doe
will redact '800-123-7654' on page 1 and 'John Doe' on pages 3 to 5. Note that the four periods, as shown, are required. To specify all pages in the PDF, use "-1" as the ending page number.
- Use the REGEX: prefix for regular expression searches.
For example,
2....2....REGEX:[a-zA-Z0-9]+?@.+?\.com
will redact simple .com email addresses (that have no punctuation) on page 2.
We do not recommend using REGEX search strings without previewing your REGEX search string expressions as small mistakes will lead to unintended redaction or errors. We do not provide help with REGEX search strings.
Search string redactions are case insensitive but REGEX redactions are case sensitive. As with text redactions, search string redactions process whole words only. Text starting with a partial word will not be found. If text ends with a partial word, the entire word will be redacted. A word is a series of characters without punctuation or spaces.
Rectangle redactions are used to redact a rectangular area (or an entire page) of each page in a page range. Rectangle Extracts contain coordinates rather than text and are created with either Kofax PowerPDF or Acrobat from MasterFile.
- Select the documents to redact.
- Click [R+ Redaction Tools > Redact with external tool] to start PowerPDF or Acrobat.
- Mark areas for redaction in Acrobat or PowerPDF - but do not apply the redactions, nor save, nor close the application or any open documents.
- Switch back to MasterFile.
- Click OK. MasterFile will close all open PDFs, close the PDF application and automatically create new rectangle redaction extracts for each area marked.
Rectangles and the ability to copy profiles can be used in various ways:
- If you specify a page range, you easily apply a drawn rectangle to redact headers, footers. etc. to those pages of the PDF.
- To redact an entire page, draw a large rectangle that covers all the text on the page, leaving a small margin around it so it is clear the page was redacted.
- If you copy the Redaction Extract profile and ] paste it back in the view making sure to select the parent document before pasting, the redaction extract will be duplicated. You can then edit it and specify a different page range across which to redact the rectangle.
You can also manually enter the top, bottom, left and right coordinates of a redaction rectangle if you choose, and save and preview this redaction to check your coordinates, but it's simpler to draw the rectangle on the PDF and let MasterFile manage it as explained above.
Vector graphics (drawings that scale without pixelation like charts, graphs, icons, shapes, etc. as opposed to photographs) are redacted in a two-step process.
- Create a rectangle extract as above.
- Mark areas for redaction, and apply the redactions in Kofax PowerPDF or Acrobat.
- Mark the same areas a second time (they need not be perfectly aligned).
- Switch back to MasterFile and click OK to close any open PDFs, the open application, and automatically create new rectangle extracts for each area in MasterFile.
- Prefix the extracts' summary field with "via External Redaction - " so you are aware vector graphics were redacted manually.