A raft of state and federal laws require the redaction of social security numbers from publicly available documents. For attorneys, this usually means that documents produced in litigation must be scrubbed of SSN’s. In my experience, many parties resort to tiffing out otherwise natively producible Excel files and going page by page applying redaction stamps manually. But this is an expensive, labor-intensive, and error-prone process. In this post I describe a technique to quickly find and redact all social security numbers from a native Excel file in one fell swoop.
In this sample data there is information about a number of individuals and business entities, spread out among 5 worksheets. Some of the entries, but not all of them, have SSN’s entered. First I’ll show you how to locate all of the SSN’s, then how to replace them with generic text.
Most users are familiar with the basic Find and Find and Replace functions in Excel. There are three ways to turbocharge these functions to accomplish our goal of finding and replace all the SSNs at once.
1, Use Wild Card Characters
The question mark (?) is treated as any single character in an Excel search, so searching for ???-??-???? will return any cell with a social security number in the form 123-45-6789. The wildcard can be used in the Find & Replace function as well. This search will also tell us if there is other data in the same format that should not be redacted. For example, if there are part numbers of the form ABC-12-XY34 this search will identify those as well. (In that case we would need a different strategy to redact only the SSN’s.)
2. Use Find/Replace All
By choosing Find All instead of the default Find Next, you’re asking for a list of all the search results, which will appear in a handy box where you can click any result and go straight to the cell with the search result.
3. Search Across the Entire Workbook
By default the Find feature will only search the current worksheet. But you can hit the ‘Options’ button and click the button next to the Within field. Change the scope of the search from Sheet to Workbook. Now you’re instructing Excel to search every worksheet in the Excel workbook, not just the current worksheet. Likewise, in Find & Replace mode you’ll be replacing the target values throughout the workbook
Now let’s put these tools together to find every SSN of the form 123-45-6789 and replace it with the generic text XXX-XX-XXXX.
First, we’ll do a search to see if there are any SSN’s in the Excel file, and where they are. Generate a search by hitting Ctrl+F to launch the Find and Replace dialog box. Hit the Options button to expand the search options and choose “Within Workbook.” Next, enter ???-??-???? in the “Find what” field and hit Find All.
Excel produces a search results box which lists all of the search results on the various tabs. Clicking on any of the results will take you to the cell in question, even on a different worksheet. The results box stays on top to make it easy to scroll through the results.
This search confirms that there are SSN’s in the expected format, and that our search has not swept in any other kind of data that shouldn’t be redacted. Notice that, in this example, SSN’s were located in several different worksheets. The Status Bar indicates that 43 results were found.
Now we can proceed to replace all of these SSN’s with the text of your choice. You might use “XXX-XX-XXXX” or “[SSN redacted]”. Launch the Replace option of the Find and Replace dialog with a hearty Ctrl+H. The criteria you used for the search will already be populated. Enter the ‘Replace with’ text and click ‘Replace All.’
Success! All of the SSN’s are now replaced with an innocuous XXX-XX-XXXX.
Depending on the complexity of the file, you might want to run further searches to see if there are social security numbers lurking in other formats. Try searching for “SSN”; “social security”; or “?????????”
Now for a slightly different scenario. Federal Rule of Civil Procedure 5.2 provides that, with certain exceptions, “in an electronic or paper filing with the court that contains an individual’s social-security number . . . a party or nonparty making the filing may include only . . . the last four digits of the social-security number . . .”
Suppose you wanted to redact only the first 5 digits of the SSN, and leave the last 4 intact, as permitted by this rule? Assuming that the SSN’s occurred in a field of their own (i.e., just the SSN in the cell), you could use this formula to mask the first 5 digits:
=”XXX-XX-“&RIGHT(B2,4)
As seen in the figure below, this formula turns 120-41-0529 into XXX-XX-0529:
This formula uses the RIGHT function to grab the last 4 characters of cell B2, housing the SSN, and gloms it onto the prefix “XXX-XX-”. You could copy this formula all the way down to generate redacted versions of all the SSNs in Column B. Then replace the original SSN’s with the redacted versions. (Hint: Use Paste Special – Values.)
Finally, gentle readers, lest anyone suspect me of violating the aforementioned state and federal laws by displaying real SSN’s in this post (wouldn’t that be ironic?), I share my formula for generating the completely random social security numbers used in this exercise:
=TEXT(RANDBETWEEN(1,999999999),”000-00-0000″)
To learn more great Excel tips for attorneys, subscribe to the blog. Better yet, contact Excel Esquire to set up a CLE-accredited Excel training at your firm, bar association, company, or public interest organization. Also, click here to learn about upcoming open CLE programs in New York City.