Courts and litigants have long struggled with the question of how to describe email chains on a privilege log. Should you log only the most recent email, or log every email in the chain–or something in between? New York has recently adopted a potentially burdensome rule on this topic–one that cries out for an Excel solution.
Effective September 2, 2014, Commercial Division Rule 11-b imposes new obligations on litigants in New York Supreme who create document-by-document privilege logs, as opposed to the now-preferred “categorical privilege logs.” See here to read the rule. Among other things, entries for email chains should now indicate “the number of e-mails within the dialogue.” Rule 11-b (b)(3)(iii). That means you can log only the most recent email in a given chain, but you need to also disclose how many emails are in the chain.
How, exactly, does one figure out the number of emails in every email chain?
Option 1: Have your paralegal page through every page of every email on your log and count how many email headers occur in each chain. Time required: Hours, days . . . weeks?
Option 2: Use the available email threading metadata and a couple simple formulas. Time required: about 8 minutes.
This article will address Option 2.
Your document review platform should contain metadata fields that reflect email-threading properties for all of the emails in your collection. This information helps to understand how emails in the same chain or family are related to each other. Chances are you’ve never thought twice about those fields. The field commonly called Conversation Index is a unique email identifier that has a common beginning for related emails, and grows predictably longer as the email chain grows longer. You can parse this field to determine how many emails are in any given email chain.
Here’s a sample of metadata for several emails in an email chain with the Conversation Index displayed:
Here’s how the Conversation Index works:
When you create a brand new email in Outlook, Outlook assigns a new Conversation Index, which is a unique 44-digit alphanumeric code, like this:
When someone replies to (or forwards) this email, Outlook tacks on 10 more characters, like so:
Every reply or forward will have an index length 10 digits longer than that of the previous email. Importantly, if the same email gets three different responses, those three will have the same index length because they occupy the same position in the hierarchy.
Meanwhile, Excel has a convenient function called LEN that will count the number of characters in a cell. For example, if the 44-digit code is in cell G2, the function =LEN(G2) will return a value of 44:
Next let’s copy this formula down to calculate the length of the Conversation Index value for each email in our example:
Notice how, with each new email in the chain, the index length gets 10 characters longer.
The next step requires a bit of arithmetic: Since the first email has 44 characters, and every subsequent email in the chain has 10 more, we should subtract 34 and then divide by 10 to get the number of emails in any given email chain. Thus:
number of emails: =(H2-34)/10
Applying this formula to the four emails in this example yields the following results:
Now here is a messier, but more realistic, example. It includes all the logged emails in our fictitious conversation thread:
Notice several features of this set of emails:
- Not every email in the chain is actually represented on the log (perhaps they were not privileged, or were deleted pre-collection, not collected, etc.)
- Some emails have the same length because they are distinct responses to the same email
- When listed in chronological order, as they are here, they are not all in numerical (# of emails) order, because people are sometimes responding to older parts of the chain. For example, the penultimate email listed above is a late response to the original email, so it contains only 2 emails.
The algorithm does exactly what we wanted it to do: it correctly determines the number of emails in any given email chain, based on the Conversation Index. The emails need not be grouped together by conversation thread either–I used examples from the same thread above merely for illustrative purposes.
Here are a few more pointers to help you carry out this analysis:
1. I explained the process above in two steps for clarity, but the two functions can be combined into a single function that generates the number of emails directly:
=(LEN(G2)-34)/10 where G is the Conversation Index
2. If some of your rows have attachments, those will not have a Conversation Index at all, so your LEN will be 0, and the number of emails will appear as -3.4. To avoid seeing these strange values, you can use this formula instead:
This formula says, “As long as there is a Conversation Index, use the same formula; otherwise show “n/a”.”
3. Finally, if a user cuts an email from one window and pastes it into the body of a new email, Outlook will obviously not adjust the thread index to account for the pasted material. Do a reality check on your results to see if there might be some anomalies.
Thanks for reading! Please comment, follow, share, or re-post if you found it helpful.
Feel free to contact me if you have any questions about using Excel to create a bullet-proof privilege log.