The need to determine whether a specific message was read by an end-user comes up often in email forensics. The question is often twofold:
- How can we preserve the “read” status of messages during forensic email acquisitions?
- Can we go beyond that and determine if a user had read a message and subsequently marked it as unread? Can we find out when this happened?
While supporting Forensic Email Collector, I have answered a few queries along these lines very recently. I wanted to write this quick post to lay out some of the possibilities in this area when targeting Gmail or Google Workspace—formerly known as G Suite.
Preserving the “read” status of messages during forensic email preservation is part of virtually any forensic email preservation workflow. In the context of Gmail / Google Workspace, FEC, Google Vault, Google Takeout, and IMAP all support this in different ways. So, I won’t get into the details here. Instead, we’ll get right into the more exciting stuff!
Investigating Historical Message Read Status Activity
Capturing whether a message is marked as “read” or “unread” during forensic preservation is certainly useful. But, could we determine what happened in the past? For example, did the end-user read a message and then mark it as “unread”? What else did they do? When?
The answers to these questions depend on whether you are targeting Gmail or Google Workspace, and how far back the activity occurred. Let’s take a look at some of the strategies we can use.
Email Log Search in Google Workspace (aka G Suite)
Let’s look at the post-delivery message details for five messages in Google Workspace. The end-user took the following actions on these messages:
Message #1: The end-user encountered this message in their mailbox when they logged into Gmail’s web interface, but never opened it.
Message #2: The end-user opened this message.
Message #3: The end-user opened this message, and then marked it as “unread”.
Message #4: The end-user marked this message as “read” without opening it.
Message #5: The end-user never encountered this message. That is, it was never included in the list of messages presented to the end-user when they logged into Gmail’s web interface.
We will now go over the results of an email log search. Google Workspace admins can perform these searches here.
State: Unopened and unread, Seen, Marked unimportant
Here, the Seen post-delivery message status indicates that the message was listed in the user’s view when they opened Gmail. Unopened and unread indicates that the end-user did not open or read the message. Consistent with what we expect for this message. The Marked unimportant post-delivery message status is self-explanatory. It indicates that the message is marked unimportant—in this case, this was a system action, not a user action.
Below is a screenshot of what this looks like on the Google Admin user interface.
State: Opened and read, Seen, Marked unimportant
Opened and read indicates that the end-user opened and read the message. Consistent with what we would expect for this message—the end-user was presented with the message, they opened it, and it was marked “read”.
State: Opened and marked as unread, Seen, Marked unimportant
Now things are getting interesting! Opened and marked as unread indicates that the user opened this message, and then subsequently marked it as “unread”.
State: Unopened and marked as read, Seen, Marked unimportant
As expected, the Unopened and marked as read post-delivery message status reflects precisely what the end-user did. That is, they were presented with the message. But, they marked it as “read” without opening the message. One way to accomplish this in Gmail’s user interface is to check the checkbox next to the message, and then to mark it as “read” using the “Mark as read” menu item in the toolbar.
State: Unopened and unread, Unseen, Marked unimportant
The Unseen post-delivery message status indicates that the user never encountered this message in Gmail.
To take this a step further, I created an additional message (Message #6) and waited for the message to arrive while the end-user’s Gmail was open in a browser tab without any user interaction. That is, Gmail’s web interface refreshed automatically to list the new message without any explicit user action to navigate or refresh the page. This still resulted in the Seen post-delivery message status.
How Far Back Does Email Log Search Go?
When you attempt to specify a date range within the Email Log Search user interface, you can go back for about one month. However, Email Log Search allows you to search for messages older than 30 days by using the “Older than 30 days” option from the dropdown shown below.
This is with the caveat that you only get the post-delivery message status information for these older messages, not the other details included in the screenshot above. Additionally, you are required to provide the exact recipient address as well as the Message ID for your target message. Despite these restrictions, this is still extremely useful when you are investigating a specific message!
History Records in Gmail and Google Workspace
Another investigative technique we can use to answer some of these questions is Gmail History Records. This approach has a few advantages:
- It applies to both free Gmail accounts and paid Google Workspace accounts
- It can be used to date user actions such as when a message was marked as unread
- History records also include messages that are added and deleted
Since we covered Gmail History Records in the past, I will not go into full detail here. However, let’s take a look at an example to see if we can determine when the end-user likely read a message, and when they subsequently marked the previously-read message as “unread”.
In this example, the end-user opens a message with the subject “Sisyphus and Boulder” on 4/1/2021 at 13:11 PM (PDT). A few minutes later, at 13:16 PM (PDT), they mark the message as “unread”. Relevant history records appear as follows—this is after Forensic Email Collector correlated history records with message metadata:
------ HISTORY RECORD ID: 290038 ------ Messages Added: ID: 1788efef7e6e16e4 Folder Path: All Mail Subject: Message 6 From: LMISF Test <firstname.lastname@example.org> To: email@example.com Message ID: <CAMvYnDMYmh6T_3QFYY2RFO_tziROfC+ePgPKv7igOjWii5c6dw@mail.gmail.com> Date: 2021-04-01 19:52:58Z ------ HISTORY RECORD ID: 290073 ------ Labels Removed: Removed Label ID: UNREAD From Message: ID: 178607f63d53dedc Folder Path: All Mail Subject: Sisyphus and Boulder From: NextDraft <firstname.lastname@example.org> To: <email@example.com> Message ID: <firstname.lastname@example.org> Date: 2021-03-23 18:31:08Z ------ HISTORY RECORD ID: 290120 ------ Messages Added: ID: 1788f1441e8167fe Folder Path: All Mail Subject: Confirm Your Subscription From: PLAE <email@example.com> To: firstname.lastname@example.org Message ID: <PiaWpZGKStO5fN8qu14Shg@ismtpd0177p1mdw1.sendgrid.net> Date: 2021-04-01 20:16:11Z ------ HISTORY RECORD ID: 290189 ------ Labels Added: Added Label ID: UNREAD To Message: ID: 178607f63d53dedc Folder Path: All Mail Subject: Sisyphus and Boulder From: NextDraft <email@example.com> To: <firstname.lastname@example.org> Message ID: <email@example.com> Date: 2021-03-23 18:31:08Z ------ HISTORY RECORD ID: 290257 ------ Messages Added: ID: 1788f16dbca40e33 Folder Path: All Mail Subject: 10% off at PLAE - Welcome! From: PLAE <firstname.lastname@example.org> To: "email@example.com" <firstname.lastname@example.org> Message ID: <G1aUtePOQJifSN5Q_RQARg@ismtpd0128p1iad2.sendgrid.net> Date: 2021-04-01 20:19:02Z
The acquired history records show that the “UNREAD” label was removed from our target message between two events: when a new message arrived on 4/1/2021 at 12:52:58 PM (PDT), and another new message arrived on 4/1/2021 at 13:16:11 PM (PDT). This helps narrow the message read event down to an approximately 23-minute window.
Similarly, history records show that the “UNREAD” label was applied to our target message—in effect, marking it as “unread”—between two events: when a new message arrived on 4/1/2021 at 13:16:11 PM (PDT), and another new message arrived on 4/1/2021 at 13:19:02 PM (PDT). This helps narrow the message marked as unread event down to an approximately 3-minute window.
As I mentioned in our Gmail History Records article, it is important to forensically preserve and authenticate the messages you are using as anchor points in this type of analysis. Additionally, Gmail History Records typically do not go back more than a month.
Opened Label in Google Vault and Takeout & Message Read Status
Another data point that can be helpful when investigating post-delivery message status is the Opened label included in Google Takeout and Vault exports. Here is how this looks in a Google Takeout mbox export:
X-Gmail-Labels: Sent,Inbox,Opened,Category personal
and in a Vault metadata XML:
<Tag TagName=’Labels’ TagDataType=’Text’ TagValue=’^INBOX,^OPENED‘/>
The interesting thing is that the Opened label is not accessible via Gmail API, it is not listed as part of the common Gmail system labels, nor can it be used to query messages via Gmail’s search feature (i.e., label:<labelname>). Although listed as a Gmail label in Takeout and Vault exports, the Opened label behaves like a special value rather than a regular Gmail label.
The Opened and Unread labels are populated as follows for the 5 sample messages we discussed above:
As expected, the OPENED,UNREAD combination in Message #3 reveals that the message was marked as “unread” after it had been opened and read. Similarly, the fact that both the OPENED and UNREAD labels are missing from Message #4 shows that it was marked as “read” without being opened.
Using a combination of Email Log Search, Gmail History Records, and the Opened pseudo-label in Gmail and Google Workspace exports, forensic email examiners can answer questions such as:
- Has the end-user ever encountered the target message?
- Did they open it?
- When did they read it?
- Did they mark it as “read” without opening it?
- Did they mark it as “unread” after having read it?
Gmail History Records are particularly useful for showing both label and message deletion events and putting upper and lower time bounds on user activity.
It is important to keep in mind that time is of the essence, and Gmail History Records should be preserved as soon as possible. Additionally, any messages relied upon as anchor points for timing information should be authenticated.
Arman Gungor is a certified computer forensic examiner (CCE) and software developer. He has been appointed by courts as a neutral computer forensics expert as well as a neutral eDiscovery consultant. Arman is passionate about doing digital forensics research, developing new investigative techniques, and creating software to support them.