Peeking into msg files - revisited

Published: 2018-08-11
Last Updated: 2018-08-12 12:45:37 UTC
by Didier Stevens (Version: 1)
A reader asked how I knew stream 53 mentioned in diary entry "Peeking into msg files" contained the body of the email.

At that time, it was just trial and error. Since then, with the information posted by readers, I was able to make more sense of the different streams, and I developed a plugin for oledump to help with the analysis of MSG files:

A MSG file is a "Compound File Binary Format", or what I like to call an OLE file. OLE files can be analyzed with my tool

The second column is the size of the stream. Back then, I just peeked into the larger streams (3, 15, 53, 54) and discovered that the email body was inside stream 53.

With the information posted by readers, I was able to make more sense of this data. The third column is the stream name. The hexadecimal number at the end of the stream name, tells me what the stream contains and how it is encoded.

0x1000: Message body        <- This is the message body

Stream 53 has name __substg1.0_1000001F, and with this I know that it contains the message body (1000) and that it is UNICODE text (001F).

This information is used in plugin_msg to identify the different streams:


The plugin analyses the name of each stream, and presents the decoded information together with the beginning of the content of the stream.

To view just the output of the plugin, without the output of oledump, I use option -q:

Stream names that are not recognized by the plugin, have a qeustion mark (?) as description. To display only known streams, I use plugin option -k:

And with this output, it's easy to see that the message body is in stream 53, that it is UNICODE, and that it starts with "Dear Sir,".

I can now select stream 53 and display it as UNICODE, like this:


the sentence seems to be not complete.
Thanks, something went wrong when I published this diary entry yesterday.
I fixed it now.

