Opening OLE2 COM Structured Storage Documents with PowerShell

COM structured storage is popular storage format used in many legacy applications and some modern ones. Examples include:

  • Microsoft Office 97–2003 documents (.DOC, .DOT, .XLS, .XLT, .PPT, .POT, PUB, .VSD, .MPP, .MSG)
  • Macros in Office 2007+ documents (vbaProject.bin file embedded in document)
  • Windows Installer Files (.MSI, .MSP, .MST)
  • Microsoft Picture It! / Microsoft Digital Image files (.MIX)
  • Internet Explorer RSS Feeds Windows RSS Platform files (.feed-ms)
  • Windows 7 StickyNotes (.SNT)
  • Windows 7 jumplists files
  • Thumbs.db
  • Microsoft SQL 2000 Server DTS packages
  • Autodesk Revit
  • Autodesk Inventor
  • FlashPix
  • Altium Designer

If looking through .NET documentation you will find the APIs for manipulating structured storage files in System.IO.Packaging namespace, but there seems to be no method to actually open your own file. In the early days of .NET possibly this was available as we can see with some old MSDN documentation on web archive here

The APIs mentioned in above documentation are still there, just they are not public and marked internal. Using reflection we can still load and execute these functions to parse these types of files. We will load the file into a System.IO.Packaging.StorageInfo class object.

Note this just lets you get access to the individual streams within the structured storage file, many of these documents the subsequent streams will be a proprietary format that you need also need to work out how to parse.

In this example we will get the streams and display their bytes content from a Word 2003 document:

Add-Type -AssemblyName WindowsBase

Function Invoke-StorageRootMethod
{
        param(`
            [System.IO.Packaging.StorageInfo]$storageRoot,`
            [String]$methodName,`
            [Object[]]$methodArgs)

    $storageRootType = [System.IO.Packaging.StorageInfo].Assembly.`
        GetType("System.IO.Packaging.StorageRoot", $true, $false)

    $result = $storageRootType.InvokeMember(`
        $methodName,`
        [System.Reflection.BindingFlags]::Static -bor `
        [System.Reflection.BindingFlags]::Instance -bor `
        [System.Reflection.BindingFlags]::Public -bor `
        [System.Reflection.BindingFlags]::NonPublic -bor `
        [System.Reflection.BindingFlags]::InvokeMethod, `
        $null, ` 
        $storageRoot, ` 
        $methodArgs)

    return $result
}

Function Format-Bytes
{
	param([Byte[]]$bytes,[Int]$Width = 20,[Int]$MaxBytes = 0)
    if ($MaxBytes -eq 0)
    {
        $MaxBytes = $bytes.Length
    }
    $stringBuilder = New-Object System.Text.StringBuilder

    For ($x = 0; $x -lt $MaxBytes; $x+=$Width )
    {
        for ($y = $x; $y -lt $x+$Width -and $y -lt $MaxBytes;$y++)
        {
            [void]$stringBuilder.Append([String]::Format("{0:X2} ",$bytes[$y]))
        }

        if ($y -lt $x+$Width)
        {
            for ($y = $y; $y -lt $x+$Width;$y++)
            {
                [void]$stringBuilder.Append("   ")
            }
        }
        [void]$stringBuilder.Append("| ")
        for ($y = $x; $y -lt $x+$Width -and $y -lt $MaxBytes;$y++)
        {
            if ($bytes[$y] -lt 32)
            {
                [void]$stringBuilder.Append(".")
            }
            else
            {
                [void]$stringBuilder.Append([Char]$bytes[$y])
            }
        }

        [void]$stringBuilder.AppendLine()
    }

    $stringBuilder.ToString()
}

Function Open-StructuredStorage
{
    return (`
        Invoke-StorageRootMethod `
            -storageRoot $null `
            -methodName "Open" `
            -methodArgs @( `
                $filename, `
                [System.IO.FileMode]::Open, `
                [System.IO.FileAccess]::Read, `
                [System.IO.FileShare]::Read))
}
   
Function Close-StructuredStorage
{
    param([System.IO.Packaging.StorageInfo]$StorageInfo)
    return (`
        Invoke-StorageRootMethod `
            -storageRoot $StorageInfo `
            -methodName "Close" `
            -methodArgs $null)
}

$storageInfo =  Open-StructuredStorage -Path "C:\support\test.doc"

if ($storageInfo -eq $null)
{
    Write-Error "Unable to open '$($filename)' as structured storage file"
    return
}
        
ForEach ($stream in $storageInfo.GetStreams())
{
    Write-Host "Stream Name"
    $stream.Name
    $reader = New-Object System.IO.BinaryReader($stream.GetStream())
    $data = $reader.ReadBytes($reader.BaseStream.Length)
    Format-Bytes -bytes $data
}

Close-StructuredStorage -StorageInfo $storageInfo

About chentiangemalc

specializes in end-user computing technologies. disclaimer 1) use at your own risk. test any solution in your environment. if you do not understand the impact/consequences of what you're doing please stop, and ask advice from somebody who does. 2) views are my own at the time of posting and do not necessarily represent my current view or the view of my employer and family members/relatives. 3) over the years Microsoft/Citrix/VMWare have given me a few free shirts, pens, paper notebooks/etc. despite these gifts i will try to remain unbiased.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment