COM structured storage is popular storage format used in many legacy applications and some modern ones. Examples include:
- Microsoft Office 97–2003 documents (.DOC, .DOT, .XLS, .XLT, .PPT, .POT, PUB, .VSD, .MPP, .MSG)
- Macros in Office 2007+ documents (vbaProject.bin file embedded in document)
- Windows Installer Files (.MSI, .MSP, .MST)
- Microsoft Picture It! / Microsoft Digital Image files (.MIX)
- Internet Explorer RSS Feeds Windows RSS Platform files (.feed-ms)
- Windows 7 StickyNotes (.SNT)
- Windows 7 jumplists files
- Thumbs.db
- Microsoft SQL 2000 Server DTS packages
- Autodesk Revit
- Autodesk Inventor
- FlashPix
- Altium Designer
If looking through .NET documentation you will find the APIs for manipulating structured storage files in System.IO.Packaging namespace, but there seems to be no method to actually open your own file. In the early days of .NET possibly this was available as we can see with some old MSDN documentation on web archive here
The APIs mentioned in above documentation are still there, just they are not public and marked internal. Using reflection we can still load and execute these functions to parse these types of files. We will load the file into a System.IO.Packaging.StorageInfo class object.
Note this just lets you get access to the individual streams within the structured storage file, many of these documents the subsequent streams will be a proprietary format that you need also need to work out how to parse.
In this example we will get the streams and display their bytes content from a Word 2003 document:
Add-Type -AssemblyName WindowsBase
Function Invoke-StorageRootMethod
{
param(`
[System.IO.Packaging.StorageInfo]$storageRoot,`
[String]$methodName,`
[Object[]]$methodArgs)
$storageRootType = [System.IO.Packaging.StorageInfo].Assembly.`
GetType("System.IO.Packaging.StorageRoot", $true, $false)
$result = $storageRootType.InvokeMember(`
$methodName,`
[System.Reflection.BindingFlags]::Static -bor `
[System.Reflection.BindingFlags]::Instance -bor `
[System.Reflection.BindingFlags]::Public -bor `
[System.Reflection.BindingFlags]::NonPublic -bor `
[System.Reflection.BindingFlags]::InvokeMethod, `
$null, `
$storageRoot, `
$methodArgs)
return $result
}
Function Format-Bytes
{
param([Byte[]]$bytes,[Int]$Width = 20,[Int]$MaxBytes = 0)
if ($MaxBytes -eq 0)
{
$MaxBytes = $bytes.Length
}
$stringBuilder = New-Object System.Text.StringBuilder
For ($x = 0; $x -lt $MaxBytes; $x+=$Width )
{
for ($y = $x; $y -lt $x+$Width -and $y -lt $MaxBytes;$y++)
{
[void]$stringBuilder.Append([String]::Format("{0:X2} ",$bytes[$y]))
}
if ($y -lt $x+$Width)
{
for ($y = $y; $y -lt $x+$Width;$y++)
{
[void]$stringBuilder.Append(" ")
}
}
[void]$stringBuilder.Append("| ")
for ($y = $x; $y -lt $x+$Width -and $y -lt $MaxBytes;$y++)
{
if ($bytes[$y] -lt 32)
{
[void]$stringBuilder.Append(".")
}
else
{
[void]$stringBuilder.Append([Char]$bytes[$y])
}
}
[void]$stringBuilder.AppendLine()
}
$stringBuilder.ToString()
}
Function Open-StructuredStorage
{
return (`
Invoke-StorageRootMethod `
-storageRoot $null `
-methodName "Open" `
-methodArgs @( `
$filename, `
[System.IO.FileMode]::Open, `
[System.IO.FileAccess]::Read, `
[System.IO.FileShare]::Read))
}
Function Close-StructuredStorage
{
param([System.IO.Packaging.StorageInfo]$StorageInfo)
return (`
Invoke-StorageRootMethod `
-storageRoot $StorageInfo `
-methodName "Close" `
-methodArgs $null)
}
$storageInfo = Open-StructuredStorage -Path "C:\support\test.doc"
if ($storageInfo -eq $null)
{
Write-Error "Unable to open '$($filename)' as structured storage file"
return
}
ForEach ($stream in $storageInfo.GetStreams())
{
Write-Host "Stream Name"
$stream.Name
$reader = New-Object System.IO.BinaryReader($stream.GetStream())
$data = $reader.ReadBytes($reader.BaseStream.Length)
Format-Bytes -bytes $data
}
Close-StructuredStorage -StorageInfo $storageInfo