cfb

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 17, 2026 License: MIT Imports: 16 Imported by: 0

README

go-cfb: Pure Go Microsoft Compound File Binary Reader & Writer

Go Reference Go Report Card Codecov

A pure Go library for reading and writing Microsoft Compound File Binary (CFB) files - the COM Structured Storage (OLE2) container format used by .msg, .doc, .xls, .msi, and similar files.

See the [MS-CFB] Compound File Binary File Format specification for details.

Installation

go get github.com/abemedia/go-cfb

Usage

Reading a Compound File

A compound file is a tree of storages (directories) and streams (files). Open one and walk its entries:

r, err := cfb.OpenReader("archive.cfb")
if err != nil {
  return err
}
defer r.Close()

var walk func(s *cfb.Storage)
walk = func(s *cfb.Storage) {
  for _, e := range s.Entries {
    switch e := e.(type) {
      // Every entry is one of exactly two types: *cfb.Storage or *cfb.Stream.
      case *cfb.Storage:
        fmt.Println(e.Name + "/")
        walk(e)
      case *cfb.Stream:
        fmt.Printf("%s (%d bytes)\n", e.Name, e.Size)
    }
  }
}
walk(r.Storage)

A specific stream or storage can be looked up by name (case-insensitive) with OpenStream and OpenStorage:

s, err := r.OpenStream("\x05SummaryInformation")
if err != nil {
  return err
}
data, err := io.ReadAll(s.Open())

The Reader also implements fs.FS, so it works with fs.WalkDir, fs.ReadFile, and other standard library functions:

fs.WalkDir(r, ".", func(path string, d fs.DirEntry, err error) error {
  fmt.Println(path)
  return err
})

Stream additionally implements io.ReaderAt, which is stateless and safe for concurrent use.

Writing a Compound File

Choose a version: NewWriterV3 (512-byte sectors) or NewWriterV4 (4096-byte sectors); v4 is typical for large files such as modern MSIs.

f, err := os.Create("archive.cfb")
if err != nil {
  return err
}
defer f.Close()

w := cfb.NewWriterV4(f)

s, err := w.CreateStream("hello.txt")
if err != nil {
  return err
}

if _, err = s.Write([]byte("hello world")); err != nil {
  return err
}

// Streams must be closed before the writer.
if err := s.Close(); err != nil {
  return err
}

// Make sure to check the error on Close.
if err := w.Close(); err != nil {
  return err
}

Nested storages are created with CreateStorage, which returns a writer with the same CreateStream / CreateStorage methods:

sub, err := w.CreateStorage("Data")
if err != nil {
  return err
}
s, err := sub.CreateStream("part1")

If you want to pack any fs.FS in one go, use AddFS:

if err := w.AddFS(os.DirFS("./my-app-files")); err != nil {
  return err
}

CreateStream, CreateStorage, and concurrent Writes on distinct streams are safe to call from multiple goroutines.

See the package documentation for further examples.

Documentation

Overview

Package cfb reads and writes Microsoft Compound File Binary (CFB) files, the container format used by .msg, .doc, .xls, .msi, and other COM Structured Storage consumers.

The format is specified in MS-CFB, "Compound File Binary File Format". This implementation supports both v3 (512-byte sectors) and v4 (4096-byte sectors).

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrFormat = errors.New("cfb: not a valid CFB file")

ErrFormat is returned when a CFB file's structure is invalid.

View Source
var ErrNotFound = errors.New("cfb: entry not found")

ErrNotFound is returned when a named stream or storage does not exist.

Functions

This section is empty.

Types

type Entry

type Entry interface {
	// contains filtered or unexported methods
}

Entry is the sealed sum type for storage children. Implementations are *Storage and *Stream; type-switch to discriminate.

type ReadCloser

type ReadCloser struct {
	*Reader
	// contains filtered or unexported fields
}

A ReadCloser is a Reader that must be closed when no longer needed.

func OpenReader

func OpenReader(name string) (*ReadCloser, error)

OpenReader opens the named CFB file.

func (*ReadCloser) Close

func (rc *ReadCloser) Close() error

Close closes the CFB file, rendering it unusable for I/O.

type Reader

type Reader struct {
	*Storage

	// Version is the CFB major version (3 for 512-byte sectors, 4 for 4096).
	Version uint16
	// contains filtered or unexported fields
}

A Reader serves content from a CFB file.

Example
// Open a compound file for reading.
r, err := cfb.OpenReader("testdata/example.cfb")
if err != nil {
	log.Fatal(err)
}
defer r.Close()

// Iterate through the streams in the compound file,
// printing some of their contents.
for _, e := range r.Entries {
	s, ok := e.(*cfb.Stream)
	if !ok {
		continue
	}
	fmt.Printf("Contents of %s:\n", s.Name)
	_, err = io.Copy(os.Stdout, s.Open())
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println()
}
Output:
Contents of README.md:
This is an example CFB file.

func NewReader

func NewReader(r io.ReaderAt) (*Reader, error)

NewReader creates a new Reader reading from r.

func (*Reader) Open

func (r *Reader) Open(name string) (fs.File, error)

Open opens the named file in the CFB file using the semantics of fs.FS.Open. Paths are always slash separated, with no leading / or ../ elements.

type Storage

type Storage struct {
	// Name is the storage's name on its parent.
	Name string

	// CLSID is the GUID identifying the COM class of the entry.
	CLSID [16]byte

	// StateBits is the application-defined user flags word for the entry, opaque to CFB.
	StateBits uint32

	// Created is the time the entry was created.
	Created time.Time

	// Modified is the time the entry was last modified.
	Modified time.Time

	// Entries are sorted by length, then by case-insensitive UTF-16
	// code-unit comparison.
	Entries []Entry
}

Storage is a directory-like CFB entry holding child Stream and Storage objects.

func (*Storage) OpenStorage

func (s *Storage) OpenStorage(name string) (*Storage, error)

OpenStorage finds a child storage by name (case-insensitive). Returns ErrNotFound if the name is unknown or refers to a stream.

func (*Storage) OpenStream

func (s *Storage) OpenStream(name string) (*Stream, error)

OpenStream finds a child stream by name (case-insensitive). Returns ErrNotFound if the name is unknown or refers to a storage.

type StorageWriter

type StorageWriter struct {
	// CLSID is the GUID identifying the COM class of the entry.
	CLSID [16]byte

	// StateBits is the application-defined user flags word for the entry, opaque to CFB.
	StateBits uint32

	// Created is the time the entry was created.
	Created time.Time

	// Modified is the time the entry was last modified.
	Modified time.Time
	// contains filtered or unexported fields
}

A StorageWriter adds a storage to a CFB file. Set the exported fields to configure entry metadata.

func (*StorageWriter) AddFS

func (sw *StorageWriter) AddFS(fsys fs.FS) error

AddFS adds the files from fs.FS to the storage. It walks the directory tree starting at the root of the filesystem adding each file to the CFB while maintaining the directory structure.

When fs.FileInfo.Sys returns a *Storage or *Stream, its metadata fields are preserved on the new entry.

func (*StorageWriter) CreateStorage

func (sw *StorageWriter) CreateStorage(name string) (*StorageWriter, error)

CreateStorage adds a child storage to the storage using the provided name and returns a *StorageWriter.

func (*StorageWriter) CreateStream

func (sw *StorageWriter) CreateStream(name string) (*StreamWriter, error)

CreateStream adds a stream to the storage using the provided name and returns a *StreamWriter to which the stream contents should be written. The StreamWriter must be closed before Writer.Close.

type Stream

type Stream struct {
	// Name is the stream's name on its parent.
	Name string

	// StateBits is the application-defined user flags word for the entry, opaque to CFB.
	StateBits uint32

	// Size is the length of the stream's content in bytes.
	Size int64
	// contains filtered or unexported fields
}

Stream is a stream entry. ReadAt is stateless and safe for concurrent use.

func (*Stream) Open

func (s *Stream) Open() io.ReadSeeker

Open returns a fresh io.ReadSeeker positioned at the start of the stream.

func (*Stream) ReadAt

func (s *Stream) ReadAt(p []byte, off int64) (n int, err error)

ReadAt reads up to len(p) bytes starting at off. Reads past Size return io.EOF; reads against a truncated chain return io.ErrUnexpectedEOF.

type StreamWriter

type StreamWriter struct {
	// StateBits is the application-defined user flags word for the entry, opaque to CFB.
	StateBits uint32
	// contains filtered or unexported fields
}

A StreamWriter adds a stream to a CFB file. Set the StateBits field to configure entry metadata.

func (*StreamWriter) Close

func (s *StreamWriter) Close() error

Close finishes writing the stream. It must be called before Writer.Close.

func (*StreamWriter) Write

func (s *StreamWriter) Write(p []byte) (int, error)

Write writes len(b) bytes from b to the stream. It returns the number of bytes written and an error, if any.

type Writer

type Writer struct {
	*StorageWriter
	// contains filtered or unexported fields
}

Writer implements a CFB file writer.

CreateStream, CreateStorage, and concurrent Writes on distinct StreamWriter values are safe to call from multiple goroutines.

Example
// Create a file to write our compound file to.
f, err := os.CreateTemp("", "example-*.cfb")
if err != nil {
	log.Fatal(err)
}
defer os.Remove(f.Name())
defer f.Close()

// Create a new compound file.
w := cfb.NewWriterV3(f)

// Add some streams to the compound file.
files := []struct {
	Name, Body string
}{
	{"readme.txt", "This archive contains some text files."},
	{"gopher.txt", "Gopher names:\nGeorge\nGeoffrey\nGonzo"},
	{"todo.txt", "Get animal handling licence.\nWrite more examples."},
}
for _, file := range files {
	s, err := w.CreateStream(file.Name)
	if err != nil {
		log.Fatal(err)
	}
	_, err = s.Write([]byte(file.Body))
	if err != nil {
		log.Fatal(err)
	}
	if err := s.Close(); err != nil {
		log.Fatal(err)
	}
}

// Make sure to check the error on Close.
if err := w.Close(); err != nil {
	log.Fatal(err)
}

func NewWriterV3

func NewWriterV3(w io.WriteSeeker) *Writer

NewWriterV3 returns a Writer that produces a CFB v3 (512-byte sector) file.

func NewWriterV4

func NewWriterV4(w io.WriteSeeker) *Writer

NewWriterV4 returns a Writer that produces a CFB v4 (4096-byte sector) file.

func (*Writer) Close

func (w *Writer) Close() error

Close finishes writing the CFB file. It does not close the underlying writer.

Every StreamWriter returned by CreateStream must be closed first.

Directories

Path Synopsis
internal
casetablegen command
Generator for casetable.go in the cfb package root.
Generator for casetable.go in the cfb package root.
cfbtest
Package cfbtest provides utilities for CFB testing.
Package cfbtest provides utilities for CFB testing.
istorage
Package istorage wraps Windows' IStorage / IStream COM API from ole32.dll.
Package istorage wraps Windows' IStorage / IStream COM API from ole32.dll.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL