gomsg

package module
v0.0.0-...-985c3a1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 7, 2026 License: MIT Imports: 20 Imported by: 0

README

gomsg

Go library for parsing Microsoft Outlook .msg files. Extracts subject, sender, recipients, body (text/HTML/RTF), attachments, dates, and other MAPI properties.

.msg files use the OLE2 (CFB) binary format. The library parses the container and reads MAPI property streams inside it.

Документация на русском

Installation

go get github.com/AkmalOt/gomsg

Usage

As a library
package main

import (
    "fmt"
    "log"
    "os"

    "github.com/AkmalOt/gomsg"
)

func main() {
    msg, err := gomsg.Open("email.msg")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("Subject:", msg.Subject)
    fmt.Println("From:", msg.SenderName, "<"+msg.SenderSMTP+">")
    fmt.Println("To:", msg.DisplayTo)
    fmt.Println("Date:", msg.Date)
    fmt.Println("Body:", msg.Body)

    // Extract attachments
    for _, a := range msg.Attachments {
        fmt.Printf("Attachment: %s (%d bytes)\n", a.DisplayName(), a.Size)
        os.WriteFile(a.DisplayName(), a.Data(), 0644)
    }
}
CLI tool

The library comes with a msgdump utility:

go install github.com/AkmalOt/gomsg/cmd/msgdump@latest
# Print email summary
msgdump email.msg

# JSON output
msgdump -json email.msg

# Extract attachments to a directory
msgdump -extract ./attachments email.msg

# Print message body
msgdump -body email.msg

# Print transport headers
msgdump -headers email.msg

Example output:

Subject:      Test message
From:         John <john@example.com>
To:           jane@example.com
Date:         2024-01-15 12:30:00 UTC
Class:        IPM.Note
Importance:   Normal

Recipients (1):
  [To] Jane <jane@example.com>

Attachments (1):
  document.pdf (application/pdf, 45230 bytes)

Available fields

Field Property
Subject Message.Subject
Body (plain text) Message.Body
Body (HTML) Message.BodyHTML
Sender name Message.SenderName
Sender email Message.SenderSMTP
To Message.DisplayTo
CC Message.DisplayCC
BCC Message.DisplayBCC
Recipient list Message.Recipients
Attachments Message.Attachments
Date sent Message.Date
Delivery time Message.DeliveryTime
Message class Message.MessageClass
Importance Message.Importance
Message-ID Message.MessageID
Transport headers Message.Headers

Any MAPI property can also be accessed directly via Message.Properties.

Attachments

The library extracts file attachments and supports embedded .msg files (email inside email):

for _, a := range msg.Attachments {
    if a.IsEmbeddedMessage() {
        inner := a.EmbeddedMessage()
        fmt.Println("Embedded message:", inner.Subject)
    } else {
        os.WriteFile(a.DisplayName(), a.Data(), 0644)
    }
}

Encoding support

Handles UTF-16LE (Unicode) strings and various Windows codepages: 1250-1258, KOI8-R, KOI8-U, ISO-8859, Shift-JIS, EUC-KR, GBK, Big5, and others. The encoding is detected automatically from message properties.

Dependencies

License

MIT

Documentation

Overview

Package gomsg parses Microsoft Outlook .msg files and extracts email fields such as subject, sender, recipients, body, and attachments.

Usage:

msg, err := gomsg.Open("email.msg")
if err != nil {
    log.Fatal(err)
}
fmt.Println(msg.Subject)
fmt.Println(msg.SenderName, msg.SenderEmail)
for _, a := range msg.Attachments {
    fmt.Println(a.FileName, a.Size)
}

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotMSG indicates the file is not a valid MSG file.
	ErrNotMSG = errors.New("gomsg: not a valid MSG file")

	// ErrInvalidCFB indicates the CFB container is malformed.
	ErrInvalidCFB = errors.New("gomsg: invalid CFB container")

	// ErrNoProperties indicates the __properties_version1.0 stream is missing.
	ErrNoProperties = errors.New("gomsg: missing properties stream")

	// ErrPropertyType indicates an unexpected property type was encountered.
	ErrPropertyType = errors.New("gomsg: unexpected property type")
)
View Source
var (
	PSPublicStrings = [16]byte{
		0x29, 0x03, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
		0xC0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46,
	}
	PSInternetHeaders = [16]byte{
		0x20, 0x38, 0x60, 0x00, 0xFA, 0xB0, 0x9C, 0x01,
		0x10, 0x49, 0x00, 0x00, 0x0B, 0x6B, 0x3A, 0x03,
	}
)

Well-known property set GUIDs.

Functions

This section is empty.

Types

type AttachMethod

type AttachMethod int32

AttachMethod specifies how attachment data is stored.

const (
	AttachByValue         AttachMethod = 1 // Binary data in PidTagAttachDataBinary
	AttachByReference     AttachMethod = 2 // Path reference
	AttachByRefOnly       AttachMethod = 4 // Path reference only
	AttachEmbeddedMessage AttachMethod = 5 // Embedded MSG in sub-storage
	AttachOLE             AttachMethod = 6 // OLE object
)

type Attachment

type Attachment struct {
	FileName  string
	LongName  string
	Extension string
	MIMEType  string
	ContentID string
	Size      int64
	Method    AttachMethod

	Properties *PropertyStore
	// contains filtered or unexported fields
}

Attachment represents a file or embedded message attached to the email.

func (*Attachment) Data

func (a *Attachment) Data() []byte

Data returns the attachment's binary content.

func (*Attachment) DisplayName

func (a *Attachment) DisplayName() string

DisplayName returns the best available filename for the attachment.

func (*Attachment) EmbeddedMessage

func (a *Attachment) EmbeddedMessage() *Message

EmbeddedMessage returns the parsed embedded Message, or nil if not embedded.

func (*Attachment) IsEmbeddedMessage

func (a *Attachment) IsEmbeddedMessage() bool

IsEmbeddedMessage returns true if this attachment is an embedded .msg file.

type Importance

type Importance int32

Importance represents the message priority level.

const (
	ImportanceLow    Importance = 0
	ImportanceNormal Importance = 1
	ImportanceHigh   Importance = 2
)

type Message

type Message struct {
	Subject  string
	Body     string
	BodyHTML []byte
	BodyRTF  []byte

	SenderName  string
	SenderEmail string
	SenderSMTP  string
	SenderType  string // address type: "SMTP", "EX", etc.

	DisplayTo  string
	DisplayCC  string
	DisplayBCC string

	Recipients  []Recipient
	Attachments []Attachment

	Date         time.Time
	DeliveryTime time.Time
	MessageClass string
	Importance   Importance
	MessageID    string
	InReplyTo    string
	Headers      string

	ConversationTopic string

	Properties *PropertyStore
}

Message represents a parsed Outlook .msg file.

func Decode

func Decode(r io.Reader) (*Message, error)

Decode parses an MSG file from a reader by reading all data into memory.

func Open

func Open(path string) (*Message, error)

Open parses an MSG file from the given path.

func OpenReader

func OpenReader(r io.ReaderAt, size int64) (*Message, error)

OpenReader parses an MSG file from an io.ReaderAt.

type NamedPropertyMapping

type NamedPropertyMapping struct {
	// contains filtered or unexported fields
}

NamedPropertyMapping holds the mapping from named property IDs to their MAPI property set GUID and name/ID.

type Property

type Property struct {
	ID    PropertyID
	Type  PropertyType
	Flags uint32
	Value interface{}
}

Property represents a single MAPI property with its parsed value.

type PropertyID

type PropertyID uint16

PropertyID represents a MAPI property identifier.

const (
	PidTagMessageClass            PropertyID = 0x001A
	PidTagSubject                 PropertyID = 0x0037
	PidTagSubjectPrefix           PropertyID = 0x003D
	PidTagClientSubmitTime        PropertyID = 0x0039
	PidTagSentRepresentingName    PropertyID = 0x0042
	PidTagSentRepresentingEmail   PropertyID = 0x0065
	PidTagImportance              PropertyID = 0x0017
	PidTagSensitivity             PropertyID = 0x0036
	PidTagTransportMessageHeaders PropertyID = 0x007D
	PidTagDisplayTo               PropertyID = 0x0E04
	PidTagDisplayCC               PropertyID = 0x0E03
	PidTagDisplayBCC              PropertyID = 0x0E02
	PidTagMessageFlags            PropertyID = 0x0E07
	PidTagNormalizedSubject       PropertyID = 0x0E1D
	PidTagHasAttachments          PropertyID = 0x0E1B
	PidTagBody                    PropertyID = 0x1000
	PidTagBodyHTML                PropertyID = 0x1013
	PidTagRTFCompressed           PropertyID = 0x1009
	PidTagInternetMessageID       PropertyID = 0x1035
	PidTagInReplyToID             PropertyID = 0x1042
	PidTagSenderName              PropertyID = 0x0C1A
	PidTagSenderEmailAddress      PropertyID = 0x0C1F
	PidTagSenderAddrType          PropertyID = 0x0C1E
	PidTagSenderSMTPAddress       PropertyID = 0x5D01
	PidTagRecipientType           PropertyID = 0x0C15
	PidTagDisplayName             PropertyID = 0x3001
	PidTagEmailAddress            PropertyID = 0x3003
	PidTagAddrType                PropertyID = 0x3002
	PidTagSMTPAddress             PropertyID = 0x39FE
	PidTagAttachDataBinary        PropertyID = 0x3701
	PidTagAttachDataObject        PropertyID = 0x3701 // same ID, type 0x000D for embedded
	PidTagAttachEncoding          PropertyID = 0x3702
	PidTagAttachFilename          PropertyID = 0x3704
	PidTagAttachMethod            PropertyID = 0x3705
	PidTagAttachLongFilename      PropertyID = 0x3707
	PidTagAttachMIMETag           PropertyID = 0x370E
	PidTagAttachExtension         PropertyID = 0x3703
	PidTagAttachSize              PropertyID = 0x0E20
	PidTagAttachContentID         PropertyID = 0x3712
	PidTagInternetCodepage        PropertyID = 0x3FDE
	PidTagMessageCodepage         PropertyID = 0x3FFD
	PidTagCreationTime            PropertyID = 0x3007
	PidTagLastModificationTime    PropertyID = 0x3008
	PidTagMessageDeliveryTime     PropertyID = 0x0E06
	PidTagConversationTopic       PropertyID = 0x0070
	PidTagConversationIndex       PropertyID = 0x0071
)

Standard MAPI property IDs used in MSG files.

type PropertyStore

type PropertyStore struct {

	// Header fields from root/embedded message context.
	NextRecipientID  uint32
	NextAttachmentID uint32
	RecipientCount   uint32
	AttachmentCount  uint32
	// contains filtered or unexported fields
}

PropertyStore holds all parsed MAPI properties from a properties stream.

func (*PropertyStore) All

func (ps *PropertyStore) All() []PropertyID

All returns all property IDs present in the store.

func (*PropertyStore) Get

func (ps *PropertyStore) Get(id PropertyID) *Property

Get returns a property by its MAPI ID, or nil if not found.

func (*PropertyStore) GetBool

func (ps *PropertyStore) GetBool(id PropertyID) (bool, bool)

GetBool returns a boolean property value.

func (*PropertyStore) GetBytes

func (ps *PropertyStore) GetBytes(id PropertyID) []byte

GetBytes returns a binary property value.

func (*PropertyStore) GetInt32

func (ps *PropertyStore) GetInt32(id PropertyID) (int32, bool)

GetInt32 returns an int32 property value.

func (*PropertyStore) GetString

func (ps *PropertyStore) GetString(id PropertyID) string

GetString returns a string property value, or empty string if missing.

func (*PropertyStore) GetTime

func (ps *PropertyStore) GetTime(id PropertyID) (time.Time, bool)

GetTime returns a time.Time property value.

type PropertyType

type PropertyType uint16

PropertyType represents a MAPI property type tag.

const (
	TypeUnspecified PropertyType = 0x0000
	TypeNull        PropertyType = 0x0001
	TypeInt16       PropertyType = 0x0002
	TypeInt32       PropertyType = 0x0003
	TypeFloat32     PropertyType = 0x0004
	TypeFloat64     PropertyType = 0x0005
	TypeCurrency    PropertyType = 0x0006
	TypeAppTime     PropertyType = 0x0007
	TypeBoolean     PropertyType = 0x000B
	TypeObject      PropertyType = 0x000D
	TypeInt64       PropertyType = 0x0014
	TypeString8     PropertyType = 0x001E
	TypeUnicode     PropertyType = 0x001F
	TypeSysTime     PropertyType = 0x0040
	TypeGUID        PropertyType = 0x0048
	TypeBinary      PropertyType = 0x0102

	// Multi-valued types.
	TypeMultiInt16   PropertyType = 0x1002
	TypeMultiInt32   PropertyType = 0x1003
	TypeMultiFloat32 PropertyType = 0x1004
	TypeMultiFloat64 PropertyType = 0x1005
	TypeMultiInt64   PropertyType = 0x1014
	TypeMultiString8 PropertyType = 0x101E
	TypeMultiUnicode PropertyType = 0x101F
	TypeMultiSysTime PropertyType = 0x1040
	TypeMultiBinary  PropertyType = 0x1102
)

Standard MAPI property types.

type Recipient

type Recipient struct {
	DisplayName  string
	EmailAddress string
	SMTPAddress  string
	Type         RecipientType
	Properties   *PropertyStore
}

Recipient represents an email recipient.

type RecipientType

type RecipientType int32

RecipientType indicates whether a recipient is To, CC, or BCC.

const (
	RecipientOriginator RecipientType = 0
	RecipientTo         RecipientType = 1
	RecipientCC         RecipientType = 2
	RecipientBCC        RecipientType = 3
)

func (RecipientType) String

func (rt RecipientType) String() string

String returns a human-readable label for the recipient type.

Directories

Path Synopsis
cmd
msgdump command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL