zim

package
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package zim reads and writes the ZIM offline-archive format, the open single-file container that Kiwix uses to ship offline content. kage uses it to pack a cloned mirror into one indexable, compressed file that a reader can random-access without unpacking.

The package is pure: no network, no clock, no global state beyond a lazily built zstd codec. A ZIM file is laid out as a fixed header, a MIME-type list, three pointer lists (URL, title, cluster), a run of directory entries, a run of clusters that hold the content, and a trailing MD5. Every cross-reference is an absolute file position recorded in the header, so the writer assigns positions in one pass and emits bytes in a second. All integers are little-endian.

We write the new namespace scheme (minor version 1): all content lives under the single 'C' namespace, metadata under 'M', and a 'W/mainPage' redirect points at the entry point. Reading handles redirects and both offset widths.

Index

Constants

View Source
const (
	NamespaceContent   byte = 'C' // pages and assets
	NamespaceMetadata  byte = 'M' // M/Title, M/Date, ...
	NamespaceWellKnown byte = 'W' // W/mainPage redirect
)

Namespaces in the new (minor version 1) scheme.

View Source
const Magic uint32 = 0x44D495A // 72173914

Magic is the ZIM header magic number, the first four bytes of every file.

Variables

View Source
var ErrNotFound = errors.New("zim: not found")

ErrNotFound is returned by Get when no entry matches the namespace and url. Callers (such as the HTTP handler) test for it with errors.Is to map a miss to a 404.

Functions

This section is empty.

Types

type Blob

type Blob struct {
	Namespace byte
	URL       string
	Title     string
	MimeType  string
	Data      []byte
}

Blob is the result of a lookup: the resolved entry's bytes and metadata.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader provides random access to a ZIM file's entries. Open one with Open or NewReader, then look entries up by namespace and url, or fetch the main page. Decompressed clusters are cached so repeated reads from one cluster are cheap.

func NewReader

func NewReader(ra io.ReaderAt, size int64) (*Reader, error)

NewReader reads the header and MIME list from ra, which must hold size bytes.

func Open

func Open(path string) (*Reader, error)

Open opens a ZIM file on disk. Close the returned reader when done.

func (*Reader) Close

func (r *Reader) Close() error

Close releases the underlying file, if Open created one.

func (*Reader) Count

func (r *Reader) Count() uint32

Count returns the number of directory entries.

func (*Reader) Get

func (r *Reader) Get(namespace byte, url string) (Blob, error)

Get resolves the entry at (namespace, url), following one or more redirects.

func (*Reader) MainPage

func (r *Reader) MainPage() (Blob, error)

MainPage returns the archive's entry point, or an error if none is set.

func (*Reader) MimeTypes

func (r *Reader) MimeTypes() []string

MimeTypes returns the archive's MIME-type list.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer accumulates entries and serialises them as a ZIM file. Build it with NewWriter, add content/redirects/metadata, optionally set a main page, then call WriteTo. The writer holds entries in memory; a kage mirror comfortably fits, and packing is a one-shot batch job.

func NewWriter

func NewWriter() *Writer

NewWriter returns an empty Writer.

func (*Writer) AddContent

func (w *Writer) AddContent(namespace byte, url, title, mime string, data []byte)

AddContent adds a content entry. A later add with the same namespace and url replaces the earlier one. An empty title defaults to the url.

func (*Writer) AddMetadata

func (w *Writer) AddMetadata(name, value string)

AddMetadata adds an 'M' namespace text entry, e.g. AddMetadata("Title", "...").

func (*Writer) AddRedirect

func (w *Writer) AddRedirect(namespace byte, url, title string, targetNamespace byte, targetURL string)

AddRedirect adds a redirect from (namespace,url) to (targetNamespace,targetURL).

func (*Writer) SetMainPage

func (w *Writer) SetMainPage(namespace byte, url string)

SetMainPage marks an entry as the archive's entry point.

func (*Writer) SetNoCompress

func (w *Writer) SetNoCompress(v bool)

SetNoCompress stores every cluster uncompressed. Useful when the input is already compressed or when a reader without zstd must open the file.

func (*Writer) WriteTo

func (w *Writer) WriteTo(out io.Writer) (int64, error)

WriteTo serialises the archive to out and returns the number of bytes written.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL