Documentation
¶
Overview ¶
Package zim reads and writes the ZIM offline-archive format, the open single-file container that Kiwix uses to ship offline content. kage uses it to pack a cloned mirror into one indexable, compressed file that a reader can random-access without unpacking.
The package is pure: no network, no clock, no global state beyond a lazily built zstd codec. A ZIM file is laid out as a fixed header, a MIME-type list, three pointer lists (URL, title, cluster), a run of directory entries, a run of clusters that hold the content, and a trailing MD5. Every cross-reference is an absolute file position recorded in the header, so the writer assigns positions in one pass and emits bytes in a second. All integers are little-endian.
We write the new namespace scheme (minor version 1): all content lives under the single 'C' namespace, metadata under 'M', and a 'W/mainPage' redirect points at the entry point. Reading handles redirects and both offset widths.
Index ¶
- Constants
- Variables
- type Blob
- type Reader
- type Writer
- func (w *Writer) AddContent(namespace byte, url, title, mime string, data []byte)
- func (w *Writer) AddMetadata(name, value string)
- func (w *Writer) AddRedirect(namespace byte, url, title string, targetNamespace byte, targetURL string)
- func (w *Writer) SetMainPage(namespace byte, url string)
- func (w *Writer) SetNoCompress(v bool)
- func (w *Writer) WriteTo(out io.Writer) (int64, error)
Constants ¶
const ( NamespaceContent byte = 'C' // pages and assets NamespaceMetadata byte = 'M' // M/Title, M/Date, ... NamespaceWellKnown byte = 'W' // W/mainPage redirect )
Namespaces in the new (minor version 1) scheme.
const Magic uint32 = 0x44D495A // 72173914
Magic is the ZIM header magic number, the first four bytes of every file.
Variables ¶
var ErrNotFound = errors.New("zim: not found")
ErrNotFound is returned by Get when no entry matches the namespace and url. Callers (such as the HTTP handler) test for it with errors.Is to map a miss to a 404.
Functions ¶
This section is empty.
Types ¶
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader provides random access to a ZIM file's entries. Open one with Open or NewReader, then look entries up by namespace and url, or fetch the main page. Decompressed clusters are cached so repeated reads from one cluster are cheap.
type Writer ¶
type Writer struct {
// contains filtered or unexported fields
}
Writer accumulates entries and serialises them as a ZIM file. Build it with NewWriter, add content/redirects/metadata, optionally set a main page, then call WriteTo. The writer holds entries in memory; a kage mirror comfortably fits, and packing is a one-shot batch job.
func (*Writer) AddContent ¶
AddContent adds a content entry. A later add with the same namespace and url replaces the earlier one. An empty title defaults to the url.
func (*Writer) AddMetadata ¶
AddMetadata adds an 'M' namespace text entry, e.g. AddMetadata("Title", "...").
func (*Writer) AddRedirect ¶
func (w *Writer) AddRedirect(namespace byte, url, title string, targetNamespace byte, targetURL string)
AddRedirect adds a redirect from (namespace,url) to (targetNamespace,targetURL).
func (*Writer) SetMainPage ¶
SetMainPage marks an entry as the archive's entry point.
func (*Writer) SetNoCompress ¶
SetNoCompress stores every cluster uncompressed. Useful when the input is already compressed or when a reader without zstd must open the file.