sir

package module
v0.0.0-...-15511bd Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 21, 2025 License: Apache-2.0 Imports: 5 Imported by: 0

README

SIR

SIR is a streamable, record-oriented binary file format with a sparse index.
It is designed to efficiently locate blocks containing specific records using a block-based storage structure and an index table.

Motivation

We needed a way to stream simple log data (timestamp + message) generated by short-lived tasks to users in real time, and to efficiently retrieve specific log ranges after the task ends. The first target platform was the Web, so a lightweight implementation was preferred.

Features

  • Indexed: Uses a monotonic unsigned 64-bit index for records.
  • Write Streamable: Append-only structure.
  • Read Streamable: Efficiently locates blocks containing a specific index.

Layout

There are four sections: Header, Blocks, Index Table, and Footer.

Header
   0      1      2      3      4      5      6      7      8
   .      .      .      .      .      .      .      .      .
00 |           Magic           | VER  | COMP |     RSV     |
08 |                  Index Table Offset                   |
10 |                  First Block Offset                   |
18 |                       Metadata                        |
  • Magic: A fixed constant to identify the file format. The first 4 bytes must be 0x53 0x49 0x52 0x00 (SIR\0).
  • VER: SIR format version. Currently, only 0x01 is supported.
  • COMP: Compression algorithm used for the payload. See Compression Algorithms.
  • Index Table Offset: Start position of the Index Table in the file. If 0, refer to the Footer section to find the Index Table offset.
  • First Block Offset: Start position of the first Block in the file. If 0, refer to the Footer section.
  • Metadata: Can be used as needed.
Blocks
   0      1      2      3      4      5      6      7      8
   .      .      .      .      .      .      .      .      .
00 |     Uncompressed Size     |    Payload (variable)     | # Block 1
08 |      CRC32 Checksum       |        Sync Marker        |
10 |     Uncompressed Size     |    Payload (variable)     | # Block 2
18 |      CRC32 Checksum       |        Sync Marker        |
20 |                          ...                          |
  • First Index: The index of the first record in the block.
  • Uncompressed Size: Original size of the payload if compressed.
  • CRC32 Checksum: CRC32 value of the payload for integrity verification.
  • Sync Marker: Fixed constant to mark block boundaries, 0xDE 0xCA 0xFE 0x42.
Index Table
      0      1      2      3      4      5      6      7      8
      .      .      .      .      .      .      .      .      .
   00 |                     First Index                       | # Group 1
   08 |                        Offset                         |
   10 |        Index Delta        |       Offset Delta        | # Delta 1
   18 |        Index Delta        |       Offset Delta        | # Delta 2
   20 |                          ...                          |
20 00 |        Index Delta        |       Offset Delta        | # Delta 63

20 08 |                     First Index                       | # Group 2
20 10 |                        Offset                         |
20 18 |        Index Delta        |       Offset Delta        | # Delta 1
20 20 |                          ...                          |

The Index Table records the location of each block in the file. It is divided into groups, each with one absolute position and 63 delta positions. The absolute position indicates the first index value of the block and its file offset; deltas are used to incrementally calculate the positions of subsequent blocks.

   0      1      2      3      4      5      6      7      8
   .      .      .      .      .      .      .      .      .
00 |                  Index Table Offset                   |
08 |           Magic           |
  • Index Table Offset: Start position of the Index Table in the file.
  • Magic: A fixed constant to identify the end of file. The last 4 bytes must be 0x53 0x49 0x52 0x00 (SIR\0).

Compression Algorithms

Value Algorithm
0x00 None
0x01 Deflate
0x02 Brotli
0x03 LZ4
0x04 Snappy
0x05 Zstandard

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Auto

func Auto[T constraints.Ordered](v T) T

func AutoFirst

func AutoFirst[T constraints.Ordered](vs []T) T

func Mem

func Mem[K constraints.Ordered, T any](indexer Indexer[K, T]) (Stream[K, T], Writer[T])

Types

type Indexer

type Indexer[K constraints.Ordered, T any] func(v T) K

type Reader

type Reader[T any] interface {
	Next() ([]T, error)
}

type Stream

type Stream[K constraints.Ordered, T any] interface {
	Reader(index K) Reader[T]
}

type Writer

type Writer[T any] interface {
	Write(v T) error
	Flush() error
	Close() error
}

func ByCount

func ByCount[T any](w Writer[T], cap int, meter func(v T) int) Writer[T]

func ByTimeout

func ByTimeout[T any](w Writer[T], d time.Duration) Writer[T]

func Tap

func Tap[T any](w Writer[T], f func(v T)) Writer[T]

func Transform

func Transform[T any, U any](w Writer[U], f func(v T) U) Writer[T]

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL