logfile

package module
v0.0.0-...-e715ae1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 21, 2026 License: MIT Imports: 6 Imported by: 0

README

logfile

Test

[!NOTE]
I extracted this from an in-house database engine. It's a work in progress, but I wanted to share and improve it in a separate repo. Feedback and contributions are very welcome!

logfile is a concurrent, append-only log file optimized for SSDs and high throughput in Go.

Why

Most log file implementations flush after every write. This one batches writes into a single IO + fsync, which makes a huge difference on SSDs.

The library exposes two entry points so you can pick the right tradeoff between throughput and per-record durability:

  • Write blocks until the appended record is durable. Many concurrent Write calls naturally batch into one fsync (group commit). Use this when each record must be on disk before the caller continues and you have enough concurrent writers to amortize fsync latency.
  • WriteAsync returns as soon as the record is buffered. A single writer can keep submitting records while the background flusher fsyncs the previous batch. Call Flush at a checkpoint to wait for durability. This is the high-throughput path for a small number of writers.

Internally, a single background flusher drains the buffer using a ping-pong pattern, so new appends can be staged while the previous batch is being fsynced. Once the flush completes, all waiting writers are notified and the next batch can be flushed.

Features

  • Group commit: concurrent Write calls are batched into one write + fsync
  • Pipelined async path: a single writer can pipeline records via WriteAsync for very high throughput
  • Optimized for SSDs: minimizes write amplification with 4KB-aligned writes
  • Zero heap allocations per write
  • Safe: no torn writes. When Write returns nil, the data is on disk.
  • CRC32 checksums for data integrity (optional)
  • Small API surface: two write methods plus Flush and Close

Install

go get github.com/alialaee/logfile

Benchmarks

See benchmark/README.md.

Real-disk throughput on an M4 MacBook Air at 900-byte records:

Mode Throughput
Single writer, Write (fsync per call) 0.02 MB/s
Single writer, WriteAsync + Flush 411 MB/s
10 goroutines, WriteAsync 406 MB/s
10 goroutines, Write 1.16 MB/s
500 goroutines, Write 58.7 MB/s

Write throughput is bounded by fsync_latency / records_per_fsync, where the batch size is at most the number of concurrent writers. If you need high throughput from a small number of writers, use WriteAsync.

Cross-library comparison (20 concurrent writers, record sizes 512 bytes to 1 MB):

=== Logfile Write (This library)
Total bytes written: 477 MB
Time taken: 576.585416ms
Throughput: 828.85 MB/s

=== Direct File Write+Flush ===
Total bytes written: 119 MB
Time taken: 1.022756167s
Throughput: 116.82 MB/s

=== Tidwall WAL Write ===
Total bytes written: 119 MB
Time taken: 1.117973042s
Throughput: 106.87 MB/s

=== Hashicorp Raft Write ===
Total bytes written: 119 MB
Time taken: 1.197531459s
Throughput: 99.77 MB/s

=== Pebble Record Write ===
Total bytes written: 477 MB
Time taken: 678.86125ms
Throughput: 703.98 MB/s

Usage

Blocking writes (each call returns when the record is durable)
f, _ := os.OpenFile("my.log", os.O_CREATE|os.O_RDWR, 0644)

lf, _ := logfile.New(f, 0, 1024*1024, true) // 1MB buffer, CRC enabled

offset, _ := lf.Write(context.Background(), []byte("hello world"))

// Read back
reader := logfile.NewReader(f, 0)
data, _ := reader.ReadNext(nil)

lf.Close()
f.Close()
Pipelined writes (high throughput from one writer)
for _, rec := range records {
    _, _ = lf.WriteAsync(context.Background(), rec)
}
// Block until everything submitted above is on disk.
_ = lf.Flush(context.Background())

License

MIT

Documentation

Overview

Package logfile implements a log file optimized for SSD and concurrency. It's suitable to be used as a WAL file.

Design:

  • Group Commit: a single background flusher batches concurrent or pipelined writes into one IO write and fsync operation, maximizing SSD throughput.
  • Aligned Writes: File appends are explicitly padded to 4KB sector boundaries to optimize for SSD.
  • Record Framing: Each record is prefixed with a 9-byte header (1 byte marker, 4 byte length, and 4 byte crc32).
  • Zero heap allocations per write.

Writers append into an in-memory buffer. A single background flusher drains it using a ping-pong buffer so new appends can be staged while the previous batch is being fsynced. Two entry points are exposed:

  • Write blocks until the appended record is durable. Best for many concurrent goroutines, which naturally batch into one fsync.
  • WriteAsync returns as soon as the record is buffered. Best for a single writer that pipelines many records. Pair it with Flush to wait for durability at checkpoints.

The reader sequentially reads records using the 9-byte header and a provided buffer to avoid allocations, stopping on EOF or padding.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrInvalidRecord    = errors.New("logfile: invalid record")
	ErrCRCMismatch      = errors.New("logfile: crc checksum mismatch")
	ErrClosed           = errors.New("logfile: closed")
	ErrRecordTooLarge   = errors.New("logfile: record too large")
	ErrWriterInBadState = errors.New("logfile: writer in bad state")
)

Functions

This section is empty.

Types

type File

type File interface {
	io.WriterAt
	io.ReaderAt // Needed for initial tail read.
	Sync() error
}

type LogFile

type LogFile struct {
	// contains filtered or unexported fields
}

func New

func New(file File, startOffset int64, maxBufSize int, withCRC bool) (*LogFile, error)

New returns a LogFile.

  • maxBufSize: threshold before applying backpressure.
  • withCRC: CRC is optional and can be written on per record basis.

func (*LogFile) Close

func (w *LogFile) Close() error

Close stops accepting new writes, lets the flusher drain whatever is buffered, and returns once the flusher has exited.

func (*LogFile) Flush

func (w *LogFile) Flush(ctx context.Context) error

Flush blocks until all currently-buffered writes are durable.

func (*LogFile) Write

func (w *LogFile) Write(ctx context.Context, data []byte) (int64, error)

Write appends data and blocks until it's durable. Write returns the starting logical file offset of the framed record. Many concurrent Write calls naturally batch into a single fsync.

Context cancellation: If the context is canceled after data has been appended to the internal buffer but before the sync completes, the caller receives a context error. However, the data may still be flushed to disk by the background flusher. Callers should not assume that a context error means the write was not persisted.

func (*LogFile) WriteAsync

func (w *LogFile) WriteAsync(ctx context.Context, data []byte) (int64, error)

WriteAsync appends data and returns as soon as it's buffered, without waiting for fsync. This is the entry point for a single writer that wants to pipeline many records: keep calling WriteAsync, then call Flush at a checkpoint to wait for durability.

The returned offset is reserved as if the write had completed. Backpressure still applies: WriteAsync may block briefly if the buffer is full.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader provides sequential reading of framed LogFile records. Reader is not *thread-safe* unless it's wrapped in a mutex by the caller.

func NewReader

func NewReader(file io.ReaderAt, startOffset int64) *Reader

NewReader creates a new Reader starting at the specified logical file offset. This should match an offset returned by LogFile.Write or be zero.

func (*Reader) Offset

func (r *Reader) Offset() int64

Offset returns offset of the *next* unread record.

func (*Reader) ReadNext

func (r *Reader) ReadNext(buf []byte) ([]byte, error)

ReadNext reads the next record from the LogFile. Returns io.EOF if there are no more records or if zero-padding is encountered. If the provided buffer doesn't have enough capacity to hold the payload, a new, right-sized buffer is allocated.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL