goverseer

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 26, 2025 License: MIT Imports: 8 Imported by: 0

README

Goverseer

Go Reference Go Report Card Tests codecov

Production-ready process supervision for Go applications inspired by Erlang/OTP.

Features

Erlang-style supervision trees with multiple restart strategies
🔄 Restart intensity limits to prevent crash loops
⏱️ Multiple backoff policies (exponential, linear, constant, jitter)
🔌 Dynamic child management at runtime
🛡️ Panic recovery with full stack traces
🎯 Graceful shutdown with configurable timeouts
📊 Event system for logging and metrics integration
🌲 Hierarchical supervisors for complex applications
🔒 Thread-safe using actor model pattern
📦 Zero external dependencies - pure Go stdlib

Installation

go get github.com/Gappylul/goverseer

Quick Start

package main

import (
    "context"
    "fmt"
    "log"
    "time"
    
    "github.com/Gappylul/goverseer"
)

func worker(ctx context.Context) error {
    ticker := time.NewTicker(time.Second)
    defer ticker.Stop()
    
    for {
        select {
        case <-ctx.Done():
            return nil
        case <-ticker.C:
            fmt.Println("Working...")
        }
    }
}

func main() {
    sup := goverseer.New(
        goverseer.OneForOne,
        goverseer.WithName("main-supervisor"),
        goverseer.WithIntensity(5, time.Minute),
        goverseer.WithChildren(
            goverseer.ChildSpec{
                Name:    "worker-1",
                Start:   worker,
                Restart: goverseer.Permanent,
            },
        ),
    )
    
    if err := sup.Start(); err != nil {
        log.Fatal(err)
    }
    
    log.Println("Supervisor started")
    
    if err := sup.Wait(); err != nil {
        log.Fatal(err)
    }
}

Documentation

  • GoDoc - Full API documentation
  • Examples - Working code examples

Examples

All examples are in the examples/ directory:

Testing

# Run all tests
make test

# Run tests with coverage
make test-coverage

# Run benchmarks
make bench

# Run linter
make lint

License

MIT License - see LICENSE file for details.

Acknowledgments

Inspired by Erlang/OTP's supervisor behavior and similar projects in other languages.

Documentation

Overview

Package goverseer provides production-ready process supervision for Go applications. It implements Erlang/OTP-style supervision trees with restart strategies, intensity limits, backoff policies, and hierarchical supervision.

Basic usage:

sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithName("my-supervisor"),
    goverseer.WithChildren(
        goverseer.ChildSpec{
            Name:    "worker",
            Start:   workerFunc,
            Restart: goverseer.Permanent,
        },
    ),
)
sup.Start()
sup.Wait()

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrSupervisorStopped is returned when operations are attempted on a stopped supervisor.
	ErrSupervisorStopped = errors.New("supervisor is stopped")

	// ErrIntensityExceeded is returned when restart intensity limits are exceeded.
	// This indicates too many restarts occurred in the configured time window.
	ErrIntensityExceeded = errors.New("restart intensity exceeded")

	// ErrChildNotFound is returned when a child with the given name doesn't exist.
	ErrChildNotFound = errors.New("child not found")

	// ErrChildAlreadyExists is returned when adding a child with a name that's already in use.
	ErrChildAlreadyExists = errors.New("child already exists")

	// ErrInvalidShutdownTimeout is returned when shutdown timeout is invalid.
	ErrInvalidShutdownTimeout = errors.New("shutdown timeout must be positive")
)

Functions

This section is empty.

Types

type BackoffPolicy

type BackoffPolicy interface {
	// ComputeDelay calculates the delay before the next restart attempt.
	// The restarts parameter indicates how many times this child has already restarted.
	ComputeDelay(restarts int) time.Duration
}

BackoffPolicy calculates restart delays based on the number of restarts. This helps prevent resource exhaustion from rapid restart loops.

func ConstantBackoff

func ConstantBackoff(delay time.Duration) BackoffPolicy

ConstantBackoff creates a backoff policy with a fixed delay between restarts. This is useful when you want predictable restart timing.

Example: ConstantBackoff(time.Second) - All restarts wait 1 second

func ExponentialBackoff

func ExponentialBackoff(initial, max time.Duration) BackoffPolicy

ExponentialBackoff creates a backoff policy that doubles the delay with each restart. The delay starts at initial and is capped at max.

Example: ExponentialBackoff(100*time.Millisecond, 5*time.Second) - 1st restart: 100ms - 2nd restart: 200ms - 3rd restart: 400ms - 4th restart: 800ms - 5th restart: 1.6s - 6th+ restart: 5s (capped)

func JitterBackoff

func JitterBackoff(base BackoffPolicy, factor float64) BackoffPolicy

JitterBackoff wraps another backoff policy and adds random jitter. This helps prevent thundering herd problems when multiple processes restart simultaneously.

The factor determines the amount of jitter: 0.0 means no jitter, 1.0 means up to 100% jitter. The jitter is applied symmetrically (can increase or decrease the delay).

Example: JitterBackoff(ExponentialBackoff(1*time.Second, 10*time.Second), 0.2) - A 1s delay becomes 0.8s-1.2s (±20%)

func LinearBackoff

func LinearBackoff(initial, increment, max time.Duration) BackoffPolicy

LinearBackoff creates a backoff policy that increases linearly with each restart. The delay starts at initial and increases by increment for each restart, capped at max.

Example: LinearBackoff(100*time.Millisecond, 500*time.Millisecond, 10*time.Second) - 1st restart: 100ms - 2nd restart: 600ms - 3rd restart: 1.1s - 4th restart: 1.6s - etc., capped at 10s

type ChildFunc

type ChildFunc func(ctx context.Context) error

ChildFunc is the function signature for a supervised child process. The function receives a context that will be canceled when the supervisor wants the child to stop. Children should monitor this context and exit gracefully.

Returning nil indicates normal exit. Returning an error indicates abnormal exit. Panics are automatically recovered and treated as abnormal exits.

Example:

func worker(ctx context.Context) error {
    ticker := time.NewTicker(time.Second)
    defer ticker.Stop()

    for {
        select {
        case <-ctx.Done():
            return nil // Graceful shutdown
        case <-ticker.C:
            if err := doWork(); err != nil {
                return err // Will trigger restart based on RestartType
            }
        }
    }
}

type ChildSpec

type ChildSpec struct {
	// Name is the unique identifier for this child.
	// It's used for logging, metrics, and runtime management (AddChild, RemoveChild, etc.).
	Name string

	// Start is the function that runs the child process.
	// It receives a context that will be canceled when the child should stop.
	Start ChildFunc

	// Restart determines when this child should be restarted after exit.
	// - Permanent: Always restart (use for critical services)
	// - Transient: Restart only on error/panic (use for retriable tasks)
	// - Temporary: Never restart (use for one-off tasks)
	Restart RestartType
}

ChildSpec defines a child process specification. This struct describes how a child should be started and restarted.

type Event

type Event struct {
	// Time is when the event occurred.
	Time time.Time
	// ChildName is the name of the child involved in the event (if applicable).
	ChildName string
	// Type is the type of event.
	Type EventType
	// Err is any error associated with the event (if applicable).
	Err error
	// StackTrace contains the panic stack trace for ChildPanicked events.
	StackTrace string
}

Event represents a supervisor lifecycle event. Events are emitted for significant state changes and can be used for logging, metrics collection, and monitoring.

type EventHandler

type EventHandler func(e Event)

EventHandler is a function that processes supervisor events. Multiple handlers can be registered with WithEventHandler. Handlers should return quickly to avoid blocking the supervisor.

type EventType

type EventType int

EventType represents the type of supervisor event.

const (
	// ChildStarted is emitted when a child process starts.
	ChildStarted EventType = iota
	// ChildExited is emitted when a child process exits normally.
	ChildExited
	// ChildRestarted is emitted when a child process is restarted.
	ChildRestarted
	// SupervisorStopping is emitted when the supervisor begins shutdown.
	SupervisorStopping
	// SupervisorFailedIntensity is emitted when restart intensity is exceeded.
	SupervisorFailedIntensity
	// ChildPanicked is emitted when a child process panics.
	ChildPanicked
)

func (EventType) String

func (et EventType) String() string

String returns the string representation of an EventType.

type Option

type Option func(*Supervisor)

Option configures a Supervisor during creation.

func WithBackoff

func WithBackoff(policy BackoffPolicy) Option

WithBackoff sets the backoff policy for restart delays. The policy determines how long to wait before restarting a failed child.

Example:

sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithBackoff(
        goverseer.ExponentialBackoff(100*time.Millisecond, 5*time.Second),
    ),
)

func WithChildren

func WithChildren(specs ...ChildSpec) Option

WithChildren adds initial children to the supervisor. Children are not started automatically; call Start() to begin supervision.

Example:

sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithChildren(
        goverseer.ChildSpec{Name: "worker-1", Start: worker1Func, Restart: goverseer.Permanent},
        goverseer.ChildSpec{Name: "worker-2", Start: worker2Func, Restart: goverseer.Transient},
    ),
)

func WithContext

func WithContext(ctx context.Context) Option

WithContext sets a custom context for the supervisor instead of using context.Background(). The supervisor and all its children will be canceled when this context is canceled.

Example:

ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithContext(ctx),
)

func WithEventHandler

func WithEventHandler(handler EventHandler) Option

WithEventHandler adds an event handler to receive supervisor events. Multiple handlers can be registered by calling this option multiple times. Handlers should return quickly to avoid blocking the supervisor.

Example:

sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithEventHandler(func(e goverseer.Event) {
        log.Printf("[%s] %s: %v", e.Type, e.ChildName, e.Err)
    }),
)

func WithIntensity

func WithIntensity(maxRestarts int, window time.Duration) Option

WithIntensity sets restart intensity limits to prevent restart loops. If more than maxRestarts occur within the time window, the supervisor stops permanently and returns ErrIntensityExceeded.

Example:

// Allow up to 10 restarts per minute
sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithIntensity(10, time.Minute),
)

func WithName

func WithName(name string) Option

WithName sets the supervisor's name for logging and debugging.

Example:

sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithName("http-supervisor"),
)

func WithShutdownTimeout

func WithShutdownTimeout(timeout time.Duration) Option

WithShutdownTimeout sets the maximum time to wait for children to stop gracefully. After this timeout, the supervisor will exit even if children are still running. The default is 30 seconds. If timeout is <= 0, the default is used.

Example:

sup := goverseer.New(
    goverseer.OneForOne,
    goverseer.WithShutdownTimeout(10*time.Second),
)

type RestartType

type RestartType int

RestartType determines when a child should be restarted.

const (
	// Permanent children are always restarted, even on normal exit.
	// Use this for critical services that must always be running.
	Permanent RestartType = iota

	// Transient children are restarted only if they exit abnormally (error or panic).
	// Use this for tasks that can complete successfully but should retry on failure.
	Transient

	// Temporary children are never restarted.
	// Use this for one-off initialization tasks or operations that should not retry.
	Temporary
)

func (RestartType) String

func (rt RestartType) String() string

String returns the string representation of a RestartType.

type Strategy

type Strategy int

Strategy defines how children are restarted when one fails.

const (
	// OneForOne restarts only the failed child.
	// Use this when children are independent and can fail/restart individually.
	OneForOne Strategy = iota

	// OneForAll stops all children and restarts all when one fails.
	// Use this when children are tightly coupled and depend on each other.
	OneForAll

	// RestForOne stops and restarts the failed child and all children started after it.
	// Use this when children have startup dependencies (e.g., A must start before B).
	RestForOne

	// SimpleOneForOne is for dynamic worker pools where children are added/removed at runtime.
	// Behaves like OneForOne but optimized for many similar children.
	SimpleOneForOne
)

func (Strategy) String

func (s Strategy) String() string

String returns the string representation of a Strategy.

type Supervisor

type Supervisor struct {
	// contains filtered or unexported fields
}

Supervisor manages child processes with configurable restart strategies and intensity limits. It provides fault tolerance by automatically restarting failed children according to configured policies. Supervisors can be nested to create supervision trees.

All methods are safe for concurrent use. The supervisor uses an actor model internally to ensure race-free state management.

func New

func New(strategy Strategy, opts ...Option) *Supervisor

func (*Supervisor) AddChild

func (s *Supervisor) AddChild(spec ChildSpec) error

AddChild dynamically adds a child to the supervisor at runtime. The child is started immediately. If a child with the same name already exists, returns ErrChildAlreadyExists.

This operation is safe to call from any goroutine.

func (*Supervisor) RemoveChild

func (s *Supervisor) RemoveChild(name string) error

RemoveChild removes a child from the supervisor and stops it gracefully. If the child doesn't exist, returns ErrChildNotFound.

This operation is safe to call from any goroutine.

func (*Supervisor) RestartChild

func (s *Supervisor) RestartChild(name string) error

RestartChild manually restarts a specific child by name. The child is stopped and a new instance is started with the same specification. If the child doesn't exist, returns ErrChildNotFound.

This operation is safe to call from any goroutine.

func (*Supervisor) Start

func (s *Supervisor) Start() error

Start starts the supervisor and all its children in order. Children are started sequentially, and if any child fails to start, Start returns an error immediately without starting remaining children.

Returns ErrSupervisorStopped if the supervisor has already been stopped.

func (*Supervisor) Stop

func (s *Supervisor) Stop() error

Stop gracefully stops the supervisor and all its children. It cancels the supervisor's context, waits for all children to exit (up to the configured shutdown timeout), and returns any final error.

This method blocks until shutdown is complete.

func (*Supervisor) Wait

func (s *Supervisor) Wait() error

Wait blocks until the supervisor stops and returns any error that caused it to stop. This includes errors from intensity limit violations or context cancellation.

Use this in your main function to keep the supervisor running:

if err := sup.Wait(); err != nil {
    log.Fatal(err)
}

Directories

Path Synopsis
examples
basic command
error_handling command
hierarchial command
restart_types command
web_server command
worker_pool command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL