Documentation
¶
Overview ¶
Package goverseer provides production-ready process supervision for Go applications. It implements Erlang/OTP-style supervision trees with restart strategies, intensity limits, backoff policies, and hierarchical supervision.
Basic usage:
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithName("my-supervisor"),
goverseer.WithChildren(
goverseer.ChildSpec{
Name: "worker",
Start: workerFunc,
Restart: goverseer.Permanent,
},
),
)
sup.Start()
sup.Wait()
Index ¶
- Variables
- type BackoffPolicy
- type ChildFunc
- type ChildSpec
- type Event
- type EventHandler
- type EventType
- type Option
- func WithBackoff(policy BackoffPolicy) Option
- func WithChildren(specs ...ChildSpec) Option
- func WithContext(ctx context.Context) Option
- func WithEventHandler(handler EventHandler) Option
- func WithIntensity(maxRestarts int, window time.Duration) Option
- func WithName(name string) Option
- func WithShutdownTimeout(timeout time.Duration) Option
- type RestartType
- type Strategy
- type Supervisor
Constants ¶
This section is empty.
Variables ¶
var ( // ErrSupervisorStopped is returned when operations are attempted on a stopped supervisor. ErrSupervisorStopped = errors.New("supervisor is stopped") // ErrIntensityExceeded is returned when restart intensity limits are exceeded. // This indicates too many restarts occurred in the configured time window. ErrIntensityExceeded = errors.New("restart intensity exceeded") // ErrChildNotFound is returned when a child with the given name doesn't exist. ErrChildNotFound = errors.New("child not found") // ErrChildAlreadyExists is returned when adding a child with a name that's already in use. ErrChildAlreadyExists = errors.New("child already exists") // ErrInvalidShutdownTimeout is returned when shutdown timeout is invalid. ErrInvalidShutdownTimeout = errors.New("shutdown timeout must be positive") )
Functions ¶
This section is empty.
Types ¶
type BackoffPolicy ¶
type BackoffPolicy interface {
// ComputeDelay calculates the delay before the next restart attempt.
// The restarts parameter indicates how many times this child has already restarted.
ComputeDelay(restarts int) time.Duration
}
BackoffPolicy calculates restart delays based on the number of restarts. This helps prevent resource exhaustion from rapid restart loops.
func ConstantBackoff ¶
func ConstantBackoff(delay time.Duration) BackoffPolicy
ConstantBackoff creates a backoff policy with a fixed delay between restarts. This is useful when you want predictable restart timing.
Example: ConstantBackoff(time.Second) - All restarts wait 1 second
func ExponentialBackoff ¶
func ExponentialBackoff(initial, max time.Duration) BackoffPolicy
ExponentialBackoff creates a backoff policy that doubles the delay with each restart. The delay starts at initial and is capped at max.
Example: ExponentialBackoff(100*time.Millisecond, 5*time.Second) - 1st restart: 100ms - 2nd restart: 200ms - 3rd restart: 400ms - 4th restart: 800ms - 5th restart: 1.6s - 6th+ restart: 5s (capped)
func JitterBackoff ¶
func JitterBackoff(base BackoffPolicy, factor float64) BackoffPolicy
JitterBackoff wraps another backoff policy and adds random jitter. This helps prevent thundering herd problems when multiple processes restart simultaneously.
The factor determines the amount of jitter: 0.0 means no jitter, 1.0 means up to 100% jitter. The jitter is applied symmetrically (can increase or decrease the delay).
Example: JitterBackoff(ExponentialBackoff(1*time.Second, 10*time.Second), 0.2) - A 1s delay becomes 0.8s-1.2s (±20%)
func LinearBackoff ¶
func LinearBackoff(initial, increment, max time.Duration) BackoffPolicy
LinearBackoff creates a backoff policy that increases linearly with each restart. The delay starts at initial and increases by increment for each restart, capped at max.
Example: LinearBackoff(100*time.Millisecond, 500*time.Millisecond, 10*time.Second) - 1st restart: 100ms - 2nd restart: 600ms - 3rd restart: 1.1s - 4th restart: 1.6s - etc., capped at 10s
type ChildFunc ¶
ChildFunc is the function signature for a supervised child process. The function receives a context that will be canceled when the supervisor wants the child to stop. Children should monitor this context and exit gracefully.
Returning nil indicates normal exit. Returning an error indicates abnormal exit. Panics are automatically recovered and treated as abnormal exits.
Example:
func worker(ctx context.Context) error {
ticker := time.NewTicker(time.Second)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return nil // Graceful shutdown
case <-ticker.C:
if err := doWork(); err != nil {
return err // Will trigger restart based on RestartType
}
}
}
}
type ChildSpec ¶
type ChildSpec struct {
// Name is the unique identifier for this child.
// It's used for logging, metrics, and runtime management (AddChild, RemoveChild, etc.).
Name string
// Start is the function that runs the child process.
// It receives a context that will be canceled when the child should stop.
Start ChildFunc
// Restart determines when this child should be restarted after exit.
// - Permanent: Always restart (use for critical services)
// - Transient: Restart only on error/panic (use for retriable tasks)
// - Temporary: Never restart (use for one-off tasks)
Restart RestartType
}
ChildSpec defines a child process specification. This struct describes how a child should be started and restarted.
type Event ¶
type Event struct {
// Time is when the event occurred.
Time time.Time
// ChildName is the name of the child involved in the event (if applicable).
ChildName string
// Type is the type of event.
Type EventType
// Err is any error associated with the event (if applicable).
Err error
// StackTrace contains the panic stack trace for ChildPanicked events.
StackTrace string
}
Event represents a supervisor lifecycle event. Events are emitted for significant state changes and can be used for logging, metrics collection, and monitoring.
type EventHandler ¶
type EventHandler func(e Event)
EventHandler is a function that processes supervisor events. Multiple handlers can be registered with WithEventHandler. Handlers should return quickly to avoid blocking the supervisor.
type EventType ¶
type EventType int
EventType represents the type of supervisor event.
const ( // ChildStarted is emitted when a child process starts. ChildStarted EventType = iota // ChildExited is emitted when a child process exits normally. ChildExited // ChildRestarted is emitted when a child process is restarted. ChildRestarted // SupervisorStopping is emitted when the supervisor begins shutdown. SupervisorStopping // SupervisorFailedIntensity is emitted when restart intensity is exceeded. SupervisorFailedIntensity // ChildPanicked is emitted when a child process panics. ChildPanicked )
type Option ¶
type Option func(*Supervisor)
Option configures a Supervisor during creation.
func WithBackoff ¶
func WithBackoff(policy BackoffPolicy) Option
WithBackoff sets the backoff policy for restart delays. The policy determines how long to wait before restarting a failed child.
Example:
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithBackoff(
goverseer.ExponentialBackoff(100*time.Millisecond, 5*time.Second),
),
)
func WithChildren ¶
WithChildren adds initial children to the supervisor. Children are not started automatically; call Start() to begin supervision.
Example:
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithChildren(
goverseer.ChildSpec{Name: "worker-1", Start: worker1Func, Restart: goverseer.Permanent},
goverseer.ChildSpec{Name: "worker-2", Start: worker2Func, Restart: goverseer.Transient},
),
)
func WithContext ¶
WithContext sets a custom context for the supervisor instead of using context.Background(). The supervisor and all its children will be canceled when this context is canceled.
Example:
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithContext(ctx),
)
func WithEventHandler ¶
func WithEventHandler(handler EventHandler) Option
WithEventHandler adds an event handler to receive supervisor events. Multiple handlers can be registered by calling this option multiple times. Handlers should return quickly to avoid blocking the supervisor.
Example:
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithEventHandler(func(e goverseer.Event) {
log.Printf("[%s] %s: %v", e.Type, e.ChildName, e.Err)
}),
)
func WithIntensity ¶
WithIntensity sets restart intensity limits to prevent restart loops. If more than maxRestarts occur within the time window, the supervisor stops permanently and returns ErrIntensityExceeded.
Example:
// Allow up to 10 restarts per minute
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithIntensity(10, time.Minute),
)
func WithName ¶
WithName sets the supervisor's name for logging and debugging.
Example:
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithName("http-supervisor"),
)
func WithShutdownTimeout ¶
WithShutdownTimeout sets the maximum time to wait for children to stop gracefully. After this timeout, the supervisor will exit even if children are still running. The default is 30 seconds. If timeout is <= 0, the default is used.
Example:
sup := goverseer.New(
goverseer.OneForOne,
goverseer.WithShutdownTimeout(10*time.Second),
)
type RestartType ¶
type RestartType int
RestartType determines when a child should be restarted.
const ( // Permanent children are always restarted, even on normal exit. // Use this for critical services that must always be running. Permanent RestartType = iota // Transient children are restarted only if they exit abnormally (error or panic). // Use this for tasks that can complete successfully but should retry on failure. Transient // Temporary children are never restarted. // Use this for one-off initialization tasks or operations that should not retry. Temporary )
func (RestartType) String ¶
func (rt RestartType) String() string
String returns the string representation of a RestartType.
type Strategy ¶
type Strategy int
Strategy defines how children are restarted when one fails.
const ( // OneForOne restarts only the failed child. // Use this when children are independent and can fail/restart individually. OneForOne Strategy = iota // OneForAll stops all children and restarts all when one fails. // Use this when children are tightly coupled and depend on each other. OneForAll // RestForOne stops and restarts the failed child and all children started after it. // Use this when children have startup dependencies (e.g., A must start before B). RestForOne // SimpleOneForOne is for dynamic worker pools where children are added/removed at runtime. // Behaves like OneForOne but optimized for many similar children. SimpleOneForOne )
type Supervisor ¶
type Supervisor struct {
// contains filtered or unexported fields
}
Supervisor manages child processes with configurable restart strategies and intensity limits. It provides fault tolerance by automatically restarting failed children according to configured policies. Supervisors can be nested to create supervision trees.
All methods are safe for concurrent use. The supervisor uses an actor model internally to ensure race-free state management.
func New ¶
func New(strategy Strategy, opts ...Option) *Supervisor
func (*Supervisor) AddChild ¶
func (s *Supervisor) AddChild(spec ChildSpec) error
AddChild dynamically adds a child to the supervisor at runtime. The child is started immediately. If a child with the same name already exists, returns ErrChildAlreadyExists.
This operation is safe to call from any goroutine.
func (*Supervisor) RemoveChild ¶
func (s *Supervisor) RemoveChild(name string) error
RemoveChild removes a child from the supervisor and stops it gracefully. If the child doesn't exist, returns ErrChildNotFound.
This operation is safe to call from any goroutine.
func (*Supervisor) RestartChild ¶
func (s *Supervisor) RestartChild(name string) error
RestartChild manually restarts a specific child by name. The child is stopped and a new instance is started with the same specification. If the child doesn't exist, returns ErrChildNotFound.
This operation is safe to call from any goroutine.
func (*Supervisor) Start ¶
func (s *Supervisor) Start() error
Start starts the supervisor and all its children in order. Children are started sequentially, and if any child fails to start, Start returns an error immediately without starting remaining children.
Returns ErrSupervisorStopped if the supervisor has already been stopped.
func (*Supervisor) Stop ¶
func (s *Supervisor) Stop() error
Stop gracefully stops the supervisor and all its children. It cancels the supervisor's context, waits for all children to exit (up to the configured shutdown timeout), and returns any final error.
This method blocks until shutdown is complete.
func (*Supervisor) Wait ¶
func (s *Supervisor) Wait() error
Wait blocks until the supervisor stops and returns any error that caused it to stop. This includes errors from intensity limit violations or context cancellation.
Use this in your main function to keep the supervisor running:
if err := sup.Wait(); err != nil {
log.Fatal(err)
}
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
examples
|
|
|
basic
command
|
|
|
error_handling
command
|
|
|
hierarchial
command
|
|
|
restart_types
command
|
|
|
web_server
command
|
|
|
worker_pool
command
|