Documentation
¶
Overview ¶
Package simplevisor provides a simple, lightweight process supervisor for managing long-running goroutines with automatic restart capabilities and graceful shutdown.
Overview ¶
Simplevisor manages multiple named processes (goroutines) with configurable restart policies, panic recovery, and coordinated shutdown. It's designed for applications that need reliable background process management without complex dependencies.
Basic Usage ¶
supervisor := simplevisor.New(5*time.Second, logger)
// Register a simple process
supervisor.Register("worker", func(ctx context.Context) error {
for {
select {
case <-ctx.Done():
return ctx.Err() // Graceful shutdown
case work := <-workChan:
processWork(work)
}
}
})
// Start all processes
supervisor.Run()
// Wait for shutdown signal and cleanup
supervisor.WaitOnShutdownSignal(func() {
// Optional cleanup callback
cleanupResources()
})
Restart Policies ¶
Configure automatic restart behavior for processes:
// Never restart
supervisor.Register("one-shot", handler)
// Always restart (even on successful completion)
supervisor.Register("persistent", handler,
simplevisor.WithRestart(simplevisor.RestartAlways, 5, 2*time.Second))
// Only restart on failures/panics
supervisor.Register("resilient", handler,
simplevisor.WithRestart(simplevisor.RestartOnFailure, 3, 1*time.Second))
Panic Recovery ¶
Handle panics in processes with custom recovery logic:
supervisor.Register("risky-process", riskyHandler,
simplevisor.WithRecover(func(recovered interface{}) {
log.Printf("Process panicked: %v", recovered)
// Send alert, record metrics, etc.
}))
Process Monitoring ¶
Monitor process status during runtime:
// Check if a process is currently running
if supervisor.IsRunning("worker") {
log.Println("Worker is active")
}
// Get detailed process status
status, err := supervisor.GetProcessStatus("worker")
if err != nil {
log.Printf("Process not found: %v", err)
return
}
switch status {
case simplevisor.StatusRunning:
// Process is active
case simplevisor.StatusStopped:
// Process has stopped (will restart based on policy)
case simplevisor.StatusRestarting:
// Process is restarting after failure/completion
}
// Get total number of registered processes
count := supervisor.ProcessCount()
Graceful Shutdown ¶
Simplevisor provides coordinated shutdown with timeout protection:
// Automatic shutdown on OS signals supervisor.WaitOnShutdownSignal(nil) // Manual shutdown supervisor.Shutdown()
During shutdown: 1. All process contexts are cancelled 2. Processes should handle ctx.Done() and return gracefully 3. Supervisor waits for all processes to finish (with timeout) 4. Optional teardown callback is executed
Context-Based Cancellation ¶
All processes receive a context for cancellation detection:
func workerProcess(ctx context.Context) error {
ticker := time.NewTicker(1 * time.Second)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
// Cleanup and exit gracefully
cleanup()
return ctx.Err()
case <-ticker.C:
// Do periodic work
doWork()
}
}
}
Thread Safety ¶
Simplevisor is thread-safe for: - Process registration (before Run() is called) - Status queries (GetProcessStatus, IsRunning, ProcessCount)
However, the supervisor itself should be used from the main goroutine, particularly for Run() and WaitOnShutdownSignal().
Best Practices ¶
1. Register all processes before calling Run() (duplicate names will panic) 2. Use context cancellation for graceful shutdown in process handlers 3. Set appropriate restart limits to prevent infinite restart loops 4. Use panic recovery for critical processes that must stay running 5. Keep process handlers lightweight and delegate heavy work to other goroutines 6. Always handle ctx.Done() in process main loops 7. Let the supervisor manage process lifecycles - no manual start/stop needed
Configuration Options ¶
Restart Policy Options:
- RestartNever: Process runs once and stops (default)
- RestartAlways: Process restarts regardless of exit condition
- RestartOnFailure: Process restarts only on errors or panics
Default Values:
- Shutdown timeout: 5 seconds
- Max restarts: 3
- Restart delay: 1 second
Error Handling ¶
Processes should return errors for failure conditions:
func databaseWorker(ctx context.Context) error {
db, err := connectDB()
if err != nil {
return fmt.Errorf("failed to connect: %w", err)
}
defer db.Close()
for {
select {
case <-ctx.Done():
return ctx.Err()
default:
if err := processDBWork(db); err != nil {
return fmt.Errorf("db work failed: %w", err)
}
}
}
}
Returning an error will trigger restart behavior based on the configured policy.
OpenTelemetry Metrics ¶
Simplevisor provides comprehensive OpenTelemetry metrics for monitoring:
supervisor := simplevisor.New(5*time.Second, logger)
// Enable metrics (optional)
if err := supervisor.EnableMetrics(); err != nil {
log.Fatal(err)
}
Key metrics include: - simplevisor_processes_running: Currently running processes (UpDownCounter) - simplevisor_process_restart_count: Current restart count per process (Gauge) - simplevisor_process_status: Process status (Gauge: 1=running, 0=stopped, -1=restarting) - simplevisor_process_started_total: Process start events (Counter) - simplevisor_process_stopped_total: Process stop events by reason (Counter) - simplevisor_process_panics_total: Process panic events (Counter) - simplevisor_restart_limit_exceeded_total: Critical restart failures (Counter)
Metrics are automatically recorded when EnableMetrics() is called.
Logging ¶
Simplevisor uses structured logging (slog) and logs: - Process start/stop events - Restart attempts with counts and delays - Panic recovery details - Shutdown progress and completion
Pass a custom logger to New() or use nil for default console output.
Package simplevisor is a simple supervisor. Supervisor Registers long-running processes. It runs long-running processes in go routine and handles the panic with function registered by process or default recover. Supervisor listen to shut-down signal and then runs all shutdown functions registered by processes.
Index ¶
- Constants
- type Metrics
- type Option
- type Process
- type ProcessFunc
- type ProcessStatus
- type RecoverFunc
- type RestartPolicy
- type Supervisor
- func (s *Supervisor) Context() context.Context
- func (s *Supervisor) EnableMetrics() error
- func (s *Supervisor) GetProcessStatus(name string) (ProcessStatus, error)
- func (s *Supervisor) IsRunning(name string) bool
- func (s *Supervisor) ProcessCount() int
- func (s *Supervisor) Register(name string, handler ProcessFunc, options ...Option)
- func (s *Supervisor) Run()
- func (s *Supervisor) Shutdown()
- func (s *Supervisor) WaitOnShutdownSignal(teardown func())
Constants ¶
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Metrics ¶
type Metrics struct {
// contains filtered or unexported fields
}
Metrics holds all OpenTelemetry metrics for the supervisor
type Option ¶
type Option func(p *Process)
func WithRecover ¶
func WithRecover(handler RecoverFunc) Option
WithRecover sets the recover handler for the process.
func WithRestart ¶
func WithRestart(policy RestartPolicy, maxRestarts int, delay time.Duration) Option
WithRestart sets the restart policy for the process.
type ProcessFunc ¶
ProcessFunc is a long-running process which listens on context cancellation.
type ProcessStatus ¶
type ProcessStatus int
ProcessStatus represents the current state of a process
const ( StatusStopped ProcessStatus = iota StatusRunning StatusRestarting )
func (ProcessStatus) String ¶
func (s ProcessStatus) String() string
type RecoverFunc ¶
type RecoverFunc func(r any)
RecoverFunc is a function to execute when a process panics.
type RestartPolicy ¶
type RestartPolicy int
RestartPolicy defines when a process should be restarted
const ( RestartNever RestartPolicy = iota // Never restart the process RestartAlways // Always restart the process RestartOnFailure // Only restart on error/panic )
func (RestartPolicy) String ¶
func (r RestartPolicy) String() string
String methods for enums to provide readable metric labels
type Supervisor ¶
type Supervisor struct {
// contains filtered or unexported fields
}
Supervisor is responsible to manage long-running processes. Supervisor is not for concurrent use and should be used as the main goroutine of app.
func New ¶
func New(shutdownTimeout time.Duration, sLog *slog.Logger) *Supervisor
New returns new instance of Supervisor.
func (*Supervisor) Context ¶
func (s *Supervisor) Context() context.Context
func (*Supervisor) EnableMetrics ¶
func (s *Supervisor) EnableMetrics() error
EnableMetrics initializes OpenTelemetry metrics for the supervisor. This is optional and should be called before registering processes for best results.
func (*Supervisor) GetProcessStatus ¶
func (s *Supervisor) GetProcessStatus(name string) (ProcessStatus, error)
GetProcessStatus returns the current status of a process
func (*Supervisor) IsRunning ¶
func (s *Supervisor) IsRunning(name string) bool
func (*Supervisor) ProcessCount ¶
func (s *Supervisor) ProcessCount() int
func (*Supervisor) Register ¶
func (s *Supervisor) Register(name string, handler ProcessFunc, options ...Option)
Register registers a new process to supervisor. Panics if the name isn't unique.
func (*Supervisor) Run ¶
func (s *Supervisor) Run()
Run spawns a new goroutine for each process. Spawned goroutine is responsible to handle the panic.
func (*Supervisor) Shutdown ¶
func (s *Supervisor) Shutdown()
Shutdown manually shuts down the supervisor goroutine
func (*Supervisor) WaitOnShutdownSignal ¶
func (s *Supervisor) WaitOnShutdownSignal(teardown func())
WaitOnShutdownSignal wait to receive shutdown signal. WaitOnShutdownSignal should not be called in other goroutines except main goroutine of app. teardown is a callback function and will run at the last stage.