Documentation
¶
Overview ¶
Package health provides a health check framework for Go services.
The framework is built around three interfaces: Manager, Checker, and Reporter. A Manager orchestrates health checks and dispatches results to reporters. Checkers perform individual health checks against dependencies (databases, caches, HTTP endpoints). Reporters expose health state to external observers (HTTP endpoints, gRPC, Prometheus, OpenTelemetry).
The core module has zero external dependencies. Reporters with heavy dependencies (gRPC, OTel, Prometheus) are available as separate Go modules.
See https://schigh.github.io/health/ for full documentation.
Index ¶
- Variables
- type AddCheckOption
- func WithCheckFrequency(f CheckFrequency, interval, delay time.Duration) AddCheckOption
- func WithComponentType(ct string) AddCheckOption
- func WithDependsOn(deps ...string) AddCheckOption
- func WithGroup(group string) AddCheckOption
- func WithLivenessImpact() AddCheckOption
- func WithReadinessImpact() AddCheckOption
- func WithStartupImpact() AddCheckOption
- type AddCheckOptions
- type CachedChecker
- type CheckFrequency
- type CheckResult
- type Checker
- type CheckerFunc
- type Logger
- type Manager
- type NoOpLogger
- type Reporter
- type Status
Constants ¶
This section is empty.
Variables ¶
var ErrHealth = errors.New("health")
ErrHealth is the sentinel error for all health check errors. Use errors.Is to check if an error originated from this package.
Functions ¶
This section is empty.
Types ¶
type AddCheckOption ¶
type AddCheckOption func(*AddCheckOptions)
AddCheckOption is a functional option for adding a Checker to a health manager.
func WithCheckFrequency ¶
func WithCheckFrequency(f CheckFrequency, interval, delay time.Duration) AddCheckOption
WithCheckFrequency tells the health instance the CheckFrequency at which it will perform check with the specified Checker instance. If the value for CheckFrequency is CheckOnce, the Interval parameter is ignored. If the value for CheckFrequency is CheckAtInterval, the value of Interval will be used. If the value of Interval is equal to or less than zero, then the default Interval is used. If the value of Delay is equal to or less than zero, it is ignored. This option is not additive, so multiple invocations of this option will result in the last invocation being used to configure the Checker.
func WithComponentType ¶
func WithComponentType(ct string) AddCheckOption
WithComponentType assigns a component type hint to a health check (e.g., "datastore", "http", "tcp"). Component types are included in self-describing health endpoints.
func WithDependsOn ¶
func WithDependsOn(deps ...string) AddCheckOption
WithDependsOn declares that this check depends on other named checks. Used by the discovery protocol to build dependency graphs.
func WithGroup ¶
func WithGroup(group string) AddCheckOption
WithGroup assigns a logical group to a health check (e.g., "database", "cache", "external"). Groups are included in self-describing health endpoints and can be used for filtering.
func WithLivenessImpact ¶
func WithLivenessImpact() AddCheckOption
WithLivenessImpact marks a health check as affecting the liveness of the application. If a check that affects liveness fails, readiness is also affected.
func WithReadinessImpact ¶
func WithReadinessImpact() AddCheckOption
WithReadinessImpact marks a health check as affecting the readiness of the application.
func WithStartupImpact ¶
func WithStartupImpact() AddCheckOption
WithStartupImpact marks a health check as affecting startup probes. Startup checks must all pass before liveness and readiness probes are evaluated. Once all startup checks pass, startup is considered complete and is not re-evaluated.
type AddCheckOptions ¶
type AddCheckOptions struct {
Frequency CheckFrequency
Delay time.Duration
Interval time.Duration
AffectsLiveness bool
AffectsReadiness bool
AffectsStartup bool
Group string
ComponentType string
DependsOn []string
}
AddCheckOptions contain the options needed to add a new health check to the manager.
type CachedChecker ¶
type CachedChecker struct {
// contains filtered or unexported fields
}
CachedChecker wraps a Checker with TTL-based caching. During refresh, stale values are served to concurrent readers. Only one goroutine refreshes at a time (prevents thundering herd on expensive checks).
The first call always executes the underlying checker synchronously.
func WithCache ¶
func WithCache(c Checker, ttl time.Duration) *CachedChecker
WithCache wraps a Checker with TTL-based result caching.
func (*CachedChecker) Check ¶
func (c *CachedChecker) Check(ctx context.Context) *CheckResult
Check returns the cached result if still valid, otherwise refreshes.
type CheckFrequency ¶
type CheckFrequency uint
CheckFrequency is a set of flags to instruct the check scheduling.
const ( // CheckOnce instructs the Checker to perform its check one time. If the // CheckAfter flag is set, CheckOnce will perform the check after a duration // specified by the desired configuration. CheckOnce CheckFrequency = 1 << iota // CheckAtInterval instructs the Checker to perform its check at a specified // interval. If the CheckAfter flag is set, this check will begin after a // lapse of the combined Delay and Interval. CheckAtInterval // CheckAfter instructs the Checker to wait until after a specified time to // perform its check. CheckAfter )
type CheckResult ¶
type CheckResult struct {
// Name identifies the check. Set by the manager from the registered check name.
Name string
// Status is the health status of this check.
Status Status
// AffectsLiveness indicates whether a failing check should affect liveness. Set by manager.
AffectsLiveness bool
// AffectsReadiness indicates whether a failing check should affect readiness. Set by manager.
AffectsReadiness bool
// AffectsStartup indicates whether this check must pass before startup completes. Set by manager.
AffectsStartup bool
// Group is the logical group for this check (e.g., "database", "cache"). Set by manager.
Group string
// ComponentType is a type hint for observability tools (e.g., "datastore", "http"). Set by manager.
ComponentType string
// DependsOn lists service URLs this check depends on, used by the discovery protocol. Set by manager.
DependsOn []string
// Error is the error from the last check execution, if any. Set by checker.
Error error
// ErrorSince is when the error state began. Set by checker.
ErrorSince time.Time
// Duration is how long the check took to execute. Set by checker.
Duration time.Duration
// Metadata is arbitrary key-value data for observability. Set by checker.
Metadata map[string]string
// Timestamp is when this check result was produced. Set by checker.
Timestamp time.Time
}
CheckResult is the outcome of a single health check execution.
Some fields are set by the checker (Status, Error, Duration, Timestamp, Metadata), while others are overridden by the manager from the registered AddCheckOptions (Name, AffectsLiveness, AffectsReadiness, AffectsStartup, Group, ComponentType, DependsOn).
type Checker ¶
type Checker interface {
// Check runs the health check and returns a check result.
Check(context.Context) *CheckResult
}
Checker performs an individual health check and returns the result to the health manager.
type CheckerFunc ¶
type CheckerFunc func(context.Context) *CheckResult
CheckerFunc is a functional health checker.
func (CheckerFunc) Check ¶
func (cf CheckerFunc) Check(ctx context.Context) *CheckResult
Check satisfies Checker.
type Logger ¶
type Logger interface {
Debug(msg string, args ...any)
Info(msg string, args ...any)
Warn(msg string, args ...any)
Error(msg string, args ...any)
}
Logger defines the logging interface used internally. This interface is implemented by *log/slog.Logger.
func DefaultLogger ¶
func DefaultLogger() Logger
DefaultLogger returns the default slog.Logger, which satisfies the Logger interface.
type Manager ¶
type Manager interface {
// Run the health check manager. Invoking this will initialize all managed
// checks and reporters. This function returns a read-only channel of errors.
// If a non-nil error is propagated across this channel, that means the health
// check manager has entered an unrecoverable state, and the application
// should halt.
Run(context.Context) <-chan error
// Stop the manager and all included checks and reporters. Should be called
// when an application is shutting down gracefully.
Stop(context.Context) error
// AddCheck will add a named health checker to the manager. By default, an
// added check will run once immediately upon startup, and not affect
// liveness or readiness. Options are available to set an initial check delay,
// a check interval, and any affects on liveness or readiness. All added
// health checks must be named uniquely. Adding a check with the same name
// as an existing health check (case-insensitive), will overwrite the previous
// check. Attempting to add a check after the manager is running will return
// an error.
AddCheck(name string, c Checker, opts ...AddCheckOption) error
// AddReporter adds a named health reporter to the manager. Every time a
// health check is reported, the manager will relay the update to the
// reporters. All added health reporters must be named uniquely.
// Adding a reporter with the same name as an existing health reporter
// (case-insensitive), will overwrite the previous reporter. Attempting to
// add a reporter after the manager is running will return an error.
AddReporter(name string, r Reporter) error
}
Manager defines a manager of health checks for the application. A Manager is a running daemon that oversees all the health checks added to it. When a Manager has new health check information, it dispatches an update to its Reporter(s).
type NoOpLogger ¶
type NoOpLogger struct{}
NoOpLogger is used to suppress log output.
func (NoOpLogger) Debug ¶
func (n NoOpLogger) Debug(_ string, _ ...any)
func (NoOpLogger) Error ¶
func (n NoOpLogger) Error(_ string, _ ...any)
func (NoOpLogger) Info ¶
func (n NoOpLogger) Info(_ string, _ ...any)
func (NoOpLogger) Warn ¶
func (n NoOpLogger) Warn(_ string, _ ...any)
type Reporter ¶
type Reporter interface {
// Run the reporter.
Run(context.Context) error
// Stop the reporter and release resources.
Stop(context.Context) error
// SetLiveness instructs the reporter to relay the liveness of the
// application to an external observer.
SetLiveness(context.Context, bool)
// SetReadiness instructs the reporter to relay the readiness of the
// application to an external observer.
SetReadiness(context.Context, bool)
// SetStartup instructs the reporter to relay the startup status of the
// application to an external observer. Startup probes tell Kubernetes
// that the application has finished initializing.
SetStartup(context.Context, bool)
// UpdateHealthChecks is called from the manager to update the reported
// health checks.
UpdateHealthChecks(context.Context, map[string]*CheckResult)
}
Reporter reports the health status of the application to a receiving output. The mechanism by which the Reporter sends this information is implementation-dependent. Some reporters, such as an HTTP server, are pull-based, while others, such as a stdout reporter, are push-based. Each reporter variant is responsible for managing the health information passed to it from the health Manager. A Manager may have multiple reporters, and a Reporter may have multiple providers. The common dialog between reporters and providers is a map of CheckResult items keyed by string. It is implied that all health checks within a system are named uniquely. A Reporter must be prepared to receive updates at any time and at any frequency.
Directories
¶
| Path | Synopsis |
|---|---|
|
checker
|
|
|
e2e
|
|
|
cmd/gateway
command
|
|
|
cmd/orders
command
|
|
|
cmd/payments
command
|
|
|
examples
|
|
|
basic
command
|
|
|
internal
|
|
|
manager
|
|
|
reporter
|
|