healthcheck

package module
v0.0.0-...-b47df88 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 6, 2022 License: Apache-2.0 Imports: 11 Imported by: 0

README

Use InVisionApp/go-health

Forked this to get something working while I replace vmware-archive/healthcheck with InVisionApp/go-health

Documentation

Overview

Package healthcheck helps you implement Kubernetes liveness and readiness checks for your application. It supports synchronous and asynchronous (background) checks. It can optionally report each check's status as a set of Prometheus gauge metrics for cluster-wide monitoring and alerting.

It also includes a small library of generic checks for DNS, TCP, and HTTP reachability as well as Goroutine usage.

Example
// Create a Handler that we can use to register liveness and readiness checks.
health := NewHandler()

// Add a readiness check to make sure an upstream dependency resolves in DNS.
// If this fails we don't want to receive requests, but we shouldn't be
// restarted or rescheduled.
upstreamHost := "upstream.example.com"
health.AddReadinessCheck(
	"upstream-dep-dns",
	DNSResolveCheck(upstreamHost, 50*time.Millisecond))

// Add a liveness check to detect Goroutine leaks. If this fails we want
// to be restarted/rescheduled.
health.AddLivenessCheck("goroutine-threshold", GoroutineCountCheck(100))

// Serve http://0.0.0.0:8080/live and http://0.0.0.0:8080/ready endpoints.
// go http.ListenAndServe("0.0.0.0:8080", health)

// Make a request to the readiness endpoint and print the response.
fmt.Print(dumpRequest(health, "GET", "/ready"))
Output:

HTTP/1.1 503 Service Unavailable
Connection: close
Content-Type: application/json; charset=utf-8

{}
Example (Advanced)
// Create a Handler that we can use to register liveness and readiness checks.
health := NewHandler()

// Make sure we can connect to an upstream dependency over TCP in less than
// 50ms. Run this check asynchronously in the background every 10 seconds
// instead of every time the /ready or /live endpoints are hit.
//
// Async is useful whenever a check is expensive (especially if it causes
// load on upstream services).
upstreamAddr := "upstream.example.com:5432"
health.AddReadinessCheck(
	"upstream-dep-tcp",
	Async(TCPDialCheck(upstreamAddr, 50*time.Millisecond), 10*time.Second))

// Add a readiness check against the health of an upstream HTTP dependency
upstreamURL := "http://upstream-svc.example.com:8080/healthy"
health.AddReadinessCheck(
	"upstream-dep-http",
	HTTPGetCheck(upstreamURL, 500*time.Millisecond))

// Implement a custom check with a 50 millisecond timeout.
health.AddLivenessCheck("custom-check-with-timeout", Timeout(func() error {
	// Simulate some work that could take a long time
	time.Sleep(time.Millisecond * 100)
	return nil
}, 50*time.Millisecond))

// Expose the readiness endpoints on a custom path /healthz mixed into
// our main application mux.
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
	w.Write([]byte("Hello, world!"))
})
mux.HandleFunc("/healthz", health.ReadyEndpoint)

// Sleep for just a moment to make sure our Async handler had a chance to run
time.Sleep(500 * time.Millisecond)

// Make a sample request to the /healthz endpoint and print the response.
fmt.Println(dumpRequest(mux, "GET", "/healthz"))
Output:

HTTP/1.1 503 Service Unavailable
Connection: close
Content-Type: application/json; charset=utf-8

{}
Example (Database)
// Connect to a database/sql database
var database *sql.DB
database = connectToDatabase()

// Create a Handler that we can use to register liveness and readiness checks.
health := NewHandler()

// Add a readiness check to we don't receive requests unless we can reach
// the database with a ping in <1 second.
health.AddReadinessCheck("database", DatabasePingCheck(database, 1*time.Second))

// Serve http://0.0.0.0:8080/live and http://0.0.0.0:8080/ready endpoints.
// go http.ListenAndServe("0.0.0.0:8080", health)

// Make a request to the readiness endpoint and print the response.
fmt.Print(dumpRequest(health, "GET", "/ready?full=1"))
Output:

HTTP/1.1 200 OK
Connection: close
Content-Type: application/json; charset=utf-8

{
    "database": "OK"
}
Example (Metrics)
// Create a new Prometheus registry (you'd likely already have one of these).
registry := prometheus.NewRegistry()

// Create a metrics-exposing Handler for the Prometheus registry
// The healthcheck related metrics will be prefixed with the provided namespace
health := NewMetricsHandler(registry, "example")

// Add a simple readiness check that always fails.
health.AddReadinessCheck("failing-check", func() error {
	return fmt.Errorf("example failure")
})

// Add a liveness check that always succeeds
health.AddLivenessCheck("successful-check", func() error {
	return nil
})

// Create an "admin" listener on 0.0.0.0:9402
adminMux := http.NewServeMux()
// go http.ListenAndServe("0.0.0.0:9402", adminMux)

// Expose prometheus metrics on /metrics
adminMux.Handle("/metrics", promhttp.HandlerFor(registry, promhttp.HandlerOpts{}))

// Expose a liveness check on /live
adminMux.HandleFunc("/live", health.LiveEndpoint)

// Expose a readiness check on /ready
adminMux.HandleFunc("/ready", health.ReadyEndpoint)

// Make a request to the metrics endpoint and print the response.
fmt.Println(dumpRequest(adminMux, "GET", "/metrics"))
Output:

HTTP/1.1 200 OK
Content-Length: 245
Content-Type: text/plain; version=0.0.4; charset=utf-8

# HELP example_healthcheck_status Current check status (0 indicates success, 1 indicates failure)
# TYPE example_healthcheck_status gauge
example_healthcheck_status{check="failing-check"} 1
example_healthcheck_status{check="successful-check"} 0

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrNoData = errors.New("no data yet")

ErrNoData is returned if the first call of an Async() wrapped Check has not yet returned.

Functions

This section is empty.

Types

type Check

type Check func() error

Check is a health/readiness check.

func Async

func Async(check Check, interval time.Duration) Check

Async converts a Check into an asynchronous check that runs in a background goroutine at a fixed interval. The check is called at a fixed rate, not with a fixed delay between invocations. If your check takes longer than the interval to execute, the next execution will happen immediately.

Note: if you need to clean up the background goroutine, use AsyncWithContext().

func AsyncWithContext

func AsyncWithContext(ctx context.Context, check Check, interval time.Duration) Check

AsyncWithContext converts a Check into an asynchronous check that runs in a background goroutine at a fixed interval. The check is called at a fixed rate, not with a fixed delay between invocations. If your check takes longer than the interval to execute, the next execution will happen immediately.

Note: if you don't need to cancel execution (because this runs forever), use Async()

func DNSResolveCheck

func DNSResolveCheck(host string, timeout time.Duration) Check

DNSResolveCheck returns a Check that makes sure the provided host can resolve to at least one IP address within the specified timeout.

func DatabasePingCheck

func DatabasePingCheck(database *sql.DB, timeout time.Duration) Check

DatabasePingCheck returns a Check that validates connectivity to a database/sql.DB using Ping().

func GCMaxPauseCheck

func GCMaxPauseCheck(threshold time.Duration) Check

GCMaxPauseCheck returns a Check that fails if any recent Go garbage collection pause exceeds the provided threshold.

func GoroutineCountCheck

func GoroutineCountCheck(threshold int) Check

GoroutineCountCheck returns a Check that fails if too many goroutines are running (which could indicate a resource leak).

func HTTPGetCheck

func HTTPGetCheck(url string, timeout time.Duration) Check

HTTPGetCheck returns a Check that performs an HTTP GET request against the specified URL. The check fails if the response times out or returns a non-200 status code.

func TCPDialCheck

func TCPDialCheck(addr string, timeout time.Duration) Check

TCPDialCheck returns a Check that checks TCP connectivity to the provided endpoint.

func Timeout

func Timeout(check Check, timeout time.Duration) Check

Timeout adds a timeout to a Check. If the underlying check takes longer than the timeout, it returns an error.

type Handler

type Handler interface {
	// The Handler is an http.Handler, so it can be exposed directly and handle
	// /live and /ready endpoints.
	http.Handler

	// AddLivenessCheck adds a check that indicates that this instance of the
	// application should be destroyed or restarted. A failed liveness check
	// indicates that this instance is unhealthy, not some upstream dependency.
	// Every liveness check is also included as a readiness check.
	AddLivenessCheck(name string, check Check)

	// AddReadinessCheck adds a check that indicates that this instance of the
	// application is currently unable to serve requests because of an upstream
	// or some transient failure. If a readiness check fails, this instance
	// should no longer receiver requests, but should not be restarted or
	// destroyed.
	AddReadinessCheck(name string, check Check)

	// LiveEndpoint is the HTTP handler for just the /live endpoint, which is
	// useful if you need to attach it into your own HTTP handler tree.
	LiveEndpoint(http.ResponseWriter, *http.Request)

	// ReadyEndpoint is the HTTP handler for just the /ready endpoint, which is
	// useful if you need to attach it into your own HTTP handler tree.
	ReadyEndpoint(http.ResponseWriter, *http.Request)
}

Handler is an http.Handler with additional methods that register health and readiness checks. It handles handle "/live" and "/ready" HTTP endpoints.

func NewHandler

func NewHandler() Handler

NewHandler creates a new basic Handler

func NewMetricsHandler

func NewMetricsHandler(registry prometheus.Registerer, namespace string) Handler

NewMetricsHandler returns a healthcheck Handler that also exposes metrics into the provided Prometheus registry.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL