healthcheck

package module

v0.1.3 Latest Latest Go to latest Published: Jan 18, 2020 License: Apache-2.0 Imports: 13 Imported by: 18

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/troian/healthcheck

Links

Open Source Insights

README ¶

healthcheck

Healthcheck is a library for implementing Kubernetes liveness and readiness probe handlers in your Go application.

Features

Integrates easily with Kubernetes. This library explicitly separates liveness vs. readiness checks instead of lumping everything into a single category of check.
Optionally exposes each check as a Prometheus gauge metric. This allows for cluster-wide monitoring and alerting on individual checks.
Supports asynchronous checks, which run in a background goroutine at a fixed interval. These are useful for expensive checks that you don't want to add latency to the liveness and readiness endpoints.
Includes a small library of generically useful checks for validating upstream DNS, TCP, HTTP, and database dependencies as well as checking basic health of the Go runtime.

Usage

See the GoDoc examples for more detail.

Install with go get or your favorite Go dependency manager: go get -u github.com/heptiolabs/healthcheck
Import the package: import "github.com/heptiolabs/healthcheck"
Create a healthcheck.Handler:
```
health := healthcheck.NewHandler()
```

Configure some application-specific liveness checks (whether the app itself is unhealthy):

// Our app is not happy if we've got more than 100 goroutines running.
health.AddLivenessCheck("goroutine-threshold", healthcheck.GoroutineCountCheck(100))

Configure some application-specific readiness checks (whether the app is ready to serve requests):

// Our app is not ready if we can't resolve our upstream dependency in DNS.
health.AddReadinessCheck(
    "upstream-dep-dns",
    healthcheck.DNSResolveCheck("upstream.example.com", 50*time.Millisecond))

// Our app is not ready if we can't connect to our database (`var db *sql.DB`) in <1s.
health.AddReadinessCheck("database", healthcheck.DatabasePingCheck(db, 1*time.Second))

Expose the /live and /ready endpoints over HTTP (on port 8086):
```
go http.ListenAndServe("0.0.0.0:8086", health)
```

Configure your Kubernetes container with HTTP liveness and readiness probes see the (Kubernetes documentation) for more detail:

# this is a bare bones example
# copy and paste livenessProbe and readinessProbe as appropriate for your app
apiVersion: v1
kind: Pod
metadata:
  name: heptio-healthcheck-example
spec:
  containers:
  - name: liveness
    image: your-registry/your-container

    # define a liveness probe that checks every 5 seconds, starting after 5 seconds
    livenessProbe:
      httpGet:
        path: /live
        port: 8086
      initialDelaySeconds: 5
      periodSeconds: 5

    # define a readiness probe that checks every 5 seconds
    readinessProbe:
      httpGet:
        path: /ready
        port: 8086
      periodSeconds: 5

If one of your readiness checks fails, Kubernetes will stop routing traffic to that pod within a few seconds (depending on periodSeconds and other factors).
If one of your liveness checks fails or your app becomes totally unresponsive, Kubernetes will restart your container.

HTTP Endpoints

When you run go http.ListenAndServe("0.0.0.0:8086", health), two HTTP endpoints are exposed:

/live: liveness endpoint (HTTP 200 if healthy, HTTP 503 if unhealthy)
/ready: readiness endpoint (HTTP 200 if healthy, HTTP 503 if unhealthy)

Pass the ?full=1 query parameter to see the full check results as JSON. These are omitted by default for performance.

Documentation ¶

Overview ¶

Package healthcheck helps you implement Kubernetes liveness and readiness checks for your application. It supports synchronous and asynchronous (background) checks. It can optionally report each check's status as a set of Prometheus gauge metrics for cluster-wide monitoring and alerting.

It also includes a small library of generic checks for DNS, TCP, and HTTP reachability as well as Goroutine usage.

Example ¶

// Create a Handler that we can use to register liveness and readiness checks.
health := NewHandler()

// Add a readiness check to make sure an upstream dependency resolves in DNS.
// If this fails we don't want to receive requests, but we shouldn't be
// restarted or rescheduled.
upstreamHost := "upstream.example.com"
_ = health.AddReadinessCheck(
	"upstream-dep-dns",
	DNSResolveCheck(upstreamHost, 50*time.Millisecond))

// Add a liveness check to detect Goroutine leaks. If this fails we want
// to be restarted/rescheduled.
_ = health.AddLivenessCheck("goroutine-threshold", GoroutineCountCheck(100))

// Serve http://0.0.0.0:8080/live and http://0.0.0.0:8080/ready endpoints.
// go http.ListenAndServe("0.0.0.0:8080", health)

// Make a request to the readiness endpoint and print the response.
fmt.Print(dumpRequest(health, "GET", "/ready"))

Output:

HTTP/1.1 503 Service Unavailable
Connection: close
Content-Type: application/json; charset=utf-8

{}

Example (Advanced) ¶

// Create a Handler that we can use to register liveness and readiness checks.
health := NewHandler()

// Make sure we can connect to an upstream dependency over TCP in less than
// 50ms. Run this check asynchronously in the background every 10 seconds
// instead of every time the /ready or /live endpoints are hit.
//
// Async is useful whenever a check is expensive (especially if it causes
// load on upstream services).
upstreamAddr := "upstream.example.com:5432"
_ = health.AddReadinessCheck(
	"upstream-dep-tcp",
	Async(TCPDialCheck(upstreamAddr, 50*time.Millisecond), 10*time.Second))

// Add a readiness check against the health of an upstream HTTP dependency
upstreamURL := "http://upstream-svc.example.com:8080/healthy"
_ = health.AddReadinessCheck(
	"upstream-dep-http",
	HTTPGetCheck(upstreamURL, 500*time.Millisecond))

// Implement a custom check with a 50 millisecond timeout.
_ = health.AddLivenessCheck("custom-check-with-timeout", Timeout(func() error {
	// Simulate some work that could take a long time
	time.Sleep(time.Millisecond * 100)
	return nil
}, 50*time.Millisecond))

// Expose the readiness endpoints on a custom path /healthz mixed into
// our main application mux.
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
	_, _ = w.Write([]byte("Hello, world!"))
})
mux.HandleFunc("/healthz", health.ReadyEndpoint)

// Sleep for just a moment to make sure our Async handler had a chance to run
time.Sleep(500 * time.Millisecond)

// Make a sample request to the /healthz endpoint and print the response.
fmt.Println(dumpRequest(mux, "GET", "/healthz"))

Output:

HTTP/1.1 503 Service Unavailable
Connection: close
Content-Type: application/json; charset=utf-8

{}

Example (Database) ¶

// Connect to a database/sql database
database := connectToDatabase()

// Create a Handler that we can use to register liveness and readiness checks.
health := NewHandler()

// Add a readiness check to we don't receive requests unless we can reach
// the database with a ping in <1 second.
_ = health.AddReadinessCheck("database", DatabasePingCheck(database, 1*time.Second))

// Serve http://0.0.0.0:8080/live and http://0.0.0.0:8080/ready endpoints.
// go http.ListenAndServe("0.0.0.0:8080", health)

// Make a request to the readiness endpoint and print the response.
fmt.Print(dumpRequest(health, "GET", "/ready?full=1"))

Output:

HTTP/1.1 200 OK
Connection: close
Content-Type: application/json; charset=utf-8

{
    "database": "OK"
}

Example (Metrics) ¶

// Create a new Prometheus registry (you'd likely already have one of these).
registry := prometheus.NewRegistry()

// Create a metrics-exposing Handler for the Prometheus registry
// The healthcheck related metrics will be prefixed with the provided namespace
health := NewMetricsHandler(registry, "example")

// Add a simple readiness check that always fails.
_ = health.AddReadinessCheck("failing-check", func() error {
	return fmt.Errorf("example failure")
})

// Add a liveness check that always succeeds
_ = health.AddLivenessCheck("successful-check", func() error {
	return nil
})

// Create an "admin" listener on 0.0.0.0:9402
adminMux := http.NewServeMux()
// go http.ListenAndServe("0.0.0.0:9402", adminMux)

// Expose prometheus metrics on /metrics
adminMux.Handle("/metrics", promhttp.HandlerFor(registry, promhttp.HandlerOpts{}))

// Expose a liveness check on /live
adminMux.HandleFunc("/live", health.LiveEndpoint)

// Expose a readiness check on /ready
adminMux.HandleFunc("/ready", health.ReadyEndpoint)

// Make a request to the metrics endpoint and print the response.
fmt.Println(dumpRequest(adminMux, "GET", "/metrics"))

Output:

HTTP/1.1 200 OK
Content-Length: 245
Content-Type: text/plain; version=0.0.4

# HELP example_healthcheck_status Current check status (0 indicates success, 1 indicates failure)
# TYPE example_healthcheck_status gauge
example_healthcheck_status{check="failing-check"} 1
example_healthcheck_status{check="successful-check"} 0

Index ¶

Variables
type Check
type Checks
type Config
type Endpoints
type Handler
- func NewHandler() Handler
- func NewMetricsHandler(registry prometheus.Registerer, namespace string) Handler

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrAlreadyExists = errors.New("healthcheck: already exists")
	ErrNotFound      = errors.New("healthcheck: not found")
)

View Source

var ErrNoData = errors.New("no data yet")

ErrNoData is returned if the first call of an Async() wrapped Check has not yet returned.

Functions ¶

This section is empty.

Types ¶

type Check ¶

type Check func() error

Check is a health/readiness check.

func AMQPDialCheck ¶

func AMQPDialCheck(uri string) Check

func Async ¶

func Async(check Check, interval time.Duration) Check

Async converts a Check into an asynchronous check that runs in a background goroutine at a fixed interval. The check is called at a fixed rate, not with a fixed delay between invocations. If your check takes longer than the interval to execute, the next execution will happen immediately.

Note: if you need to clean up the background goroutine, use AsyncWithContext().

func AsyncWithContext ¶

func AsyncWithContext(ctx context.Context, check Check, interval time.Duration) Check

AsyncWithContext converts a Check into an asynchronous check that runs in a background goroutine at a fixed interval. The check is called at a fixed rate, not with a fixed delay between invocations. If your check takes longer than the interval to execute, the next execution will happen immediately.

Note: if you don't need to cancel execution (because this runs forever), use Async()

func DNSResolveCheck ¶

func DNSResolveCheck(host string, timeout time.Duration) Check

DNSResolveCheck returns a Check that makes sure the provided host can resolve to at least one IP address within the specified timeout.

func DatabasePingCheck ¶

func DatabasePingCheck(database *sql.DB, timeout time.Duration) Check

DatabasePingCheck returns a Check that validates connectivity to a database/sql.DB using Ping().

func GoroutineCountCheck ¶

func GoroutineCountCheck(threshold int) Check

GoroutineCountCheck returns a Check that fails if too many goroutines are running (which could indicate a resource leak).

func HTTPGetCheck ¶

func HTTPGetCheck(url string, timeout time.Duration) Check

HTTPGetCheck returns a Check that performs an HTTP GET request against the specified URL. The check fails if the response times out or returns a non-200 status code.

func TCPDialCheck ¶

func TCPDialCheck(addr string, timeout time.Duration) Check

TCPDialCheck returns a Check that checks TCP connectivity to the provided endpoint.

func Timeout ¶

func Timeout(check Check, timeout time.Duration) Check

Timeout adds a timeout to a Check. If the underlying check takes longer than the timeout, it returns an error.

type Checks ¶

type Checks interface {
	// AddLivenessCheck adds a check that indicates that this instance of the
	// application should be destroyed or restarted. A failed liveness check
	// indicates that this instance is unhealthy, not some upstream dependency.
	// Every liveness check is also included as a readiness check.
	AddLivenessCheck(name string, check Check) error

	// AddReadinessCheck adds a check that indicates that this instance of the
	// application is currently unable to serve requests because of an upstream
	// or some transient failure. If a readiness check fails, this instance
	// should no longer receiver requests, but should not be restarted or
	// destroyed.
	AddReadinessCheck(name string, check Check) error

	RemoveLivenessCheck(name string) error

	RemoveReadinessCheck(name string) error
}

type Config ¶

type Config struct {
	Timeout       int    `json:"timeout" yaml:"timeout" mapstructure:"timeout"`
	LivenessName  string `json:"livenessName" yaml:"livenessName" mapstructure:"livenessName"`
	ReadinessName string `json:"readinessName" yaml:"readinessName" mapstructure:"readinessName"`
}

type Endpoints ¶

type Endpoints interface {
	// LiveEndpoint is the HTTP handler for just the /live endpoint, which is
	// useful if you need to attach it into your own HTTP handler tree.
	LiveEndpoint(http.ResponseWriter, *http.Request)

	// ReadyEndpoint is the HTTP handler for just the /ready endpoint, which is
	// useful if you need to attach it into your own HTTP handler tree.
	ReadyEndpoint(http.ResponseWriter, *http.Request)
}

type Handler ¶

type Handler interface {
	// The Handler is an http.Handler, so it can be exposed directly and handle
	// /live and /ready endpoints.
	http.Handler
	Checks
	Endpoints
}

Handler is an http.Handler with additional methods that register health and readiness checks. It handles handle "/live" and "/ready" HTTP endpoints.

func NewHandler ¶

func NewHandler() Handler

NewHandler creates a new basic Handler

func NewMetricsHandler ¶

func NewMetricsHandler(registry prometheus.Registerer, namespace string) Handler

NewMetricsHandler returns a healthcheck Handler that also exposes metrics into the provided Prometheus registry.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL