health

package module
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 12, 2020 License: MIT Imports: 13 Imported by: 0

README

healthchecker

A dead simple health checker for GO services.

GitHub go.mod Go version Go Report Card PkgGoDev

TL;DR

Checks the availability of all services your service depends on and provides /.well-known/alive and /.well-known/ready endpoints. Supports some probes out of the box and can be extended by your own readiness probe. See all available probes or create a custom probe.

Learn more about health checks.

Usage

Serve on same port with your application

func main() {
	checker := health.Checker{}

	// This can be any http.ServerMux
    checker.AppendHealthEndpoints(http.DefaultServeMux)

	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		_, _ = w.Write([]byte("Hello World!"))
	})

	_ = http.ListenAndServe(":8080", http.DefaultServeMux)
}

Serve on separate port

func main() {
    checker := &health.Checker{}
    defer checker.ServeHTTPBackground(":8080")()

    // Check an external gRPC Service
    cc, _ := grpc.Dial(...)
    checker.AddReadinessProbe("my-grpc-service", health.GrpcProbe(cc))
}

Custom Probes

A health.Probe is just a plain function returning an error if the service can not be reached. The probe is called any time the readiness endpoint is called. Thus use the most simple way to check if the service you depend on is up and running.

// Checks if the customServiceConn can be reached.
func MyCustomServiceProbe(srv *customService) health.Probe {
    return func() error {
        // Check your service for availability
        available, err := srv.Ping()
        if !available || err != nil {
	    // Service is not available. We return an error.
            return fmt.Errorf("service is unavailable: %v", err)
	}
	
	// My depending service is up and running!
	return nil
    }
}

Usage

checker := &health.Checker{}

src := NewCustomService()
checker.AddReadinessProbe("my-service", MyCustomServiceProbe(srv))

Integrate with Kubernetes

This package is designed to seamlessly integrate with kubernetes. Lets asume we have a container image named company/my-service which uses this package. Than you can add the following lines to your deployment to enable monitoring by kubernetes. Kubernetes is now automatically restarting your service if it does not return alive in three consequitive requests. Also the pod is skipped by the load balancer as long it isn't ready. Learn more about health checks.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: service1
spec:
  selector:
    matchLabels:
      app: service1
  template:
    metadata:
      labels:
        app: service1
    spec:
      containers:
        - name: service1
          image: company/my-service:latest
          ports:
            - containerPort: 80
+          livenessProbe:
+            httpGet:
+              path: /.well-known/alive
+              port: 80
+          readinessProbe:
+            httpGet:
+              path: /.well-known/ready
+              port: 80

About Heath Checks

Kubernetes distinguishes between liveliness and readiness checks. Thus, our services should provide two endpoints, one to check if the service is alive and one to check if the service is ready.

Alive

A service is defined as alive, if it started correctly and accepts incoming requests. A service which is not alive for more than three times in a row will be killed and automatically restarted.

Ready

A service is defined as ready, if all mandatory dependent services, for example a databases, can be reached and the service can work as expected. A service which is not ready for more than three times in a row will be skipped by the internal load balancer.

A service which is alive, but not ready has to recover itself.

ℹ Per default both states are checked every 2 seconds after an initial delay of 10 seconds. If a service needs more than 5 seconds to come up (alive=true), you should increase the initial delay to twice the mean startup time.

State decision matrix

The following table contains a set of common states / events and the expected health report.

State Alive Ready
Startup phase false false
Ready true true
Deadlock false false
Heavy load due to processing lots of data true false
Database cannot be reached true false
Mandatory Service cannot be reached true false
Volume has insufficient space true false
Slow response from service / database true true
Leader cannot be reached true false

Implementation

A service must implement a health endpoint to check if it is alive and ready. Both have to be served via HTTP/1.1 under the same port on all interfaces. The routes should be /.well-known/alive and /.well-known/ready. Those endpoints must not require any authentication or any additional header. Response should either be 200 OK or 503 Service Unavailable and a minimal JSON body.

Both endpoints should be served independently and next to the main application on a different port.

Alive

The response for the liveliness probe should be a simple true or false.

/.well-known/alive: success

HTTP/1.1 200 OK
Content-Type: application/json

{
	"alive": true
}

/.well-known/alive: failure

HTTP/1.1 503 Service Unavailable
Content-Type: application/json

{
	"alive": false
}
Ready

The response for the readiness probe should be a simple true or false. For debug purpose the failure response can contain a list of simple reasons, why a service is unhealthy. Detailed information should be reported via the metrics / telemetry endpoints.

/.well-known/ready: success

HTTP/1.1 200 OK
Content-Type: application/json

{
	"ready": true
}

/.well-known/ready: failure

HTTP/1.1 503 Service Unavailable
Content-Type: application/json

{
	"ready": false,
	"reasons": [
		"dgraph: Service unreachable"
	]
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Checker

type Checker struct {
	// contains filtered or unexported fields
}

A Checker can be used to provide a liveliness and readiness endpoint for your application. Use `checker.AddReadinessProbe` to add a test for readiness.

func (*Checker) AddReadinessProbe

func (h *Checker) AddReadinessProbe(service string, probe Probe)

Add a probe which should be run each time the service is checked for readiness. Example:

conn, _ := grpc.Dial(...)
checker.AddReadinessProbe("eventstore", health.GrpcProbe(conn))

func (*Checker) AppendHealthEndpoints added in v0.2.0

func (h *Checker) AppendHealthEndpoints(m *http.ServeMux)

Appends `/.well-known/alive` and `/.well-known/ready` endpoints to given server mux

func (*Checker) ServeHTTP

func (h *Checker) ServeHTTP(addr string) error

Serves health status endpoints via http

func (*Checker) ServeHTTPBackground

func (h *Checker) ServeHTTPBackground(addr string) func()

Serves health endpoint in background. Calls os.Exit(1) in error. Use with defer to graceful shutdown the server. Example:

func main() {
	health := &Checker{}
	defer health.ServeHTTPBackground(":8080")()
}

func (*Checker) Shutdown

func (h *Checker) Shutdown() error

Gracefully stops health checker

type GrpcStateReporter

type GrpcStateReporter interface {
	GetState() connectivity.State
}

Interface matching a gRPC client's state method.

type MongoStateReporter added in v0.2.1

type MongoStateReporter interface {
	Ping(ctx context.Context, rp *readpref.ReadPref) error
}

Interface matching a mongodb client's ping method.

type NatsStateReporter

type NatsStateReporter interface {
	Status() nats.Status
}

Interface matching a nats client's status method.

type Probe

type Probe func() error

A Probe is a health check for a service you depend on. Should return an error if the tested service is unhealthy.

func GrpcProbe

func GrpcProbe(conn GrpcStateReporter) Probe

Checks a grpc connection for readiness.

Example:

cc, _ := grpc.Dial(...)
checker.AddReadinessProbe("my-grpc-service", health.GrpcProbe(cc))

func HTTPProbe added in v0.2.0

func HTTPProbe(endpoint string) Probe

Pings a http endpoint for readiness. Called endpoint should return 2xx as status. **INFO:** If you check another service using this lib, always use the `/.well-known/alive endpoint` to prevent cascading requests.

Example:

checker.AddReadinessProbe("my-http-service", health.HTTPProbe("http://my-service:8080/.well-known/alive"))

func MongoProbe added in v0.2.1

func MongoProbe(client MongoStateReporter) Probe

Checks a mongodb connection for readiness.

Example:

client, _ := mongo.Connect(ctx, options.Client().ApplyURI(uri))
checker.AddReadinessProbe("my-mongo-client", health.MongoProbe(client))

func NatsProbe

func NatsProbe(conn NatsStateReporter) Probe

Checks a nats connection for readiness.

Example:

sc, _ := stan.Connect(...)
checker.AddReadinessProbe("my-stan-service", health.NatsProbe(sc.NatsConn()))

func RedisPoolProbe

func RedisPoolProbe(pool *redis.Pool) Probe

Checks a pool of redis connection for readiness.

func SQLProbe

func SQLProbe(db *sql.DB) Probe

Checks a SQL connection for readiness.

func VaultProbe

func VaultProbe(hr VaultHealthReporter) Probe

Checks a vault connection for readiness

type VaultHealthReporter

type VaultHealthReporter interface {
	Health() (*vault.HealthResponse, error)
}

Interface matching a vault client's health method.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL