health

package module

v0.2.1 Latest Latest Go to latest Published: Oct 12, 2020 License: MIT Imports: 13 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/regiocom/healthchecker

Links

Open Source Insights

README ¶

healthchecker

A dead simple health checker for GO services.

GitHub go.mod Go version

TL;DR

Checks the availability of all services your service depends on and provides /.well-known/alive and /.well-known/ready endpoints. Supports some probes out of the box and can be extended by your own readiness probe. See all available probes or create a custom probe.

Learn more about health checks.

Usage

Serve on same port with your application

func main() {
	checker := health.Checker{}

	// This can be any http.ServerMux
    checker.AppendHealthEndpoints(http.DefaultServeMux)

	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		_, _ = w.Write([]byte("Hello World!"))
	})

	_ = http.ListenAndServe(":8080", http.DefaultServeMux)
}

Serve on separate port

func main() {
    checker := &health.Checker{}
    defer checker.ServeHTTPBackground(":8080")()

    // Check an external gRPC Service
    cc, _ := grpc.Dial(...)
    checker.AddReadinessProbe("my-grpc-service", health.GrpcProbe(cc))
}

Custom Probes

A health.Probe is just a plain function returning an error if the service can not be reached. The probe is called any time the readiness endpoint is called. Thus use the most simple way to check if the service you depend on is up and running.

// Checks if the customServiceConn can be reached.
func MyCustomServiceProbe(srv *customService) health.Probe {
    return func() error {
        // Check your service for availability
        available, err := srv.Ping()
        if !available || err != nil {
	    // Service is not available. We return an error.
            return fmt.Errorf("service is unavailable: %v", err)
	}
	
	// My depending service is up and running!
	return nil
    }
}

Usage

checker := &health.Checker{}

src := NewCustomService()
checker.AddReadinessProbe("my-service", MyCustomServiceProbe(srv))

Integrate with Kubernetes

This package is designed to seamlessly integrate with kubernetes. Lets asume we have a container image named company/my-service which uses this package. Than you can add the following lines to your deployment to enable monitoring by kubernetes. Kubernetes is now automatically restarting your service if it does not return alive in three consequitive requests. Also the pod is skipped by the load balancer as long it isn't ready. Learn more about health checks.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: service1
spec:
  selector:
    matchLabels:
      app: service1
  template:
    metadata:
      labels:
        app: service1
    spec:
      containers:
        - name: service1
          image: company/my-service:latest
          ports:
            - containerPort: 80
+          livenessProbe:
+            httpGet:
+              path: /.well-known/alive
+              port: 80
+          readinessProbe:
+            httpGet:
+              path: /.well-known/ready
+              port: 80

About Heath Checks

Kubernetes distinguishes between liveliness and readiness checks. Thus, our services should provide two endpoints, one to check if the service is alive and one to check if the service is ready.

Alive

A service is defined as alive, if it started correctly and accepts incoming requests. A service which is not alive for more than three times in a row will be killed and automatically restarted.

Ready

A service is defined as ready, if all mandatory dependent services, for example a databases, can be reached and the service can work as expected. A service which is not ready for more than three times in a row will be skipped by the internal load balancer.

A service which is alive, but not ready has to recover itself.

ℹ Per default both states are checked every 2 seconds after an initial delay of 10 seconds. If a service needs more than 5 seconds to come up (alive=true), you should increase the initial delay to twice the mean startup time.

State decision matrix

The following table contains a set of common states / events and the expected health report.

State	Alive	Ready
Startup phase	false	false
Ready	true	true
Deadlock	false	false
Heavy load due to processing lots of data	true	false
Database cannot be reached	true	false
Mandatory Service cannot be reached	true	false
Volume has insufficient space	true	false
Slow response from service / database	true	true
Leader cannot be reached	true	false

Implementation

A service must implement a health endpoint to check if it is alive and ready. Both have to be served via HTTP/1.1 under the same port on all interfaces. The routes should be /.well-known/alive and /.well-known/ready. Those endpoints must not require any authentication or any additional header. Response should either be 200 OK or 503 Service Unavailable and a minimal JSON body.

Both endpoints should be served independently and next to the main application on a different port.

Alive

The response for the liveliness probe should be a simple true or false.

/.well-known/alive: success

HTTP/1.1 200 OK
Content-Type: application/json

{
	"alive": true
}

/.well-known/alive: failure

HTTP/1.1 503 Service Unavailable
Content-Type: application/json

{
	"alive": false
}

Ready

The response for the readiness probe should be a simple true or false. For debug purpose the failure response can contain a list of simple reasons, why a service is unhealthy. Detailed information should be reported via the metrics / telemetry endpoints.

/.well-known/ready: success

HTTP/1.1 200 OK
Content-Type: application/json

{
	"ready": true
}

/.well-known/ready: failure

HTTP/1.1 503 Service Unavailable
Content-Type: application/json

{
	"ready": false,
	"reasons": [
		"dgraph: Service unreachable"
	]
}

Documentation ¶

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Checker ¶

type Checker struct {
	// contains filtered or unexported fields
}

A Checker can be used to provide a liveliness and readiness endpoint for your application. Use `checker.AddReadinessProbe` to add a test for readiness.

func (*Checker) AddReadinessProbe ¶

func (h *Checker) AddReadinessProbe(service string, probe Probe)

Add a probe which should be run each time the service is checked for readiness. Example:

conn, _ := grpc.Dial(...)
checker.AddReadinessProbe("eventstore", health.GrpcProbe(conn))

func (*Checker) AppendHealthEndpoints ¶ added in v0.2.0

func (h *Checker) AppendHealthEndpoints(m *http.ServeMux)

Appends `/.well-known/alive` and `/.well-known/ready` endpoints to given server mux

func (*Checker) ServeHTTP ¶

func (h *Checker) ServeHTTP(addr string) error

Serves health status endpoints via http

func (*Checker) ServeHTTPBackground ¶

func (h *Checker) ServeHTTPBackground(addr string) func()

Serves health endpoint in background. Calls os.Exit(1) in error. Use with defer to graceful shutdown the server. Example:

func main() {
	health := &Checker{}
	defer health.ServeHTTPBackground(":8080")()
}

func (*Checker) Shutdown ¶

func (h *Checker) Shutdown() error

Gracefully stops health checker

type GrpcStateReporter ¶

type GrpcStateReporter interface {
	GetState() connectivity.State
}

Interface matching a gRPC client's state method.

type MongoStateReporter ¶ added in v0.2.1

type MongoStateReporter interface {
	Ping(ctx context.Context, rp *readpref.ReadPref) error
}

Interface matching a mongodb client's ping method.

type NatsStateReporter ¶

type NatsStateReporter interface {
	Status() nats.Status
}

Interface matching a nats client's status method.

type Probe ¶

type Probe func() error

A Probe is a health check for a service you depend on. Should return an error if the tested service is unhealthy.

func GrpcProbe ¶

func GrpcProbe(conn GrpcStateReporter) Probe

Checks a grpc connection for readiness.

Example:

cc, _ := grpc.Dial(...)
checker.AddReadinessProbe("my-grpc-service", health.GrpcProbe(cc))

func HTTPProbe ¶ added in v0.2.0

func HTTPProbe(endpoint string) Probe

Pings a http endpoint for readiness. Called endpoint should return 2xx as status. **INFO:** If you check another service using this lib, always use the `/.well-known/alive endpoint` to prevent cascading requests.

Example:

checker.AddReadinessProbe("my-http-service", health.HTTPProbe("http://my-service:8080/.well-known/alive"))

func MongoProbe ¶ added in v0.2.1

func MongoProbe(client MongoStateReporter) Probe

Checks a mongodb connection for readiness.

Example:

client, _ := mongo.Connect(ctx, options.Client().ApplyURI(uri))
checker.AddReadinessProbe("my-mongo-client", health.MongoProbe(client))

func NatsProbe ¶

func NatsProbe(conn NatsStateReporter) Probe

Checks a nats connection for readiness.

Example:

sc, _ := stan.Connect(...)
checker.AddReadinessProbe("my-stan-service", health.NatsProbe(sc.NatsConn()))

func RedisPoolProbe ¶

func RedisPoolProbe(pool *redis.Pool) Probe

Checks a pool of redis connection for readiness.

func SQLProbe ¶

func SQLProbe(db *sql.DB) Probe

Checks a SQL connection for readiness.

func VaultProbe ¶

func VaultProbe(hr VaultHealthReporter) Probe

Checks a vault connection for readiness

type VaultHealthReporter ¶

type VaultHealthReporter interface {
	Health() (*vault.HealthResponse, error)
}

Interface matching a vault client's health method.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL