Oh My GLB
Project Health
A Global Service Load Balancing solution with a focus on having cloud native qualities and work natively in a Kubernetes context.
Motivation and Architecture
Please see the extended documentation here
Installation and Configuration
Installation with Helm3
Add ohmyglb Helm repository
helm repo add ohmyglb https://absaoss.github.io/ohmyglb/
helm repo update
helm -n ohmyglb upgrade -i ohmyglb ohmyglb/ohmyglb --create-namespace --wait
See values.yaml
for customization options.
Local Playground Install
Environment prerequisites
-
install GO 1.14
-
install GIT
-
install gnu-sed if you don't have it
- If you are on a Mac, install sed by Homebrew
brew install gnu-sed
-
install Docker
-
ensure you are able to push/pull from your docker registry
-
to run multiple clusters reserve 8GB of memory
above screenshot is for Docker for Mac and that options for other Docker distributions may vary
-
install Kubectl to operate clusters
-
install Helm3 to get charts
-
install kind as tool for running local Kubernetes clusters
Running project locally
To spin-up a local environment using two Kind clusters and deploy a test application to both clusters, execute the command below:
make deploy-full-local-setup
Verify installation
If local setup runs well, check if clusters are correctly installed
kubectl cluster-info --context kind-test-gslb1 && kubectl cluster-info --context kind-test-gslb2
Check if Etcd cluster is healthy
kubectl run --rm -i --tty --env="ETCDCTL_API=3" --env="ETCDCTL_ENDPOINTS=http://etcd-cluster-client:2379" --namespace ohmyglb etcd-test --image quay.io/coreos/etcd --restart=Never -- /bin/sh -c 'etcdctl member list'
as expected output you will see three started pods: etcd-cluster
...
c3261c079f6990a7, started, etcd-cluster-5bcpvf6ngz, http://etcd-cluster-5bcpvf6ngz.etcd-cluster.ohmyglb.svc:2380, http://etcd-cluster-5bcpvf6ngz.etcd-cluster.ohmyglb.svc:2379
eb6ead15c2b92606, started, etcd-cluster-6d8pxtpklm, http://etcd-cluster-6d8pxtpklm.etcd-cluster.ohmyglb.svc:2380, http://etcd-cluster-6d8pxtpklm.etcd-cluster.ohmyglb.svc:2379
eed5a40bbfb6ee97, started, etcd-cluster-xsjmwdkdf8, http://etcd-cluster-xsjmwdkdf8.etcd-cluster.ohmyglb.svc:2380, http://etcd-cluster-xsjmwdkdf8.etcd-cluster.ohmyglb.svc:2379
...
Cluster test-gslb1 is exposing external DNS on default port :5053
while test-gslb2 on port :5054
.
dig @localhost localtargets.app3.cloud.example.com -p 5053 && dig -p 5054 @localhost localtargets.app3.cloud.example.com
As expected result you should see six A records divided between nodes of both clusters.
...
...
;; ANSWER SECTION:
localtargets.app3.cloud.example.com. 30 IN A 172.17.0.2
localtargets.app3.cloud.example.com. 30 IN A 172.17.0.5
localtargets.app3.cloud.example.com. 30 IN A 172.17.0.3
...
...
localtargets.app3.cloud.example.com. 30 IN A 172.17.0.8
localtargets.app3.cloud.example.com. 30 IN A 172.17.0.6
localtargets.app3.cloud.example.com. 30 IN A 172.17.0.7
Both clusters have podinfo installed on the top.
Run following command and check if you get two json responses.
curl localhost:80 -H "Host:app3.cloud.example.com" && curl localhost:81 -H "Host:app3.cloud.example.com"
Run integration tests
There is wide range of scenarios which GSLB provides and all of them are covered within tests.
To check whether everything is running properly execute terratests :
make terratest
Cleaning
Clean up your local development clusters with
make destroy-full-local-setup
Sample demo
Round Robin
Both clusters have podinfo installed on the top where each
cluster has been tagged to serve a different region. In this demo we will hit podinfo by wget -qO - app3.cloud.example.com
and depending
on region will podinfo return us or eu. In current round robin implementation are ip addresses randomly picked.
See Gslb manifest with round robin strategy
Run several times command below and watch message
field.
make test-round-robin
As expected result you should see podinfo message changing
{
"hostname": "frontend-podinfo-856bb46677-8p45m",
...
"message": "us",
...
}
{
"hostname": "frontend-podinfo-856bb46677-8p45m",
...
"message": "eu",
...
}
Failover
Both clusters have podinfo installed on the top where each
cluster has been tagged to serve a different region. In this demo we will hit podinfo by wget -qO - failover.cloud.example.com
and depending
on whether podinfo is running inside the cluster it returns only eu or us.
See Gslb manifest with failover strategy
Switch GLSB to failover mode:
make init-failover
Now both clusters are running in failover mode and podinfo is running on both of them.
Run several times command below and watch message
field.
make test-failover
You will see only eu podinfo is responsive:
{
"hostname": "frontend-podinfo-856bb46677-8p45m",
...
"message": "eu",
...
}
Stop podinfo on current (eu) cluster:
make stop-test-app
Several times hit application again
make test-failover
As expected result you should see only podinfo from second cluster (us) is responding:
{
"hostname": "frontend-podinfo-856bb46677-v5nll",
...
"message": "us",
...
}
It might happen that podinfo will be unavailable for a while due to
DNS sync interval and default ohmyglb DNS TTL of 30 seconds
wget: server returned error: HTTP/1.1 503 Service Temporarily Unavailable
Start podinfo again on current (eu) cluster:
make start-test-app
and hit several times hit podinfo:
make test-failover
After DNS sync interval is over eu will be back
{
"hostname": "frontend-podinfo-6945c9ddd7-xksrc",
...
"message": "eu",
...
}
Optionally you can switch GLSB back to round-robin mode
make init-round-robin
Metrics
OhMyGLB generates Prometheus-compatible metrics.
Metrics endpoints are exposed via -metrics
service in operator namespace and can be scraped by 3rd party tools:
spec:
...
ports:
- name: http-metrics
port: 8383
protocol: TCP
targetPort: 8383
- name: cr-metrics
port: 8686
protocol: TCP
targetPort: 8686
Metrics can be also automatically discovered and monitored by Prometheus Operator via automatically generated ServiceMonitor CRDs , in case if Prometheus Operator is deployed into the cluster.
General metrics
controller-runtime standard metrics, extended with OhMyGLB operator-specific metrics listed below:
healthy_records
Number of healthy records observed by OhMyGLB.
Example:
# HELP ohmyglb_gslb_healthy_records Number of healthy records observed by OhMyGLB.
# TYPE ohmyglb_gslb_healthy_records gauge
ohmyglb_gslb_healthy_records{name="test-gslb",namespace="test-gslb"} 6
ingress_hosts_per_status
Number of ingress hosts per status (NotFound, Healthy, Unhealthy), observed by OhMyGLB.
Example:
# HELP ohmyglb_gslb_ingress_hosts_per_status Number of managed hosts observed by OhMyGLB.
# TYPE ohmyglb_gslb_ingress_hosts_per_status gauge
ohmyglb_gslb_ingress_hosts_per_status{name="test-gslb",namespace="test-gslb",status="Healthy"} 1
ohmyglb_gslb_ingress_hosts_per_status{name="test-gslb",namespace="test-gslb",status="NotFound"} 1
ohmyglb_gslb_ingress_hosts_per_status{name="test-gslb",namespace="test-gslb",status="Unhealthy"} 2
Served on 0.0.0.0:8383/metrics
endpoint
Custom resource specific metrics
Info metrics, automatically exposed by operator based on the number of the current instances of an operator's custom resources in the cluster.
Example:
# HELP gslb_info Information about the Gslb custom resource.
# TYPE gslb_info gauge
gslb_info{namespace="test-gslb",gslb="test-gslb"} 1
Served on 0.0.0.0:8686/metrics
endoint