Gansoi Infrastructure Monitoring
Gansoi is a modern monitoring solution inspired by classical network monitoring
software (Nagios and friends) updated to modern and best practices.
Disclaimer and Current State
At the moment this software is unusable for anyone but early adopters
and developers. It's a just-working proof-of-concept.
Current Limitations
- No useful agents implemented yet.
- The web interface is rudimentary at best. Useless at worst.
What We Currently Have
- A distributed core implemented using Raft and
BoltDB.
- Full encryption of internal cluster communication using an internal
CA.
- The project requires nothing but a Go compiler when building. There
is no runtime dependencies.
- A simple Slack notifier.
- A few example agents for testing.
- LetsEncrypt support for the web interface.
- It's quite performant. The smallest cloud server we could rent can easily
handle hundreds of checks per second. With no optimizations yet.
Mission Statement
Gansoi's mission is to radically improve self-hosted health monitoring and
alerting for ops and devops.
Design goals
- Fault tolerant and distributed - should tolerate multiple node failures
with no single point of failure.
- Performant - should scale to thousands of checks per second on basic
hardware.
- Zero dependencies - should install and operate without any other
software.
- Geodistributed - should be deployable across the globe for full
redundancy and geographically distributed checks.
- Multidimensional severity grouping - an alert should be categorized in
both severity and urgency.
- No false positives - the system should never cry wolf.
- Easy encryption - should support Let’s Encrypt out of the box.
- Future-proof - Everything must support IPv6 out of the box.
- Binary outcome - either a human should do something or not.
Alert integrations
- Slack - it's what we all use and love.
- Pagerduty - oldtimers swear by this. Let's support it.
- Email - CEO's want their alerts too.
- Twilio voice calls - for waking up your ops team at 3 AM.
Transports
Checks could be self-contained and run on a Gansoi node - or they can run on a
third party host. This will require a transport.
- ssh - Gansoi should support SSH as a transport. It is universally supported
as a remote access protocol. Gansoi must support some form of keep-alive to
avoid the constant reconnecting and handshake.
- NRPE - This is the industry standard and we should support it.
Building and development
Go 1.9 or newer is required for building Gansoi.
go get ./...
should get all dependencies.
go build .
should be enough to build Gansoi.
You can run a small local Gansoi cluster for testing and development:
$ ./test3.sh
A three-node local cluster will be started shortly. You can visit the web
interface at https://cluster.gansoi-dev.com:9002/.
Alternatively, you can launch a single node:
$ ./gansoi demo
The node will be available at https://gansoi-dev.com:9002/.
Docker
We provide a Docker image for your convenience.
Plain Docker
You can start a Gansoi node using regular Docker with a simple command:
docker run --rm -p 80:80 -p 443:443 gansoi/gansoi
.
Compose and Stack
Included in this project is a docker-compose.yml
for running Gansoi with
Docker Compose or Docker Stack.
If you're using Docker Compose you can try Gansoi using the command
docker-compose up
and visiting https://gansoi-dev.com/.
The docker-compose.yml
file can also be used for a Docker Stack deploy on
a Swarm: docker stack deploy -c docker-compose.yml gansoi
Slack Channel
If you would like you can join other developers in the
#gansoi channel on the Gophers
Slack (invites to Gophers Slack are available
here).