conch
Distributed coordination wrapper for Linux processes, built on etcd v3. Provides leader election, counting semaphores, and distributed cron without requiring application-level etcd integration.
Install
Pre-built binaries (amd64, arm64) are available on the releases page.
sudo curl -fsSL "https://github.com/alien43/conch/releases/latest/download/conch-$(uname -s | tr A-Z a-z)-$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')" \
-o /usr/local/bin/conch && sudo chmod +x /usr/local/bin/conch
NixOS: use pkgs.conch from the flake.
Build
Using just:
just build
Or manually:
CGO_ENABLED=0 go build -ldflags="-s -w" -o conch cmd/conch/main.go
Global options
CONCH_ENDPOINTS etcd endpoints, comma-separated (default: localhost:2379)
CONCH_DIAL_TIMEOUT dial timeout (default: 5s)
CONCH_TTL lease TTL (default: 10s)
Can also be passed as flags.
elect
Run a command under a cluster-wide leader lease. Blocks until the office is vacant, then spawns the command. If the lease is lost, the child process is killed.
conch elect <office> [flags] -- <cmd...>
| Flag |
Default |
Description |
--restart |
off |
re-campaign after child exits |
--kill-after |
5s |
grace period before SIGKILL |
--wait |
forever |
give up after this duration |
--nonblock |
off |
exit 75 immediately if office is held |
--on-acquire |
— |
shell command run on winning the office |
--on-lose |
— |
shell command run on losing the office |
--hook-timeout |
30s |
timeout for hook commands |
# Run a daemon, restart on failure
conch elect api-leader --restart -- /usr/local/bin/api-server
# Promote/demote postgres on leadership change
conch elect db-primary --on-acquire "pg_ctl promote" --on-lose "pg_ctl demote" --restart -- postgres
# Check who holds the office
conch elect api-leader --who
conch elect api-leader --watch
# Assert this host holds the office (exit 0 yes, 1 no, 69 etcd unreachable)
conch elect api-leader --assert
sema
Limit concurrent execution to N slots across the cluster.
conch sema <name> --max N [flags] -- <cmd...>
| Flag |
Default |
Description |
--max |
required |
max concurrent holders |
--spread |
off |
at most 1 slot per hostname |
--wait |
forever |
give up after this duration |
--nonblock |
off |
exit 75 immediately if no slot available |
# At most 3 nodes run this at once
conch sema heavy-jobs --max 3 -- /usr/local/bin/process-video.sh
# One slot per host, 5 total
conch sema ingest --max 5 --spread -- /usr/local/bin/ingest
# Check who holds slots
conch sema heavy-jobs --max 3 --who
cron
Register and manage distributed cron jobs. Each tick runs exactly once across the cluster.
# Register a job
conch cron add <name> --schedule '<expr>' [--run-ttl 10m] -- <cmd...>
# Remove a job
conch cron rm <name>
# List jobs
conch cron ls [--last] [--json]
# Run conchd on each cluster node to execute scheduled jobs
conch conchd
Exit codes
| Code |
Meaning |
| 0 |
success |
| 1–63 |
child exit code (passed through) |
| 64 |
usage error |
| 69 |
etcd unreachable at startup |
| 70 |
lease lost during run; child was killed |
| 75 |
--nonblock or --wait expired |
Test
Using just:
just test
just test-chaos
Or manually:
go test -v -race ./...
./test-cluster-chaos.sh
Docs