atc

command module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2026 License: MIT Imports: 1 Imported by: 0

README

ATC (Active Traffic Control)

goreleaser

ATC is a lightweight, high-performance Go service that automates the creation and management of Consul service-resolver configurations. Like Active Traffic Control, ATC monitors the state of your Consul endpoints and automatically publishes failover or redirect rules when failing services are detected.

Originally based on a heavy dependency kit, ATC has been modernised to run natively on Go standard library primitives, making it extremely lightweight and secure with a minimal dependency tree.


Features

  • Consul Automation: Automatically watches service health checks and resolves configurations to create failover (failover.hcl) and redirection (redirect.hcl) service resolvers.
  • Consul Catalog Failover: Automatically monitors Consul catalog changes and registers Prepared Queries (atc-<service-name>) with geo-failover to the nearest 2 datacenters for tagged services.
  • Oscillation Dampening (Hysteresis): Protects Consul from configuration churn by debouncing catalog flapping. Supports global defaults, custom tag overrides, and operator safety boundaries.
  • Active-Passive High Availability: Run multiple ATC instances in active-passive mode. Dynamic leadership is coordinated via target-scoped Consul KV Session Locks (e.g. <lock_key>/forwarder and <lock_key>/redirector) to allow independent component failover without split-workload deadlocks, with standby nodes serving read-only dashboards and metrics.
  • Embedded React Dashboard: Stunning glassmorphic web dashboard served at / that displays tracked services, their prepared queries, active tags, failover status, and active leader designation.
  • Native Concurrency: Powered by Go standard library context, sync, and golang.org/x/sync/errgroup for safe, coordinated background execution.
  • OpenTelemetry Integration: Full support for OpenTelemetry Traces, Metrics, and Logs. Traces and logs are automatically correlated and exported via OTLP, and metrics are exposed via OTLP and a Prometheus bridge.
  • log/slog Structured Logging: Multiplexed structured, level-filtered logging using standard Go "log/slog" that outputs to console/Stderr and propagates context to OpenTelemetry.
  • Dual HTTP Ports: Isolates application API endpoints from metrics for production security:
    • Main HTTP Port (default :8088): Serves the React dashboard at /, exposes /ready, /services, /api/services, /api/overrides, /api/leader, and the MCP interface.
    • Metrics HTTP Port (default :8089): Serves OpenTelemetry metrics in Prometheus format at /metrics.
  • Model Context Protocol (MCP) Server: Hosts a native MCP server over Streamable HTTP transport at /mcp to expose system state and manual write actions directly to LLM agents and AI clients.

[!IMPORTANT] API-to-MCP Tool Mapping Rule For every HTTP API endpoint exposed by the ATC server, an equivalent MCP tool must be registered on the MCP server to ensure parity between standard monitoring APIs and agent capabilities.


Install

macOS
brew tap atcprojectio/tap
brew install atcprojectio/tap/atc
From Source
# Clone the repository
git clone https://github.com/atcprojectio/atc.git
cd atc

# Compile the binary
make build
Deployment

Production-grade deployment configurations for Nomad and Kubernetes are available in the deploy directory.

Kubernetes (Helm)

A Helm chart is located under deploy/helm/atc. You can install it with:

helm install atc ./deploy/helm/atc --values ./deploy/helm/atc/values.yaml

To configure strategy rules, edit the config.strategiesYaml block in values.yaml.

Nomad (HCL)

A Nomad job definition is available at deploy/nomad/atc.nomad.hcl. It runs two instances in Active-Passive HA mode using Consul session locks.

Submit the job using:

nomad job run ./deploy/nomad/atc.nomad.hcl

Demo Project

A complete, self-contained multi-datacenter demo environment is available in the atc-demo repository. It automatically spins up:

  • Two WAN-federated Consul servers (dc1 and dc2).
  • Two ATC service instances running in active-passive HA mode.
  • Mock target services and an automated Python traffic client to showcase live routing, failover, and redirect behavior.

To run it, clone the demo repository and follow the instructions in the README:

# Clone the demo repository
git clone https://github.com/atcprojectio/atc-demo.git
cd atc-demo

# Pull images, start the stack and run the demo walkthrough
make pull
make up
make run-demo

Usage

Start the ATC background watcher process:

./dist/atc server [flags]
Key Flags
  • --port (int): Port to expose main service endpoints on (default: 8088).
  • --metrics_port (int): Port to expose Prometheus-formatted metrics on (default: 8089).
  • --log_level (string): Only log messages with this severity or above (debug, info, warn, error) (default: info).
  • --target (strings): Comma-separated list of components to run (consul, forwarder, redirector, server, all) (default: all).
  • --consul_addr (string): Consul HTTP endpoint address.
  • --consul_token (string): Consul ACL token.
  • --consul_dc (string): Consul target datacenter.
  • --config (string): Path to ATC configuration file.
  • --ui-enabled (bool): Enable serving the embedded React Web UI dashboard (default: true).
Environment Variables

OpenTelemetry exporters can be configured using standard OpenTelemetry environment variables:

  • OTEL_EXPORTER_OTLP_ENDPOINT: Target OTel collector endpoint (default: http://localhost:4318 via HTTP).
  • OTEL_SERVICE_NAME: The name of the service (default: atc).
  • OTEL_SDK_DISABLED: Set to true to completely disable telemetry collection.
Configuration Validation & Linting

ATC supports linting and validating configuration files (e.g., strategies.yaml) using the validate command:

./dist/atc validate --config strategies.yaml

This parses and verifies all configuration options (such as target modules, dampening durations, HA election parameters, failover targets, and connect timeouts) against schema validation rules. It exits with code 0 if the configuration is valid, and prints detailed error messages to stderr and exits with code 1 if validation fails—making it ideal for integration into GitOps CI/CD pipelines.


Predefined Routing & Failover Strategies

ATC supports predefined routing strategies that admins can configure in a YAML file loaded via the --config flag. Predefined strategies are defined under strategies.failover and strategies.redirect.

If a strategy named default is configured (e.g. strategies.failover.default or strategies.redirect.default), teams can register their services using only the atc.enabled=true tag. ATC will automatically fall back to the default strategy if the specific atc.failover or atc.redirect tag is omitted.

Configuration Example (strategies.yaml)
dampening_period: "5s"
min_dampening_period: "0s"

ha:
  enabled: true
  lock_key: "atc/leader/lock"
  session_ttl: "15s"

strategies:
  failover:
    standard-failover:
      connect_timeout: "10s"
      targets:
        - datacenter: "dc2"
    multi-region-failover:
      connect_timeout: "5s"
      targets:
        - datacenter: "dc2"
        - datacenter: "dc3"
          service: "fallback-service"
  redirect:
    standard-redirect:
      datacenter: "dc2"
    geo-redirect:
      service: "geo-fallback"
      datacenter: "dc3"
Invoking Strategies and Hysteresis via Consul Tags

Teams can apply these predefined strategies and override dampening boundaries to their Consul services using the following tags in the service definition:

  • atc.failover=<strategy-name>: Specifies the predefined failover strategy to use. If omitted or not found, it defaults to failing over to the dynamically resolved target datacenter for the same service.
  • atc.redirect=<strategy-name>: Specifies the predefined redirection strategy to use when the service goes offline (is deleted from the catalog). If omitted or not found, it defaults to redirecting to the dynamically resolved target datacenter for the same service.
  • atc.dampening=<duration>: Custom hysteresis override (e.g. atc.dampening=10s or atc.dampening=0s for immediate mode). Clamped to min_dampening_period.

When a service is active, the strategy and dampening tag values are persisted in the Consul service-resolver entry metadata, which allows the redirector to apply the correct policies even after the service is removed from the Consul catalog.


Core HTTP Endpoints

Application API (Port 8088)
  • / (GET): Serves the embedded React frontend dashboard.
  • /ready (GET): Simple readiness check (200 OK).
  • /services (GET): Prints a formatted ASCII table showing the status of all active components.
  • /api/services (GET): JSON list of active Consul services tagged with atc.enabled=true.
  • /api/leader (GET): JSON representing leadership status and component details (e.g., {"leader":true,"components":{"forwarder":true,"redirector":true}}).
  • /api/federation (GET): JSON list of WAN-federated datacenters and connection statuses.
  • /api/overrides (POST): Manually override automatic failover/redirect routes in Consul.
    • JSON payload:
      {
        "service": "payment-service",
        "type": "failover|redirect",
        "target_dc": "dc2"
      }
      
  • /mcp (GET/POST): Streamable HTTP endpoint for Model Context Protocol interactions. Offers tools:
    • check_readiness
    • check_leadership
    • list_atc_enabled_services
    • list_wan_federation_status
    • purge_redirect_config
    • apply_failover_override
    • trigger_manual_redirect
Metrics API (Port 8089)
  • /metrics (GET): OpenTelemetry metrics in Prometheus format.

Testing & Verification

Registering and Deregistering Test Services

You can register and deregister mock services in your local Consul agent using the following make targets:

# Register a test-service with tag atc.enabled=true
make consul-register-test

# Deregister the test-service
make consul-deregister-test
Claude Desktop MCP Integration

To enable the ATC Model Context Protocol (MCP) server in Claude Desktop, you can bridge the Streamable HTTP SSE transport to Claude's stdio interface using the official mcp-remote bridge client.

  1. Open the Claude Desktop configuration file:

    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
  2. Add the atc configuration block under mcpServers pointing to the remote HTTP SSE endpoint:

{
  "mcpServers": {
    "atc": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "http://localhost:8088/mcp"
      ]
    }
  }
}
  1. Restart Claude Desktop. The agent will now have access to the following tools:
    • check_readiness
    • list_atc_enabled_services
    • purge_redirect_config
    • check_leadership
    • list_wan_federation_status

Documentation & Project Resources

  • Project Documentation & Website: Located in the docs/ directory and automatically published to GitHub Pages. Contains detailed setup, architecture, and module configurations.
  • Architecture Decision Records (ADRs): Detailed records of all design and architecture decisions are stored in ADR.MD.
  • Project Roadmap & TODO List: The MoSCoW analysis and active TODO lists are located in TODO.md.

Documentation

Overview

Package main represents the entrypoint for the ATC (Active Traffic Control) command-line utility.

ATC automates the creation and management of Consul service-resolver configurations to control routing of ingress requests for failing services. By watching service checks and endpoints, ATC automatically publishes failover and redirect rules to Consul.

Usage:

atc server [flags]
atc modules
atc version

Targets:

  • consul: Runs both the Forwarder and Redirector watcher services.
  • forwarder: Watches Consul endpoints to automatically configure request forwarding.
  • redirector: Watches Consul endpoints to automatically configure geo-failover prepared queries.
  • server: Spins up the internal HTTP handler (health, services, API endpoints).
  • all: Resolves to 'consul' (runs all services).

HTTP Endpoints:

ATC server hosts two separate HTTP port listeners:

  • Main Port (default :8088): Serves the React frontend dashboard at `/` (which can be disabled via `server.ui_enabled` in the config file or `--ui-enabled` command-line flag), exposes /ready, /services, JSON API service list (/api/services), manual overrides (/api/overrides), leader status (/api/leader), WAN federation status (/api/federation), and the MCP server interface. When the Web UI is disabled, requests to static routes `/` return a 404 Not Found error with a "Web UI is disabled" message, while other REST APIs, `/ready`, `/health`, and `/mcp` endpoints remain active.
  • Metrics Port (default :8089): Exposes OpenTelemetry metrics in Prometheus format at `/metrics`.

API & MCP Integration:

ATC server hosts a Model Context Protocol (MCP) server over Streamable HTTP transport at the `/mcp` route for seamless integration with AI models and agents. Exposed tools include check_readiness, check_leadership, list_atc_enabled_services, list_wan_federation_status, purge_redirect_config, apply_failover_override, and trigger_manual_redirect.

Deployment:

ATC can be deployed using the production-ready Helm chart located under deploy/helm/atc, or the Nomad job specification located under deploy/nomad/atc.nomad.hcl.

Predefined Strategies:

ATC supports predefined failover and redirect strategies defined by admins in a YAML config file. Teams can assign these strategies to their Consul services using tags (e.g., `atc.failover=strategy-name` and `atc.redirect=strategy-name`). ATC's forwarder and redirector apply these strategies dynamically and persist them in the service-resolver configuration entry metadata.

If a strategy named "default" is configured (e.g. `strategies.failover.default` or `strategies.redirect.default`), teams can register their services using only the `atc.enabled=true` tag. ATC will automatically fall back to the "default" strategy configuration if the specific `atc.failover` or `atc.redirect` tag is omitted.

Oscillation Dampening (Hysteresis):

ATC protects Consul from excessive write operations by debouncing rapid health check changes. It supports a global default dampening period (e.g., `5s`), a tag-based override (`atc.dampening=duration` such as `atc.dampening=0s` for immediate mode), and an operator safety boundary (`min_dampening_period`) to prevent users from bypassing stability safeguards.

Active-Passive High Availability:

ATC can run in active-passive HA mode coordinated via Consul KV session locks. Instead of a single global lock, ATC uses target-scoped leader locking for each active reconciler workload (e.g. `atc/leader/lock/forwarder` and `atc/leader/lock/redirector`). This prevents split-workload deadlocks and allows partitioned instances running subsets of modules to failover and run workloads independently. Standby instances keep their HTTP/metrics servers active but suspend reconciler watches. Failover is automatic when the active session lock expires.

Documentation & Project Resources:

  • docs/: Project documentation website directory, automatically published to GitHub Pages.
  • ADR.MD: Architecture Decision Records (ADRs) detailing core design and operational decisions.
  • TODO.md: Active roadmap and MoSCoW priorities list.

Architectural Design Rule: - NOTE: For every HTTP API endpoint exposed by the ATC server, a corresponding MCP tool MUST be registered.

For more options, run:

atc server --help

Directories

Path Synopsis
pkg
atc

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL