cli

module
v0.0.1-rc1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 15, 2026 License: Apache-2.0

README

Ruptor

Reliability testing for AI agents — Chaos Engineering meets LLM systems.

Ruptor helps you find out how your AI agent behaves when things go wrong: tool timeouts, invalid JSON, rate limits, empty responses. Before your users do.


Modules

Module What it does
ruptor run Injects failures into tool calls and observes agent behavior
ruptor simulate Simulates real users to evaluate goal completion and conversation quality

Installation

brew install ruptor-dev/tap/ruptor
curl
curl -fsSL https://ruptor.dev/install.sh | sh
go install
go install github.com/ruptor-dev/cli/cmd/ruptor@latest

Requires Go 1.22+

macOS — First Run

On macOS, Gatekeeper may block the binary on first run with "Apple could not verify ruptor is free of malware."

This is expected for unsigned OSS binaries. Remove the quarantine attribute and run normally:

xattr -d com.apple.quarantine $(which ruptor)
ruptor --version

This is a one-time step. It does not affect subsequent runs.


Quickstart — Chaos Testing

1. Point your agent's tools at Ruptor:

export TOOL_BASE_URL=http://localhost:8080

2. Create a chaos.yaml:

agent:
  name: my_agent
  entrypoint: python agent.py
  env:
    TOOL_BASE_URL: http://localhost:8080

proxy:
  port: 8080
  passthrough_url: https://my-real-tool-api.com

tests:
  - id: timeout_on_search
    tool: /search
    fault: tool_timeout
    delay_ms: 30000
    probability: 1.0

evaluation:
  max_iterations: 20
  timeout_s: 60
  llm_judge: true

output:
  format: both
  path: ./reports/

3. Run:

ruptor run chaos.yaml
ruptor run chaos.yaml --output report.html
ruptor run chaos.yaml --test timeout_on_search

Quickstart — Simulate

1. Create a simulate.yaml:

agent:
  name: support_agent
  base_url: http://localhost:3000

simulations:
  - id: frustrated_user
    persona: "Frustrated user who wants to resolve their issue in under 3 messages"
    goal: "Cancel subscription"
    max_turns: 10
    success_criteria: "Agent completed the cancellation"

evaluation:
  goal_completion: true
  tone_quality: true

output:
  format: both
  path: ./reports/

2. Run:

ruptor simulate simulate.yaml
ruptor simulate simulate.yaml --sim frustrated_user

Available Faults

Fault Description
tool_timeout No response — does the agent have its own timeout?
slow_response Responds after N ms — does the agent wait or cut?
tool_error Returns 5xx — does the agent retry or fail gracefully?
invalid_json Returns malformed JSON — does the agent handle ParseError?
empty_response Returns 200 OK with empty body — common and rarely tested
rate_limit Returns 429 — does the agent implement backoff or spam?
llm_error LLM backend returns 503 — does the agent have a fallback?
llm_timeout LLM call hangs indefinitely — does the agent have a deadline?

Environment Variables

Variable Required Description
TOOL_BASE_URL Yes Points your agent's tool calls at the ruptor proxy
OPENAI_API_KEY Only for LLM judge API key for the LLM judge evaluator
RUPTOR_TOKEN Only for --cloud Auth token for cloud reporting (coming soon)

Development

make build                  # compile
make test                   # run tests with race detector
make lint                   # run golangci-lint
make check                  # build + test + vet
make run-example            # run chaos example
make run-simulate-example   # run simulate example
make tools                  # install dev tools
make help                   # list all targets

Roadmap

  • 8 fault types (tool_timeout, slow_response, tool_error, invalid_json, empty_response, rate_limit, llm_error, llm_timeout)
  • ruptor run — chaos proxy with Robustness Score + HTML report
  • ruptor simulate — user simulation with goal completion scoring
  • ruptor auth — OAuth device flow (cloud, coming soon)
  • ruptor doctor — environment diagnostics
  • ruptor update — self-update
  • ruptor sync — sync run results to cloud (coming soon)
  • Cloud dashboard — run history, team reports, CI/CD integration
  • MCP proxy support
  • cascade_failure, partial_degradation enterprise scenarios

Directories

Path Synopsis
cmd
ruptor command
internal
cloud
Cloud client for run-report ingestion.
Cloud client for run-report ingestion.
doctor
Package doctor runs preflight diagnostics against a user's local install and prints an actionable report.
Package doctor runs preflight diagnostics against a user's local install and prints an actionable report.
evaluator/llmjudge/prompts
Package prompts ships the LLM-judge system prompts as embedded text files.
Package prompts ships the LLM-judge system prompts as embedded text files.
llmclient
Package llmclient provides HTTP clients for language-model APIs used by both the LLM judge (evaluator) and the user simulator.
Package llmclient provides HTTP clients for language-model APIs used by both the LLM judge (evaluator) and the user simulator.
telemetry
Package telemetry wires ruptor's opt-in OpenTelemetry tracing.
Package telemetry wires ruptor's opt-in OpenTelemetry tracing.
ui
Package ui — runprogress.go: the live Bubbletea program shown during `ruptor run`.
Package ui — runprogress.go: the live Bubbletea program shown during `ruptor run`.
updater
Package updater queries the GitHub releases feed for the latest published ruptor tag and reports whether the running binary is behind it.
Package updater queries the GitHub releases feed for the latest published ruptor tag and reports whether the running binary is behind it.
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL