chaoskit

package module

v0.9.0 Latest Latest Go to latest Published: Nov 23, 2025 License: MIT Imports: 14 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/rom8726/chaoskit

Links

Open Source Insights

README ¶

ChaosKit

ChaosKit is a Go framework for code-level chaos engineering and fault injection. It enables controlled failures inside your functions — delays, panics, errors, resource pressure — and provides validators to ensure your application behaves correctly under adverse conditions.

Most chaos tools operate at the infrastructure level (containers, nodes, networks). ChaosKit focuses on what they cannot test: business logic, compensations, concurrency behavior, state invariants, and internal error-handling paths.

Features

Code-level fault injection (panic, delay, error, resource faults)
Context-based activation (no-op unless chaos context is attached)
Scenario DSL (define steps, injectors, validators, scopes)
Severity-aware validators feeding a verdict engine
Built-in reporters & success thresholds for PASS/UNSTABLE/FAIL outcomes
Metrics collector & Prometheus exporter for observability
Go testing integration helpers with configurable thresholds
Integration with ToxiProxy for network chaos
Optional monkey-patching for advanced cases
Fully opt-in via build tags (-tags=chaos)

ChaosKit is safe to include in your codebase: fault injection never activates accidentally.

Installation

go get github.com/rom8726/chaoskit

Quick Example

scenario := chaoskit.NewScenario("workflow-stability").
    WithTarget(engine).
    Step("run workflow", ExecuteWorkflow).
    Inject("delays", injectors.RandomDelay(10*time.Millisecond, 100*time.Millisecond)).
    Inject("panic", injectors.PanicProbability(0.01)).
    Assert("goroutines", validators.GoroutineLimit(200)).
    Assert("no-deadlock", validators.NoInfiniteLoop(5*time.Second)).
    Repeat(100).
    Build()

if err := chaoskit.Run(context.Background(), scenario); err != nil {
    log.Fatalf("Chaos scenario failed: %v", err)
}

Build Tag Isolation

ChaosKit’s fault injectors are compiled only when explicitly enabled:

go build -tags=chaos .

Without the chaos tag:

all injectors become no-ops,
chaos code is excluded from the binary,
scenarios run without injecting any faults.

This ensures ChaosKit never affects production binaries unless intentionally enabled.

Context-Based Activation

Even with the build tag, chaos runs only when a special context is attached:

ctx := chaoskit.AttachChaos(context.Background())

Without this context, calls like:

chaoskit.MaybePanic(ctx)
chaoskit.MaybeDelay(ctx)
if err := chaoskit.MaybeError(ctx); err != nil {
    return err
}
child, cancel := chaoskit.MaybeCancelContext(ctx)
defer cancel()

if chaoskit.ApplyChaos(child, "force-retry") {
    // Provider-specific logic
}

and event hooks such as

chaoskit.RecordRecursionDepth(ctx, depth)
chaoskit.RecordError(ctx)

are strictly no-op.

Scenario DSL

Define multi-step, multi-injector stress or chaos tests:

chaoskit.NewScenario("example").
    WithTarget(client).
    Step("call API", callAPI).
    Inject("latency", injectors.RandomDelay(5*time.Millisecond, 50*time.Millisecond)).
    Inject("errors", injectors.ErrorWithProbability(io.ErrUnexpectedEOF, 0.02)).
    Assert("goroutines", validators.GoroutineLimit(100)).
    Repeat(50).
    Build()

ChaosKit supports:

fixed-number runs (Repeat(n))
duration-based runs (RunFor(time.Hour))

Injectors

PanicProbability(p)
RandomDelay(min, max)
ErrorWithProbability(err, p)
CompositeInjector(...)
network chaos via ToxiProxy
optional monkey-patching for advanced scenarios

Validators

GoroutineLimit(n)
RecursionDepthLimit(n)
NoInfiniteLoop(timeout)
NoSlowIteration(timeout)
MemoryUnderLimit(bytes)
MaxErrors(limit)
custom validators via:

type Validator interface {
    Name() string
    Validate(ctx context.Context, target Target) error
    Severity() chaoskit.ValidationSeverity
}

Verdicts & Reports

Reporter().GetVerdict(thresholds) evaluates executions against SuccessThresholds.
Verdicts are PASS, UNSTABLE, or FAIL with matching exit codes.
Reports include categorized failures, top error patterns, and JSON/text outputs.
Threshold helpers: DefaultThresholds, StrictThresholds, RelaxedThresholds.

Metrics & Exporters

MetricsCollector tracks executions, success rates, and injector metrics.
exporters.PrometheusExporter exposes /metrics compatible with Prometheus.
HTTP helper: prom.Handler() to plug into net/http.
Collect metrics after each run via executor.Reporter().Results() and executor.Metrics().Stats().

Go Testing Integration

chaoskit/testing.RunChaos integrates scenarios with testing.T.
Options: WithRepeat, WithFailurePolicy, WithThresholds, WithoutReport, WithReportToStderr, etc.
RunChaosSimple accepts plain slices of steps, injectors, and validators for lightweight tests.

When to Use ChaosKit

Best use cases

workflow engines (Saga, orchestration)
systems with compensations or rollback logic
retry-heavy algorithms
concurrency-sensitive components
correctness-critical state machines
CI stress testing

Less suitable for

black-box testing without code access
infrastructure-level chaos (use Chaos Mesh / Litmus / Gremlin instead)

Network Chaos (ToxiProxy)

ChaosKit integrates with ToxiProxy without modifying your code:

latency
bandwidth limits
timeouts
connection cuts
packet shaping

Useful for testing clients, message brokers, databases, etc.

Why Code-Level Chaos Matters

Infrastructure chaos exposes resilience of clusters. ChaosKit exposes resilience of your logic.

Examples of failures ChaosKit can detect:

rollback recursion loops
leaked goroutines
unbounded retries
inconsistent state after error paths
panics during compensations
subtle timing bugs

This is the category of failures that infrastructure-level tools cannot simulate.

License (MIT)

ChaosKit is released under the MIT License.

This software is provided “as is”, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and noninfringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability arising from the use of this software.

See LICENSE for details.

Status

ChaosKit is early-stage but stable enough for research, prototyping, and internal testing workflows. Contributions, issue reports, and design discussions are welcome.

Documentation ¶

Overview ¶

Package chaoskit provides a modular framework for chaos engineering.

ChaosKit enables systematic testing of system reliability through controlled fault injection and invariant validation.

Basic Usage ¶

scenario := chaoskit.NewScenario("test").
	WithTarget(mySystem).
	Inject("delay", injectors.RandomDelay(5*time.Millisecond, 25*time.Millisecond)).
	Assert("goroutines", validators.GoroutineLimit(100)).
	Build()

executor := chaoskit.NewExecutor()
if err := executor.Run(ctx, scenario); err != nil {
	log.Fatal(err)
}

Architecture ¶

ChaosKit follows clean architecture principles with clear separation between: - Scenarios: Define what to test - Injectors: Introduce faults into the system - Validators: Verify system invariants - Executor: Orchestrates scenario execution

Extension Points ¶

Implement the Injector or Validator interfaces to create custom chaos behaviors.

Index ¶

Constants
func ApplyChaos(context.Context, string) bool
func AttachChaos(ctx context.Context, _ *ChaosContext) context.Context
func AttachLogger(ctx context.Context, logger *slog.Logger) context.Context
func AttachRand(ctx context.Context, rng *rand.Rand) context.Context
func AttachRecorder(ctx context.Context, r EventRecorder) context.Context
func GetLogger(ctx context.Context) *slog.Logger
func GetRand(ctx context.Context) *rand.Rand
func MaybeCancelContext(ctx context.Context) (context.Context, context.CancelFunc)
func MaybeDelay(ctx context.Context)
func MaybeError(context.Context) error
func MaybeNetworkChaos(context.Context, string, int)
func MaybePanic(context.Context)
func RecordError(ctx context.Context)
func RecordPanic(ctx context.Context)
func RecordRecursionDepth(ctx context.Context, depth int)
func Run(ctx context.Context, scenario *Scenario) error
func RunWithLogger(ctx context.Context, scenario *Scenario, logger Logger) error
func RunWithSlogLogger(ctx context.Context, scenario *Scenario, logger *slog.Logger) error
type CategorizedInjector
type ChaosContext
- func GetChaos(context.Context) *ChaosContext
- func NewChaosContext() *ChaosContext
- func (*ChaosContext) GetProvider(string) (ChaosProvider, bool)
- func (*ChaosContext) RegisterProvider(ChaosProvider)
- func (*ChaosContext) SetCancellationFunc(func(context.Context) (context.Context, context.CancelFunc))
- func (*ChaosContext) SetDelayFunc(func() bool)
- func (*ChaosContext) SetErrorFunc(func() error)
- func (*ChaosContext) SetNetworkFunc(func(host string, port int) bool)
- func (*ChaosContext) SetPanicFunc(func() bool)
type ChaosContextCancellationProvider
type ChaosDelayProvider
type ChaosErrorProvider
type ChaosNetworkProvider
type ChaosPanicProvider
type ChaosProvider
type ErrorRecorder
type ErrorSummary
type EventRecorder
type ExecutionResult
type Executor
- func NewExecutor(opts ...ExecutorOption) *Executor
- func (e *Executor) Metrics() *MetricsCollector
- func (e *Executor) Reporter() *Reporter
- func (e *Executor) Run(ctx context.Context, scenario *Scenario) error
type ExecutorOption
- func WithFailurePolicy(policy FailurePolicy) ExecutorOption
- func WithJSONLogging() ExecutorOption
- func WithLogger(logger Logger) ExecutorOption
- func WithMetrics(metrics *MetricsCollector) ExecutorOption
- func WithReporter(reporter *Reporter) ExecutorOption
- func WithSlogLogger(logger *slog.Logger) ExecutorOption
type FailureAnalysis
type FailurePolicy
type GlobalInjector
type Injector
type InjectorType
type JUnitError
type JUnitFailure
type JUnitTestCase
type JUnitTestSuite
type Logger
- func NewDefaultLogger() Logger
type MetricsCollector
- func NewMetricsCollector() *MetricsCollector
- func (m *MetricsCollector) GetInjectorMetrics(injectorName string) (map[string]interface{}, bool)
- func (m *MetricsCollector) RecordExecution(result ExecutionResult)
- func (m *MetricsCollector) RecordInjectorMetrics(injectorName string, metrics map[string]interface{})
- func (m *MetricsCollector) Stats() map[string]any
type MetricsProvider
type NetworkInjectorLifecycle
type PanicRecorder
type RecursionRecorder
type Report
type Reporter
- func NewReporter() *Reporter
- func (r *Reporter) AddResult(result ExecutionResult)
- func (r *Reporter) GenerateJSON() (string, error)
- func (r *Reporter) GenerateJUnitXML(report *Report) (string, error)
- func (r *Reporter) GenerateReport() string
- func (r *Reporter) GenerateTextReport(report *Report) string
- func (r *Reporter) GetVerdict(thresholds *SuccessThresholds) (*Report, error)
- func (r *Reporter) Results() []ExecutionResult
- func (r *Reporter) SaveJSON(path string) error
- func (r *Reporter) SaveJUnitXML(report *Report, path string) error
type Resettable
type Scenario
type ScenarioBuilder
- func NewScenario(name string) *ScenarioBuilder
- func (b *ScenarioBuilder) Assert(name string, validator Validator) *ScenarioBuilder
- func (b *ScenarioBuilder) Build() *Scenario
- func (b *ScenarioBuilder) Inject(name string, injector Injector) *ScenarioBuilder
- func (b *ScenarioBuilder) Repeat(n int) *ScenarioBuilder
- func (b *ScenarioBuilder) RunFor(duration time.Duration) *ScenarioBuilder
- func (b *ScenarioBuilder) Scope(name string, fn func(*ScopeBuilder)) *ScenarioBuilder
- func (b *ScenarioBuilder) Step(name string, fn func(context.Context, Target) error) *ScenarioBuilder
- func (b *ScenarioBuilder) WithSeed(seed int64) *ScenarioBuilder
- func (b *ScenarioBuilder) WithTarget(target Target) *ScenarioBuilder
type Scope
type ScopeBuilder
- func (sb *ScopeBuilder) Inject(name string, injector Injector) *ScopeBuilder
type Step
type StepInjector
type StepWrapper
type SuccessThresholds
- func DefaultThresholds() *SuccessThresholds
- func RelaxedThresholds() *SuccessThresholds
- func StrictThresholds() *SuccessThresholds
- func (t *SuccessThresholds) Validate() error
type Target
type TimeWindow
type ValidationFailure
type ValidationSeverity
- func (s ValidationSeverity) String() string
type Validator
type Verdict
- func (v Verdict) ExitCode() int
- func (v Verdict) String() string

Constants ¶

View Source

const (
	ValidatorGoroutineLimit      = "goroutine-limit"
	ValidatorRecursionDepth      = "recursion-depth"
	ValidatorSlowIteration       = "slow-iteration"
	ValidatorMemoryLimit         = "memory-limit"
	ValidatorPanicRecovery       = "panic-recovery"
	ValidatorExecutionTime       = "execution-time"
	ValidatorRecursionDepthLimit = "recursion-depth-limit"
	ValidatorMemoryUnder         = "memory-under"
	ValidatorPanics              = "panics"
	ValidatorInfiniteLoop        = "infinite-loop"
	ValidatorMaxErrors           = "max-errors"
)

Validator identifiers

View Source

const (
	ErrorTypeGoroutineLeak = "goroutine-leak"
	ErrorTypePanic         = "panic"
	ErrorTypeRecursion     = "recursion"
	ErrorTypeTimeout       = "timeout"
	ErrorTypeMemory        = "memory"
	ErrorTypeOther         = "other"
	ErrorTypeUnknown       = "unknown"
)

Error type identifiers

Variables ¶

This section is empty.

Functions ¶

func ApplyChaos ¶

func ApplyChaos(context.Context, string) bool

func AttachChaos ¶

func AttachChaos(ctx context.Context, _ *ChaosContext) context.Context

func AttachLogger ¶

func AttachLogger(ctx context.Context, logger *slog.Logger) context.Context

AttachLogger attaches a logger to context.

func AttachRand ¶

func AttachRand(ctx context.Context, rng *rand.Rand) context.Context

AttachRand attaches a deterministic random number generator to context

func AttachRecorder ¶

func AttachRecorder(ctx context.Context, r EventRecorder) context.Context

AttachRecorder attaches an EventRecorder to context.

func GetLogger ¶

func GetLogger(ctx context.Context) *slog.Logger

GetLogger retrieves logger from context, or returns slog.Default() if not found.

func GetRand ¶

func GetRand(ctx context.Context) *rand.Rand

GetRand retrieves the random number generator from context, or creates a new one if not found If seed was set in scenario, the generator will be deterministic

func MaybeCancelContext ¶

func MaybeCancelContext(ctx context.Context) (context.Context, context.CancelFunc)

func MaybeDelay ¶

func MaybeDelay(ctx context.Context)

func MaybeError ¶

func MaybeError(context.Context) error

func MaybeNetworkChaos ¶

func MaybeNetworkChaos(context.Context, string, int)

func MaybePanic ¶

func MaybePanic(context.Context)

func RecordError ¶ added in v0.5.1

func RecordError(ctx context.Context)

RecordError records an error via context-attached recorder (no-op if absent).

func RecordPanic ¶

func RecordPanic(ctx context.Context)

RecordPanic records a panic via context-attached recorder (no-op if absent).

func RecordRecursionDepth ¶

func RecordRecursionDepth(ctx context.Context, depth int)

RecordRecursionDepth records recursion depth via context-attached recorder (no-op if absent).

func Run ¶

func Run(ctx context.Context, scenario *Scenario) error

Run executes a scenario with default settings. This is a convenience function that creates an executor, runs the scenario, and prints the report.

Example:

scenario := chaoskit.NewScenario("test").WithTarget(mySystem).Build()
if err := chaoskit.Run(ctx, scenario); err != nil {
	log.Fatal(err)
}

func RunWithLogger ¶

func RunWithLogger(ctx context.Context, scenario *Scenario, logger Logger) error

RunWithLogger executes a scenario with a custom logger (deprecated, use RunWithSlogLogger)

func RunWithSlogLogger ¶

func RunWithSlogLogger(ctx context.Context, scenario *Scenario, logger *slog.Logger) error

RunWithSlogLogger executes a scenario with a structured logger. Use this function when you want to use structured logging with slog.

Example:

logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
scenario := chaoskit.NewScenario("test").WithTarget(mySystem).Build()
if err := chaoskit.RunWithSlogLogger(ctx, scenario, logger); err != nil {
	log.Fatal(err)
}

Types ¶

type CategorizedInjector ¶

type CategorizedInjector interface {
	Injector
	Type() InjectorType
}

CategorizedInjector provides information about injector type

type ChaosContext ¶

type ChaosContext struct {
}

func GetChaos ¶

func GetChaos(context.Context) *ChaosContext

func NewChaosContext ¶ added in v0.8.0

func NewChaosContext() *ChaosContext

func (*ChaosContext) GetProvider ¶

func (*ChaosContext) GetProvider(string) (ChaosProvider, bool)

func (*ChaosContext) RegisterProvider ¶

func (*ChaosContext) RegisterProvider(ChaosProvider)

func (*ChaosContext) SetCancellationFunc ¶ added in v0.8.0

func (*ChaosContext) SetCancellationFunc(func(context.Context) (context.Context, context.CancelFunc))

func (*ChaosContext) SetDelayFunc ¶ added in v0.8.0

func (*ChaosContext) SetDelayFunc(func() bool)

func (*ChaosContext) SetErrorFunc ¶ added in v0.8.0

func (*ChaosContext) SetErrorFunc(func() error)

func (*ChaosContext) SetNetworkFunc ¶ added in v0.8.0

func (*ChaosContext) SetNetworkFunc(func(host string, port int) bool)

func (*ChaosContext) SetPanicFunc ¶ added in v0.8.0

func (*ChaosContext) SetPanicFunc(func() bool)

type ChaosContextCancellationProvider ¶

type ChaosContextCancellationProvider interface {
	Injector
	GetChaosContext(parent context.Context) (context.Context, context.CancelFunc)
	GetCancellationProbability() float64
}

ChaosContextCancellationProvider provides context cancellation capability

type ChaosDelayProvider ¶

type ChaosDelayProvider interface {
	Injector
	GetChaosDelay(ctx context.Context) (time.Duration, bool)
}

ChaosDelayProvider provides delay injection capability

type ChaosErrorProvider ¶

type ChaosErrorProvider interface {
	Injector
	ShouldReturnError() error
}

ChaosErrorProvider provides error injection capability

type ChaosNetworkProvider ¶

type ChaosNetworkProvider interface {
	Injector
	ShouldApplyNetworkChaos(host string, port int) bool
	GetNetworkLatency(host string, port int) (time.Duration, bool)
	ShouldDropConnection(host string, port int) bool
}

ChaosNetworkProvider provides network chaos injection capability

type ChaosPanicProvider ¶

type ChaosPanicProvider interface {
	Injector
	ShouldChaosPanic() bool
	GetPanicProbability() float64
}

ChaosPanicProvider provides panic injection capability

type ChaosProvider ¶

type ChaosProvider interface {
	Name() string
	Apply(ctx context.Context) bool
}

ChaosProvider is a universal interface for context-based chaos injection

type ErrorRecorder ¶ added in v0.5.1

type ErrorRecorder interface {
	RecordError(ctx context.Context)
}

ErrorRecorder is implemented by validators that can record errors during execution.

type ErrorSummary ¶

type ErrorSummary struct {
	ErrorPattern string             `json:"error_pattern"`
	Count        int                `json:"count"`
	Severity     ValidationSeverity `json:"severity"`
	Examples     []string           `json:"examples,omitempty"`
}

ErrorSummary represents a common error pattern

type EventRecorder ¶

type EventRecorder interface {
	RecordPanic(ctx context.Context)
	RecordRecursionDepth(depth int)
	RecordError(ctx context.Context)
}

EventRecorder provides a unified interface for recording runtime events from steps.

type ExecutionResult ¶

type ExecutionResult struct {
	ScenarioName  string
	Success       bool
	Error         error
	Duration      time.Duration
	StepsExecuted int
	Timestamp     time.Time
}

ExecutionResult contains the result of a scenario execution

type Executor ¶

type Executor struct {
	// contains filtered or unexported fields
}

Executor runs scenarios

func NewExecutor ¶

func NewExecutor(opts ...ExecutorOption) *Executor

NewExecutor creates a new executor with options

func (*Executor) Metrics ¶

func (e *Executor) Metrics() *MetricsCollector

Metrics returns the metrics collector

func (*Executor) Reporter ¶

func (e *Executor) Reporter() *Reporter

Reporter returns the reporter

func (*Executor) Run ¶

func (e *Executor) Run(ctx context.Context, scenario *Scenario) error

Run executes a scenario

type ExecutorOption ¶

type ExecutorOption func(*Executor)

ExecutorOption configures an Executor

func WithFailurePolicy ¶

func WithFailurePolicy(policy FailurePolicy) ExecutorOption

WithFailurePolicy sets the failure handling policy

func WithJSONLogging ¶

func WithJSONLogging() ExecutorOption

WithJSONLogging sets JSON output format

func WithLogger ¶

func WithLogger(logger Logger) ExecutorOption

WithLogger sets a custom logger (deprecated, use WithSlogLogger)

func WithMetrics ¶

func WithMetrics(metrics *MetricsCollector) ExecutorOption

WithMetrics sets a custom metrics collector

func WithReporter ¶

func WithReporter(reporter *Reporter) ExecutorOption

WithReporter sets a custom reporter

func WithSlogLogger ¶

func WithSlogLogger(logger *slog.Logger) ExecutorOption

WithSlogLogger sets a structured logger

type FailureAnalysis ¶

type FailureAnalysis struct {
	// ByValidator counts failures per validator
	ByValidator map[string]int `json:"by_validator"`

	// ByType groups failures by error type
	ByType map[string]int `json:"by_type"`

	// TopErrors lists most common errors
	TopErrors []ErrorSummary `json:"top_errors"`

	// FailureRate over time (if duration-based test)
	FailureRateOverTime []TimeWindow `json:"failure_rate_over_time,omitempty"`
}

FailureAnalysis provides detailed failure breakdown

type FailurePolicy ¶

type FailurePolicy int

FailurePolicy defines how the executor handles failures

const (
	// FailFast stops execution on first failure
	FailFast FailurePolicy = iota
	// ContinueOnFailure continues execution even after failures
	ContinueOnFailure
)

type GlobalInjector ¶

type GlobalInjector interface {
	Injector
	IsGlobal() bool
}

GlobalInjector indicates that injector applies global effects

type Injector ¶

type Injector interface {
	Name() string
	Inject(ctx context.Context) error
	Stop(ctx context.Context) error
}

Injector introduces faults into the system. Implement this interface to create custom chaos injection behaviors.

Inject() is called when the injector should start injecting faults. Stop() is called when the injector should stop and clean up.

Example implementations:

DelayInjector: Adds random delays
PanicInjector: Triggers panics with probability
NetworkInjector: Introduces network latency/drops

See the injectors package for reference implementations.

type InjectorType ¶

type InjectorType int

InjectorType defines how an injector applies its effects

const (
	// InjectorTypeGlobal applies effects globally (CPU, Memory, Network proxies)
	InjectorTypeGlobal InjectorType = iota
	// InjectorTypeContext applies effects through context (Delay, Panic)
	InjectorTypeContext
	// InjectorTypeStep applies effects before/after steps
	InjectorTypeStep
	// InjectorTypeHybrid can work in multiple modes
	InjectorTypeHybrid
)

type JUnitError ¶

type JUnitError struct {
	Message string `xml:"message,attr"`
	Type    string `xml:"type,attr"`
	Content string `xml:",chardata"`
}

JUnitError represents a test error

type JUnitFailure ¶

type JUnitFailure struct {
	Message string `xml:"message,attr"`
	Type    string `xml:"type,attr"`
	Content string `xml:",chardata"`
}

JUnitFailure represents a test failure

type JUnitTestCase ¶

type JUnitTestCase struct {
	Name      string        `xml:"name,attr"`
	Classname string        `xml:"classname,attr"`
	Time      float64       `xml:"time,attr"`
	Failure   *JUnitFailure `xml:"failure,omitempty"`
	Error     *JUnitError   `xml:"error,omitempty"`
}

JUnitTestCase represents a single test case

type JUnitTestSuite ¶

type JUnitTestSuite struct {
	XMLName   xml.Name        `xml:"testsuite"`
	Name      string          `xml:"name,attr"`
	Tests     int             `xml:"tests,attr"`
	Failures  int             `xml:"failures,attr"`
	Errors    int             `xml:"errors,attr"`
	Time      float64         `xml:"time,attr"`
	Timestamp string          `xml:"timestamp,attr"`
	TestCases []JUnitTestCase `xml:"testcase"`
}

JUnitTestSuite represents JUnit XML test suite format

type Logger ¶

type Logger interface {
	Printf(format string, v ...any)
	Println(v ...any)
}

Logger is deprecated. Use *slog.Logger instead. This type is kept for backward compatibility but will be removed in a future version.

func NewDefaultLogger ¶

func NewDefaultLogger() Logger

NewDefaultLogger creates a default logger (deprecated, use slog.Default())

type MetricsCollector ¶

type MetricsCollector struct {
	// contains filtered or unexported fields
}

MetricsCollector collects execution metrics

func NewMetricsCollector ¶

func NewMetricsCollector() *MetricsCollector

NewMetricsCollector creates a new metrics collector

func (*MetricsCollector) GetInjectorMetrics ¶

func (m *MetricsCollector) GetInjectorMetrics(injectorName string) (map[string]interface{}, bool)

GetInjectorMetrics returns metrics for a specific injector

func (*MetricsCollector) RecordExecution ¶

func (m *MetricsCollector) RecordExecution(result ExecutionResult)

RecordExecution records an execution result

func (*MetricsCollector) RecordInjectorMetrics ¶

func (m *MetricsCollector) RecordInjectorMetrics(injectorName string, metrics map[string]interface{})

RecordInjectorMetrics records metrics from an injector

func (*MetricsCollector) Stats ¶

func (m *MetricsCollector) Stats() map[string]any

Stats returns current statistics

type MetricsProvider ¶

type MetricsProvider interface {
	Injector
	GetMetrics() map[string]interface{}
}

MetricsProvider allows injectors to expose metrics

type NetworkInjectorLifecycle ¶

type NetworkInjectorLifecycle interface {
	Injector
	SetupNetwork(ctx context.Context) error
	TeardownNetwork(ctx context.Context) error
}

NetworkInjectorLifecycle manages network proxy setup/teardown

type PanicRecorder ¶

type PanicRecorder interface {
	RecordPanic(ctx context.Context)
}

PanicRecorder is implemented by validators that can record panics during execution.

type RecursionRecorder ¶

type RecursionRecorder interface {
	RecordRecursion(depth int)
}

RecursionRecorder is implemented by validators that can record recursion/rollback depth.

type Report ¶

type Report struct {
	// Verdict is the overall test outcome
	Verdict Verdict `json:"verdict"`

	// Summary is human-readable verdict explanation
	Summary string `json:"summary"`

	// ScenarioName is the name of tested scenario
	ScenarioName string `json:"scenario_name"`

	// ExecutionTime is when test was executed
	ExecutionTime time.Time `json:"execution_time"`

	// Duration is total test duration
	Duration time.Duration `json:"duration"`

	// Statistics
	TotalIterations int           `json:"total_iterations"`
	SuccessCount    int           `json:"success_count"`
	FailureCount    int           `json:"failure_count"`
	SuccessRate     float64       `json:"success_rate"`
	AvgDuration     time.Duration `json:"avg_duration"`

	// Failures categorized by severity
	CriticalFailures []ValidationFailure `json:"critical_failures"`
	Warnings         []ValidationFailure `json:"warnings"`
	InfoMessages     []ValidationFailure `json:"info_messages,omitempty"`

	// Failure analysis
	Analysis *FailureAnalysis `json:"analysis,omitempty"`

	// Thresholds used for evaluation
	Thresholds *SuccessThresholds `json:"thresholds,omitempty"`
}

Report contains comprehensive test results with verdict

type Reporter ¶

type Reporter struct {
	// contains filtered or unexported fields
}

Reporter generates execution reports

func NewReporter ¶

func NewReporter() *Reporter

NewReporter creates a new reporter

func (*Reporter) AddResult ¶

func (r *Reporter) AddResult(result ExecutionResult)

AddResult adds an execution result

func (*Reporter) GenerateJSON ¶

func (r *Reporter) GenerateJSON() (string, error)

GenerateJSON returns a JSON report with aggregate stats and executions

func (*Reporter) GenerateJUnitXML ¶

func (r *Reporter) GenerateJUnitXML(report *Report) (string, error)

GenerateJUnitXML converts report to JUnit XML format

func (*Reporter) GenerateReport ¶

func (r *Reporter) GenerateReport() string

GenerateReport generates a human-readable summary report

func (*Reporter) GenerateTextReport ¶

func (r *Reporter) GenerateTextReport(report *Report) string

GenerateTextReport generates enhanced human-readable report

func (*Reporter) GetVerdict ¶

func (r *Reporter) GetVerdict(thresholds *SuccessThresholds) (*Report, error)

GetVerdict calculates verdict based on thresholds

func (*Reporter) Results ¶

func (r *Reporter) Results() []ExecutionResult

Results returns a copy of accumulated results

func (*Reporter) SaveJSON ¶

func (r *Reporter) SaveJSON(path string) error

SaveJSON writes the JSON report to a file

func (*Reporter) SaveJUnitXML ¶

func (r *Reporter) SaveJUnitXML(report *Report, path string) error

SaveJUnitXML writes JUnit XML report to file

type Resettable ¶

type Resettable interface {
	Reset()
}

Resettable is implemented by validators that need to reset state between iterations

type Scenario ¶

type Scenario struct {
	// contains filtered or unexported fields
}

Scenario describes a chaos experiment. A scenario defines what to test (target), how to test it (steps), what faults to inject (injectors), and what invariants to verify (validators).

Scenarios are built using the ScenarioBuilder pattern:

scenario := chaoskit.NewScenario("my-test").
	WithTarget(mySystem).
	Step("step1", func(ctx context.Context, target chaoskit.Target) error {
		// Execute step logic
		return nil
	}).
	Inject("delay", injectors.RandomDelay(10*time.Millisecond, 50*time.Millisecond)).
	Assert("goroutines", validators.GoroutineLimit(100)).
	Repeat(10).
	Build()

type ScenarioBuilder ¶

type ScenarioBuilder struct {
	// contains filtered or unexported fields
}

ScenarioBuilder builds scenarios fluently

func NewScenario ¶

func NewScenario(name string) *ScenarioBuilder

NewScenario creates a new scenario builder. Use the builder methods to configure the scenario, then call Build() to create the Scenario.

Example:

scenario := chaoskit.NewScenario("test").
	WithTarget(mySystem).
	Repeat(5).
	Build()

func (*ScenarioBuilder) Assert ¶

func (b *ScenarioBuilder) Assert(name string, validator Validator) *ScenarioBuilder

Assert adds a validator

func (*ScenarioBuilder) Build ¶

func (b *ScenarioBuilder) Build() *Scenario

Build returns the built scenario

func (*ScenarioBuilder) Inject ¶

func (b *ScenarioBuilder) Inject(name string, injector Injector) *ScenarioBuilder

Inject adds a fault injector

func (*ScenarioBuilder) Repeat ¶

func (b *ScenarioBuilder) Repeat(n int) *ScenarioBuilder

Repeat sets the number of times to repeat the scenario

func (*ScenarioBuilder) RunFor ¶

func (b *ScenarioBuilder) RunFor(duration time.Duration) *ScenarioBuilder

RunFor sets the duration to run the scenario

func (*ScenarioBuilder) Scope ¶

func (b *ScenarioBuilder) Scope(name string, fn func(*ScopeBuilder)) *ScenarioBuilder

Scope adds a scope for grouping injectors

func (*ScenarioBuilder) Step ¶

func (b *ScenarioBuilder) Step(name string, fn func(context.Context, Target) error) *ScenarioBuilder

Step adds a step to the scenario

func (*ScenarioBuilder) WithSeed ¶

func (b *ScenarioBuilder) WithSeed(seed int64) *ScenarioBuilder

WithSeed sets the random seed for deterministic experiments

func (*ScenarioBuilder) WithTarget ¶

func (b *ScenarioBuilder) WithTarget(target Target) *ScenarioBuilder

WithTarget sets the target system

type Scope ¶

type Scope struct {
	// contains filtered or unexported fields
}

Scope groups injectors logically (e.g., "db", "api", "cache")

type ScopeBuilder ¶

type ScopeBuilder struct {
	// contains filtered or unexported fields
}

ScopeBuilder builds a scope fluently

func (*ScopeBuilder) Inject ¶

func (sb *ScopeBuilder) Inject(name string, injector Injector) *ScopeBuilder

Inject adds a fault injector to the scope

type Step ¶

type Step interface {
	Name() string
	Execute(ctx context.Context, target Target) error
}

Step represents a single step in a scenario. Steps are executed sequentially and can use chaos injection functions like MaybeDelay() and MaybePanic() to interact with active injectors.

Example:

step := &myStep{name: "process-order"}
scenario.Step("process", step.Execute)

type StepInjector ¶

type StepInjector interface {
	Injector
	BeforeStep(ctx context.Context) error
	AfterStep(ctx context.Context, err error) error
}

StepInjector can inject faults before/after step execution

type StepWrapper ¶

type StepWrapper interface {
	WrapStep(step Step) func(ctx context.Context, target Target) error
}

StepWrapper is implemented by validators that can wrap step execution. This allows validators to intercept and modify step behavior, such as adding timeouts, monitoring, or other cross-cutting concerns.

WrapStep receives a step and returns a wrapped version that will be executed instead of the original step. The wrapper function receives the same context and target as the original step.

Example use cases:

InfiniteLoopValidator: Wraps steps with timeout to detect hung steps
PerformanceMonitor: Wraps steps to measure execution time

type SuccessThresholds ¶

type SuccessThresholds struct {
	// MinSuccessRate is minimum acceptable success rate (0.0-1.0)
	// Example: 0.95 = 95% of iterations must succeed
	MinSuccessRate float64 `json:"min_success_rate" yaml:"min_success_rate"`

	// CriticalValidators lists validators that MUST pass (block release if fail)
	// Example: [ValidatorGoroutineLimit, ValidatorSlowIteration]
	CriticalValidators []string `json:"critical_validators" yaml:"critical_validators"`

	// WarningValidators lists validators that produce warnings (don't block)
	// Example: [ValidatorExecutionTime, "memory-pressure"]
	WarningValidators []string `json:"warning_validators,omitempty" yaml:"warning_validators,omitempty"`

	// MaxFailedIterations is maximum number of failed iterations allowed
	// If exceeded, test fails regardless of success rate
	MaxFailedIterations int `json:"max_failed_iterations,omitempty" yaml:"max_failed_iterations,omitempty"`

	// MaxAvgDuration is maximum acceptable average execution duration
	// Exceeding this produces a warning
	MaxAvgDuration time.Duration `json:"max_avg_duration,omitempty" yaml:"max_avg_duration,omitempty"`

	// RequireAllValidatorsPassing requires ALL validators to pass
	// If true, any validator failure = FAIL
	//nolint:lll
	RequireAllValidatorsPassing bool `json:"require_all_validators_passing,omitempty" yaml:"require_all_validators_passing,omitempty"`
}

SuccessThresholds defines criteria for test success

func DefaultThresholds ¶

func DefaultThresholds() *SuccessThresholds

DefaultThresholds returns sensible defaults for most systems

func RelaxedThresholds ¶

func RelaxedThresholds() *SuccessThresholds

RelaxedThresholds returns relaxed thresholds for experimental features

func StrictThresholds ¶

func StrictThresholds() *SuccessThresholds

StrictThresholds returns strict thresholds for critical systems

func (*SuccessThresholds) Validate ¶

func (t *SuccessThresholds) Validate() error

Validate checks if thresholds are valid

type Target ¶

type Target interface {
	Name() string
	Setup(ctx context.Context) error
	Teardown(ctx context.Context) error
}

Target represents the system under test. Implement this interface to define the system that will be subject to chaos testing.

Example:

type MySystem struct{}

func (s *MySystem) Name() string { return "my-system" }

func (s *MySystem) Setup(ctx context.Context) error {
	// Initialize system resources
	return nil
}

func (s *MySystem) Teardown(ctx context.Context) error {
	// Clean up resources
	return nil
}

type TimeWindow ¶

type TimeWindow struct {
	Start       time.Time `json:"start"`
	End         time.Time `json:"end"`
	Iterations  int       `json:"iterations"`
	Failures    int       `json:"failures"`
	FailureRate float64   `json:"failure_rate"`
}

TimeWindow represents the failure rate in a time period

type ValidationFailure ¶

type ValidationFailure struct {
	ValidatorName string             `json:"validator_name"`
	Severity      ValidationSeverity `json:"severity"`
	Message       string             `json:"message"`
	Occurrences   int                `json:"occurrences"`
	FirstSeen     time.Time          `json:"first_seen"`
	LastSeen      time.Time          `json:"last_seen"`
	Details       map[string]any     `json:"details,omitempty"`
}

ValidationFailure represents a validator failure

type ValidationSeverity ¶

type ValidationSeverity int

ValidationSeverity indicates how critical a validator failure is

const (
	// SeverityCritical blocks release - must be fixed
	SeverityCritical ValidationSeverity = iota

	// SeverityWarning should be investigated but doesn't block release
	SeverityWarning

	// SeverityInfo is informational only
	SeverityInfo
)

func (ValidationSeverity) String ¶

func (s ValidationSeverity) String() string

String returns human-readable severity

type Validator ¶

type Validator interface {
	Name() string
	Validate(ctx context.Context, target Target) error

	// Severity returns the severity level of this validator
	// Added in v1.x for CI/CD integration
	Severity() ValidationSeverity
}

Validator checks system invariants. Implement this interface to verify that the system maintains expected properties during chaos testing.

Validate() is called after each scenario execution to check invariants. Return an error if the invariant is violated.

Severity() returns the severity level of this validator for CI/CD integration. This determines how failures are categorized in test reports.

Example implementations:

GoroutineLimit: Ensures goroutine count stays below threshold
RecursionDepthLimit: Verifies recursion depth doesn't exceed limit
NoSlowIteration: Detects slow loops

See the validators package for reference implementations.

type Verdict ¶

type Verdict int

Verdict represents the overall test outcome

const (
	// VerdictPass indicates all critical validators passed
	VerdictPass Verdict = iota

	// VerdictUnstable indicates warnings detected but no critical failures
	VerdictUnstable

	// VerdictFail indicates critical validators failed
	VerdictFail
)

func (Verdict) ExitCode ¶

func (v Verdict) ExitCode() int

ExitCode returns appropriate exit code for CI/CD Pass=0, Unstable=0, Fail=1

func (Verdict) String ¶

func (v Verdict) String() string

String returns human-readable verdict

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
report-viewer command
exporters
injectors
testing
validators

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL