hedge

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 26, 2026 License: MIT Imports: 11 Imported by: 0

README

hedge

Go Report Card Go Reference Coverage

hedge is a Go library that reduces tail latency in fan-out architectures using adaptive hedged requests. It learns your service's latency distribution in real-time using DDSketch and fires backup requests only when the primary is genuinely slow - matching hand-tuned static thresholds with zero configuration. A token bucket budget prevents load amplification during outages.


Table of Contents


Introduction

In fan-out architectures, a single user request fans out to dozens or hundreds of downstream services. Even when each service has only 1% slow responses, the probability that at least one backend is slow compounds dramatically - with 100 services, 63% of top-level requests will be delayed by at least one straggler. Dean & Barroso (2013) identified hedged requests as the most effective mitigation: send a duplicate request to another server after a brief delay, and use whichever responds first.

hedge implements this with three components. A per-host DDSketch (Masson et al., VLDB 2019) tracks the latency distribution in real time with relative-error guarantees and constant memory, automatically adapting to load changes and deployments. When a request exceeds the estimated p90 latency, a backup is fired to the same target; whichever responds first wins and the loser is cancelled. A token bucket budget caps the hedge rate at a configurable percentage of total traffic, so that during genuine outages - when every request is slow - hedging stops before it doubles backend load.

Features

  • Zero configuration - learns latency per target host automatically
  • Drop-in http.RoundTripper and gRPC UnaryClientInterceptor
  • Adaptive thresholds via DDSketch with relative-error guarantees
  • Token bucket budget prevents load amplification during outages
  • Constant memory, O(1) per request overhead (~35ns for sketch update)
  • Full observability via Stats API

Evaluation

Evaluation

50,000 requests against a backend with lognormal base latency (mean=5ms) and 5% straggler probability (10× multiplier). Adaptive hedge matches the best hand-tuned static threshold at p99 (17.3ms vs 17.5ms) without requiring any manual configuration. Static thresholds either hedge too aggressively (10ms: 7.7% overhead) or too conservatively (50ms: p99 still at 54.9ms). In production where latency distributions shift with load and deployments, adaptive tracking avoids the stale-threshold problem entirely.

Configuration p50 p90 p95 p99 p999 Overhead
No hedging 5.1ms 9.0ms 18.8ms 65.0ms 103.8ms 0.0%
Static 10ms 5.0ms 9.0ms 13.3ms 17.5ms 61.2ms 7.7%
Static 50ms 5.0ms 9.0ms 16.5ms 54.9ms 59.7ms 2.1%
Adaptive (hedge) 5.0ms 8.9ms 12.3ms 17.3ms 63.5ms 8.9%

Reproduce: cd benchmark/simulate && go run .


Installation

go get github.com/bhope/hedge

Quick Start

Zero configuration - the transport learns latency automatically:

import "github.com/bhope/hedge"

client := &http.Client{
    Transport: hedge.New(http.DefaultTransport),
}
resp, err := client.Get("https://api.example.com/data")

Tuned - with explicit options and observability:

var stats *hedge.Stats

client := &http.Client{
    Transport: hedge.New(http.DefaultTransport,
        hedge.WithPercentile(0.90),
        hedge.WithBudgetPercent(10),
        hedge.WithEstimatedRPS(1000),
        hedge.WithMinDelay(time.Millisecond),
        hedge.WithStats(&stats),
    ),
}

// After requests:
fmt.Printf("hedged=%d total=%d budget_exhausted=%d\n",
    stats.HedgedRequests.Load(),
    stats.TotalRequests.Load(),
    stats.BudgetExhausted.Load(),
)

Configuration

Option Type Default Description
WithPercentile(q) float64 0.90 Quantile of the latency distribution used as the hedge trigger threshold
WithMaxHedges(n) int 1 Maximum number of in-flight hedge requests per call
WithBudgetPercent(p) float64 10.0 Max hedge rate as a percentage of estimated total traffic
WithEstimatedRPS(r) float64 100 Expected requests per second; scales the token bucket capacity
WithMinDelay(d) time.Duration 1ms Floor on the hedge delay; prevents hedging on sub-millisecond latencies
WithStats(s) **Stats nil Pointer to receive the live Stats struct for observability

gRPC Support

import "github.com/bhope/hedge"

conn, err := grpc.NewClient(target,
    grpc.WithTransportCredentials(insecure.NewCredentials()),
    grpc.WithUnaryInterceptor(hedge.NewUnaryClientInterceptor(
        hedge.WithEstimatedRPS(500),
        hedge.WithBudgetPercent(10),
    )),
)

All options from the HTTP transport are supported. Per-target latency tracking uses cc.Target() as the host key.


How It Works

DDSketch

DDSketch is a streaming quantile sketch with relative-error guarantees: the returned quantile is always within ±ε of the true value (default ε=1%). hedge maintains one sketch per target host, updated on every completed request in O(1) time with constant memory. A tumbling window (default 30s) decays old observations so the sketch adapts to changing conditions.

Adaptive Hedging

When a request exceeds the estimated p90 latency, a backup request is fired to the same target using a child context derived from the caller's context. Whichever response arrives first is returned to the caller; the other is cancelled and its response body drained to release the connection back to the pool. If the primary wins, it was just slow but not a straggler - no overhead is incurred.

Hedging Budget

A token bucket limits hedge rate to a configurable percentage of total traffic (default 10%). The bucket refills at estimatedRPS × budgetPercent / 100 tokens per second. When the bucket is empty, the request waits for the primary without firing a hedge. During genuine outages - when every request is slow and the bucket drains - hedging stops automatically, preventing the load-doubling spiral that would worsen the outage.


Why Not a Static Threshold?

A static 10ms threshold looks great in benchmarks with fixed distributions. In production, latency shifts with load, deployments, GC pauses, and time of day - a threshold that is perfect at 3am causes 90%+ hedge rate at peak traffic. You would need to continuously monitor per-service latency and reconfigure thresholds as conditions change across every target your client talks to. Adaptive tracking handles this automatically: the sketch updates on every request, and the hedge threshold follows the actual distribution wherever it goes.


References


Contributing

Contributions are welcome! Please open an issue to discuss your idea before submitting a PR.

See CONTRIBUTING.md for development setup and guidelines.


License

hedge is released under the MIT License.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func LatencyEstimate

func LatencyEstimate(rt http.RoundTripper, host string, q float64) time.Duration

LatencyEstimate returns the current hedge-delay threshold the transport would use for the given host and quantile. Returns 0 if rt was not created by New.

func New

func New(transport http.RoundTripper, opts ...Option) http.RoundTripper

func NewUnaryClientInterceptor

func NewUnaryClientInterceptor(opts ...Option) grpc.UnaryClientInterceptor

Types

type Option

type Option func(*config)

func WithBudgetPercent

func WithBudgetPercent(pct float64) Option

func WithEstimatedRPS

func WithEstimatedRPS(rps float64) Option

func WithMaxHedges

func WithMaxHedges(n int) Option

func WithMinDelay

func WithMinDelay(d time.Duration) Option

func WithPercentile

func WithPercentile(p float64) Option

func WithStats

func WithStats(s **Stats) Option

type Stats

type Stats struct {
	TotalRequests   atomic.Int64
	HedgedRequests  atomic.Int64
	HedgeWins       atomic.Int64
	PrimaryWins     atomic.Int64
	BudgetExhausted atomic.Int64
	WarmupRequests  atomic.Int64
}

func (*Stats) HedgeRate

func (s *Stats) HedgeRate() float64

func (*Stats) Snapshot

func (s *Stats) Snapshot() StatsSnapshot

type StatsSnapshot

type StatsSnapshot struct {
	TotalRequests   int64
	HedgedRequests  int64
	HedgeWins       int64
	PrimaryWins     int64
	BudgetExhausted int64
	WarmupRequests  int64
}

Directories

Path Synopsis
Package sketch implements a DDSketch streaming quantile estimator.
Package sketch implements a DDSketch streaming quantile estimator.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL