hedge

package module

v1.0.0 Latest Latest Go to latest Published: Mar 26, 2026 License: MIT Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/bhope/hedge

Links

Open Source Insights

README ¶

hedge

hedge is a Go library that reduces tail latency in fan-out architectures using adaptive hedged requests. It learns your service's latency distribution in real-time using DDSketch and fires backup requests only when the primary is genuinely slow - matching hand-tuned static thresholds with zero configuration. A token bucket budget prevents load amplification during outages.

Introduction

In fan-out architectures, a single user request fans out to dozens or hundreds of downstream services. Even when each service has only 1% slow responses, the probability that at least one backend is slow compounds dramatically - with 100 services, 63% of top-level requests will be delayed by at least one straggler. Dean & Barroso (2013) identified hedged requests as the most effective mitigation: send a duplicate request to another server after a brief delay, and use whichever responds first.

hedge implements this with three components. A per-host DDSketch (Masson et al., VLDB 2019) tracks the latency distribution in real time with relative-error guarantees and constant memory, automatically adapting to load changes and deployments. When a request exceeds the estimated p90 latency, a backup is fired to the same target; whichever responds first wins and the loser is cancelled. A token bucket budget caps the hedge rate at a configurable percentage of total traffic, so that during genuine outages - when every request is slow - hedging stops before it doubles backend load.

Features

Zero configuration - learns latency per target host automatically
Drop-in http.RoundTripper and gRPC UnaryClientInterceptor
Adaptive thresholds via DDSketch with relative-error guarantees
Token bucket budget prevents load amplification during outages
Constant memory, O(1) per request overhead (~35ns for sketch update)
Full observability via Stats API

Evaluation

50,000 requests against a backend with lognormal base latency (mean=5ms) and 5% straggler probability (10× multiplier). Adaptive hedge matches the best hand-tuned static threshold at p99 (17.3ms vs 17.5ms) without requiring any manual configuration. Static thresholds either hedge too aggressively (10ms: 7.7% overhead) or too conservatively (50ms: p99 still at 54.9ms). In production where latency distributions shift with load and deployments, adaptive tracking avoids the stale-threshold problem entirely.

Configuration	p50	p90	p95	p99	p999	Overhead
No hedging	5.1ms	9.0ms	18.8ms	65.0ms	103.8ms	0.0%
Static 10ms	5.0ms	9.0ms	13.3ms	17.5ms	61.2ms	7.7%
Static 50ms	5.0ms	9.0ms	16.5ms	54.9ms	59.7ms	2.1%
Adaptive (hedge)	5.0ms	8.9ms	12.3ms	17.3ms	63.5ms	8.9%

Reproduce: cd benchmark/simulate && go run .

Installation

go get github.com/bhope/hedge

Quick Start

Zero configuration - the transport learns latency automatically:

import "github.com/bhope/hedge"

client := &http.Client{
    Transport: hedge.New(http.DefaultTransport),
}
resp, err := client.Get("https://api.example.com/data")

Tuned - with explicit options and observability:

var stats *hedge.Stats

client := &http.Client{
    Transport: hedge.New(http.DefaultTransport,
        hedge.WithPercentile(0.90),
        hedge.WithBudgetPercent(10),
        hedge.WithEstimatedRPS(1000),
        hedge.WithMinDelay(time.Millisecond),
        hedge.WithStats(&stats),
    ),
}

// After requests:
fmt.Printf("hedged=%d total=%d budget_exhausted=%d\n",
    stats.HedgedRequests.Load(),
    stats.TotalRequests.Load(),
    stats.BudgetExhausted.Load(),
)

Configuration

Option	Type	Default	Description
WithPercentile(q)	float64	0.90	Quantile of the latency distribution used as the hedge trigger threshold
WithMaxHedges(n)	int	1	Maximum number of in-flight hedge requests per call
WithBudgetPercent(p)	float64	10.0	Max hedge rate as a percentage of estimated total traffic
WithEstimatedRPS(r)	float64	100	Expected requests per second; scales the token bucket capacity
WithMinDelay(d)	time.Duration	1ms	Floor on the hedge delay; prevents hedging on sub-millisecond latencies
WithStats(s)	**Stats	nil	Pointer to receive the live `Stats` struct for observability

gRPC Support

import "github.com/bhope/hedge"

conn, err := grpc.NewClient(target,
    grpc.WithTransportCredentials(insecure.NewCredentials()),
    grpc.WithUnaryInterceptor(hedge.NewUnaryClientInterceptor(
        hedge.WithEstimatedRPS(500),
        hedge.WithBudgetPercent(10),
    )),
)

All options from the HTTP transport are supported. Per-target latency tracking uses cc.Target() as the host key.

How It Works

DDSketch

DDSketch is a streaming quantile sketch with relative-error guarantees: the returned quantile is always within ±ε of the true value (default ε=1%). hedge maintains one sketch per target host, updated on every completed request in O(1) time with constant memory. A tumbling window (default 30s) decays old observations so the sketch adapts to changing conditions.

Adaptive Hedging

When a request exceeds the estimated p90 latency, a backup request is fired to the same target using a child context derived from the caller's context. Whichever response arrives first is returned to the caller; the other is cancelled and its response body drained to release the connection back to the pool. If the primary wins, it was just slow but not a straggler - no overhead is incurred.

Hedging Budget

A token bucket limits hedge rate to a configurable percentage of total traffic (default 10%). The bucket refills at estimatedRPS × budgetPercent / 100 tokens per second. When the bucket is empty, the request waits for the primary without firing a hedge. During genuine outages - when every request is slow and the bucket drains - hedging stops automatically, preventing the load-doubling spiral that would worsen the outage.

Why Not a Static Threshold?

A static 10ms threshold looks great in benchmarks with fixed distributions. In production, latency shifts with load, deployments, GC pauses, and time of day - a threshold that is perfect at 3am causes 90%+ hedge rate at peak traffic. You would need to continuously monitor per-service latency and reconfigure thresholds as conditions change across every target your client talks to. Adaptive tracking handles this automatically: the sketch updates on every request, and the hedge threshold follows the actual distribution wherever it goes.

References

Jeffrey Dean and Luiz André Barroso. "The Tail at Scale." Communications of the ACM, 56(2):74–80, 2013.
Charles Masson, Jee E. Rim, and Homin K. Lee. "DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees." PVLDB, 12(12):2195–2205, 2019.

Contributing

Contributions are welcome! Please open an issue to discuss your idea before submitting a PR.

See CONTRIBUTING.md for development setup and guidelines.

License

hedge is released under the MIT License.

Documentation ¶

Index ¶

func LatencyEstimate(rt http.RoundTripper, host string, q float64) time.Duration
func New(transport http.RoundTripper, opts ...Option) http.RoundTripper
func NewUnaryClientInterceptor(opts ...Option) grpc.UnaryClientInterceptor
type Option
type Stats
- func (s *Stats) HedgeRate() float64
- func (s *Stats) Snapshot() StatsSnapshot
type StatsSnapshot

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func LatencyEstimate ¶

func LatencyEstimate(rt http.RoundTripper, host string, q float64) time.Duration

LatencyEstimate returns the current hedge-delay threshold the transport would use for the given host and quantile. Returns 0 if rt was not created by New.

func New ¶

func New(transport http.RoundTripper, opts ...Option) http.RoundTripper

func NewUnaryClientInterceptor ¶

func NewUnaryClientInterceptor(opts ...Option) grpc.UnaryClientInterceptor

Types ¶

type Option ¶

type Option func(*config)

func WithBudgetPercent ¶

func WithBudgetPercent(pct float64) Option

func WithEstimatedRPS ¶

func WithEstimatedRPS(rps float64) Option

func WithMaxHedges ¶

func WithMaxHedges(n int) Option

func WithMinDelay ¶

func WithMinDelay(d time.Duration) Option

func WithPercentile ¶

func WithPercentile(p float64) Option

func WithStats ¶

func WithStats(s **Stats) Option

type Stats ¶

type Stats struct {
	TotalRequests   atomic.Int64
	HedgedRequests  atomic.Int64
	HedgeWins       atomic.Int64
	PrimaryWins     atomic.Int64
	BudgetExhausted atomic.Int64
	WarmupRequests  atomic.Int64
}

func (*Stats) HedgeRate ¶

func (s *Stats) HedgeRate() float64

func (*Stats) Snapshot ¶

func (s *Stats) Snapshot() StatsSnapshot

type StatsSnapshot ¶

type StatsSnapshot struct {
	TotalRequests   int64
	HedgedRequests  int64
	HedgeWins       int64
	PrimaryWins     int64
	BudgetExhausted int64
	WarmupRequests  int64
}

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
budget
sketch Package sketch implements a DDSketch streaming quantile estimator.	Package sketch implements a DDSketch streaming quantile estimator.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

hedge

Table of Contents

Introduction

Evaluation

Installation

Quick Start

Configuration

gRPC Support

How It Works

DDSketch

Adaptive Hedging

Hedging Budget

Why Not a Static Threshold?

References

Contributing

License

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func LatencyEstimate ¶

func New ¶

func NewUnaryClientInterceptor ¶

Types ¶

type Option ¶

func WithBudgetPercent ¶

func WithEstimatedRPS ¶

func WithMaxHedges ¶

func WithMinDelay ¶

func WithPercentile ¶

func WithStats ¶

type Stats ¶

func (*Stats) HedgeRate ¶

func (*Stats) Snapshot ¶

type StatsSnapshot ¶

Source Files ¶

Directories ¶