Documentation
¶
Overview ¶
Package leader – etcd-backed leader election with fencing tokens.
EtcdElection implements distributed leader election over a real etcd cluster using etcd sessions + elections (go.etcd.io/etcd/client/v3/concurrency). It exposes the same fencing-token contract as the in-memory LeaseRegistry: every successful leadership acquisition issues a strictly-increasing epoch, and a stale leader can be detected by any consumer that tracks the highest epoch it has observed.
Package leader provides distributed leader election utilities for Helix Cluster OS.
Index ¶
- Constants
- Variables
- func StaleLeader(tok FencingToken, highestSeenEpoch uint64) bool
- type Clock
- type Config
- type Election
- func (e *Election) AcquireLease() (FencingToken, error)
- func (e *Election) BecomeLeader() error
- func (e *Election) Epoch() uint64
- func (e *Election) IsLeader() bool
- func (e *Election) LeaderID() string
- func (e *Election) LeaseValid() bool
- func (e *Election) Resign()
- func (e *Election) Run(ctx context.Context)
- func (e *Election) SetLeaseRegistry(r *LeaseRegistry)
- func (e *Election) SetSWIMProtocol(p *swim.Protocol)
- func (e *Election) Stop()
- type EtcdElection
- func (ee *EtcdElection) Campaign(ctx context.Context, runID string) (FencingToken, error)
- func (ee *EtcdElection) Close() error
- func (ee *EtcdElection) CurrentEpoch() uint64
- func (ee *EtcdElection) CurrentToken() FencingToken
- func (ee *EtcdElection) IsLeader() bool
- func (ee *EtcdElection) LeaderID() string
- func (ee *EtcdElection) QueryLeader(ctx context.Context) (LeaderValue, error)
- func (ee *EtcdElection) Resign(ctx context.Context) error
- type EtcdElectionBackend
- type EtcdElectionConfig
- type EtcdEpochStore
- type FencingToken
- type LeaderValue
- type LeaseRegistry
- type Option
Constants ¶
const EpochNamespace = "/helix/leader-epoch"
EpochNamespace is the parallel etcd key root under which durable fencing-epoch counters are stored. It is deliberately disjoint from any election key prefix so concurrency.Election's WithPrefix(ElectionKey) scan never matches an epoch key (which would otherwise hang Campaign — see NewEtcdElection).
Variables ¶
var ( ErrNoHealthyMembers = errors.New("no healthy members available for election") ErrNotLeader = errors.New("this node is not the leader") )
var ErrLeaseHeld = errors.New("leader: lease still held by an un-expired holder")
ErrLeaseHeld is returned by AcquireLease when a valid (un-expired) lease is still held — either by this node or, via a shared LeaseRegistry, by another node. Deterministic failover requires that a new candidate may acquire ONLY after the prior lease has expired.
Functions ¶
func StaleLeader ¶
func StaleLeader(tok FencingToken, highestSeenEpoch uint64) bool
StaleLeader checks if a given fencing token is stale relative to the highest observed epoch. Consumers call this to reject requests from former leaders whose lease was revoked.
Types ¶
type Clock ¶
Clock is an injectable time source. It exists so that leadership failover and TTL/lease expiry can be exercised deterministically in tests with a fake clock (no time.Sleep, no wall-clock dependence). The production default is realClock.
type Election ¶
type Election struct {
// contains filtered or unexported fields
}
Election implements a distributed leader election using SWIM membership. The leader is deterministically chosen as the lowest SHA256 hash of (memberID + term), and must renew its leadership within a TTL or another node will take over.
func NewElection ¶
NewElection creates a new distributed Election. Additional behavior may be supplied via Options (e.g. WithClock); with no options the construction is identical to the original API.
func (*Election) AcquireLease ¶
func (e *Election) AcquireLease() (FencingToken, error)
AcquireLease attempts to acquire (or renew) the leadership lease and returns a fresh FencingToken on success.
Hardening guarantees:
- Fencing token: every successful acquisition issues a strictly increasing epoch, so stale leaders are detectable by consumers (see FencingToken.IsStale).
- Deterministic failover: a new candidate may acquire ONLY after the current lease has expired (now >= expiry). A second acquire before expiry by a different node is rejected with ErrLeaseHeld and burns no epoch.
- Split-brain guard: when a shared LeaseRegistry is attached, the registry enforces a single valid holder per clock instant across all nodes; a contending node cannot acquire while another holds an un-expired lease.
func (*Election) BecomeLeader ¶
BecomeLeader attempts to become leader by checking deterministic ordering. If this node is the lowest hash among healthy members, it becomes leader.
func (*Election) Epoch ¶
Epoch returns the most recent fencing epoch issued to this node (0 if it has never acquired a lease).
func (*Election) LeaseValid ¶
LeaseValid reports whether this node currently holds a valid (un-expired) lease, evaluated against the injectable clock. When a shared registry is attached, validity is authoritative against the registry (so a node whose lease was taken over by a peer correctly reports false), preventing split-brain.
func (*Election) Run ¶
Run starts a background goroutine that periodically attempts to become leader and resigns if the TTL expires without renewal.
func (*Election) SetLeaseRegistry ¶
func (e *Election) SetLeaseRegistry(r *LeaseRegistry)
SetLeaseRegistry attaches a shared registry used as the cross-node split-brain guard. Elections sharing the same registry contend for one logical lease.
func (*Election) SetSWIMProtocol ¶
SetSWIMProtocol attaches a SWIM protocol for cluster membership awareness.
type EtcdElection ¶
type EtcdElection struct {
// contains filtered or unexported fields
}
EtcdElection is a distributed leader election backed by real etcd sessions and the concurrency.Election primitive. It satisfies HXC-1093.
func NewEtcdElection ¶
func NewEtcdElection(cfg EtcdElectionConfig, backend EtcdElectionBackend, epochStore EtcdEpochStore) (*EtcdElection, error)
NewEtcdElection creates a new EtcdElection.
backend is the injectable seam (pass nil to force the production path — but production callers should use NewEtcdElectionFromClient). epochStore is the injectable seam for epoch persistence (pass nil only in tests that inject a backend that manages epoch counting themselves).
func NewEtcdElectionFromClient ¶
func NewEtcdElectionFromClient(ctx context.Context, cli *clientv3.Client, cfg EtcdElectionConfig, ttl int) (*EtcdElection, error)
NewEtcdElectionFromClient creates a production EtcdElection from a real etcd client. It creates a new session and election under cfg.ElectionKey.
func (*EtcdElection) Campaign ¶
func (ee *EtcdElection) Campaign(ctx context.Context, runID string) (FencingToken, error)
Campaign blocks until this node wins the election, then records the fencing token and sets IsLeader. The per-run runID is embedded in the leader value (etcd key payload) for sink-side evidence.
func (*EtcdElection) Close ¶
func (ee *EtcdElection) Close() error
Close releases the underlying session.
func (*EtcdElection) CurrentEpoch ¶
func (ee *EtcdElection) CurrentEpoch() uint64
CurrentEpoch returns the most recently issued fencing epoch.
func (*EtcdElection) CurrentToken ¶
func (ee *EtcdElection) CurrentToken() FencingToken
CurrentToken returns the most recently issued fencing token.
func (*EtcdElection) IsLeader ¶
func (ee *EtcdElection) IsLeader() bool
IsLeader returns true if this node currently holds the etcd election.
func (*EtcdElection) LeaderID ¶
func (ee *EtcdElection) LeaderID() string
LeaderID returns the current leader node ID (empty if unknown).
func (*EtcdElection) QueryLeader ¶
func (ee *EtcdElection) QueryLeader(ctx context.Context) (LeaderValue, error)
QueryLeader reads the current leader's value from etcd and parses it. Returns an error if no leader exists.
type EtcdElectionBackend ¶
type EtcdElectionBackend interface {
// Campaign blocks until this candidate wins the election, then writes value
// as the leader value.
Campaign(ctx context.Context, value string) error
// Resign gives up leadership.
Resign(ctx context.Context) error
// Leader returns the current leader's KV, or an error if there is none.
Leader(ctx context.Context) (*clientv3.GetResponse, error)
// Observe returns a channel that fires whenever leadership changes.
Observe(ctx context.Context) <-chan clientv3.GetResponse
// LeaseID returns the underlying session lease ID (used in FencingToken).
LeaseID() clientv3.LeaseID
// Close releases the session.
Close() error
}
EtcdElectionBackend is the injectable seam that isolates EtcdElection from the real clientv3/concurrency library during unit tests. The concurrency.Session + concurrency.Election pair satisfies this interface in production; a fake satisfies it in unit tests.
type EtcdElectionConfig ¶
type EtcdElectionConfig struct {
// LocalID uniquely identifies this candidate node.
LocalID string
// ElectionKey is the etcd key prefix under which the election runs,
// e.g. "/helix/election/scheduler".
ElectionKey string
// EpochKey is the etcd key used to store the durable fencing epoch counter,
// e.g. "/helix/election/scheduler/epoch". If empty, defaults to ElectionKey+"/epoch".
EpochKey string
}
EtcdElectionConfig configures an EtcdElection.
type EtcdEpochStore ¶
type EtcdEpochStore interface {
// IncrementAndGet atomically increments the epoch stored at epochKey and
// returns the new value. It MUST be strictly monotonic.
IncrementAndGet(ctx context.Context, epochKey string) (uint64, error)
// Get reads the current epoch (0 if not yet written).
Get(ctx context.Context, epochKey string) (uint64, error)
}
EtcdEpochStore persists and retrieves the monotonic fencing epoch in etcd. This makes the epoch durable across restarts: a new leader reads the current epoch from etcd and increments it, so the token is guaranteed to be strictly greater even after a crash-restart cycle.
type FencingToken ¶
type FencingToken struct {
// HolderID is the node that acquired the lease.
HolderID string
// Epoch is the monotonic fencing number. Strictly increases across the whole
// election lifetime; never reused.
Epoch uint64
// Expiry is the clock time at which this lease becomes invalid.
Expiry time.Time
}
FencingToken is the monotonically increasing credential issued on each successful leadership acquisition. Consumers attach the token they observed from a leader to their writes; a downstream resource that tracks the highest epoch it has seen can reject any request carrying a strictly lower epoch, fencing out a stale (e.g. paused or partitioned) former leader.
func (FencingToken) IsStale ¶
func (t FencingToken) IsStale(highestSeen uint64) bool
IsStale reports whether this token is older than highestSeen, i.e. a newer leadership epoch has been issued since this token. A consumer that rejects stale tokens prevents a former leader from acting after losing leadership.
type LeaderValue ¶
type LeaderValue struct {
HolderID string `json:"holder_id"`
Epoch uint64 `json:"epoch"`
RunID string `json:"run_id"` // per-run UUID for anti-bluff evidence
}
LeaderValue is the JSON payload written to etcd by the winning candidate. Embedding the per-run UUID and epoch here makes the sink-side value inspectable.
type LeaseRegistry ¶
type LeaseRegistry struct {
// contains filtered or unexported fields
}
LeaseRegistry is the shared split-brain guard. A single registry instance is shared (via SetLeaseRegistry) by all Election instances that contend for the SAME logical leadership. It guarantees that at most one holder owns a valid (un-expired) lease at any clock instant, and that each granted lease carries a strictly increasing epoch — so two nodes can never both hold a valid lease with the same epoch.
func NewLeaseRegistry ¶
func NewLeaseRegistry() *LeaseRegistry
NewLeaseRegistry creates an empty registry.
type Option ¶
type Option func(*Election)
Option configures an Election at construction time. Options are additive and optional, preserving the original single-argument NewElection call sites.