retry

package

v0.0.0-...-66343a0 Latest Latest Go to latest Published: May 14, 2024 License: BSD-3-Clause, BSD-3-Clause Imports: 7 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/enfabrica/enkit

README ¶

Overview

The retry library is a simple golang library to implement retry logic in a simple, configurable, and reliable way.

For example, let's say you have a Scrape() function, to scrape content from a remote website: func Scrape() error. Scraping fails at times, and you want to re-try this function up to 10 times, waiting 1 second in between attempts. You can write:

import (
"github.com/enfabrica/enkit/lib/retry"
"fmt"
"time"
)

func DoWork() {
	...
  	if err := retry.New(retry.WithAttempts(10), retry.WithDelay(1 * time.Second)).Run(Scrape); err != nil {
	return fmt.Errorf("scraping failed after 10 attempts: %w", err)
  	}
}

The main features of the retry library are:

Fuzzies delays by default (but configurable) - this is important to avoid the thundering herd problem in large systems.
Allows the configuration of attempts, delay, logger, fuzzying, random number generator, a log message, and time source to simplify testing.
Captures the last n errors (configurable) in a multierror, for user friendly messages as well as easy processing of the errors.
Allows the function to stop retries, with return Fatal(err).
Allows to implement retry logic in functions that cannot block or sleep, by invoking Once (instead of Run) and re-scheduling the call later.
Allows to access the original errors returned by the function using normal errors.Unwrap, or errors.As or errors.Is, and wraps errors so it is possible to distinguish between a fatal error returned by the function (FatalError) or having exhausted the attempts (ExaustedError)
Allows to parse the retry parameters from the command line. See example below.

Command line example:

import (
    ...
"github.com/enfabrica/enkit/lib/retry"
    ...
"github.com/enfabrica/enkit/lib/kflags"
    ...
    "flag"
)

func main() {
    retryFlags := retry.DefaultFlags()

	// "scrape-" is a prefix to give to the added flags.
	//
	// If using cobra, you can use &kcobra.FlagSet{FlagSet: ...} instead, from
	// github.com/enfabrica/enkit/lib/kflags/kcobra.
    retryFlags.Register(&kflags.GoFlagSet{FlagSet: flag.CommandLine}, "scrape-")
    ...
    flag.Parse()
    ...

    if err := retry.New(retry.FromFlags(retryFlags)).Run(func () error {
	return Scrape()
}); err != nil {
	log.Fatal("scrape failed: %v", err)
}
}

In the example above, running the command with --help would show a few extra flags like --scrape-retry-at-most, --scrape-retry-max-errors, --scrape..., as per retry.Register function definition.

Documentation

All the documentation is available on pkg.go.dev.

Documentation ¶

Overview ¶

A simple library to safely retry operations.

Whenever you have an operation that can temporarily fail, your code should have a strategy to retry the operation.

But retrying is non trivial: before retrying, your code should wait some time. It should also not retry forever, and eventually give up. It should allow you to handle fatal errors and temporary errors differently. If the request blocks for a long time before failing, it should probably take into account that time when deciding how long to wait before retrying.

If you are writing an application that is talking with a remote endpoint and will be running on a large number of machines, your code should also try to randomize the retry interval over a period of time, so that if a remote endpoint experiences an outage, and all clients try to reconnect, they don't all reconnect at the same time. This is important to prevent the "thundering herd" problem, which could overload the remote backend, further prolonging the outage.

To use the retry library:

1) Create a `retry.Options` object, like:

options := retry.New(retry.WithWait(5 * time.Second), retry.WithAttempts(10))

2) Run some code:

options.Run(func () error {
  ...
})

The retry library will run your functions as many times as configured until it returns an error, or until it returns retry.FatalError (use retry.Fatal to create one) or an error wrapping a retry.FatalError (see the errors library, and all the magic wrapping/unwrapping logic).

Index ¶

Variables
type ExaustedError
- func (ee *ExaustedError) Error() string
- func (ee *ExaustedError) Unwrap() error
type FatalError
- func Fatal(err error) *FatalError
- func (s *FatalError) Error() string
- func (s *FatalError) Unwrap() error
type Flags
- func DefaultFlags() *Flags
- func (fl *Flags) Register(set kflags.FlagSet, prefix string) *Flags
type Modifier
type Modifiers
- func (mods Modifiers) Apply(o *Options) *Options
type Options
- func New(mods ...Modifier) *Options
type TimeSource

Constants ¶

This section is empty.

Variables ¶

View Source

var Nil = &Options{
	logger: logger.Nil,
	Now:    time.Now,
	Flags: Flags{
		AtMost: 1,
	},
}

Nil is a set of retry options that perform a single retry.

This is useful whenever you have an object that requires a retry config, but you only want a single retry attempt to be performed.

Functions ¶

This section is empty.

Types ¶

type ExaustedError ¶

type ExaustedError struct {
	// Message is a human readable error message, returned by Error().
	Message string
	// Original is a multierror.MultiError containing the last MaxErrors errors.
	Original error
}

ExaustedError is returned when the retrier has exhausted all attempts.

func (*ExaustedError) Error ¶

func (ee *ExaustedError) Error() string

func (*ExaustedError) Unwrap ¶

func (ee *ExaustedError) Unwrap() error

type FatalError ¶

type FatalError struct {
	Original error
}

func Fatal ¶

func Fatal(err error) *FatalError

Fatal turns a normal error into a fatal error.

Fatal errors will stop the retrier immediately. Fatal errors implement the Unwrap() API, allowing the use of errors.Is, errors.As, and errors.Unwrap.

func (*FatalError) Error ¶

func (s *FatalError) Error() string

func (*FatalError) Unwrap ¶

func (s *FatalError) Unwrap() error

type Flags ¶

type Flags struct {
	// How many times to retry the operation, at most.
	AtMost int
	// How long to wait between attempts.
	Wait time.Duration
	// How much of a random retry time to add.
	Fuzzy time.Duration
	// How many errors to store at most.
	MaxErrors int
}

func DefaultFlags ¶

func DefaultFlags() *Flags

func (*Flags) Register ¶

func (fl *Flags) Register(set kflags.FlagSet, prefix string) *Flags

type Modifier ¶

type Modifier func(*Options)

func FromFlags ¶

func FromFlags(fl *Flags) Modifier

FromFlags configures a retry object from command line flags.

func WithAttempts ¶

func WithAttempts(atmost int) Modifier

WithAttempts configures the number of attempts to perform.

func WithDescription ¶

func WithDescription(desc string) Modifier

WithDescription adds text used from logging, to distinguish a retry attempt from another.

If no description is provided, and retry fails, you will get a generic log entry like:

attempt #1 - FAILED - This is the string error received

If you provide a description instead, you will get a log entry:

attempt #1 Your description goes here - FAILED - This is the string error received

func WithFuzzy ¶

func WithFuzzy(fuzzy time.Duration) Modifier

WithFuzzy introduces a random offset from 0 to fuzzy time in between connection attempts.

This is very important in distributed environments, to avoid connection storms or overload because of a failure.

For example: let's say that you have 10,000 workers, connected to a server. The server crashes at 2pm. With no fuzzy time, all the 10,000 workers will likely try to reconnect at exactly the same time.

If you set fuzzy time to 10 seconds, a random retry time up to 10 seconds will be added to the normal retry time.

This will cause the server to process roughly 1,000 requests per second, rather than 10,000.

func WithLogger ¶

func WithLogger(log logger.Logger) Modifier

WithLogger configures a logger to send log messages to.

func WithRng ¶

func WithRng(rng *rand.Rand) Modifier

WithRng sets a random number generator to use. If not set, it just uses math.Rand. Convenient for testing, or to set a seeded / secure global generator.

func WithTimeSource ¶

func WithTimeSource(ts TimeSource) Modifier

WithTimeSource configures a different clock.

func WithWait ¶

func WithWait(duration time.Duration) Modifier

WithWait sets how long to wait between attempts.

Note that retry will start counting the time since the last attempt was started.

Let's say you use retry to connect to a remote server. You set the Wait time to 10 seconds. The connection succeeds at 2pm. At 3pm, one hour later, the connection fails, and retry kicks in. Retry will retry *immediately* as 10 seconds passed already since the last connection attempt.

The server is now down, and connecting fails in 5 seconds. Retry will wait 5 more seconds to reconnect.

In general, make sure that your Wait time is set > than the timeout configured for whatever operation is attempted, otherwise it will almost always reconnect immediately.

Another way to look at it: the Wait time guarantees that there is no more than one attempt at the operation within the Wait time.

type Modifiers ¶

type Modifiers []Modifier

func (Modifiers) Apply ¶

func (mods Modifiers) Apply(o *Options) *Options

type Options ¶

type Options struct {

	// How to read time.
	Now TimeSource

	Flags
	// contains filtered or unexported fields
}

Options are all the options that the Retry functions accept.

func New ¶

func New(mods ...Modifier) *Options

New creates a new retry object.

func (*Options) Delay ¶

func (o *Options) Delay() time.Duration

Delay computes how long to wait before the next attempt.

If Fuzzy is non 0, the delay is fuzzied by a random amount less than the value of fuzzy.

func (*Options) DelaySince ¶

func (o *Options) DelaySince(start time.Time) time.Duration

DelaySince computes how longer to wait since a start time.

DelaySince assumes that a wait started at start time, and computes how longer the code still has to wait based on a delay computed with the Delay() function.

func (*Options) Once ¶

func (o *Options) Once(attempt int, runner func() error) (time.Duration, error)

Once runs the specified function once as if it was run by Run().

attempt is the attempt number, how many times before it was invoked. runner is the function to invoke.

Once returns the error returned by the supplied runner. In case the runner fails, Once also log messages as specified by Options and exactly like Run() would, and computes a delay indicating how long to wait before running this function again.

Once is useful in non-blocking or multithreaded code, when you cannot afford to block an entire goroutine for the funcntion to complete, but you still want to implement reasonable retry semantics based on this library.

Typically, your code will invoke Once() to run the runner, and in case of failure, re-schedule it to run later.

func (*Options) OnceAttempt ¶

func (o *Options) OnceAttempt(attempt int, runner func(attempt int) error) (time.Duration, error)

OnceAttempt is just like Once, but invokes a runner that expects an attempt #.

OnceAttempt is to Once what RunAttempt is to Run. Read the documentation for RunAttempt and Once for details.

func (*Options) Run ¶

func (o *Options) Run(runner func() error) error

Run runs the function specified until it succeeds.

Run will keep retrying running the function until either the function return a nil error, it returns a FatalError, or all retry attempts as specified in Options are exhausted.

When Run gives up running a function, it returns the original error returned by the function, wrapped into an ExaustedError.

You can use errors.As or errors.Is or the unwrap magic to retrieve the original error.

func (*Options) RunAttempt ¶

func (o *Options) RunAttempt(runner func(attempt int) error) error

RunAttempt is just like Run, but propagates the attempt #.

Use RunAttempt when your function callback requires knowing how many attemps have been made so far at running your function. This is useful, for example, to log an extra message every x attempts, to re-initialize state on non-first attempt, or try harder after a number of attempts, ...

type TimeSource ¶

type TimeSource func() time.Time

TimeSource is a function returning the current time. Generally, it should be set to time.Now. Mainly used for testing.

Source Files ¶

View all Source files

retry.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL