httprc

package module
v1.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 19, 2022 License: MIT Imports: 10 Imported by: 7

README

httprc

httprc is a HTTP "Refresh" Cache. Its aim is to cache a remote resource that can be fetched via HTTP, but keep the cached content up-to-date based on periodic refreshing.

SYNOPSIS

package httprc_test

import (
  "context"
  "fmt"
  "net/http"
  "net/http/httptest"
  "sync"
  "time"

  "github.com/lestrrat-go/httprc"
)

const (
  helloWorld   = `Hello World!`
  goodbyeWorld = `Goodbye World!`
)

func ExampleCache() {
  var mu sync.RWMutex

  msg := helloWorld

  srv := httptest.NewTLSServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
    w.Header().Set(`Cache-Control`, fmt.Sprintf(`max-age=%d`, 2))
    w.WriteHeader(http.StatusOK)
    mu.RLock()
    fmt.Fprint(w, msg)
    mu.RUnlock()
  }))
  defer srv.Close()

  ctx, cancel := context.WithCancel(context.Background())
  defer cancel()

  errSink := httprc.ErrSinkFunc(func(err error) {
    fmt.Printf("%s\n", err)
  })

  c := httprc.NewCache(ctx,
    httprc.WithErrSink(errSink),
    httprc.WithRefreshWindow(time.Second), // force checks every second
  )

  c.Register(srv.URL,
    httprc.WithHTTPClient(srv.Client()),        // we need client with TLS settings
    httprc.WithMinRefreshInterval(time.Second), // allow max-age=1 (smallest)
  )

  payload, err := c.Get(ctx, srv.URL)
  if err != nil {
    fmt.Printf("%s\n", err)
    return
  }

  if string(payload.([]byte)) != helloWorld {
    fmt.Printf("payload mismatch: %s\n", payload)
    return
  }

  mu.Lock()
  msg = goodbyeWorld
  mu.Unlock()

  time.Sleep(4 * time.Second)

  payload, err = c.Get(ctx, srv.URL)
  if err != nil {
    fmt.Printf("%s\n", err)
    return
  }

  if string(payload.([]byte)) != goodbyeWorld {
    fmt.Printf("payload mismatch: %s\n", payload)
    return
  }

  cancel()

  // OUTPUT:
}

source: httprc_example_test.go

Sequence Diagram

sequenceDiagram
  autonumber
  actor User
  participant httprc.Cache
  participant httprc.Storage
  User->>httprc.Cache: Fetch URL `u`
  activate httprc.Storage
  httprc.Cache->>httprc.Storage: Fetch local cache for `u`
  alt Cache exists
    httprc.Storage-->httprc.Cache: Return local cache
    httprc.Cache-->>User: Return data
    Note over httprc.Storage: If the cache exists, there's nothing more to do.<br />The cached content will be updated periodically in httprc.Refresher
    deactivate httprc.Storage
  else Cache does not exist
    activate httprc.Fetcher
    httprc.Cache->>httprc.Fetcher: Fetch remote resource `u`
    httprc.Fetcher-->>httprc.Cache: Return fetched data
    deactivate httprc.Fetcher
    httprc.Cache-->>User: Return data
    httprc.Cache-)httprc.Refresher: Enqueue into auto-refresh queue
    activate httprc.Refresher
    loop Refresh Loop
      Note over httprc.Storage,httprc.Fetcher: Cached contents are updated synchronously
      httprc.Refresher->>httprc.Refresher: Wait until next refresh
      httprc.Refresher-->>httprc.Fetcher: Request fetch
      httprc.Fetcher->>httprc.Refresher: Return fetched data
      httprc.Refresher-->>httprc.Storage: Store new version in cache
      httprc.Refresher->>httprc.Refresher: Enqueue into auto-refresh queue (again)
    end
    deactivate httprc.Refresher
  end

Documentation

Overview

Package httprc implements a cache for resources available over http(s). Its aim is not only to cache these resources so that it saves on HTTP roundtrips, but it also periodically attempts to auto-refresh these resources once they are cached based on the user-specified intervals and HTTP `Expires` and `Cache-Control` headers, thus keeping the entries _relatively_ fresh.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BodyBytes

type BodyBytes struct{}

BodyBytes is the default Transformer applied to all resources. It takes an *http.Response object and extracts the body of the response as `[]byte`

func (BodyBytes) Transform

func (BodyBytes) Transform(_ string, res *http.Response) (interface{}, error)

type Cache

type Cache struct {
	// contains filtered or unexported fields
}

Cache represents a cache that stores resources locally, while periodically refreshing the contents based on HTTP header values and/or user-supplied hints.

Refresh is performed _periodically_, and therefore the contents are not kept up-to-date in real time. The interval between checks for refreshes is called the refresh window.

The default refresh window is 15 minutes. This means that if a resource is fetched is at time T, and it is supposed to be refreshed in 20 minutes, the next refresh for this resource will happen at T+30 minutes (15+15 minutes).

Example
package main

import (
	"context"
	"fmt"
	"net/http"
	"net/http/httptest"
	"sync"
	"time"

	"github.com/lestrrat-go/httprc"
)

const (
	helloWorld   = `Hello World!`
	goodbyeWorld = `Goodbye World!`
)

func main() {
	var mu sync.RWMutex

	msg := helloWorld

	srv := httptest.NewTLSServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.Header().Set(`Cache-Control`, fmt.Sprintf(`max-age=%d`, 2))
		w.WriteHeader(http.StatusOK)
		mu.RLock()
		fmt.Fprint(w, msg)
		mu.RUnlock()
	}))
	defer srv.Close()

	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	errSink := httprc.ErrSinkFunc(func(err error) {
		fmt.Printf("%s\n", err)
	})

	c := httprc.NewCache(ctx,
		httprc.WithErrSink(errSink),
		httprc.WithRefreshWindow(time.Second), // force checks every second
	)

	c.Register(srv.URL,
		httprc.WithHTTPClient(srv.Client()),        // we need client with TLS settings
		httprc.WithMinRefreshInterval(time.Second), // allow max-age=1 (smallest)
	)

	payload, err := c.Get(ctx, srv.URL)
	if err != nil {
		fmt.Printf("%s\n", err)
		return
	}

	if string(payload.([]byte)) != helloWorld {
		fmt.Printf("payload mismatch: %s\n", payload)
		return
	}

	mu.Lock()
	msg = goodbyeWorld
	mu.Unlock()

	time.Sleep(4 * time.Second)

	payload, err = c.Get(ctx, srv.URL)
	if err != nil {
		fmt.Printf("%s\n", err)
		return
	}

	if string(payload.([]byte)) != goodbyeWorld {
		fmt.Printf("payload mismatch: %s\n", payload)
		return
	}

	cancel()

}
Output:

func NewCache

func NewCache(ctx context.Context, options ...CacheOption) *Cache

New creates a new Cache object.

The context object in the argument controls the life-cycle of the auto-refresh worker. If you cancel the `ctx`, then the automatic refresh will stop working.

Refresh will only be performed periodically where the interval between refreshes are controlled by the `refresh window` variable. For example, if the refresh window is every 5 minutes and the resource was queued to be refreshed at 7 minutes, the resource will be refreshed after 10 minutes (in 2 refresh window time).

The refresh window can be configured by using `httprc.WithRefreshWindow` option. If you want refreshes to be performed more often, provide a smaller refresh window. If you specify a refresh window that is smaller than 1 second, it will automatically be set to the default value, which is 15 minutes.

Internally the HTTP fetching is done using a pool of HTTP fetch workers. The default number of workers is 3. You may change this number by specifying the `httprc.WithFetcherWorkerCount`

func (*Cache) Get

func (c *Cache) Get(ctx context.Context, u string) (interface{}, error)

Get returns the cached object.

The context.Context argument is used to control the timeout for synchronous fetches, when they need to happen. Synchronous fetches will be performed when the cache does not contain the specified resource.

func (*Cache) IsRegistered

func (c *Cache) IsRegistered(u string) bool

IsRegistered returns true if the given URL `u` has already been registered in the cache.

func (*Cache) Refresh

func (c *Cache) Refresh(ctx context.Context, u string) (interface{}, error)

Refresh is identical to Get(), except it always fetches the specified resource anew, and updates the cached content

func (*Cache) Register

func (c *Cache) Register(u string, options ...RegisterOption) error

Register configures a URL to be stored in the cache.

For any given URL, the URL must be registered _BEFORE_ it is accessed using `Get()` method.

func (*Cache) Snapshot

func (c *Cache) Snapshot() *Snapshot

func (*Cache) Unregister

func (c *Cache) Unregister(u string) error

Unregister removes the given URL `u` from the cache.

Subsequent calls to `Get()` will fail until `u` is registered again.

type CacheOption

type CacheOption interface {
	Option
	// contains filtered or unexported methods
}

CacheOption desribes options that can be passed to `New()`

func WithErrSink

func WithErrSink(v ErrSink) CacheOption

WithErrSink specifies the `httprc.ErrSink` object that handles errors that occurred during the cache's execution. For example, you will be able to intercept errors that occurred during the execution of Transformers.

func WithRefreshWindow

func WithRefreshWindow(v time.Duration) CacheOption

WithRefreshWindow specifies the interval between checks for refreshes. `httprc.Cache` does not check for refreshes in exact intervals. Instead, it wakes up at every tick that occurs in the interval specified by `WithRefreshWindow` option, and refreshes all entries that need to be refreshed within this window.

The default value is 15 minutes.

You generally do not want to make this value too small, as it can easily be considered a DoS attack, and there is no backoff mechanism for failed attempts.

type ErrSink

type ErrSink interface {
	// Error accepts errors produced during the cache queue's execution.
	// The method should never block, otherwise the fetch loop may be
	// paused for a prolonged amount of time.
	Error(error)
}

ErrSink is an abstraction that allows users to consume errors produced while the cache queue is running.

type ErrSinkFunc

type ErrSinkFunc func(err error)

func (ErrSinkFunc) Error

func (f ErrSinkFunc) Error(err error)

type FetchFetcherRegisterOption

type FetchFetcherRegisterOption interface {
	Option
	// contains filtered or unexported methods
}

func WithWhitelist

func WithWhitelist(v Whitelist) FetchFetcherRegisterOption

WithWhitelist specifies the Whitelist object that can control which URLs are allowed to be processed.

It can be passed to `httprc.NewCache` as a whitelist applied to all URLs that are fetched by the cache, or it can be passed on a per-URL basis using `(httprc.Cache).Register()`. If both are specified, the url must fulfill _both_ the cache-wide whitelist and the per-URL whitelist.

type FetchOption

type FetchOption interface {
	Option
	// contains filtered or unexported methods
}

FetchOption describes options that can be passed to `(httprc.Fetcher).Fetch()`

type FetchRegisterOption

type FetchRegisterOption interface {
	Option
	// contains filtered or unexported methods
}

func WithHTTPClient

func WithHTTPClient(v HTTPClient) FetchRegisterOption

WithHTTPClient specififes the HTTP Client object that should be used to fetch the resource. For example, if you need an `*http.Client` instance that requires special TLS or Authorization setup, you might want to pass it using this option.

type Fetcher

type Fetcher interface {
	Fetch(context.Context, string, ...FetchOption) (*http.Response, error)
	// contains filtered or unexported methods
}

func NewFetcher

func NewFetcher(ctx context.Context, options ...FetcherOption) Fetcher

type FetcherOption

type FetcherOption interface {
	Option
	// contains filtered or unexported methods
}

FetcherOption describes options that can be passed to `(httprc.Fetcher).NewFetcher()`

func WithFetcherWorkerCount

func WithFetcherWorkerCount(v int) FetcherOption

WithFetchWorkerCount specifies the number of HTTP fetch workers that are spawned in the backend. By default 3 workers are spawned.

type HTTPClient

type HTTPClient interface {
	Get(string) (*http.Response, error)
}

ErrSink is an abstraction that allows users to consume errors produced while the cache queue is running.

type InsecureWhitelist

type InsecureWhitelist struct{}

InsecureWhitelist allows any URLs to be fetched.

func (InsecureWhitelist) IsAllowed

func (InsecureWhitelist) IsAllowed(string) bool

type MapWhitelist

type MapWhitelist struct {
	// contains filtered or unexported fields
}

MapWhitelist is a httprc.Whitelist object comprised of a map of strings. If the URL exists in the map, then the URL is allowed to be fetched.

func NewMapWhitelist

func NewMapWhitelist() *MapWhitelist

func (*MapWhitelist) Add

func (w *MapWhitelist) Add(pat string) *MapWhitelist

func (*MapWhitelist) IsAllowed

func (w *MapWhitelist) IsAllowed(u string) bool

type Option

type Option = option.Interface

type RefreshError

type RefreshError struct {
	URL string
	Err error
}

RefreshError is the underlying error type that is sent to the `httprc.ErrSink` objects

func (*RefreshError) Error

func (re *RefreshError) Error() string

type RegexpWhitelist

type RegexpWhitelist struct {
	// contains filtered or unexported fields
}

RegexpWhitelist is a httprc.Whitelist object comprised of a list of *regexp.Regexp objects. All entries in the list are tried until one matches. If none of the *regexp.Regexp objects match, then the URL is deemed unallowed.

func NewRegexpWhitelist

func NewRegexpWhitelist() *RegexpWhitelist

func (*RegexpWhitelist) Add

func (*RegexpWhitelist) IsAllowed

func (w *RegexpWhitelist) IsAllowed(u string) bool

IsAlloed returns true if any of the patterns in the whitelist returns true.

type RegisterOption

type RegisterOption interface {
	Option
	// contains filtered or unexported methods
}

RegisterOption desribes options that can be passed to `(httprc.Cache).Register()`

func WithMinRefreshInterval

func WithMinRefreshInterval(v time.Duration) RegisterOption

WithMinRefreshInterval specifies the minimum refresh interval to be used.

When we fetch the key from a remote URL, we first look at the `max-age` directive from `Cache-Control` response header. If this value is present, we compare the `max-age` value and the value specified by this option and take the larger one (e.g. if `max-age` = 5 minutes and `min refresh` = 10 minutes, then next fetch will happen in 10 minutes)

Next we check for the `Expires` header, and similarly if the header is present, we compare it against the value specified by this option, and take the larger one.

Finally, if neither of the above headers are present, we use the value specified by this option as the interval until the next refresh.

If unspecified, the minimum refresh interval is 1 hour.

This value and the header values are ignored if `WithRefreshInterval` is specified.

func WithRefreshInterval

func WithRefreshInterval(v time.Duration) RegisterOption

WithRefreshInterval specifies the static interval between refreshes of resources controlled by `httprc.Cache`.

Providing this option overrides the adaptive token refreshing based on Cache-Control/Expires header (and `httprc.WithMinRefreshInterval`), and refreshes will *always* happen in this interval.

You generally do not want to make this value too small, as it can easily be considered a DoS attack, and there is no backoff mechanism for failed attempts.

func WithTransformer

func WithTransformer(v Transformer) RegisterOption

WithTransformer specifies the `httprc.Transformer` object that should be applied to the fetched resource. The `Transform()` method is only called if the HTTP request returns a `200 OK` status.

type Snapshot

type Snapshot struct {
	Entries []SnapshotEntry `json:"entries"`
}

type SnapshotEntry

type SnapshotEntry struct {
	URL         string      `json:"url"`
	Data        interface{} `json:"data"`
	LastFetched time.Time   `json:"last_fetched"`
}

type TransformFunc

type TransformFunc func(string, *http.Response) (interface{}, error)

func (TransformFunc) Transform

func (f TransformFunc) Transform(u string, res *http.Response) (interface{}, error)

type Transformer

type Transformer interface {
	// Transform receives an HTTP response object, and should
	// return an appropriate object that suits your needs.
	//
	// If you happen to use the response body, you are responsible
	// for closing the body
	Transform(string, *http.Response) (interface{}, error)
}

Transformer is responsible for converting an HTTP response into an appropriate form of your choosing.

type Whitelist

type Whitelist interface {
	IsAllowed(string) bool
}

Whitelist is an interface for a set of URL whitelists. When provided to fetching operations, urls are checked against this object, and the object must return true for urls to be fetched.

type WhitelistFunc

type WhitelistFunc func(string) bool

WhitelistFunc is a httprc.Whitelist object based on a function. You can perform any sort of check against the given URL to determine if it can be fetched or not.

func (WhitelistFunc) IsAllowed

func (w WhitelistFunc) IsAllowed(u string) bool

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL