galaxycache

package module
v0.0.0-...-60f325a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 11, 2021 License: Apache-2.0 Imports: 17 Imported by: 1

README

galaxycache

Build Status

galaxycache is a caching and cache-filling library, adapted from groupcache, intended as a replacement for memcached in many cases.

For API docs and examples, see http://godoc.org/github.com/localrivet/galaxycache

Quick Start

Initializing a peer
// Generate the protocol for this peer to Fetch from others with (package includes HTTP and gRPC)
grpcProto := NewGRPCFetchProtocol(grpc.WithInsecure())

// HTTP protocol as an alternative (passing the nil argument ensures use of the default basepath
// and opencensus Transport as an http.RoundTripper)
httpProto := NewHTTPFetchProtocol(nil)

// Create a new Universe with the chosen peer connection protocol and the URL of this process
u := NewUniverse(grpcProto, "my-url")

// Set the Universe's list of peer addresses for the distributed cache
u.Set("peer1-url", "peer2-url", "peer3-url")

// Define a BackendGetter (here as a function) for retrieving data
getter := GetterFunc(func(ctx context.Context, key string, dest Codec) error {
   // Define your method for retrieving non-cached data here, i.e. from a database
})

// Create a new Galaxy within the Universe with a name, the max capacity of cache space you would
// like to allocate, and your BackendGetter
g := u.NewGalaxy("galaxy-1", 1 << 20, getter)

// In order to receive Fetch requests from peers over HTTP or gRPC, we must register this universe
// to handle those requests

// gRPC Server registration (note: you must create the server with an ocgrpc.ServerHandler for
// opencensus metrics to propogate properly)
grpcServer := grpc.NewServer(grpc.StatsHandler(&ocgrpc.ServerHandler{}))
RegisterGRPCServer(u, grpcServer)

// HTTP Handler registration (passing nil for the second argument will ensure use of the default 
// basepath, passing nil for the third argument will ensure use of the DefaultServeMux wrapped 
// by opencensus)
RegisterHTTPHandler(u, nil, nil)

// Refer to the http/grpc godocs for information on how to serve using the registered HTTP handler
// or gRPC server, respectively:
// HTTP: https://golang.org/pkg/net/http/#Handler
// gRPC: https://godoc.org/google.golang.org/grpc#Server

Getting a value
// Create a Codec for unmarshaling data into your format of choice - the package includes 
// implementations for []byte and string formats, and the protocodec subpackage includes the 
// protobuf adapter
sCodec := StringCodec{}

// Call Get on the Galaxy to retrieve data and unmarshal it into your Codec
ctx := context.Background()
err := g.Get(ctx, "my-key", &sCodec)
if err != nil {
   // handle if Get returns an error
}

// Shutdown all open connections between peers before killing the process
u.Shutdown()

Concepts and Glossary

Consistent hash determines authority

A consistent hashing algorithm determines the sharding of keys across peers in galaxycache. Further reading can be found here and here.

Universe

To keep galaxycache instances non-global (i.e. for multithreaded testing), a Universe object contains all of the moving parts of the cache, including the logic for connecting to peers, consistent hashing, and maintaining the set of galaxies.

Galaxy

A Galaxy is a grouping of keys based on a category determined by the user. For example, you might have a galaxy for Users and a galaxy for Video Metadata; those data types may require different fetching protocols on the backend -- separating them into different Galaxies allows for this flexibility.

Each Galaxy contains its own cache space. The cache is immutable; all cache population and eviction is handled by internal logic.

Maincache vs Hotcache

The cache within each galaxy is divided into a "maincache" and a "hotcache".

The "maincache" contains data that the local process is authoritative over. The maincache is always populated whenever data is fetched from the backend (with a LRU eviction policy).

In order to eliminate network hops, a portion of the cache space in each process is reserved for especially popular keys that the local process is not authoritative over. By default, this "hotcache" is populated by a key and its associated data by means of a requests-per-second metric. The logic for hotcache promotion can be configured by implementing a custom solution with the ShouldPromote.Interface.

Step-by-Step Breakdown of a Get()

galaxycache Caching Example Diagram

When Get is called for a key in a Galaxy in some process called Process_A:

  1. The local cache (both maincache and hotcache) in Process_A is checked first
  2. On a cache miss, the PeerPicker object delegates to the peer authoritative over the requested key
  3. Depends on which peer is authoritative over this key...
  • If the Process_A is the authority:
    • Process_A uses its BackendGetter to get the data, and populates its local maincache
  • If Process_A is not the authority:
    • Process_A calls Fetch on the authoritative remote peer, Process_B (method determined by FetchProtocol)
    • Process_B then performs a Get to either find the data from its own local cache or use the specified BackendGetter to get the data from elsewhere, such as by querying a database
    • Process_B populates its maincache with the data before serving it back to Process_A
    • Process_A determines whether the key is hot enough to promote to the hotcache
      • If it is, then the hotcache for Process_A is populated with the key/data
  1. The data is unmarshaled into the Codec passed into Get

Changes from groupcache

Our changes include the following:

  • Overhauled API to improve usability and configurability
  • Improvements to testing by removing global state
  • Improvement to connection efficiency between peers with the addition of gRPC
  • Added a Promoter.Interface for choosing which keys get hotcached
  • Made some core functionality more generic (e.g. replaced the Sink object with a Codec marshaler interface, removed Byteview)
New architecture and API
No more global state
  • Removed all global variables to allow for multithreaded testing by implementing a Universe container that holds the [Galaxies] (previously a global groups map) and PeerPicker (part of what used to be HTTPPool)
  • Added methods to Universe to allow for simpler handling of most galaxycache operations (setting Peers, instantiating a Picker, etc)
New structure for fetching from peers (with gRPC support)
A smarter Hotcache with configurable promotion logic
  • New default promotion logic uses key access statistics tied to every key to make decisions about populating the hotcache
  • Promoter package provides a ShouldPromote.Interface for creating your own method to determine whether a key should be added to the hotcache
  • Newly added candidate cache keeps track of peer-owned keys (without associated data) that have not yet been promoted to the hotcache
  • Provided variadic options for Galaxy construction to override default promotion logic (with your promoter, max number of candidates, and relative hotcache size to maincache)

Comparison to memcached

See: https://github.com/golang/groupcache/blob/master/README.md

Help

Use the golang-nuts mailing list for any discussion or questions.

Documentation

Overview

Package galaxycache provides a data loading mechanism with caching and de-duplication that works across a set of peer processes.

Each data Get first consults its local cache, otherwise delegates to the requested key's canonical owner, which then checks its cache or finally gets the data. In the common case, many concurrent cache misses across a set of peers for the same key result in just one cache fill.

Index

Constants

This section is empty.

Variables

View Source
var (
	MGets              = stats.Int64("galaxycache/gets", "The number of Get requests", stats.UnitDimensionless)
	MLoads             = stats.Int64("galaxycache/loads", "The number of gets/cacheHits", stats.UnitDimensionless)
	MLoadErrors        = stats.Int64("galaxycache/loads_errors", "The number of errors encountered during Get", stats.UnitDimensionless)
	MCacheHits         = stats.Int64("galaxycache/cache_hits", "The number of times that the cache was hit", stats.UnitDimensionless)
	MPeerLoads         = stats.Int64("galaxycache/peer_loads", "The number of remote loads or remote cache hits", stats.UnitDimensionless)
	MPeerLoadErrors    = stats.Int64("galaxycache/peer_errors", "The number of remote errors", stats.UnitDimensionless)
	MBackendLoads      = stats.Int64("galaxycache/backend_loads", "The number of successful loads from the backend getter", stats.UnitDimensionless)
	MBackendLoadErrors = stats.Int64("galaxycache/local_load_errors", "The number of failed backend loads", stats.UnitDimensionless)

	MCoalescedLoads        = stats.Int64("galaxycache/coalesced_loads", "The number of loads coalesced by singleflight", stats.UnitDimensionless)
	MCoalescedCacheHits    = stats.Int64("galaxycache/coalesced_cache_hits", "The number of coalesced times that the cache was hit", stats.UnitDimensionless)
	MCoalescedPeerLoads    = stats.Int64("galaxycache/coalesced_peer_loads", "The number of coalesced remote loads or remote cache hits", stats.UnitDimensionless)
	MCoalescedBackendLoads = stats.Int64("galaxycache/coalesced_backend_loads", "The number of coalesced successful loads from the backend getter", stats.UnitDimensionless)

	MServerRequests = stats.Int64("galaxycache/server_requests", "The number of Gets that came over the network from peers", stats.UnitDimensionless)
	MKeyLength      = stats.Int64("galaxycache/key_length", "The length of keys", stats.UnitBytes)
	MValueLength    = stats.Int64("galaxycache/value_length", "The length of values", stats.UnitBytes)

	MRoundtripLatencyMilliseconds = stats.Float64("galaxycache/roundtrip_latency", "Roundtrip latency in milliseconds", stats.UnitMilliseconds)

	MCacheSize    = stats.Int64("galaxycache/cache_bytes", "The number of bytes used for storing Keys and Values in the cache", stats.UnitBytes)
	MCacheEntries = stats.Int64("galaxycache/cache_entries", "The number of entries in the cache", stats.UnitDimensionless)
)

Opencensus stats

View Source
var (
	// GalaxyKey tags the name of the galaxy
	GalaxyKey = tag.MustNewKey("galaxy")

	// CacheLevelKey tags the level at which data was found on Get
	CacheLevelKey = tag.MustNewKey("cache-hit-level")

	// CacheTypeKey tags the galaxy sub-cache the metric applies to
	CacheTypeKey = tag.MustNewKey("cache-type")
)
View Source
var AllViews = []*view.View{
	{Measure: MGets, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MLoads, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MCacheHits, TagKeys: []tag.Key{GalaxyKey, CacheLevelKey}, Aggregation: view.Count()},
	{Measure: MPeerLoads, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MPeerLoadErrors, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MBackendLoads, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MBackendLoadErrors, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},

	{Measure: MCoalescedLoads, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MCoalescedCacheHits, TagKeys: []tag.Key{GalaxyKey, CacheLevelKey}, Aggregation: view.Count()},
	{Measure: MCoalescedPeerLoads, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MCoalescedBackendLoads, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},

	{Measure: MServerRequests, TagKeys: []tag.Key{GalaxyKey}, Aggregation: view.Count()},
	{Measure: MKeyLength, TagKeys: []tag.Key{GalaxyKey}, Aggregation: defaultBytesDistribution},
	{Measure: MValueLength, TagKeys: []tag.Key{GalaxyKey}, Aggregation: defaultBytesDistribution},

	{Measure: MRoundtripLatencyMilliseconds, TagKeys: []tag.Key{GalaxyKey}, Aggregation: defaultMillisecondsDistribution},
	{Measure: MCacheSize, TagKeys: []tag.Key{GalaxyKey, CacheTypeKey}, Aggregation: view.LastValue()},
	{Measure: MCacheEntries, TagKeys: []tag.Key{GalaxyKey, CacheTypeKey}, Aggregation: view.LastValue()},
}

AllViews is a slice of default views for people to use

Functions

This section is empty.

Types

type AtomicInt

type AtomicInt int64

An AtomicInt is an int64 to be accessed atomically.

func (*AtomicInt) Add

func (i *AtomicInt) Add(n int64)

Add atomically adds n to i.

func (*AtomicInt) Get

func (i *AtomicInt) Get() int64

Get atomically gets the value of i.

func (*AtomicInt) String

func (i *AtomicInt) String() string

type BackendGetter

type BackendGetter interface {
	// Get populates dest with the value identified by key
	//
	// The returned data must be unversioned. That is, key must
	// uniquely describe the loaded data, without an implicit
	// current time, and without relying on cache expiration
	// mechanisms.
	Get(ctx context.Context, key string, dest Codec) error
}

A BackendGetter loads data for a key.

type ByteCodec

type ByteCodec []byte

ByteCodec is a byte slice type that implements Codec

func (*ByteCodec) MarshalBinary

func (c *ByteCodec) MarshalBinary() ([]byte, error)

MarshalBinary on a ByteCodec returns the bytes

func (*ByteCodec) UnmarshalBinary

func (c *ByteCodec) UnmarshalBinary(data []byte) error

UnmarshalBinary on a ByteCodec sets the ByteCodec to a copy of the provided data

type CacheStats

type CacheStats struct {
	Bytes     int64
	Items     int64
	Gets      int64
	Hits      int64
	Evictions int64
}

CacheStats are returned by stats accessors on Galaxy.

type CacheType

type CacheType uint8

CacheType represents a type of cache.

const (
	// MainCache is the cache for items that this peer is the
	// owner of.
	MainCache CacheType = iota + 1

	// HotCache is the cache for items that seem popular
	// enough to replicate to this node, even though it's not the
	// owner.
	HotCache

	// CandidateCache is the cache for peer-owned keys that
	// may become popular enough to put in the HotCache
	CandidateCache
)

func (CacheType) String

func (i CacheType) String() string

type Codec

type Codec interface {
	MarshalBinary() ([]byte, error)
	UnmarshalBinary(data []byte) error
}

Codec includes both the BinaryMarshaler and BinaryUnmarshaler interfaces

type CopyingByteCodec

type CopyingByteCodec []byte

CopyingByteCodec is a byte slice type that implements Codec and returns a copy of the bytes when marshaled

func (*CopyingByteCodec) MarshalBinary

func (c *CopyingByteCodec) MarshalBinary() ([]byte, error)

MarshalBinary on a CopyingByteCodec returns a copy of the bytes

func (*CopyingByteCodec) UnmarshalBinary

func (c *CopyingByteCodec) UnmarshalBinary(data []byte) error

UnmarshalBinary on a CopyingByteCodec sets the ByteCodec to a copy of the provided data

type FetchProtocol

type FetchProtocol interface {
	// NewFetcher instantiates the connection between the current and a
	// remote peer and returns a RemoteFetcher to be used for fetching
	// data from that peer
	NewFetcher(url string) (RemoteFetcher, error)
}

FetchProtocol defines the chosen fetching protocol to peers (namely HTTP or GRPC) and implements the instantiation method for that connection (creating a new RemoteFetcher)

type Galaxy

type Galaxy struct {

	// Stats are statistics on the galaxy.
	Stats GalaxyStats
	// contains filtered or unexported fields
}

A Galaxy is a cache namespace and associated data spread over a group of 1 or more machines.

func (*Galaxy) CacheStats

func (g *Galaxy) CacheStats(which CacheType) CacheStats

CacheStats returns stats about the provided cache within the galaxy.

func (*Galaxy) Get

func (g *Galaxy) Get(ctx context.Context, key string, dest Codec) error

Get as defined here is the primary "get" called on a galaxy to find the value for the given key, using the following logic: - First, try the local cache; if its a cache hit, we're done - On a cache miss, search for which peer is the owner of the key based on the consistent hash - If a different peer is the owner, use the corresponding fetcher to Fetch from it; otherwise, if the calling instance is the key's canonical owner, call the BackendGetter to retrieve the value (which will now be cached locally)

func (*Galaxy) Name

func (g *Galaxy) Name() string

Name returns the name of the galaxy.

func (*Galaxy) Remove

func (g *Galaxy) Remove(ctx context.Context, key string) error

Remove clears the key from our cache then forwards the remove request to all peers.

type GalaxyOption

type GalaxyOption interface {
	// contains filtered or unexported methods
}

GalaxyOption is an interface for implementing functional galaxy options

func WithHotCacheRatio

func WithHotCacheRatio(r int64) GalaxyOption

WithHotCacheRatio allows the client to specify a ratio for the main-to-hot cache sizes for the galaxy; defaults to 8:1

func WithMaxCandidates

func WithMaxCandidates(n int) GalaxyOption

WithMaxCandidates allows the client to specify the size of the candidate cache by the max number of candidates held at one time; defaults to 100

func WithPromoter

func WithPromoter(p promoter.Interface) GalaxyOption

WithPromoter allows the client to specify a promoter for the galaxy; defaults to a simple QPS comparison

type GalaxyStats

type GalaxyStats struct {
	Gets              AtomicInt // any Get request, including from peers
	Loads             AtomicInt // (gets - cacheHits)
	CoalescedLoads    AtomicInt // inside singleflight
	MaincacheHits     AtomicInt // number of maincache hits
	HotcacheHits      AtomicInt // number of hotcache hits
	PeerLoads         AtomicInt // either remote load or remote cache hit (not an error)
	PeerLoadErrors    AtomicInt // errors on getFromPeer
	BackendLoads      AtomicInt // load from backend locally
	BackendLoadErrors AtomicInt // total bad local loads

	CoalescedMaincacheHits AtomicInt // maincache hit in singleflight
	CoalescedHotcacheHits  AtomicInt // hotcache hit in singleflight
	CoalescedPeerLoads     AtomicInt // peer load in singleflight
	CoalescedBackendLoads  AtomicInt // backend load in singleflight

	ServerRequests AtomicInt // gets that came over the network from peers
}

GalaxyStats are per-galaxy statistics.

type GetterFunc

type GetterFunc func(ctx context.Context, key string, dest Codec) error

A GetterFunc implements BackendGetter with a function.

func (GetterFunc) Get

func (f GetterFunc) Get(ctx context.Context, key string, dest Codec) error

Get implements Get from BackendGetter

type HCStatsWithTime

type HCStatsWithTime struct {
	// contains filtered or unexported fields
}

HCStatsWithTime includes a time stamp along with the hotcache stats to ensure updates happen no more than once per second

type HashOptions

type HashOptions struct {
	// Replicas specifies the number of key replicas on the consistent hash.
	// If zero, it defaults to 50.
	Replicas int

	// HashFn specifies the hash function of the consistent hash.
	// If nil, it defaults to crc32.ChecksumIEEE.
	HashFn consistenthash.Hash
}

HashOptions specifies the the hash function and the number of replicas for consistent hashing

type NullFetchProtocol

type NullFetchProtocol struct{}

NullFetchProtocol implements FetchProtocol, but always returns errors. (useful for unit-testing)

func (*NullFetchProtocol) NewFetcher

func (n *NullFetchProtocol) NewFetcher(url string) (RemoteFetcher, error)

NewFetcher instantiates the connection between the current and a remote peer and returns a RemoteFetcher to be used for fetching data from that peer

type PeerPicker

type PeerPicker struct {
	// contains filtered or unexported fields
}

PeerPicker is in charge of dealing with peers: it contains the hashing options (hash function and number of replicas), consistent hash map of peers, and a map of RemoteFetchers to those peers

func (*PeerPicker) GetAll

func (pp *PeerPicker) GetAll() []RemoteFetcher

GetAll returns all the peers in the pool

type RemoteFetcher

type RemoteFetcher interface {
	Fetch(context context.Context, galaxy string, key string) ([]byte, error)
	Remove(context context.Context, galaxy string, key string) error
	// Close closes a client-side connection (may be a nop)
	Close() error
}

RemoteFetcher is the interface that must be implemented to fetch from other peers; the PeerPicker contains a map of these fetchers corresponding to each other peer address

type StringCodec

type StringCodec string

StringCodec is a string type that implements Codec

func (*StringCodec) MarshalBinary

func (c *StringCodec) MarshalBinary() ([]byte, error)

MarshalBinary on a StringCodec returns the bytes underlying the string

func (*StringCodec) UnmarshalBinary

func (c *StringCodec) UnmarshalBinary(data []byte) error

UnmarshalBinary on a StringCodec sets the StringCodec to a stringified copy of the provided data

type Universe

type Universe struct {
	// contains filtered or unexported fields
}

Universe defines the primary container for all galaxycache operations. It contains the galaxies and PeerPicker

func NewUniverse

func NewUniverse(protocol FetchProtocol, selfURL string, opts ...UniverseOpt) *Universe

NewUniverse is the main constructor for the Universe object. It is passed a FetchProtocol (to specify fetching via GRPC or HTTP) and its own URL along with options.

func NewUniverseWithOpts

func NewUniverseWithOpts(protocol FetchProtocol, selfURL string, options *HashOptions) *Universe

NewUniverseWithOpts is a deprecated constructor for the Universe object that defines a non-default hash function and number of replicas. Please use `NewUniverse` with the `WithHashOpts` option instead.

func (*Universe) DeregisterGroup

func (universe *Universe) DeregisterGroup(name string)

DeregisterGroup removes group from galaxy pool

func (*Universe) GetGalaxy

func (universe *Universe) GetGalaxy(name string) *Galaxy

GetGalaxy returns the named galaxy previously created with NewGalaxy, or nil if there's no such galaxy.

func (*Universe) NewGalaxy

func (universe *Universe) NewGalaxy(name string, cacheBytes int64, getter BackendGetter, opts ...GalaxyOption) *Galaxy

NewGalaxy creates a coordinated galaxy-aware BackendGetter from a BackendGetter.

The returned BackendGetter tries (but does not guarantee) to run only one Get is called once for a given key across an entire set of peer processes. Concurrent callers both in the local process and in other processes receive copies of the answer once the original Get completes.

The galaxy name must be unique for each BackendGetter.

func (*Universe) Set

func (universe *Universe) Set(peerURLs ...string) error

Set updates the Universe's list of peers (contained in the PeerPicker). Each PeerURL value should be a valid base URL, for example "example.net:8000".

func (*Universe) Shutdown

func (universe *Universe) Shutdown() error

Shutdown closes all open fetcher connections

type UniverseOpt

type UniverseOpt func(*universeOpts)

UniverseOpt is a functional Universe option.

func WithHashOpts

func WithHashOpts(hashOpts *HashOptions) UniverseOpt

WithHashOpts sets the HashOptions on a universe.

func WithRecorder

func WithRecorder(recorder stats.Recorder) UniverseOpt

WithRecorder allows you to override the default stats.Recorder used for stats.

Directories

Path Synopsis
Package consistenthash provides an implementation of a ring hash.
Package consistenthash provides an implementation of a ring hash.
Package lru implements an LRU cache.
Package lru implements an LRU cache.
Package singleflight provides a duplicate function call suppression mechanism.
Package singleflight provides a duplicate function call suppression mechanism.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL