epcache

package module
v1.10.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 10, 2024 License: MIT Imports: 24 Imported by: 0

README

EPCache

EPCache is Endless Paradox's cache, Experiment-Purpose cache and Enhanced-Performance cache. A lightweight and highly customizable distributed cache system construction package implemented by golang. Data synchronization between nodes is asynchronous, with high concurrent access capabilities and ease of use.

Structure

.
├── bloomfilter
│   └── bloomfilter.go: implements a bloomfilter using bitmap and murmur3, working as a blacklist 
│                         to avoid cache penetration, which might cause a false potive problem.
├── consistenthash
│   └── consistenthash.go: implements a hash ring mapping reqs to a specific node having the same group,
│                            which establishes a basic load balance of the EPCache cluster.
├── epcachepb
│   └── epcachepb.proto: defines the protobuf messages and service used by ProtoPeer and PeerAgent.
├── etcd
│   └── startup.sh: provides an example to start an etcd cluster.
├── lru
│   └── lru.go: implements a lru cache.
├── singleflight
│   └── singleflight.go: provides a duplicate func call suppression mechanism using Mutex and WaitGroup,
│                          to avoid cache breakdown.
├── byteview.go: implements an immutable view of bytes, used inside the EPCache cluster and presented to users,
│                  which provides benefit of decoupling from data source and preventing users from 
│                  accidentally modifying the EPCache cluster's data.
├── cache.go: wraps a lru cache and its operators, using Mutex to provide concurrent safety 
│               and recording relevant statistical data. 
├── epcache.go: group, the orgainizational form of data, provides APIs to an EPCache cluster node, 
│                 like Get, OnUpdate and OnDelete etc. The ratelimiter and bloomfilter here can be enable and
│                 disabled at any time.
├── getter.go: provides the interface Getter, which must be implemented by users to access to the data source.
├── grpc.go: implements GrpcPool as a PeerAgent, which communicates with other nodes using gRPC, 
│              it will automatically deal with the service registration and discovery work based on etcd,
│              and of course satrt a gRPC server, all of which support graceful shutdown.
├── peers.go: provides the interface standards of ProtoPeer and PeerAgent, which are responsible for
│               the interation work among the EPCache cluster nodes; also implements NoPeer as a PeerAgent.
└── protopeer.go: implements protoPeer as a ProtoPeer with a gRPC client, which is used by GrpcPool.

Procedure

                         y
Get -----> cache hit? -----> retrun
            | n                                  
            |----> consistent hashing 
                    |                 y                                 y
                    |----> remote? -----> load from peer -----> suc? -----> popuate hot cache in 1/10 chance -----> return
                            | n                                  | n                            y             
                            |                                    |----> due to non-existent? -----> return error
                            |                                            | n                                  
                            |                                            |                            y
                            |-----------------------------------------> load locally -----> exist? -----> popuate main cache -----> return
                                                                                             | n                                  
                                                                                             |----> return error
OnUpdate/OnDelete -----> opt locally if exist
                          | go                                  
                          |----> sync to peers 
                                  | go all
                                  |                                      y             
                                  |----> sync to one peer -----> suc? -----> opt remotely if exist
                                                                  | n                                  
                                                                  |----> collet 
                                                                          | then
                                                                          |
                                                                          |----> log all

Highlights

  1. Using gRPC/ProtoBuf to achieve efficient communication between nodes: request forwarding and data synchronization.
  2. Implementing cache elimination strategy based on LRU, and implement load balancing based on consistent hashing.
  3. Using mutexes and semaphores to prevent cache penetration, and providing bloom filters to prevent cache penetration.
  4. Providing the token bucket algorithm to limit the request flow of the cache system.
  5. Implementing service registration and service discovery based on etcd to achieve synchronization when nodes and their groups are dynamically adjusted.

Guide

  1. You can build up a cache system as you like by importing this module.
  2. Getter is implemented by you, a normal one might be using DB as data source.
  3. ProtoPeer and PeerAgent can also be implemented by you, and using protoPeer and GrpcPool is recommended when attempting to build up a cluster, as well as NoPeer when you just need standalone one.
  4. Set up ratelimiter and bloomfilter when you need them.
  5. The GrpcPool will log something important when up.
  6. An API server is the best practise to be built in front of an EPCache cluster node.

Contributing

Issues and Pull Requests are accepted. Feel free to contribute to this project.

License

MIT © EndlessParadox1

Documentation

Overview

Package epcache implements a distributed cache system.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type ByteView

type ByteView struct {
	// contains filtered or unexported fields
}

ByteView holds an immutable view of bytes, it should be used as a value type, not a pointer type.

func (ByteView) ByteSlice

func (bv ByteView) ByteSlice() []byte

ByteSlice returns a copy of the data as a byte slice.

func (ByteView) Equal

func (bv ByteView) Equal(bv2 ByteView) bool

Equal returns whether the bytes in bv are the same as the bytes in bv2.

func (ByteView) EqualBytes

func (bv ByteView) EqualBytes(b2 []byte) bool

EqualBytes returns whether the bytes in bv are the same as the bytes b2.

func (ByteView) Len

func (bv ByteView) Len() int

func (ByteView) Slice

func (bv ByteView) Slice(from, to int) ByteView

Slice slices the view between from and to.

type CacheStats

type CacheStats struct {
	Bytes  int64
	Items  int64
	Hits   int64
	Gets   int64
	Evicts int64
}

type CacheType

type CacheType int
const (
	MainCache CacheType = iota + 1
	HotCache
)

type Getter

type Getter interface {
	// Get depends on users' concrete implementation.
	// Context's deadline should be treated properly if existed.
	Get(ctx context.Context, key string) ([]byte, error)
}

Getter loads data from source, like a DB.

type GetterFunc

type GetterFunc func(ctx context.Context, key string) ([]byte, error)

GetterFunc indicates Getter might just be a func.

func (GetterFunc) Get

func (f GetterFunc) Get(ctx context.Context, key string) ([]byte, error)

type GrpcPool

type GrpcPool struct {
	pb.UnimplementedEPCacheServer
	// contains filtered or unexported fields
}

func NewGrpcPool

func NewGrpcPool(self, prefix string, registry []string, opts *GrpcPoolOptions) *GrpcPool

NewGrpcPool returns a GrpcPool instance.

prefix: The working directory of the EPCache cluster.
registry: The listening addresses of the etcd cluster.

func (*GrpcPool) Get

func (gp *GrpcPool) Get(ctx context.Context, in *pb.Request) (*pb.Response, error)

func (*GrpcPool) ListPeers

func (gp *GrpcPool) ListPeers() (ans []string)

func (*GrpcPool) PickPeer

func (gp *GrpcPool) PickPeer(key string) (ProtoPeer, bool)

func (*GrpcPool) SetNode added in v1.10.0

func (gp *GrpcPool) SetNode(node *Node)

func (*GrpcPool) SyncAll

func (gp *GrpcPool) SyncAll(data *pb.SyncData)

SyncAll trys to sync data to all peers in an async way, and logs error if any.

type GrpcPoolOptions

type GrpcPoolOptions struct {
	Replicas int
	HashFn   consistenthash.Hash
}

type LimitMode

type LimitMode int
const (
	NoLimit LimitMode = iota
	BlockMode
	RejectMode
)

type LoadError

type LoadError string
const ErrNotFound LoadError = "key not found in data source"

ErrNotFound must be returned when Getter can't found the data.

func (LoadError) Error

func (e LoadError) Error() string

type NoPeer

type NoPeer struct{}

NoPeer is an implementation of PeerAgent, used for groups running in standalone mode.

func (NoPeer) ListPeers

func (NoPeer) ListPeers() (ans []string)

func (NoPeer) PickPeer

func (NoPeer) PickPeer(_ string) (peer ProtoPeer, ok bool)

func (NoPeer) SetNode added in v1.10.0

func (NoPeer) SetNode(_ *Node)

func (NoPeer) SyncAll

func (NoPeer) SyncAll(_ *pb.SyncData)

type Node added in v1.10.0

type Node struct {
	Stats Stats
	// contains filtered or unexported fields
}

Node is a set of associated data spreading over one or more processes.

func NewNode added in v1.10.0

func NewNode(cacheBytes int64, getter Getter) *Node

func (*Node) CacheStats added in v1.10.0

func (n *Node) CacheStats(ctype CacheType) CacheStats

func (*Node) Get added in v1.10.0

func (n *Node) Get(ctx context.Context, key string) (ByteView, error)

func (*Node) OnRemove added in v1.10.0

func (n *Node) OnRemove(key string)

OnRemove removes data in cache and then syncs to all peers in background. This must be called when data in source is purged.

func (*Node) OnUpdate added in v1.10.0

func (n *Node) OnUpdate(key string, value []byte)

OnUpdate updates data in cache and then syncs to all peers in background. This must be called when data in source is changed.

func (*Node) RegisterPeers added in v1.10.0

func (n *Node) RegisterPeers(peers PeerAgent)

RegisterPeers specifies PeerPicker for a group, e.n. NoPeer, GrpcPool or any that implements the PeerPicker.

func (*Node) ResetLimiter added in v1.10.0

func (n *Node) ResetLimiter()

ResetLimiter disables a rate limiter.

func (*Node) SetFilter added in v1.10.0

func (n *Node) SetFilter(size uint32)

SetFilter sets a bloom filter, zero size for none. It calculates the required params to build a bloom filter, false positive rate of which will be lower than 0.01%, according to user's expected blacklist size.

func (*Node) SetLimiter added in v1.10.0

func (n *Node) SetLimiter(rate float64, cap int64, mode LimitMode)

SetLimiter sets a rate limiter working on blocking or rejecting mode.

type PeerAgent

type PeerAgent interface {
	// PickPeer picks peer with the same group according to the key.
	PickPeer(key string) (ProtoPeer, bool)
	// SyncAll trys to sync data to all peers.
	SyncAll(data *pb.SyncData)
	SetNode(node *Node)
	// ListPeers lists all peers.
	ListPeers() []string
}

type ProtoPeer

type ProtoPeer interface {
	// Get loads data from remote using gRPC.
	Get(ctx context.Context, in *pb.Request) (*pb.Response, error)
}

type Stats

type Stats struct {
	Reqs          int64
	Gets          int64
	Hits          int64
	Loads         int64
	LoadsDeduped  int64 // after singleflight
	LocalLoads    int64
	LocalLoadErrs int64
	PeerLoads     int64
	PeerLoadErrs  int64
	PeerReqs      int64 // requests from peers

	LenBlacklist int64
}

Stats are statistics for group.

Directories

Path Synopsis
Package bloomfilter avoids cache penetration.
Package bloomfilter avoids cache penetration.
Package consistenthash implements a ring hash.
Package consistenthash implements a ring hash.
Package lru implements a lru cache.
Package lru implements a lru cache.
Package msgctl reduces messages within a specified interval into one.
Package msgctl reduces messages within a specified interval into one.
Package singleflight provides a duplicate func call suppression mechanism, therefore avoiding cache breakdown.
Package singleflight provides a duplicate func call suppression mechanism, therefore avoiding cache breakdown.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL