dscache

package
v0.0.0-...-8b7da69 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 27, 2020 License: Apache-2.0 Imports: 17 Imported by: 18

Documentation

Overview

Package dscache provides a transparent cache for RawDatastore which is backed by Memcache.

Inspiration

Although this is not a port of any particular implementation, it takes inspiration from these fine libraries:

Algorithm

Memcache contains cache entries for single datastore entities. The memcache key looks like

"gae:" | vers | ":" | shard | ":" | Base64_std_nopad(SHA1(datastore.Key))

Where:

  • vers is an ascii-hex-encoded number (currently 1).
  • shard is a zero-based ascii-hex-encoded number (depends on shardsForKey).
  • SHA1 has been chosen as unlikely (p == 1e-18) to collide, given dedicated memcache sizes of up to 170 Exabytes (assuming an average entry size of 100KB including the memcache key). This is clearly overkill, but MD5 could start showing collisions at this probability in as small as a 26GB cache (and also MD5 sucks).

The memcache value is a compression byte, indicating the scheme (See CompressionType), followed by the encoded (and possibly compressed) value. Encoding is done with datastore.PropertyMap.Write(). The memcache value may also be the empty byte sequence, indicating that this entity is deleted.

The memcache entry may also have a 'flags' value set to one of the following:

  • 0 "entity" (cached value)
  • 1 "lock" (someone is mutating this entry)

Algorithm - Put and Delete

On a Put (or Delete), an empty value is unconditionally written to memcache with a LockTimeSeconds expiration (default 31 seconds), and a memcache flag value of 0x1 (indicating that it's a put-locked key). The random value is to preclude Get operations from believing that they possess the lock.

NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.

The datastore operation will then occur. Assuming success, Put will then unconditionally delete all of the memcache locks. At some point later, a Get will write its own lock, get the value from datastore, and compare and swap to populate the value (detailed below).

Algorithm - Get

On a Get, "Add" a lock for it (which only does something if there's no entry in memcache yet) with a nonce value. We immediately Get the memcache entries back (for CAS purposes later).

If it doesn't exist (unlikely since we just Add'd it) or if its flag is "lock" and the Value != the nonce we put there, go hit the datastore without trying to update memcache.

If its flag is "entity", decode the object and return it. If the Value is the empty byte sequence, return ErrNoSuchEntity.

If its flag is "lock" and the Value equals the nonce, go get it from the datastore. If that's successful, then encode the value to bytes, and CAS the object to memcache. The CAS will succeed if nothing else touched the memcache in the meantime (like a Put, a memcache expiration/eviction, etc.).

Algorithm - Transactions

In a transaction, all Put memcache operations are held until the very end of the transaction. Right before the transaction is committed, all accumulated Put memcache items are unconditionally set into memcache.

NOTE: If this memcache Set fails, it's a HARD ERROR. See DANGER ZONE.

If the transaction is sucessfully committed (err == nil), then all the locks will be deleted.

The assumption here is that get operations apply all outstanding transactions before they return data (https://cloud.google.com/appengine/docs/go/datastore/#Go_Datastore_writes_and_data_visibility), and so it is safe to purge all the locks if the transaction is known-good.

If the transaction succeeds, but RunInTransaction returns an error (which can happen), or if the transaction fails, then the lock entries time out naturally. This will mean 31-ish seconds of direct datastore access, but it's the more-correct thing to do.

Gets and Queries in a transaction pass right through without reading or writing memcache.

Cache control

An entity may expose the following metadata (see datastore.PropertyLoadSaver.GetMeta) to control the behavior of its cache.

  • `gae:"$dscache.enable,<true|false>"` - whether or not this entity should be cached at all. If ommitted, dscache defaults to true.
  • `gae:"$dscache.expiration,#seconds"` - the number of seconds of persistance to use when this item is cached. 0 is infinite. If omitted, defaults to 0.

In addition, the application may set a function shardsForKey(key) which returns the number of shards to use for a given datastore key. This function is set with the invocation of FilterRDS.

Shards have the effect that all write (Put/Delete) operations clear all memcache entries for the given datastore entry, and all reads read (and possibly populate) one of the memcache entries. So if an entity has 4 shards, a datastore Get for it will pull from one of the 4 possible memcache keys at random. This is good for heavily-read, but infrequently updated, entities. The purpose of sharding is to alleviate hot memcache keys, as recommended by https://cloud.google.com/appengine/articles/best-practices-for-app-engine-memcache#distribute-load .

Caveats

A couple things to note that may differ from other appengine datastore caching libraries (like goon, nds, or ndb).

  • It does NOT provide in-memory ("per-request") caching.
  • It's INtolerant of some memcache failures, but in exchange will not return inconsistent results. See DANGER ZONE for details.
  • Queries do not interact with the cache at all.
  • Negative lookups (e.g. ErrNoSuchEntity) are cached.

DANGER ZONE

As mentioned in the Put/Delete/Transactions sections above, if the memcache Set fails, that's a HARD ERROR. The reason for this is that otherwise in the event of transient memcache failures, the memcache may be permanently left in an inconsistent state, since there will be nothing to actually ensure that the bad value is flushed from memcache. As long as the Put is allowed to write the lock, then all will be (eventually) well, and so all other memcache operations are best effort.

So, if memcache is DOWN, you will effectively see tons of errors in the logs, and all cached datastore access will be essentially degraded to a slow read-only state. At this point, you have essentially 3 mitigration strategies:

  • wait for memcache to come back up.
  • dynamically disable all memcache access by writing the datastore entry: /dscache,1 = {"Enable": false} in the default namespace. This can be done by invoking the SetDynamicGlobalEnable method. This can take up to 5 minutes to take effect. If you have very long-running backend requests, you may need to cycle them to have it take effect. This dynamic bit is read essentially once per http request (when FilteRDS is called on the context).
  • push a new version of the application disabling the cache filter by setting InstanceEnabledStatic to false in an init() function.

On every dscache.FilterRDS invocation, it takes the opportunity to fetch this datastore value, if it hasn't been fetched in the last GlobalEnabledCheckInterval time (5 minutes). This equates to essentially once per http request, per 5 minutes, per instance.

AppEngine's memcache reserves the right to evict keys at any moment. This is especially true for shared memcache, which is subject to pressures outside of your application. When eviction happens due to memory pressure, least-recently-used values are evicted first.

https://cloud.google.com/appengine/docs/go/memcache/#Go_How_cached_data_expires

Eviction presents a potential race window, as lock items that were put in memcache may be evicted prior to the locked operations completing (or failing), causing concurrent Get operations to cache bad data indefinitely.

In practice, a dedicated memcache will be safe. LRU-based eviction means that that locks recently added will almost certainly not be evicted before their operations are complete, and a dedicated memcache lowers eviction pressure to a single application's operation. Production systems that have data integrity requirements are encouraged to use a dedicated memcache.

Note that flusing memcache of a running application may also induce this race. Flushes should be performed with this concern in mind.

TODO: A potential mitigation to lock eviction poisoning is to use memcache Statistics to identify the oldest memcache item and use that age to bound the lifetime of cached datastore entries. This would cause dscache items created around the time of a flush to expire quickly (instead of never), bounding the period of time when poisoned data may reside in the cache.

Index

Constants

View Source
const (
	// MemcacheVersion will be incremented in the event that the in-memcache
	// representation of the cache data is modified.
	MemcacheVersion = "1"

	// KeyFormat is the format string used to generate memcache keys. It's
	//   gae:<version>:<shard#>:<base64_std_nopad(sha1(datastore.Key))>
	KeyFormat = "gae:" + MemcacheVersion + ":%x:%s"

	// Sha1B64Padding is the number of padding characters a base64 encoding of
	// a sha1 has.
	Sha1B64Padding = 1

	// MaxShards is the maximum number of shards a single entity can have.
	MaxShards = 256

	// MaxShardsLen is the number of characters in the key the shard field
	// occupies.
	MaxShardsLen = len("ff")

	// InternalGAEPadding is the estimated internal padding size that GAE takes
	// per memcache line.
	//   https://cloud.google.com/appengine/docs/go/memcache/#Go_Limits
	InternalGAEPadding = 96

	// ValueSizeLimit is the maximum encoded size a datastore key+entry may
	// occupy. If a datastore entity is too large, it will have an indefinite
	// lock which will cause all clients to fetch it from the datastore.
	ValueSizeLimit = (1000 * 1000) - InternalGAEPadding - MaxShardsLen

	// CacheEnableMeta is the gae metadata key name for whether or not dscache
	// is enabled for an entity type at all.
	CacheEnableMeta = "dscache.enable"

	// CacheExpirationMeta is the gae metadata key name for the default
	// expiration time (in seconds) for an entity type.
	CacheExpirationMeta = "dscache.expiration"

	// NonceBytes is the number of bytes to use in the 'lock' nonce.
	NonceBytes = 8

	// GlobalEnabledCheckInterval is how frequently IsGloballyEnabled should check
	// the globalEnabled datastore entry.
	GlobalEnabledCheckInterval = 5 * time.Minute
)

Variables

View Source
var (
	// InstanceEnabledStatic allows you to statically (e.g. in an init() function)
	// bypass this filter by setting it to false. This takes effect when the
	// application calls IsGloballyEnabled.
	InstanceEnabledStatic = true

	// LockTimeSeconds is the number of seconds that a "lock" memcache entry will
	// have its expiration set to. It's set to just over half of the frontend
	// request handler timeout (currently 60 seconds).
	LockTimeSeconds = 31

	// CacheTimeSeconds is the default number of seconds that a cached entity will
	// be retained (memcache contention notwithstanding). A value of 0 is
	// infinite. This is #seconds instead of time.Duration, because memcache
	// truncates expiration to the second, which means a sub-second amount would
	// actually truncate to an infinite timeout.
	CacheTimeSeconds = int64((time.Hour * 24).Seconds())

	// CompressionThreshold is the number of bytes of entity value after which
	// compression kicks in.
	CompressionThreshold = 860

	// DefaultShards is the default number of key sharding to do.
	DefaultShards = 1

	// DefaultEnabled indicates whether or not caching is globally enabled or
	// disabled by default. Can still be overridden by CacheEnableMeta.
	DefaultEnabled = true
)

Functions

func AddShardFunctions

func AddShardFunctions(c context.Context, shardFns ...ShardFunction) context.Context

AddShardFunctions appends the provided shardFn functions to the internal list of shard functions. They are evaluated left to right, bottom to top.

nil functions will cause a panic.

So:

ctx = AddShardFunctions(ctx, A, B, C)
ctx = AddShardFunctions(ctx, D, E, F)

Would evaulate `D, E, F, A, B, C`

func AlwaysFilterRDS

func AlwaysFilterRDS(c context.Context) context.Context

AlwaysFilterRDS installs a caching RawDatastore filter in the context.

Unlike FilterRDS it doesn't check GlobalConfig via IsGloballyEnabled call, assuming caller already knows whether filter should be applied or not.

func FilterRDS

func FilterRDS(c context.Context) context.Context

FilterRDS installs a caching RawDatastore filter in the context.

It does nothing if IsGloballyEnabled returns false. That way it is possible to disable the cache in runtime (e.g. in case memcache service is having issues).

func HashKey

func HashKey(k *datastore.Key) string

HashKey generates just the hashed portion of the MemcacheKey.

func IsGloballyEnabled

func IsGloballyEnabled(c context.Context) bool

IsGloballyEnabled checks to see if this filter is enabled globally.

This checks InstanceEnabledStatic, as well as polls the datastore entity

/dscache,1 (a GlobalConfig instance)

Once every GlobalEnabledCheckInterval.

For correctness, any error encountered returns true. If this assumed false, then Put operations might incorrectly invalidate the cache.

func MakeMemcacheKey

func MakeMemcacheKey(shard int, k *datastore.Key) string

MakeMemcacheKey generates a memcache key for the given datastore Key. This is useful for debugging.

func SetGlobalEnable

func SetGlobalEnable(c context.Context, memcacheEnabled bool) error

SetGlobalEnable is a convenience function for manipulating the GlobalConfig.

It's meant to be called from admin handlers on your app to turn dscache functionality on or off in emergencies.

Types

type CompressionType

type CompressionType byte

CompressionType is the type of compression a single memcache entry has.

const (
	NoCompression CompressionType = iota
	ZlibCompression
)

Types of compression. ZlibCompression uses "compress/zlib".

func (CompressionType) String

func (c CompressionType) String() string

type FlagValue

type FlagValue uint32

FlagValue is used to indicate if a memcache entry currently contains an item or a lock.

const (
	ItemUKNONWN FlagValue = iota
	ItemHasData
	ItemHasLock
)

States for a memcache entry. ItemUNKNOWN exists to distinguish the default zero state from a valid state, but shouldn't ever be observed in memcache. .

type GlobalConfig

type GlobalConfig struct {
	Enable bool
	// contains filtered or unexported fields
}

GlobalConfig is the entity definition for dscache's global configuration.

It's Enable field can be set to false to cause all dscache operations (read and write) to cease in a given application.

This should be manipulated in the GLOBAL (e.g. empty) namespace only. When written there, it affects activity in all namespaces.

type ShardFunction

type ShardFunction func(*ds.Key) (shards int, ok bool)

ShardFunction is a user-controllable function which calculates the number of shards to use for a certain datastore key. The provided key will always be valid and complete. It should return ok=true if it recognized the Key, and false otherwise.

The # of shards returned may be between 1 and 256. Values above this range will be clamped into that range. A return value of 0 means that NO cache operations should be done for this key, regardless of the dscache.enable setting.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL