Documentation
¶
Overview ¶
Package cache provides a TTL + LRU response cache for LLM completions. Cache keys are derived from canonicalized request JSON, skipping non-deterministic fields (stream, user, request_id) and stripping timestamp prefixes from message content.
Index ¶
Constants ¶
const ( // DefaultTTL is the default time-to-live for cached entries. DefaultTTL = 10 * time.Minute // DefaultMaxSize is the default maximum number of cached entries. DefaultMaxSize = 200 // DefaultMaxItemSize is the maximum size of a single cached item (1 MB). DefaultMaxItemSize = 1 << 20 )
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Cache ¶
type Cache struct {
// contains filtered or unexported fields
}
Cache is a concurrency-safe TTL + LRU response cache.
func (*Cache) Get ¶
Get retrieves a cached response for the given request body. Returns the entry and true on hit, or a zero Entry and false on miss. If noCache is true (e.g. from a Cache-Control: no-cache header), it always returns a miss.
func (*Cache) SetEnabled ¶
SetEnabled enables or disables the cache at runtime.
type Option ¶
type Option func(*Cache)
Option configures a Cache.
func WithMaxItemSize ¶
WithMaxItemSize sets the maximum byte size of a single cached item.
func WithMaxSize ¶
WithMaxSize sets the maximum number of cached entries.