Documentation
¶
Overview ¶
Package cache provides semantic caching for routing decisions. It uses embedding similarity to cache and retrieve routing decisions, reducing latency for similar queries.
Index ¶
- type CacheEntry
- type CacheMetrics
- type EmbeddingEngine
- type SemanticCache
- func (c *SemanticCache) Clear()
- func (c *SemanticCache) GetHitRate() float64
- func (c *SemanticCache) GetMetrics() CacheMetrics
- func (c *SemanticCache) GetMetricsAsMap() map[string]interface{}
- func (c *SemanticCache) GetSize() int
- func (c *SemanticCache) IsEnabled() bool
- func (c *SemanticCache) Lookup(query string) (interface{}, error)
- func (c *SemanticCache) Store(query string, embedding []float32, decision string, ...) error
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CacheEntry ¶
type CacheEntry struct {
// Query is the original query text
Query string
// Embedding is the query embedding vector
Embedding []float32
// Decision is the routing decision (model ID)
Decision string
// Metadata contains additional routing information
Metadata map[string]interface{}
// Timestamp is when the entry was created
Timestamp time.Time
// contains filtered or unexported fields
}
CacheEntry represents a cached routing decision.
type CacheMetrics ¶
type CacheMetrics struct {
Hits int64
Misses int64
Evictions int64
Size int
AvgHitLatency time.Duration
AvgLookup time.Duration
}
CacheMetrics tracks cache performance statistics.
type EmbeddingEngine ¶
type EmbeddingEngine interface {
Embed(text string) ([]float32, error)
CosineSimilarity(a, b []float32) float64
IsEnabled() bool
}
EmbeddingEngine defines the interface for embedding operations.
type SemanticCache ¶
type SemanticCache struct {
// contains filtered or unexported fields
}
SemanticCache provides similarity-based caching for routing decisions. It uses LRU eviction when the cache reaches maximum size.
func NewSemanticCache ¶
func NewSemanticCache(engine EmbeddingEngine, similarityThreshold float64, maxSize int) *SemanticCache
NewSemanticCache creates a new semantic cache instance.
Parameters:
- engine: The embedding engine for computing similarities
- similarityThreshold: Minimum similarity for a cache hit (0.0-1.0)
- maxSize: Maximum number of cache entries
Returns:
- *SemanticCache: A new cache instance
func (*SemanticCache) Clear ¶
func (c *SemanticCache) Clear()
Clear removes all entries from the cache.
func (*SemanticCache) GetHitRate ¶
func (c *SemanticCache) GetHitRate() float64
GetHitRate returns the cache hit rate as a percentage.
Returns:
- float64: Hit rate (0.0-1.0)
func (*SemanticCache) GetMetrics ¶
func (c *SemanticCache) GetMetrics() CacheMetrics
GetMetrics returns current cache performance metrics.
Returns:
- CacheMetrics: Current metrics snapshot
func (*SemanticCache) GetMetricsAsMap ¶
func (c *SemanticCache) GetMetricsAsMap() map[string]interface{}
GetMetricsAsMap returns current cache performance metrics as a map. This is used by the Lua plugin engine.
Returns:
- map[string]interface{}: Current metrics as a map
func (*SemanticCache) GetSize ¶
func (c *SemanticCache) GetSize() int
GetSize returns the current number of cached entries.
Returns:
- int: Number of entries
func (*SemanticCache) IsEnabled ¶
func (c *SemanticCache) IsEnabled() bool
IsEnabled returns whether the cache is operational.
Returns:
- bool: true if the cache is enabled
func (*SemanticCache) Lookup ¶
func (c *SemanticCache) Lookup(query string) (interface{}, error)
Lookup searches for a cached routing decision based on semantic similarity. Returns the cached decision if a similar query is found, or nil if no match.
Parameters:
- query: The query text to look up
Returns:
- interface{}: The cached entry if found, or nil
- error: Any error during lookup
func (*SemanticCache) Store ¶
func (c *SemanticCache) Store(query string, embedding []float32, decision string, metadata map[string]interface{}) error
Store adds a routing decision to the cache. If the cache is full, the least recently used entry is evicted.
Parameters:
- query: The query text
- embedding: The query embedding vector
- decision: The routing decision (model ID)
- metadata: Additional routing information
Returns:
- error: Any error during storage