Documentation
¶
Overview ¶
Package dedup provides request deduplication for LLM proxy requests. Multiple identical in-flight requests share a single upstream call, and recently completed responses are cached for a short TTL.
Index ¶
Constants ¶
const ( // DefaultTTL is how long completed responses stay cached. DefaultTTL = 30 * time.Second // DefaultMaxBodySize is the max response body size to cache (1 MB). DefaultMaxBodySize = 1 << 20 )
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Deduplicator ¶
type Deduplicator struct {
// contains filtered or unexported fields
}
Deduplicator coalesces identical requests and caches recent responses.
func New ¶
func New(opts ...Option) *Deduplicator
New creates a Deduplicator with the given options.
func (*Deduplicator) Do ¶
Do executes fn at most once for concurrent identical requests identified by the JSON body. If another goroutine is already executing a request with the same key, Do blocks until that request completes and returns the same result. Recently completed responses are returned from cache without calling fn.
Returns the response, whether it was a cache/dedup hit, and any error.
func (*Deduplicator) Len ¶
func (d *Deduplicator) Len() int
Len returns the number of entries in the completed cache.
func (*Deduplicator) Prune ¶
func (d *Deduplicator) Prune() int
Prune removes expired entries from the completed cache.
type Option ¶
type Option func(*Deduplicator)
Option configures a Deduplicator.
func WithMaxBodySize ¶
WithMaxBodySize sets the maximum response body size to cache.