Documentation
¶
Index ¶
- func AdaptiveCapFor(vendor, baseURL, model string, userHint int) *adaptiveCap
- func BuildHeadersForProvider(protocol string) http.Header
- func DefaultHeadersForProtocol(protocol string) http.Header
- func DiscoverModels(ctx context.Context, resolved *config.ResolvedEndpoint) ([]string, error)
- func GetCachedContextWindow(vendor, baseURL, model string) int
- func InferContextWindowFromError(err error, currentTokenCount int, currentMaxTokens int, probeKey string, ...) int
- func IsContextOverflowError(err error) bool
- func IsImageBlockFallbackCandidate(err error) bool
- func LookupProbeCache(key string) int
- func MakeProbeKey(vendor, baseURL, model string) string
- func ProbeContextWindow(ctx context.Context, p Provider, vendor, baseURL, model string, ...)
- func ResolveImpersonationHeaders() http.Header
- func SetActiveImpersonation(preset *ImpersonationPreset, version string, customHeaders map[string]string)
- func SetProbeCache(key string, window int)
- func UserFacingError(err error) string
- type AnthropicProvider
- func (p *AnthropicProvider) Chat(ctx context.Context, messages []Message, tools []ToolDefinition) (*ChatResponse, error)
- func (p *AnthropicProvider) ChatStream(ctx context.Context, messages []Message, tools []ToolDefinition) (<-chan StreamEvent, error)
- func (p *AnthropicProvider) CountTokens(ctx context.Context, messages []Message) (int, error)
- func (p *AnthropicProvider) Name() string
- func (p *AnthropicProvider) SetAdaptiveCap(c *adaptiveCap)
- type ChatResponse
- type ContentBlock
- func ImageBlock(mime, base64Data string) ContentBlock
- func TextBlock(text string) ContentBlock
- func ToolResultBlock(id, output string, isError bool) ContentBlock
- func ToolResultNamedBlock(id, name, output string, isError bool) ContentBlock
- func ToolResultWithImages(id, name, textOutput string, images []ContentImage, isError bool) ContentBlock
- func ToolUseBlock(id, name string, input json.RawMessage) ContentBlock
- type ContentImage
- type CopilotProvider
- type GeminiProvider
- func (p *GeminiProvider) Chat(ctx context.Context, messages []Message, tools []ToolDefinition) (*ChatResponse, error)
- func (p *GeminiProvider) ChatStream(ctx context.Context, messages []Message, tools []ToolDefinition) (<-chan StreamEvent, error)
- func (p *GeminiProvider) CountTokens(ctx context.Context, messages []Message) (int, error)
- func (p *GeminiProvider) Name() string
- func (p *GeminiProvider) SetAdaptiveCap(c *adaptiveCap)
- func (p *GeminiProvider) UpdateRuntimeHeaders(headers http.Header)
- type HeaderMutable
- type ImpersonationPreset
- type Message
- type OpenAIProvider
- func NewOpenAIProvider(apiKey string, model string, maxTokens int) *OpenAIProvider
- func NewOpenAIProviderWithBaseURL(apiKey string, model string, maxTokens int, baseURL string) *OpenAIProvider
- func NewOpenAIProviderWithConfig(config openai.ClientConfig, apiKey, model string, maxTokens int, name string) *OpenAIProvider
- func (p *OpenAIProvider) Chat(ctx context.Context, messages []Message, tools []ToolDefinition) (*ChatResponse, error)
- func (p *OpenAIProvider) ChatStream(ctx context.Context, messages []Message, tools []ToolDefinition) (<-chan StreamEvent, error)
- func (p *OpenAIProvider) CountTokens(ctx context.Context, messages []Message) (int, error)
- func (p *OpenAIProvider) Name() string
- func (p *OpenAIProvider) ReasoningEffort() string
- func (p *OpenAIProvider) SetAdaptiveCap(c *adaptiveCap)
- func (p *OpenAIProvider) SetReasoningEffort(effort string)
- func (p *OpenAIProvider) UpdateRuntimeHeaders(headers http.Header)
- type ProbeResult
- type Provider
- type ReasoningEffortProvider
- type StreamEvent
- type StreamEventType
- type TokenUsage
- type ToolCallDelta
- type ToolDefinition
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AdaptiveCapFor ¶ added in v1.1.45
AdaptiveCapFor returns the singleton adaptiveCap for the given identity. Identity is keyed on (vendor, baseURL, model) to avoid mixing learned values across distinct endpoints that share a model name.
userHint is the value the user (or default) configured. It is used as the initial `cur` only on first creation; subsequent calls return the existing learned cap (which has already been clamped by lo/hi from disk).
func BuildHeadersForProvider ¶ added in v1.1.34
BuildHeadersForProvider returns the headers to use for a given protocol. If impersonation is active, returns impersonation headers. Otherwise returns protocol-specific defaults.
func DefaultHeadersForProtocol ¶ added in v1.1.34
DefaultHeadersForProtocol returns the default headers for a given protocol when no impersonation is active. These match the original hardcoded behavior.
func DiscoverModels ¶ added in v1.0.12
DiscoverModels fetches the latest model list for a resolved endpoint when the remote API exposes it.
func GetCachedContextWindow ¶ added in v1.2.5
GetCachedContextWindow checks the persistent cache and returns the stored context window, or 0 if not cached.
func InferContextWindowFromError ¶ added in v1.2.11
func InferContextWindowFromError( err error, currentTokenCount int, currentMaxTokens int, probeKey string, setMaxTokens func(int), ) int
InferContextWindowFromError is called when a context overflow error is received. It attempts to determine the model's actual context window:
- First, tries to parse an exact limit from the error message.
- If that fails, uses currentTokenCount as an upper-bound estimate.
- Matches the result to the nearest tier from contextOverflowTiers.
- If the inferred tier is strictly smaller than currentMaxTokens, updates the context manager via setMaxTokens and persists to the probe cache.
Returns the inferred context window (0 if no update was needed/possible).
func IsContextOverflowError ¶ added in v1.2.7
isRetryable returns true for any error that is worth retrying.
We retry aggressively: only 401 (auth), 403 (forbidden), and 404 (not found) are considered permanent failures. Everything else — rate limits, server errors, timeouts, network glitches, bad gateway, etc. — gets retried. IsContextOverflowError checks whether the error indicates the input prompt exceeds the model's context window. These errors are never retryable — the same request will always fail until the context is compacted.
func IsImageBlockFallbackCandidate ¶ added in v1.1.17
IsImageBlockFallbackCandidate reports whether an image-bearing request should be retried without image blocks. We intentionally gate on HTTP 400 only and only for requests that already attempted to send image content.
func LookupProbeCache ¶ added in v1.2.5
LookupProbeCache returns the cached context window for the given key. Returns 0 if not cached.
func MakeProbeKey ¶ added in v1.2.5
MakeProbeKey builds the cache key for a vendor+baseURL+model combination. Matches adaptiveCap's capKey convention.
func ProbeContextWindow ¶ added in v1.2.5
func ProbeContextWindow(ctx context.Context, p Provider, vendor, baseURL, model string, onResult func(ProbeResult))
ProbeContextWindow probes the actual context window limit for the given provider. It runs asynchronously and calls onResult when done.
This is fully non-blocking:
- Cache hit → onResult called synchronously (O(1) read + SetContextWindow under lock)
- Cache miss → onResult called from a background goroutine
The onResult callback may be called from any goroutine. The caller must ensure any shared state access within onResult is thread-safe. ContextManager.SetContextWindow is already mutex-protected, so it's safe.
func ResolveImpersonationHeaders ¶ added in v1.1.34
ResolveImpersonationHeaders builds the final http.Header set from the current impersonation state. Priority: custom headers > preset extra headers. Returns a new http.Header each time.
func SetActiveImpersonation ¶ added in v1.1.34
func SetActiveImpersonation(preset *ImpersonationPreset, version string, customHeaders map[string]string)
SetActiveImpersonation configures the global impersonation state. Pass nil preset to clear impersonation.
func SetProbeCache ¶ added in v1.2.5
SetProbeCache persists a discovered context window value.
func UserFacingError ¶ added in v1.1.34
UserFacingError translates a technical provider/API error into a concise, human-readable message suitable for display in the TUI or IM.
It strips SDK-specific noise, maps HTTP status codes to friendly text, and falls back to a generic message so users never see raw stack traces or protocol details.
Types ¶
type AnthropicProvider ¶
type AnthropicProvider struct {
// contains filtered or unexported fields
}
AnthropicProvider implements Provider using the Anthropic SDK.
func NewAnthropicProvider ¶
func NewAnthropicProvider(apiKey string, model string, maxTokens int) *AnthropicProvider
NewAnthropicProvider creates a new Anthropic provider.
func NewAnthropicProviderWithBaseURL ¶
func NewAnthropicProviderWithBaseURL(apiKey string, model string, maxTokens int, baseURL string) *AnthropicProvider
NewAnthropicProviderWithBaseURL creates a new Anthropic provider with a custom base URL.
func (*AnthropicProvider) Chat ¶
func (p *AnthropicProvider) Chat(ctx context.Context, messages []Message, tools []ToolDefinition) (*ChatResponse, error)
func (*AnthropicProvider) ChatStream ¶
func (p *AnthropicProvider) ChatStream(ctx context.Context, messages []Message, tools []ToolDefinition) (<-chan StreamEvent, error)
func (*AnthropicProvider) CountTokens ¶
func (*AnthropicProvider) Name ¶
func (p *AnthropicProvider) Name() string
func (*AnthropicProvider) SetAdaptiveCap ¶ added in v1.1.45
func (p *AnthropicProvider) SetAdaptiveCap(c *adaptiveCap)
SetAdaptiveCap installs the adaptive max-output-tokens cap.
type ChatResponse ¶
type ChatResponse struct {
Message Message
Usage TokenUsage
}
ChatResponse is the complete response from a non-streaming Chat call.
type ContentBlock ¶
type ContentBlock struct {
Type string `json:"type"` // "text", "image", "tool_use", "tool_result"
Text string `json:"text,omitempty"`
ImageMIME string `json:"image_mime,omitempty"` // MIME type for image blocks
ImageData string `json:"image_data,omitempty"` // base64-encoded image data
ToolName string `json:"tool_name,omitempty"`
ToolID string `json:"tool_id,omitempty"`
Input json.RawMessage `json:"input,omitempty"`
Output string `json:"output,omitempty"`
IsError bool `json:"is_error,omitempty"`
Images []ContentImage `json:"images,omitempty"` // images within a tool_result
ReasoningContent string `json:"reasoning_content,omitempty"` // DeepSeek reasoning content (must be echoed back)
ThinkingSignature string `json:"thinking_signature,omitempty"` // Anthropic extended thinking signature (must be echoed back)
ThinkingData string `json:"thinking_data,omitempty"` // Anthropic redacted thinking data (must be echoed back)
}
ContentBlock is a union type: text, image, tool call, or tool result.
func ImageBlock ¶
func ImageBlock(mime, base64Data string) ContentBlock
ImageBlock creates an image content block with base64-encoded data.
func ToolResultBlock ¶
func ToolResultBlock(id, output string, isError bool) ContentBlock
ToolResultBlock creates a tool result content block.
func ToolResultNamedBlock ¶ added in v1.1.14
func ToolResultNamedBlock(id, name, output string, isError bool) ContentBlock
ToolResultNamedBlock creates a tool result content block with the originating tool name.
func ToolResultWithImages ¶ added in v1.1.34
func ToolResultWithImages(id, name, textOutput string, images []ContentImage, isError bool) ContentBlock
ToolResultWithImages creates a tool result that carries both text and images.
func ToolUseBlock ¶
func ToolUseBlock(id, name string, input json.RawMessage) ContentBlock
ToolUseBlock creates a tool call content block.
type ContentImage ¶ added in v1.1.34
ContentImage represents an image within a tool_result block.
type CopilotProvider ¶ added in v1.1.16
type CopilotProvider struct {
*OpenAIProvider
}
func NewCopilotProvider ¶ added in v1.1.16
func NewCopilotProvider(apiKey, model string, maxTokens int, baseURL string) *CopilotProvider
func (*CopilotProvider) Name ¶ added in v1.1.16
func (p *CopilotProvider) Name() string
func (*CopilotProvider) SetImpersonatedUA ¶ added in v1.1.34
func (p *CopilotProvider) SetImpersonatedUA(ua string)
SetImpersonatedUA sets the impersonated User-Agent for the copilot transport.
type GeminiProvider ¶
type GeminiProvider struct {
// contains filtered or unexported fields
}
GeminiProvider implements Provider using the Google Generative AI API.
func NewGeminiProvider ¶
func NewGeminiProvider(apiKey string, model string, maxTokens int) (*GeminiProvider, error)
NewGeminiProvider creates a new Gemini provider.
func NewGeminiProviderWithBaseURL ¶ added in v1.1.14
func NewGeminiProviderWithBaseURL(apiKey string, model string, maxTokens int, baseURL string) (*GeminiProvider, error)
NewGeminiProviderWithBaseURL creates a new Gemini provider with a custom base URL.
func (*GeminiProvider) Chat ¶
func (p *GeminiProvider) Chat(ctx context.Context, messages []Message, tools []ToolDefinition) (*ChatResponse, error)
func (*GeminiProvider) ChatStream ¶
func (p *GeminiProvider) ChatStream(ctx context.Context, messages []Message, tools []ToolDefinition) (<-chan StreamEvent, error)
func (*GeminiProvider) CountTokens ¶
func (*GeminiProvider) Name ¶
func (p *GeminiProvider) Name() string
func (*GeminiProvider) SetAdaptiveCap ¶ added in v1.1.45
func (p *GeminiProvider) SetAdaptiveCap(c *adaptiveCap)
SetAdaptiveCap installs the adaptive max-output-tokens cap.
func (*GeminiProvider) UpdateRuntimeHeaders ¶ added in v1.1.34
func (p *GeminiProvider) UpdateRuntimeHeaders(headers http.Header)
UpdateRuntimeHeaders updates the injected headers at runtime.
type HeaderMutable ¶ added in v1.1.34
HeaderMutable is implemented by providers that support runtime header updates.
type ImpersonationPreset ¶ added in v1.1.34
type ImpersonationPreset struct {
ID string // e.g. "claude-cli", "codex-cli"
DisplayName string // e.g. "Claude CLI"
UATemplate string // User-Agent template; {version} is replaced at runtime
ExtraHeaders map[string]string // additional headers
DefaultVersion string // default version when no custom version is set
}
ImpersonationPreset defines a known CLI tool identity that can be used to set the User-Agent and other HTTP headers when communicating with LLM APIs.
func DefaultImpersonationPresets ¶ added in v1.1.34
func DefaultImpersonationPresets() []ImpersonationPreset
DefaultImpersonationPresets returns the ordered list of available presets.
func FindPresetByID ¶ added in v1.1.34
func FindPresetByID(id string) *ImpersonationPreset
FindPresetByID looks up a preset by its ID. Returns nil if not found.
func GetActiveImpersonation ¶ added in v1.1.34
func GetActiveImpersonation() (preset *ImpersonationPreset, version string, customHeaders map[string]string)
GetActiveImpersonation returns the current impersonation state.
type Message ¶
type Message struct {
Role string `json:"role"` // "user", "assistant", "system"
Content []ContentBlock `json:"content"`
}
Message represents a single message in the conversation.
type OpenAIProvider ¶
type OpenAIProvider struct {
// contains filtered or unexported fields
}
OpenAIProvider implements Provider using the OpenAI-compatible API.
func NewOpenAIProvider ¶
func NewOpenAIProvider(apiKey string, model string, maxTokens int) *OpenAIProvider
NewOpenAIProvider creates a new OpenAI provider.
func NewOpenAIProviderWithBaseURL ¶
func NewOpenAIProviderWithBaseURL(apiKey string, model string, maxTokens int, baseURL string) *OpenAIProvider
NewOpenAIProviderWithBaseURL creates a new OpenAI provider with a custom base URL.
func NewOpenAIProviderWithConfig ¶ added in v1.1.16
func NewOpenAIProviderWithConfig(config openai.ClientConfig, apiKey, model string, maxTokens int, name string) *OpenAIProvider
func (*OpenAIProvider) Chat ¶
func (p *OpenAIProvider) Chat(ctx context.Context, messages []Message, tools []ToolDefinition) (*ChatResponse, error)
func (*OpenAIProvider) ChatStream ¶
func (p *OpenAIProvider) ChatStream(ctx context.Context, messages []Message, tools []ToolDefinition) (<-chan StreamEvent, error)
func (*OpenAIProvider) CountTokens ¶
func (*OpenAIProvider) Name ¶
func (p *OpenAIProvider) Name() string
func (*OpenAIProvider) ReasoningEffort ¶ added in v1.3.68
func (p *OpenAIProvider) ReasoningEffort() string
func (*OpenAIProvider) SetAdaptiveCap ¶ added in v1.1.45
func (p *OpenAIProvider) SetAdaptiveCap(c *adaptiveCap)
SetAdaptiveCap installs (or replaces) the adaptive max-output-tokens cap. Used by NewProvider to share learned state across reconstructions.
func (*OpenAIProvider) SetReasoningEffort ¶ added in v1.3.68
func (p *OpenAIProvider) SetReasoningEffort(effort string)
func (*OpenAIProvider) UpdateRuntimeHeaders ¶ added in v1.1.34
func (p *OpenAIProvider) UpdateRuntimeHeaders(headers http.Header)
UpdateRuntimeHeaders updates the injected headers at runtime.
type ProbeResult ¶ added in v1.2.5
type ProbeResult struct {
Key string // "vendor|baseURL|model"
ContextWindow int // discovered value, 0 if probe failed
FromCache bool // true if value came from persistent cache
}
ProbeResult is delivered asynchronously after a probe completes.
type Provider ¶
type Provider interface {
// Name returns the provider identifier (e.g., "anthropic", "openai", "gemini").
Name() string
// Chat sends a non-streaming request and returns the complete response.
// Used for token counting, summarization, and cost estimation.
Chat(ctx context.Context, messages []Message, tools []ToolDefinition) (*ChatResponse, error)
// ChatStream sends a streaming request and returns a channel of events.
// The channel is closed when the stream ends.
ChatStream(ctx context.Context, messages []Message, tools []ToolDefinition) (<-chan StreamEvent, error)
// CountTokens returns the token count for the given messages.
// Returns an error if the provider does not support counting.
CountTokens(ctx context.Context, messages []Message) (int, error)
}
Provider is the interface every LLM backend must implement.
func NewProvider ¶
func NewProvider(resolved *config.ResolvedEndpoint) (Provider, error)
NewProvider creates a protocol adapter from a resolved endpoint.
type ReasoningEffortProvider ¶ added in v1.3.68
type StreamEvent ¶
type StreamEvent struct {
Type StreamEventType
Text string // for TextChunk, Reasoning
ThinkingSignature string // for Reasoning (Anthropic extended thinking signature)
Tool ToolCallDelta // for ToolCallChunk / ToolCallDone
Result string // for ToolResult
IsError bool // for ToolResult
Usage *TokenUsage // for Done (nil if not final)
Error error // for Error
}
StreamEvent is sent over a channel during streaming responses.
type StreamEventType ¶
type StreamEventType int
const ( StreamEventText StreamEventType = iota StreamEventToolCallChunk StreamEventToolCallDone StreamEventToolResult StreamEventDone StreamEventError StreamEventReasoning // thinking/reasoning content (DeepSeek, etc.) StreamEventSystem // system notification (retry status, etc.) )
type TokenUsage ¶
type TokenUsage struct {
InputTokens int `json:"input_tokens"`
OutputTokens int `json:"output_tokens"`
CacheRead int `json:"cache_read_tokens"`
CacheWrite int `json:"cache_write_tokens"`
PromptTokensTotal int `json:"prompt_tokens_total,omitempty"`
}
TokenUsage records token consumption for a single API call.
func (TokenUsage) Add ¶ added in v1.3.40
func (u TokenUsage) Add(delta TokenUsage) TokenUsage
func (TokenUsage) CacheHitPercent ¶ added in v1.3.41
func (u TokenUsage) CacheHitPercent() int
func (TokenUsage) DisplayInputTokens ¶ added in v1.3.55
func (u TokenUsage) DisplayInputTokens() int
func (TokenUsage) Total ¶ added in v1.3.40
func (u TokenUsage) Total() int
func (TokenUsage) TotalInputTokens ¶ added in v1.3.55
func (u TokenUsage) TotalInputTokens() int
type ToolCallDelta ¶
type ToolCallDelta struct {
ID string // tool call ID (stable across chunks)
Index int // position in the tool call list
Name string // tool name (may be empty in early chunks)
Arguments json.RawMessage // accumulated arguments so far
}
ToolCallDelta represents a (possibly partial) tool call from a streaming response.
type ToolDefinition ¶
type ToolDefinition struct {
Name string `json:"name"`
Description string `json:"description"`
Parameters json.RawMessage `json:"parameters"` // JSON Schema
}
ToolDefinition describes a tool to the LLM provider.