Documentation
¶
Overview ¶
Package llm provides an OpenAI-compatible HTTP client using only stdlib.
Package llm provides an OpenAI-compatible HTTP client using only stdlib.
Index ¶
- func ApplyCacheMarkers(messages []Message) ([]Message, []SystemBlock)
- func DiscoverModelContext(baseURL, apiKey, model string) int
- func ResetModelCache()
- type CacheControl
- type CallParams
- type CallResult
- type Client
- type FunctionDef
- type Message
- type SystemBlock
- type ThinkingConfig
- type ToolCall
- type ToolDef
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ApplyCacheMarkers ¶
func ApplyCacheMarkers(messages []Message) ([]Message, []SystemBlock)
ApplyCacheMarkers annotates messages with Anthropic-style cache_control markers to enable prompt caching. It:
- Marks the first system message (if present) with cache_control: ephemeral
- Marks the first user message with cache_control: ephemeral
Returns the updated messages and a System field (populated if the system message was moved out of the messages array for Anthropic compatibility). Providers that don't support prompt caching silently ignore these fields.
func DiscoverModelContext ¶
DiscoverModelContext queries the /models endpoint of the configured base URL to discover the context window for the given model. Returns 0 if the endpoint doesn't support model attribute discovery or the model isn't found.
Results are cached per (baseURL, apiKey) so multiple agents using the same provider share a single API call. The full model list is cached so different model lookups from the same endpoint don't re-query.
Call this at startup before creating the engine. The HTTP call uses a 5s timeout and never blocks startup for more than that.
func ResetModelCache ¶
func ResetModelCache()
ResetModelCache clears the model discovery cache. Used in tests.
Types ¶
type CacheControl ¶
type CacheControl struct {
Type string `json:"type"` // "ephemeral"
}
CacheControl marks a message or system block as cacheable by Anthropic. Providers that don't support it (OpenAI, DeepSeek) silently ignore the field.
type CallParams ¶
type CallParams struct {
Model string `json:"model"`
Messages []Message `json:"messages"`
System []SystemBlock `json:"system,omitempty"` // Anthropic-style system blocks
Tools []ToolDef `json:"tools,omitempty"`
Stream bool `json:"stream"`
MaxTokens int `json:"max_tokens,omitempty"` // max output tokens (0 = omit/provider default)
Temperature *float64 `json:"temperature,omitempty"` // 0–2, nil = provider default
Thinking *ThinkingConfig `json:"thinking,omitempty"`
ReasoningEffort string `json:"reasoning_effort,omitempty"`
}
CallParams is the request body for /chat/completions.
type CallResult ¶
type CallResult struct {
Content string // assistant text
ReasoningContent string // DeepSeek reasoning/thinking tokens
ToolCalls []ToolCall // tool calls requested by the model
InputTokens int // prompt_tokens from API usage (0 = not reported)
OutputTokens int // completion_tokens from API usage (0 = not reported)
// Cache metrics. Only populated when the provider returns them.
// Anthropic: cache_creation_input_tokens, cache_read_input_tokens
// OpenAI: prompt_tokens_details.cached_tokens
CacheCreationTokens int // Anthropic — tokens written to cache
CacheReadTokens int // Anthropic — tokens read from cache hit
CachedTokens int // OpenAI — cached tokens in prompt
}
CallResult is the parsed response from /chat/completions.
type Client ¶
type Client struct {
BaseURL string
APIKey string
Model string
Thinking string // "enabled", "disabled", "low", "medium", "high", or empty
MaxTokens int // max output tokens (0 = provider default)
Temperature float64 // 0 = use provider default, <0 = omit from request
// contains filtered or unexported fields
}
Client sends chat completion requests to any OpenAI-compatible endpoint.
func New ¶
New creates a Client with the given timeout. Pass 0 to use the default (120s). The timeout applies per HTTP request — the agent loop may have multiple requests; set a generous timeout for deep-reasoning models.
func NewWithMaxTokens ¶
func NewWithMaxTokens(baseURL, apiKey, model, thinking string, maxTokens int, timeout time.Duration) *Client
NewWithMaxTokens creates a Client with a specific max_tokens setting. maxTokens=0 means no limit (provider default).
func (*Client) Call ¶
func (c *Client) Call(ctx context.Context, messages []Message, systemBlocks []SystemBlock, tools []ToolDef) (*CallResult, error)
Call sends a chat completion request and returns the result. systemBlocks is optional — pass nil for providers that don't support the separate System field (OpenAI, DeepSeek). When non-nil, the system prompt is sent in the "system" field instead of as a system message in the messages array (Anthropic format for prompt caching).
func (*Client) SimpleCall ¶
SimpleCall sends a single-turn chat completion request and returns the text response. No tools, no streaming, no thinking config. Used for lightweight LLM calls like skill risk assessment.
type FunctionDef ¶
type FunctionDef struct {
Name string `json:"name"`
Description string `json:"description"`
Parameters any `json:"parameters"`
}
FunctionDef defines a single tool's function signature.
type Message ¶
type Message struct {
Role string `json:"role"` // "system", "user", "assistant", "tool"
Content string `json:"content"` // text content
Name string `json:"name,omitempty"` // tool name (for tool role)
ToolCallID string `json:"tool_call_id,omitempty"`
ToolCalls []ToolCall `json:"tool_calls,omitempty"` // required for assistant role with tool calls
ReasoningContent string `json:"reasoning_content,omitempty"` // DeepSeek reasoning tokens, must be echoed back
CacheControl *CacheControl `json:"cache_control,omitempty"` // Anthropic prompt caching marker
}
Message represents a chat message.
type SystemBlock ¶
type SystemBlock struct {
Type string `json:"type"` // "text"
Text string `json:"text"`
CacheControl *CacheControl `json:"cache_control,omitempty"`
}
SystemBlock represents an Anthropic-style system prompt block with optional cache control. OpenAI-compatible endpoints that don't support this format silently ignore the field.
type ThinkingConfig ¶
type ThinkingConfig struct {
Type string `json:"type"` // "enabled" or "disabled"
}
ThinkingConfig controls Deepseek's extended thinking feature.
type ToolCall ¶
type ToolCall struct {
ID string `json:"id"`
Type string `json:"type"` // always "function"
Function struct {
Name string `json:"name"`
Arguments string `json:"arguments"`
} `json:"function"`
}
ToolCall represents a single tool invocation requested by the model. Matches the OpenAI API format exactly.
type ToolDef ¶
type ToolDef struct {
Type string `json:"type"`
Function FunctionDef `json:"function"`
}
ToolDef is the JSON Schema definition of a tool.