llm

package

v1.10.0 Latest Latest Go to latest Published: Jun 17, 2026 License: MIT Imports: 11 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/BackendStack21/odek

Links

Open Source Insights

Documentation ¶

Overview ¶

Package llm provides an OpenAI-compatible HTTP client using only stdlib.

Index ¶

func ApplyCacheMarkers(messages []Message) ([]Message, []SystemBlock)
func DiscoverModelContext(baseURL, apiKey, model string) int
func ResetModelCache()
type CacheControl
type CallParams
type CallResult
type Client
- func New(baseURL, apiKey, model, thinking string, thinkingBudget int, ...) *Client
- func NewWithMaxTokens(baseURL, apiKey, model, thinking string, thinkingBudget int, maxTokens int, ...) *Client
- func (c *Client) Call(ctx context.Context, messages []Message, systemBlocks []SystemBlock, ...) (*CallResult, error)
- func (c *Client) SimpleCall(ctx context.Context, systemPrompt, userPrompt string) (string, error)
type FunctionDef
type Message
type SystemBlock
type ThinkingConfig
type ToolCall
type ToolDef

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ApplyCacheMarkers ¶

func ApplyCacheMarkers(messages []Message) ([]Message, []SystemBlock)

ApplyCacheMarkers annotates messages with Anthropic-style cache_control markers to enable prompt caching. It:

Marks the first system message (if present) with cache_control: ephemeral
Marks the first user message with cache_control: ephemeral

Returns the updated messages and a System field (populated if the system message was moved out of the messages array for Anthropic compatibility). Providers that don't support prompt caching silently ignore these fields.

func DiscoverModelContext ¶

func DiscoverModelContext(baseURL, apiKey, model string) int

DiscoverModelContext queries the /models endpoint of the configured base URL to discover the context window for the given model. Returns 0 if the endpoint doesn't support model attribute discovery or the model isn't found.

Results are cached per (baseURL, apiKey) so multiple agents using the same provider share a single API call. The full model list is cached so different model lookups from the same endpoint don't re-query.

Call this at startup before creating the engine. The HTTP call uses a 5s timeout and never blocks startup for more than that.

func ResetModelCache ¶

func ResetModelCache()

ResetModelCache clears the model discovery cache. Used in tests.

Types ¶

type CacheControl ¶

type CacheControl struct {
	Type string `json:"type"` // "ephemeral"
}

CacheControl marks a message or system block as cacheable by Anthropic. Providers that don't support it (OpenAI, DeepSeek) silently ignore the field.

type CallParams ¶

type CallParams struct {
	Model           string          `json:"model"`
	Messages        []Message       `json:"messages"`
	System          []SystemBlock   `json:"system,omitempty"` // Anthropic-style system blocks
	Tools           []ToolDef       `json:"tools,omitempty"`
	Stream          bool            `json:"stream"`
	MaxTokens       int             `json:"max_tokens,omitempty"`  // max output tokens (0 = omit/provider default)
	Temperature     *float64        `json:"temperature,omitempty"` // 0–2, nil = provider default
	Thinking        *ThinkingConfig `json:"thinking,omitempty"`
	ReasoningEffort string          `json:"reasoning_effort,omitempty"`
}

CallParams is the request body for /chat/completions.

type CallResult ¶

type CallResult struct {
	Content          string     // assistant text
	ReasoningContent string     // DeepSeek reasoning/thinking tokens
	ToolCalls        []ToolCall // tool calls requested by the model
	InputTokens      int        // prompt_tokens from API usage (0 = not reported)
	OutputTokens     int        // completion_tokens from API usage (0 = not reported)

	// Cache metrics. Only populated when the provider returns them.
	// Anthropic: cache_creation_input_tokens, cache_read_input_tokens
	// OpenAI: prompt_tokens_details.cached_tokens
	CacheCreationTokens int // Anthropic — tokens written to cache
	CacheReadTokens     int // Anthropic — tokens read from cache hit
	CachedTokens        int // OpenAI — cached tokens in prompt
}

CallResult is the parsed response from /chat/completions.

type Client ¶

type Client struct {
	BaseURL        string
	APIKey         string
	Model          string
	Thinking       string  // "enabled", "disabled", "low", "medium", "high", or empty
	ThinkingBudget int     // max thinking tokens for Anthropic extended thinking (0 = use default 5000)
	MaxTokens      int     // max output tokens (0 = provider default)
	Temperature    float64 // 0 = use provider default, <0 = omit from request
	// contains filtered or unexported fields
}

Client sends chat completion requests to any OpenAI-compatible endpoint.

func New ¶

func New(baseURL, apiKey, model, thinking string, thinkingBudget int, timeout time.Duration) *Client

New creates a Client with the given timeout. Pass 0 to use the default (120s). The timeout applies per HTTP request — the agent loop may have multiple requests; set a generous timeout for deep-reasoning models.

func NewWithMaxTokens ¶

func NewWithMaxTokens(baseURL, apiKey, model, thinking string, thinkingBudget int, maxTokens int, timeout time.Duration) *Client

NewWithMaxTokens creates a Client with a specific max_tokens setting. maxTokens=0 means no limit (provider default).

func (*Client) Call ¶

func (c *Client) Call(ctx context.Context, messages []Message, systemBlocks []SystemBlock, tools []ToolDef) (*CallResult, error)

Call sends a chat completion request and returns the result. systemBlocks is optional — pass nil for providers that don't support the separate System field (OpenAI, DeepSeek). When non-nil, the system prompt is sent in the "system" field instead of as a system message in the messages array (Anthropic format for prompt caching).

func (*Client) SimpleCall ¶

func (c *Client) SimpleCall(ctx context.Context, systemPrompt, userPrompt string) (string, error)

SimpleCall sends a single-turn chat completion request and returns the text response. No tools, no streaming, no thinking config. Used for lightweight LLM calls like skill risk assessment.

type FunctionDef ¶

type FunctionDef struct {
	Name        string `json:"name"`
	Description string `json:"description"`
	Parameters  any    `json:"parameters"`
}

FunctionDef defines a single tool's function signature.

type Message ¶

type Message struct {
	Role             string        `json:"role"`           // "system", "user", "assistant", "tool"
	Content          string        `json:"content"`        // text content
	Name             string        `json:"name,omitempty"` // tool name (for tool role)
	ToolCallID       string        `json:"tool_call_id,omitempty"`
	ToolCalls        []ToolCall    `json:"tool_calls,omitempty"`        // required for assistant role with tool calls
	ReasoningContent string        `json:"reasoning_content,omitempty"` // DeepSeek reasoning tokens, must be echoed back
	CacheControl     *CacheControl `json:"cache_control,omitempty"`     // Anthropic prompt caching marker
}

Message represents a chat message.

type SystemBlock ¶

type SystemBlock struct {
	Type         string        `json:"type"` // "text"
	Text         string        `json:"text"`
	CacheControl *CacheControl `json:"cache_control,omitempty"`
}

SystemBlock represents an Anthropic-style system prompt block with optional cache control. OpenAI-compatible endpoints that don't support this format silently ignore the field.

type ThinkingConfig ¶

type ThinkingConfig struct {
	Type         string `json:"type"`                    // "enabled" or "disabled"
	BudgetTokens int    `json:"budget_tokens,omitempty"` // Anthropic: max thinking tokens
}

ThinkingConfig controls extended thinking for DeepSeek and Anthropic models. Anthropic requires budget_tokens when type is "enabled"; DeepSeek ignores it.

type ToolCall ¶

type ToolCall struct {
	ID       string `json:"id"`
	Type     string `json:"type"` // always "function"
	Function struct {
		Name      string `json:"name"`
		Arguments string `json:"arguments"`
	} `json:"function"`
}

ToolCall represents a single tool invocation requested by the model. Matches the OpenAI API format exactly.

type ToolDef ¶

type ToolDef struct {
	Type     string      `json:"type"`
	Function FunctionDef `json:"function"`
}

ToolDef is the JSON Schema definition of a tool.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL