openaicompatible

package module
v0.0.0-...-d326e01 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 8, 2026 License: MIT Imports: 13 Imported by: 0

README

openai-compatible

A small, dependency-light Go client for any server that speaks the OpenAI Chat Completions wire format. Drop it in front of:

  • the official OpenAI API (https://api.openai.com/v1)
  • the AT gateway (http://<host>/gateway/v1)
  • Ollama, vLLM, LiteLLM, Together, Groq, GitHub Models, Azure OpenAI, …

Anything that exposes POST /chat/completions, POST /embeddings, and GET /models in OpenAI shape will work.

Install

go get github.com/rakunlabs/at/pkg/openai-compatible

Quick start

import oa "github.com/rakunlabs/at/pkg/openai-compatible"

client, err := oa.New(
    oa.WithBaseURL("http://localhost:8080/gateway/v1"), // or DefaultBaseURL
    oa.WithAPIKey(os.Getenv("AT_API_TOKEN")),
    oa.WithModel("openai/gpt-4o-mini"),
)
if err != nil { log.Fatal(err) }

resp, err := client.Chat(ctx, &oa.ChatRequest{
    Messages: []oa.Message{
        oa.SystemMessage("You are a helpful assistant."),
        oa.UserMessage("Hello!"),
    },
})
if err != nil { log.Fatal(err) }

fmt.Println(resp.Content())

Streaming

stream, err := client.ChatStream(ctx, &oa.ChatRequest{
    Messages: []oa.Message{oa.UserMessage("Tell me a haiku about Go.")},
})
if err != nil { log.Fatal(err) }
defer stream.Close()

for {
    ev, err := stream.Recv()
    if errors.Is(err, io.EOF) { break }
    if err != nil { log.Fatal(err) }

    for _, ch := range ev.Choices {
        fmt.Print(ch.Delta.Content)
    }
}

To get the assembled final response (with reassembled tool-call arguments):

final, err := oa.AccumulateStream(stream, func(ev *oa.StreamEvent) {
    // optional: forward each chunk to a UI
})

Tool / function calling

weather := oa.FunctionTool(
    "get_weather",
    "Return the current weather for a city.",
    map[string]any{
        "type": "object",
        "properties": map[string]any{
            "city": map[string]any{"type": "string"},
        },
        "required": []string{"city"},
    },
)

resp, _ := client.Chat(ctx, &oa.ChatRequest{
    Messages: []oa.Message{oa.UserMessage("Weather in Istanbul?")},
    Tools:    []oa.Tool{weather},
})

for _, tc := range resp.ToolCalls() {
    args, _ := tc.ArgumentsMap()
    result := callMyTool(tc.Function.Name, args)
    // Feed the result back as a tool message and call Chat again
    // (see example/main.go for the full loop).
}

Multimodal content

msg := oa.UserMessageParts(
    oa.TextPart("What's in this picture?"),
    oa.ImageURLPart("https://example.com/cat.jpg", "auto"),
)
// or attach inline base64:
oa.ImageDataPart("image/png", pngBytes, "low")
oa.InputAudioPart(wavBytes, "wav")
oa.FilePartByID("file-abc123")

Embeddings

emb, _ := client.Embeddings(ctx, &oa.EmbeddingRequest{
    Model: "text-embedding-3-small",
    Input: []string{"hello", "world"},
})
v, _ := emb.Data[0].AsFloat()

List models

list, _ := client.ListModels(ctx)
for _, m := range list.Data {
    fmt.Println(m.ID)
}

Errors

Non-2xx responses are returned as a typed *APIError. Rate-limited responses (HTTP 429 or error.type == "rate_limit_error") are returned as *RateLimitError with the parsed Retry-After header:

if rle := (*oa.RateLimitError)(nil); errors.As(err, &rle) {
    time.Sleep(rle.RetryAfter)
}

Options

Option Purpose
WithBaseURL API root (with or without /chat/completions)
WithAPIKey Bearer token
WithModel Default model id
WithHeader / WithHeaders Extra request headers
WithUserAgent Override User-Agent
WithProxy HTTP/HTTPS/SOCKS5 proxy
WithInsecureSkipVerify Skip TLS verification (dev only)
WithTimeout Overall request timeout
WithDisableRetry Turn off retry-with-backoff
WithRetryMax Max retry attempts (default 4)
WithHTTPClient Use a caller-supplied *http.Client
WithOKOptions Forward arbitrary ok.OptionClientFn options

ChatRequest.Extra is a map[string]any that gets merged into the on-the-wire JSON for any provider-specific fields the typed struct does not yet expose (e.g. web_search_options, thinking, top_k, min_p, …).

Run the example

cd pkg/openai-compatible
go run ./example \
    -base-url "http://localhost:8080/gateway/v1" \
    -api-key  "$AT_API_TOKEN" \
    -model    "openai/gpt-4o-mini"

Documentation

Overview

Package openaicompatible is a small, dependency-light Go client for any server that speaks the OpenAI Chat Completions wire format. It is designed to talk to:

  • the official OpenAI API (https://api.openai.com/v1)
  • the AT gateway (http://<host>/gateway/v1)
  • any other OpenAI-compatible endpoint (Ollama, vLLM, LiteLLM, Together, Groq, GitHub Models, Azure OpenAI, ...)

The package surface is intentionally narrow: a single Client with methods for chat, streaming, embeddings, and listing models. Everything that goes on the wire is exposed as a plain Go struct that mirrors the OpenAI shape, so users can set any field a particular server supports — including fields this library does not know about — via the [ChatRequest.Extra] map.

Basic usage:

client, err := openaicompatible.New(
    openaicompatible.WithBaseURL("https://api.openai.com/v1"),
    openaicompatible.WithAPIKey(os.Getenv("OPENAI_API_KEY")),
)
if err != nil { ... }

resp, err := client.Chat(ctx, &openaicompatible.ChatRequest{
    Model: "gpt-4o-mini",
    Messages: []openaicompatible.Message{
        openaicompatible.SystemMessage("You are a helpful assistant."),
        openaicompatible.UserMessage("Hello!"),
    },
})

Index

Constants

View Source
const (
	RoleSystem    = "system"
	RoleUser      = "user"
	RoleAssistant = "assistant"
	RoleTool      = "tool"
	// RoleDeveloper is OpenAI's instruction-priority role for newer models
	// (o-series, gpt-4.1+). Servers that don't recognise it will typically
	// treat it as system.
	RoleDeveloper = "developer"
)

Standard message roles. Servers may accept additional values; these are just the well-known ones for convenience and to avoid string typos.

View Source
const DefaultBaseURL = "https://api.openai.com/v1"

DefaultBaseURL is the default OpenAI v1 root used when no base URL is provided. Note: it is the API root, not the chat endpoint — the client appends "/chat/completions", "/embeddings", "/models" itself.

Variables

This section is empty.

Functions

func IsAPIError

func IsAPIError(err error) bool

IsAPIError reports whether err is or wraps an *APIError.

func IsRateLimit

func IsRateLimit(err error) bool

IsRateLimit reports whether err is or wraps a *RateLimitError.

func ToolChoiceFunction

func ToolChoiceFunction(name string) any

ToolChoiceFunction returns a value suitable for [ChatRequest.ToolChoice] that forces the model to call the named function.

Types

type APIError

type APIError struct {
	StatusCode int
	Status     string
	Message    string
	Type       string
	Code       string
	Param      string
	// RawBody is the original (possibly truncated) response body, useful
	// for debugging when the server returned a non-standard error shape.
	RawBody string
	// Header is the response header (kept so callers can inspect rate-limit
	// or trace headers).
	Header http.Header
}

APIError wraps a non-2xx response from the server. The HTTP status code is always populated; Type, Code, and Param come from the OpenAI-style error envelope and may be empty for non-OpenAI servers.

func (*APIError) Error

func (e *APIError) Error() string

type ChatRequest

type ChatRequest struct {
	Model    string    `json:"model"`
	Messages []Message `json:"messages"`

	// Tools and tool selection.
	Tools             []Tool `json:"tools,omitempty"`
	ToolChoice        any    `json:"tool_choice,omitempty"` // "auto" | "none" | "required" | {type:"function",function:{name:"x"}}
	ParallelToolCalls *bool  `json:"parallel_tool_calls,omitempty"`

	// Sampling.
	Temperature      *float64       `json:"temperature,omitempty"`
	TopP             *float64       `json:"top_p,omitempty"`
	N                *int           `json:"n,omitempty"`
	Stop             any            `json:"stop,omitempty"` // string or []string
	Seed             *int           `json:"seed,omitempty"`
	PresencePenalty  *float64       `json:"presence_penalty,omitempty"`
	FrequencyPenalty *float64       `json:"frequency_penalty,omitempty"`
	LogitBias        map[string]int `json:"logit_bias,omitempty"`

	// Output limits.
	MaxTokens           *int `json:"max_tokens,omitempty"`
	MaxCompletionTokens *int `json:"max_completion_tokens,omitempty"` // OpenAI o-series

	// Output shape.
	ResponseFormat any    `json:"response_format,omitempty"` // {"type":"json_object"} or json_schema
	User           string `json:"user,omitempty"`

	// Reasoning / thinking.
	ReasoningEffort string `json:"reasoning_effort,omitempty"` // "low" | "medium" | "high"

	// Streaming. Callers should use [Client.ChatStream] rather than setting
	// these directly; ChatStream populates them automatically.
	Stream        bool           `json:"stream,omitempty"`
	StreamOptions *StreamOptions `json:"stream_options,omitempty"`

	// Extra carries arbitrary additional fields that will be merged into
	// the JSON body. Use it for server-specific extensions such as
	// "web_search_options", "thinking", "top_k", "min_p", etc.
	Extra map[string]any `json:"-"`
}

ChatRequest is the body sent to POST /chat/completions. Fields map 1:1 to the OpenAI API. Use Extra to set anything this struct does not model directly (provider-specific knobs, future fields, …); Extra entries are merged into the JSON body.

func (ChatRequest) MarshalJSON

func (r ChatRequest) MarshalJSON() ([]byte, error)

MarshalJSON merges Extra into the wire body without losing the typed fields. Extra keys do not overwrite already-set typed fields.

type ChatResponse

type ChatResponse struct {
	ID      string   `json:"id"`
	Object  string   `json:"object"`
	Created int64    `json:"created"`
	Model   string   `json:"model"`
	Choices []Choice `json:"choices"`
	Usage   *Usage   `json:"usage,omitempty"`

	// SystemFingerprint is a backend identifier (OpenAI feature; may be
	// empty on other servers).
	SystemFingerprint string `json:"system_fingerprint,omitempty"`
}

ChatResponse is the body of a non-streaming /chat/completions response.

func AccumulateStream

func AccumulateStream(s *Stream, onChunk func(*StreamEvent)) (*ChatResponse, error)

AccumulateStream consumes the entire stream and assembles a final *ChatResponse, joining content fragments and reassembling tool-call argument fragments back into well-formed JSON.

It calls onChunk (if non-nil) for every received event before merging it into the accumulator. onChunk should not retain references to the event past the call — its slices are reused.

The stream is left open; callers should still call s.Close() afterward.

func (*ChatResponse) Content

func (r *ChatResponse) Content() string

Content returns the assistant text from the first choice, or "".

func (*ChatResponse) FirstChoice

func (r *ChatResponse) FirstChoice() *Choice

FirstChoice returns the first completion or nil if none. Convenience for the common single-choice case.

func (*ChatResponse) ToolCalls

func (r *ChatResponse) ToolCalls() []ToolCall

ToolCalls returns the tool calls from the first choice, or nil.

type Choice

type Choice struct {
	Index        int     `json:"index"`
	Message      Message `json:"message"`
	FinishReason string  `json:"finish_reason"`
	// Logprobs is left as raw bytes — wire format varies between providers.
	Logprobs json.RawMessage `json:"logprobs,omitempty"`
}

Choice is one of N completions returned by the server.

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client talks to an OpenAI-compatible server.

A Client is safe for concurrent use by multiple goroutines. Create one per (base URL, credentials) pair and reuse it for the lifetime of the process.

func New

func New(opts ...Option) (*Client, error)

New constructs a Client. At minimum a base URL is required, either via WithBaseURL or by leaving it blank to use DefaultBaseURL.

func (*Client) BaseURL

func (c *Client) BaseURL() string

BaseURL returns the API root the client was configured with.

func (*Client) Chat

func (c *Client) Chat(ctx context.Context, req *ChatRequest) (*ChatResponse, error)

Chat issues a non-streaming POST /chat/completions request.

If req.Model is empty and the client was created with WithModel, the configured default is used. The req.Stream flag is forced to false; use Client.ChatStream for streaming.

Non-2xx responses are returned as either *APIError or *RateLimitError.

func (*Client) ChatStream

func (c *Client) ChatStream(ctx context.Context, req *ChatRequest) (*Stream, error)

ChatStream issues a streaming POST /chat/completions request.

It forces req.Stream=true and sets stream_options.include_usage so the final event carries token usage. The returned *Stream must be closed by the caller.

func (*Client) DefaultModel

func (c *Client) DefaultModel() string

DefaultModel returns the default model id, if one was configured via WithModel. Empty string if not set.

func (*Client) Embeddings

func (c *Client) Embeddings(ctx context.Context, req *EmbeddingRequest) (*EmbeddingResponse, error)

Embeddings calls POST /embeddings.

func (*Client) HTTPClient

func (c *Client) HTTPClient() *http.Client

HTTPClient returns the underlying *http.Client. Callers can use this for advanced cases (e.g. issuing a raw request to a non-standard endpoint on the same server). Modifying the returned client's Transport will affect all subsequent requests.

func (*Client) ListModels

func (c *Client) ListModels(ctx context.Context) (*ModelList, error)

ListModels calls GET /models on the configured server.

Not all OpenAI-compatible servers implement this endpoint — the AT gateway does (it returns the merged list across all configured providers in "provider/model" form), as do OpenAI, Ollama, vLLM, LiteLLM, etc.

type CompletionTokensDetails

type CompletionTokensDetails struct {
	ReasoningTokens          int `json:"reasoning_tokens,omitempty"`
	AudioTokens              int `json:"audio_tokens,omitempty"`
	AcceptedPredictionTokens int `json:"accepted_prediction_tokens,omitempty"`
	RejectedPredictionTokens int `json:"rejected_prediction_tokens,omitempty"`
}

CompletionTokensDetails is the optional breakdown of completion tokens.

type ContentPart

type ContentPart struct {
	Type string `json:"type"`

	// Text is set when Type == "text".
	Text string `json:"text,omitempty"`

	// ImageURL is set when Type == "image_url".
	ImageURL *ImageURL `json:"image_url,omitempty"`

	// InputAudio is set when Type == "input_audio".
	InputAudio *InputAudio `json:"input_audio,omitempty"`

	// File is set when Type == "file".
	File *FileContent `json:"file,omitempty"`
}

ContentPart is a single block of multimodal content within a Message. Use the helpers TextPart, ImageURLPart, ImageDataPart, InputAudioPart, and [FilePart] to construct parts.

func FilePartByID

func FilePartByID(fileID string) ContentPart

FilePartByID references a previously-uploaded file by its file_id.

func FilePartInline

func FilePartInline(filename string, data []byte) ContentPart

FilePartInline embeds file bytes (base64-encoded) into the request.

func ImageDataPart

func ImageDataPart(mediaType string, data []byte, detail string) ContentPart

ImageDataPart embeds a base64-encoded image as a data URI. mediaType is e.g. "image/png", "image/jpeg".

func ImageURLPart

func ImageURLPart(url, detail string) ContentPart

ImageURLPart references an image by URL.

detail may be "low", "high", or "auto" (or "" to omit).

func InputAudioPart

func InputAudioPart(data []byte, format string) ContentPart

InputAudioPart embeds base64-encoded audio. format is e.g. "wav" or "mp3".

func InputAudioPartBase64

func InputAudioPartBase64(b64, format string) ContentPart

InputAudioPartBase64 embeds already-base64-encoded audio.

func TextPart

func TextPart(text string) ContentPart

TextPart builds a {"type":"text","text":...} content block.

type EmbeddingObj

type EmbeddingObj struct {
	Object string `json:"object"`
	Index  int    `json:"index"`
	// Embedding is []float64 when EncodingFormat is "float" (default), or a
	// base64 string when EncodingFormat is "base64". It is decoded as
	// json.RawMessage to support both shapes — call AsFloat / AsBase64.
	Embedding json.RawMessage `json:"embedding"`
}

EmbeddingObj is one vector in the embeddings response.

func (EmbeddingObj) AsBase64

func (e EmbeddingObj) AsBase64() (string, error)

AsBase64 returns the embedding as a base64 string. Use this when the request set EncodingFormat to "base64".

func (EmbeddingObj) AsFloat

func (e EmbeddingObj) AsFloat() ([]float64, error)

AsFloat decodes the embedding as a []float64. Use this when the request did not set EncodingFormat or set it to "float".

type EmbeddingRequest

type EmbeddingRequest struct {
	Model          string `json:"model"`
	Input          any    `json:"input"`
	EncodingFormat string `json:"encoding_format,omitempty"` // "float" (default) or "base64"
	Dimensions     *int   `json:"dimensions,omitempty"`
	User           string `json:"user,omitempty"`

	// Extra carries arbitrary additional fields merged into the JSON body.
	Extra map[string]any `json:"-"`
}

EmbeddingRequest is the body of POST /embeddings.

Input must be either a string or a []string. The server may also accept []int / [][]int for already-tokenised input — in that case use Extra to override.

func (EmbeddingRequest) MarshalJSON

func (r EmbeddingRequest) MarshalJSON() ([]byte, error)

MarshalJSON merges Extra into the wire body without overwriting typed fields.

type EmbeddingResponse

type EmbeddingResponse struct {
	Object string         `json:"object"`
	Model  string         `json:"model"`
	Data   []EmbeddingObj `json:"data"`
	Usage  *Usage         `json:"usage,omitempty"`
}

EmbeddingResponse is the body of POST /embeddings.

type FileContent

type FileContent struct {
	FileID   string `json:"file_id,omitempty"`
	Filename string `json:"filename,omitempty"`
	FileData string `json:"file_data,omitempty"` // base64
}

FileContent describes an attached file. Either FileID (already uploaded) or FileData (inline base64) should be set.

type ImageURL

type ImageURL struct {
	// URL is either an https:// URL or a data: URI of the form
	// "data:image/png;base64,<base64-data>".
	URL string `json:"url"`
	// Detail controls the model's image fidelity: "low", "high", or "auto".
	Detail string `json:"detail,omitempty"`
}

ImageURL references an image either by URL or by inline base64 data URI.

type InputAudio

type InputAudio struct {
	Data   string `json:"data"`             // base64-encoded
	Format string `json:"format,omitempty"` // e.g. "wav", "mp3"
}

InputAudio holds inline base64-encoded audio.

type Message

type Message struct {
	Role string `json:"role"`
	// Content is either string, []ContentPart, or nil.
	Content any `json:"content,omitempty"`
	// Name is optional; some servers use it to disambiguate participants
	// or to identify the function whose result is being returned.
	Name string `json:"name,omitempty"`
	// ToolCallID is set on role="tool" messages to associate the tool
	// result with the assistant's earlier tool_call.id.
	ToolCallID string `json:"tool_call_id,omitempty"`
	// ToolCalls is set on role="assistant" messages that requested one or
	// more tool invocations.
	ToolCalls []ToolCall `json:"tool_calls,omitempty"`
	// ReasoningContent is exposed by some servers (e.g. Anthropic via the
	// AT gateway, DeepSeek-R1, OpenAI o-series) to surface chain-of-thought
	// separately from the final answer. Optional.
	ReasoningContent string `json:"reasoning_content,omitempty"`
	// Refusal carries an explicit refusal message from OpenAI safety models.
	Refusal string `json:"refusal,omitempty"`
}

Message is one entry in the chat history. The on-the-wire format follows OpenAI's spec exactly, so callers can populate any field a particular server understands.

Content can be either a plain string (the most common case) or a slice of ContentPart for multimodal input. Use the constructor helpers (UserMessage, SystemMessage, AssistantMessage, ToolMessage) for readable code.

func AssistantMessage

func AssistantMessage(text string) Message

AssistantMessage builds a role="assistant" message with plain-text content.

func AssistantToolCallMessage

func AssistantToolCallMessage(content string, toolCalls ...ToolCall) Message

AssistantToolCallMessage builds a role="assistant" message that requests one or more tool invocations. Content may be empty.

func DeveloperMessage

func DeveloperMessage(text string) Message

DeveloperMessage builds a role="developer" message (OpenAI o-series and gpt-4.1+ models). Servers that don't recognise the role typically treat it as system.

func SystemMessage

func SystemMessage(text string) Message

SystemMessage builds a role="system" message with plain-text content.

func ToolMessage

func ToolMessage(toolCallID, result string) Message

ToolMessage builds a role="tool" message carrying the result of a previous assistant tool_call. toolCallID must match the ID in the assistant's ToolCall.

func UserMessage

func UserMessage(text string) Message

UserMessage builds a role="user" message with plain-text content.

func UserMessageParts

func UserMessageParts(parts ...ContentPart) Message

UserMessageParts builds a role="user" message with multimodal content. Use TextPart, ImageURLPart, ImageDataPart, InputAudioPart, [FilePart] to construct each part.

type Model

type Model struct {
	ID      string `json:"id"`
	Object  string `json:"object"`
	Created int64  `json:"created,omitempty"`
	OwnedBy string `json:"owned_by,omitempty"`
}

Model is one entry returned by GET /models.

type ModelList

type ModelList struct {
	Object string  `json:"object"`
	Data   []Model `json:"data"`
}

ModelList is the response shape of GET /models.

type Option

type Option func(*config)

Option configures a Client.

func WithAPIKey

func WithAPIKey(key string) Option

WithAPIKey sets the bearer token used in the Authorization header. Leave unset (or empty) for servers that do not require authentication.

func WithBaseURL

func WithBaseURL(url string) Option

WithBaseURL sets the API root URL (without trailing /chat/completions).

Examples:

If the configured URL ends with "/chat/completions", that suffix is stripped automatically so users can paste the chat URL by mistake without breaking other endpoints.

func WithDisableRetry

func WithDisableRetry(disable bool) Option

WithDisableRetry disables the built-in retry-with-backoff behaviour. By default the client retries on connection errors and 5xx responses.

func WithHTTPClient

func WithHTTPClient(h *http.Client) Option

WithHTTPClient lets callers supply their own *http.Client. When provided, proxy / TLS / retry / timeout options are still honoured by wrapping the supplied client's Transport.

func WithHeader

func WithHeader(key, value string) Option

WithHeader sets a single extra HTTP header sent on every request. Calling it again with the same key replaces the previous value.

func WithHeaders

func WithHeaders(h http.Header) Option

WithHeaders merges a header map into the per-request headers. Existing values for the same key are overwritten.

func WithInsecureSkipVerify

func WithInsecureSkipVerify(skip bool) Option

WithInsecureSkipVerify disables TLS certificate verification. Use with care — only for self-signed development servers.

func WithModel

func WithModel(model string) Option

WithModel sets a default model id used by methods when [ChatRequest.Model] or [EmbeddingRequest.Model] is empty.

func WithOKOptions

func WithOKOptions(opts ...ok.OptionClientFn) Option

WithOKOptions is an escape hatch that forwards arbitrary github.com/rakunlabs/ok.OptionClientFn values to the underlying HTTP client builder. Use it for advanced configuration (custom retry policy, round-tripper wrappers, telemetry injection, etc).

func WithProxy

func WithProxy(proxy string) Option

WithProxy routes requests through the given HTTP/HTTPS/SOCKS5 proxy URL. Empty string disables the proxy.

func WithRetryMax

func WithRetryMax(n int) Option

WithRetryMax overrides the default retry attempt count (4). Set to 0 to issue a single attempt with no retries.

func WithTimeout

func WithTimeout(d time.Duration) Option

WithTimeout sets the overall HTTP client timeout. This applies to the total time of a single request including all retries. For streaming requests the read of the SSE body is bounded by the request context, not by this timeout.

func WithUserAgent

func WithUserAgent(ua string) Option

WithUserAgent overrides the default User-Agent header.

type PromptTokensDetails

type PromptTokensDetails struct {
	CachedTokens int `json:"cached_tokens,omitempty"`
	AudioTokens  int `json:"audio_tokens,omitempty"`
}

PromptTokensDetails is the optional breakdown of the prompt token count.

type RateLimitError

type RateLimitError struct {
	APIError
	// RetryAfter is the parsed value of the Retry-After response header, if
	// present. Zero means the header was absent or unparseable; callers
	// should fall back to their own backoff policy in that case.
	RetryAfter time.Duration
}

RateLimitError is a typed error returned when the server responds with HTTP 429 or an explicit rate-limit error envelope. Callers can use errors.As to detect it and honour the suggested RetryAfter delay.

func (*RateLimitError) Error

func (e *RateLimitError) Error() string

func (*RateLimitError) Unwrap

func (e *RateLimitError) Unwrap() error

Unwrap so errors.Is(err, &APIError{}) and errors.As both work.

type Stream

type Stream struct {
	// contains filtered or unexported fields
}

Stream consumes a server-sent-events response from /chat/completions.

Recv returns each parsed event as it arrives. The stream ends with io.EOF when the server emits the "[DONE]" sentinel (or closes the connection cleanly). Always call Stream.Close when done so the underlying TCP connection is returned to the pool.

For high-level use, see AccumulateStream, which assembles all deltas into a final ChatResponse.

func (*Stream) Close

func (s *Stream) Close() error

Close releases the underlying connection. Idempotent.

func (*Stream) Header

func (s *Stream) Header() http.Header

Header returns the HTTP response headers from the streaming request. Useful for inspecting trace IDs or rate-limit headers.

func (*Stream) Recv

func (s *Stream) Recv() (*StreamEvent, error)

Recv reads the next event from the stream. Returns io.EOF when the stream completes normally.

type StreamChoice

type StreamChoice struct {
	Index        int         `json:"index"`
	Delta        StreamDelta `json:"delta"`
	FinishReason *string     `json:"finish_reason,omitempty"`
	// Logprobs is left as raw bytes — wire format varies between providers.
	Logprobs json.RawMessage `json:"logprobs,omitempty"`
}

StreamChoice is one streamed choice in a StreamEvent.

type StreamDelta

type StreamDelta struct {
	Role             string     `json:"role,omitempty"`
	Content          string     `json:"content,omitempty"`
	ReasoningContent string     `json:"reasoning_content,omitempty"`
	Refusal          string     `json:"refusal,omitempty"`
	ToolCalls        []ToolCall `json:"tool_calls,omitempty"`
}

StreamDelta is the incremental content fragment in a StreamChoice.

type StreamEvent

type StreamEvent struct {
	ID                string         `json:"id"`
	Object            string         `json:"object"`
	Created           int64          `json:"created"`
	Model             string         `json:"model"`
	SystemFingerprint string         `json:"system_fingerprint,omitempty"`
	Choices           []StreamChoice `json:"choices"`
	// Usage is populated on the final empty-choices chunk when
	// stream_options.include_usage was requested.
	Usage *Usage `json:"usage,omitempty"`
}

StreamEvent is a single SSE chunk decoded from a streaming /chat/completions response. It mirrors the OpenAI shape one-to-one.

type StreamOptions

type StreamOptions struct {
	IncludeUsage bool `json:"include_usage,omitempty"`
}

StreamOptions is documented under OpenAI's stream_options request field.

type Tool

type Tool struct {
	Type     string       `json:"type"` // "function"
	Function ToolFunction `json:"function"`
}

Tool describes a function the model may call.

Currently only Type == "function" is widely supported across providers; some servers may add other tool types (web_search, code_interpreter, …) — set Type accordingly.

func FunctionTool

func FunctionTool(name, description string, parameters map[string]any) Tool

FunctionTool builds a Tool of type "function" with the given schema.

parameters should be a JSON schema document, e.g.:

parameters := map[string]any{
    "type": "object",
    "properties": map[string]any{
        "city": map[string]any{"type": "string"},
    },
    "required": []string{"city"},
}

type ToolCall

type ToolCall struct {
	// Index is set on streaming deltas so fragments of the same tool call
	// can be reassembled. nil on non-streaming responses.
	Index *int `json:"index,omitempty"`

	ID       string           `json:"id"`
	Type     string           `json:"type"` // "function"
	Function ToolCallFunction `json:"function"`
}

ToolCall is emitted by the model when it wants to invoke a function. Arguments is the raw JSON string returned by the model — call ToolCall.UnmarshalArguments to decode it into a Go value, or ToolCall.ArgumentsMap to get a map[string]any.

func (ToolCall) ArgumentsMap

func (tc ToolCall) ArgumentsMap() (map[string]any, error)

ArgumentsMap decodes the function-call arguments into a map[string]any. Returns an empty map if Arguments is empty.

func (ToolCall) UnmarshalArguments

func (tc ToolCall) UnmarshalArguments(v any) error

UnmarshalArguments decodes the function-call arguments JSON into v. Returns nil if Arguments is empty.

type ToolCallFunction

type ToolCallFunction struct {
	Name      string `json:"name"`
	Arguments string `json:"arguments"` // raw JSON string per OpenAI spec
}

ToolCallFunction is the function-call payload of a ToolCall.

type ToolFunction

type ToolFunction struct {
	Name        string         `json:"name"`
	Description string         `json:"description,omitempty"`
	Parameters  map[string]any `json:"parameters,omitempty"` // JSON schema
	Strict      *bool          `json:"strict,omitempty"`
}

ToolFunction defines a callable function the model can request.

type Usage

type Usage struct {
	PromptTokens     int `json:"prompt_tokens"`
	CompletionTokens int `json:"completion_tokens"`
	TotalTokens      int `json:"total_tokens"`

	// Some servers expose more granular breakdowns. These map directly to
	// fields seen in OpenAI / Anthropic responses; absent fields stay zero.
	PromptTokensDetails     *PromptTokensDetails     `json:"prompt_tokens_details,omitempty"`
	CompletionTokensDetails *CompletionTokensDetails `json:"completion_tokens_details,omitempty"`
}

Usage reports token consumption for a request.

Directories

Path Synopsis
Package main is a runnable example that exercises every public capability of github.com/rakunlabs/at/pkg/openai-compatible against any OpenAI-compatible server (OpenAI, the AT gateway, Ollama, vLLM, …).
Package main is a runnable example that exercises every public capability of github.com/rakunlabs/at/pkg/openai-compatible against any OpenAI-compatible server (OpenAI, the AT gateway, Ollama, vLLM, …).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL