gopenrouter

package module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 3, 2025 License: MIT Imports: 10 Imported by: 0

README

GOpenRouter

The GOpenRouter library provides convenient access to the OpenRouter REST API from applications written in Go. OpenRouter is a unified API that provides access to various AI models from different providers, including OpenAI, Anthropic, Google, and more.

Go Reference GitHub Release CI codecov License

🚀 New Feature: Real-time streaming support is now available for both completion and chat endpoints! Build interactive AI applications with live response generation. See the Streaming Documentation for complete details and examples.

Quick Navigation

What you want to do Go to
🚀 Get started quickly InstallationUsage
💬 Build chat applications Chat Completions
⚡ Implement real-time streaming Streaming Documentation
🔧 Advanced configuration Advanced Provider Routing
📊 Monitor usage and costs Checking Credits
🎯 See working examples Examples
🐛 Handle errors properly Error Handling

Features

  • Complete OpenRouter API coverage
  • Text completion and chat completion support
  • Real-time streaming support for both completion and chat endpoints
  • Builder pattern for constructing requests
  • Customizable HTTP client with middleware support
  • Proper error handling and detailed error types
  • Context support for request cancellation and timeouts
  • Comprehensive documentation and examples

Installation

go get github.com/bkovacki/gopenrouter

Requirements

This library requires Go 1.24+.

Usage

Import the package in your Go code:

import "github.com/bkovacki/gopenrouter"
Creating a client

To use this library, create a new client with your OpenRouter API key:

client := gopenrouter.New("your-api-key")

You can also customize the client with optional settings:

client := gopenrouter.New(
    "your-api-key",
    gopenrouter.WithSiteURL("https://yourapp.com"),
    gopenrouter.WithSiteTitle("Your App Name"),
    gopenrouter.WithHTTPClient(customHTTPClient),
)
Text Completions
// Create a completion request using the builder pattern
request := gopenrouter.NewCompletionRequestBuilder(
    "anthropic/claude-3-opus-20240229",
    "Write a short poem about Go programming.",
).WithMaxTokens(150).
  WithTemperature(0.7).
  Build()

// Send the completion request
ctx := context.Background()
resp, err := client.Completion(ctx, *request)
if err != nil {
    log.Fatalf("Completion error: %v", err)
}

// Use the response
fmt.Println(resp.Choices[0].Text)
Chat Completions
// Create conversation messages
messages := []gopenrouter.ChatMessage{
    {
        Role:    "system",
        Content: "You are a helpful assistant that provides concise answers.",
    },
    {
        Role:    "user",
        Content: "What is the capital of France?",
    },
}

// Build chat completion request
request := gopenrouter.NewChatCompletionRequestBuilder("openai/gpt-3.5-turbo", messages).
    WithMaxTokens(100).
    WithTemperature(0.7).
    WithUsage(true).
    Build()

// Make the chat completion request
ctx := context.Background()
response, err := client.ChatCompletion(ctx, *request)
if err != nil {
    log.Fatalf("Chat completion failed: %v", err)
}

// Use the response
fmt.Printf("Assistant: %s\n", response.Choices[0].Message.Content)
Streaming Responses

The library provides comprehensive real-time streaming support for both completion and chat completion endpoints. Streaming allows you to:

  • Reduce perceived latency by displaying responses as they are generated
  • Build interactive chat interfaces with real-time feedback
  • Handle long responses efficiently without waiting for complete generation
  • Implement live AI-powered features with immediate user feedback

Quick streaming example:

import (
    "context"
    "fmt"
    "io"
    "log"
    
    "github.com/bkovacki/gopenrouter"
)

// Streaming chat completion
messages := []gopenrouter.ChatMessage{
    {Role: "user", Content: "Tell me a story"},
}

request := gopenrouter.NewChatCompletionRequestBuilder("openai/gpt-3.5-turbo", messages).Build()

stream, err := client.ChatCompletionStream(ctx, *request)
if err != nil {
    log.Fatal(err)
}
defer stream.Close()

fmt.Print("Assistant: ")
for {
    chunk, err := stream.Recv()
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
    
    for _, choice := range chunk.Choices {
        if choice.Delta.Content != nil {
            fmt.Print(*choice.Delta.Content)
        }
    }
}
fmt.Println()

📚 For comprehensive streaming documentation including:

  • Complete API reference and response types
  • Advanced usage patterns and best practices
  • Error handling and resource management
  • Performance considerations and limitations
  • Migration guide from non-streaming code

See the Streaming Documentation for complete details and examples.

Advanced Provider Routing

OpenRouter allows you to customize how your requests are routed between different AI providers:

// Create provider routing options
providerOptions := gopenrouter.NewProviderOptionsBuilder().
    WithDataCollection("deny").
    WithSort("price").
    WithOrder([]string{"Anthropic", "OpenAI"}).
    WithIgnore([]string{"Mistral"}).
    Build()

// Include provider options in your completion request
request := gopenrouter.NewCompletionRequestBuilder(
    "anthropic/claude-3-opus-20240229",
    "Write a short story about a robot learning to code.",
).WithProvider(providerOptions).
  Build()
Checking Credits and Usage
// Get your account credit information
credits, err := client.GetCredits(ctx)
if err != nil {
    log.Fatalf("Error getting credits: %v", err)
}

fmt.Printf("Total credits: %.2f\n", credits.TotalCredits)
fmt.Printf("Total usage: %.2f\n", credits.TotalUsage)
Listing Available Models
// Get a list of all available models
models, err := client.ListModels(ctx)
if err != nil {
    log.Fatalf("Error listing models: %v", err)
}

// Display model information
for _, model := range models {
    fmt.Printf("Model: %s\n", model.Name)
    fmt.Printf("  Description: %s\n", model.Description)
    fmt.Printf("  Context Length: %.0f tokens\n", model.ContextLength)
}
Getting Generation Details
// Get details about a specific generation by its ID
generationID := "gen_abc123"
generation, err := client.GetGeneration(ctx, generationID)
if err != nil {
    log.Fatalf("Error getting generation: %v", err)
}

fmt.Printf("Generation Cost: $%.6f\n", generation.TotalCost)
fmt.Printf("Prompt Tokens: %d\n", generation.TokensPrompt)
fmt.Printf("Completion Tokens: %d\n", generation.TokensCompletion)

Examples

The library includes comprehensive examples to help you get started:

⚠️ Cost Warning: Running these examples will make actual API calls to OpenRouter and will incur charges based on your usage. Please monitor your credits and usage to avoid unexpected costs.

Simple Completion Example

Located in examples/simple_completion/, this example demonstrates:

  • Basic text completion with usage reporting
  • Generation details retrieval using the generation endpoint
  • Credits status monitoring before and after requests
  • Cost calculation for individual requests
  • Advanced provider options with cost controls

Run the example:

export OPENROUTER_API_KEY="your-api-key-here"
go run examples/simple_completion/simple_completion.go  # Note: This will incur API charges
Chat Completion Example

Located in examples/chat_completion/, this example demonstrates:

  • Basic chat completion with system and user messages
  • Multi-turn conversations with context
  • Provider options for cost control and fallbacks
  • Different AI models (OpenAI, Anthropic, etc.)
  • Parameter tuning (temperature, penalties, etc.)

Run the example:

export OPENROUTER_API_KEY="your-api-key-here"
go run examples/chat_completion/chat_completion.go  # Note: This will incur API charges
Streaming Example

Located in examples/streaming/, this example demonstrates:

  • Real-time streaming for both completion and chat completion
  • Handling streaming responses and delta content
  • Model fallback with streaming support
  • Proper stream lifecycle management and error handling

Run the example:

export OPENROUTER_API_KEY="your-api-key-here"
go run examples/streaming/streaming.go  # Note: This will incur API charges

Error Handling

The library provides detailed error types for API errors and request errors:

resp, err := client.Completion(ctx, *request)
if err != nil {
    var apiErr *gopenrouter.APIError
    var reqErr *gopenrouter.RequestError
    
    if errors.As(err, &apiErr) {
        // Handle API-specific error
        fmt.Printf("API Error: %s (Code: %d)\n", apiErr.Message, apiErr.Code)
    } else if errors.As(err, &reqErr) {
        // Handle request error
        fmt.Printf("Request Error: %s (Status: %d)\n", reqErr.Error(), reqErr.HTTPStatusCode)
    } else {
        // Handle other errors
        fmt.Printf("Unexpected error: %v\n", err)
    }
    return
}

Development

Running Tests
make test
Linting
make lint
Coverage Report
make cover
make cover-html  # Opens coverage report in browser

Contributing

Contributions to GOpenRouter are welcome! Please feel free to submit a Pull Request.

Roadmap

  • Chat completion API support
  • Examples for common use cases
  • Streaming support for completion and chat completion requests
  • Additional helper methods for advanced use cases

License

This library is distributed under the MIT license. See the LICENSE file for more information.

Documentation

Overview

Package gopenrouter provides a Go client for the OpenRouter API. OpenRouter is a unified API that provides access to various AI models.

Index

Constants

View Source
const Version = "0.4.0"

Version is the current version of the library

Variables

View Source
var ErrCompletionStreamNotSupported = errors.New("streaming is not supported with this method. Use CompletionStream() or ChatCompletionStream() for streaming requests")

Functions

This section is empty.

Types

type APIError

type APIError struct {
	Code     int            `json:"code,omitempty"`
	Message  string         `json:"message"`
	Metadata map[string]any `json:"metadata,omitempty"`
}

APIError provides error information returned by the OpenAI API.

func (*APIError) Error

func (e *APIError) Error() string

type Architecture

type Architecture struct {
	// InputModalities lists the types of input the model accepts (e.g., "text", "image")
	InputModalities []string `json:"input_modalities"`
	// OutputModalities lists the types of output the model produces (e.g., "text")
	OutputModalities []string `json:"output_modalities"`
	// Tokenizer indicates the tokenization method used by this model
	Tokenizer string `json:"tokenizer"`
	// InstructType specifies the instruction format for this model
	InstructType string `json:"instruct_type"`
}

Architecture describes a model's input and output capabilities.

type ChatChoice added in v0.3.0

type ChatChoice struct {
	// Message is the generated chat message response
	Message ChatMessage `json:"message"`
	// Index is the position of this choice in the array of choices
	Index int `json:"index,omitempty"`
	// FinishReason explains why the generation stopped (e.g., "stop", "length")
	FinishReason string `json:"finish_reason,omitempty"`
	// LogProbs contains log probability information for the choice (if requested)
	LogProbs *LogProbs `json:"logprobs,omitempty"`
}

ChatChoice represents a single chat completion choice from the API. The API may return multiple choices depending on the request parameters.

type ChatCompletionRequest added in v0.3.0

type ChatCompletionRequest struct {
	// Required fields
	// Model is the identifier of the AI model to use
	Model string `json:"model"`
	// Messages is the conversation history as a list of messages
	Messages []ChatMessage `json:"messages"`

	// Optional fields
	// Models provides an alternate list of models for routing overrides
	Models []string `json:"models,omitempty"`
	// Provider contains preferences for provider routing
	Provider *ProviderOptions `json:"provider,omitempty"`
	// Reasoning configures model reasoning/thinking tokens
	Reasoning *ReasoningOptions `json:"reasoning,omitempty"`
	// Usage specifies whether to include usage information in the response
	Usage *UsageOptions `json:"usage,omitempty"`
	// Transforms lists prompt transformations (OpenRouter-only feature)
	Transforms []string `json:"transforms,omitempty"`
	// Stream enables streaming of results as they are generated
	Stream *bool `json:"stream,omitempty"`
	// MaxTokens limits the maximum number of tokens in the response
	MaxTokens *int `json:"max_tokens,omitempty"`
	// Temperature controls randomness in generation (range: [0, 2])
	Temperature *float64 `json:"temperature,omitempty"`
	// Seed enables deterministic outputs with the same inputs
	Seed *int `json:"seed,omitempty"`
	// TopP controls nucleus sampling (range: (0, 1])
	TopP *float64 `json:"top_p,omitempty"`
	// TopK limits sampling to top K most likely tokens (range: [1, Infinity))
	TopK *int `json:"top_k,omitempty"`
	// FrequencyPenalty reduces repetition of token sequences (range: [-2, 2])
	FrequencyPenalty *float64 `json:"frequency_penalty,omitempty"`
	// PresencePenalty reduces repetition of topics (range: [-2, 2])
	PresencePenalty *float64 `json:"presence_penalty,omitempty"`
	// RepetitionPenalty penalizes repeated tokens (range: (0, 2])
	RepetitionPenalty *float64 `json:"repetition_penalty,omitempty"`
	// LogitBias maps token IDs to bias values for controlling token probability
	LogitBias map[string]float64 `json:"logit_bias,omitempty"`
	// TopLogProbs specifies the number of top log probabilities to return
	TopLogProbs *int `json:"top_logprobs,omitempty"`
	// MinP sets the minimum probability threshold for tokens (range: [0, 1])
	MinP *float64 `json:"min_p,omitempty"`
	// TopA is an alternate top sampling parameter (range: [0, 1])
	TopA *float64 `json:"top_a,omitempty"`
	// Logprobs enables returning log probabilities of output tokens
	Logprobs *bool `json:"logprobs,omitempty"`
	// Stop specifies sequences where the model will stop generating tokens
	Stop []string `json:"stop,omitempty"`
	// User is a stable identifier for end-users, used to help detect and prevent abuse
	User *string `json:"user,omitempty"`
}

ChatCompletionRequest represents a request for chat completion to the OpenRouter API. It contains all the parameters needed to generate chat responses from AI models.

type ChatCompletionRequestBuilder added in v0.3.0

type ChatCompletionRequestBuilder struct {
	// contains filtered or unexported fields
}

ChatCompletionRequestBuilder implements a builder pattern for constructing ChatCompletionRequest objects. It provides a fluent interface for setting request parameters with method chaining.

func NewChatCompletionRequestBuilder added in v0.3.0

func NewChatCompletionRequestBuilder(model string, messages []ChatMessage) *ChatCompletionRequestBuilder

NewChatCompletionRequestBuilder creates a new builder for ChatCompletionRequest with required fields. The model and messages parameters are required for all chat completion requests.

func (*ChatCompletionRequestBuilder) Build added in v0.3.0

Build returns the constructed ChatCompletionRequest.

func (*ChatCompletionRequestBuilder) WithFrequencyPenalty added in v0.3.0

func (b *ChatCompletionRequestBuilder) WithFrequencyPenalty(penalty float64) *ChatCompletionRequestBuilder

WithFrequencyPenalty sets the frequency penalty parameter.

func (*ChatCompletionRequestBuilder) WithLogitBias added in v0.3.0

WithLogitBias sets the logit bias for specific tokens.

func (*ChatCompletionRequestBuilder) WithLogprobs added in v0.4.0

WithLogprobs enables or disables returning log probabilities of output tokens.

func (*ChatCompletionRequestBuilder) WithMaxTokens added in v0.3.0

func (b *ChatCompletionRequestBuilder) WithMaxTokens(maxTokens int) *ChatCompletionRequestBuilder

WithMaxTokens sets the maximum number of tokens for the response.

func (*ChatCompletionRequestBuilder) WithMinP added in v0.3.0

WithMinP sets the minimum probability threshold.

func (*ChatCompletionRequestBuilder) WithModels added in v0.3.0

WithModels sets alternate models for routing overrides.

func (*ChatCompletionRequestBuilder) WithPresencePenalty added in v0.3.0

func (b *ChatCompletionRequestBuilder) WithPresencePenalty(penalty float64) *ChatCompletionRequestBuilder

WithPresencePenalty sets the presence penalty parameter.

func (*ChatCompletionRequestBuilder) WithProvider added in v0.3.0

WithProvider sets provider preferences for routing.

func (*ChatCompletionRequestBuilder) WithReasoning added in v0.3.0

WithReasoning sets reasoning configuration for the request.

func (*ChatCompletionRequestBuilder) WithRepetitionPenalty added in v0.3.0

func (b *ChatCompletionRequestBuilder) WithRepetitionPenalty(penalty float64) *ChatCompletionRequestBuilder

WithRepetitionPenalty sets the repetition penalty parameter.

func (*ChatCompletionRequestBuilder) WithSeed added in v0.3.0

WithSeed sets the seed for deterministic outputs.

func (*ChatCompletionRequestBuilder) WithStop added in v0.4.0

WithStop sets the stop sequences for token generation.

func (*ChatCompletionRequestBuilder) WithStream added in v0.3.0

WithStream enables or disables streaming for the request.

func (*ChatCompletionRequestBuilder) WithTemperature added in v0.3.0

func (b *ChatCompletionRequestBuilder) WithTemperature(temperature float64) *ChatCompletionRequestBuilder

WithTemperature sets the sampling temperature for the request.

func (*ChatCompletionRequestBuilder) WithTopA added in v0.3.0

WithTopA sets the top-a sampling parameter.

func (*ChatCompletionRequestBuilder) WithTopK added in v0.3.0

WithTopK sets the top-k sampling parameter.

func (*ChatCompletionRequestBuilder) WithTopLogprobs added in v0.3.0

func (b *ChatCompletionRequestBuilder) WithTopLogprobs(topLogProbs int) *ChatCompletionRequestBuilder

WithTopLogprobs sets the number of top log probabilities to return.

func (*ChatCompletionRequestBuilder) WithTopP added in v0.3.0

WithTopP sets the nucleus sampling parameter.

func (*ChatCompletionRequestBuilder) WithTransforms added in v0.3.0

func (b *ChatCompletionRequestBuilder) WithTransforms(transforms []string) *ChatCompletionRequestBuilder

WithTransforms sets prompt transformations for the request.

func (*ChatCompletionRequestBuilder) WithUsage added in v0.3.0

WithUsage sets whether to include usage information in the response.

func (*ChatCompletionRequestBuilder) WithUser added in v0.3.0

WithUser sets the user identifier for the request.

type ChatCompletionResponse added in v0.3.0

type ChatCompletionResponse struct {
	// ID is the unique identifier for this chat completion request
	ID string `json:"id"`
	// Choices contains the generated chat message responses
	Choices []ChatChoice `json:"choices"`
	// Usage provides token usage statistics for the request
	Usage Usage `json:"usage,omitzero"`
}

ChatCompletionResponse represents the response from a chat completion request. It contains the generated messages and metadata about the request.

type ChatCompletionStreamReader added in v0.4.0

type ChatCompletionStreamReader struct {
	// contains filtered or unexported fields
}

ChatCompletionStreamReader implements StreamReader for chat completion responses

func NewChatCompletionStreamReader added in v0.4.0

func NewChatCompletionStreamReader(response *http.Response) *ChatCompletionStreamReader

NewChatCompletionStreamReader creates a new stream reader for chat completion responses

func (*ChatCompletionStreamReader) Close added in v0.4.0

func (r *ChatCompletionStreamReader) Close() error

Close closes the chat completion stream reader

func (*ChatCompletionStreamReader) Recv added in v0.4.0

Recv reads the next chat completion chunk from the stream

type ChatCompletionStreamResponse added in v0.4.0

type ChatCompletionStreamResponse struct {
	// ID is the unique identifier for this chat completion request
	ID string `json:"id"`
	// Object is the type of object returned, typically "chat.completion.chunk"
	Object string `json:"object"`
	// Created is the Unix timestamp when the completion was created
	Created int64 `json:"created"`
	// Model is the identifier of the model used for this completion
	Model string `json:"model"`
	// Choices contains the streaming chat completion choices with delta content
	Choices []ChatStreamingChoice `json:"choices"`
	// Usage provides token usage statistics, typically only present in the final chunk
	Usage *Usage `json:"usage,omitempty"`
}

ChatCompletionStreamResponse represents a single chunk in a streaming chat completion response

type ChatDelta added in v0.4.0

type ChatDelta struct {
	// Role is the role of the message sender (e.g., "assistant"), typically only present in the first chunk
	Role *string `json:"role,omitempty"`
	// Content contains the incremental text content being streamed for this chunk
	Content *string `json:"content,omitempty"`
}

ChatDelta represents the incremental content in a streaming chat response

type ChatMessage added in v0.3.0

type ChatMessage struct {
	// Role defines who sent the message (system, user, or assistant)
	Role string `json:"role"`
	// Content is the text content of the message
	Content string `json:"content"`
}

ChatMessage represents a single message in a conversation. Each message has a role (system, user, assistant) and content.

type ChatStreamingChoice added in v0.4.0

type ChatStreamingChoice struct {
	// Index is the position of this choice in the array of choices
	Index int `json:"index"`
	// Delta contains the incremental content changes for this streaming chunk
	Delta ChatDelta `json:"delta"`
	// FinishReason explains why the generation stopped (e.g., "stop", "length", "content_filter")
	// This field is only present in the final chunk of the stream
	FinishReason *string `json:"finish_reason"`
	// LogProbs contains log probability information for the streaming choice (if requested)
	LogProbs *LogProbs `json:"logprobs,omitempty"`
}

ChatStreamingChoice represents a streaming chat completion choice with delta content

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client represents the OpenRouter client for making API requests. It holds API credentials and configuration for communicating with OpenRouter.

func New

func New(apiKey string, options ...Option) *Client

New creates a new OpenRouter client with the provided API key and optional customization options. By default, it uses the standard OpenRouter API URL and the default HTTP client.

func (*Client) ChatCompletion added in v0.3.0

func (c *Client) ChatCompletion(
	ctx context.Context,
	request ChatCompletionRequest,
) (response ChatCompletionResponse, err error)

ChatCompletion sends a chat completion request to the OpenRouter API.

This method allows users to generate chat responses from AI models through the OpenRouter API. The request can be customized with various parameters to control the generation process, provider selection, and output format.

The method takes a context for cancellation and timeout control, and a ChatCompletionRequest containing the conversation messages and generation parameters.

Returns a ChatCompletionResponse containing the generated messages and usage statistics, or an error if the request fails.

func (*Client) ChatCompletionStream added in v0.4.0

func (c *Client) ChatCompletionStream(
	ctx context.Context,
	request ChatCompletionRequest,
) (*ChatCompletionStreamReader, error)

ChatCompletionStream sends a streaming chat completion request to the OpenRouter API.

This method enables real-time streaming of chat completion responses, allowing applications to display partial results as they are generated by the AI model.

The method automatically sets the stream parameter to true in the request and returns a ChatCompletionStreamReader for reading the streaming chunks.

Example usage:

messages := []gopenrouter.ChatMessage{{Role: "user", Content: "Hello"}}
request := gopenrouter.NewChatCompletionRequestBuilder("model-id", messages).Build()
stream, err := client.ChatCompletionStream(ctx, *request)
if err != nil {
  // handle error
}
defer stream.Close()

for {
  chunk, err := stream.Recv()
  if err == io.EOF {
    break // Stream finished
  }
  if err != nil {
    // handle error
  }
  // Process chunk
}

func (*Client) Completion

func (c *Client) Completion(
	ctx context.Context,
	request CompletionRequest,
) (response CompletionResponse, err error)

Completion sends a text completion request to the OpenRouter API.

This method allows users to generate text completions from AI models through the OpenRouter API. The request can be customized with various parameters to control the generation process, provider selection, and output format.

Parameters:

  • ctx: The context for the request, which can be used for cancellation and timeouts
  • request: The completion request parameters

Returns:

  • CompletionResponse: Contains the generated completions and metadata
  • error: Any error that occurred during the request, including ErrCompletionStreamNotSupported if streaming was requested

func (*Client) CompletionStream added in v0.4.0

func (c *Client) CompletionStream(
	ctx context.Context,
	request CompletionRequest,
) (*CompletionStreamReader, error)

CompletionStream sends a streaming completion request to the OpenRouter API.

This method enables real-time streaming of completion responses, allowing applications to display partial results as they are generated by the AI model.

The method automatically sets the stream parameter to true in the request and returns a CompletionStreamReader for reading the streaming chunks.

Example usage:

request := gopenrouter.NewCompletionRequestBuilder("model-id", "prompt").Build()
stream, err := client.CompletionStream(ctx, *request)
if err != nil {
  // handle error
}
defer stream.Close()

for {
  chunk, err := stream.Recv()
  if err == io.EOF {
    break // Stream finished
  }
  if err != nil {
    // handle error
  }
  // Process chunk
}

func (*Client) GetCredits

func (c *Client) GetCredits(ctx context.Context) (data CreditsData, err error)

GetCredits retrieves information about the authenticated user's credits and usage.

This method provides a way to check the account's financial status, including the total purchased credits and how much has been consumed. This can be used for budgeting, monitoring usage, or determining when to purchase more credits.

Parameters:

  • ctx: The context for the request, which can be used for cancellation and timeouts

Returns:

  • CreditsData: Contains information about credits and usage
  • error: Any error that occurred during the request

func (*Client) GetGeneration

func (c *Client) GetGeneration(ctx context.Context, id string) (data GenerationData, err error)

GetGeneration retrieves metadata about a specific generation request by its ID.

The generation ID is provided when creating a completion or chat completion. This method allows you to retrieve detailed information about a previously made request, including its cost, token usage, and other metadata.

Parameters:

  • ctx: The context for the request, which can be used for cancellation and timeouts
  • id: The unique identifier of the generation to retrieve

Returns:

  • GenerationData: Contains the detailed generation metadata
  • error: Any error that occurred during the request

func (*Client) ListEndpoints

func (c *Client) ListEndpoints(ctx context.Context, author string, slug string) (data EndpointData, err error)

ListEndpoints retrieves information about all available endpoints for a specific model.

Each model on OpenRouter may be available through multiple providers, with each provider offering different capabilities, context lengths, and pricing. This method allows you to see all provider-specific implementations of a given model.

Parameters:

  • ctx: The context for the request, which can be used for cancellation and timeouts
  • author: The author/owner of the model
  • slug: The model identifier/slug

Returns:

  • EndpointData: Contains model information and a list of available endpoints
  • error: Any error that occurred during the request

func (*Client) ListModels

func (c *Client) ListModels(ctx context.Context) (models []ModelData, err error)

ListModels retrieves information about all models available through the OpenRouter API.

The returned list includes details about each model's capabilities, pricing, and technical specifications. This information can be used to select an appropriate model for different use cases or to compare models.

Parameters:

  • ctx: The context for the request, which can be used for cancellation and timeouts

Returns:

  • []ModelData: A list of available models with their details
  • error: Any error that occurred during the request

type CompletionChoice

type CompletionChoice struct {
	// LogProbs contains log probability information for the choice (if requested)
	LogProbs *LogProbs `json:"logprobs,omitempty"`
	// FinishReason explains why the generation stopped (e.g., "length", "stop")
	FinishReason string `json:"finish_reason"`
	// NativeFinishReason is the provider's native finish reason
	NativeFinishReason string `json:"native_finish_reason"`
	// Text is the generated completion content
	Text string `json:"text"`
	// Reasoning contains reasoning tokens if available
	Reasoning *string `json:"reasoning,omitempty"`
	// Index is the position of this choice in the array of choices
	Index int `json:"index"`
}

CompletionChoice represents a single completion result from the API. The API may return multiple choices depending on the request parameters.

type CompletionRequest

type CompletionRequest struct {
	// Required fields
	// Model is the identifier of the AI model to use
	Model string `json:"model"`
	// Prompt is the text input that the model will complete
	Prompt string `json:"prompt"`

	// Optional fields
	// Models provides an alternate list of models for routing overrides
	Models []string `json:"models,omitempty"`
	// Provider contains preferences for provider routing
	Provider *ProviderOptions `json:"provider,omitempty"`
	// Reasoning configures model reasoning/thinking tokens
	Reasoning *ReasoningOptions `json:"reasoning,omitempty"`
	// Usage specifies whether to include usage information in the response
	Usage *UsageOptions `json:"usage,omitempty"`
	// Transforms lists prompt transformations (OpenRouter-only feature)
	Transforms []string `json:"transforms,omitempty"`
	// Stream enables streaming of results as they are generated
	Stream *bool `json:"stream,omitempty"`
	// MaxTokens limits the maximum number of tokens in the response
	MaxTokens *int `json:"max_tokens,omitempty"`
	// Temperature controls randomness in generation (range: [0, 2])
	Temperature *float64 `json:"temperature,omitempty"`
	// Seed enables deterministic outputs with the same inputs
	Seed *int `json:"seed,omitempty"`
	// TopP controls nucleus sampling (range: (0, 1])
	TopP *float64 `json:"top_p,omitempty"`
	// TopK limits sampling to top K most likely tokens (range: [1, Infinity))
	TopK *int `json:"top_k,omitempty"`
	// FrequencyPenalty reduces repetition of token sequences (range: [-2, 2])
	FrequencyPenalty *float64 `json:"frequency_penalty,omitempty"`
	// PresencePenalty reduces repetition of topics (range: [-2, 2])
	PresencePenalty *float64 `json:"presence_penalty,omitempty"`
	// RepetitionPenalty penalizes repeated tokens (range: (0, 2])
	RepetitionPenalty *float64 `json:"repetition_penalty,omitempty"`
	// LogitBias maps token IDs to bias values for controlling token probability
	LogitBias map[string]float64 `json:"logit_bias,omitempty"`
	// TopLogProbs specifies the number of top log probabilities to return
	TopLogProbs *int `json:"top_logprobs,omitempty"`
	// MinP sets the minimum probability threshold for tokens (range: [0, 1])
	MinP *float64 `json:"min_p,omitempty"`
	// TopA is an alternate top sampling parameter (range: [0, 1])
	TopA *float64 `json:"top_a,omitempty"`
	// Logprobs enables returning log probabilities of output tokens
	Logprobs *bool `json:"logprobs,omitempty"`
	// Stop specifies sequences where the model will stop generating tokens
	Stop []string `json:"stop,omitempty"`
}

CompletionRequest represents a request payload for the completions endpoint. It contains all parameters needed to generate text completions from AI models.

type CompletionRequestBuilder

type CompletionRequestBuilder struct {
	// contains filtered or unexported fields
}

CompletionRequestBuilder implements a builder pattern for constructing CompletionRequest objects. This makes it easier to create requests with many optional parameters.

func NewCompletionRequestBuilder

func NewCompletionRequestBuilder(model, prompt string) *CompletionRequestBuilder

NewCompletionRequestBuilder creates a new builder initialized with the required model and prompt.

Parameters:

  • model: The identifier of the AI model to use
  • prompt: The text prompt that the model will complete

Returns:

  • *CompletionRequestBuilder: A builder instance that can be used to set optional parameters

func (*CompletionRequestBuilder) Build

Build finalizes and returns the constructed CompletionRequest.

func (*CompletionRequestBuilder) WithFrequencyPenalty

func (b *CompletionRequestBuilder) WithFrequencyPenalty(penalty float64) *CompletionRequestBuilder

WithFrequencyPenalty sets the frequency penalty

func (*CompletionRequestBuilder) WithLogitBias

func (b *CompletionRequestBuilder) WithLogitBias(biases map[string]float64) *CompletionRequestBuilder

WithLogitBias sets the logit bias map

func (*CompletionRequestBuilder) WithLogprobs added in v0.4.0

func (b *CompletionRequestBuilder) WithLogprobs(logprobs bool) *CompletionRequestBuilder

WithLogprobs enables or disables returning log probabilities of output tokens

func (*CompletionRequestBuilder) WithMaxTokens

func (b *CompletionRequestBuilder) WithMaxTokens(maxTokens int) *CompletionRequestBuilder

WithMaxTokens sets the maximum tokens

func (*CompletionRequestBuilder) WithMinP

WithMinP sets the minimum probability threshold

func (*CompletionRequestBuilder) WithModels

func (b *CompletionRequestBuilder) WithModels(models []string) *CompletionRequestBuilder

WithModels sets the list of alternative models

func (*CompletionRequestBuilder) WithPresencePenalty

func (b *CompletionRequestBuilder) WithPresencePenalty(penalty float64) *CompletionRequestBuilder

WithPresencePenalty sets the presence penalty

func (*CompletionRequestBuilder) WithProvider

WithProvider sets provider routing options

func (*CompletionRequestBuilder) WithReasoning

WithReasoning sets reasoning options

func (*CompletionRequestBuilder) WithRepetitionPenalty

func (b *CompletionRequestBuilder) WithRepetitionPenalty(penalty float64) *CompletionRequestBuilder

WithRepetitionPenalty sets the repetition penalty

func (*CompletionRequestBuilder) WithSeed

WithSeed sets the seed for deterministic outputs

func (*CompletionRequestBuilder) WithStop added in v0.4.0

WithStop sets the stop sequences for token generation

func (*CompletionRequestBuilder) WithStream

WithStream enables or disables streaming

func (*CompletionRequestBuilder) WithTemperature

func (b *CompletionRequestBuilder) WithTemperature(temperature float64) *CompletionRequestBuilder

WithTemperature sets the sampling temperature

func (*CompletionRequestBuilder) WithTopA

WithTopA sets the alternate top sampling parameter

func (*CompletionRequestBuilder) WithTopK

WithTopK sets the top-k sampling value

func (*CompletionRequestBuilder) WithTopLogprobs

func (b *CompletionRequestBuilder) WithTopLogprobs(topLogProbs int) *CompletionRequestBuilder

WithTopLogprobs sets the number of top log probabilities to return

func (*CompletionRequestBuilder) WithTopP

WithTopP sets the top-p sampling value

func (*CompletionRequestBuilder) WithTransforms

func (b *CompletionRequestBuilder) WithTransforms(transforms []string) *CompletionRequestBuilder

WithTransforms sets prompt transforms

func (*CompletionRequestBuilder) WithUsage

WithUsage sets usage information option

type CompletionResponse

type CompletionResponse struct {
	// ID is the unique identifier for this completion request
	ID string `json:"id"`
	// Provider is the name of the AI provider that generated the completion
	Provider string `json:"provider"`
	// Model is the name of the model that generated the completion
	Model string `json:"model"`
	// Object is the object type, typically "chat.completion"
	Object string `json:"object"`
	// Created is the Unix timestamp when the completion was created
	Created int64 `json:"created"`
	// Choices contains the generated text completions
	Choices []CompletionChoice `json:"choices"`
	// SystemFingerprint is a unique identifier for the backend configuration
	SystemFingerprint *string `json:"system_fingerprint,omitempty"`
	// Usage provides token usage statistics for the request
	Usage Usage `json:"usage"`
}

CompletionResponse represents the API response from a text completion request. It contains the generated completions and associated metadata.

type CompletionStreamReader added in v0.4.0

type CompletionStreamReader struct {
	// contains filtered or unexported fields
}

CompletionStreamReader implements stream reader for completion responses

func NewCompletionStreamReader added in v0.4.0

func NewCompletionStreamReader(response *http.Response) *CompletionStreamReader

NewCompletionStreamReader creates a new stream reader for completion responses

func (*CompletionStreamReader) Close added in v0.4.0

func (r *CompletionStreamReader) Close() error

Close closes the completion stream reader

func (*CompletionStreamReader) Recv added in v0.4.0

Recv reads the next completion chunk from the stream

type CompletionStreamResponse added in v0.4.0

type CompletionStreamResponse struct {
	ID                string            `json:"id"`
	Provider          string            `json:"provider"`
	Model             string            `json:"model"`
	Object            string            `json:"object"`
	Created           int64             `json:"created"`
	Choices           []StreamingChoice `json:"choices"`
	SystemFingerprint *string           `json:"system_fingerprint,omitempty"`
	Usage             *Usage            `json:"usage,omitempty"`
}

CompletionStreamResponse represents a single chunk in a streaming completion response

type CompletionTokensDetails added in v0.4.0

type CompletionTokensDetails struct {
	// ReasoningTokens is the number of tokens used for reasoning (if applicable)
	ReasoningTokens int `json:"reasoning_tokens"`
}

CompletionTokensDetails provides detailed information about completion token usage

type CreditsData

type CreditsData struct {
	// TotalCredits represents the total amount of credits purchased or added to the account
	TotalCredits float64 `json:"total_credits"`
	// TotalUsage represents the total amount of credits consumed by API requests
	TotalUsage float64 `json:"total_usage"`
}

CreditsData contains information about a user's credits and usage. This provides visibility into the account's financial standing with OpenRouter.

type Effort

type Effort string

Effort represents the level of token allocation for reasoning in AI models. Different effort levels allocate different proportions of the maximum token limit.

const (
	// EffortHigh allocates a large portion of tokens for reasoning (approximately 80% of max_tokens)
	EffortHigh Effort = "high"

	// EffortMedium allocates a moderate portion of tokens for reasoning (approximately 50% of max_tokens)
	EffortMedium Effort = "medium"

	// EffortLow allocates a smaller portion of tokens for reasoning (approximately 20% of max_tokens)
	EffortLow Effort = "Low"
)

type EndpointData

type EndpointData struct {
	// ID is the unique identifier for the model
	ID string `json:"id"`
	// Name is the human-readable name of the model
	Name string `json:"name"`
	// Created is the Unix timestamp when the model was added
	Created float64 `json:"created"`
	// Description provides details about the model's capabilities
	Description string `json:"description"`
	// Architecture contains information about the model's input/output capabilities
	Architecture Architecture `json:"architecture"`
	// Endpoints is a list of provider-specific implementations of this model
	Endpoints []EndpointDetail `json:"endpoints"`
}

EndpointData contains information about a model and its available endpoints. This includes both model metadata and a list of provider-specific endpoints.

type EndpointDetail

type EndpointDetail struct {
	// Name is the identifier for this specific endpoint
	Name string `json:"name"`
	// ContextLength is the maximum context length supported by this endpoint
	ContextLength float64 `json:"context_length"`
	// Pricing contains the cost information for using this endpoint
	Pricing EndpointPricing `json:"pricing"`
	// ProviderName identifies which AI provider offers this endpoint
	ProviderName string `json:"provider_name"`
	// SupportedParameters lists the API parameters this endpoint accepts
	SupportedParameters []string `json:"supported_parameters"`
}

EndpointDetail represents a specific provider endpoint for a model. Each endpoint is a provider-specific implementation of the same model.

type EndpointPricing

type EndpointPricing struct {
	// Request is the fixed cost per API request
	Request string `json:"request"`
	// Image is the cost per image in the input
	Image string `json:"image"`
	// Prompt is the cost per token for the input/prompt
	Prompt string `json:"prompt"`
	// Completion is the cost per token for the output/completion
	Completion string `json:"completion"`
}

EndpointPricing contains pricing information for using a specific endpoint. All prices are expressed as strings representing cost per token (or per operation).

type ErrorResponse

type ErrorResponse struct {
	Error *APIError `json:"error,omitempty"`
}

type ExperimentalOptions

type ExperimentalOptions struct {
	// ForceChatCompletions forces the use of chat completions API even when using the completions endpoint
	ForceChatCompletions *bool `json:"force_chat_completions,omitempty"`
}

ExperimentalOptions contains cutting-edge features that may change in future API versions. These options provide additional control for advanced use cases.

type GenerationData

type GenerationData struct {
	// ID is the unique identifier for this generation
	ID string `json:"id"`
	// TotalCost represents the total cost of the generation in credits
	TotalCost float64 `json:"total_cost"`
	// CreatedAt is the timestamp when the generation was created
	CreatedAt string `json:"created_at"`
	// Model is the name of the AI model used for the generation
	Model string `json:"model"`
	// Origin indicates the source of the generation request
	Origin string `json:"origin"`
	// Usage represents the total credit usage for this generation
	Usage float64 `json:"usage"`
	// IsBYOK indicates if this was a "Bring Your Own Key" request
	IsBYOK bool `json:"is_byok"`
	// UpstreamID is the ID assigned by the upstream provider
	UpstreamID string `json:"upstream_id"`
	// CacheDiscount represents any discount applied due to prompt caching
	CacheDiscount float64 `json:"cache_discount"`
	// AppID is the identifier of the application that made the request
	AppID int `json:"app_id"`
	// Streamed indicates whether the generation was streamed
	Streamed bool `json:"streamed"`
	// Cancelled indicates whether the generation was cancelled before completion
	Cancelled bool `json:"cancelled"`
	// ProviderName is the name of the AI provider (e.g., "openai", "anthropic")
	ProviderName string `json:"provider_name"`
	// Latency is the total latency of the request in milliseconds
	Latency int `json:"latency"`
	// ModerationLatency is the time spent on content moderation in milliseconds
	ModerationLatency int `json:"moderation_latency"`
	// GenerationTime is the time spent generating the response in milliseconds
	GenerationTime int `json:"generation_time"`
	// FinishReason describes why the generation stopped
	FinishReason string `json:"finish_reason"`
	// NativeFinishReason is the raw finish reason from the provider
	NativeFinishReason string `json:"native_finish_reason"`
	// TokensPrompt is the number of tokens in the prompt
	TokensPrompt int `json:"tokens_prompt"`
	// TokensCompletion is the number of tokens in the completion
	TokensCompletion int `json:"tokens_completion"`
	// NativeTokensPrompt is the raw token count from the provider for the prompt
	NativeTokensPrompt int `json:"native_tokens_prompt"`
	// NativeTokensCompletion is the raw token count from the provider for the completion
	NativeTokensCompletion int `json:"native_tokens_completion"`
	// NativeTokensReasoning is the number of tokens used for internal reasoning
	NativeTokensReasoning int `json:"native_tokens_reasoning"`
	// NumMediaPrompt is the count of media items in the prompt
	NumMediaPrompt int `json:"num_media_prompt"`
	// NumMediaCompletion is the count of media items in the completion
	NumMediaCompletion int `json:"num_media_completion"`
	// NumSearchResults is the number of search results included
	NumSearchResults int `json:"num_search_results"`
}

GenerationData contains detailed information about a specific generation request. This includes metadata about the request, the model used, performance metrics, token usage statistics, and other details about the generation process.

type HTTPDoer

type HTTPDoer interface {
	Do(req *http.Request) (*http.Response, error)
}

HTTPDoer is an interface for making HTTP requests. It abstracts HTTP operations to allow users to provide custom HTTP clients with their own configuration (like custom timeouts, transport settings, or middleware). This interface matches http.Client's Do method, so *http.Client satisfies it directly.

type LogProbToken added in v0.4.0

type LogProbToken struct {
	// Token is the token string
	Token string `json:"token"`
	// Bytes are the UTF-8 byte values of the token
	Bytes []int `json:"bytes"`
	// LogProb is the log probability of this token
	LogProb float64 `json:"logprob"`
}

LogProbToken represents a single token with its log probability information

type LogProbs added in v0.4.0

type LogProbs struct {
	// Content contains token-by-token log probabilities for the content
	Content []TokenLogProbs `json:"content"`
	// Refusal contains log probabilities for refusal tokens (if applicable)
	Refusal *[]TokenLogProbs `json:"refusal,omitempty"`
}

LogProbs represents log probability information for the completion

type MaxPrice

type MaxPrice struct {
	// Prompt is the maximum USD price per million tokens for the input prompt
	Prompt *float64 `json:"prompt,omitempty"`

	// Completion is the maximum USD price per million tokens for the generated completion
	Completion *float64 `json:"completion,omitempty"`

	// Image is the maximum USD price per image included in the request
	Image *float64 `json:"image,omitempty"`

	// Request is the maximum USD price per API request regardless of tokens
	Request *float64 `json:"request,omitempty"`
}

MaxPrice specifies the maximum price limits for different components of a request. All prices are in USD and allow for cost control when using the API.

type ModelArchitecture

type ModelArchitecture struct {
	// Modality describes the input and output types in format "input->output" (e.g., "text->text", "text+image->text")
	Modality string `json:"modality"`
	// InputModalities describes the types of input the model can accept (e.g., "text", "image")
	InputModalities []string `json:"input_modalities"`
	// OutputModalities describes the types of output the model can produce (e.g., "text")
	OutputModalities []string `json:"output_modalities"`
	// Tokenizer indicates the tokenization method used by the model (e.g., "GPT")
	Tokenizer string `json:"tokenizer"`
	// InstructType specifies the instruction format the model uses (if applicable)
	InstructType *string `json:"instruct_type,omitempty"`
}

ModelArchitecture contains information about the model's input and output capabilities.

type ModelData

type ModelData struct {
	// ID is the unique identifier for the model
	ID string `json:"id"`
	// Name is the human-readable name of the model
	Name string `json:"name"`
	// Created is the Unix timestamp when the model was added to OpenRouter
	Created float64 `json:"created"`
	// Description provides details about the model's capabilities
	Description string `json:"description"`
	// Architecture contains information about the model's input/output capabilities
	Architecture ModelArchitecture `json:"architecture"`
	// TopProvider contains information about the primary provider for this model
	TopProvider ModelTopProvider `json:"top_provider"`
	// Pricing contains the cost information for using this model
	Pricing ModelPricing `json:"pricing"`
	// ContextLength is the maximum number of tokens the model can process
	ContextLength *float64 `json:"context_length,omitempty"`
	// HuggingFaceID is the identifier for the model on Hugging Face (if available)
	HuggingFaceID *string `json:"hugging_face_id,omitempty"`
	// PerRequestLimits contains any limitations on requests to this model
	PerRequestLimits map[string]any `json:"per_request_limits,omitempty"`
	// SupportedParameters lists all parameters that can be used with this model
	// Note: This is a union of parameters from all providers; no single provider may support all parameters
	SupportedParameters []string `json:"supported_parameters,omitempty"`
}

ModelData represents information about an AI model available through OpenRouter. It contains details about the model's capabilities, pricing, and technical specifications.

type ModelPricing

type ModelPricing struct {
	// Prompt is the cost per token for the input/prompt
	Prompt string `json:"prompt"`
	// Completion is the cost per token for the output/completion
	Completion string `json:"completion"`
	// Image is the cost per image in the input
	Image string `json:"image"`
	// Request is the fixed cost per request
	Request string `json:"request"`
	// InputCacheRead is the cost for reading from the prompt cache
	InputCacheRead string `json:"input_cache_read"`
	// InputCacheWrite is the cost for writing to the prompt cache
	InputCacheWrite string `json:"input_cache_write"`
	// WebSearch is the cost for web search operations
	WebSearch string `json:"web_search"`
	// InternalReasoning is the cost for internal reasoning tokens
	InternalReasoning string `json:"internal_reasoning"`
}

ModelPricing contains the cost information for using a model. All prices are expressed as strings representing cost per token (or per operation).

type ModelTopProvider

type ModelTopProvider struct {
	// IsModerated indicates if the provider applies content moderation
	IsModerated bool `json:"is_moderated"`
	// ContextLength is the maximum context length supported by this specific provider
	ContextLength *float64 `json:"context_length,omitempty"`
	// MaxCompletionTokens is the maximum number of tokens the provider allows in completions
	MaxCompletionTokens *float64 `json:"max_completion_tokens,omitempty"`
}

ModelTopProvider contains information about the primary provider for a model.

type Option

type Option func(*Client)

Option defines a client option function for modifying Client properties. These are used with the New constructor function to customize client behavior.

func WithBaseURL

func WithBaseURL(baseURL string) Option

WithBaseURL sets a custom base URL for the OpenRouter API. This is primarily useful for testing or when using a proxy.

func WithHTTPClient

func WithHTTPClient(httpClient HTTPDoer) Option

WithHTTPClient sets a custom HTTP client for making requests. Users can provide their own http.Client (or any HTTPDoer implementation) to customize timeouts, transport settings, proxies, or add middleware for logging, metrics collection, or request/response manipulation.

func WithSiteTitle

func WithSiteTitle(siteTitle string) Option

WithSiteTitle sets the site title that will be passed in X-Title header to the OpenRouter API. This provides additional context about the origin of requests.

func WithSiteURL

func WithSiteURL(siteURL string) Option

WithSiteURL sets the site URL that will be passed in HTTP-Referer header to the OpenRouter API. This is useful for attribution and tracking usage from different applications.

type PromptTokensDetails added in v0.4.0

type PromptTokensDetails struct {
	// CachedTokens is the number of tokens that were cached from previous requests
	CachedTokens int `json:"cached_tokens"`
}

PromptTokensDetails provides detailed information about prompt token usage

type ProviderOptions

type ProviderOptions struct {
	// AllowFallbacks determines whether to try backup providers when the primary is unavailable
	AllowFallbacks *bool `json:"allow_fallbacks,omitempty"`

	// RequireParameters ensures only providers that support all request parameters are used
	RequireParameters *bool `json:"require_parameters,omitempty"`

	// DataCollection controls whether to use providers that may store data
	// Valid values: "deny", "allow"
	DataCollection string `json:"data_collection,omitempty"`

	// Order specifies the ordered list of provider names to try (e.g. ["Anthropic", "OpenAI"])
	Order []string `json:"order,omitempty"`

	// Only limits request routing to only the specified providers
	Only []string `json:"only,omitempty"`

	// Ignore specifies which providers should not be used for this request
	Ignore []string `json:"ignore,omitempty"`

	// Quantizations filters providers by their model quantization levels
	// Valid values include: "int4", "int8", "fp4", "fp6", "fp8", "fp16", "bf16", "fp32", "unknown"
	Quantizations []Quantization `json:"quantizations,omitempty"`

	// Sort specifies how to rank available providers
	// Valid values: "price", "throughput", "latency"
	Sort string `json:"sort,omitempty"`

	// MaxPrice sets the maximum pricing limits for this request
	MaxPrice *MaxPrice `json:"max_price,omitempty"`

	// Experimental contains experimental provider routing features
	Experimental *ExperimentalOptions `json:"experimental,omitempty"`
}

ProviderOptions specifies preferences for how OpenRouter should route requests to AI providers. These options allow for fine-grained control over which providers are used and how they are selected.

type ProviderOptionsBuilder

type ProviderOptionsBuilder struct {
	// contains filtered or unexported fields
}

ProviderOptionsBuilder implements a builder pattern for constructing ProviderOptions objects. This provides a fluent interface for configuring the many options available for provider routing.

func NewProviderOptionsBuilder

func NewProviderOptionsBuilder() *ProviderOptionsBuilder

NewProviderOptionsBuilder creates a new builder for configuring provider routing options. The returned builder can be used to set options through method chaining.

func (*ProviderOptionsBuilder) Build

Build finalizes and returns the constructed ProviderOptions.

Returns:

  • *ProviderOptions: A pointer to the fully configured provider options object

func (*ProviderOptionsBuilder) WithAllowFallbacks

func (b *ProviderOptionsBuilder) WithAllowFallbacks(allow bool) *ProviderOptionsBuilder

WithAllowFallbacks sets whether to allow backup providers

func (*ProviderOptionsBuilder) WithDataCollection

func (b *ProviderOptionsBuilder) WithDataCollection(policy string) *ProviderOptionsBuilder

WithDataCollection sets the data collection policy Values should be "allow" or "deny"

func (*ProviderOptionsBuilder) WithForceChatCompletions

func (b *ProviderOptionsBuilder) WithForceChatCompletions(force bool) *ProviderOptionsBuilder

WithForceChatCompletions sets whether to force using chat completions API

func (*ProviderOptionsBuilder) WithIgnore

func (b *ProviderOptionsBuilder) WithIgnore(providers []string) *ProviderOptionsBuilder

WithIgnore sets the list of provider names to skip

func (*ProviderOptionsBuilder) WithMaxCompletionPrice

func (b *ProviderOptionsBuilder) WithMaxCompletionPrice(price float64) *ProviderOptionsBuilder

WithMaxCompletionPrice sets the maximum price per million completion tokens

func (*ProviderOptionsBuilder) WithMaxImagePrice

func (b *ProviderOptionsBuilder) WithMaxImagePrice(price float64) *ProviderOptionsBuilder

WithMaxImagePrice sets the maximum price per image

func (*ProviderOptionsBuilder) WithMaxPrice

func (b *ProviderOptionsBuilder) WithMaxPrice(maxPrice *MaxPrice) *ProviderOptionsBuilder

WithMaxPrice sets the maximum pricing configuration

func (*ProviderOptionsBuilder) WithMaxPromptPrice

func (b *ProviderOptionsBuilder) WithMaxPromptPrice(price float64) *ProviderOptionsBuilder

WithMaxPromptPrice sets the maximum price per million prompt tokens

func (*ProviderOptionsBuilder) WithMaxRequestPrice

func (b *ProviderOptionsBuilder) WithMaxRequestPrice(price float64) *ProviderOptionsBuilder

WithMaxRequestPrice sets the maximum price per request

func (*ProviderOptionsBuilder) WithOnly

func (b *ProviderOptionsBuilder) WithOnly(providers []string) *ProviderOptionsBuilder

WithOnly sets the list of provider names to exclusively allow

func (*ProviderOptionsBuilder) WithOrder

func (b *ProviderOptionsBuilder) WithOrder(providers []string) *ProviderOptionsBuilder

WithOrder sets the list of provider names to try in order

func (*ProviderOptionsBuilder) WithQuantizations

func (b *ProviderOptionsBuilder) WithQuantizations(quantizations []Quantization) *ProviderOptionsBuilder

WithQuantizations sets the list of quantization levels to filter by

func (*ProviderOptionsBuilder) WithRequireParameters

func (b *ProviderOptionsBuilder) WithRequireParameters(require bool) *ProviderOptionsBuilder

WithRequireParameters sets whether to require providers to support all parameters

func (*ProviderOptionsBuilder) WithSort

WithSort sets the sorting strategy Values should be "price", "throughput", or "latency"

type Quantization

type Quantization string

Quantization represents the precision level used in model weights. Different quantization levels offer trade-offs between model size, inference speed, and prediction quality.

const (
	// QuantizationInt4 represents Integer (4 bit) quantization
	QuantizationInt4 Quantization = "int4"

	// QuantizationInt8 represents Integer (8 bit) quantization
	QuantizationInt8 Quantization = "int8"

	// QuantizationFP4 represents Floating point (4 bit) quantization
	QuantizationFP4 Quantization = "fp4"

	// QuantizationFP6 represents Floating point (6 bit) quantization
	QuantizationFP6 Quantization = "fp6"

	// QuantizationFP8 represents Floating point (8 bit) quantization
	QuantizationFP8 Quantization = "fp8"

	// QuantizationFP16 represents Floating point (16 bit) quantization
	QuantizationFP16 Quantization = "fp16"

	// QuantizationBF16 represents Brain floating point (16 bit) quantization
	QuantizationBF16 Quantization = "bf16"

	// QuantizationFP32 represents Floating point (32 bit) quantization
	QuantizationFP32 Quantization = "fp32"

	// QuantizationUnknown represents Unknown quantization level
	QuantizationUnknown Quantization = "unknown"
)

type ReasoningOptions

type ReasoningOptions struct {
	// Effort sets the proportion of tokens to allocate for reasoning
	Effort Effort `json:"effort,omitempty"`
	// MaxTokens sets the maximum number of tokens for reasoning
	MaxTokens *int `json:"max_tokens,omitempty"`
	// Exclude determines whether to include reasoning in the final response
	Exclude *bool `json:"exclude,omitempty"`
}

ReasoningOptions configures how models allocate tokens for internal reasoning. This allows models to "think" before producing a final response.

type RequestError

type RequestError struct {
	HTTPStatus     string
	HTTPStatusCode int
	Err            error
	Body           []byte
}

RequestError provides information about generic request errors.

func (*RequestError) Error

func (e *RequestError) Error() string

func (*RequestError) Unwrap

func (e *RequestError) Unwrap() error

type StreamingChoice added in v0.4.0

type StreamingChoice struct {
	Index              int       `json:"index"`
	Text               string    `json:"text"`
	FinishReason       *string   `json:"finish_reason"`
	NativeFinishReason *string   `json:"native_finish_reason"`
	LogProbs           *LogProbs `json:"logprobs,omitempty"`
}

StreamingChoice represents a streaming completion choice with text content

type TokenLogProbs added in v0.4.0

type TokenLogProbs struct {
	// Token is the token string
	Token string `json:"token"`
	// Bytes are the UTF-8 byte values of the token
	Bytes []int `json:"bytes"`
	// LogProb is the log probability of this token
	LogProb float64 `json:"logprob"`
	// TopLogProbs contains the most likely tokens at this position
	TopLogProbs []LogProbToken `json:"top_logprobs"`
}

TokenLogProbs represents log probability information for a token

type Usage

type Usage struct {
	// PromptTokens is the number of tokens in the input prompt
	PromptTokens int `json:"prompt_tokens"`
	// CompletionTokens is the number of tokens in the generated completion
	CompletionTokens int `json:"completion_tokens"`
	// TotalTokens is the sum of prompt and completion tokens
	TotalTokens int `json:"total_tokens"`
	// PromptTokensDetails provides detailed breakdown of prompt tokens
	PromptTokensDetails *PromptTokensDetails `json:"prompt_tokens_details,omitempty"`
	// CompletionTokensDetails provides detailed breakdown of completion tokens
	CompletionTokensDetails *CompletionTokensDetails `json:"completion_tokens_details,omitempty"`
}

Usage provides detailed information about token consumption for a request. This helps users track their API usage and optimize their requests.

type UsageOptions

type UsageOptions struct {
	// Include determines whether token usage information should be returned
	Include *bool `json:"usage,omitempty"`
}

UsageOptions controls whether to include token usage information in the response. When enabled, the API will return counts of prompt, completion, and total tokens.

Directories

Path Synopsis
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL