sdk

package module

v1.11.0 Latest Latest Go to latest Published: Aug 9, 2025 License: MIT Imports: 9 Imported by: 5

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/inference-gateway/sdk

Links

Open Source Insights

README ¶

🚀 Inference Gateway Go SDK

A powerful and easy-to-use Go SDK for the Inference Gateway

Connect to multiple LLM providers through a unified interface • Stream responses • Function calling • MCP tools support • Middleware control

Installation • Quick Start • Examples • Documentation

🚀 Inference Gateway Go SDK

Installation

To install the SDK, use go get:

go get github.com/inference-gateway/sdk

Usage

Creating a Client

To create a client, use the NewClient function:

package main

import (
    "fmt"
    "log"

    sdk "github.com/inference-gateway/sdk"
)

func main() {
    client := sdk.NewClient(&sdk.ClientOptions{
        BaseURL: "http://localhost:8080/v1",
    })
}

Using Custom Headers

The SDK supports custom HTTP headers that can be included with all requests. You can set headers in three ways:

Initial headers via ClientOptions:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
    Headers: map[string]string{
        "X-Custom-Header": "my-value",
        "User-Agent":      "my-app/1.0",
    },
})

Multiple headers using WithHeaders:

client = client.WithHeaders(map[string]string{
    "X-Request-ID": "abc123",
    "X-Source":     "sdk",
})

Single header using WithHeader:

client = client.WithHeader("Authorization", "Bearer token123")

Headers can be combined and will override previous values if the same header name is used:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
    Headers: map[string]string{
        "X-App-Name": "my-app",
    },
})

// Add more headers
client = client.WithHeaders(map[string]string{
    "X-Request-ID": "req-123",
    "X-Version":    "1.0",
}).WithHeader("Authorization", "Bearer token")

// All subsequent requests will include all these headers
response, err := client.GenerateContent(ctx, provider, model, messages)

Middleware Options

The Inference Gateway supports various middleware layers (MCP tools, A2A agents) that can be bypassed for direct provider access. The SDK provides WithMiddlewareOptions to control middleware behavior:

package main

import (
    "context"
    "fmt"
    "log"

    sdk "github.com/inference-gateway/sdk"
)

func main() {
    client := sdk.NewClient(&sdk.ClientOptions{
        BaseURL: "http://localhost:8080/v1",
        APIKey:  "your-api-key",
    })

    ctx := context.Background()
    messages := []sdk.Message{
        {Role: sdk.User, Content: "Hello, world!"},
    }

    // 1. Skip MCP middleware only
    response1, err := client.WithMiddlewareOptions(&sdk.MiddlewareOptions{
        SkipMCP: true,
    }).GenerateContent(ctx, sdk.Openai, "gpt-4o", messages)

    // 2. Skip A2A (Agent-to-Agent) middleware only
    response2, err := client.WithMiddlewareOptions(&sdk.MiddlewareOptions{
        SkipA2A: true,
    }).GenerateContent(ctx, sdk.Openai, "gpt-4o", messages)

    // 3. Direct provider access (bypasses all middleware)
    response3, err := client.WithMiddlewareOptions(&sdk.MiddlewareOptions{
        DirectProvider: true,
    }).GenerateContent(ctx, sdk.Openai, "gpt-4o", messages)

    // 4. Skip both MCP and A2A middleware
    response4, err := client.WithMiddlewareOptions(&sdk.MiddlewareOptions{
        SkipMCP: true,
        SkipA2A: true,
    }).GenerateContent(ctx, sdk.Openai, "gpt-4o", messages)
}

Middleware Options:

SkipMCP - Bypasses MCP (Model Context Protocol) middleware processing
SkipA2A - Bypasses A2A (Agent-to-Agent) middleware processing
DirectProvider - Routes directly to the provider without any middleware

Method Chaining:

Middleware options can be chained with other configuration methods:

response, err := client.
    WithHeader("X-Custom-Header", "value").
    WithMiddlewareOptions(&sdk.MiddlewareOptions{
        SkipMCP: true,
        SkipA2A: true,
    }).
    GenerateContent(ctx, sdk.Openai, "gpt-4o", messages)

Alternative Header Approach:

You can also control middleware using custom headers directly:

response, err := client.
    WithHeader("X-MCP-Bypass", "true").
    WithHeader("X-A2A-Bypass", "true").
    WithHeader("X-Direct-Provider", "true").
    GenerateContent(ctx, sdk.Openai, "gpt-4o", messages)

Note: Middleware options apply to all subsequent API calls until overridden with a new WithMiddlewareOptions call. The gateway must support the corresponding headers for this functionality to work properly.

Listing Models

To list available models, use the ListModels method:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
})

ctx := context.Background()

// List all models from all providers
resp, err := client.ListModels(ctx)
if err != nil {
    log.Fatalf("Error listing models: %v", err)
}
fmt.Printf("All available models: %+v\n", resp.Data)

// List models for a specific provider
groqResp, err := client.ListProviderModels(ctx, sdk.Groq)
if err != nil {
    log.Fatalf("Error listing provider models: %v", err)
}
fmt.Printf("Provider: %s\n", *groqResp.Provider)
fmt.Printf("Available Groq models: %+v\n", groqResp.Data)

Listing MCP Tools

To list available MCP (Model Context Protocol) tools, use the ListTools method. This functionality is only available when EXPOSE_MCP=true is set on the Inference Gateway server:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
    APIKey:  "your-api-key", // Required for MCP tools access
})

ctx := context.Background()
tools, err := client.ListTools(ctx)
if err != nil {
    log.Fatalf("Error listing tools: %v", err)
}

fmt.Printf("Found %d MCP tools:\n", len(tools.Data))
for _, tool := range tools.Data {
    fmt.Printf("- %s: %s (Server: %s)\n", tool.Name, tool.Description, tool.Server)
    if tool.InputSchema != nil {
        fmt.Printf("  Input Schema: %+v\n", *tool.InputSchema)
    }
}

Note: The MCP tools endpoint requires authentication and is only accessible when the server has EXPOSE_MCP=true configured. If the endpoint is not exposed, you'll receive a 403 error with the message "MCP tools endpoint is not exposed. Set EXPOSE_MCP=true to enable."

Generating Content

To generate content using a model, use the GenerateContent method:

Note: Some models support reasoning capabilities. You can use the ReasoningFormat parameter to control how reasoning is provided in the response. The model's reasoning will be available in the Reasoning or ReasoningContent fields of the response message.

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
})

ctx := context.Background()
response, err := client.GenerateContent(
    ctx,
    sdk.Ollama,
    "ollama/llama2",
    []sdk.Message{
        {
            Role:    sdk.System,
            Content: "You are a helpful assistant.",
        },
        {
            Role:    sdk.User,
            Content: "What is Go?",
        },
    },
)
if err != nil {
    log.Printf("Error generating content: %v", err)
    return
}

var chatCompletion CreateChatCompletionResponse
if err := json.Unmarshal(response.RawResponse, &chatCompletion); err != nil {
    log.Printf("Error unmarshaling response: %v", err)
    return
}

fmt.Printf("Generated content: %s\n", chatCompletion.Choices[0].Message.Content)

// If reasoning was requested and the model supports it
if chatCompletion.Choices[0].Message.Reasoning != nil {
    fmt.Printf("Reasoning: %s\n", *chatCompletion.Choices[0].Message.Reasoning)
}

Using ReasoningFormat

You can enable reasoning capabilities by setting the ReasoningFormat parameter in your request:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
})

ctx := context.Background()

// Set up your messages
messages := []sdk.Message{
    {
        Role:    sdk.System,
        Content: "You are a helpful assistant. Please include your reasoning for complex questions.",
    },
    {
        Role:    sdk.User,
        Content: "What is the square root of 144 and why?",
    },
}

// Create a request with reasoning format
reasoningFormat := "parsed"  // Use "raw" or "parsed" - default to "parsed" if not specified
options := &sdk.CreateChatCompletionRequest{
    ReasoningFormat: &reasoningFormat,
}

// Set options and make the request
response, err := client.WithOptions(options).GenerateContent(
    ctx,
    sdk.Anthropic,
    "anthropic/claude-3-opus-20240229",
    messages,
)

if err != nil {
    log.Fatalf("Error generating content: %v", err)
}

fmt.Printf("Content: %s\n", response.Choices[0].Message.Content)
if response.Choices[0].Message.Reasoning != nil {
    fmt.Printf("Reasoning: %s\n", *response.Choices[0].Message.Reasoning)
}

Streaming Content

To generate content using streaming mode, use the GenerateContentStream method:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
})
ctx := context.Background()
events, err := client.GenerateContentStream(
    ctx,
    sdk.Ollama,
    "ollama/llama2",
    []sdk.Message{
        {
            Role:    sdk.System,
            Content: "You are a helpful assistant.",
        },
        {
            Role:    sdk.User,
            Content: "What is Go?",
        },
    },
)
if err != nil {
    log.Fatalf("Error generating content stream: %v", err)
}

// Read events from the stream / channel
for event := range events {
    if event.Event != nil {
        continue
    }

    switch *event.Event {
    case sdk.ContentDelta:
        if event.Data != nil {
            // Parse the streaming response
            var streamResponse sdk.CreateChatCompletionStreamResponse
            if err := json.Unmarshal(*event.Data, &streamResponse); err != nil {
                log.Printf("Error parsing stream response: %v", err)
                continue
            }

            // Process each choice in the response
            for _, choice := range streamResponse.Choices {
                if choice.Delta.Content != "" {
                    // Just print the content as it comes in
                    fmt.Print(choice.Delta.Content)
                }
            }
        }

    case sdk.StreamEnd:
        // Stream has ended
        fmt.Println("\nStream ended")

    case sdk.MessageError:
        // Handle error events
        if event.Data != nil {
            var errResp struct {
                Error string `json:"error"`
            }
            if err := json.Unmarshal(*event.Data, &errResp); err != nil {
                log.Printf("Error parsing error: %v", err)
                continue
            }
            log.Printf("Error: %s", errResp.Error)
        }
    }
}

Tool-Use

To use tools with the SDK, you can define a tool and provide it to the client:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
})

// Create tools array with our function
tools := []sdk.ChatCompletionTool{
    {
        Type:     sdk.Function,
        Function: sdk.FunctionObject{
            Name:        "get_current_weather",
            Description: stringPtr("Get the current weather in a given location"),
            Parameters: &sdk.FunctionParameters{
                "type": "object",
                "properties": map[string]interface{}{
                    "location": map[string]interface{}{
                        "type":        "string",
                        "enum":        []string{"san francisco", "new york", "london", "tokyo", "sydney"},
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": map[string]interface{}{
                        "type":        "string",
                        "enum":        []string{"celsius", "fahrenheit"},
                        "description": "The temperature unit to use",
                    },
                },
                "required": []string{"location"},
            },
        }
    },
    {
        Type:     sdk.Function,
        Function: sdk.FunctionObject{
            Name:        "get_current_time",
            Description: stringPtr("Get the current time in a given location"),
            Parameters: &sdk.FunctionParameters{
                "type": "object",
                "properties": map[string]interface{}{
                    "location": map[string]interface{}{
                        "type":        "string",
                        "enum":        []string{"san francisco", "new york", "london", "tokyo", "sydney"},
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                },
                "required": []string{"location"},
            },
        }
    }
}

// Provide the tool to the client
client.WithTools(&tools).GenerateContent(ctx, provider, modelName, messages)

Health Check

To check if the API is healthy:

client := sdk.NewClient(&sdk.ClientOptions{
    BaseURL: "http://localhost:8080/v1",
})

ctx := context.Background()
err := client.HealthCheck(ctx)
if err != nil {
    log.Fatalf("Health check failed: %v", err)
}

Examples

For more detailed examples and use cases, check out the examples directory. The examples include:

Generation Example - Basic content generation examples
MCP List Tools Example - How to list available MCP tools
Middleware Bypass Example - How to bypass middleware layers for direct provider access
Models Example - How to list and work with different models
Stream Example - Streaming content generation
Stream Tools Example - Advanced streaming with tool usage
Tools Example - Function calling and tool usage

Each example includes its own README with specific instructions and explanations.

Supported Providers

The SDK supports the following LLM providers:

Ollama (sdk.Ollama)
Groq (sdk.Groq)
OpenAI (sdk.Openai)
DeepSeek (sdk.Deepseek)
Cloudflare (sdk.Cloudflare)
Cohere (sdk.Cohere)
Anthropic (sdk.Anthropic)
Google (sdk.Google)
Mistral AI (sdk.Mistral)

Documentation

Run: task docs
Open: http://localhost:6060/pkg/github.com/inference-gateway/sdk

Contributing

Please refer to the CONTRIBUTING.md file for information about how to get involved. We welcome issues, questions, and pull requests.

License

This SDK is distributed under the MIT License, see LICENSE for more information.

Documentation ¶

Overview ¶

Package sdk provides primitives to interact with the openapi HTTP API.

Code generated by github.com/oapi-codegen/oapi-codegen/v2 version v2.5.0 DO NOT EDIT.

Index ¶

Constants
type A2AAgentCard
type A2ANotExposed
type BadRequest
type ChatCompletionChoice
type ChatCompletionChoiceFinishReason
type ChatCompletionMessageToolCall
type ChatCompletionMessageToolCallChunk
type ChatCompletionMessageToolCallFunction
type ChatCompletionStreamChoice
type ChatCompletionStreamOptions
type ChatCompletionStreamResponseDelta
type ChatCompletionTokenLogprob
type ChatCompletionTool
type ChatCompletionToolType
type Client
- func NewClient(options *ClientOptions) Client
type ClientOptions
type CompletionUsage
type CreateChatCompletionJSONRequestBody
type CreateChatCompletionParams
type CreateChatCompletionRequest
type CreateChatCompletionResponse
type CreateChatCompletionStreamResponse
type Error
type FunctionObject
type FunctionParameters
type InternalError
type ListAgentsResponse
type ListModelsParams
type ListModelsResponse
type ListToolsResponse
type MCPNotExposed
type MCPTool
type Message
type MessageRole
type MiddlewareOptions
type Model
type Provider
type ProviderRequest
type ProviderResponse
type ProviderSpecificResponse
type ProxyPatchJSONBody
type ProxyPatchJSONRequestBody
type ProxyPostJSONBody
type ProxyPostJSONRequestBody
type ProxyPutJSONBody
type ProxyPutJSONRequestBody
type SSEvent
type SSEventEvent
type Unauthorized

Constants ¶

View Source

const (
	BearerAuthScopes = "bearerAuth.Scopes"
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type A2AAgentCard ¶ added in v1.10.0

type A2AAgentCard struct {
	// Capabilities Optional capabilities supported by the agent.
	Capabilities map[string]interface{} `json:"capabilities"`

	// DefaultInputModes The set of interaction modes that the agent supports across all skills. This can be overridden per-skill.
	// Supported media types for input.
	DefaultInputModes []string `json:"defaultInputModes"`

	// DefaultOutputModes Supported media types for output.
	DefaultOutputModes []string `json:"defaultOutputModes"`

	// Description A human-readable description of the agent. Used to assist users and
	// other agents in understanding what the agent can do.
	Description string `json:"description"`

	// DocumentationUrl A URL to documentation for the agent.
	DocumentationUrl *string `json:"documentationUrl,omitempty"`

	// IconUrl A URL to an icon for the agent.
	IconUrl *string `json:"iconUrl,omitempty"`

	// Id Unique identifier for the agent (base64-encoded SHA256 hash of the agent URL).
	Id string `json:"id"`

	// Name Human readable name of the agent.
	Name string `json:"name"`

	// Provider The service provider of the agent
	Provider *map[string]interface{} `json:"provider,omitempty"`

	// Security Security requirements for contacting the agent.
	Security *[]map[string]interface{} `json:"security,omitempty"`

	// SecuritySchemes Security scheme details used for authenticating with this agent.
	SecuritySchemes *map[string]interface{} `json:"securitySchemes,omitempty"`

	// Skills Skills are a unit of capability that an agent can perform.
	Skills []map[string]interface{} `json:"skills"`

	// SupportsAuthenticatedExtendedCard true if the agent supports providing an extended agent card when the user is authenticated.
	// Defaults to false if not specified.
	SupportsAuthenticatedExtendedCard *bool `json:"supportsAuthenticatedExtendedCard,omitempty"`

	// Url A URL to the address the agent is hosted at.
	Url string `json:"url"`

	// Version The version of the agent - format is up to the provider.
	Version string `json:"version"`
}

A2AAgentCard An AgentCard conveys key information: - Overall details (version, name, description, uses) - Skills: A set of capabilities the agent can perform - Default modalities/content types supported by the agent. - Authentication requirements

type A2ANotExposed ¶ added in v1.10.0

type A2ANotExposed = Error

A2ANotExposed defines model for A2ANotExposed.

type BadRequest ¶ added in v1.5.0

type BadRequest = Error

BadRequest defines model for BadRequest.

type ChatCompletionChoice ¶ added in v1.5.0

type ChatCompletionChoice struct {
	// FinishReason The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence,
	// `length` if the maximum number of tokens specified in the request was reached,
	// `content_filter` if content was omitted due to a flag from our content filters,
	// `tool_calls` if the model called a tool.
	FinishReason ChatCompletionChoiceFinishReason `json:"finish_reason"`

	// Index The index of the choice in the list of choices.
	Index int `json:"index"`

	// Message Message structure for provider requests
	Message Message `json:"message"`
}

ChatCompletionChoice defines model for ChatCompletionChoice.

type ChatCompletionChoiceFinishReason ¶ added in v1.5.0

type ChatCompletionChoiceFinishReason string

ChatCompletionChoiceFinishReason The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence, `length` if the maximum number of tokens specified in the request was reached, `content_filter` if content was omitted due to a flag from our content filters, `tool_calls` if the model called a tool.

const (
	ContentFilter ChatCompletionChoiceFinishReason = "content_filter"
	FunctionCall  ChatCompletionChoiceFinishReason = "function_call"
	Length        ChatCompletionChoiceFinishReason = "length"
	Stop          ChatCompletionChoiceFinishReason = "stop"
	ToolCalls     ChatCompletionChoiceFinishReason = "tool_calls"
)

Defines values for ChatCompletionChoiceFinishReason.

type ChatCompletionMessageToolCall ¶ added in v1.5.0

type ChatCompletionMessageToolCall struct {
	// Function The function that the model called.
	Function ChatCompletionMessageToolCallFunction `json:"function"`

	// Id The ID of the tool call.
	Id string `json:"id"`

	// Type The type of the tool. Currently, only `function` is supported.
	Type ChatCompletionToolType `json:"type"`
}

ChatCompletionMessageToolCall defines model for ChatCompletionMessageToolCall.

type ChatCompletionMessageToolCallChunk ¶ added in v1.5.0

type ChatCompletionMessageToolCallChunk struct {
	Index    int    `json:"index"`
	ID       string `json:"id,omitempty"`
	Type     string `json:"type,omitempty"`
	Function struct {
		Name      string `json:"name,omitempty"`
		Arguments string `json:"arguments,omitempty"`
	} `json:"function,omitempty"`
}

ChatCompletionMessageToolCallChunk represents a chunk of a tool call in a stream response.

type ChatCompletionMessageToolCallFunction ¶ added in v1.5.0

type ChatCompletionMessageToolCallFunction struct {
	// Arguments The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
	Arguments string `json:"arguments"`

	// Name The name of the function to call.
	Name string `json:"name"`
}

ChatCompletionMessageToolCallFunction The function that the model called.

type ChatCompletionStreamChoice ¶ added in v1.5.0

type ChatCompletionStreamChoice struct {
	Delta        ChatCompletionStreamResponseDelta `json:"delta"`
	Index        int                               `json:"index"`
	FinishReason string                            `json:"finish_reason"`
}

ChatCompletionStreamChoice represents a choice in a streaming chat completion response.

type ChatCompletionStreamOptions ¶ added in v1.5.0

type ChatCompletionStreamOptions struct {
	// IncludeUsage If set, an additional chunk will be streamed before the `data: [DONE]` message. The `usage` field on this chunk shows the token usage statistics for the entire request, and the `choices` field will always be an empty array. All other chunks will also include a `usage` field, but with a null value.
	IncludeUsage bool `json:"include_usage"`
}

ChatCompletionStreamOptions Options for streaming response. Only set this when you set `stream: true`.

type ChatCompletionStreamResponseDelta ¶ added in v1.5.0

type ChatCompletionStreamResponseDelta struct {
	Content          string                               `json:"content,omitempty"`
	ToolCalls        []ChatCompletionMessageToolCallChunk `json:"tool_calls,omitempty"`
	Role             string                               `json:"role,omitempty"`
	Reasoning        *string                              `json:"reasoning,omitempty"`
	ReasoningContent *string                              `json:"reasoning_content,omitempty"`
	Refusal          string                               `json:"refusal,omitempty"`
}

ChatCompletionStreamResponseDelta represents a chat completion delta generated by streamed model responses.

type ChatCompletionTokenLogprob ¶ added in v1.5.0

type ChatCompletionTokenLogprob struct {
	Token   string  `json:"token"`
	Logprob float64 `json:"logprob"`
	Bytes   []int   `json:"bytes"`
}

ChatCompletionTokenLogprob represents token log probability information.

type ChatCompletionTool ¶ added in v1.5.0

type ChatCompletionTool struct {
	Function FunctionObject `json:"function"`

	// Type The type of the tool. Currently, only `function` is supported.
	Type ChatCompletionToolType `json:"type"`
}

ChatCompletionTool defines model for ChatCompletionTool.

type ChatCompletionToolType ¶ added in v1.5.0

type ChatCompletionToolType string

ChatCompletionToolType The type of the tool. Currently, only `function` is supported.

const (
	Function ChatCompletionToolType = "function"
)

Defines values for ChatCompletionToolType.

type Client ¶

type Client interface {
	WithAuthToken(token string) *clientImpl
	WithTools(tools *[]ChatCompletionTool) *clientImpl
	WithOptions(options *CreateChatCompletionRequest) *clientImpl
	WithHeaders(headers map[string]string) *clientImpl
	WithHeader(name, value string) *clientImpl
	WithMiddlewareOptions(options *MiddlewareOptions) *clientImpl
	ListModels(ctx context.Context) (*ListModelsResponse, error)
	ListProviderModels(ctx context.Context, provider Provider) (*ListModelsResponse, error)
	ListTools(ctx context.Context) (*ListToolsResponse, error)
	GenerateContent(ctx context.Context, provider Provider, model string, messages []Message) (*CreateChatCompletionResponse, error)
	GenerateContentStream(ctx context.Context, provider Provider, model string, messages []Message) (<-chan SSEvent, error)
	HealthCheck(ctx context.Context) error
}

Client represents the SDK client interface

func NewClient ¶

func NewClient(options *ClientOptions) Client

NewClient creates a new SDK client with the specified options.

Example:

client := sdk.NewClient(&sdk.ClientOptions{
	BaseURL: "http://localhost:8080/v1",
	APIKey: "your-api-key",
	Timeout: 30 * time.Second,
	Tools: nil,
	Headers: map[string]string{
		"X-Custom-Header": "custom-value",
		"User-Agent": "my-app/1.0",
	},
})

type ClientOptions ¶ added in v1.5.1

type ClientOptions struct {
	// APIKey is the API key to use for the client.
	APIKey string
	// BaseURL is the base URL to use for the client.
	BaseURL string
	// Timeout is the timeout to use for the client.
	Timeout time.Duration
	// Tools is the tools to use for the client.
	Tools *[]ChatCompletionTool
	// Headers is a map of custom headers to include with all requests.
	Headers map[string]string
}

ClientOptions represents the options that can be passed to the client.

type CompletionUsage ¶ added in v1.5.0

type CompletionUsage struct {
	// CompletionTokens Number of tokens in the generated completion.
	CompletionTokens int64 `json:"completion_tokens"`

	// PromptTokens Number of tokens in the prompt.
	PromptTokens int64 `json:"prompt_tokens"`

	// TotalTokens Total number of tokens used in the request (prompt + completion).
	TotalTokens int64 `json:"total_tokens"`
}

CompletionUsage Usage statistics for the completion request.

type CreateChatCompletionJSONRequestBody ¶ added in v1.5.0

type CreateChatCompletionJSONRequestBody = CreateChatCompletionRequest

CreateChatCompletionJSONRequestBody defines body for CreateChatCompletion for application/json ContentType.

type CreateChatCompletionParams ¶ added in v1.5.0

type CreateChatCompletionParams struct {
	// Provider Specific provider to use (default determined by model)
	Provider *Provider `form:"provider,omitempty" json:"provider,omitempty"`
}

CreateChatCompletionParams defines parameters for CreateChatCompletion.

type CreateChatCompletionRequest ¶ added in v1.5.0

type CreateChatCompletionRequest struct {
	// MaxTokens An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
	MaxTokens *int `json:"max_tokens,omitempty"`

	// Messages A list of messages comprising the conversation so far.
	Messages []Message `json:"messages"`

	// Model Model ID to use
	Model string `json:"model"`

	// ReasoningFormat The format of the reasoning content. Can be `raw` or `parsed`.
	// When specified as raw some reasoning models will output <think /> tags. When specified as parsed the model will output the reasoning under  `reasoning` or `reasoning_content` attribute.
	ReasoningFormat *string `json:"reasoning_format,omitempty"`

	// Stream If set to true, the model response data will be streamed to the client as it is generated using [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format).
	Stream *bool `json:"stream,omitempty"`

	// StreamOptions Options for streaming response. Only set this when you set `stream: true`.
	StreamOptions *ChatCompletionStreamOptions `json:"stream_options,omitempty"`

	// Tools A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
	Tools *[]ChatCompletionTool `json:"tools,omitempty"`
}

CreateChatCompletionRequest defines model for CreateChatCompletionRequest.

type CreateChatCompletionResponse ¶ added in v1.5.0

type CreateChatCompletionResponse struct {
	// Choices A list of chat completion choices. Can be more than one if `n` is greater than 1.
	Choices []ChatCompletionChoice `json:"choices"`

	// Created The Unix timestamp (in seconds) of when the chat completion was created.
	Created int `json:"created"`

	// Id A unique identifier for the chat completion.
	Id string `json:"id"`

	// Model The model used for the chat completion.
	Model string `json:"model"`

	// Object The object type, which is always `chat.completion`.
	Object string `json:"object"`

	// Usage Usage statistics for the completion request.
	Usage *CompletionUsage `json:"usage,omitempty"`
}

CreateChatCompletionResponse Represents a chat completion response returned by model, based on the provided input.

type CreateChatCompletionStreamResponse ¶ added in v1.5.0

type CreateChatCompletionStreamResponse struct {
	ID                string                       `json:"id"`
	Choices           []ChatCompletionStreamChoice `json:"choices"`
	Created           int                          `json:"created"`
	Model             string                       `json:"model"`
	SystemFingerprint string                       `json:"system_fingerprint,omitempty"`
	Object            string                       `json:"object"`
	Usage             *CompletionUsage             `json:"usage,omitempty"`
}

CreateChatCompletionStreamResponse represents a streamed chunk of a chat completion response.

type Error ¶ added in v1.5.0

type Error struct {
	Error *string `json:"error,omitempty"`
}

Error defines model for Error.

type FunctionObject ¶ added in v1.5.0

type FunctionObject struct {
	// Description A description of what the function does, used by the model to choose when and how to call the function.
	Description *string `json:"description,omitempty"`

	// Name The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
	Name string `json:"name"`

	// Parameters The parameters the functions accepts, described as a JSON Schema object. See the [guide](/docs/guides/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format.
	// Omitting `parameters` defines a function with an empty parameter list.
	Parameters *FunctionParameters `json:"parameters,omitempty"`

	// Strict Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the `parameters` field. Only a subset of JSON Schema is supported when `strict` is `true`. Learn more about Structured Outputs in the [function calling guide](docs/guides/function-calling).
	Strict *bool `json:"strict,omitempty"`
}

FunctionObject defines model for FunctionObject.

type FunctionParameters ¶ added in v1.5.0

type FunctionParameters map[string]interface{}

FunctionParameters The parameters the functions accepts, described as a JSON Schema object. See the [guide](/docs/guides/function-calling) for examples, and the [JSON Schema reference](https://json-schema.org/understanding-json-schema/) for documentation about the format. Omitting `parameters` defines a function with an empty parameter list.

type InternalError ¶ added in v1.5.0

type InternalError = Error

InternalError defines model for InternalError.

type ListAgentsResponse ¶ added in v1.10.0

type ListAgentsResponse struct {
	// Data Array of available A2A agents
	Data []A2AAgentCard `json:"data"`

	// Object Always "list"
	Object string `json:"object"`
}

ListAgentsResponse Response structure for listing A2A agents

type ListModelsParams ¶ added in v1.5.0

type ListModelsParams struct {
	// Provider Specific provider to query (optional)
	Provider *Provider `form:"provider,omitempty" json:"provider,omitempty"`
}

ListModelsParams defines parameters for ListModels.

type ListModelsResponse ¶ added in v1.3.0

type ListModelsResponse struct {
	Data     []Model   `json:"data"`
	Object   string    `json:"object"`
	Provider *Provider `json:"provider,omitempty"`
}

ListModelsResponse Response structure for listing models

type ListToolsResponse ¶ added in v1.8.0

type ListToolsResponse struct {
	// Data Array of available MCP tools
	Data []MCPTool `json:"data"`

	// Object Always "list"
	Object string `json:"object"`
}

ListToolsResponse Response structure for listing MCP tools

type MCPNotExposed ¶ added in v1.8.0

type MCPNotExposed = Error

MCPNotExposed defines model for MCPNotExposed.

type MCPTool ¶ added in v1.8.0

type MCPTool struct {
	// Description A description of what the tool does
	Description string `json:"description"`

	// InputSchema JSON schema for the tool's input parameters
	InputSchema *map[string]interface{} `json:"input_schema,omitempty"`

	// Name The name of the tool
	Name string `json:"name"`

	// Server The MCP server that provides this tool
	Server string `json:"server"`
}

MCPTool An MCP tool definition

type Message ¶

type Message struct {
	Content string `json:"content"`

	// Reasoning The reasoning of the chunk message. Same as reasoning_content.
	Reasoning *string `json:"reasoning,omitempty"`

	// ReasoningContent The reasoning content of the chunk message.
	ReasoningContent *string `json:"reasoning_content,omitempty"`

	// Role Role of the message sender
	Role       MessageRole                      `json:"role"`
	ToolCallId *string                          `json:"tool_call_id,omitempty"`
	ToolCalls  *[]ChatCompletionMessageToolCall `json:"tool_calls,omitempty"`
}

Message Message structure for provider requests

type MessageRole ¶ added in v1.5.0

type MessageRole string

MessageRole Role of the message sender

const (
	Assistant MessageRole = "assistant"
	System    MessageRole = "system"
	Tool      MessageRole = "tool"
	User      MessageRole = "user"
)

Defines values for MessageRole.

type MiddlewareOptions ¶ added in v1.9.0

type MiddlewareOptions struct {
	// SkipMCP bypasses MCP middleware processing
	SkipMCP bool
	// SkipA2A bypasses A2A middleware processing
	SkipA2A bool
	// DirectProvider routes directly to provider without middleware
	DirectProvider bool
}

MiddlewareOptions represents options for controlling middleware behavior

type Model ¶

type Model struct {
	Created  int64    `json:"created"`
	Id       string   `json:"id"`
	Object   string   `json:"object"`
	OwnedBy  string   `json:"owned_by"`
	ServedBy Provider `json:"served_by"`
}

Model Common model information

type Provider ¶

type Provider string

Provider defines model for Provider.

const (
	Anthropic  Provider = "anthropic"
	Cloudflare Provider = "cloudflare"
	Cohere     Provider = "cohere"
	Deepseek   Provider = "deepseek"
	Google     Provider = "google"
	Groq       Provider = "groq"
	Mistral    Provider = "mistral"
	Ollama     Provider = "ollama"
	Openai     Provider = "openai"
)

Defines values for Provider.

type ProviderRequest ¶ added in v1.5.0

type ProviderRequest struct {
	Messages *[]struct {
		Content *string `json:"content,omitempty"`
		Role    *string `json:"role,omitempty"`
	} `json:"messages,omitempty"`
	Model       *string  `json:"model,omitempty"`
	Temperature *float32 `json:"temperature,omitempty"`
}

ProviderRequest defines model for ProviderRequest.

type ProviderResponse ¶ added in v1.5.0

type ProviderResponse = ProviderSpecificResponse

ProviderResponse Provider-specific response format. Examples:

OpenAI GET /v1/models?provider=openai response: ```json

{
  "provider": "openai",
  "object": "list",
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882410,
      "owned_by": "openai",
      "served_by": "openai"
    }
  ]
}

```

Anthropic GET /v1/models?provider=anthropic response: ```json

{
  "provider": "anthropic",
  "object": "list",
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882410,
      "owned_by": "openai",
      "served_by": "openai"
    }
  ]
}

```

type ProviderSpecificResponse ¶ added in v1.5.0

type ProviderSpecificResponse = map[string]interface{}

ProviderSpecificResponse Provider-specific response format. Examples:

OpenAI GET /v1/models?provider=openai response: ```json

{
  "provider": "openai",
  "object": "list",
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882410,
      "owned_by": "openai",
      "served_by": "openai"
    }
  ]
}

```

Anthropic GET /v1/models?provider=anthropic response: ```json

{
  "provider": "anthropic",
  "object": "list",
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882410,
      "owned_by": "openai",
      "served_by": "openai"
    }
  ]
}

```

type ProxyPatchJSONBody ¶ added in v1.5.0

type ProxyPatchJSONBody struct {
	Messages *[]struct {
		Content *string `json:"content,omitempty"`
		Role    *string `json:"role,omitempty"`
	} `json:"messages,omitempty"`
	Model       *string  `json:"model,omitempty"`
	Temperature *float32 `json:"temperature,omitempty"`
}

ProxyPatchJSONBody defines parameters for ProxyPatch.

type ProxyPatchJSONRequestBody ¶ added in v1.5.0

type ProxyPatchJSONRequestBody ProxyPatchJSONBody

ProxyPatchJSONRequestBody defines body for ProxyPatch for application/json ContentType.

type ProxyPostJSONBody ¶ added in v1.5.0

type ProxyPostJSONBody struct {
	Messages *[]struct {
		Content *string `json:"content,omitempty"`
		Role    *string `json:"role,omitempty"`
	} `json:"messages,omitempty"`
	Model       *string  `json:"model,omitempty"`
	Temperature *float32 `json:"temperature,omitempty"`
}

ProxyPostJSONBody defines parameters for ProxyPost.

type ProxyPostJSONRequestBody ¶ added in v1.5.0

type ProxyPostJSONRequestBody ProxyPostJSONBody

ProxyPostJSONRequestBody defines body for ProxyPost for application/json ContentType.

type ProxyPutJSONBody ¶ added in v1.5.0

type ProxyPutJSONBody struct {
	Messages *[]struct {
		Content *string `json:"content,omitempty"`
		Role    *string `json:"role,omitempty"`
	} `json:"messages,omitempty"`
	Model       *string  `json:"model,omitempty"`
	Temperature *float32 `json:"temperature,omitempty"`
}

ProxyPutJSONBody defines parameters for ProxyPut.

type ProxyPutJSONRequestBody ¶ added in v1.5.0

type ProxyPutJSONRequestBody ProxyPutJSONBody

ProxyPutJSONRequestBody defines body for ProxyPut for application/json ContentType.

type SSEvent ¶ added in v1.4.0

type SSEvent struct {
	Data  *[]byte       `json:"data,omitempty"`
	Event *SSEventEvent `json:"event,omitempty"`
	Retry *int          `json:"retry,omitempty"`
}

SSEvent defines model for SSEvent.

type SSEventEvent ¶ added in v1.5.0

type SSEventEvent string

SSEventEvent defines model for SSEvent.Event.

const (
	ContentDelta SSEventEvent = "content-delta"
	ContentEnd   SSEventEvent = "content-end"
	ContentStart SSEventEvent = "content-start"
	MessageEnd   SSEventEvent = "message-end"
	MessageStart SSEventEvent = "message-start"
	StreamEnd    SSEventEvent = "stream-end"
	StreamStart  SSEventEvent = "stream-start"
)

Defines values for SSEventEvent.

type Unauthorized ¶ added in v1.5.0

type Unauthorized = Error

Unauthorized defines model for Unauthorized.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL