genai

package module

v0.0.0-...-0071670 Latest Latest Go to latest Published: Mar 25, 2025 License: Apache-2.0 Imports: 11 Imported by: 1

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/maruel/genai

README ¶

genai

The high performance low level native Go client for LLMs.

Provider	Country	Chat	Streaming	Vision	PDF	Audio	Video	JSON output	JSON schema	Seed	Tools
Anthropic	🇺🇸	✅	✅	✅	✅	❌	❌	❌	❌	❌	✅
Cerebras	🇺🇸	✅	✅	❌	❌	❌	❌	✅	✅	✅	✅
Cloudflare Workers AI	🇺🇸	✅	✅	⏳	❌	⏳	❌	✅	✅	✅	✅
Cohere	🇨🇦	✅	✅	⏳	❌	❌	❌	✅	✅	✅	✅
DeepSeek	🇨🇳	✅	✅	❌	❌	❌	❌	✅	❌	❌	✅
Google's Gemini	🇺🇸	✅	✅	✅	✅	✅	✅	✅	✅	✅	✅
Groq	🇺🇸	✅	✅	✅	❌	❌	❌	✅	❌	✅	✅
HuggingFace	🇺🇸	✅	✅	⏳	⏳	❌	❌	⏳	⏳	✅	✅
llama.cpp	N/A	✅	✅	⏳	⏳	⏳	⏳	⏳	⏳	✅	⏳
Mistral	🇫🇷	✅	✅	✅	✅	❌	❌	✅	✅	✅	✅
Ollama	N/A	✅	✅	✅	❌	❌	❌	❌	✅	✅	✅
OpenAI	🇺🇸	✅	✅	✅	✅	✅	❌	✅	✅	✅	✅
Perplexity	🇺🇸	✅	✅	❌	❌	❌	❌	❌	⏳	❌	❌
TogetherAI	🇺🇸	✅	✅	✅	❌	❌	✅	✅	✅	✅	✅

✅ Implemented
⏳ To be implemented
❌ Not supported
Streaming: chat streaming
Vision: ability to process an image as input; most providers support PNG, JPG, WEBP and non-animated GIF
Video: ability to process a video (e.g. MP4) as input.
PDF: ability to process a PDF as input, possibly with OCR
JSON output/schema: ability to output JSON in free form or with a schema
Seed: deterministic seed for reproducibility
Tools: tool calling

Features

Full functionality: Full access to each backend-specific functionality. Access the raw API if needed with full message schema as Go structs.
Native JSON struct serialization: Pass a struct to tell the LLM what to generate, decode the reply into your struct. No need to manually fiddle with JSON. Supports required fields, enums, descriptions, etc.
Native tool calling: Tell the LLM to call a tool directly, described a Go struct. No need to manually fiddle with JSON.
Streaming: Streams completion reply as the output is being generated.
Vision: Process images, PDFs and videos (!) as input.
Unit testing friendly: record and play back API calls at HTTP level.

Implementation is in flux. :)

Design

Safe and strict API implementation. All you love from a statically typed language. Immediately fails on unknown RPC fields. Error code paths are properly implemented.
*Stateless: no global state, clients are safe to use concurrently lock-less.
Professional grade: unit tested on live services.
Optimized for speed: minimize memory allocations, compress data at the transport layer when possible.
Lean: Few dependencies. No unnecessary abstraction layer.
Easy to add new providers.

I'm poor 💸

As of March 2025, the following services offer a free tier (other limits apply):

Cerebras has unspecified "generous" free tier
Cloudflare Workers AI about 10k tokens/day
Cohere (1000 RPCs/month)
Google's Gemini 0.25qps, 1m tokens/month
Groq 0.5qps, 500k tokens/day
HuggingFace 10¢/month
Mistral 1qps, 1B tokens/month
Together.AI provides many models for free at 1qps
Running Ollama or llama.cpp locally is free. :)

HTTP transport compression

Each service provider was manually tested to see if the accept compressed POST body.

As for March 2025, here's the HTTP POST compression supported by each provider:

Provider	Compression accepted for POST data	Response compressed
Anthropic	none	gzip
Cerebras	none	none
Cloudflare Workers AI	none	gzip
Cohere	none	none
DeepSeek	none	gzip
Google's Gemini	gzip	gzip
Groq	none	br
HuggingFace	gzip, br or zstd	none
Mistral	none	br
OpenAI	none	br
Perplexity	none	none

It matters if you care about your ingress/egress bandwidth. Only HuggingFace supports brotli and zstd as POST data but replies uncompressed (!). Google supports gzip.

Look and feel

Decoding answer as a typed struct

Tell the LLM to use a specific JSON schema to generate the response.

package main

import (
	"context"
	"fmt"
	"log"
	"strings"

	"github.com/maruel/genai/cerebras"
	"github.com/maruel/genai/genai"
)

type Circle struct {
    Round bool `json:"round"`
}

func main() {
    c, err := cerebras.New("", "llama3.1-8b")
    if err != nil {
        log.Fatal(err)
    }
    msgs := genai.Messages{
        genai.NewTextMessage(genai.User, "Is a circle round? Reply as JSON."),
    }
    opts := genai.ChatOptions{
        Seed:        1,
        Temperature: 0.01,
        MaxTokens:   50,
        DecodeAs:    &Circle{},
    }
    resp, err := c.Chat(context.Background(), msgs, &opts)
    if err != nil {
        log.Fatal(err)
    }
    got := Circle{}
    if err := resp.Contents[0].Decode(&got); err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Round: %v\n", got.Round)
}

Models

Snapshot of all the supported models: MODELS.md.

Try it:

go install github.com/maruel/genai/cmd/...@latest
list-models -provider hugginface

TODO

Audio out
Video out
Batch
Tuning
Embeddings
Handle rate limiting
Moderation
Thinking
Content Blocks
Citations

Documentation ¶

Overview ¶

Package genai is the high performance native Go client for LLMs.

It provides a generic interface to interact with various LLM providers.

Check out the examples for a quick start.

Index ¶

type ChatOptions
- func (c *ChatOptions) Validate() error
type ChatProvider
type ChatResult
type Content
- func (c *Content) Decode(x any) error
- func (c *Content) GetFilename() string
- func (c *Content) ReadDocument(maxSize int64) (string, []byte, error)
- func (c *Content) Validate() error
type Message
- func NewTextMessage(role Role, text string) Message
- func (m *Message) Validate() error
type MessageFragment
- func (m *MessageFragment) Accumulate(msgs Messages) (Messages, error)
type Messages
- func (msgs Messages) Validate() error
type Model
type ModelProvider
type ReflectedToJSON
type Role
- func (r Role) Validate() error
type ToolCall
- func (t *ToolCall) Decode(x any) error
type ToolDef
- func (t *ToolDef) Validate() error
type Usage
type Validatable

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type ChatOptions ¶

type ChatOptions struct {

	// Temperature adjust the creativity of the sampling. Generally between 0 and 2.
	Temperature float64
	// TopP adjusts correctness sampling between 0 and 1. The higher the more diverse the output.
	TopP float64
	// MaxTokens is the maximum number of tokens to generate. Used to limit it
	// lower than the default maximum, for budget reasons.
	MaxTokens int64
	// SystemPrompt is the prompt to use for the system role.
	SystemPrompt string

	// Seed for the random number generator. Default is 0 which means
	// non-deterministic.
	Seed int64
	// TopK adjusts sampling where only the N first candidates are considered.
	TopK int64
	// Stop is the list of tokens to stop generation.
	Stop []string

	// ReplyAsJSON enforces the output to be valid JSON, any JSON. It is
	// important to tell the model to reply in JSON in the prompt itself.
	ReplyAsJSON bool
	// DecodeAs enforces a reply with a specific JSON structure. It is important
	// to tell the model to reply in JSON in the prompt itself.
	DecodeAs ReflectedToJSON
	// Tools is the list of tools that the LLM can request to call.
	Tools []ToolDef
	// contains filtered or unexported fields
}

ChatOptions is a list of frequent options supported by most ChatProvider. Each provider is free to support more options through a specialized struct.

func (*ChatOptions) Validate ¶

func (c *ChatOptions) Validate() error

Validate ensures the completion options are valid.

type ChatProvider ¶

type ChatProvider interface {
	// Chat runs completion synchronously.
	//
	// opts must be either nil, *ChatOptions or a provider-specialized
	// option struct.
	Chat(ctx context.Context, msgs Messages, opts Validatable) (ChatResult, error)
	// ChatStream runs completion synchronously, streaming the results to channel replies.
	//
	// opts must be either nil, *ChatOptions or a provider-specialized
	// option struct.
	ChatStream(ctx context.Context, msgs Messages, opts Validatable, replies chan<- MessageFragment) error
}

ChatProvider is the generic interface to interact with a LLM backend.

Example ¶

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/maruel/genai"
	"github.com/maruel/genai/anthropic"
	"github.com/maruel/genai/cerebras"
	"github.com/maruel/genai/cloudflare"
	"github.com/maruel/genai/cohere"
	"github.com/maruel/genai/deepseek"
	"github.com/maruel/genai/gemini"
	"github.com/maruel/genai/groq"
	"github.com/maruel/genai/huggingface"
	"github.com/maruel/genai/llamacpp"
	"github.com/maruel/genai/mistral"
	"github.com/maruel/genai/openai"
	"github.com/maruel/genai/perplexity"
	"github.com/maruel/genai/togetherai"
)

func main() {
	// Pro-tip: Using os.Stderr so if you modify this file and append a "// Output: foo"
	// at the end of this function, "go test" will run the code and stream the
	// output to you.
	completionProviders := map[string]genai.ChatProvider{}
	// https://docs.anthropic.com/en/docs/about-claude/models/all-models
	if c, err := anthropic.New("", "claude-3-7-sonnet-latest"); err == nil {
		completionProviders["anthropic"] = c
	}
	if c, err := cerebras.New("", "llama-3.3-70b"); err == nil {
		completionProviders["cerebras"] = c
	}
	// https://developers.cloudflare.com/workers-ai/models/
	if c, err := cloudflare.New("", "", "@cf/deepseek-ai/deepseek-r1-distill-qwen-32b"); err == nil {
		completionProviders["cloudflare"] = c
	}
	// https://docs.cohere.com/v2/docs/models
	if c, err := cohere.New("", "command-r-plus"); err == nil {
		completionProviders["cohere"] = c
	}
	if c, err := deepseek.New("", "deepseek-reasoner"); err == nil {
		completionProviders["deepseek"] = c
	}
	// https://ai.google.dev/gemini-api/docs/models/gemini
	if c, err := gemini.New("", "gemini-2.0-flash"); err == nil {
		completionProviders["gemini"] = c
	}
	// https://console.groq.com/docs/models
	if c, err := groq.New("", "qwen-qwq-32b"); err == nil {
		completionProviders["groq"] = c
	}
	// https://huggingface.co/models?inference=warm&sort=trending
	if c, err := huggingface.New("", "Qwen/QwQ-32B"); err == nil {
		completionProviders["huggingface"] = c
	}
	if false {
		// See llamacpp/llamacppsrv to see how to run a local server.
		if c, err := llamacpp.New("http://localhost:8080", nil); err == nil {
			completionProviders["llamacpp"] = c
		}
	}
	// https://docs.mistral.ai/getting-started/models/models_overview/
	if c, err := mistral.New("", "mistral-large-latest"); err == nil {
		completionProviders["mistral"] = c
	}
	// https://platform.openai.com/docs/api-reference/models
	if c, err := openai.New("", "o3-mini"); err == nil {
		completionProviders["openai"] = c
	}
	if c, err := perplexity.New("", "sonar"); err == nil {
		completionProviders["perplexity"] = c
	}
	if c, err := togetherai.New("", "deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free"); err == nil {
		completionProviders["togetherai"] = c
	}

	for name, provider := range completionProviders {
		msgs := genai.Messages{
			genai.NewTextMessage(genai.User, "Tell a story in 10 words."),
		}
		response, err := provider.Chat(context.Background(), msgs, nil)
		if err != nil {
			fmt.Fprintf(os.Stderr, "- %s: %v\n", name, err)
		} else {
			fmt.Fprintf(os.Stderr, "- %s: %v\n", name, response)
		}
	}
}

Output:

type ChatResult ¶

type ChatResult struct {
	Message
	Usage
	// contains filtered or unexported fields
}

ChatResult is the result of a completion.

type Content ¶

type Content struct {

	// Text is the content of the text message.
	Text string

	// Filename is the name of the file. For many providers, only the extension
	// is relevant. They only use mime-type, which is derived from the filename's
	// extension. When an URL is provided or when the object provided to Document
	// implements a method with the signature `Name() string`, like an
	// `*os.File`, Filename is optional.
	Filename string
	// Document is raw document data. It is perfectly fine to use a
	// bytes.Buffer{}, bytes.NewReader() or *os.File.
	Document io.ReadSeeker
	// URL is the reference to the raw data. When set, the mime-type is derived from the URL.
	URL string
	// contains filtered or unexported fields
}

Content is a block of content in the message meant to be visible in a chat setting.

The content can be text or a document. The document may be audio, video, image, PDF or any other format.

func (*Content) Decode ¶

func (c *Content) Decode(x any) error

Decode decodes the JSON message into the struct.

Requires using either ReplyAsJSON or DecodeAs in the ChatOptions.

Note: this doesn't verify the type is the same as specified in ChatOptions.DecodeAs.

func (*Content) GetFilename ¶

func (c *Content) GetFilename() string

GetFilename returns the filename to use for the document, querying the Document's name if available.

func (*Content) ReadDocument ¶

func (c *Content) ReadDocument(maxSize int64) (string, []byte, error)

ReadDocument reads the document content into memory.

func (*Content) Validate ¶

func (c *Content) Validate() error

Validate ensures the block is valid.

type Message ¶

type Message struct {
	Role Role
	User string // Only used when Role == User. Only some provider (e.g. OpenAI, Groq, DeepSeek) support it.

	Contents []Content // For example when the LLM replies with multiple content blocks, an explanation and a code block.

	// ToolCall is a tool call that the LLM requested to make.
	ToolCalls []ToolCall
	// contains filtered or unexported fields
}

Message is a message to send to the LLM as part of the exchange.

The message may contain content, information to communicate between the user and the LLM. This is the Contents section. The content can be text or a document. The document may be audio, video, image, PDF or any other format.

The message may also contain tool calls. The tool call is a request from the LLM to answer a specific question, so the LLM can continue its process.

func NewTextMessage ¶

func NewTextMessage(role Role, text string) Message

NewTextMessage is a shorthand function to create a Message with a single text block.

func (*Message) Validate ¶

func (m *Message) Validate() error

Validate ensures the messages are valid.

type MessageFragment ¶

type MessageFragment struct {
	TextFragment string

	Filename         string
	DocumentFragment []byte

	// ToolCall is a tool call that the LLM requested to make.
	ToolCall ToolCall
	// contains filtered or unexported fields
}

MessageFragment is a fragment of a message the LLM is sending back as part of the ChatStream().

Only one of the item can be set.

func (*MessageFragment) Accumulate ¶

func (m *MessageFragment) Accumulate(msgs Messages) (Messages, error)

Accumulate accumulates the message fragment into the list of messages.

The assumption is that the fragment is always a message from the Assistant.

type Messages ¶

type Messages []Message

Messages is a list of valid messages in an exchange with a LLM.

The messages should be alternating between User and Assistant roles, or in the case of multi-user discussion, with different Users.

func (Messages) Validate ¶

func (msgs Messages) Validate() error

Validate ensures the messages are valid.

type Model ¶

type Model interface {
	GetID() string
	String() string
	Context() int64
}

Model represents a served model by the provider.

type ModelProvider ¶

type ModelProvider interface {
	ListModels(ctx context.Context) ([]Model, error)
}

ModelProvider represents a provider that can list models.

Example ¶

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/maruel/genai"
	"github.com/maruel/genai/anthropic"
	"github.com/maruel/genai/cerebras"
	"github.com/maruel/genai/cloudflare"
	"github.com/maruel/genai/cohere"
	"github.com/maruel/genai/deepseek"
	"github.com/maruel/genai/gemini"
	"github.com/maruel/genai/groq"
	"github.com/maruel/genai/huggingface"
	"github.com/maruel/genai/mistral"
	"github.com/maruel/genai/openai"
	"github.com/maruel/genai/togetherai"
)

func main() {
	// Pro-tip: Using os.Stderr so if you modify this file and append a "// Output: foo"
	// at the end of this function, "go test" will run the code and stream the
	// output to you.

	modelProviders := map[string]genai.ModelProvider{}
	if c, err := anthropic.New("", ""); err == nil {
		modelProviders["anthropic"] = c
	}
	if c, err := cerebras.New("", ""); err == nil {
		modelProviders["cerebras"] = c
	}
	if c, err := cloudflare.New("", "", ""); err == nil {
		modelProviders["cloudflare"] = c
	}
	if c, err := cohere.New("", ""); err == nil {
		modelProviders["cohere"] = c
	}
	if c, err := deepseek.New("", ""); err == nil {
		modelProviders["deepseek"] = c
	}
	if c, err := gemini.New("", ""); err == nil {
		modelProviders["gemini"] = c
	}
	if c, err := groq.New("", ""); err == nil {
		modelProviders["groq"] = c
	}
	if c, err := huggingface.New("", ""); err == nil {
		modelProviders["huggingface"] = c
	}
	// llamapcpp doesn't implement ModelProvider.
	if c, err := mistral.New("", ""); err == nil {
		modelProviders["mistral"] = c
	}
	if c, err := openai.New("", ""); err == nil {
		modelProviders["openai"] = c
	}
	// perplexity doesn't implement ModelProvider.
	if c, err := togetherai.New("", ""); err == nil {
		modelProviders["togetherai"] = c
	}

	for name, p := range modelProviders {
		models, err := p.ListModels(context.Background())
		fmt.Fprintf(os.Stderr, "%s:\n", name)
		if err != nil {
			fmt.Fprintf(os.Stderr, "  Failed to get models: %v\n", err)
		}
		for _, model := range models {
			fmt.Fprintf(os.Stderr, "- %s\n", model)
		}
	}
}

Output:

type ReflectedToJSON ¶

type ReflectedToJSON any

ReflectedToJSON must be a pointer to a struct that can be decoded by encoding/json and can have jsonschema tags.

It is recommended to use jsonscheme_description tags to describe each field or argument.

Use jsonschema:"enum=..." to enforce a specific value within a set.

type Role ¶

type Role string

Role is one of the LLM known roles.

const (
	// User is the user's inputs. There can be multiple users in a conversation.
	// They are differentiated by the Message.User field.
	User Role = "user"
	// Assistant is the LLM.
	Assistant Role = "assistant"
	// Computer is the user's computer, it replies to tool calls.
	Computer Role = "computer"
)

LLM known roles. Not all systems support all roles.

func (Role) Validate ¶

func (r Role) Validate() error

Validate ensures the role is valid.

type ToolCall ¶

type ToolCall struct {
	ID        string // Unique identifier for the tool call. Necessary for parallel tool calling.
	Name      string // Tool being called.
	Arguments string // encoded as JSON
	// contains filtered or unexported fields
}

ToolCall is a tool call that the LLM requested to make.

func (*ToolCall) Decode ¶

func (t *ToolCall) Decode(x any) error

Decode decodes the JSON tool call.

This function doesn't validate x is the same as InputsAs in the ToolDef.

type ToolDef ¶

type ToolDef struct {
	// Name must be unique among all tools.
	Name string
	// Description must be a LLM-friendly short description of the tool.
	Description string
	// InputsAs enforces a tool call with a specific JSON structure for
	// arguments.
	InputsAs ReflectedToJSON
	// contains filtered or unexported fields
}

ToolDef describes a tool that the LLM can request to use.

func (*ToolDef) Validate ¶

func (t *ToolDef) Validate() error

Validate ensures the tool definition is valid.

type Usage ¶

type Usage struct {
	InputTokens  int64
	OutputTokens int64
	// contains filtered or unexported fields
}

Usage from the LLM provider.

type Validatable ¶

type Validatable interface {
	Validate() error
}

Validatable is an interface to an object that can be validated.

Source Files ¶

View all Source files

genai.go

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
anthropic Package anthropic implements a client for the Anthropic API, to use Claude.	Package anthropic implements a client for the Anthropic API, to use Claude.
cerebras Package cerebras implements a client for the Cerebras API.	Package cerebras implements a client for the Cerebras API.
cloudflare Package cloudflare implements a client for the Cloudflare AI API.	Package cloudflare implements a client for the Cloudflare AI API.
cmd
list-models
cohere Package cohere implements a client for the Cohere API.	Package cohere implements a client for the Cohere API.
deepseek Package deepseek implements a client for the DeepSeek API.	Package deepseek implements a client for the DeepSeek API.
gemini Package gemini implements a client for Google's Gemini API.	Package gemini implements a client for Google's Gemini API.
groq Package groq implements a client for the Groq API.	Package groq implements a client for the Groq API.
huggingface Package huggingface implements a client for the HuggingFace serverless inference API.	Package huggingface implements a client for the HuggingFace serverless inference API.
internal Package internal is awesome sauce.	Package internal is awesome sauce.
internaltest Package internaltest is awesome sauce for unit testign.	Package internaltest is awesome sauce for unit testign.
llamacpp Package llamacpp implements a client for the llama-server native API, not the OpenAI compatible one.	Package llamacpp implements a client for the llama-server native API, not the OpenAI compatible one.
llamacppsrv Package llamacppsrv downloads and starts llama-server from llama.cpp, directly from GitHub releases.	Package llamacppsrv downloads and starts llama-server from llama.cpp, directly from GitHub releases.
mistral Package mistral implements a client for the Mistral API.	Package mistral implements a client for the Mistral API.
ollama Package ollama implements a client for the Ollama API.	Package ollama implements a client for the Ollama API.
ollamasrv Package ollamasrv downloads and starts ollama directly from GitHub releases.	Package ollamasrv downloads and starts ollama directly from GitHub releases.
openai Package openai implements a client for the OpenAI API.	Package openai implements a client for the OpenAI API.
perplexity Package perplexity implements a client for the Perplexity API.	Package perplexity implements a client for the Perplexity API.
togetherai Package togetherai implements a client for the Together.ai API.	Package togetherai implements a client for the Together.ai API.