genai

package module
v0.0.0-...-0071670 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 25, 2025 License: Apache-2.0 Imports: 11 Imported by: 1

README ΒΆ

genai

The high performance low level native Go client for LLMs.

Provider Country Chat Streaming Vision PDF Audio Video JSON output JSON schema Seed Tools
Anthropic πŸ‡ΊπŸ‡Έ βœ… βœ… βœ… βœ… ❌ ❌ ❌ ❌ ❌ βœ…
Cerebras πŸ‡ΊπŸ‡Έ βœ… βœ… ❌ ❌ ❌ ❌ βœ… βœ… βœ… βœ…
Cloudflare Workers AI πŸ‡ΊπŸ‡Έ βœ… βœ… ⏳ ❌ ⏳ ❌ βœ… βœ… βœ… βœ…
Cohere πŸ‡¨πŸ‡¦ βœ… βœ… ⏳ ❌ ❌ ❌ βœ… βœ… βœ… βœ…
DeepSeek πŸ‡¨πŸ‡³ βœ… βœ… ❌ ❌ ❌ ❌ βœ… ❌ ❌ βœ…
Google's Gemini πŸ‡ΊπŸ‡Έ βœ… βœ… βœ… βœ… βœ… βœ… βœ… βœ… βœ… βœ…
Groq πŸ‡ΊπŸ‡Έ βœ… βœ… βœ… ❌ ❌ ❌ βœ… ❌ βœ… βœ…
HuggingFace πŸ‡ΊπŸ‡Έ βœ… βœ… ⏳ ⏳ ❌ ❌ ⏳ ⏳ βœ… βœ…
llama.cpp N/A βœ… βœ… ⏳ ⏳ ⏳ ⏳ ⏳ ⏳ βœ… ⏳
Mistral πŸ‡«πŸ‡· βœ… βœ… βœ… βœ… ❌ ❌ βœ… βœ… βœ… βœ…
Ollama N/A βœ… βœ… βœ… ❌ ❌ ❌ ❌ βœ… βœ… βœ…
OpenAI πŸ‡ΊπŸ‡Έ βœ… βœ… βœ… βœ… βœ… ❌ βœ… βœ… βœ… βœ…
Perplexity πŸ‡ΊπŸ‡Έ βœ… βœ… ❌ ❌ ❌ ❌ ❌ ⏳ ❌ ❌
TogetherAI πŸ‡ΊπŸ‡Έ βœ… βœ… βœ… ❌ ❌ βœ… βœ… βœ… βœ… βœ…
  • βœ… Implemented
  • ⏳ To be implemented
  • ❌ Not supported
  • Streaming: chat streaming
  • Vision: ability to process an image as input; most providers support PNG, JPG, WEBP and non-animated GIF
  • Video: ability to process a video (e.g. MP4) as input.
  • PDF: ability to process a PDF as input, possibly with OCR
  • JSON output/schema: ability to output JSON in free form or with a schema
  • Seed: deterministic seed for reproducibility
  • Tools: tool calling

Features

  • Full functionality: Full access to each backend-specific functionality. Access the raw API if needed with full message schema as Go structs.
  • Native JSON struct serialization: Pass a struct to tell the LLM what to generate, decode the reply into your struct. No need to manually fiddle with JSON. Supports required fields, enums, descriptions, etc.
  • Native tool calling: Tell the LLM to call a tool directly, described a Go struct. No need to manually fiddle with JSON.
  • Streaming: Streams completion reply as the output is being generated.
  • Vision: Process images, PDFs and videos (!) as input.
  • Unit testing friendly: record and play back API calls at HTTP level.

Implementation is in flux. :)

Go Reference codecov

Design

  • Safe and strict API implementation. All you love from a statically typed language. Immediately fails on unknown RPC fields. Error code paths are properly implemented.
  • *Stateless: no global state, clients are safe to use concurrently lock-less.
  • Professional grade: unit tested on live services.
  • Optimized for speed: minimize memory allocations, compress data at the transport layer when possible.
  • Lean: Few dependencies. No unnecessary abstraction layer.
  • Easy to add new providers.

I'm poor πŸ’Έ

As of March 2025, the following services offer a free tier (other limits apply):

HTTP transport compression

Each service provider was manually tested to see if the accept compressed POST body.

As for March 2025, here's the HTTP POST compression supported by each provider:

Provider Compression accepted for POST data Response compressed
Anthropic none gzip
Cerebras none none
Cloudflare Workers AI none gzip
Cohere none none
DeepSeek none gzip
Google's Gemini gzip gzip
Groq none br
HuggingFace gzip, br or zstd none
Mistral none br
OpenAI none br
Perplexity none none

It matters if you care about your ingress/egress bandwidth. Only HuggingFace supports brotli and zstd as POST data but replies uncompressed (!). Google supports gzip.

Look and feel

Decoding answer as a typed struct

Tell the LLM to use a specific JSON schema to generate the response.

package main

import (
	"context"
	"fmt"
	"log"
	"strings"

	"github.com/maruel/genai/cerebras"
	"github.com/maruel/genai/genai"
)

type Circle struct {
    Round bool `json:"round"`
}

func main() {
    c, err := cerebras.New("", "llama3.1-8b")
    if err != nil {
        log.Fatal(err)
    }
    msgs := genai.Messages{
        genai.NewTextMessage(genai.User, "Is a circle round? Reply as JSON."),
    }
    opts := genai.ChatOptions{
        Seed:        1,
        Temperature: 0.01,
        MaxTokens:   50,
        DecodeAs:    &Circle{},
    }
    resp, err := c.Chat(context.Background(), msgs, &opts)
    if err != nil {
        log.Fatal(err)
    }
    got := Circle{}
    if err := resp.Contents[0].Decode(&got); err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Round: %v\n", got.Round)
}

Models

Snapshot of all the supported models: MODELS.md.

Try it:

go install github.com/maruel/genai/cmd/...@latest
list-models -provider hugginface

TODO

  • Audio out
  • Video out
  • Batch
  • Tuning
  • Embeddings
  • Handle rate limiting
  • Moderation
  • Thinking
  • Content Blocks
  • Citations

Documentation ΒΆ

Overview ΒΆ

Package genai is the high performance native Go client for LLMs.

It provides a generic interface to interact with various LLM providers.

Check out the examples for a quick start.

Index ΒΆ

Examples ΒΆ

Constants ΒΆ

This section is empty.

Variables ΒΆ

This section is empty.

Functions ΒΆ

This section is empty.

Types ΒΆ

type ChatOptions ΒΆ

type ChatOptions struct {

	// Temperature adjust the creativity of the sampling. Generally between 0 and 2.
	Temperature float64
	// TopP adjusts correctness sampling between 0 and 1. The higher the more diverse the output.
	TopP float64
	// MaxTokens is the maximum number of tokens to generate. Used to limit it
	// lower than the default maximum, for budget reasons.
	MaxTokens int64
	// SystemPrompt is the prompt to use for the system role.
	SystemPrompt string

	// Seed for the random number generator. Default is 0 which means
	// non-deterministic.
	Seed int64
	// TopK adjusts sampling where only the N first candidates are considered.
	TopK int64
	// Stop is the list of tokens to stop generation.
	Stop []string

	// ReplyAsJSON enforces the output to be valid JSON, any JSON. It is
	// important to tell the model to reply in JSON in the prompt itself.
	ReplyAsJSON bool
	// DecodeAs enforces a reply with a specific JSON structure. It is important
	// to tell the model to reply in JSON in the prompt itself.
	DecodeAs ReflectedToJSON
	// Tools is the list of tools that the LLM can request to call.
	Tools []ToolDef
	// contains filtered or unexported fields
}

ChatOptions is a list of frequent options supported by most ChatProvider. Each provider is free to support more options through a specialized struct.

func (*ChatOptions) Validate ΒΆ

func (c *ChatOptions) Validate() error

Validate ensures the completion options are valid.

type ChatProvider ΒΆ

type ChatProvider interface {
	// Chat runs completion synchronously.
	//
	// opts must be either nil, *ChatOptions or a provider-specialized
	// option struct.
	Chat(ctx context.Context, msgs Messages, opts Validatable) (ChatResult, error)
	// ChatStream runs completion synchronously, streaming the results to channel replies.
	//
	// opts must be either nil, *ChatOptions or a provider-specialized
	// option struct.
	ChatStream(ctx context.Context, msgs Messages, opts Validatable, replies chan<- MessageFragment) error
}

ChatProvider is the generic interface to interact with a LLM backend.

Example ΒΆ
package main

import (
	"context"
	"fmt"
	"os"

	"github.com/maruel/genai"
	"github.com/maruel/genai/anthropic"
	"github.com/maruel/genai/cerebras"
	"github.com/maruel/genai/cloudflare"
	"github.com/maruel/genai/cohere"
	"github.com/maruel/genai/deepseek"
	"github.com/maruel/genai/gemini"
	"github.com/maruel/genai/groq"
	"github.com/maruel/genai/huggingface"
	"github.com/maruel/genai/llamacpp"
	"github.com/maruel/genai/mistral"
	"github.com/maruel/genai/openai"
	"github.com/maruel/genai/perplexity"
	"github.com/maruel/genai/togetherai"
)

func main() {
	// Pro-tip: Using os.Stderr so if you modify this file and append a "// Output: foo"
	// at the end of this function, "go test" will run the code and stream the
	// output to you.
	completionProviders := map[string]genai.ChatProvider{}
	// https://docs.anthropic.com/en/docs/about-claude/models/all-models
	if c, err := anthropic.New("", "claude-3-7-sonnet-latest"); err == nil {
		completionProviders["anthropic"] = c
	}
	if c, err := cerebras.New("", "llama-3.3-70b"); err == nil {
		completionProviders["cerebras"] = c
	}
	// https://developers.cloudflare.com/workers-ai/models/
	if c, err := cloudflare.New("", "", "@cf/deepseek-ai/deepseek-r1-distill-qwen-32b"); err == nil {
		completionProviders["cloudflare"] = c
	}
	// https://docs.cohere.com/v2/docs/models
	if c, err := cohere.New("", "command-r-plus"); err == nil {
		completionProviders["cohere"] = c
	}
	if c, err := deepseek.New("", "deepseek-reasoner"); err == nil {
		completionProviders["deepseek"] = c
	}
	// https://ai.google.dev/gemini-api/docs/models/gemini
	if c, err := gemini.New("", "gemini-2.0-flash"); err == nil {
		completionProviders["gemini"] = c
	}
	// https://console.groq.com/docs/models
	if c, err := groq.New("", "qwen-qwq-32b"); err == nil {
		completionProviders["groq"] = c
	}
	// https://huggingface.co/models?inference=warm&sort=trending
	if c, err := huggingface.New("", "Qwen/QwQ-32B"); err == nil {
		completionProviders["huggingface"] = c
	}
	if false {
		// See llamacpp/llamacppsrv to see how to run a local server.
		if c, err := llamacpp.New("http://localhost:8080", nil); err == nil {
			completionProviders["llamacpp"] = c
		}
	}
	// https://docs.mistral.ai/getting-started/models/models_overview/
	if c, err := mistral.New("", "mistral-large-latest"); err == nil {
		completionProviders["mistral"] = c
	}
	// https://platform.openai.com/docs/api-reference/models
	if c, err := openai.New("", "o3-mini"); err == nil {
		completionProviders["openai"] = c
	}
	if c, err := perplexity.New("", "sonar"); err == nil {
		completionProviders["perplexity"] = c
	}
	if c, err := togetherai.New("", "deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free"); err == nil {
		completionProviders["togetherai"] = c
	}

	for name, provider := range completionProviders {
		msgs := genai.Messages{
			genai.NewTextMessage(genai.User, "Tell a story in 10 words."),
		}
		response, err := provider.Chat(context.Background(), msgs, nil)
		if err != nil {
			fmt.Fprintf(os.Stderr, "- %s: %v\n", name, err)
		} else {
			fmt.Fprintf(os.Stderr, "- %s: %v\n", name, response)
		}
	}
}
Output:

type ChatResult ΒΆ

type ChatResult struct {
	Message
	Usage
	// contains filtered or unexported fields
}

ChatResult is the result of a completion.

type Content ΒΆ

type Content struct {

	// Text is the content of the text message.
	Text string

	// Filename is the name of the file. For many providers, only the extension
	// is relevant. They only use mime-type, which is derived from the filename's
	// extension. When an URL is provided or when the object provided to Document
	// implements a method with the signature `Name() string`, like an
	// `*os.File`, Filename is optional.
	Filename string
	// Document is raw document data. It is perfectly fine to use a
	// bytes.Buffer{}, bytes.NewReader() or *os.File.
	Document io.ReadSeeker
	// URL is the reference to the raw data. When set, the mime-type is derived from the URL.
	URL string
	// contains filtered or unexported fields
}

Content is a block of content in the message meant to be visible in a chat setting.

The content can be text or a document. The document may be audio, video, image, PDF or any other format.

func (*Content) Decode ΒΆ

func (c *Content) Decode(x any) error

Decode decodes the JSON message into the struct.

Requires using either ReplyAsJSON or DecodeAs in the ChatOptions.

Note: this doesn't verify the type is the same as specified in ChatOptions.DecodeAs.

func (*Content) GetFilename ΒΆ

func (c *Content) GetFilename() string

GetFilename returns the filename to use for the document, querying the Document's name if available.

func (*Content) ReadDocument ΒΆ

func (c *Content) ReadDocument(maxSize int64) (string, []byte, error)

ReadDocument reads the document content into memory.

func (*Content) Validate ΒΆ

func (c *Content) Validate() error

Validate ensures the block is valid.

type Message ΒΆ

type Message struct {
	Role Role
	User string // Only used when Role == User. Only some provider (e.g. OpenAI, Groq, DeepSeek) support it.

	Contents []Content // For example when the LLM replies with multiple content blocks, an explanation and a code block.

	// ToolCall is a tool call that the LLM requested to make.
	ToolCalls []ToolCall
	// contains filtered or unexported fields
}

Message is a message to send to the LLM as part of the exchange.

The message may contain content, information to communicate between the user and the LLM. This is the Contents section. The content can be text or a document. The document may be audio, video, image, PDF or any other format.

The message may also contain tool calls. The tool call is a request from the LLM to answer a specific question, so the LLM can continue its process.

func NewTextMessage ΒΆ

func NewTextMessage(role Role, text string) Message

NewTextMessage is a shorthand function to create a Message with a single text block.

func (*Message) Validate ΒΆ

func (m *Message) Validate() error

Validate ensures the messages are valid.

type MessageFragment ΒΆ

type MessageFragment struct {
	TextFragment string

	Filename         string
	DocumentFragment []byte

	// ToolCall is a tool call that the LLM requested to make.
	ToolCall ToolCall
	// contains filtered or unexported fields
}

MessageFragment is a fragment of a message the LLM is sending back as part of the ChatStream().

Only one of the item can be set.

func (*MessageFragment) Accumulate ΒΆ

func (m *MessageFragment) Accumulate(msgs Messages) (Messages, error)

Accumulate accumulates the message fragment into the list of messages.

The assumption is that the fragment is always a message from the Assistant.

type Messages ΒΆ

type Messages []Message

Messages is a list of valid messages in an exchange with a LLM.

The messages should be alternating between User and Assistant roles, or in the case of multi-user discussion, with different Users.

func (Messages) Validate ΒΆ

func (msgs Messages) Validate() error

Validate ensures the messages are valid.

type Model ΒΆ

type Model interface {
	GetID() string
	String() string
	Context() int64
}

Model represents a served model by the provider.

type ModelProvider ΒΆ

type ModelProvider interface {
	ListModels(ctx context.Context) ([]Model, error)
}

ModelProvider represents a provider that can list models.

Example ΒΆ
package main

import (
	"context"
	"fmt"
	"os"

	"github.com/maruel/genai"
	"github.com/maruel/genai/anthropic"
	"github.com/maruel/genai/cerebras"
	"github.com/maruel/genai/cloudflare"
	"github.com/maruel/genai/cohere"
	"github.com/maruel/genai/deepseek"
	"github.com/maruel/genai/gemini"
	"github.com/maruel/genai/groq"
	"github.com/maruel/genai/huggingface"
	"github.com/maruel/genai/mistral"
	"github.com/maruel/genai/openai"
	"github.com/maruel/genai/togetherai"
)

func main() {
	// Pro-tip: Using os.Stderr so if you modify this file and append a "// Output: foo"
	// at the end of this function, "go test" will run the code and stream the
	// output to you.

	modelProviders := map[string]genai.ModelProvider{}
	if c, err := anthropic.New("", ""); err == nil {
		modelProviders["anthropic"] = c
	}
	if c, err := cerebras.New("", ""); err == nil {
		modelProviders["cerebras"] = c
	}
	if c, err := cloudflare.New("", "", ""); err == nil {
		modelProviders["cloudflare"] = c
	}
	if c, err := cohere.New("", ""); err == nil {
		modelProviders["cohere"] = c
	}
	if c, err := deepseek.New("", ""); err == nil {
		modelProviders["deepseek"] = c
	}
	if c, err := gemini.New("", ""); err == nil {
		modelProviders["gemini"] = c
	}
	if c, err := groq.New("", ""); err == nil {
		modelProviders["groq"] = c
	}
	if c, err := huggingface.New("", ""); err == nil {
		modelProviders["huggingface"] = c
	}
	// llamapcpp doesn't implement ModelProvider.
	if c, err := mistral.New("", ""); err == nil {
		modelProviders["mistral"] = c
	}
	if c, err := openai.New("", ""); err == nil {
		modelProviders["openai"] = c
	}
	// perplexity doesn't implement ModelProvider.
	if c, err := togetherai.New("", ""); err == nil {
		modelProviders["togetherai"] = c
	}

	for name, p := range modelProviders {
		models, err := p.ListModels(context.Background())
		fmt.Fprintf(os.Stderr, "%s:\n", name)
		if err != nil {
			fmt.Fprintf(os.Stderr, "  Failed to get models: %v\n", err)
		}
		for _, model := range models {
			fmt.Fprintf(os.Stderr, "- %s\n", model)
		}
	}
}
Output:

type ReflectedToJSON ΒΆ

type ReflectedToJSON any

ReflectedToJSON must be a pointer to a struct that can be decoded by encoding/json and can have jsonschema tags.

It is recommended to use jsonscheme_description tags to describe each field or argument.

Use jsonschema:"enum=..." to enforce a specific value within a set.

type Role ΒΆ

type Role string

Role is one of the LLM known roles.

const (
	// User is the user's inputs. There can be multiple users in a conversation.
	// They are differentiated by the Message.User field.
	User Role = "user"
	// Assistant is the LLM.
	Assistant Role = "assistant"
	// Computer is the user's computer, it replies to tool calls.
	Computer Role = "computer"
)

LLM known roles. Not all systems support all roles.

func (Role) Validate ΒΆ

func (r Role) Validate() error

Validate ensures the role is valid.

type ToolCall ΒΆ

type ToolCall struct {
	ID        string // Unique identifier for the tool call. Necessary for parallel tool calling.
	Name      string // Tool being called.
	Arguments string // encoded as JSON
	// contains filtered or unexported fields
}

ToolCall is a tool call that the LLM requested to make.

func (*ToolCall) Decode ΒΆ

func (t *ToolCall) Decode(x any) error

Decode decodes the JSON tool call.

This function doesn't validate x is the same as InputsAs in the ToolDef.

type ToolDef ΒΆ

type ToolDef struct {
	// Name must be unique among all tools.
	Name string
	// Description must be a LLM-friendly short description of the tool.
	Description string
	// InputsAs enforces a tool call with a specific JSON structure for
	// arguments.
	InputsAs ReflectedToJSON
	// contains filtered or unexported fields
}

ToolDef describes a tool that the LLM can request to use.

func (*ToolDef) Validate ΒΆ

func (t *ToolDef) Validate() error

Validate ensures the tool definition is valid.

type Usage ΒΆ

type Usage struct {
	InputTokens  int64
	OutputTokens int64
	// contains filtered or unexported fields
}

Usage from the LLM provider.

type Validatable ΒΆ

type Validatable interface {
	Validate() error
}

Validatable is an interface to an object that can be validated.

Directories ΒΆ

Path Synopsis
Package anthropic implements a client for the Anthropic API, to use Claude.
Package anthropic implements a client for the Anthropic API, to use Claude.
Package cerebras implements a client for the Cerebras API.
Package cerebras implements a client for the Cerebras API.
Package cloudflare implements a client for the Cloudflare AI API.
Package cloudflare implements a client for the Cloudflare AI API.
cmd
Package cohere implements a client for the Cohere API.
Package cohere implements a client for the Cohere API.
Package deepseek implements a client for the DeepSeek API.
Package deepseek implements a client for the DeepSeek API.
Package gemini implements a client for Google's Gemini API.
Package gemini implements a client for Google's Gemini API.
Package groq implements a client for the Groq API.
Package groq implements a client for the Groq API.
Package huggingface implements a client for the HuggingFace serverless inference API.
Package huggingface implements a client for the HuggingFace serverless inference API.
Package internal is awesome sauce.
Package internal is awesome sauce.
internaltest
Package internaltest is awesome sauce for unit testign.
Package internaltest is awesome sauce for unit testign.
Package llamacpp implements a client for the llama-server native API, not the OpenAI compatible one.
Package llamacpp implements a client for the llama-server native API, not the OpenAI compatible one.
llamacppsrv
Package llamacppsrv downloads and starts llama-server from llama.cpp, directly from GitHub releases.
Package llamacppsrv downloads and starts llama-server from llama.cpp, directly from GitHub releases.
Package mistral implements a client for the Mistral API.
Package mistral implements a client for the Mistral API.
Package ollama implements a client for the Ollama API.
Package ollama implements a client for the Ollama API.
ollamasrv
Package ollamasrv downloads and starts ollama directly from GitHub releases.
Package ollamasrv downloads and starts ollama directly from GitHub releases.
Package openai implements a client for the OpenAI API.
Package openai implements a client for the OpenAI API.
Package perplexity implements a client for the Perplexity API.
Package perplexity implements a client for the Perplexity API.
Package togetherai implements a client for the Together.ai API.
Package togetherai implements a client for the Together.ai API.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL