grail

package module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2025 License: MIT Imports: 10 Imported by: 0

README

grail

CI Go Reference

A lightweight Go SDK that unifies multiple AI providers behind a consistent interface for text and image generation.

Design Goals

  • One client for text & image generation across providers
  • Provider-agnostic by default, extensible when needed
  • Multimodal-first: ordered text + image inputs
  • Flexible Configuration: Client, provider and per-request options
  • Type-Safe Errors: Typed error codes for predictable error handling

Installation

go get github.com/montanaflynn/grail

Quick Start

// Create a provider (automatically uses OPENAI_API_KEY if not provided)
provider, _ := openai.New()

// Create a client
client := grail.NewClient(provider)

// Generate text
res, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{grail.InputText("Create a haiku")},
	Output: grail.OutputText(),
})
text, _ := res.Text()
fmt.Println(text)

// Generate image
imgRes, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{grail.InputText("A beautiful sunset")},
	Output: grail.OutputImage(grail.ImageSpec{Count: 1}),
})
imgs, _ := imgRes.Images()
os.WriteFile("sunset.png", imgs[0], 0644)

// Generate image with provider-specific options
import "github.com/montanaflynn/grail/providers/gemini"
imgRes2, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{grail.InputText("A landscape photo")},
	Output: grail.OutputImage(grail.ImageSpec{Count: 1}),
	ProviderOptions: []grail.ProviderOption{
		gemini.WithImageAspectRatio(gemini.ImageAspectRatio16_9),
		gemini.WithImageSize(gemini.ImageSize2K),
	},
})

// Image understanding (text from image)
imgData, _ := os.ReadFile("photo.jpg")
imgInput := grail.InputImage(imgData)
textRes, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{
		grail.InputText("Describe this image"),
		imgInput,
	},
	Output: grail.OutputText(),
})
text, _ := textRes.Text()
fmt.Println(text)

// Multimodal image generation (image from text + image)
imgRes3, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{
		grail.InputText("Create a variation of this image"),
		imgInput,
		grail.InputText("but make it more colorful"),
	},
	Output: grail.OutputImage(grail.ImageSpec{Count: 1}),
})
imgs, _ := imgRes3.Images()
os.WriteFile("variation.png", imgs[0], 0644)

// PDF understanding (text from PDF)
pdfData, _ := os.ReadFile("document.pdf")
pdfRes, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{
		grail.InputText("Summarize this document"),
		grail.InputPDF(pdfData),
	},
	Output: grail.OutputText(),
})
text, _ := pdfRes.Text()
fmt.Println(text)

// Model selection: explicit model name
res, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{grail.InputText("Hello")},
	Output: grail.OutputText(),
	Model:  "gpt-4o",  // Use this specific model
})

// Model selection: tier-based (provider picks the right model)
res, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{grail.InputText("Hello")},
	Output: grail.OutputText(),
	Tier:   grail.ModelTierFast,  // Let provider pick the fast text model
})

// Query available models
models, _ := client.ListModels(ctx)
for _, m := range models {
	fmt.Printf("%s: role=%s tier=%s\n", m.Name, m.Role, m.Tier)
}

// Get specific model by role and tier
model, _ := client.GetModel(ctx, grail.ModelRoleText, grail.ModelTierBest)
fmt.Printf("Best text model: %s\n", model.Name)

Examples

See the examples/ directory for complete, runnable examples:

Providers

OpenAI
import "github.com/montanaflynn/grail/providers/openai"

// Basic usage (uses OPENAI_API_KEY env var)
provider, err := openai.New()

// With options
provider, err := openai.New(
    openai.WithAPIKey("sk-..."),
    openai.WithTextModel("gpt-4"),
    openai.WithImageModel("gpt-image-1"),
    openai.WithLogger(logger),
)

Options:

  • WithAPIKey(key string) - Set API key explicitly
  • WithAPIKeyFromEnv(env string) - Read API key from environment variable
  • WithTextModel(model string) - Override default text model (default: gpt-5.2)
  • WithImageModel(model string) - Override default image model (default: gpt-image-1)
  • WithLogger(logger *slog.Logger) - Set custom logger

Image Options:

  • WithImageFormat(format ImageFormat) - Set output format (png, jpeg, webp)
  • WithImageBackground(bg ImageBackground) - Set background (auto, transparent, opaque)
  • WithImageSize(size ImageSize) - Set image size (auto, 1024x1024, 1536x1024, 1024x1536, 256x256, 512x512, 1792x1024, 1024x1792)
  • WithImageModeration(moderation ImageModeration) - Set moderation level (auto, low)
  • WithImageOutputCompression(compression int) - Set output compression quality (0-100)

Text Options:

  • TextOptions{Model, MaxTokens, Temperature, TopP, SystemPrompt} - Provider-specific text generation options
Gemini
import "github.com/montanaflynn/grail/providers/gemini"

// Basic usage (uses GEMINI_API_KEY env var)
provider, err := gemini.New(ctx)

// With options
provider, err := gemini.New(ctx,
    gemini.WithAPIKey("..."),
    gemini.WithTextModel("gemini-3-flash-preview"),
    gemini.WithImageModel("gemini-2.5-flash-image"),
    gemini.WithLogger(logger),
)

Options:

  • WithAPIKey(key string) - Set API key explicitly
  • WithAPIKeyFromEnv(env string) - Read API key from environment variable
  • WithTextModel(model string) - Override default text model (default: gemini-3-flash-preview)
  • WithImageModel(model string) - Override default image model (default: gemini-2.5-flash-image)
  • WithLogger(logger *slog.Logger) - Set custom logger

Image Options:

  • WithImageAspectRatio(ratio ImageAspectRatio) - Set aspect ratio (1:1, 16:9, etc.)
  • WithImageSize(size ImageSize) - Set image size (1K, 2K, 4K)

Text Options:

  • TextOptions{Model, MaxTokens, Temperature, TopP, SystemPrompt} - Provider-specific text generation options

Development

# Run tests
go test ./...

# Format code
go fmt ./...

# Run linter
go vet ./...

# Or use make
make format
make lint
make test
make # runs all

Contributing

See CONTRIBUTING.md for contribution guidelines.

Changelog

See CHANGELOG.md for a detailed list of changes.

License

MIT License - see LICENSE for details.

Documentation

Overview

Package grail provides a unified interface for AI text and image generation across multiple providers (OpenAI, Gemini, etc.). It supports multimodal inputs (ordered sequences of text, images, and PDFs) and provides type-safe error handling, structured logging, and flexible configuration options.

Example usage:

provider, _ := openai.New()
client := grail.NewClient(provider)
res, _ := client.Generate(ctx, grail.Request{
	Inputs: []grail.Input{grail.InputText("Hello, world!")},
	Output: grail.OutputText(),
})

Sub-packages:

This package provides the core client and interfaces. Provider implementations are available in sub-packages:

Index

Constants

View Source
const (
	MaxPDFSize  = 50 * 1024 * 1024  // 50 MB
	MaxFileSize = 100 * 1024 * 1024 // 100 MB
)

Variables

View Source
var LoggerLevels = map[string]LoggerLevel{
	"debug": LoggerLevelDebug,
	"info":  LoggerLevelInfo,
	"warn":  LoggerLevelWarn,
	"error": LoggerLevelError,
}

Functions

func AsFileInput added in v0.2.0

func AsFileInput(input Input) ([]byte, string, string, bool)

func AsFileReaderInput added in v0.2.0

func AsFileReaderInput(input Input) (io.Reader, int64, string, string, bool)

func AsTextInput added in v0.2.0

func AsTextInput(input Input) (string, bool)

Type assertion helpers for providers

func GetJSONOutput added in v0.2.0

func GetJSONOutput(output Output) (schema any, strict bool, ok bool)

func IsRateLimited added in v0.2.0

func IsRateLimited(err error) bool

func IsRefused added in v0.2.0

func IsRefused(err error) bool

func IsRetryable added in v0.2.0

func IsRetryable(err error) bool

func IsTextOutput added in v0.2.0

func IsTextOutput(output Output) bool

Output type checking helpers for providers

func NewGrailError added in v0.2.0

func NewGrailError(code ErrorCode, message string) *grailError

func Pointer

func Pointer[T any](v T) *T

Pointer is a helper to take the address of a literal value (e.g., grail.Pointer(0.0)).

func SniffImageMIME added in v0.2.1

func SniffImageMIME(data []byte) string

SniffImageMIME detects image MIME type from magic bytes. It supports PNG, JPEG, GIF, and WebP formats.

Types

type Client

type Client interface {
	Generate(ctx context.Context, req Request) (Response, error)

	// Explicit helpers for loading remote content (HTTP/S only).
	// These helpers perform network I/O using the client's HTTP client
	// and return concrete Inputs (bytes + MIME).
	InputFileFromURI(ctx context.Context, uri string, opts ...FileOpt) (Input, error)
	InputImageFromURI(ctx context.Context, uri string, opts ...FileOpt) (Input, error)
	InputPDFFromURI(ctx context.Context, uri string, opts ...FileOpt) (Input, error)

	// ListModels returns all available models for the provider and their capabilities.
	// Returns an error if the provider doesn't support model listing.
	ListModels(ctx context.Context) ([]Model, error)

	// GetModel returns the model matching the given role and tier.
	// Returns an error if no matching model is found.
	GetModel(ctx context.Context, role ModelRole, tier ModelTier) (Model, error)
}

func NewClient

func NewClient(p Provider, opts ...ClientOption) Client

type ClientOption

type ClientOption interface {
	// contains filtered or unexported methods
}

func WithDownloadLimits added in v0.2.0

func WithDownloadLimits(maxBytes int64, timeout time.Duration) ClientOption

func WithHTTPClient added in v0.2.0

func WithHTTPClient(hc *http.Client) ClientOption

func WithLogger

func WithLogger(l *slog.Logger) ClientOption

WithLogger sets a custom logger for client-level logs.

func WithLoggerFormat

func WithLoggerFormat(format string, level LoggerLevel) ClientOption

WithLoggerFormat builds a default logger at the given level and format ("text" or "json"). This is a convenience if you don't want to construct a slog.Logger yourself.

type ErrorCode

type ErrorCode string
const (
	InvalidArgument ErrorCode = "invalid_argument"
	Unauthorized    ErrorCode = "unauthorized"
	RateLimited     ErrorCode = "rate_limited"
	Timeout         ErrorCode = "timeout"
	Unavailable     ErrorCode = "unavailable"
	Unsupported     ErrorCode = "unsupported"
	Refused         ErrorCode = "refused"
	OutputInvalid   ErrorCode = "output_invalid"
	Internal        ErrorCode = "internal"
)

func GetErrorCode

func GetErrorCode(err error) ErrorCode

type FileOpt added in v0.2.0

type FileOpt interface {
	// contains filtered or unexported methods
}

func WithFileName added in v0.2.0

func WithFileName(name string) FileOpt

type GrailError added in v0.2.0

type GrailError interface {
	error
	Code() ErrorCode
	Retryable() bool
	ProviderName() string
	RequestID() string
}

type ImageOutputInfo added in v0.2.0

type ImageOutputInfo struct {
	Data []byte
	MIME string
	Name string
}

ImageOutputInfo contains image data with MIME and optional name.

type ImageSpec added in v0.2.0

type ImageSpec struct {
	Count int // default 1
}

func GetImageSpec added in v0.2.0

func GetImageSpec(output Output) (ImageSpec, bool)

type Input added in v0.2.0

type Input interface {
	// contains filtered or unexported methods
}

func InputFile added in v0.2.0

func InputFile(data []byte, mime string, opts ...FileOpt) Input

func InputFileFromPath added in v0.2.0

func InputFileFromPath(path string, opts ...FileOpt) (Input, error)

func InputFileReader added in v0.2.0

func InputFileReader(r io.Reader, size int64, mime string, opts ...FileOpt) Input

func InputImage added in v0.2.0

func InputImage(data []byte, opts ...FileOpt) Input

func InputImageFromPath added in v0.2.0

func InputImageFromPath(path string, opts ...FileOpt) (Input, error)

func InputPDF added in v0.2.0

func InputPDF(data []byte, opts ...FileOpt) Input

func InputPDFFromPath added in v0.2.0

func InputPDFFromPath(path string, opts ...FileOpt) (Input, error)

func InputText added in v0.2.0

func InputText(s string) Input

func InputTextFile added in v0.2.0

func InputTextFile(text string, mime string, opts ...FileOpt) Input

type JSONOpt added in v0.2.0

type JSONOpt interface {
	// contains filtered or unexported methods
}

func WithStrictJSON added in v0.2.0

func WithStrictJSON(strict bool) JSONOpt

type LoggerAware

type LoggerAware interface {
	SetLogger(*slog.Logger)
}

LoggerAware is an optional interface for providers to accept a logger from the client.

type LoggerLevel

type LoggerLevel slog.Level

LoggerLevel is a small enum for convenience logger construction.

const (
	LoggerLevelDebug LoggerLevel = LoggerLevel(slog.LevelDebug)
	LoggerLevelInfo  LoggerLevel = LoggerLevel(slog.LevelInfo)
	LoggerLevelWarn  LoggerLevel = LoggerLevel(slog.LevelWarn)
	LoggerLevelError LoggerLevel = LoggerLevel(slog.LevelError)
)

type Model added in v0.3.0

type Model struct {
	Name         string            // Model identifier (e.g., "gpt-5.2", "gemini-3-flash-preview")
	Role         ModelRole         // text or image
	Tier         ModelTier         // best or fast
	Capabilities ModelCapabilities // What the model can do
}

Model describes a model and its capabilities. Providers export these as package-level variables for easy reference.

func (Model) String added in v0.3.0

func (m Model) String() string

String returns the model name for use in requests.

type ModelCapabilities added in v0.3.0

type ModelCapabilities struct {
	TextGeneration     bool // Can generate text from text input
	ImageGeneration    bool // Can generate images from text input
	ImageUnderstanding bool // Can understand/describe images
	PDFUnderstanding   bool // Can understand/extract from PDFs
	JSONOutput         bool // Can output structured JSON
}

ModelCapabilities describes what a model can do.

type ModelCatalog added in v0.3.0

type ModelCatalog interface {
	SetBestTextModel(model Model)
	SetFastTextModel(model Model)
	SetBestImageModel(model Model)
	SetFastImageModel(model Model)

	BestTextModel() Model
	FastTextModel() Model
	BestImageModel() Model
	FastImageModel() Model

	AllModels() []Model
}

ModelCatalog is an optional interface for providers to manage model selection. Providers implement this to allow users to override default models.

type ModelDescriber added in v0.3.0

type ModelDescriber interface {
	DescribeModels(req Request) string
}

ModelDescriber describes what models will be used for a request. Providers implement this to provide accurate logging when req.Model doesn't fully describe the models (e.g., OpenAI image generation uses both a text model and an image model).

type ModelLister added in v0.3.0

type ModelLister interface {
	ListModels(ctx context.Context) ([]Model, error)
}

ModelLister is an optional interface for providers to list available models.

type ModelResolver added in v0.3.0

type ModelResolver interface {
	ResolveModel(role ModelRole, tier ModelTier) (string, error)
}

ModelResolver resolves a role+tier to a model name. Providers implement this to support tier-based selection.

type ModelRole added in v0.3.0

type ModelRole string

ModelRole describes the primary function of a model.

const (
	ModelRoleText  ModelRole = "text"  // Text/language generation
	ModelRoleImage ModelRole = "image" // Image generation
)

type ModelTier added in v0.3.0

type ModelTier string

ModelTier describes the quality/speed trade-off of a model.

const (
	ModelTierBest ModelTier = "best" // Highest quality, may be slower/costlier
	ModelTierFast ModelTier = "fast" // Speed/cost optimized
)

type ModelUse added in v0.2.0

type ModelUse struct {
	Role string // "language", "image_generation", "moderation", etc.
	Name string // provider-native model identifier
}

type Output added in v0.2.0

type Output interface {
	// contains filtered or unexported methods
}

func OutputImage added in v0.2.0

func OutputImage(spec ImageSpec) Output

func OutputJSON added in v0.2.0

func OutputJSON(schema any, opts ...JSONOpt) Output

func OutputText added in v0.2.0

func OutputText() Output

type OutputPart added in v0.2.0

type OutputPart interface {
	// contains filtered or unexported methods
}

func NewImageOutputPart added in v0.2.0

func NewImageOutputPart(data []byte, mime, name string) OutputPart

func NewJSONOutputPart added in v0.2.0

func NewJSONOutputPart(jsonData []byte) OutputPart

func NewTextOutputPart added in v0.2.0

func NewTextOutputPart(text string) OutputPart

OutputPart construction helpers for providers

type Provider

type Provider interface {
	Name() string
}

type ProviderExecutor added in v0.2.0

type ProviderExecutor interface {
	Provider
	DoGenerate(ctx context.Context, req Request) (Response, error)
}

ProviderExecutor is the internal execution seam (implemented by provider packages). This is exported so provider packages can implement it, but it's not part of the public API contract - users should not implement this directly.

type ProviderInfo added in v0.2.0

type ProviderInfo struct {
	Name   string
	Route  string // provider-defined (e.g. "responses", "images")
	Models []ModelUse
}

type ProviderOption

type ProviderOption interface {
	ApplyProviderOption() // marker method - must be exported for provider packages
}

type Request added in v0.2.0

type Request struct {
	Inputs          []Input
	Output          Output
	Model           string    // Optional: explicit model name (highest priority)
	Tier            ModelTier // Optional: tier-based selection (if Model not set)
	ProviderOptions []ProviderOption
	Metadata        map[string]string
}

type Response added in v0.2.0

type Response struct {
	Outputs   []OutputPart
	Usage     Usage
	Provider  ProviderInfo
	RequestID string
	Warnings  []Warning
}

func (Response) DecodeJSON added in v0.2.0

func (r Response) DecodeJSON(dst any) error

func (Response) ImageOutputs added in v0.2.0

func (r Response) ImageOutputs() []ImageOutputInfo

ImageOutputs returns image output parts with MIME and name information.

func (Response) Images added in v0.2.0

func (r Response) Images() ([][]byte, bool)

func (Response) Text added in v0.2.0

func (r Response) Text() (string, bool)

type Usage added in v0.2.0

type Usage struct {
	InputTokens  int
	OutputTokens int
	TotalTokens  int
}

type Warning added in v0.2.0

type Warning struct {
	Code    string
	Message string
}

Directories

Path Synopsis
examples
gemini-image-options command
Gemini-image-options demonstrates Gemini-specific image generation options.
Gemini-image-options demonstrates Gemini-specific image generation options.
image-understanding command
Image-understanding demonstrates text generation from image inputs.
Image-understanding demonstrates text generation from image inputs.
model-selection command
Model-selection demonstrates choosing between best and fast model tiers.
Model-selection demonstrates choosing between best and fast model tiers.
openai-image-options command
Openai-image-options demonstrates OpenAI-specific image generation options.
Openai-image-options demonstrates OpenAI-specific image generation options.
pdf-to-image command
Pdf-to-image demonstrates image generation from PDF documents.
Pdf-to-image demonstrates image generation from PDF documents.
pdf-understanding command
Pdf-understanding demonstrates text generation from PDF documents.
Pdf-understanding demonstrates text generation from PDF documents.
simple-text command
Simple-text demonstrates minimal text generation with default settings.
Simple-text demonstrates minimal text generation with default settings.
text-generation command
Text-generation demonstrates text generation with provider selection.
Text-generation demonstrates text generation with provider selection.
text-to-image command
Text-to-image demonstrates image generation from text prompts.
Text-to-image demonstrates image generation from text prompts.
providers
gemini
Package gemini provides a Google Gemini implementation of the grail.Provider interface.
Package gemini provides a Google Gemini implementation of the grail.Provider interface.
mock
Package mock provides a test double implementation of the grail.Provider interface.
Package mock provides a test double implementation of the grail.Provider interface.
openai
Package openai provides an OpenAI implementation of the grail.Provider interface.
Package openai provides an OpenAI implementation of the grail.Provider interface.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL