gaugo

package module

v1.1.1 Latest Latest Go to latest Published: May 16, 2026 License: MIT Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/nnull13/gaugo

Links

Open Source Insights

README ¶

Gaugo

Go-native evaluations for AI applications, runnable through go test.

Gaugo lets Go teams evaluate RAG systems, agents, chatbots, and other AI-backed services with deterministic test cases, optional LLM judges, concurrent execution, and structured results that fit naturally into CI. Gaugo includes 24 built-in metrics across RAG, safety, generation quality, structured output, instruction following, domain-specific checks, and deterministic contracts.

go get github.com/nnull13/gaugo

Why Gaugo

Capability	What it gives you
Native Go tests	Write AI evaluations as normal `testing` tests.
Deterministic reporting	Run cases concurrently while preserving registration order.
No-LLM checks	Catch required behavior with JSON, regex, latency, length, and `ExpectedContains` assertions.
LLM-judged metrics	Use structured-output judges for RAG, safety, answer quality, citations, summaries, and custom criteria.
Programmatic runs	Use `Runner` to feed dashboards, CLIs, and internal pipelines.
Provider adapters	Start with OpenAI, Anthropic, Gemini, xAI, or a local model service.
Extension points	Bring your own judge, metric, or reporter.

Quickstart (No Provider Needed)

This is the smallest end-to-end evaluation you can run with go test.

package rag_test

import (
	"context"
	"testing"

	"github.com/nnull13/gaugo"
)

func TestEnterprisePricing(t *testing.T) {
	suite := gaugo.New(t)

	suite.Case("answer mentions sales",
		gaugo.Question("What is enterprise pricing?"),
		gaugo.ContextDocs(
			gaugo.Document{
				ID:   "pricing.md",
				Text: "Enterprise plans are custom and sold via sales.",
			},
		),
		gaugo.ExpectedContains("sales"),
	)

	suite.Assert(context.Background(), func(ctx context.Context, in gaugo.Input) (gaugo.Output, error) {
		return gaugo.Output{Answer: "Contact sales for enterprise pricing."}, nil
	})
}

Run it like any other Go test:

go test ./...

Add an LLM judge

package rag_test

import (
	"context"
	"os"
	"testing"
	"time"

	"github.com/nnull13/gaugo"
	"github.com/nnull13/gaugo/provider/openai"
)

func TestRAGQuality(t *testing.T) {
	apiKey := os.Getenv("OPENAI_API_KEY")
	if apiKey == "" {
		t.Skip("OPENAI_API_KEY is not set")
	}

	judge, err := openai.New(openai.Config{
		APIKey: apiKey,
		Model:  "gpt-4.1-mini",
	})
	if err != nil {
		t.Fatal(err)
	}

	suite := gaugo.New(t,
		gaugo.WithJudge(judge),
		gaugo.WithParallelism(8),
		gaugo.WithCaseTimeout(15*time.Second),
	)

	suite.Case("pricing answer",
		gaugo.Question("What is enterprise pricing?"),
		gaugo.ContextDocs(gaugo.Document{
			ID:   "pricing.md",
			Text: "Enterprise pricing is custom and handled by sales.",
		}),
		gaugo.ExpectedContains("sales"),
	)

	suite.Assert(context.Background(), yourGaugoEvaluation,
		gaugo.ContextRelevancy(gaugo.WithThreshold(0.75)),
		gaugo.Faithfulness(gaugo.WithThreshold(0.8)),
		gaugo.AnswerRelevancy(gaugo.WithThreshold(0.7)),
	)
}

Hosted providers validate URLs in strict mode by default (https + official provider hosts). Use AllowUnsafeURL: true only for trusted local stubs or custom gateways.

yourGaugoEvaluation is your adapter:

func yourGaugoEvaluation(ctx context.Context, in gaugo.Input) (gaugo.Output, error) {
	answer, err := myApp.Answer(ctx, in.Question, in.Context)
	if err != nil {
		return gaugo.Output{}, err
	}
	return gaugo.Output{Answer: answer}, nil
}

Go-native positioning

Python-first evaluation frameworks such as Ragas, DeepEval, and TruLens are good options when your eval stack already lives in notebooks, Python services, or dedicated observability platforms. Gaugo's narrower focus is Go-native evaluation: keep cases beside Go application code, run them with go test, and send structured results to the CI and reporting systems your team already uses.

Docs & Next Steps

I want to...	Go to
Browse all docs from one place	Documentation index
Write my first evaluation	Getting started
Understand the evaluation model	Concepts
Use Gaugo inside `go test`	Testing with Suite
Run evaluations from a CLI or pipeline	Programmatic Runner
Configure metrics and thresholds	Metrics reference
Choose and configure an LLM provider	Provider index (OpenAI, Anthropic, Gemini, xAI, Local)
Add a custom judge, metric, or reporter	Extending Gaugo
Debug a failure	Troubleshooting

License

See LICENSE.

_{If Gaugo helps your team ship safer AI, consider giving it a star.}

_{Crafted by NoName13.}

Documentation ¶

Overview ¶

Package gaugo provides an idiomatic Go testing harness for AI application evaluation.

Gaugo evaluates RAG pipelines and AI systems directly within Go's testing workflow, producing deterministic, concurrent, CI-friendly results without external orchestration.

Quick start ¶

Use Suite inside a standard Go test to register cases and assert metrics:

func TestRAG(t *testing.T) {
    suite := gaugo.New(t, gaugo.WithJudge(judge))
    suite.Case("basic",
        gaugo.Question("What is Go?"),
        gaugo.ContextDocs(gaugo.Doc("d1", "Go is a programming language.")),
    )
    suite.Assert(ctx, myRAG, gaugo.Faithfulness(), gaugo.AnswerRelevancy())
}

Programmatic usage ¶

Use Runner when you need structured results outside the testing framework:

runner, _ := gaugo.NewRunner(gaugo.WithJudge(judge))
runner.Case("example", gaugo.Question("Q?"), gaugo.ExpectedContains("answer"))
result, _ := runner.Run(ctx, myFunc)
fmt.Println(result.Summary())

Built-in metrics ¶

Built-in metrics cover RAG, safety, generation quality, structured output, instruction following, domain-specific checks, and deterministic contracts.

RAG and answer quality:

Faithfulness, AnswerRelevancy, ContextRelevancy
ContextPrecision, ContextRecall, AnswerCorrectness

Safety and generation quality:

Hallucination, Toxicity, Bias
Coherence, Conciseness, Completeness

Structured output and deterministic checks:

JSONValidity, SchemaCompliance, ExpectedJSON
AnswerSimilarity, Latency, AnswerLength, ExpectedRegex

Instruction and domain-specific metrics:

InstructionAdherence, GEval
CitationAccuracy, SummarizationQuality

All metrics accept WithThreshold to set a custom pass/fail score in [0,1]. Metric interfaces, shared input/output types, and built-in constructors also live in the github.com/nnull13/gaugo/metric sub-package. The root package re-exports that public surface so callers can use either Faithfulness or metric.Faithfulness interchangeably.

Provider judges ¶

Metrics that require LLM evaluation use a Judge interface. Built-in adapters are provided for OpenAI, Anthropic, Gemini, xAI, and local models (Ollama). See the provider sub-packages for configuration details.

Index ¶

Constants
Variables
func Assert(t testing.TB, result RunResult)
type Case
type CaseOption
- func ContextDocs(docs ...Document) CaseOption
- func ExpectedAnswer(answer string) CaseOption
- func ExpectedContains(substr string) CaseOption
- func ExpectedInstructions(instructions string) CaseOption
- func Question(question string) CaseOption
type CaseResult
- func (c CaseResult) Failed() bool
- func (c CaseResult) FailedMetrics() []MetricResult
- func (c CaseResult) MetricsByName(name string) []MetricResult
type Document
- func Doc(id, text string) Document
type Error
type ErrorCode
type ErrorInfo
- func ClassifyError(err error) ErrorInfo
- func MetricErrorInfo(m MetricResult) (ErrorInfo, bool)
type ErrorKind
type EvalInput
type Expected
type Input
type Judge
type JudgeRequest
type JudgeResponse
type Metric
- func AnswerCorrectness(opts ...MetricOption) Metric
- func AnswerLength(opts ...MetricOption) Metric
- func AnswerRelevancy(opts ...MetricOption) Metric
- func AnswerSimilarity(opts ...MetricOption) Metric
- func Bias(opts ...MetricOption) Metric
- func CitationAccuracy(opts ...MetricOption) Metric
- func Coherence(opts ...MetricOption) Metric
- func Completeness(opts ...MetricOption) Metric
- func Conciseness(opts ...MetricOption) Metric
- func ContextPrecision(opts ...MetricOption) Metric
- func ContextRecall(opts ...MetricOption) Metric
- func ContextRelevancy(opts ...MetricOption) Metric
- func ExpectedJSON(opts ...MetricOption) Metric
- func ExpectedRegex(pattern string, opts ...MetricOption) Metric
- func Faithfulness(opts ...MetricOption) Metric
- func GEval(criteria string, opts ...MetricOption) Metric
- func Hallucination(opts ...MetricOption) Metric
- func InstructionAdherence(opts ...MetricOption) Metric
- func JSONValidity(opts ...MetricOption) Metric
- func Latency(opts ...MetricOption) Metric
- func SchemaCompliance(opts ...MetricOption) Metric
- func SummarizationQuality(opts ...MetricOption) Metric
- func Toxicity(opts ...MetricOption) Metric
type MetricOption
- func WithExpectedFields(fields map[string]any) MetricOption
- func WithMaxLatency(d time.Duration) MetricOption
- func WithMaxLength(n int) MetricOption
- func WithMinLength(n int) MetricOption
- func WithSchema(schema json.RawMessage) MetricOption
- func WithThreshold(v float64) MetricOption
type MetricResult
type Option
- func WithCaseTimeout(d time.Duration) Option
- func WithJudge(j Judge) Option
- func WithMetricDetailsLimit(bytes int) Option
- func WithParallelism(n int) Option
- func WithReporter(r Reporter) Option
type Output
type Reporter
type RetryConfig
- func DefaultRetryConfig() RetryConfig
- func (cfg RetryConfig) Validate() error
type RunFunc
type RunResult
- func (r RunResult) Failed() bool
- func (r RunResult) PassRate() float64
- func (r RunResult) Summary() string
type Runner
- func NewRunner(opts ...Option) (*Runner, error)
- func (r *Runner) Case(name string, opts ...CaseOption) error
- func (r *Runner) Run(ctx context.Context, run RunFunc, metrics ...Metric) (RunResult, error)
type Suite
- func New(t testing.TB, opts ...Option) *Suite
- func (s *Suite) Assert(ctx context.Context, run RunFunc, metrics ...Metric)
- func (s *Suite) Case(name string, opts ...CaseOption)

Constants ¶

View Source

const (
	ErrorKindUnknown             = failure.KindUnknown
	ErrorKindConfig              = failure.KindConfig
	ErrorKindValidation          = failure.KindValidation
	ErrorKindContextCanceled     = failure.KindContextCanceled
	ErrorKindContextDeadline     = failure.KindContextDeadline
	ErrorKindPanic               = failure.KindPanic
	ErrorKindMetric              = failure.KindMetric
	ErrorKindMetricParse         = failure.KindMetricParse
	ErrorKindProviderRequest     = failure.KindProviderRequest
	ErrorKindProviderAuth        = failure.KindProviderAuth
	ErrorKindProviderRateLimit   = failure.KindProviderRateLimit
	ErrorKindProviderUnavailable = failure.KindProviderUnavailable
	ErrorKindProviderResponse    = failure.KindProviderResponse
	ErrorKindProviderRefusal     = failure.KindProviderRefusal
	ErrorKindProviderTruncated   = failure.KindProviderTruncated
)

Variables ¶

View Source

var (
	ErrConfig              = failure.ErrConfig
	ErrValidation          = failure.ErrValidation
	ErrMetric              = failure.ErrMetric
	ErrMetricParse         = failure.ErrMetricParse
	ErrProviderRequest     = failure.ErrProviderRequest
	ErrProviderAuth        = failure.ErrProviderAuth
	ErrProviderRateLimit   = failure.ErrProviderRateLimit
	ErrProviderUnavailable = failure.ErrProviderUnavailable
	ErrProviderResponse    = failure.ErrProviderResponse
	ErrProviderRefusal     = failure.ErrProviderRefusal
	ErrProviderTruncated   = failure.ErrProviderTruncated
	ErrPanic               = failure.ErrPanic
)

Functions ¶

func Assert ¶

func Assert(t testing.TB, result RunResult)

Assert reports result failures through the Go testing package.

Types ¶

type Case ¶

type Case struct {
	Name     string
	Input    Input
	Expected Expected
}

Case defines one evaluation scenario.

type CaseOption ¶

type CaseOption func(*Case)

CaseOption mutates one Case definition.

func ContextDocs ¶

func ContextDocs(docs ...Document) CaseOption

ContextDocs sets retrieved context documents for a case.

func ExpectedAnswer ¶ added in v1.1.0

func ExpectedAnswer(answer string) CaseOption

ExpectedAnswer sets the reference answer for metrics that need ground truth.

func ExpectedContains ¶

func ExpectedContains(substr string) CaseOption

ExpectedContains requires the output answer to contain the given substring.

func ExpectedInstructions ¶ added in v1.1.0

func ExpectedInstructions(instructions string) CaseOption

ExpectedInstructions sets the reference instructions for instruction-following metrics.

func Question ¶

func Question(question string) CaseOption

Question sets the user question for a case.

type CaseResult ¶

type CaseResult struct {
	Name     string
	Metrics  []MetricResult
	RunError error
	Elapsed  time.Duration
}

CaseResult contains execution and metric results for a single case.

func (CaseResult) Failed ¶ added in v1.1.0

func (c CaseResult) Failed() bool

Failed reports whether this case has a run error or any failing metric.

func (CaseResult) FailedMetrics ¶ added in v1.1.0

func (c CaseResult) FailedMetrics() []MetricResult

FailedMetrics returns only the metrics that did not pass.

func (CaseResult) MetricsByName ¶ added in v1.1.0

func (c CaseResult) MetricsByName(name string) []MetricResult

MetricsByName returns metrics matching the given name.

type Document ¶

type Document = metric.Document

func Doc ¶ added in v1.1.0

func Doc(id, text string) Document

Doc is a convenience constructor for Document.

type Error ¶ added in v1.1.1

type Error = failure.Error

Error is Gaugo's typed operational error. It keeps a human message, structured integration metadata, and an optional wrapped cause.

type ErrorCode ¶ added in v1.1.1

type ErrorCode = failure.Code

ErrorCode is a stable machine-readable reason for a Gaugo failure.

type ErrorInfo ¶ added in v1.1.0

type ErrorInfo = failure.Info

ErrorInfo is a redacted, structured description of an operational failure.

func ClassifyError ¶ added in v1.1.0

func ClassifyError(err error) ErrorInfo

ClassifyError returns redacted operational metadata for err.

func MetricErrorInfo ¶ added in v1.1.0

func MetricErrorInfo(m MetricResult) (ErrorInfo, bool)

MetricErrorInfo extracts ErrorInfo stored in MetricResult details.

type ErrorKind ¶ added in v1.1.0

type ErrorKind = failure.Kind

ErrorKind classifies operational failures separately from low quality scores.

type EvalInput ¶

type EvalInput = metric.EvalInput

type Expected ¶

type Expected = metric.Expected

type Input ¶

type Input = metric.Input

type Judge ¶

type Judge = metric.Judge

type JudgeRequest ¶

type JudgeRequest = metric.JudgeRequest

type JudgeResponse ¶

type JudgeResponse = metric.JudgeResponse

type Metric ¶

type Metric = metric.Metric

func AnswerCorrectness ¶ added in v1.1.0

func AnswerCorrectness(opts ...MetricOption) Metric

AnswerCorrectness scores how well the answer matches Expected.Answer.

func AnswerLength ¶ added in v1.1.0

func AnswerLength(opts ...MetricOption) Metric

AnswerLength scores whether the answer length lies in [min,max] runes.

func AnswerRelevancy ¶

func AnswerRelevancy(opts ...MetricOption) Metric

AnswerRelevancy scores how well the answer addresses the question.

func AnswerSimilarity ¶ added in v1.1.0

func AnswerSimilarity(opts ...MetricOption) Metric

AnswerSimilarity scores Jaccard token overlap against Expected.Answer.

func Bias ¶ added in v1.1.0

func Bias(opts ...MetricOption) Metric

Bias scores how unbiased the answer is.

func CitationAccuracy ¶ added in v1.1.0

func CitationAccuracy(opts ...MetricOption) Metric

CitationAccuracy scores how accurate inline citations are against context.

func Coherence ¶ added in v1.1.0

func Coherence(opts ...MetricOption) Metric

Coherence scores how internally consistent the answer is.

func Completeness ¶ added in v1.1.0

func Completeness(opts ...MetricOption) Metric

Completeness scores how completely the answer addresses the question.

func Conciseness ¶ added in v1.1.0

func Conciseness(opts ...MetricOption) Metric

Conciseness scores how succinct the answer is without losing meaning.

func ContextPrecision ¶ added in v1.1.0

func ContextPrecision(opts ...MetricOption) Metric

ContextPrecision scores the fraction of context documents that are useful.

func ContextRecall ¶ added in v1.1.0

func ContextRecall(opts ...MetricOption) Metric

ContextRecall scores how much of the expected answer is supported by the context.

func ContextRelevancy ¶ added in v1.1.0

func ContextRelevancy(opts ...MetricOption) Metric

ContextRelevancy scores how relevant each context document is to the question.

func ExpectedJSON ¶ added in v1.1.0

func ExpectedJSON(opts ...MetricOption) Metric

ExpectedJSON scores how many expected JSON fields match the answer.

func ExpectedRegex ¶ added in v1.1.0

func ExpectedRegex(pattern string, opts ...MetricOption) Metric

ExpectedRegex scores whether the answer matches a Go regular expression.

func Faithfulness ¶

func Faithfulness(opts ...MetricOption) Metric

Faithfulness scores whether the answer is supported by the provided context.

func GEval ¶ added in v1.1.0

func GEval(criteria string, opts ...MetricOption) Metric

GEval scores along a free-form criteria string. Criteria must be non-empty.

func Hallucination ¶ added in v1.1.0

func Hallucination(opts ...MetricOption) Metric

Hallucination scores the fraction of claims that are not hallucinated.

func InstructionAdherence ¶ added in v1.1.0

func InstructionAdherence(opts ...MetricOption) Metric

InstructionAdherence scores how closely the answer followed Expected.Instructions.

func JSONValidity ¶ added in v1.1.0

func JSONValidity(opts ...MetricOption) Metric

JSONValidity scores whether the answer parses as valid JSON.

func Latency ¶ added in v1.1.0

func Latency(opts ...MetricOption) Metric

Latency scores whether the run elapsed within the configured maximum.

func SchemaCompliance ¶ added in v1.1.0

func SchemaCompliance(opts ...MetricOption) Metric

SchemaCompliance scores whether the answer matches the configured JSON schema.

func SummarizationQuality ¶ added in v1.1.0

func SummarizationQuality(opts ...MetricOption) Metric

SummarizationQuality scores summary coverage, fidelity and conciseness.

func Toxicity ¶ added in v1.1.0

func Toxicity(opts ...MetricOption) Metric

Toxicity scores how non-toxic the answer is.

type MetricOption ¶

type MetricOption = metric.Option

func WithExpectedFields ¶ added in v1.1.0

func WithExpectedFields(fields map[string]any) MetricOption

WithExpectedFields sets the dotted JSON paths and expected values used by ExpectedJSON.

func WithMaxLatency ¶ added in v1.1.0

func WithMaxLatency(d time.Duration) MetricOption

WithMaxLatency sets the maximum allowed run latency for Latency.

func WithMaxLength ¶ added in v1.1.0

func WithMaxLength(n int) MetricOption

WithMaxLength sets the maximum allowed answer length in runes.

func WithMinLength ¶ added in v1.1.0

func WithMinLength(n int) MetricOption

WithMinLength sets the minimum allowed answer length in runes.

func WithSchema ¶ added in v1.1.0

func WithSchema(schema json.RawMessage) MetricOption

WithSchema sets the JSON Schema used by SchemaCompliance.

func WithThreshold ¶

func WithThreshold(v float64) MetricOption

WithThreshold sets the pass/fail threshold in [0,1]. Default is 0.7.

type MetricResult ¶

type MetricResult = metric.Result

type Option ¶

type Option func(*config) error

Option configures a Runner or Suite.

func WithCaseTimeout ¶

func WithCaseTimeout(d time.Duration) Option

WithCaseTimeout applies a per-case timeout to run and metric evaluation.

func WithJudge ¶

func WithJudge(j Judge) Option

WithJudge configures the LLM judge used by metrics.

func WithMetricDetailsLimit ¶

func WithMetricDetailsLimit(bytes int) Option

WithMetricDetailsLimit caps stored metric detail bytes per metric result. Use 0 to disable details entirely.

func WithParallelism ¶

func WithParallelism(n int) Option

WithParallelism configures the maximum number of concurrent case executions.

func WithReporter ¶

func WithReporter(r Reporter) Option

WithReporter overrides the default testing reporter.

type Output ¶

type Output = metric.Output

type Reporter ¶

type Reporter interface {
	Report(ctx context.Context, result RunResult)
}

Reporter receives completed suite results without depending on testing.T.

type RetryConfig ¶

type RetryConfig struct {
	// MaxAttempts is the total number of attempts, including the first request.
	// Zero uses the package default.
	MaxAttempts int
	// BaseDelay is the fallback delay before the second attempt.
	// Zero uses the package default.
	BaseDelay time.Duration
	// MaxDelay caps exponential backoff and Retry-After delays.
	// Zero uses the package default.
	MaxDelay time.Duration
}

RetryConfig controls retries for transient provider HTTP failures.

func DefaultRetryConfig ¶

func DefaultRetryConfig() RetryConfig

DefaultRetryConfig returns the retry defaults used by bundled providers.

func (RetryConfig) Validate ¶

func (cfg RetryConfig) Validate() error

Validate reports whether cfg is internally consistent.

type RunFunc ¶

type RunFunc func(ctx context.Context, in Input) (Output, error)

RunFunc executes the system under test for one case.

type RunResult ¶

type RunResult struct {
	Cases []CaseResult
}

RunResult is the deterministic output of a suite execution.

func (RunResult) Failed ¶ added in v1.1.0

func (r RunResult) Failed() bool

Failed reports whether any case has a run error or a failing metric.

func (RunResult) PassRate ¶ added in v1.1.0

func (r RunResult) PassRate() float64

PassRate returns passed checks divided by explicit checks plus run errors. Run errors count as failed executions in the denominator.

func (RunResult) Summary ¶ added in v1.1.0

func (r RunResult) Summary() string

Summary returns a human-readable one-line summary suitable for logs and CI output.

type Runner ¶

type Runner struct {
	// contains filtered or unexported fields
}

Runner executes registered cases and returns structured results without depending on the testing package.

func NewRunner ¶

func NewRunner(opts ...Option) (*Runner, error)

NewRunner creates a programmatic evaluation runner.

func (*Runner) Case ¶

func (r *Runner) Case(name string, opts ...CaseOption) error

Case registers one evaluation case.

func (*Runner) Run ¶

func (r *Runner) Run(ctx context.Context, run RunFunc, metrics ...Metric) (RunResult, error)

Run executes all registered cases.

type Suite ¶

type Suite struct {
	// contains filtered or unexported fields
}

Suite is a testing wrapper around Runner.

func New ¶

func New(t testing.TB, opts ...Option) *Suite

New creates a testing Suite. Configuration errors fail the test immediately.

func (*Suite) Assert ¶

func (s *Suite) Assert(ctx context.Context, run RunFunc, metrics ...Metric)

Assert executes all registered cases and reports failures through testing.

func (*Suite) Case ¶

func (s *Suite) Case(name string, opts ...CaseOption)

Case registers one evaluation case and fails the test if it is invalid.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
failure
metrics Package metrics provides validation helpers shared by every built-in judge-metric sub-package under internal/metrics/*.	Package metrics provides validation helpers shared by every built-in judge-metric sub-package under internal/metrics/*.
metrics/answercorrectness
metrics/answerrelevancy
metrics/bias
metrics/citationaccuracy
metrics/coherence
metrics/completeness
metrics/conciseness
metrics/contextprecision
metrics/contextrecall
metrics/contextrelevancy
metrics/faithfulness
metrics/hallucination
metrics/instructionadherence
metrics/summarizationquality
metrics/toxicity
prompt
provider
provider/request
provider/validate
provider/wire
provider/wire/anthropic/messages
provider/wire/gemini/generatecontent
provider/wire/ollama/nativechat
provider/wire/openai/chat
provider/wire/openai/responses
runner
strictjson Package strictjson decodes JSON into Go values while rejecting unknown fields and trailing tokens.	Package strictjson decodes JSON into Go values while rejecting unknown fields and trailing tokens.
metric Package metric defines the Metric interface used by gaugo to evaluate one case at a time, plus the built-in deterministic and LLM-judge metrics.	Package metric defines the Metric interface used by gaugo to evaluate one case at a time, plus the built-in deterministic and LLM-judge metrics.
provider
anthropic
gemini
local
openai
xai

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL