Documentation
¶
Overview ¶
Package genai provides OpenLLMetry semantic convention constants and helpers for recording GenAI usage (tokens, cost, prompts) on OTel spans.
Index ¶
- Constants
- Variables
- func RecordAgentStep(span trace.Span, agentName, agentRole, step string)
- func RecordCacheHit(span trace.Span, hit bool, source string)
- func RecordInteraction(span trace.Span, prompt, completion string)
- func RecordTTFT(ctx context.Context, durationSeconds float64, modelName string)
- func RecordToolResult(span trace.Span, resultJSON string, isError bool)
- func RecordUsage(ctx context.Context, span trace.Span, inTokens, outTokens int, costUSD float64)
- func RecordUsageWithPurpose(ctx context.Context, span trace.Span, inTokens, outTokens int, costUSD float64, ...)
- func SetMaxContextLength(limit int32)
- func StartToolSpan(ctx context.Context, toolName, toolID, argsJSON string) (context.Context, trace.Span)
Constants ¶
const ( System = "gen_ai.system" RequestModel = "gen_ai.request.model" ResponseModel = "gen_ai.response.model" Prompt = "gen_ai.prompt" Completion = "gen_ai.completion" InputTokens = "gen_ai.usage.input_tokens" // #nosec G101 -- OTel semantic convention name, not a credential OutputTokens = "gen_ai.usage.output_tokens" // #nosec G101 -- OTel semantic convention name, not a credential CostUSD = "gen_ai.usage.cost" // Tool Calling (toolsy). ToolName = "gen_ai.tool.name" ToolID = "gen_ai.tool.id" ToolArgs = "gen_ai.tool.args" // Expected JSON string. ToolResult = "gen_ai.tool.result" // Tool output (truncated if over internal limit). ToolError = "gen_ai.tool.error" // True if the tool call failed. // RAG and Semantic Cache. RetrievalSource = "gen_ai.retrieval.source" CacheHit = "gen_ai.cache.hit" EmbeddingModel = "gen_ai.embedding.model" // Multi-agent and orchestration (flowy). AgentName = "gen_ai.agent.name" AgentRole = "gen_ai.agent.role" WorkflowStep = "gen_ai.workflow.step" PromptType = "gen_ai.prompt.type" // Operation purpose for cost tracking (e.g. generation vs guard evaluation). OperationPurpose = "ai.operation.purpose" )
OpenLLMetry semantic convention attribute keys. See https://openllmetry.io/ and OpenTelemetry GenAI semantic conventions.
Note on OpenTelemetry standard evolution: Currently, metry defines its own semantic conventions for GenAI (e.g. CostUSD = "gen_ai.usage.cost") tracking the OpenLLMetry project. The OpenTelemetry project is actively standardizing GenAI semantic conventions (semconv). When official OTel GenAI semconv mature and are included in the standard Go OTel packages (e.g., > 1.42), these constants should be updated to alias the official ones to ensure long-term ecosystem compatibility without breaking the public `metry/genai` API.
const ( PurposeGeneration = "generation" PurposeGuardEvaluation = "guard_evaluation" PurposeQualityEvaluation = "quality_evaluation" )
Standard values for OperationPurpose.
Variables ¶
var ( SystemKey = attribute.Key(System) RequestModelKey = attribute.Key(RequestModel) ResponseModelKey = attribute.Key(ResponseModel) PromptKey = attribute.Key(Prompt) CompletionKey = attribute.Key(Completion) InputTokensKey = attribute.Key(InputTokens) OutputTokensKey = attribute.Key(OutputTokens) CostUSDKey = attribute.Key(CostUSD) ToolNameKey = attribute.Key(ToolName) ToolIDKey = attribute.Key(ToolID) ToolArgsKey = attribute.Key(ToolArgs) ToolResultKey = attribute.Key(ToolResult) ToolErrorKey = attribute.Key(ToolError) RetrievalSourceKey = attribute.Key(RetrievalSource) CacheHitKey = attribute.Key(CacheHit) EmbeddingModelKey = attribute.Key(EmbeddingModel) AgentNameKey = attribute.Key(AgentName) AgentRoleKey = attribute.Key(AgentRole) WorkflowStepKey = attribute.Key(WorkflowStep) PromptTypeKey = attribute.Key(PromptType) // Operation purpose for cost tracking (e.g. generation vs guard evaluation). OperationPurposeKey = attribute.Key(OperationPurpose) )
Attribute keys as attribute.Key for type-safe span recording.
Functions ¶
func RecordAgentStep ¶ added in v0.1.2
RecordAgentStep records one agent step as a span event (ReAct loops: Thought -> Action -> Observation). Event name gen_ai.agent.step and attributes follow OTel GenAI semantic conventions for dashboards. Call from flowy on each state transition; multiple calls on the same span produce a chronological event list.
func RecordCacheHit ¶
RecordCacheHit records cache hit and retrieval source on the span. Call from RAG layer before LLM request.
func RecordInteraction ¶
RecordInteraction sets prompt and completion attributes on the span (OpenLLMetry conventions). Long strings are truncated to maxContextLength to protect export pipelines.
func RecordTTFT ¶
RecordTTFT records the Time To First Token (in seconds) as a histogram metric with model dimension. modelName is recorded as an attribute so dashboards can show TTFT per LLM (e.g. gpt-4o vs claude-3-5). Metrics are registered automatically when metry.Init is called with a metric exporter.
func RecordToolResult ¶ added in v0.1.2
RecordToolResult records the result of a tool call on its own span (from StartToolSpan). resultJSON is truncated; isError sets span status to Error for dashboard visibility.
func RecordUsage ¶
RecordUsage sets token usage and cost attributes on the span and increments OTel counters when metry.Init was called with a metric exporter, with purpose defaulting to PurposeGeneration.
func RecordUsageWithPurpose ¶ added in v0.1.1
func RecordUsageWithPurpose(ctx context.Context, span trace.Span, inTokens, outTokens int, costUSD float64, purpose string)
RecordUsageWithPurpose records usage with an explicit purpose so metrics can be split by generation vs guard_evaluation vs quality_evaluation (billing, dashboards).
func SetMaxContextLength ¶ added in v0.1.3
func SetMaxContextLength(limit int32)
SetMaxContextLength is an internal API. DO NOT use directly; configure via metry.WithMaxGenAIContextLength.
func StartToolSpan ¶ added in v0.1.3
func StartToolSpan(ctx context.Context, toolName, toolID, argsJSON string) (context.Context, trace.Span)
StartToolSpan creates a child span for a tool invocation. Caller MUST call span.End() (e.g. via defer). Use for parallel tool calls so each has its own span and timing.
Types ¶
This section is empty.