telemetry

package
v1.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 29, 2026 License: Apache-2.0 Imports: 18 Imported by: 0

Documentation

Overview

Package telemetry harvests Copilot CLI telemetry from the workspace-scoped .copilot/ directory, correlates events with backlogit task completions, attributes tool calls to their originating MCP servers, and exposes metrics through backlogit's existing SQL query surface.

Constitutional note: read-only access to .copilot/ is a documented exception to Principle IV (Workspace Containment). All writes target .backlogit/.

Index

Constants

View Source
const DefaultContextLimit = 200000

DefaultContextLimit is the conservative fallback used for unknown models.

Variables

View Source
var ModelContextLimits = map[string]int{
	"claude-sonnet-4":   200000,
	"claude-sonnet-4.5": 200000,
	"claude-haiku-4.5":  200000,
	"claude-opus-4":     200000,
	"claude-opus-4.5":   200000,
	"gpt-4.1":           1000000,
	"gpt-4.1-mini":      500000,
	"gpt-5":             1000000,
	"gpt-5.1":           1000000,
	"o4-mini":           200000,
}

ModelContextLimits maps model identifiers to their maximum context window size in tokens. Used to derive utilisation metrics when raw context window data is unavailable from log sources.

Unknown models fall back to DefaultContextLimit. The table is a single var declaration and trivially extensible as new models ship.

Functions

func AttributeTool

func AttributeTool(toolName string) string

AttributeTool maps a tool name to its originating MCP server name using longest-prefix-first matching against the default prefix registry. Exact tool names (e.g. "view", "edit") are matched before prefix patterns. Unknown tool names resolve to "unknown".

func ContextLimitForModel

func ContextLimitForModel(model string) int

ContextLimitForModel returns the maximum context window size for model. Falls back to DefaultContextLimit for unknown models.

func GenerateReport

func GenerateReport(workspacePath string, opts ReportOptions) (string, error)

GenerateReport reads telemetry-sessions.jsonl from <workspacePath>/.backlogit/ and produces a formatted report string according to opts. Returns an informative message (not an error) when no harvested data exists.

func LoadSessionEvents

func LoadSessionEvents(sessionStateDir string) (map[string][]CompactionEvent, error)

LoadSessionEvents reads all events.jsonl files found directly under sessionStateDir (one level deep, one file per session directory) and returns a map of sessionID → CompactionEvents. The session ID is inferred from the parent directory name.

func ReadSessionStore

func ReadSessionStore(dbPath string) (map[string]SessionMeta, error)

ReadSessionStore opens the session-store.db SQLite database at dbPath (read-only) and returns a map of sessionID → SessionMeta with branch, repository, work directory, and timing data. Returns an empty map (not an error) when the database is missing or inaccessible — graceful fallback per Plan Review F3.

func SaveCheckpoint

func SaveCheckpoint(workspacePath string, cp *HarvestCheckpoint) error

SaveCheckpoint atomically writes cp to <workspacePath>/.backlogit/.telemetry-checkpoint.json via temp-file-then-rename.

func ValidateSessionSummary added in v1.1.1

func ValidateSessionSummary(s SessionSummary) bool

ValidateSessionSummary reports whether s represents a complete, harvestable session. A session is invalid (partial) when it has tool calls recorded but zero tokens — this occurs when oversized log entries are dropped by the parser, leaving tool calls without corresponding model-call token attribution.

Callers in the harvest pipeline must reject invalid sessions to prevent partial zero-token records from being written to telemetry-sessions.jsonl.

Types

type CompactionEvent

type CompactionEvent struct {
	Timestamp           string `json:"timestamp"`
	PreCompactionTokens int    `json:"preCompactionTokens"`
	InputTokens         int    `json:"input"`
	OutputTokens        int    `json:"output"`
	CachedInputTokens   int    `json:"cachedInput"`
}

CompactionEvent holds data from a session.compaction_complete event in a session-state events.jsonl file.

func ParseSessionEvents

func ParseSessionEvents(r io.Reader) ([]CompactionEvent, error)

ParseSessionEvents reads a session-state events.jsonl stream and extracts CompactionEvents from session.compaction_complete entries. Malformed or non-compaction lines are skipped. Returns an empty slice (not an error) when no compaction events are present.

type ContextWindowMetrics

type ContextWindowMetrics struct {
	// PeakUtilization is the highest prompt_tokens/max_tokens ratio observed
	// across all model calls in the session (0.0–1.0+).
	PeakUtilization float64 `json:"peak_utilization"`
	// RemainingCapacity is max_tokens - peak_prompt_tokens at the point of peak
	// utilisation.
	RemainingCapacity int `json:"remaining_capacity"`
	// DepletionRate is total_tokens / model_calls (average tokens consumed per
	// model turn).
	DepletionRate float64 `json:"depletion_rate"`
	// MaxContextTokens is the model-limit value used for the calculations.
	MaxContextTokens int `json:"max_context_tokens"`
	// PeakPromptTokens is the highest prompt token count observed in any single
	// model call.
	PeakPromptTokens int `json:"peak_prompt_tokens"`
	// CompactionCount is the number of context-compaction events recorded for
	// this session.
	CompactionCount int `json:"compaction_count"`
}

ContextWindowMetrics holds derived context utilisation for a single session.

func ComputeContextMetrics

func ComputeContextMetrics(modelCalls []ModelCall, compactionEvents []CompactionEvent) *ContextWindowMetrics

ComputeContextMetrics derives context window utilisation from a slice of model calls and compaction events. Returns nil when modelCalls is empty to avoid division by zero and to distinguish "no data" from "zero utilisation".

type CopilotCLIParser

type CopilotCLIParser struct{}

CopilotCLIParser parses Copilot CLI process log files by scanning line-by-line for cli.model_call and cli.tool_call JSON telemetry events. Supports both old single-line format (telemetry) and new multi-line format ([Telemetry]). Malformed lines are skipped with a slog debug log rather than aborting the parse.

func NewCopilotCLIParser

func NewCopilotCLIParser() *CopilotCLIParser

NewCopilotCLIParser returns a new CopilotCLIParser.

func (*CopilotCLIParser) Parse

func (p *CopilotCLIParser) Parse(r io.Reader, emit func(TelemetryEvent) error) error

Parse scans r line-by-line for Copilot CLI telemetry events, calling emit for each valid event found. Supports both old single-line telemetry format and new multi-line [Telemetry] format where JSON is spread across subsequent lines.

type EventKind

type EventKind string

EventKind identifies the type of a telemetry event.

const (
	// EventKindModelCall represents a cli.model_call event from Copilot CLI logs.
	EventKindModelCall EventKind = "cli.model_call"
	// EventKindToolCall represents a cli.tool_call event from Copilot CLI logs.
	EventKindToolCall EventKind = "cli.tool_call"
)

type HarvestCheckpoint

type HarvestCheckpoint struct {
	// FileOffsets maps a log file base-name to the byte offset from which the
	// next harvest should resume.
	FileOffsets map[string]int64 `json:"file_offsets"`
	// LastHarvest is the wall-clock time of the most recent successful harvest.
	LastHarvest time.Time `json:"last_harvest"`
	// Version is a schema version for forward-compatibility checks.
	Version int `json:"version"`
}

HarvestCheckpoint tracks the last-read byte offset per log file to enable incremental harvest. Stored as JSON in .backlogit/.telemetry-checkpoint.json.

func LoadCheckpoint

func LoadCheckpoint(workspacePath string) (*HarvestCheckpoint, error)

LoadCheckpoint reads the harvest checkpoint from <workspacePath>/.backlogit/.telemetry-checkpoint.json. Returns a zero-value checkpoint when the file does not exist or contains malformed JSON — the checkpoint is derived state so a missing file is not an error.

type HarvestOptions

type HarvestOptions struct {
	// Force ignores any saved checkpoint and re-processes all log files from
	// byte offset 0, overwriting telemetry-sessions.jsonl on completion.
	Force bool
	// Since, when non-nil, excludes events whose timestamp precedes this value.
	// Events with unparseable timestamps are always included (safe default).
	Since *time.Time
}

HarvestOptions configures a telemetry harvest run. Both fields are optional; the zero value performs a full re-harvest identical to the prior behaviour.

type HarvestResult

type HarvestResult struct {
	SessionsHarvested int
	ToolCallsIndexed  int
	TotalTokens       int
}

HarvestResult summarises the outcome of a telemetry harvest run.

func HarvestTelemetry

func HarvestTelemetry(ctx context.Context, workspacePath, copilotPath string, sqlDB *sql.DB, opts HarvestOptions) (HarvestResult, error)

HarvestTelemetry is the top-level harvest orchestrator. It:

  1. Loads (or creates) the harvest checkpoint from workspacePath/.backlogit/
  2. Parses process logs from copilotPath (.copilot/logs/) starting at saved offsets
  3. Loads session metadata from copilotPath (.copilot/session-state/, session-store.db)
  4. Correlates events with backlogit task completions via per-item logs in workspacePath/.backlogit/logs/
  5. Computes context window utilisation per session
  6. Attributes tool calls to MCP servers via the attribution registry
  7. Merges new sessions with prior JSONL records (incremental) or replaces (Force)
  8. Writes typed records to workspacePath/.backlogit/telemetry-sessions.jsonl
  9. Triggers RehydrateTelemetry to rebuild SQLite telemetry tables

10. Saves the updated checkpoint for the next run

When opts.Force is false and a checkpoint exists, only new log data (by byte offset) is parsed and merged with existing JSONL. When opts.Force is true, all logs are re-processed from offset 0 and the JSONL is overwritten. When opts.Since is set, events whose timestamp precedes that value are excluded; events with unparseable timestamps are always included.

Returns ErrTelemetrySourceMissing when copilotPath does not exist.

type LogParser

type LogParser interface {
	Parse(r io.Reader, emit func(TelemetryEvent) error) error
}

LogParser is the streaming interface for parsing telemetry log sources. Implementations call emit for each parsed TelemetryEvent. Malformed input lines are skipped with a slog debug log; the parse continues on error-free lines.

type ModelCall

type ModelCall struct {
	SessionID        string `json:"session_id"`
	RequestID        string `json:"request_id"`
	Model            string `json:"model"`
	PromptTokens     int    `json:"prompt_tokens_count"`
	CompletionTokens int    `json:"completion_tokens_count"`
	TotalTokens      int    `json:"total_tokens_count"`
	CachedTokens     int    `json:"cached_tokens_count"`
	DurationMs       int    `json:"duration_ms"`
}

ModelCall holds data extracted from a cli.model_call telemetry event.

type ReportFormat added in v1.1.0

type ReportFormat string

ReportFormat identifies a supported report output encoding.

const (
	// FormatTable renders aligned plaintext tables.
	FormatTable ReportFormat = "table"
	// FormatJSON renders JSON output.
	FormatJSON ReportFormat = "json"
	// FormatMarkdown renders GitHub-flavored Markdown tables.
	FormatMarkdown ReportFormat = "markdown"
)

type ReportOptions

type ReportOptions struct {
	// SessionID, when non-empty, restricts the report to a single session.
	SessionID string
	// GroupBy controls the aggregation dimension. Valid values: "session",
	// "server".
	GroupBy string
	// Format controls the output encoding. Valid values: "table", "json", "markdown".
	Format ReportFormat
	// Limit, when > 0, restricts the number of rows returned. Used by the
	// "top" subcommand to implement top-N behaviour.
	Limit int
}

ReportOptions configures the telemetry report output produced by GenerateReport.

type SessionMeta

type SessionMeta struct {
	SessionID        string            `json:"session_id"`
	Branch           string            `json:"branch"`
	Repository       string            `json:"repository"`
	WorkDir          string            `json:"work_dir"`
	StartedAt        string            `json:"started_at"`
	CompactionEvents []CompactionEvent `json:"compaction_events"`
}

SessionMeta aggregates session metadata merged from session-state events.jsonl and the session-store.db SQLite database.

type SessionSummary

type SessionSummary struct {
	SessionID        string         `json:"session_id"`
	Branch           string         `json:"branch"`
	Repository       string         `json:"repository"`
	TotalTokens      int            `json:"total_tokens"`
	PromptTokens     int            `json:"prompt_tokens"`
	CompletionTokens int            `json:"completion_tokens"`
	CachedTokens     int            `json:"cached_tokens"`
	ModelCalls       int            `json:"model_calls"`
	ToolCalls        int            `json:"tool_calls"`
	TokensByModel    map[string]int `json:"tokens_by_model"`
	// TokensByServer maps server name to server name, acting as an attribution set.
	// Populated via AttributeTool prefix matching on tool call names.
	TokensByServer   map[string]string `json:"tokens_by_server"`
	CompletedTasks   []string          `json:"completed_tasks"`
	TokensPerTask    *float64          `json:"tokens_per_task"`
	CompactionEvents []CompactionEvent `json:"compaction_events"`
	// ContextWindow holds derived context utilisation for this session.
	// Nil when no model calls are recorded or model is unknown.
	ContextWindow *ContextWindowMetrics `json:"context_window,omitempty"`
}

SessionSummary holds fully correlated telemetry for a single session.

func Correlate

func Correlate(ctx context.Context, events []TelemetryEvent, metas map[string]SessionMeta, workspacePath string) ([]SessionSummary, error)

Correlate joins model calls, tool calls, session metadata, and backlogit task completions into per-session SessionSummary records.

Task completions are detected by scanning per-item log files under .backlogit/logs/ for status_changed events where delta.to == "done". Sessions with no task completions report TokensPerTask as nil.

type SessionSummaryRecord

type SessionSummaryRecord struct {
	RecordType        string         `json:"record_type"` // "session_summary"
	HarvestedAt       time.Time      `json:"harvested_at"`
	SessionID         string         `json:"session_id"`
	Branch            string         `json:"branch"`
	Repository        string         `json:"repository"`
	TotalTokens       int            `json:"total_tokens"`
	PromptTokens      int            `json:"prompt_tokens"`
	CompletionTokens  int            `json:"completion_tokens"`
	CachedTokens      int            `json:"cached_tokens"`
	ModelCalls        int            `json:"model_calls"`
	ToolCalls         int            `json:"tool_calls"`
	TokensByModel     map[string]int `json:"tokens_by_model"`
	ToolCallsByServer map[string]int `json:"tool_calls_by_server"`
	CompletedTasks    []string       `json:"completed_tasks"`
	TokensPerTask     *float64       `json:"tokens_per_task"`
	CompactionCount   int            `json:"compaction_count"`
	// Context window metrics — nil when model calls are unavailable.
	PeakUtilization   *float64 `json:"peak_utilization,omitempty"`
	RemainingCapacity *int     `json:"remaining_capacity,omitempty"`
	DepletionRate     *float64 `json:"depletion_rate,omitempty"`
	MaxContextTokens  *int     `json:"max_context_tokens,omitempty"`
}

SessionSummaryRecord is the typed JSONL record written to .backlogit/telemetry-sessions.jsonl. One record per session per harvest run.

Using typed struct fields instead of map[string]any enforces the contract that token counts are always integers and server/model maps are always map[string]int (Plan Review F4).

type TelemetryEvent

type TelemetryEvent struct {
	Kind      EventKind  `json:"kind"`
	ModelCall *ModelCall `json:"model_call,omitempty"`
	ToolCall  *ToolCall  `json:"tool_call,omitempty"`
	// Timestamp is the wall-clock time extracted from the log-line prefix.
	// Zero when unavailable or malformed.
	Timestamp time.Time `json:"timestamp,omitempty"`
}

TelemetryEvent is a union of model and tool telemetry events parsed from a Copilot CLI process log. Exactly one of ModelCall or ToolCall is set. Timestamp is populated from the log-line prefix (e.g. "2026-04-09T00:00:02.000Z") and used for --since filtering. It is zero when the prefix is absent or unparseable.

type ToolCall

type ToolCall struct {
	SessionID   string `json:"session_id"`
	ModelCallID string `json:"model_call_id"`
	ToolName    string `json:"tool_name"`
	ResultType  string `json:"result_type"`
	DurationMs  int    `json:"duration_ms"`
}

ToolCall holds data extracted from a cli.tool_call telemetry event.

type ToolUsageRecord

type ToolUsageRecord struct {
	RecordType  string    `json:"record_type"` // "tool_usage"
	HarvestedAt time.Time `json:"harvested_at"`
	SessionID   string    `json:"session_id"`
	ServerName  string    `json:"server_name"`
	ToolName    string    `json:"tool_name"`
	CallCount   int       `json:"call_count"`
	TotalDurMs  int       `json:"total_duration_ms"`
}

ToolUsageRecord is the typed JSONL record for per-server tool call counts within a single session. The composite (session_id, server_name, tool_name) is unique per harvest (Plan Review F7).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL