Documentation
¶
Overview ¶
Package telemetry harvests Copilot CLI telemetry from the workspace-scoped .copilot/ directory, correlates events with backlogit task completions, attributes tool calls to their originating MCP servers, and exposes metrics through backlogit's existing SQL query surface.
Constitutional note: read-only access to .copilot/ is a documented exception to Principle IV (Workspace Containment). All writes target .backlogit/.
Index ¶
- Constants
- Variables
- func AttributeTool(toolName string) string
- func ContextLimitForModel(model string) int
- func GenerateReport(workspacePath string, opts ReportOptions) (string, error)
- func LoadSessionEvents(sessionStateDir string) (map[string][]CompactionEvent, error)
- func ReadSessionStore(dbPath string) (map[string]SessionMeta, error)
- func SaveCheckpoint(workspacePath string, cp *HarvestCheckpoint) error
- func ValidateSessionSummary(s SessionSummary) bool
- type CompactionEvent
- type ContextWindowMetrics
- type CopilotCLIParser
- type EventKind
- type HarvestCheckpoint
- type HarvestOptions
- type HarvestResult
- type LogParser
- type ModelCall
- type ReportFormat
- type ReportOptions
- type SessionMeta
- type SessionSummary
- type SessionSummaryRecord
- type TelemetryEvent
- type ToolCall
- type ToolUsageRecord
Constants ¶
const DefaultContextLimit = 200000
DefaultContextLimit is the conservative fallback used for unknown models.
Variables ¶
var ModelContextLimits = map[string]int{
"claude-sonnet-4": 200000,
"claude-sonnet-4.5": 200000,
"claude-haiku-4.5": 200000,
"claude-opus-4": 200000,
"claude-opus-4.5": 200000,
"gpt-4.1": 1000000,
"gpt-4.1-mini": 500000,
"gpt-5": 1000000,
"gpt-5.1": 1000000,
"o4-mini": 200000,
}
ModelContextLimits maps model identifiers to their maximum context window size in tokens. Used to derive utilisation metrics when raw context window data is unavailable from log sources.
Unknown models fall back to DefaultContextLimit. The table is a single var declaration and trivially extensible as new models ship.
Functions ¶
func AttributeTool ¶
AttributeTool maps a tool name to its originating MCP server name using longest-prefix-first matching against the default prefix registry. Exact tool names (e.g. "view", "edit") are matched before prefix patterns. Unknown tool names resolve to "unknown".
func ContextLimitForModel ¶
ContextLimitForModel returns the maximum context window size for model. Falls back to DefaultContextLimit for unknown models.
func GenerateReport ¶
func GenerateReport(workspacePath string, opts ReportOptions) (string, error)
GenerateReport reads telemetry-sessions.jsonl from <workspacePath>/.backlogit/ and produces a formatted report string according to opts. Returns an informative message (not an error) when no harvested data exists.
func LoadSessionEvents ¶
func LoadSessionEvents(sessionStateDir string) (map[string][]CompactionEvent, error)
LoadSessionEvents reads all events.jsonl files found directly under sessionStateDir (one level deep, one file per session directory) and returns a map of sessionID → CompactionEvents. The session ID is inferred from the parent directory name.
func ReadSessionStore ¶
func ReadSessionStore(dbPath string) (map[string]SessionMeta, error)
ReadSessionStore opens the session-store.db SQLite database at dbPath (read-only) and returns a map of sessionID → SessionMeta with branch, repository, work directory, and timing data. Returns an empty map (not an error) when the database is missing or inaccessible — graceful fallback per Plan Review F3.
func SaveCheckpoint ¶
func SaveCheckpoint(workspacePath string, cp *HarvestCheckpoint) error
SaveCheckpoint atomically writes cp to <workspacePath>/.backlogit/.telemetry-checkpoint.json via temp-file-then-rename.
func ValidateSessionSummary ¶ added in v1.1.1
func ValidateSessionSummary(s SessionSummary) bool
ValidateSessionSummary reports whether s represents a complete, harvestable session. A session is invalid (partial) when it has tool calls recorded but zero tokens — this occurs when oversized log entries are dropped by the parser, leaving tool calls without corresponding model-call token attribution.
Callers in the harvest pipeline must reject invalid sessions to prevent partial zero-token records from being written to telemetry-sessions.jsonl.
Types ¶
type CompactionEvent ¶
type CompactionEvent struct {
Timestamp string `json:"timestamp"`
PreCompactionTokens int `json:"preCompactionTokens"`
InputTokens int `json:"input"`
OutputTokens int `json:"output"`
CachedInputTokens int `json:"cachedInput"`
}
CompactionEvent holds data from a session.compaction_complete event in a session-state events.jsonl file.
func ParseSessionEvents ¶
func ParseSessionEvents(r io.Reader) ([]CompactionEvent, error)
ParseSessionEvents reads a session-state events.jsonl stream and extracts CompactionEvents from session.compaction_complete entries. Malformed or non-compaction lines are skipped. Returns an empty slice (not an error) when no compaction events are present.
type ContextWindowMetrics ¶
type ContextWindowMetrics struct {
// PeakUtilization is the highest prompt_tokens/max_tokens ratio observed
// across all model calls in the session (0.0–1.0+).
PeakUtilization float64 `json:"peak_utilization"`
// RemainingCapacity is max_tokens - peak_prompt_tokens at the point of peak
// utilisation.
RemainingCapacity int `json:"remaining_capacity"`
// DepletionRate is total_tokens / model_calls (average tokens consumed per
// model turn).
DepletionRate float64 `json:"depletion_rate"`
// MaxContextTokens is the model-limit value used for the calculations.
MaxContextTokens int `json:"max_context_tokens"`
// PeakPromptTokens is the highest prompt token count observed in any single
// model call.
PeakPromptTokens int `json:"peak_prompt_tokens"`
// CompactionCount is the number of context-compaction events recorded for
// this session.
CompactionCount int `json:"compaction_count"`
}
ContextWindowMetrics holds derived context utilisation for a single session.
func ComputeContextMetrics ¶
func ComputeContextMetrics(modelCalls []ModelCall, compactionEvents []CompactionEvent) *ContextWindowMetrics
ComputeContextMetrics derives context window utilisation from a slice of model calls and compaction events. Returns nil when modelCalls is empty to avoid division by zero and to distinguish "no data" from "zero utilisation".
type CopilotCLIParser ¶
type CopilotCLIParser struct{}
CopilotCLIParser parses Copilot CLI process log files by scanning line-by-line for cli.model_call and cli.tool_call JSON telemetry events. Supports both old single-line format (telemetry) and new multi-line format ([Telemetry]). Malformed lines are skipped with a slog debug log rather than aborting the parse.
func NewCopilotCLIParser ¶
func NewCopilotCLIParser() *CopilotCLIParser
NewCopilotCLIParser returns a new CopilotCLIParser.
func (*CopilotCLIParser) Parse ¶
func (p *CopilotCLIParser) Parse(r io.Reader, emit func(TelemetryEvent) error) error
Parse scans r line-by-line for Copilot CLI telemetry events, calling emit for each valid event found. Supports both old single-line telemetry format and new multi-line [Telemetry] format where JSON is spread across subsequent lines.
type HarvestCheckpoint ¶
type HarvestCheckpoint struct {
// FileOffsets maps a log file base-name to the byte offset from which the
// next harvest should resume.
FileOffsets map[string]int64 `json:"file_offsets"`
// LastHarvest is the wall-clock time of the most recent successful harvest.
LastHarvest time.Time `json:"last_harvest"`
// Version is a schema version for forward-compatibility checks.
Version int `json:"version"`
}
HarvestCheckpoint tracks the last-read byte offset per log file to enable incremental harvest. Stored as JSON in .backlogit/.telemetry-checkpoint.json.
func LoadCheckpoint ¶
func LoadCheckpoint(workspacePath string) (*HarvestCheckpoint, error)
LoadCheckpoint reads the harvest checkpoint from <workspacePath>/.backlogit/.telemetry-checkpoint.json. Returns a zero-value checkpoint when the file does not exist or contains malformed JSON — the checkpoint is derived state so a missing file is not an error.
type HarvestOptions ¶
type HarvestOptions struct {
// Force ignores any saved checkpoint and re-processes all log files from
// byte offset 0, overwriting telemetry-sessions.jsonl on completion.
Force bool
// Since, when non-nil, excludes events whose timestamp precedes this value.
// Events with unparseable timestamps are always included (safe default).
Since *time.Time
}
HarvestOptions configures a telemetry harvest run. Both fields are optional; the zero value performs a full re-harvest identical to the prior behaviour.
type HarvestResult ¶
HarvestResult summarises the outcome of a telemetry harvest run.
func HarvestTelemetry ¶
func HarvestTelemetry(ctx context.Context, workspacePath, copilotPath string, sqlDB *sql.DB, opts HarvestOptions) (HarvestResult, error)
HarvestTelemetry is the top-level harvest orchestrator. It:
- Loads (or creates) the harvest checkpoint from workspacePath/.backlogit/
- Parses process logs from copilotPath (.copilot/logs/) starting at saved offsets
- Loads session metadata from copilotPath (.copilot/session-state/, session-store.db)
- Correlates events with backlogit task completions via per-item logs in workspacePath/.backlogit/logs/
- Computes context window utilisation per session
- Attributes tool calls to MCP servers via the attribution registry
- Merges new sessions with prior JSONL records (incremental) or replaces (Force)
- Writes typed records to workspacePath/.backlogit/telemetry-sessions.jsonl
- Triggers RehydrateTelemetry to rebuild SQLite telemetry tables
10. Saves the updated checkpoint for the next run
When opts.Force is false and a checkpoint exists, only new log data (by byte offset) is parsed and merged with existing JSONL. When opts.Force is true, all logs are re-processed from offset 0 and the JSONL is overwritten. When opts.Since is set, events whose timestamp precedes that value are excluded; events with unparseable timestamps are always included.
Returns ErrTelemetrySourceMissing when copilotPath does not exist.
type LogParser ¶
type LogParser interface {
Parse(r io.Reader, emit func(TelemetryEvent) error) error
}
LogParser is the streaming interface for parsing telemetry log sources. Implementations call emit for each parsed TelemetryEvent. Malformed input lines are skipped with a slog debug log; the parse continues on error-free lines.
type ModelCall ¶
type ModelCall struct {
SessionID string `json:"session_id"`
RequestID string `json:"request_id"`
Model string `json:"model"`
PromptTokens int `json:"prompt_tokens_count"`
CompletionTokens int `json:"completion_tokens_count"`
TotalTokens int `json:"total_tokens_count"`
CachedTokens int `json:"cached_tokens_count"`
DurationMs int `json:"duration_ms"`
}
ModelCall holds data extracted from a cli.model_call telemetry event.
type ReportFormat ¶ added in v1.1.0
type ReportFormat string
ReportFormat identifies a supported report output encoding.
const ( // FormatTable renders aligned plaintext tables. FormatTable ReportFormat = "table" // FormatJSON renders JSON output. FormatJSON ReportFormat = "json" // FormatMarkdown renders GitHub-flavored Markdown tables. FormatMarkdown ReportFormat = "markdown" )
type ReportOptions ¶
type ReportOptions struct {
// SessionID, when non-empty, restricts the report to a single session.
SessionID string
// GroupBy controls the aggregation dimension. Valid values: "session",
// "server".
GroupBy string
// Format controls the output encoding. Valid values: "table", "json", "markdown".
Format ReportFormat
// Limit, when > 0, restricts the number of rows returned. Used by the
// "top" subcommand to implement top-N behaviour.
Limit int
}
ReportOptions configures the telemetry report output produced by GenerateReport.
type SessionMeta ¶
type SessionMeta struct {
SessionID string `json:"session_id"`
Branch string `json:"branch"`
Repository string `json:"repository"`
WorkDir string `json:"work_dir"`
StartedAt string `json:"started_at"`
CompactionEvents []CompactionEvent `json:"compaction_events"`
}
SessionMeta aggregates session metadata merged from session-state events.jsonl and the session-store.db SQLite database.
type SessionSummary ¶
type SessionSummary struct {
SessionID string `json:"session_id"`
Branch string `json:"branch"`
Repository string `json:"repository"`
TotalTokens int `json:"total_tokens"`
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
CachedTokens int `json:"cached_tokens"`
ModelCalls int `json:"model_calls"`
ToolCalls int `json:"tool_calls"`
TokensByModel map[string]int `json:"tokens_by_model"`
// TokensByServer maps server name to server name, acting as an attribution set.
// Populated via AttributeTool prefix matching on tool call names.
TokensByServer map[string]string `json:"tokens_by_server"`
CompletedTasks []string `json:"completed_tasks"`
TokensPerTask *float64 `json:"tokens_per_task"`
CompactionEvents []CompactionEvent `json:"compaction_events"`
// ContextWindow holds derived context utilisation for this session.
// Nil when no model calls are recorded or model is unknown.
ContextWindow *ContextWindowMetrics `json:"context_window,omitempty"`
}
SessionSummary holds fully correlated telemetry for a single session.
func Correlate ¶
func Correlate(ctx context.Context, events []TelemetryEvent, metas map[string]SessionMeta, workspacePath string) ([]SessionSummary, error)
Correlate joins model calls, tool calls, session metadata, and backlogit task completions into per-session SessionSummary records.
Task completions are detected by scanning per-item log files under .backlogit/logs/ for status_changed events where delta.to == "done". Sessions with no task completions report TokensPerTask as nil.
type SessionSummaryRecord ¶
type SessionSummaryRecord struct {
RecordType string `json:"record_type"` // "session_summary"
HarvestedAt time.Time `json:"harvested_at"`
SessionID string `json:"session_id"`
Branch string `json:"branch"`
Repository string `json:"repository"`
TotalTokens int `json:"total_tokens"`
PromptTokens int `json:"prompt_tokens"`
CompletionTokens int `json:"completion_tokens"`
CachedTokens int `json:"cached_tokens"`
ModelCalls int `json:"model_calls"`
ToolCalls int `json:"tool_calls"`
TokensByModel map[string]int `json:"tokens_by_model"`
ToolCallsByServer map[string]int `json:"tool_calls_by_server"`
CompletedTasks []string `json:"completed_tasks"`
TokensPerTask *float64 `json:"tokens_per_task"`
CompactionCount int `json:"compaction_count"`
// Context window metrics — nil when model calls are unavailable.
PeakUtilization *float64 `json:"peak_utilization,omitempty"`
RemainingCapacity *int `json:"remaining_capacity,omitempty"`
DepletionRate *float64 `json:"depletion_rate,omitempty"`
MaxContextTokens *int `json:"max_context_tokens,omitempty"`
}
SessionSummaryRecord is the typed JSONL record written to .backlogit/telemetry-sessions.jsonl. One record per session per harvest run.
Using typed struct fields instead of map[string]any enforces the contract that token counts are always integers and server/model maps are always map[string]int (Plan Review F4).
type TelemetryEvent ¶
type TelemetryEvent struct {
Kind EventKind `json:"kind"`
ModelCall *ModelCall `json:"model_call,omitempty"`
ToolCall *ToolCall `json:"tool_call,omitempty"`
// Timestamp is the wall-clock time extracted from the log-line prefix.
// Zero when unavailable or malformed.
Timestamp time.Time `json:"timestamp,omitempty"`
}
TelemetryEvent is a union of model and tool telemetry events parsed from a Copilot CLI process log. Exactly one of ModelCall or ToolCall is set. Timestamp is populated from the log-line prefix (e.g. "2026-04-09T00:00:02.000Z") and used for --since filtering. It is zero when the prefix is absent or unparseable.
type ToolCall ¶
type ToolCall struct {
SessionID string `json:"session_id"`
ModelCallID string `json:"model_call_id"`
ToolName string `json:"tool_name"`
ResultType string `json:"result_type"`
DurationMs int `json:"duration_ms"`
}
ToolCall holds data extracted from a cli.tool_call telemetry event.
type ToolUsageRecord ¶
type ToolUsageRecord struct {
RecordType string `json:"record_type"` // "tool_usage"
HarvestedAt time.Time `json:"harvested_at"`
SessionID string `json:"session_id"`
ServerName string `json:"server_name"`
ToolName string `json:"tool_name"`
CallCount int `json:"call_count"`
TotalDurMs int `json:"total_duration_ms"`
}
ToolUsageRecord is the typed JSONL record for per-server tool call counts within a single session. The composite (session_id, server_name, tool_name) is unique per harvest (Plan Review F7).