execution

package
v0.21.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 12, 2026 License: MIT Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func UpdateOutcomeUsage

func UpdateOutcomeUsage(outcome *models.EvaluationOutcome, engine AgentEngine)

UpdateOutcomeUsage replaces fallback per-turn usage data in the outcome with authoritative post-shutdown usage data from the engine, then re-aggregates the digest-level usage totals. Call after engine.Shutdown().

Types

type AgentEngine

type AgentEngine interface {
	// Initialize sets up the engine
	Initialize(ctx context.Context) error

	// Execute runs a test with the given stimulus
	Execute(ctx context.Context, req *ExecutionRequest) (*ExecutionResponse, error)

	// Shutdown cleans up resources. It is safe to call multiple times;
	// subsequent calls after the first are no-ops. After Shutdown returns,
	// SessionUsage results include data from session termination events.
	Shutdown(ctx context.Context) error

	// SessionUsage returns the final usage stats for a session, including
	// data from session.shutdown events that fire during Shutdown().
	// Returns nil if no usage data is available for the given session.
	SessionUsage(sessionID string) *models.UsageStats
}

AgentEngine is the interface for executing test prompts

type CopilotClient

type CopilotClient interface {
	// CreateSession maps to [copilot.Client.CreateSession]
	CreateSession(ctx context.Context, config *copilot.SessionConfig) (CopilotSession, error)

	// Start maps to [copilot.Client.Start]
	Start(ctx context.Context) error

	// Stop maps to [copilot.Client.Stop]
	Stop() error

	// ResumeSessionWithOptions maps to [copilot.Client.ResumeSessionWithOptions]
	ResumeSessionWithOptions(ctx context.Context, sessionID string, config *copilot.ResumeSessionConfig) (CopilotSession, error)

	// DeleteSession maps to [copilot.Client.DeleteSession]
	DeleteSession(ctx context.Context, sessionID string) error
}

CopilotClient is just an interface over *copilot.Client

type CopilotEngine

type CopilotEngine struct {
	// contains filtered or unexported fields
}

CopilotEngine integrates with GitHub Copilot SDK

func (*CopilotEngine) Execute

Execute runs a test with Copilot SDK

func (*CopilotEngine) Initialize

func (e *CopilotEngine) Initialize(ctx context.Context) error

Initialize sets up the Copilot client

func (*CopilotEngine) SessionUsage

func (e *CopilotEngine) SessionUsage(sessionID string) *models.UsageStats

SessionUsage returns the final usage stats for a session. Call after Shutdown() to get data from session.shutdown events (ModelMetrics, TotalPremiumRequests).

func (*CopilotEngine) Shutdown

func (e *CopilotEngine) Shutdown(ctx context.Context) error

Shutdown cleans up resources, deleting session and workspace data. It is safe to call multiple times; subsequent calls after the first are no-ops that return the original error.

type CopilotEngineBuilder

type CopilotEngineBuilder struct {
	// contains filtered or unexported fields
}

CopilotEngineBuilder builds a CopilotEngine with options

func NewCopilotEngineBuilder

func NewCopilotEngineBuilder(defaultModelID string, options *CopilotEngineBuilderOptions) *CopilotEngineBuilder

NewCopilotEngineBuilder creates a builder for CopilotEngine

  • defaultModelID - used if no model ID is specified in session creation. Can be blank, which means the copilot CLI will choose its own fallback model.

func (*CopilotEngineBuilder) Build

func (b *CopilotEngineBuilder) Build() *CopilotEngine

type CopilotEngineBuilderOptions

type CopilotEngineBuilderOptions struct {
	NewCopilotClient func(clientOptions *copilot.ClientOptions) CopilotClient
}

type CopilotSession

type CopilotSession interface {
	// Disconnect maps to [copilot.Session.Disconnect]. It closes the session and releases resources, however it
	// doesn't delete data and the session is still resumable until deleted via [copilot.Client.DeleteSession].
	Disconnect() error

	// On maps to [copilot.Session.On]
	On(handler copilot.SessionEventHandler) func()

	// SendAndWait maps to [copilot.Session.SendAndWait]
	SendAndWait(ctx context.Context, options copilot.MessageOptions) (*copilot.SessionEvent, error)

	// SessionID returns [copilot.Session.SessionID]
	SessionID() string
}

CopilotSession is just an interface over *copilot.Session

type ExecutionRequest

type ExecutionRequest struct {
	ModelID   string
	Message   string
	Context   map[string]any
	Resources []ResourceFile

	SessionID string
	SkillName string

	SourceDir  string   // used when looking for workspace items via relative path, like skills.
	SkillPaths []string // Directories to search for skills

	Timeout time.Duration

	// PermissionHandler called when the copilot SDK wants to determine if a tool can be used.
	// Default: allows all tools.
	PermissionHandler copilot.PermissionHandlerFunc
}

ExecutionRequest represents a test execution request

type ExecutionResponse

type ExecutionResponse struct {
	FinalOutput      string
	Events           []copilot.SessionEvent
	ModelID          string
	SkillInvocations []SkillInvocation
	DurationMs       int64
	ToolCalls        []models.ToolCall
	ErrorMsg         string
	Success          bool
	WorkspaceDir     string // Path to workspace directory (for file grading)
	SessionID        string // Copilot session ID
	Usage            *models.UsageStats
}

ExecutionResponse represents the result of an execution

func (*ExecutionResponse) ContainsText

func (r *ExecutionResponse) ContainsText(text string) bool

ContainsText checks if output contains text (case-insensitive)

func (*ExecutionResponse) ExtractMessages

func (r *ExecutionResponse) ExtractMessages() []string

ExtractMessages gets all assistant messages from events

type MockEngine

type MockEngine struct {
	// contains filtered or unexported fields
}

MockEngine is a simple mock implementation for testing

func NewMockEngine

func NewMockEngine(modelID string) *MockEngine

NewMockEngine creates a new mock engine

func (*MockEngine) Execute

func (*MockEngine) Initialize

func (m *MockEngine) Initialize(ctx context.Context) error

func (*MockEngine) SessionUsage

func (m *MockEngine) SessionUsage(sessionID string) *models.UsageStats

func (*MockEngine) Shutdown

func (m *MockEngine) Shutdown(ctx context.Context) error

type ResourceFile

type ResourceFile struct {
	Path    string
	Content []byte
}

ResourceFile represents a file resource

type SessionEventsCollector

type SessionEventsCollector struct {
	// SkillInvocations is a chronological list of skills invoked during the session
	SkillInvocations []SkillInvocation
	// contains filtered or unexported fields
}

func NewSessionEventsCollector

func NewSessionEventsCollector() *SessionEventsCollector

NewSessionEventsCollector creates a new SessionEvents.

func (*SessionEventsCollector) Done

func (coll *SessionEventsCollector) Done() <-chan struct{}

Done returns the channel that is closed when the session completes.

func (*SessionEventsCollector) ErrorMessage

func (coll *SessionEventsCollector) ErrorMessage() string

ErrorMessage returns the error message, if any.

func (*SessionEventsCollector) On

On is a callback, intended to be passed to copilot.Session.On to receive events in real-time.

func (*SessionEventsCollector) OutputParts

func (coll *SessionEventsCollector) OutputParts() []string

OutputParts returns the collected output text parts.

func (*SessionEventsCollector) SessionEvents

func (coll *SessionEventsCollector) SessionEvents() []copilot.SessionEvent

SessionEvents returns the collected session events.

func (*SessionEventsCollector) ToolCalls

func (coll *SessionEventsCollector) ToolCalls() []models.ToolCall

ToolCalls goes through the list of session events and correlates tool starts with Success. The resulting tool calls are not cached - if you're going to use it repeatedly you should store it locally.

type SessionUsageCollector

type SessionUsageCollector struct {
	// contains filtered or unexported fields
}

SessionUsageCollector tracks token and premium request usage from Copilot SDK session events. Its On method implements copilot.SessionEventHandler and should be registered via session.On(collector.On).

Usage data arrives through two channels:

  • Per-turn events (AssistantUsage) — accumulated as a fallback.
  • Session termination events (SessionIdle, SessionShutdown) — authoritative totals that override per-turn data when available.

func NewSessionUsageCollector

func NewSessionUsageCollector() *SessionUsageCollector

func (*SessionUsageCollector) On

On handles a single session event, extracting any usage data it carries. Pass this method to session.On as a copilot.SessionEventHandler.

func (*SessionUsageCollector) UsageStats

func (s *SessionUsageCollector) UsageStats() *models.UsageStats

UsageStats returns the collected usage statistics. Returns nil if no usage data was collected. Session-level data (from SessionIdle/SessionShutdown) is preferred as the authoritative source; per-turn accumulated data (from AssistantUsage) is used as fallback.

type SkillInvocation

type SkillInvocation struct {
	// Name of the invoked skill
	Name string
	// Path of the invoked SKILL.md
	Path string
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL