Documentation
¶
Index ¶
- Constants
- func UCBScore(avgReward float64, totalSessions, timesUsed int, explorationC float64) float64
- type ContextProfile
- type ContextTrimmer
- type EvaluatorService
- func (s *EvaluatorService) ClassifyTask(text string) string
- func (s *EvaluatorService) EvaluateSession(ctx context.Context, sessionID string) error
- func (s *EvaluatorService) GetActiveSkills(ctx context.Context, taskType string) ([]Skill, error)
- func (s *EvaluatorService) GetStats(ctx context.Context) (*Stats, error)
- func (s *EvaluatorService) IsEnabled() bool
- func (s *EvaluatorService) NewContextTrimmer() *ContextTrimmer
- func (s *EvaluatorService) RecordTemplateSelection(_ context.Context, sessionID, templateID string)
- func (s *EvaluatorService) SelectTemplate(ctx context.Context, sectionName string) (*PromptTemplate, error)
- type Judge
- type JudgeMeta
- type JudgeOutput
- type PromptTemplate
- type RewardResult
- type Service
- type Skill
- type Stats
- type TemplateStats
- type ToolInfo
Constants ¶
const ContextProfileKey contextKey = "context_profile"
ContextProfileKey is used to store a ContextProfile in the request context.
const SelectedTemplateKey contextKey = "selected_template_id"
SelectedTemplateKey is used to store the selected template ID in context.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type ContextProfile ¶ added in v0.407.0
type ContextProfile struct {
// TaskType is the classified task type (e.g. "code", "debug", "refactor").
TaskType string
// RelevantToolNames lists the tool names to keep for this task.
// An empty slice means keep all tools (no filtering).
RelevantToolNames []string
// SkipSections lists prompt section names that are not needed for this task.
SkipSections []string
// Confidence is a 0.0–1.0 measure of the trimmer's certainty.
// Profiles with Confidence < 0.5 should be ignored and defaults used.
Confidence float64
}
ContextProfile is produced by the ContextTrimmer at session start. It guides the PromptBuilder and agent tool assembly to include only what is relevant for the current task.
type ContextTrimmer ¶ added in v0.407.0
type ContextTrimmer struct {
// contains filtered or unexported fields
}
ContextTrimmer uses a cheap LLM to produce a ContextProfile at session start. It complements the post-session Judge: instead of "what went wrong?", it asks "what is needed for this task?".
func NewContextTrimmer ¶ added in v0.407.0
func NewContextTrimmer(cfg config.EvaluatorConfig, j *Judge) *ContextTrimmer
NewContextTrimmer creates a ContextTrimmer that reuses the evaluator's judge infrastructure. Returns nil if the evaluator has no judge configured.
func (*ContextTrimmer) ProfileTask ¶ added in v0.407.0
func (ct *ContextTrimmer) ProfileTask( ctx context.Context, firstMessage string, availableTools []ToolInfo, ) (*ContextProfile, error)
ProfileTask analyzes the user's first message and returns a ContextProfile describing which tools and prompt sections are relevant for the task.
The LLM call is bounded by a 3-second timeout; on timeout or error the method returns nil so callers can fall back to defaults. Results are cached by a hash of (firstMessage + toolNames) to avoid redundant LLM calls.
type EvaluatorService ¶
type EvaluatorService struct {
// contains filtered or unexported fields
}
EvaluatorService is the concrete implementation of Service.
func New ¶
func New(cfg config.EvaluatorConfig, q db.Querier, msgs message.Service) (*EvaluatorService, error)
New creates a new EvaluatorService. Returns nil if disabled.
func (*EvaluatorService) ClassifyTask ¶ added in v0.407.0
func (s *EvaluatorService) ClassifyTask(text string) string
ClassifyTask returns a task type label from the user's first message. It uses compiled patterns from config, evaluated in order; first match wins. Returns "general" if no pattern matches or the text is empty.
func (*EvaluatorService) EvaluateSession ¶
func (s *EvaluatorService) EvaluateSession(ctx context.Context, sessionID string) error
EvaluateSession triggers evaluation of a completed session.
func (*EvaluatorService) GetActiveSkills ¶
GetActiveSkills returns active skills for a task type.
func (*EvaluatorService) GetStats ¶
func (s *EvaluatorService) GetStats(ctx context.Context) (*Stats, error)
GetStats returns system statistics for TUI display.
func (*EvaluatorService) IsEnabled ¶
func (s *EvaluatorService) IsEnabled() bool
IsEnabled returns whether the evaluator is active.
func (*EvaluatorService) NewContextTrimmer ¶ added in v0.407.0
func (s *EvaluatorService) NewContextTrimmer() *ContextTrimmer
NewContextTrimmer creates a ContextTrimmer backed by this service's judge infrastructure. Returns nil if the evaluator has no judge configured or if the evaluator itself is nil.
func (*EvaluatorService) RecordTemplateSelection ¶
func (s *EvaluatorService) RecordTemplateSelection(_ context.Context, sessionID, templateID string)
RecordTemplateSelection stores the template used in this session for later evaluation.
func (*EvaluatorService) SelectTemplate ¶
func (s *EvaluatorService) SelectTemplate(ctx context.Context, sectionName string) (*PromptTemplate, error)
SelectTemplate returns the best template for a section using UCB.
type Judge ¶ added in v0.244.0
type Judge struct {
// contains filtered or unexported fields
}
Judge calls an LLM model to evaluate session quality.
type JudgeMeta ¶
type JudgeMeta struct {
TemplateName string
TemplateVersion int
Corrections int
Tokens int64
Transcript string
}
JudgeMeta holds metadata passed to the judge prompt.
type JudgeOutput ¶
type JudgeOutput struct {
Reasoning string `json:"reasoning"`
KeyPoints []string `json:"key_points"`
NewSkill string `json:"new_skill"`
TaskType string `json:"task_type"`
Confidence float64 `json:"confidence"`
}
JudgeOutput is the structured response from the LLM judge model.
type PromptTemplate ¶
type PromptTemplate struct {
ID string
Name string
Section string
Content string
Version int
IsDefault bool
}
PromptTemplate represents a versioned prompt template variant.
type RewardResult ¶
type RewardResult struct {
Total float64
SuccessScore float64
EfficiencyScore float64
PromptTokens int64
CompletionTokens int64
MessageCount int64
UserCorrections int
}
RewardResult holds the decomposed reward calculation for a session.
type Service ¶
type Service interface {
// EvaluateSession triggers evaluation of a completed session (async if configured).
EvaluateSession(ctx context.Context, sessionID string) error
// SelectTemplate returns the best prompt template for a section using UCB.
// Returns nil if insufficient history or evaluator disabled.
SelectTemplate(ctx context.Context, sectionName string) (*PromptTemplate, error)
// GetActiveSkills returns skills to inject into prompts for a given task type.
GetActiveSkills(ctx context.Context, taskType string) ([]Skill, error)
// GetStats returns current UCB rankings and skill library summary.
GetStats(ctx context.Context) (*Stats, error)
// IsEnabled returns whether the evaluator is active.
IsEnabled() bool
// RecordTemplateSelection records which template was selected for a session.
RecordTemplateSelection(ctx context.Context, sessionID, templateID string)
// ClassifyTask returns a task type label from the user's first message.
// Returns "general" if no pattern matches.
ClassifyTask(text string) string
}
Service defines the evaluator interface used by other packages.
type Skill ¶
type Skill struct {
ID string
Title string
Content string
TaskType string
SuccessRate float64
UsageCount int
}
Skill represents a learned optimization rule from the Skill Library.
type Stats ¶
type Stats struct {
TotalEvaluations int
Templates []TemplateStats
SkillCount int
TopSkills []Skill
AvgReward float64
LastEvaluation time.Time
IsEnabled bool
}
Stats is the overall self-improvement system statistics.
type TemplateStats ¶
type TemplateStats struct {
Template PromptTemplate
TimesUsed int
AvgReward float64
UCBScore float64
Rank int
}
TemplateStats holds UCB statistics for a template (for TUI display).