safety

package
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 26, 2026 License: MIT Imports: 12 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func BuildRejectionMessage

func BuildRejectionMessage(result *GroundingResult) string

BuildRejectionMessage constructs a human-readable message explaining which claims in the response were not supported by context.

func FormatAssessment

func FormatAssessment(assessment *RiskAssessment) string

FormatAssessment produces a human-readable formatted string of the assessment.

func FormatGroundingResult

func FormatGroundingResult(result *GroundingResult) string

FormatGroundingResult returns a human-readable summary of the grounding analysis.

func GenerateMitigations

func GenerateMitigations(assessment *RiskAssessment) []string

GenerateMitigations produces mitigation suggestions based on the assessment.

func ShouldProceed

func ShouldProceed(assessment *RiskAssessment) bool

ShouldProceed returns false for "critical" risk level, indicating the change should not proceed without further review.

func ToolNeedsPermission

func ToolNeedsPermission(name string, args map[string]interface{}) bool

toolNeedsPermission returns true for tools that modify state.

func ToolSummary

func ToolSummary(name string, args map[string]interface{}) string

ToolSummary generates a human-readable summary of what a tool call will do.

Types

type AutonomyConfig

type AutonomyConfig struct {
	Level           AutonomyLevel
	AutoContinue    bool
	AutoApplyEdits  bool
	AutoExecuteBash bool
	AutoCommit      bool
}

AutonomyConfig holds the derived permission flags for an autonomy level.

func PresetConfig

func PresetConfig(level AutonomyLevel) AutonomyConfig

PresetConfig returns the AutonomyConfig for a given level.

func (AutonomyConfig) NeedsPermission

func (c AutonomyConfig) NeedsPermission(toolName string, isSafe bool) bool

NeedsPermission returns true when the tool call should prompt the user. isSafe indicates whether the specific invocation has been classified as safe (e.g. a non-destructive bash command).

type AutonomyLevel

type AutonomyLevel int

AutonomyLevel controls how much the agent can do without asking the user.

const (
	// AutonomySupervised asks for permission on every tool call.
	AutonomySupervised AutonomyLevel = 0
	// AutonomyBasic auto-allows read-only tools.
	AutonomyBasic AutonomyLevel = 1
	// AutonomySemi auto-allows reads and writes, asks for Bash.
	AutonomySemi AutonomyLevel = 2
	// AutonomyFull auto-allows everything except destructive commands.
	AutonomyFull AutonomyLevel = 3
	// AutonomyYOLO never asks for permission.
	AutonomyYOLO AutonomyLevel = 4
)

func ParseAutonomyLevel

func ParseAutonomyLevel(s string) AutonomyLevel

ParseAutonomyLevel converts a string name or number to an AutonomyLevel.

func (AutonomyLevel) String

func (l AutonomyLevel) String() string

String returns the human-readable name of an autonomy level.

type GroundingResult

type GroundingResult struct {
	Score             float64
	SupportedClaims   []string
	UnsupportedClaims []string
	TotalClaims       int
	Grounded          bool
}

GroundingResult holds the outcome of checking a response against context.

type HallucinationGuard

type HallucinationGuard struct {
	Enabled    bool
	Threshold  float64 // fraction of claims that must be grounded (e.g. 0.7 = 70%)
	MaxRetries int
	// contains filtered or unexported fields
}

HallucinationGuard validates agent outputs against source context, rejecting responses that contain unsupported factual claims.

func NewHallucinationGuard

func NewHallucinationGuard() *HallucinationGuard

NewHallucinationGuard creates a HallucinationGuard with sensible defaults.

func (*HallucinationGuard) Check

func (hg *HallucinationGuard) Check(response string, context []string) *GroundingResult

Check validates a response against the provided context sources. It extracts factual claims, verifies each against context, and returns a grounding result indicating whether the response is sufficiently supported.

func (*HallucinationGuard) ExtractClaims

func (hg *HallucinationGuard) ExtractClaims(text string) []string

ExtractClaims splits text into sentences and filters to those that contain specific factual assertions (names, numbers, paths, technical terms). Opinions, questions, and hedged statements are excluded.

func (*HallucinationGuard) ExtractKeyTerms

func (hg *HallucinationGuard) ExtractKeyTerms(claim string) []string

ExtractKeyTerms removes stop words and returns nouns, numbers, paths, and identifiers from the claim text.

func (*HallucinationGuard) VerifyClaim

func (hg *HallucinationGuard) VerifyClaim(claim string, context []string) bool

VerifyClaim checks whether a claim's key terms appear sufficiently in the provided context. A word overlap threshold of > 0.5 is required.

type OutputRedactor

type OutputRedactor struct {
	Patterns     []*RedactPattern
	KnownSecrets map[string]string
	Stats        RedactStats
	// contains filtered or unexported fields
}

OutputRedactor strips sensitive information from tool outputs before they reach the LLM.

func NewOutputRedactor

func NewOutputRedactor() *OutputRedactor

NewOutputRedactor creates an OutputRedactor pre-loaded with 25+ built-in patterns covering common secret formats.

func (*OutputRedactor) AddKnownSecret

func (r *OutputRedactor) AddKnownSecret(name, value string)

AddKnownSecret registers a specific value to always redact from outputs. The name is used in the replacement placeholder.

func (*OutputRedactor) AddPattern

func (r *OutputRedactor) AddPattern(name string, pattern string, category string) error

AddPattern registers a new redaction pattern at runtime. Returns an error if the pattern cannot be compiled.

func (*OutputRedactor) FormatStats

func (r *OutputRedactor) FormatStats() string

FormatStats returns a human-readable summary of redaction statistics.

func (*OutputRedactor) IsClean

func (r *OutputRedactor) IsClean(output string) bool

IsClean performs a quick check to determine whether the output contains any detectable secrets. Returns true if no secrets are found.

func (*OutputRedactor) Redact

func (r *OutputRedactor) Redact(output string) string

Redact applies all patterns and known secrets to the output, replacing matches with [REDACTED:<category>] placeholders. Returns the sanitized string.

func (*OutputRedactor) RedactEnvVars

func (r *OutputRedactor) RedactEnvVars(output string) string

RedactEnvVars scans the output for values of known environment variables whose names suggest they contain secrets, and replaces them.

func (*OutputRedactor) RedactPaths

func (r *OutputRedactor) RedactPaths(output string, homedir string) string

RedactPaths replaces the user's home directory in output with ~/ to avoid leaking the full filesystem path structure.

type PermissionEngine

type PermissionEngine struct {
	Memory     *PermissionMemory
	AutoMode   *permissions.AutoModeState
	Classifier *permissions.Classifier
	BypassKill *permissions.BypassKillswitch
	Mode       PermissionMode
	Autonomy   AutonomyLevel
	PromptFn   func(PermissionRequest) // callback to ask user
}

PermissionEngine encapsulates all permission-checking logic. Extracted from Session to keep the god object lean.

func NewPermissionEngine

func NewPermissionEngine() *PermissionEngine

NewPermissionEngine creates a PermissionEngine with sensible defaults.

func (*PermissionEngine) ApplyToolState

func (pe *PermissionEngine) ApplyToolState(name string)

ApplyToolState updates permission mode based on plan mode tools.

func (*PermissionEngine) CheckTool

func (pe *PermissionEngine) CheckTool(ctx context.Context, tc ToolCallInfo) (bool, string)

CheckTool determines if a tool call is allowed, denied, or needs user prompt. Returns (granted bool, denyReason string). If the user must be asked, it blocks on PromptFn with a 5-minute timeout.

func (*PermissionEngine) SetMode

func (pe *PermissionEngine) SetMode(mode string) error

SetMode applies a permission mode string.

type PermissionMemory

type PermissionMemory struct {
	// contains filtered or unexported fields
}

PermissionMemory stores always-allow and always-deny rules.

func NewPermissionMemory

func NewPermissionMemory() *PermissionMemory

func (*PermissionMemory) AllowSpec

func (pm *PermissionMemory) AllowSpec(spec string)

AllowSpec applies an archive-style permission rule, e.g. "Bash(git:*)".

func (*PermissionMemory) AlwaysAllow

func (pm *PermissionMemory) AlwaysAllow(toolName string)

AlwaysAllow marks a tool as always allowed.

func (*PermissionMemory) AlwaysAllowPattern

func (pm *PermissionMemory) AlwaysAllowPattern(pattern string)

AlwaysAllowPattern adds a pattern rule (e.g. "bash:go *").

func (*PermissionMemory) AlwaysDeny

func (pm *PermissionMemory) AlwaysDeny(toolName string)

AlwaysDeny marks a tool as always denied.

func (*PermissionMemory) AlwaysDenyPattern

func (pm *PermissionMemory) AlwaysDenyPattern(pattern string)

AlwaysDenyPattern adds a deny pattern rule.

func (*PermissionMemory) Check

func (pm *PermissionMemory) Check(toolName string, summary string) *bool

Check returns: true=allowed, false=denied, nil=ask user.

func (*PermissionMemory) DenySpec

func (pm *PermissionMemory) DenySpec(spec string)

DenySpec applies an archive-style deny rule, e.g. "Write(*.env)".

type PermissionMode

type PermissionMode string

PermissionMode controls how permission prompts are handled.

const (
	PermissionModeDefault           PermissionMode = "default"
	PermissionModeAcceptEdits       PermissionMode = "acceptEdits"
	PermissionModeBypassPermissions PermissionMode = "bypassPermissions"
	PermissionModeDontAsk           PermissionMode = "dontAsk"
	PermissionModePlan              PermissionMode = "plan"
)

type PermissionRequest

type PermissionRequest struct {
	ToolName string
	ToolID   string
	Summary  string
	Response chan bool
}

PermissionRequest is sent from engine to TUI when a tool needs approval.

type ProtectedPaths

type ProtectedPaths struct {
	// contains filtered or unexported fields
}

ProtectedPaths tracks file paths that are read-only within the session. Tools that write or edit files should check IsProtected before proceeding.

func NewProtectedPaths

func NewProtectedPaths() *ProtectedPaths

NewProtectedPaths creates an empty ProtectedPaths set.

func (*ProtectedPaths) Add

func (p *ProtectedPaths) Add(path string)

Add marks a path as protected (read-only). The path is cleaned before storage for consistent lookups.

func (*ProtectedPaths) Format

func (p *ProtectedPaths) Format() string

Format returns a human-readable block suitable for system prompt injection.

func (*ProtectedPaths) IsProtected

func (p *ProtectedPaths) IsProtected(path string) bool

IsProtected returns true when path (or any ancestor directory) is protected.

func (*ProtectedPaths) List

func (p *ProtectedPaths) List() []string

List returns a sorted slice of all protected paths.

func (*ProtectedPaths) Remove

func (p *ProtectedPaths) Remove(path string)

Remove unmarks a path so it is no longer protected.

type RedactPattern

type RedactPattern struct {
	Name        string
	Pattern     *regexp.Regexp
	Replacement string
	Category    string // "api_key", "token", "password", "cert", "connection_string"
}

RedactPattern defines a regex pattern used to detect and redact sensitive information.

type RedactStats

type RedactStats struct {
	TotalRedacted int
	ByCategory    map[string]int
	BytesSaved    int
}

RedactStats tracks cumulative redaction statistics.

type RiskAssessment

type RiskAssessment struct {
	Score          float64
	Level          string
	Factors        []RiskFactor
	Mitigations    []string
	Recommendation string
}

RiskAssessment holds the result of evaluating how risky a proposed code change is.

type RiskAssessor

type RiskAssessor struct {
	Factors []RiskFactorDef
	// contains filtered or unexported fields
}

RiskAssessor evaluates risk of proposed code changes.

func NewRiskAssessor

func NewRiskAssessor() *RiskAssessor

NewRiskAssessor creates a new RiskAssessor with built-in factors.

func (*RiskAssessor) Assess

func (ra *RiskAssessor) Assess(ctx *RiskContext) *RiskAssessment

Assess evaluates all risk factors and returns a RiskAssessment.

type RiskContext

type RiskContext struct {
	Files             []string
	Diff              string
	TestsExist        bool
	IsExported        bool
	HasBreakingChange bool
	LinesChanged      int
	FilesAffected     int
	Complexity        int
}

RiskContext provides the context needed to assess risk of a change.

type RiskFactor

type RiskFactor struct {
	Name        string
	Weight      float64
	Score       float64
	Description string
}

RiskFactor represents an individual evaluated risk factor.

type RiskFactorDef

type RiskFactorDef struct {
	Name       string
	Weight     float64
	EvaluateFn func(ctx *RiskContext) float64
}

RiskFactorDef defines a risk factor with its evaluation function.

type ToolCallInfo

type ToolCallInfo struct {
	Name string
	ID   string
	Args map[string]interface{}
}

ToolCallInfo is a minimal struct for permission checking.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL