contentguard

package

v1.0.2 Latest Latest Go to latest Published: Apr 19, 2026 License: Apache-2.0 Imports: 14 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/vinayprograms/agentkit

Links

Open Source Insights

README ¶

contentguard

Content trust verification for agent tool calls. Protects against prompt injection by tracking content with trust metadata and verifying tool calls through a staged pipeline.

Usage

guard, err := contentguard.New(
    []contentguard.Stage{
        contentguard.NewScreener(cheapModel),
        contentguard.NewReviewer(capableModel),
    },
    contentguard.Escalatory(),
    contentguard.Config{
        Context:  map[string]string{"scope": "authorized pentest of lab network"},
        Patterns: []string{"exfil:send.*external"},
        Keywords: []string{"custom_secret"},
        Skip:     []string{"read", "list_files"},
    },
)
defer guard.Close()

// Track content as it enters the system
guard.Ingest(contentguard.Untrusted, contentguard.Data, true, html, "web_fetch")

// Track derived content with lineage
guard.IngestWithLineage(contentguard.Untrusted, contentguard.Data, true, derived, "llm:response", []string{parentID})

// Verify a tool call
result, err := guard.Check(ctx, "bash", args, originalGoal)
switch result.Verdict {
case contentguard.Allow:  // proceed
case contentguard.Deny:   // blocked — result.Rationale explains why
case contentguard.Modify: // blocked — result.Rationale has the suggested alternative
}

How It Works

Deterministic check (built-in, always runs) — detects untrusted content, pattern matches, keyword scanning
Configurable stages — run through the pipeline per the chosen workflow

Workflows

Workflow	Behavior
`Escalatory()`	Stop on first allow/deny/modify. Only escalate passes to next stage.
`Paranoid()`	ALL stages must run. Deny if ANY denies. Allow only if all pass.

Stages

Stages implement the Stage interface:

type Stage interface {
    Evaluate(ctx context.Context, req Request) (*Finding, error)
}

Built-in stages:

NewScreener(model) — quick LLM triage (YES/NO)
NewReviewer(model) — full LLM review (ALLOW/DENY/MODIFY)

Custom stages (rule engine, human approval, etc.) implement the same interface.

Context

Context flows from the guard into every stage's Request.Context:

contentguard.New(stages, workflow, contentguard.Config{
    Context: map[string]string{"scope": "authorized pentest"},
})

Stages read context to adjust behavior (e.g., research scope modifies LLM prompts).

Verdicts

Verdict	Meaning
`Allow`	Tool call is safe
`Deny`	Tool call is blocked
`Modify`	Tool call needs changes (rationale has the suggestion)
`Escalate`	Stage can't decide, pass to next (only in findings, never in final result)

Trust

Level	Meaning
`Trusted`	Framework-generated (system prompts)
`Vetted`	Human-authored (goals)
`Untrusted`	External content (web fetches, tool results)

Documentation ¶

Overview ¶

Package contentguard provides prompt injection defense through tracked content and staged verification.

Index ¶

type Config
- func Defaults() Config
type Content
type Finding
type Guard
- func New(stages []Stage, workflow Workflow, cfg Config) (*Guard, error)
- func (g *Guard) Check(ctx context.Context, toolName string, args map[string]any, originalGoal string) (res *Result, err error)
- func (g *Guard) ClearContext()
- func (g *Guard) Close()
- func (g *Guard) Find(id string) *Content
- func (g *Guard) Ingest(trust Trust, kind Kind, mutable bool, text, source string) *Content
- func (g *Guard) IngestWithLineage(trust Trust, kind Kind, mutable bool, text, source string, originIDs []string) *Content
- func (g *Guard) UntrustedIDs() []string
type Kind
type Request
type Result
type Reviewer
- func NewReviewer(provider llm.Model) *Reviewer
- func (r *Reviewer) Evaluate(ctx context.Context, req Request) (*Finding, error)
type Screener
- func NewScreener(provider llm.Model) *Screener
- func (s *Screener) Evaluate(ctx context.Context, req Request) (*Finding, error)
type Stage
type Trust
type Verdict
type Workflow
- func Escalatory() Workflow
- func Paranoid() Workflow

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶

type Config struct {
	Context  map[string]string // flows to stages (e.g., research scope)
	Patterns []string          // custom "name:regex" injection patterns
	Keywords []string          // custom sensitive keywords
	Skip     []string          // tools that skip verification
}

Config holds optional configuration for the guard. Use Defaults() for zero-value config.

func Defaults ¶

func Defaults() Config

Defaults returns a zero-value Config.

type Content ¶

type Content struct {
	ID      string
	Trust   Trust
	Kind    Kind
	Mutable bool
	Text    string
	Source  string
	Origins []*Content // parent content that influenced this
}

Content represents a piece of tracked content with security metadata.

type Finding ¶

type Finding struct {
	Verdict   Verdict
	Rationale string // why (deny), what instead (modify), why unsure (escalate)
	Source    string // which stage produced this
}

Finding is what one stage concluded about a tool call.

type Guard ¶

type Guard struct {
	// contains filtered or unexported fields
}

Guard verifies tool calls against ingested content through a staged pipeline.

func New ¶

func New(stages []Stage, workflow Workflow, cfg Config) (*Guard, error)

New creates a content guard.

func (*Guard) Check ¶

func (g *Guard) Check(ctx context.Context, toolName string, args map[string]any, originalGoal string) (res *Result, err error)

Check runs the verification pipeline for a tool call.

func (*Guard) ClearContext ¶

func (g *Guard) ClearContext()

ClearContext removes all tracked content.

func (*Guard) Close ¶

func (g *Guard) Close()

Close cleans up resources.

func (*Guard) Find ¶

func (g *Guard) Find(id string) *Content

Find returns tracked content by ID, or nil if not found.

func (*Guard) Ingest ¶

func (g *Guard) Ingest(trust Trust, kind Kind, mutable bool, text, source string) *Content

Ingest adds content to the guard's tracking.

func (*Guard) IngestWithLineage ¶

func (g *Guard) IngestWithLineage(trust Trust, kind Kind, mutable bool, text, source string, originIDs []string) *Content

IngestWithLineage adds content with explicit parent content IDs. Use this when the content was derived from other tracked content (e.g., an LLM response influenced by a web fetch result).

func (*Guard) UntrustedIDs ¶

func (g *Guard) UntrustedIDs() []string

UntrustedIDs returns IDs of all untrusted content in context.

type Kind ¶

type Kind string

Kind represents how content should be interpreted.

const (
	// Instruction means content contains executable instructions.
	Instruction Kind = "instruction"
	// Data means content is data only, never to be interpreted as instructions.
	Data Kind = "data"
)

type Request ¶

type Request struct {
	ToolName      string
	ToolArgs      map[string]any
	Untrusted     []*Content
	OriginalGoal  string
	PriorFindings []*Finding        // what earlier stages found
	Context       map[string]string // guard-level context (e.g., research scope)
}

Request carries all information stages need to make a decision.

type Result ¶

type Result struct {
	Verdict   Verdict
	Rationale string
	ToolName  string
	Findings  []*Finding // all findings, deterministic first
}

Result is the guard's final answer on a tool call.

type Reviewer ¶

type Reviewer struct {
	// contains filtered or unexported fields
}

Reviewer is a Stage that performs full LLM-based security review.

func NewReviewer ¶

func NewReviewer(provider llm.Model) *Reviewer

NewReviewer creates a Stage backed by a capable LLM for full review.

func (*Reviewer) Evaluate ¶

func (r *Reviewer) Evaluate(ctx context.Context, req Request) (*Finding, error)

Evaluate implements Stage.

type Screener ¶

type Screener struct {
	// contains filtered or unexported fields
}

Screener is a Stage that performs quick LLM-based triage.

func NewScreener ¶

func NewScreener(provider llm.Model) *Screener

Screener creates a Stage backed by a cheap LLM for quick triage.

func (*Screener) Evaluate ¶

func (s *Screener) Evaluate(ctx context.Context, req Request) (*Finding, error)

Evaluate implements Stage.

type Stage ¶

type Stage interface {
	Evaluate(ctx context.Context, req Request) (*Finding, error)
}

Stage is one step in the verification pipeline.

type Trust ¶

type Trust string

Trust represents the origin-based authenticity of content.

const (
	// Trusted is for framework-generated content (system prompt, supervisor messages).
	Trusted Trust = "trusted"
	// Vetted is for human-authored content (Agentfile goals, signed packages).
	Vetted Trust = "vetted"
	// Untrusted is for external content (tool results, file reads, web fetches).
	Untrusted Trust = "untrusted"
)

type Verdict ¶

type Verdict string

Verdict is the outcome of a stage evaluation or the guard's final decision.

const (
	Allow    Verdict = "allow"
	Deny     Verdict = "deny"
	Modify   Verdict = "modify"
	Escalate Verdict = "escalate" // only in Finding, never in Result
)

type Workflow ¶

type Workflow interface {
	Execute(ctx context.Context, stages []Stage, req Request) *Result
}

Workflow defines how stages are executed in the verification pipeline.

func Escalatory ¶

func Escalatory() Workflow

Escalatory returns a Workflow that stops on the first allow/deny/modify verdict. Only escalate passes to the next stage. If all stages escalate, fail-safe deny.

func Paranoid ¶

func Paranoid() Workflow

Paranoid returns a Workflow that runs ALL stages regardless of individual verdicts. Deny if ANY stage denies. Allow only if ALL stages allow or escalate.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL