inferenceproxy

package
v0.0.0-...-c16d419 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 14, 2026 License: Apache-2.0 Imports: 9 Imported by: 0

Documentation

Overview

Package inferenceproxy implements a guarded inference proxy for Helix Cluster OS. The proxy fronts an inference Backend and enforces an ADMISSION gate (an Authorizer) before any request is forwarded. The ordering guarantee is the security-critical contract: authorization runs FIRST, and a denied request never reaches the backend. Every request — allowed or denied — is recorded to an audit sink (who/model/decision/bytes/timestamp).

SCOPE / HONESTY: this package is self-contained and stdlib-only. It does NOT import pkg/inference, the model registry, or the digital.vasic.security submodule. Real serving engines plug in behind the Backend interface; real identity/policy systems plug in behind the Authorizer interface. The bundled TokenAuthorizer is a WORKING reference that verifies request tokens with a real HMAC-SHA256 keyed MAC and constant-time comparison (crypto/hmac, crypto/sha256, crypto/subtle) — not a no-op marked "real".

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrMissingToken indicates the request carried no token.
	ErrMissingToken = errors.New("inferenceproxy: missing token")
	// ErrBadToken indicates the token's MAC did not verify against the secret.
	ErrBadToken = errors.New("inferenceproxy: token signature invalid")
	// ErrScopeDenied indicates a valid token lacks the scope the request needs.
	ErrScopeDenied = errors.New("inferenceproxy: scope not permitted")
)

Authorization-failure reasons, returned by TokenAuthorizer. They are wrapped by the proxy under ErrUnauthorized; callers can also match these directly.

View Source
var (
	// ErrUnauthorized is returned (wrapped) when the admission gate denies a
	// request. When the proxy returns an error that Is ErrUnauthorized, the
	// backend was NEVER invoked.
	ErrUnauthorized = errors.New("inferenceproxy: request not authorized")
	// ErrNilBackend is returned by NewProxy when no backend is supplied.
	ErrNilBackend = errors.New("inferenceproxy: backend is required")
	// ErrNilAuthorizer is returned by NewProxy when no authorizer is supplied.
	ErrNilAuthorizer = errors.New("inferenceproxy: authorizer is required")
)

Sentinel errors so callers (and tests) can branch on the failure class with errors.Is rather than string matching.

Functions

This section is empty.

Types

type AuditEntry

type AuditEntry struct {
	// Subject is the caller identity ("who").
	Subject string
	// Model is the requested model.
	Model string
	// Decision is allow or deny.
	Decision Decision
	// PromptBytes is the byte length of the request prompt.
	PromptBytes int
	// Time is when the decision was recorded.
	Time time.Time
	// Err is the reason for a deny (or a propagated backend error on an allow);
	// nil on a clean allow.
	Err error
}

AuditEntry is a single sink-side record of a request passing through the proxy. One entry is produced per Infer call regardless of the decision.

type AuditSink

type AuditSink interface {
	// Record appends an audit entry.
	Record(entry AuditEntry)
}

AuditSink receives one AuditEntry per request. Implementations must be safe to call from the goroutine that invokes Infer.

type Authorizer

type Authorizer interface {
	// Authorize decides whether the request may proceed to the backend.
	Authorize(req Request) error
}

Authorizer is the admission gate. Authorize returns nil to ADMIT the request and a non-nil error to DENY it. It MUST NOT mutate or forward the request.

type Backend

type Backend interface {
	// Infer runs a single request and returns its response or an error.
	Infer(ctx context.Context, req Request) (Response, error)
}

Backend is the downstream inference provider fronted by the proxy. It is a minimal in-package seam; a real engine implements this interface.

type Clock

type Clock func() time.Time

Clock supplies the current time; injectable for deterministic tests.

type Decision

type Decision string

Decision is the recorded outcome of an admission check.

const (
	// DecisionAllow marks a request that passed admission and reached the backend.
	DecisionAllow Decision = "allow"
	// DecisionDeny marks a request rejected by admission; backend never called.
	DecisionDeny Decision = "deny"
)

type MemAudit

type MemAudit struct {
	// contains filtered or unexported fields
}

MemAudit is an in-memory AuditSink that retains every recorded entry in order. It is safe for concurrent use and intended for tests and small embedded deployments where audit history fits in memory.

func NewMemAudit

func NewMemAudit() *MemAudit

NewMemAudit returns an empty in-memory audit sink.

func (*MemAudit) Entries

func (m *MemAudit) Entries() []AuditEntry

Entries returns a snapshot copy of all recorded entries in record order.

func (*MemAudit) Len

func (m *MemAudit) Len() int

Len returns the number of recorded entries.

func (*MemAudit) Record

func (m *MemAudit) Record(entry AuditEntry)

Record implements AuditSink by appending a copy of the entry.

type Proxy

type Proxy struct {
	// contains filtered or unexported fields
}

Proxy is a guarded inference proxy. It authorizes every request BEFORE forwarding, rejects denied requests without ever touching the backend, and records every request to the audit sink.

func NewProxy

func NewProxy(backend Backend, authorizer Authorizer, audit AuditSink, clock Clock) (*Proxy, error)

NewProxy constructs a Proxy. backend and authorizer are required. A nil audit sink is replaced with a no-op sink; a nil clock defaults to time.Now.

func (*Proxy) Infer

func (p *Proxy) Infer(ctx context.Context, req Request) (Response, error)

Infer applies the admission gate and, only on success, forwards to the backend. The ordering is the contract: Authorize runs FIRST. On denial the backend is never invoked, an error wrapping ErrUnauthorized is returned, and a DecisionDeny audit entry is recorded. On admission the request is forwarded; whether the backend succeeds or errors, a DecisionAllow audit entry is recorded and the backend's exact response/error is returned.

type Request

type Request struct {
	// Subject identifies the caller (the "who" in the audit record).
	Subject string
	// Model is the requested model name.
	Model string
	// Prompt is the input text; its byte length is recorded for audit.
	Prompt string
	// Token is the bearer credential the Authorizer validates.
	Token string
	// Scope is the capability the request needs (e.g. "infer").
	Scope string
}

Request is a single inference request flowing through the proxy. It is defined locally so the package does not depend on pkg/inference.

type Response

type Response struct {
	// Model echoes the served model name.
	Model string
	// Text is the generated output.
	Text string
	// Tokens is the number of tokens emitted.
	Tokens int
}

Response is the result returned by a Backend.

type TokenAuthorizer

type TokenAuthorizer struct {
	// contains filtered or unexported fields
}

TokenAuthorizer is a WORKING reference Authorizer. A request is admitted iff its Token is a valid HMAC-SHA256(secret, Subject) tag (hex-encoded) AND the request Scope is in the permitted set. Verification uses crypto/hmac with a constant-time comparison (crypto/subtle), so it resists timing oracles and forged tokens. This is a real keyed-MAC check, not a placeholder.

func NewTokenAuthorizer

func NewTokenAuthorizer(secret []byte, permittedScopes ...string) *TokenAuthorizer

NewTokenAuthorizer builds an authorizer keyed by secret that permits the given scopes. An empty scope set permits any scope (scope check disabled); a non-empty set admits only requests whose Scope is listed.

func (*TokenAuthorizer) Authorize

func (a *TokenAuthorizer) Authorize(req Request) error

Authorize implements Authorizer. It verifies the token MAC over the subject, then enforces scope. It returns nil only when BOTH checks pass.

func (*TokenAuthorizer) MintToken

func (a *TokenAuthorizer) MintToken(subject string) string

MintToken returns the canonical token for a subject: the hex-encoded HMAC-SHA256(secret, subject). A client presents this as Request.Token. This is the same computation Authorize verifies, so a minted token always admits.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL