provider

package
v0.0.0-...-d3a3bb4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 25, 2026 License: AGPL-3.0 Imports: 12 Imported by: 0

Documentation

Overview

Package provider defines the Provider struct and the Attester and RequestPreparer interfaces used by all TEE-capable AI backends.

Dependency flow: attestation → e2ee → provider → proxy → cmd Provider uses attestation types but is not imported by attestation.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FetchAttestationJSON

func FetchAttestationJSON(ctx context.Context, client *http.Client, url, apiKey string, limit int64) ([]byte, error)

FetchAttestationJSON performs a GET to url with a Bearer token, reads up to limit bytes, and returns the response body. Returns an error with the truncated body for non-200 responses.

func NormalizeUncompressedKey

func NormalizeUncompressedKey(key string) string

NormalizeUncompressedKey prepends the "04" uncompressed-point prefix to 128-char hex public keys that omit it.

func ParseChutesFormat

func ParseChutesFormat(ctx context.Context, body []byte, prefix string) (*attestation.RawAttestation, error)

ParseChutesFormat parses the gateway-wrapped chutes attestation format. This format is returned by gateway providers like phalacloud/RedPill and NanoGPT when routing to chutes-based backends.

func Truncate

func Truncate(s string, n int) string

Truncate returns s truncated to n characters with "..." appended if needed.

func UnwrapDoubleEncoded

func UnwrapDoubleEncoded(data []byte) []byte

UnwrapDoubleEncoded handles dstack fields that may be JSON-encoded as either a raw object or a string containing JSON. Returns the inner bytes.

Types

type Attester

type Attester interface {
	FetchAttestation(ctx context.Context, model string, nonce attestation.Nonce) (*attestation.RawAttestation, error)
}

Attester fetches raw attestation data from a TEE provider. Implementations are in the provider-specific sub-packages.

type E2EEMaterial

type E2EEMaterial struct {
	InstanceID string
	E2EPubKey  string // base64-encoded ML-KEM-768 public key
	E2ENonce   string // single-use nonce from /e2e/instances
	ChuteID    string // resolved chute UUID
}

E2EEMaterial holds the minimum information needed to encrypt a single Chutes E2EE request without full re-attestation: instance ID, ML-KEM public key, single-use nonce, and resolved chute UUID.

type E2EEMaterialFetcher

type E2EEMaterialFetcher interface {
	FetchE2EEMaterial(ctx context.Context, model string) (*E2EEMaterial, error)
	MarkFailed(chuteID, instanceID string)
	Invalidate(chuteID string)
}

E2EEMaterialFetcher provides lightweight E2EE key material from a nonce pool without full re-attestation. Used by Chutes to avoid the expensive /chutes/{id}/evidence + TDX verification roundtrip on every request. MarkFailed records that an instance produced an error so the pool can prefer other instances. Invalidate discards all cached material for a chute, forcing a fresh fetch on the next request.

type ModelFilter

type ModelFilter interface {
	Models(ctx context.Context) (map[string]struct{}, error)
}

ModelFilter returns the set of model names that should be included in a filtered model listing. Implementations must be safe for concurrent use.

type ModelLister

type ModelLister interface {
	ListModels(ctx context.Context) ([]json.RawMessage, error)
}

ModelLister fetches the list of available models from a provider. Each entry is a json.RawMessage conforming to the OpenAI model object schema. Implementations may cache results internally.

func NewFilteredModelLister

func NewFilteredModelLister(baseURL, apiKey string, client *http.Client, filter ModelFilter) ModelLister

NewFilteredModelLister returns a ModelLister that fetches the full model catalog from baseURL/v1/models (with apiKey auth) and then filters to only include models present in the filter set.

func NewModelLister

func NewModelLister(baseURL, apiKey string, client *http.Client) ModelLister

NewModelLister returns a ModelLister that fetches from baseURL/v1/models.

type PinnedHandler

type PinnedHandler interface {
	HandlePinned(ctx context.Context, req *PinnedRequest) (*PinnedResponse, error)
}

PinnedHandler handles chat requests on a connection-pinned TLS connection where attestation and inference share the same TCP connection. Used by providers like NEAR AI where the TLS cert is verified via attestation rather than a traditional CA chain.

type PinnedRequest

type PinnedRequest struct {
	Method  string
	Path    string      // e.g. "/v1/chat/completions"
	Headers http.Header // forwarded headers (Authorization, Content-Type, etc.)
	Body    []byte      // raw request body
	Model   string      // upstream model name (for endpoint resolution)
	E2EE    bool        // encrypt message contents for the model backend

	// SigningKey is the model's attested public key, provided by the caller
	// from its signing key cache. Used on SPKI cache hits when E2EE is
	// active and no fresh attestation provides a signing key.
	SigningKey string
}

PinnedRequest is the input to a pinned chat handler.

type PinnedResponse

type PinnedResponse struct {
	StatusCode int
	Header     http.Header
	Body       io.ReadCloser

	// Report is the verification report from attestation, if attestation was
	// performed on this connection. Nil on SPKI cache hits.
	Report *attestation.VerificationReport

	// SigningKey is the attested model key returned on cache misses. It allows
	// callers to refresh signing-key caches without a second attestation fetch.
	SigningKey string

	// Session is the E2EE session established during the pinned request.
	// Non-nil when E2EE was active; callers use it for response decryption.
	Session e2ee.Decryptor
}

PinnedResponse is a raw HTTP response from a pinned connection.

type Provider

type Provider struct {
	// Name is the canonical provider identifier (e.g. "venice", "neardirect").
	Name string

	// BaseURL is the upstream API root (e.g. "https://api.venice.ai").
	BaseURL string

	// APIKey is the resolved API key. Never log this directly; use
	// config.RedactKey.
	APIKey string

	// ChatPath is the API path for chat completions (e.g. "/api/v1/chat/completions").
	ChatPath string

	// EmbeddingsPath is the upstream API path for embeddings (e.g. "/v1/embeddings").
	// Empty means the provider does not support embeddings via this proxy.
	EmbeddingsPath string

	// AudioPath is the upstream API path for audio transcriptions
	// (e.g. "/v1/audio/transcriptions"). Empty means unsupported.
	AudioPath string

	// ImagesPath is the upstream API path for image generations
	// (e.g. "/v1/images/generations"). Empty means unsupported.
	ImagesPath string

	// RerankPath is the upstream API path for reranking
	// (e.g. "/v1/rerank"). Empty means unsupported.
	RerankPath string

	// E2EE indicates whether this provider supports end-to-end encryption.
	E2EE bool

	// Encryptor encrypts outgoing chat request bodies for the provider's
	// E2EE protocol. Non-nil when E2EE is true.
	Encryptor RequestEncryptor

	// SkipSigningKeyCache indicates the provider needs fresh attestation for
	// each E2EE request (e.g. Chutes requires per-request instance/nonce data).
	SkipSigningKeyCache bool

	// E2EEMaterialFetcher provides lightweight E2EE material from a nonce
	// pool for providers that separate attestation from E2EE key exchange
	// (Chutes). When set, buildUpstreamBody uses this instead of full
	// re-attestation for cache-hit E2EE requests.
	E2EEMaterialFetcher E2EEMaterialFetcher

	// Attester fetches raw attestation from the provider's attestation endpoint.
	// May be nil if the provider does not support attestation.
	Attester Attester

	// Preparer injects provider-specific headers into outgoing requests.
	// May be nil if no special headers are needed.
	Preparer RequestPreparer

	// ReportDataVerifier validates REPORTDATA binding for this provider.
	// May be nil if the provider does not support REPORTDATA verification.
	ReportDataVerifier ReportDataVerifier

	// PinnedHandler handles chat requests on a connection-pinned TLS
	// connection. Set for providers that require same-connection attestation
	// (e.g. NEAR AI). When non-nil, the proxy uses this instead of the
	// standard http.Client path.
	PinnedHandler PinnedHandler

	// SPKIDomainForModel returns the domain string used as the SPKI cache
	// key for a given model. Required for E2EE providers with a PinnedHandler
	// so the proxy can evict the correct SPKI entries when the attestation
	// report cache expires. Returns ("", false) if the domain cannot be
	// determined (the proxy must fail closed in that case).
	SPKIDomainForModel func(ctx context.Context, model string) (string, bool)

	// SupplyChainPolicy defines the allowed container image repos for this
	// provider. May be nil if the provider has no policy.
	SupplyChainPolicy *attestation.SupplyChainPolicy

	// MeasurementPolicy is the merged TDX measurement allowlist for this
	// provider's model backend CVM (Go defaults + global TOML + per-provider TOML).
	MeasurementPolicy attestation.MeasurementPolicy

	// GatewayMeasurementPolicy is the merged TDX measurement allowlist for
	// this provider's gateway CVM. Zero value for non-gateway providers.
	GatewayMeasurementPolicy attestation.MeasurementPolicy

	// ModelLister fetches available models from the provider's discovery API.
	// May be nil if the provider does not support model listing.
	ModelLister ModelLister
}

Provider is a fully wired TEE-capable AI backend. It combines the data from config.Provider with the behavioral interfaces Attester and Preparer.

The zero value is not useful; construct with New or fill fields directly.

type ReportDataVerifier

type ReportDataVerifier interface {
	VerifyReportData(reportData [64]byte, raw *attestation.RawAttestation, nonce attestation.Nonce) (detail string, err error)
}

ReportDataVerifier validates that TDX REPORTDATA binds the expected identity. Each provider implements its own binding scheme (e.g. Venice uses keccak256-derived address, NEAR uses sha256(signing_address + tls_fingerprint)).

type RequestEncryptor

type RequestEncryptor interface {
	EncryptRequest(body []byte, raw *attestation.RawAttestation, endpointPath string) ([]byte, e2ee.Decryptor, *e2ee.ChutesE2EE, error)
}

RequestEncryptor encrypts an outgoing request body for a provider's E2EE protocol. The endpointPath (e.g. "/v1/chat/completions", "/v1/images/generations") tells the encryptor which fields to encrypt. Returns the encrypted body, a Decryptor for response decryption, optional Chutes metadata, and any error.

For Chutes, Decryptor is nil; crypto state is carried in *e2ee.ChutesE2EE instead (the Chutes protocol uses a different relay path). For Venice and NearCloud, *e2ee.ChutesE2EE is nil.

type RequestPreparer

type RequestPreparer interface {
	PrepareRequest(req *http.Request, e2eeHeaders http.Header, meta *e2ee.ChutesE2EE, stream bool, path string) error
}

RequestPreparer injects provider-specific headers into an outgoing upstream request. e2eeHeaders contains pre-built E2EE protocol headers (may be nil for plaintext or Chutes paths). meta is non-nil for Chutes requests. path is the endpoint path for this request (e.g. "/v1/embeddings"); used by Chutes to set X-E2E-Path dynamically per endpoint type.

Directories

Path Synopsis
Package chutes implements the Attester and RequestPreparer interfaces for the Chutes direct TEE attestation API.
Package chutes implements the Attester and RequestPreparer interfaces for the Chutes direct TEE attestation API.
Package nanogpt implements the Attester interface for NanoGPT's TEE attestation API.
Package nanogpt implements the Attester interface for NanoGPT's TEE attestation API.
Package nearcloud implements the Attester and PinnedHandler for the NEAR AI cloud gateway (cloud-api.near.ai).
Package nearcloud implements the Attester and PinnedHandler for the NEAR AI cloud gateway (cloud-api.near.ai).
Package neardirect implements the Attester and RequestPreparer interfaces for NEAR AI's direct TEE attestation API.
Package neardirect implements the Attester and RequestPreparer interfaces for NEAR AI's direct TEE attestation API.
Package phalacloud implements the Attester and RequestPreparer interfaces for Phala Cloud's TEE attestation API (RedPill gateway).
Package phalacloud implements the Attester and RequestPreparer interfaces for Phala Cloud's TEE attestation API (RedPill gateway).
Package venice implements the Attester and RequestPreparer interfaces for Venice AI's TEE attestation and E2EE API.
Package venice implements the Attester and RequestPreparer interfaces for Venice AI's TEE attestation and E2EE API.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL