llm-proxy

module
v0.0.0-...-48df7e4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 12, 2025 License: MIT

README ΒΆ

LLM Proxy

A transparent, secure proxy for OpenAI's API with token management, rate limiting, logging, and admin UI.

Features

  • OpenAI API Compatibility
  • Withering Tokens: Expiration, revocation, and rate-limiting
  • Project-based Access Control with lifecycle management
    • Soft Deactivation: Projects and tokens use activation flags instead of destructive deletes
    • Individual Token Operations: GET, PATCH, DELETE with comprehensive audit trails
    • Bulk Token Management: Revoke all tokens for a project
    • Project Activation Controls: Deactivate projects to block token generation and API access
    • Admin UI Actions: Edit/revoke tokens, activate/deactivate projects, bulk operations
  • HTTP Response Caching: Redis-backed cache with configurable TTL, auth-aware shared caching, and streaming response support. Enable with HTTP_CACHE_ENABLED=true.
  • Admin UI: Web interface for management
  • Comprehensive Logging & Audit Events: Full lifecycle operation tracking for compliance
  • Async Instrumentation Middleware: Non-blocking, streaming-capable instrumentation for all API calls. See docs/instrumentation.md for advanced usage and extension.
  • Async Event Bus & Dispatcher: All API instrumentation events are handled via an always-on, fully asynchronous event bus (in-memory or Redis) with support for multiple subscribers, batching, retry logic, and graceful shutdown. Persistent event logging is handled by a dispatcher CLI or the --file-event-log flag.
  • OpenAI Token Counting: Accurate prompt and completion token counting using tiktoken-go.
  • Metrics Endpoint (provider-agnostic): Optional JSON metrics endpoint; Prometheus scraping/export is optional and not required by core features
  • SQLite Storage
  • Docker Deployment

Quick Start

docker pull ghcr.io/sofatutor/llm-proxy:latest
mkdir -p ./llm-proxy/data
docker run -d \
  --name llm-proxy \
  -p 8080:8080 \
  -v ./llm-proxy/data:/app/data \
  -e MANAGEMENT_TOKEN=your-secure-management-token \
  ghcr.io/sofatutor/llm-proxy:latest
With Redis Caching
# Start Redis
docker run -d --name redis -p 6379:6379 redis:alpine

# Start proxy with caching enabled
docker run -d \
  --name llm-proxy \
  -p 8080:8080 \
  -v ./llm-proxy/data:/app/data \
  -e MANAGEMENT_TOKEN=your-secure-management-token \
  -e HTTP_CACHE_ENABLED=true \
  -e HTTP_CACHE_BACKEND=redis \
  -e REDIS_CACHE_URL=redis://redis:6379/0 \
  --link redis \
  ghcr.io/sofatutor/llm-proxy:latest
From Source
git clone https://github.com/sofatutor/llm-proxy.git
cd llm-proxy
make build
MANAGEMENT_TOKEN=your-secure-management-token ./bin/llm-proxy

Configuration (Essentials)

  • MANAGEMENT_TOKEN (required): Admin API access
  • LISTEN_ADDR: Default :8080
  • DATABASE_PATH: Default ./data/llm-proxy.db
  • LOG_LEVEL: Default info
  • LOG_FILE: Path to log file (stdout if empty)
  • LOG_MAX_SIZE_MB: Rotate log after this size in MB (default 10)
  • LOG_MAX_BACKUPS: Number of rotated log files to keep (default 5)
  • AUDIT_ENABLED: Enable audit logging (default true)
  • AUDIT_LOG_FILE: Audit log file path (default ./data/audit.log)
  • AUDIT_STORE_IN_DB: Store audit events in database (default true)
  • AUDIT_CREATE_DIR: Create audit log directories (default true)
  • OBSERVABILITY_ENABLED: Deprecated; the async event bus is now always enabled
  • OBSERVABILITY_BUFFER_SIZE: Event buffer size for instrumentation events (default 1000)
  • FILE_EVENT_LOG: Path to persistent event log file (enables file event logging via dispatcher)
Caching Configuration
  • HTTP_CACHE_ENABLED: Enable HTTP response caching (default true)
  • HTTP_CACHE_BACKEND: Cache backend (redis or in-memory, default in-memory)
  • REDIS_CACHE_URL: Redis connection URL (default redis://localhost:6379/0 when backend=redis)
  • REDIS_CACHE_KEY_PREFIX: Cache key prefix (default llmproxy:cache:)
  • HTTP_CACHE_MAX_OBJECT_BYTES: Maximum cached object size in bytes (default 1048576)
  • HTTP_CACHE_DEFAULT_TTL: Default TTL in seconds when upstream doesn't specify (default 300)

See docs/api-configuration.md and docs/instrumentation.md for all options and advanced usage.

Advanced Example
apis:
  openai:
    param_whitelist:
      model:
        - gpt-4o
        - gpt-4.1-*
    allowed_origins:
      - https://www.sofatutor.com
      - http://localhost:4000
    required_headers:
      - origin

See docs/issues/phase-7-param-cors-whitelist.md for advanced configuration and rationale.

Main API Endpoints

Management API
  • /manage/projects β€” Project lifecycle management
    • GET /manage/projects β€” List all projects
    • POST /manage/projects β€” Create a new project (defaults to active)
  • /manage/projects/{projectId}
    • GET β€” Get project details
    • PATCH β€” Update a project (supports is_active field)
    • DELETE β€” 405 Method Not Allowed (no destructive deletes)
  • /manage/projects/{projectId}/tokens/revoke β€” Bulk token operations
    • POST β€” Revoke all tokens for project
  • /manage/tokens β€” Token lifecycle management
    • GET /manage/tokens β€” List all tokens (filter by project, active status)
    • POST /manage/tokens β€” Generate a new token (blocked if project inactive)
  • /manage/tokens/{tokenId}
    • GET β€” Get token details
    • PATCH β€” Update token (activate/deactivate)
    • DELETE β€” Revoke token (soft deactivation)

All management endpoints require:

Authorization: Bearer <MANAGEMENT_TOKEN>
Example (curl):
# Create active project
curl -X POST http://localhost:8080/manage/projects \
  -H "Authorization: Bearer $MANAGEMENT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Project", "openai_api_key": "sk-..."}'

# Update project activation status
curl -X PATCH http://localhost:8080/manage/projects/<project-id> \
  -H "Authorization: Bearer $MANAGEMENT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"is_active": false}'

# Bulk revoke project tokens
curl -X POST http://localhost:8080/manage/projects/<project-id>/tokens/revoke \
  -H "Authorization: Bearer $MANAGEMENT_TOKEN"

# Revoke individual token
curl -X DELETE http://localhost:8080/manage/tokens/<token-id> \
  -H "Authorization: Bearer $MANAGEMENT_TOKEN"
Proxy
  • POST /v1/* β€” Forwarded to OpenAI, requires withering token

Example:

curl -H "Authorization: Bearer <withering-token>" \
     -H "Content-Type: application/json" \
     -d '{"model":"gpt-4","messages":[{"role":"user","content":"Hello"}]}' \
     http://localhost:8080/v1/chat/completions

Note: The proxy API is not documented with Swagger/OpenAPI except for authentication and allowed paths/methods. For backend schemas, refer to the provider's documentation.

Admin UI
  • /admin/ β€” Web interface with lifecycle management
    • Project activation/deactivation controls
    • Token revocation and editing
    • Bulk token management by project
    • Audit event viewing (when enabled)

CLI Management Tool

The CLI provides full management of projects and tokens via the llm-proxy manage command with lifecycle operations. All subcommands support the --manage-api-base-url flag (default: http://localhost:8080) and require a management token (via --management-token or MANAGEMENT_TOKEN env).

Project Management
# List projects with activation status
llm-proxy manage project list --manage-api-base-url http://localhost:8080 --management-token <token>

# Get project details
llm-proxy manage project get <project-id> --manage-api-base-url http://localhost:8080 --management-token <token>

# Create project (defaults to active)
llm-proxy manage project create --name "My Project" --openai-key sk-... --manage-api-base-url http://localhost:8080 --management-token <token>

# Update project (supports activation changes)
# Note: --is-active flag not yet available in CLI; use direct API calls for activation control
curl -X PATCH http://localhost:8080/manage/projects/<project-id> \
  -H "Authorization: Bearer $MANAGEMENT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"is_active": false}'

# CLI currently supports name and API key updates
llm-proxy manage project update <project-id> --name "New Name" --manage-api-base-url http://localhost:8080 --management-token <token>

# Project deletion not supported (405) - use deactivation instead
# llm-proxy manage project delete <project-id>  # This will fail with 405
Token Management
# Generate token (blocked if project inactive via API validation)
llm-proxy manage token generate --project-id <project-id> --duration 24 --manage-api-base-url http://localhost:8080 --management-token <token>

# Note: Token listing, details, and revocation not yet available in CLI
# Use direct API calls for these operations:

# List tokens with filtering
curl -H "Authorization: Bearer $MANAGEMENT_TOKEN" \
  "http://localhost:8080/manage/tokens?project_id=<project-id>&active_only=true"

# Get token details
curl -H "Authorization: Bearer $MANAGEMENT_TOKEN" \
  "http://localhost:8080/manage/tokens/<token-id>"

# Revoke individual token
curl -X DELETE -H "Authorization: Bearer $MANAGEMENT_TOKEN" \
  "http://localhost:8080/manage/tokens/<token-id>"

# Bulk revoke project tokens  
curl -X POST -H "Authorization: Bearer $MANAGEMENT_TOKEN" \
  "http://localhost:8080/manage/projects/<project-id>/tokens/revoke"
Flags
  • --manage-api-base-url β€” Set the management API base URL (default: http://localhost:8080)
  • --management-token β€” Provide the management token (or set MANAGEMENT_TOKEN env)
  • --json β€” Output results as JSON (optional)

Event Dispatcher CLI

The LLM Proxy includes a powerful, pluggable dispatcher system for sending observability events to external services. The dispatcher supports multiple backends and can be run as a separate service.

Supported Backends
  • file: Write events to JSONL file
  • lunary: Send events to Lunary.ai platform
  • helicone: Send events to Helicone platform
Basic Usage
# File output  
llm-proxy dispatcher --service file --endpoint events.jsonl

# Lunary integration
export LLM_PROXY_API_KEY="your-lunary-api-key"
llm-proxy dispatcher --service lunary

# Helicone integration
llm-proxy dispatcher --service helicone --api-key your-helicone-key

# Custom batch size and buffer
llm-proxy dispatcher --service lunary --api-key $API_KEY --batch-size 50 --buffer 2000
Deployment Options

The dispatcher can be deployed in multiple ways:

  • Standalone Process: Run as a separate service for production
  • Sidecar Container: Deploy alongside the main proxy in Kubernetes
  • Background Mode: Use --detach flag for daemon-like operation

See docs/instrumentation.md for detailed configuration and architecture.

Warning: Event loss can occur if the Redis event log is configured with TTL/max length values that are too low for your dispatcher lag and throughput. In production, increase Redis TTL and list length to cover worst-case backlogs and keep the dispatcher running with sufficient batch size/throughput. For strict guarantees, use a durable queue (e.g., Redis Streams with consumer groups or Kafka). See the Production Reliability section in docs/instrumentation.md.

Using Redis for Distributed Event Bus (Local Development)

Note: The in-memory event bus only works within a single process. For multi-process setups (e.g., running the proxy and dispatcher as separate processes or containers), you must use Redis as the event bus backend.

Local Setup with Docker Compose

A redis service is included in the docker-compose.yml for local development:

db:
  image: redis:7
  container_name: llm-proxy-redis
  ports:
    - "6379:6379"
  restart: unless-stopped
Configuring the Proxy and Dispatcher to Use Redis

Set the event bus backend to Redis by using the appropriate environment variable or CLI flag (see documentation for exact flag):

LLM_PROXY_EVENT_BUS=redis llm-proxy ...
LLM_PROXY_EVENT_BUS=redis llm-proxy dispatcher ...

This ensures both the proxy and dispatcher share events via Redis, enabling full async pipeline testing and production-like operation.

Project Structure

  • /cmd β€” Entrypoints (proxy, eventdispatcher)
  • /internal β€” Core logic (token, database, proxy, admin, logging, eventbus, dispatcher)
  • /api β€” OpenAPI specs
  • /web β€” Admin UI static assets
  • /docs β€” Full documentation

Security & Production Notes

  • Tokens support expiration, revocation, and rate limits
  • Management API protected by MANAGEMENT_TOKEN
  • Admin UI uses basic auth (ADMIN_USER, ADMIN_PASSWORD)
  • Logs stored locally and/or sent to external backends
  • Use HTTPS in production (via reverse proxy)
  • See docs/security.md and docs/production.md for best practices
Containerization Notes
  • Multi-stage Dockerfile builds a static binary and ships a minimal Alpine runtime
  • Runs as non-root user appuser with read-only filesystem by default
  • Healthcheck hits /health; see docker-compose.yml or Dockerfile HEALTHCHECK
  • Volumes: /app/data, /app/logs, /app/config, /app/certs
  • Example local build/test:
make docker-build
make docker-run
make docker-smoke
Publishing

Images are built and published to GitHub Container Registry on pushes to main and tags v*.

Registry: ghcr.io/sofatutor/llm-proxy

Workflow: .github/workflows/docker.yml builds for linux/amd64 and linux/arm64 and pushes labels/tags.

Documentation

This README provides a quick overview and getting started guide. For comprehensive documentation, see the /docs directory:

πŸ“š Complete Documentation Index

Key Documentation:

For Developers:

License

MIT License

Directories ΒΆ

Path Synopsis
cmd
eventdispatcher command
proxy command
internal
admin
Package admin provides HTTP client functionality for communicating with the Management API from the Admin UI server.
Package admin provides HTTP client functionality for communicating with the Management API from the Admin UI server.
api
Package api provides types for API requests and responses shared between CLI and server.
Package api provides types for API requests and responses shared between CLI and server.
audit
Package audit provides audit logging functionality for security-sensitive events in the LLM proxy.
Package audit provides audit logging functionality for security-sensitive events in the LLM proxy.
client
Package client provides HTTP client functionality for communicating with the LLM Proxy API.
Package client provides HTTP client functionality for communicating with the LLM Proxy API.
config
Package config handles application configuration loading and validation from environment variables, providing a type-safe configuration structure.
Package config handles application configuration loading and validation from environment variables, providing a type-safe configuration structure.
database
Package database provides SQLite database operations for the LLM Proxy.
Package database provides SQLite database operations for the LLM Proxy.
obfuscate
Package obfuscate centralizes redaction/obfuscation helpers used across the codebase.
Package obfuscate centralizes redaction/obfuscation helpers used across the codebase.
proxy
Package proxy provides the transparent proxy functionality for the LLM API.
Package proxy provides the transparent proxy functionality for the LLM API.
server
Package server implements the HTTP server for the LLM Proxy.
Package server implements the HTTP server for the LLM Proxy.
setup
Package setup provides configuration setup and management utilities.
Package setup provides configuration setup and management utilities.
utils
Package utils provides common utility functions.
Package utils provides common utility functions.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL