QueuePriority
Priority queues for LLM requests.
QueuePriority extends BatchQueue with priority levels. Enterprise customers jump ahead of free tier. Reserved capacity and SLA tracking.
Quickstart
export OPENAI_API_KEY=sk-...
npx @stockyard/queuepriority
# Your app: http://localhost:6590/v1/chat/completions
# Dashboard: http://localhost:6590/ui
What You Get
- Priority levels per key/tenant
- Reserved capacity
- SLA tracking
- Queue depth per priority
- Auto-promote on timeout
- Dashboard with queue analytics
Config
# queuepriority.yaml
port: 6590
queuepriority:
levels:
critical: { weight: 10, reserved: 5 }
high: { weight: 5 }
normal: { weight: 1 }
low: { weight: 0 }
Docker
docker run -p 6590:6590 -e OPENAI_API_KEY=sk-... stockyard/queuepriority
Part of Stockyard
QueuePriority is part of Stockyard — an open-source LLM proxy and control plane. MIT licensed.