Documentation
¶
Overview ¶
Recipe 04: OpenAI-Compatible Server
Serve a GGUF model behind an OpenAI-compatible HTTP API. Clients that work with the OpenAI API (curl, Python openai library, LangChain, etc.) can connect directly -- just point them at http://localhost:8080.
Endpoints:
- POST /v1/chat/completions (chat)
- POST /v1/completions (text completion)
- POST /v1/embeddings (embeddings)
- GET /v1/models (model listing)
- GET /health (health check)
Usage:
go run ./docs/cookbook/04-openai-server/ --model path/to/model.gguf
curl http://localhost:8080/v1/chat/completions -d '{"model":"default","messages":[{"role":"user","content":"Hello"}]}'
Click to show internal directories.
Click to hide internal directories.