Documentation
¶
Index ¶
Constants ¶
const DefaultBaseURL = "http://localhost:11434"
DefaultBaseURL is where Ollama listens by default. Users running it on a non-default host/port set GETDEBUG_OLLAMA_URL.
const DefaultDim = 768
DefaultDim matches nomic-embed-text v1.5.
const DefaultModel = "nomic-embed-text"
DefaultModel — small, fast, decent recall on prose + code. ~270MB on disk. The user installs it once: `ollama pull nomic-embed-text`.
const MaxInputBytes = 24 * 1024
MaxInputBytes caps each input before sending. nomic-embed-text accepts up to 8192 tokens; ~4 chars/token = 32KB. We stay at 24KB for a 25% safety margin — embedding a truncated chunk still gives a useful vector (filename + first ~600 lines is plenty for retrieval) and a clear cap is better than guessing the model's exact tokeniser behaviour.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Client ¶
Client is the minimal HTTP client we need. Stateless aside from the configured base URL + http.Client; safe to reuse.
func New ¶
New returns a client with sane defaults. 5-minute per-request timeout because nomic-embed-text on CPU can chew through a batch of 16 large chunks in 60-120s. The outer context (set by the command) is the real wall-clock budget; this is the per-HTTP-call guard.
func (*Client) EmbedBatch ¶
func (c *Client) EmbedBatch(ctx context.Context, model string, inputs []string) ([][]float32, error)
EmbedBatch sends up to N texts in one /api/embed call. Ollama returns vectors in the same order; we return them as []float32 so downstream math + storage (vector(N) BLOB) doesn't pay a float64 tax. Inputs are truncated at MaxInputBytes — over-long content is a real possibility in generated / minified files and we'd rather index a prefix than drop the chunk.