localembed

package

v0.3.0 Latest Latest Go to latest Published: Jun 4, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/getdebug-ai/cli

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
type Client
- func New(baseURL string) *Client

Constants ¶

View Source

const DefaultBaseURL = "http://localhost:11434"

DefaultBaseURL is where Ollama listens by default. Users running it on a non-default host/port set GETDEBUG_OLLAMA_URL.

View Source

const DefaultDim = 768

DefaultDim matches nomic-embed-text v1.5.

View Source

const DefaultModel = "nomic-embed-text"

DefaultModel — small, fast, decent recall on prose + code. ~270MB on disk. The user installs it once: `ollama pull nomic-embed-text`.

View Source

const MaxInputBytes = 24 * 1024

MaxInputBytes caps each input before sending. nomic-embed-text accepts up to 8192 tokens; ~4 chars/token = 32KB. We stay at 24KB for a 25% safety margin — embedding a truncated chunk still gives a useful vector (filename + first ~600 lines is plenty for retrieval) and a clear cap is better than guessing the model's exact tokeniser behaviour.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Client ¶

type Client struct {
	BaseURL string
	HTTP    *http.Client
}

Client is the minimal HTTP client we need. Stateless aside from the configured base URL + http.Client; safe to reuse.

func New ¶

func New(baseURL string) *Client

New returns a client with sane defaults. 5-minute per-request timeout because nomic-embed-text on CPU can chew through a batch of 16 large chunks in 60-120s. The outer context (set by the command) is the real wall-clock budget; this is the per-HTTP-call guard.

func (*Client) EmbedBatch ¶

func (c *Client) EmbedBatch(ctx context.Context, model string, inputs []string) ([][]float32, error)

EmbedBatch sends up to N texts in one /api/embed call. Ollama returns vectors in the same order; we return them as []float32 so downstream math + storage (vector(N) BLOB) doesn't pay a float64 tax. Inputs are truncated at MaxInputBytes — over-long content is a real possibility in generated / minified files and we'd rather index a prefix than drop the chunk.

func (*Client) EmbedOne ¶

func (c *Client) EmbedOne(ctx context.Context, model, input string) ([]float32, error)

EmbedOne is a convenience wrapper for single-input queries (the search path). Re-uses EmbedBatch under the hood — Ollama treats single + batch the same way.

func (*Client) Ping ¶

func (c *Client) Ping(ctx context.Context, model string) error

Ping checks the Ollama server is reachable + the requested model is available. Returns a human-readable error pointing the user at how to fix it — the failure mode IS the v0.2 onboarding flow for new users.

Source Files ¶

View all Source files

ollama.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL