indexer

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2026 License: MPL-2.0 Imports: 13 Imported by: 0

Documentation

Overview

Package indexer orchestrates the walk → chunk → embed → store pipeline.

Indexer is constructed with four injected dependencies — a chunker, an embedder, a store, and a walker — and knows nothing about the concrete implementations behind those interfaces. indexer never imports langs; the caller builds the *chunker.Chunker with the desired language configs and passes it in, keeping CGo out of the library layer.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Indexer

type Indexer struct {
	// contains filtered or unexported fields
}

Indexer orchestrates the walk → chunk → embed → store pipeline.

func New

func New(c *chunker.Chunker, e embedder.Embedder, s store.Store, w *walker.Walker, concurrency int, maxFileSize int64, writeBatchSize int, onIndexed func(), progress Progress) *Indexer

New constructs an Indexer from its dependencies. c may be nil for hash-only indexing (no chunking); when non-nil it must have been built with the desired language configs (from langs.AllLanguages() in cmd/). indexer does not import langs directly. The caller retains ownership of w and must close it separately; the indexer does not close the walker. concurrency controls the maximum number of files processed in parallel; values <= 0 are treated as 1. writeBatchSize controls how many FileRecords entries are written per store call; values <= 0 default to 50. progress is optional; pass nil to disable per-file notifications.

func (*Indexer) Index

func (idx *Indexer) Index(ctx context.Context) error

Index walks the project root, chunks every matched file, embeds each chunk, and upserts the resulting records into the store. Files whose content hash matches the stored hash are skipped. Records for files that no longer exist on disk are removed from the store. Returns nil on context cancellation (partial progress is left in the store and is valid).

func (*Indexer) IndexFiles

func (idx *Indexer) IndexFiles(ctx context.Context, paths []string) error

IndexFiles indexes a specific set of paths. Files that no longer exist on disk are removed from the store. Files excluded by the walker's filter are silently skipped.

type Progress

type Progress interface {
	// FileStarted is called when a file begins processing in the read+hash stage.
	FileStarted(path string)
	// FileProcessed is called after a file has been successfully embedded and stored.
	FileProcessed(path string)
	// FileSkipped is called when a file is bypassed due to a hash match (content
	// unchanged), an unsupported language extension, or exceeding maxFileSize.
	FileSkipped(path string)
	// FileFailed is called when a file could not be indexed because at least
	// one of its chunks failed embedding. The error is the joined per-chunk
	// failure cause. The file's existing store record (if any) is left
	// untouched so it will be retried on the next index run.
	FileFailed(path string, err error)
}

Progress receives per-file notifications during indexing. Implementations must be safe for concurrent use — stages run inside errgroup goroutines.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL