Documentation
¶
Index ¶
- func SaveCache(path string, entries []CacheEntry, modelID, sentexVersion string, dims int) error
- func Similarity(a, b string) float64
- func Trigrams(term string) []string
- type BM25
- type CacheEntry
- type CachedChunk
- type Chunk
- type Graph
- type Index
- type Result
- type SearchResult
- type Trigram
- type VectorIndex
- type VectorResult
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func SaveCache ¶
func SaveCache(path string, entries []CacheEntry, modelID, sentexVersion string, dims int) error
SaveCache writes entries to path atomically (temp file + rename). Every save is a full rewrite of the cache.
func Similarity ¶
Similarity computes the Jaccard similarity of the trigram sets of two strings.
Types ¶
type BM25 ¶
type BM25 struct {
// contains filtered or unexported fields
}
BM25 is a weighted-field BM25 inverted index.
type CacheEntry ¶
type CacheEntry struct {
PageName string
ContentHash string
Chunks []CachedChunk
}
CacheEntry holds the cached embeddings for a single page.
func LoadCache ¶
func LoadCache(path string, modelID, sentexVersion string, dims int) ([]CacheEntry, error)
LoadCache reads the cache from path. Returns empty slice (no error) when:
- the file does not exist (first run)
- any header field mismatches (stale cache — model/version/dims changed)
Returns nil + error when the file exists but cannot be decoded (corrupt).
type CachedChunk ¶
CachedChunk holds the embedding vector and line range for a single chunk.
type Chunk ¶
type Chunk struct {
Text string // chunk content, prefixed with the page's # Title line
StartLine int // 1-indexed start line in the original page content
EndLine int // 1-indexed end line in the original page content (inclusive)
}
Chunk represents a semantically-meaningful portion of a page with line-range tracking for search result anchoring.
func ChunkPage ¶
ChunkPage splits a page into semantically-meaningful chunks using section headings as the primary split strategy, falling back to paragraph breaks when no headings are present. Every chunk is prefixed with the page's "# Title" line. Chunks whose body content is below minChunkTokens tokens are merged with an adjacent chunk.
type Graph ¶
type Graph struct {
// contains filtered or unexported fields
}
Graph is a bidirectional wikilink graph.
func (*Graph) Add ¶
Add adds or replaces a page and its outbound wikilinks in the graph. If the page was previously indexed, old link relationships are cleaned up first.
func (*Graph) LinkedFrom ¶
LinkedFrom returns the canonical names of pages that link to the given target.
type Index ¶
type Index struct {
// contains filtered or unexported fields
}
Index is the composite search index combining BM25, trigram fuzzy matching, a bidirectional wikilink graph for link-boost, and an optional vector index.
func NewIndex ¶
NewIndex creates an empty composite Index. When model is non-nil a VectorIndex is created and wired in; when nil the index behaves as BM25 + trigram + graph only. cachePath is the path to the .memento-vectors sidecar file used for embedding write-through. An empty cachePath disables cache persistence.
func (*Index) Add ¶
Add indexes a page, replacing any existing entry with the same name. When a cachePath is set and a model is available, the resulting chunk embeddings are written through to the cache file.
func (*Index) AddFromCache ¶
func (ix *Index) AddFromCache(page pages.Page, entry CacheEntry)
AddFromCache indexes a page using pre-computed chunk vectors from the given CacheEntry, bypassing the (expensive) embedding step. The BM25, graph, and trigram sub-indexes are updated exactly as in Add.
func (*Index) LinkedFrom ¶
LinkedFrom returns the canonical names of pages that link to the given page. It delegates to the underlying graph.
func (*Index) LinksTo ¶
LinksTo returns the canonical names of pages that the given page links to. It delegates to the underlying graph.
func (*Index) Remove ¶
Remove removes a page from all sub-indexes. When a cachePath is set, the cache file is updated (write-through).
func (*Index) Search ¶
Search executes the full search pipeline and returns up to limit results.
Pipeline (with vector model):
BM25 + vector cosine search → normalize & merge → graph boost → relevance threshold.
Pipeline (nil model, backward-compatible):
BM25 → trigram fallback if <3 results → graph boost → relevance threshold.
type Result ¶
type Result struct {
Page string
Score float64
Snippet string
Line int
IsDirect bool // true if the page matched the query directly (BM25), false if graph-boosted only
}
Result is a single result from the composite Index search.
type SearchResult ¶
SearchResult is a single result from a BM25 search.
type Trigram ¶
type Trigram struct {
// contains filtered or unexported fields
}
Trigram is an in-memory fuzzy-match index based on trigram Jaccard similarity.
type VectorIndex ¶
type VectorIndex struct {
// contains filtered or unexported fields
}
VectorIndex stores per-chunk embeddings and supports Add/Remove/Search with case-insensitive page-name matching.
func NewVectorIndex ¶
func NewVectorIndex(model *embed.Model) *VectorIndex
NewVectorIndex creates an empty vector index backed by the given embedding model.
func (*VectorIndex) Add ¶
func (vi *VectorIndex) Add(page pages.Page) error
Add chunks the page, embeds each chunk, and stores the resulting vectors. If the page was previously indexed its old chunks are replaced.
func (*VectorIndex) AddFromCache ¶
func (vi *VectorIndex) AddFromCache(page pages.Page, chunks []CachedChunk)
AddFromCache loads pre-computed chunk vectors for a page, bypassing embedding. If the page was previously indexed its old chunks are replaced.
func (*VectorIndex) Remove ¶
func (vi *VectorIndex) Remove(name string)
Remove removes all stored chunks for the named page (case-insensitive).
func (*VectorIndex) Search ¶
func (vi *VectorIndex) Search(query string, limit int) []VectorResult
Search embeds the query, scores all stored chunk vectors via cosine similarity, deduplicates to one result per page (best-scoring chunk wins), then returns up to limit results sorted by score descending. Returns nil when the index is empty.
type VectorResult ¶
type VectorResult struct {
Page string
Score float64 // cosine similarity, range [-1, 1] but typically [0, 1] for text
Line int // 1-indexed start line of the best-matching chunk
}
VectorResult is a single result from a vector similarity search.