index

package

v0.0.0-...-ec03379 Latest Latest Go to latest Published: Apr 20, 2026 License: MIT Imports: 16 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/edgetools/memento

Links

Open Source Insights

Documentation ¶

Index ¶

func SaveCache(path string, entries []CacheEntry, modelID, sentexVersion string, dims int) error
func Similarity(a, b string) float64
func Trigrams(term string) []string
type BM25
- func NewBM25() *BM25
- func (b *BM25) Add(page pages.Page)
- func (b *BM25) Remove(name string)
- func (b *BM25) Search(query string, limit int) []SearchResult
type CacheEntry
- func LoadCache(path string, modelID, sentexVersion string, dims int) ([]CacheEntry, error)
type CachedChunk
type Chunk
- func ChunkPage(page pages.Page) []Chunk
type Graph
- func NewGraph() *Graph
- func (g *Graph) Add(page pages.Page)
- func (g *Graph) LinkedFrom(name string) []string
- func (g *Graph) LinksTo(name string) []string
- func (g *Graph) Remove(name string)
type Index
- func NewIndex(model *embed.Model, cachePath string) *Index
- func (ix *Index) Add(page pages.Page)
- func (ix *Index) AddFromCache(page pages.Page, entry CacheEntry)
- func (ix *Index) LinkedFrom(name string) []string
- func (ix *Index) LinksTo(name string) []string
- func (ix *Index) Remove(name string)
- func (ix *Index) Search(query string, limit int) []Result
type Result
type SearchResult
type Trigram
- func NewTrigram() *Trigram
- func (ti *Trigram) Add(term string)
- func (ti *Trigram) FuzzyMatch(query string, threshold float64) []string
type VectorIndex
- func NewVectorIndex(model *embed.Model) *VectorIndex
- func (vi *VectorIndex) Add(page pages.Page) error
- func (vi *VectorIndex) AddFromCache(page pages.Page, chunks []CachedChunk)
- func (vi *VectorIndex) Remove(name string)
- func (vi *VectorIndex) Search(query string, limit int) []VectorResult
type VectorResult

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func SaveCache ¶

func SaveCache(path string, entries []CacheEntry, modelID, sentexVersion string, dims int) error

SaveCache writes entries to path atomically (temp file + rename). Every save is a full rewrite of the cache.

func Similarity ¶

func Similarity(a, b string) float64

Similarity computes the Jaccard similarity of the trigram sets of two strings.

func Trigrams ¶

func Trigrams(term string) []string

Trigrams returns the set of unique 3-character sliding windows for term. For terms shorter than 3 characters, returns the term itself as a single element.

Types ¶

type BM25 ¶

type BM25 struct {
	// contains filtered or unexported fields
}

BM25 is a weighted-field BM25 inverted index.

func (*BM25) Add ¶

func (b *BM25) Add(page pages.Page)

Add indexes a page, replacing any existing entry with the same name.

func (*BM25) Remove ¶

func (b *BM25) Remove(name string)

Remove deletes a page from the index by name.

func (*BM25) Search ¶

func (b *BM25) Search(query string, limit int) []SearchResult

Search returns up to limit pages ranked by BM25 score for query.

type CacheEntry ¶

type CacheEntry struct {
	PageName    string
	ContentHash string
	Chunks      []CachedChunk
}

CacheEntry holds the cached embeddings for a single page.

func LoadCache ¶

func LoadCache(path string, modelID, sentexVersion string, dims int) ([]CacheEntry, error)

LoadCache reads the cache from path. Returns empty slice (no error) when:

the file does not exist (first run)
any header field mismatches (stale cache — model/version/dims changed)

Returns nil + error when the file exists but cannot be decoded (corrupt).

type CachedChunk ¶

type CachedChunk struct {
	StartLine int
	EndLine   int
	Vector    []float32
}

CachedChunk holds the embedding vector and line range for a single chunk.

type Chunk ¶

type Chunk struct {
	Text      string // chunk content, prefixed with the page's # Title line
	StartLine int    // 1-indexed start line in the original page content
	EndLine   int    // 1-indexed end line in the original page content (inclusive)
}

Chunk represents a semantically-meaningful portion of a page with line-range tracking for search result anchoring.

func ChunkPage ¶

func ChunkPage(page pages.Page) []Chunk

ChunkPage splits a page into semantically-meaningful chunks using section headings as the primary split strategy, falling back to paragraph breaks when no headings are present. Every chunk is prefixed with the page's "# Title" line. Chunks whose body content is below minChunkTokens tokens are merged with an adjacent chunk.

type Graph ¶

type Graph struct {
	// contains filtered or unexported fields
}

Graph is a bidirectional wikilink graph.

func NewGraph ¶

func NewGraph() *Graph

NewGraph creates an empty Graph.

func (*Graph) Add ¶

func (g *Graph) Add(page pages.Page)

Add adds or replaces a page and its outbound wikilinks in the graph. If the page was previously indexed, old link relationships are cleaned up first.

func (*Graph) LinkedFrom ¶

func (g *Graph) LinkedFrom(name string) []string

LinkedFrom returns the canonical names of pages that link to the given target.

func (*Graph) LinksTo ¶

func (g *Graph) LinksTo(name string) []string

LinksTo returns the canonical names of pages that the given page links to, in the order the links appear in the page source.

func (*Graph) Remove ¶

func (g *Graph) Remove(name string)

Remove removes a page and cleans up all its outbound link relationships.

type Index ¶

type Index struct {
	// contains filtered or unexported fields
}

Index is the composite search index combining BM25, trigram fuzzy matching, a bidirectional wikilink graph for link-boost, and an optional vector index.

func NewIndex ¶

func NewIndex(model *embed.Model, cachePath string) *Index

NewIndex creates an empty composite Index. When model is non-nil a VectorIndex is created and wired in; when nil the index behaves as BM25 + trigram + graph only. cachePath is the path to the .memento-vectors sidecar file used for embedding write-through. An empty cachePath disables cache persistence.

func (*Index) Add ¶

func (ix *Index) Add(page pages.Page)

Add indexes a page, replacing any existing entry with the same name. When a cachePath is set and a model is available, the resulting chunk embeddings are written through to the cache file.

func (*Index) AddFromCache ¶

func (ix *Index) AddFromCache(page pages.Page, entry CacheEntry)

AddFromCache indexes a page using pre-computed chunk vectors from the given CacheEntry, bypassing the (expensive) embedding step. The BM25, graph, and trigram sub-indexes are updated exactly as in Add.

func (*Index) LinkedFrom ¶

func (ix *Index) LinkedFrom(name string) []string

LinkedFrom returns the canonical names of pages that link to the given page. It delegates to the underlying graph.

func (*Index) LinksTo ¶

func (ix *Index) LinksTo(name string) []string

LinksTo returns the canonical names of pages that the given page links to. It delegates to the underlying graph.

func (*Index) Remove ¶

func (ix *Index) Remove(name string)

Remove removes a page from all sub-indexes. When a cachePath is set, the cache file is updated (write-through).

func (*Index) Search ¶

func (ix *Index) Search(query string, limit int) []Result

Search executes the full search pipeline and returns up to limit results.

Pipeline (with vector model):

BM25 + vector cosine search → normalize & merge → graph boost → relevance threshold.

Pipeline (nil model, backward-compatible):

BM25 → trigram fallback if <3 results → graph boost → relevance threshold.

type Result ¶

type Result struct {
	Page     string
	Score    float64
	Snippet  string
	Line     int
	IsDirect bool // true if the page matched the query directly (BM25), false if graph-boosted only
}

Result is a single result from the composite Index search.

type SearchResult ¶

type SearchResult struct {
	Name  string
	Score float64
}

SearchResult is a single result from a BM25 search.

type Trigram ¶

type Trigram struct {
	// contains filtered or unexported fields
}

Trigram is an in-memory fuzzy-match index based on trigram Jaccard similarity.

func NewTrigram ¶

func NewTrigram() *Trigram

NewTrigram creates an empty trigram index.

func (*Trigram) Add ¶

func (ti *Trigram) Add(term string)

Add adds a term to the trigram index.

func (*Trigram) FuzzyMatch ¶

func (ti *Trigram) FuzzyMatch(query string, threshold float64) []string

FuzzyMatch returns all indexed terms whose Jaccard similarity with query is at or above threshold.

type VectorIndex ¶

type VectorIndex struct {
	// contains filtered or unexported fields
}

VectorIndex stores per-chunk embeddings and supports Add/Remove/Search with case-insensitive page-name matching.

func NewVectorIndex ¶

func NewVectorIndex(model *embed.Model) *VectorIndex

NewVectorIndex creates an empty vector index backed by the given embedding model.

func (*VectorIndex) Add ¶

func (vi *VectorIndex) Add(page pages.Page) error

Add chunks the page, embeds each chunk, and stores the resulting vectors. If the page was previously indexed its old chunks are replaced.

func (*VectorIndex) AddFromCache ¶

func (vi *VectorIndex) AddFromCache(page pages.Page, chunks []CachedChunk)

AddFromCache loads pre-computed chunk vectors for a page, bypassing embedding. If the page was previously indexed its old chunks are replaced.

func (*VectorIndex) Remove ¶

func (vi *VectorIndex) Remove(name string)

Remove removes all stored chunks for the named page (case-insensitive).

func (*VectorIndex) Search ¶

func (vi *VectorIndex) Search(query string, limit int) []VectorResult

Search embeds the query, scores all stored chunk vectors via cosine similarity, deduplicates to one result per page (best-scoring chunk wins), then returns up to limit results sorted by score descending. Returns nil when the index is empty.

type VectorResult ¶

type VectorResult struct {
	Page  string
	Score float64 // cosine similarity, range [-1, 1] but typically [0, 1] for text
	Line  int     // 1-indexed start line of the best-matching chunk
}

VectorResult is a single result from a vector similarity search.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL