engine

package
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CosineSimilarity

func CosineSimilarity(a, b []float32) float64

CosineSimilarity computes the cosine similarity between two float32 vectors.

func LexicalScore

func LexicalScore(query, desc string) float64

LexicalScore computes Jaccard similarity with synonym expansion, context-aware stopwords, role boosting, and prefix matching. Returns [0, 1].

func LexicalScoreWithFrequency added in v0.1.1

func LexicalScoreWithFrequency(query, desc string, ef *ElementFrequency) float64

LexicalScoreWithFrequency computes lexical similarity with optional snapshot-level IEF weighting (nil keeps default equal-weight behavior).

Types

type CombinedMatcher

type CombinedMatcher struct {

	// LexicalWeight and EmbeddingWeight should sum to 1.0 for
	// interpretable scores. Defaults: 0.6 / 0.4.
	LexicalWeight   float64
	EmbeddingWeight float64
	// contains filtered or unexported fields
}

combinedMatcher fuses lexical and embedding scores:

score = 0.6 * lexical + 0.4 * embedding

CombinedMatcher fuses lexical and embedding scores:

score = LexicalWeight * lexical + EmbeddingWeight * embedding

func NewCombinedMatcher

func NewCombinedMatcher(embedder Embedder) *CombinedMatcher

NewCombinedMatcher creates a matcher that fuses lexical and embedding strategies with default weights (0.6 lexical, 0.4 embedding).

func (*CombinedMatcher) Find

func (*CombinedMatcher) Strategy

func (c *CombinedMatcher) Strategy() string

type ElementFrequency added in v0.1.1

type ElementFrequency struct {
	// contains filtered or unexported fields
}

ElementFrequency holds per-snapshot token document frequencies. It is used to compute inverse element frequency (IEF) token weights.

func BuildElementFrequency added in v0.1.1

func BuildElementFrequency(elements []types.ElementDescriptor) *ElementFrequency

BuildElementFrequency creates and fills frequency statistics for one snapshot.

func (*ElementFrequency) Build added in v0.1.1

func (ef *ElementFrequency) Build(elements []types.ElementDescriptor)

Build recomputes token frequencies from a snapshot.

func (*ElementFrequency) IEF added in v0.1.1

func (ef *ElementFrequency) IEF(token string) float64

IEF returns inverse element frequency for a token.

type Embedder

type Embedder interface {
	// Embed converts a batch of text strings into float32 vectors.
	// All returned vectors must have the same dimensionality.
	Embed(texts []string) ([][]float32, error)

	// Strategy returns the name of the embedding strategy (e.g. "hashing", "openai").
	Strategy() string
}

Embedder converts text into dense vectors. See NewHashingEmbedder.

type EmbeddingMatcher

type EmbeddingMatcher struct {
	// contains filtered or unexported fields
}

EmbeddingMatcher scores elements using cosine similarity on dense vectors produced by an Embedder.

func NewEmbeddingMatcher

func NewEmbeddingMatcher(e Embedder) *EmbeddingMatcher

NewEmbeddingMatcher creates an embedding-based matcher.

func NewEmbeddingMatcherWithNeighborWeight added in v0.1.1

func NewEmbeddingMatcherWithNeighborWeight(e Embedder, weight float64) *EmbeddingMatcher

NewEmbeddingMatcherWithNeighborWeight creates an embedding matcher and sets how much immediate neighbors influence each element embedding.

func (*EmbeddingMatcher) Find

func (*EmbeddingMatcher) Strategy

func (m *EmbeddingMatcher) Strategy() string

type HashingEmbedder

type HashingEmbedder struct {
	// contains filtered or unexported fields
}

hashingEmbedder uses the hashing trick (Weinberger et al. 2009) to produce fixed-dimension vectors from word unigrams and character n-grams. No vocabulary construction needed. HashingEmbedder uses the hashing trick (Weinberger et al. 2009) to produce fixed-dimension vectors from word unigrams and character n-grams. Zero external dependencies.

func NewHashingEmbedder

func NewHashingEmbedder(dim int) *HashingEmbedder

NewHashingEmbedder creates a hashing-based embedder with the given vector dimensionality. Default: 128.

func (*HashingEmbedder) Embed

func (h *HashingEmbedder) Embed(texts []string) ([][]float32, error)

func (*HashingEmbedder) EmbedContext added in v0.1.2

func (h *HashingEmbedder) EmbedContext(ctx context.Context, texts []string) ([][]float32, error)

func (*HashingEmbedder) Strategy

func (h *HashingEmbedder) Strategy() string

type LexicalMatcher

type LexicalMatcher struct{}

LexicalMatcher scores elements using Jaccard similarity with synonym expansion, context-aware stopwords, role boosting, and prefix matching.

func NewLexicalMatcher

func NewLexicalMatcher() *LexicalMatcher

NewLexicalMatcher creates a stateless lexical matcher.

func (*LexicalMatcher) Find

func (*LexicalMatcher) Strategy

func (m *LexicalMatcher) Strategy() string

type OrdinalConstraint added in v0.1.2

type OrdinalConstraint struct {
	HasOrdinal bool
	Last       bool
	Position   int
}

type ParsedQuery added in v0.1.2

type ParsedQuery = types.ParsedQuery

Query grammar:

<positive tokens> [NEGATIVE_TRIGGER <negative token>...]+

A NEGATIVE_TRIGGER is one of: not, without, exclude, excluding, except, no, ignore. After a trigger, all following tokens are classified as negative until another trigger or the end of the query.

func ParseQuery added in v0.1.2

func ParseQuery(raw string) ParsedQuery

ParseQuery tokenizes and classifies tokens into positive and negative terms.

type QueryContext added in v0.1.2

type QueryContext struct {
	Base     ParsedQuery
	Exclude  []string
	HasScope bool
	Ordinal  OrdinalConstraint
}

func ParseQueryContext added in v0.1.2

func ParseQueryContext(raw string) QueryContext

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL