Documentation
¶
Index ¶
- func CosineSimilarity(a, b []float32) float64
- func LexicalScore(query, desc string) float64
- func LexicalScoreWithFrequency(query, desc string, ef *ElementFrequency) float64
- type CombinedMatcher
- type ElementFrequency
- type Embedder
- type EmbeddingMatcher
- type HashingEmbedder
- type LexicalMatcher
- type OrdinalConstraint
- type ParsedQuery
- type QueryContext
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CosineSimilarity ¶
CosineSimilarity computes the cosine similarity between two float32 vectors.
func LexicalScore ¶
LexicalScore computes Jaccard similarity with synonym expansion, context-aware stopwords, role boosting, and prefix matching. Returns [0, 1].
func LexicalScoreWithFrequency ¶ added in v0.1.1
func LexicalScoreWithFrequency(query, desc string, ef *ElementFrequency) float64
LexicalScoreWithFrequency computes lexical similarity with optional snapshot-level IEF weighting (nil keeps default equal-weight behavior).
Types ¶
type CombinedMatcher ¶
type CombinedMatcher struct {
// LexicalWeight and EmbeddingWeight should sum to 1.0 for
// interpretable scores. Defaults: 0.6 / 0.4.
LexicalWeight float64
EmbeddingWeight float64
// contains filtered or unexported fields
}
combinedMatcher fuses lexical and embedding scores:
score = 0.6 * lexical + 0.4 * embedding
CombinedMatcher fuses lexical and embedding scores:
score = LexicalWeight * lexical + EmbeddingWeight * embedding
func NewCombinedMatcher ¶
func NewCombinedMatcher(embedder Embedder) *CombinedMatcher
NewCombinedMatcher creates a matcher that fuses lexical and embedding strategies with default weights (0.6 lexical, 0.4 embedding).
func (*CombinedMatcher) Find ¶
func (c *CombinedMatcher) Find(ctx context.Context, query string, elements []types.ElementDescriptor, opts types.FindOptions) (types.FindResult, error)
func (*CombinedMatcher) Strategy ¶
func (c *CombinedMatcher) Strategy() string
type ElementFrequency ¶ added in v0.1.1
type ElementFrequency struct {
// contains filtered or unexported fields
}
ElementFrequency holds per-snapshot token document frequencies. It is used to compute inverse element frequency (IEF) token weights.
func BuildElementFrequency ¶ added in v0.1.1
func BuildElementFrequency(elements []types.ElementDescriptor) *ElementFrequency
BuildElementFrequency creates and fills frequency statistics for one snapshot.
func (*ElementFrequency) Build ¶ added in v0.1.1
func (ef *ElementFrequency) Build(elements []types.ElementDescriptor)
Build recomputes token frequencies from a snapshot.
func (*ElementFrequency) IEF ¶ added in v0.1.1
func (ef *ElementFrequency) IEF(token string) float64
IEF returns inverse element frequency for a token.
type Embedder ¶
type Embedder interface {
// Embed converts a batch of text strings into float32 vectors.
// All returned vectors must have the same dimensionality.
Embed(texts []string) ([][]float32, error)
// Strategy returns the name of the embedding strategy (e.g. "hashing", "openai").
Strategy() string
}
Embedder converts text into dense vectors. See NewHashingEmbedder.
type EmbeddingMatcher ¶
type EmbeddingMatcher struct {
// contains filtered or unexported fields
}
EmbeddingMatcher scores elements using cosine similarity on dense vectors produced by an Embedder.
func NewEmbeddingMatcher ¶
func NewEmbeddingMatcher(e Embedder) *EmbeddingMatcher
NewEmbeddingMatcher creates an embedding-based matcher.
func NewEmbeddingMatcherWithNeighborWeight ¶ added in v0.1.1
func NewEmbeddingMatcherWithNeighborWeight(e Embedder, weight float64) *EmbeddingMatcher
NewEmbeddingMatcherWithNeighborWeight creates an embedding matcher and sets how much immediate neighbors influence each element embedding.
func (*EmbeddingMatcher) Find ¶
func (m *EmbeddingMatcher) Find(ctx context.Context, query string, elements []types.ElementDescriptor, opts types.FindOptions) (types.FindResult, error)
func (*EmbeddingMatcher) Strategy ¶
func (m *EmbeddingMatcher) Strategy() string
type HashingEmbedder ¶
type HashingEmbedder struct {
// contains filtered or unexported fields
}
hashingEmbedder uses the hashing trick (Weinberger et al. 2009) to produce fixed-dimension vectors from word unigrams and character n-grams. No vocabulary construction needed. HashingEmbedder uses the hashing trick (Weinberger et al. 2009) to produce fixed-dimension vectors from word unigrams and character n-grams. Zero external dependencies.
func NewHashingEmbedder ¶
func NewHashingEmbedder(dim int) *HashingEmbedder
NewHashingEmbedder creates a hashing-based embedder with the given vector dimensionality. Default: 128.
func (*HashingEmbedder) Embed ¶
func (h *HashingEmbedder) Embed(texts []string) ([][]float32, error)
func (*HashingEmbedder) EmbedContext ¶ added in v0.1.2
func (*HashingEmbedder) Strategy ¶
func (h *HashingEmbedder) Strategy() string
type LexicalMatcher ¶
type LexicalMatcher struct{}
LexicalMatcher scores elements using Jaccard similarity with synonym expansion, context-aware stopwords, role boosting, and prefix matching.
func NewLexicalMatcher ¶
func NewLexicalMatcher() *LexicalMatcher
NewLexicalMatcher creates a stateless lexical matcher.
func (*LexicalMatcher) Find ¶
func (m *LexicalMatcher) Find(ctx context.Context, query string, elements []types.ElementDescriptor, opts types.FindOptions) (types.FindResult, error)
func (*LexicalMatcher) Strategy ¶
func (m *LexicalMatcher) Strategy() string
type OrdinalConstraint ¶ added in v0.1.2
type ParsedQuery ¶ added in v0.1.2
type ParsedQuery = types.ParsedQuery
Query grammar:
<positive tokens> [NEGATIVE_TRIGGER <negative token>...]+
A NEGATIVE_TRIGGER is one of: not, without, exclude, excluding, except, no, ignore. After a trigger, all following tokens are classified as negative until another trigger or the end of the query.
func ParseQuery ¶ added in v0.1.2
func ParseQuery(raw string) ParsedQuery
ParseQuery tokenizes and classifies tokens into positive and negative terms.
type QueryContext ¶ added in v0.1.2
type QueryContext struct {
Base ParsedQuery
Exclude []string
HasScope bool
Ordinal OrdinalConstraint
}
func ParseQueryContext ¶ added in v0.1.2
func ParseQueryContext(raw string) QueryContext