Semantic search over a document corpus using model embeddings and cosine similarity.
This example embeds a set of documents and a user query, then ranks documents by similarity to find the most relevant matches. This is the retrieval component of a RAG (retrieval-augmented generation) pipeline.
How it works
Loads a GGUF model using the zerfoo.Load one-line API
Embeds all corpus documents with model.Embed
Embeds the user query
Ranks documents by cosine similarity using Embedding.CosineSimilarity
Command embedding-search demonstrates semantic search using model embeddings.
It embeds a corpus of documents and a query, then ranks documents by cosine
similarity to find the most relevant matches. This is the retrieval half of
a RAG (retrieval-augmented generation) system.