Documentation
¶
Index ¶
- func ChunkText(text string, chunkSize, overlap int) []string
- func ChunkWithMarkdownHierarchy(content string) []string
- func SplitMarkdownBySections(markdown string) []string
- func SplitTextWithDelimiter(text string, delimiter string) []string
- type MarkdownChunk
- type MemoryVectorStore
- func (mvs *MemoryVectorStore) GetAll() ([]VectorRecord, error)
- func (mvs *MemoryVectorStore) Load(storeFilePath string) error
- func (mvs *MemoryVectorStore) Persist(storeFilePath string) error
- func (mvs *MemoryVectorStore) ResetMemory() error
- func (mvs *MemoryVectorStore) Save(vectorRecord VectorRecord) (VectorRecord, error)
- func (mvs *MemoryVectorStore) SearchSimilarities(embeddingFromQuestion VectorRecord, limit float64) ([]VectorRecord, error)
- func (mvs *MemoryVectorStore) SearchTopNSimilarities(embeddingFromQuestion VectorRecord, limit float64, max int) ([]VectorRecord, error)
- type VectorRecord
- type VectorStore
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ChunkText ¶
ChunkText takes a text string and divides it into chunks of a specified size with a given overlap. It returns a slice of strings, where each string represents a chunk of the original text.
Parameters:
- text: The input text to be chunked.
- chunkSize: The size of each chunk.
- overlap: The amount of overlap between consecutive chunks.
Returns:
- []string: A slice of strings representing the chunks of the original text.
func ChunkWithMarkdownHierarchy ¶
ChunkWithMarkdownHierarchy processes markdown content into formatted chunks with hierarchical context
func SplitMarkdownBySections ¶
SplitMarkdownBySections splits markdown content into sections at header boundaries
func SplitTextWithDelimiter ¶
SplitTextWithDelimiter splits the given text using the specified delimiter and returns a slice of strings.
Parameters:
- text: The text to be split.
- delimiter: The delimiter used to split the text.
Returns:
- []string: A slice of strings containing the split parts of the text.
Types ¶
type MarkdownChunk ¶
type MarkdownChunk struct {
Header string
Content string
Level int
Prefix string
ParentLevel int
ParentHeader string
ParentPrefix string
Hierarchy string
SimpleMetaData string // Additional metadata if needed
Metadata map[string]interface{} // additional metadata
KeyWords []string // Keywords that could be extracted from the content
}
MarkdownChunk represents a parsed markdown section with hierarchical context
func ParseMarkdownHierarchy ¶
func ParseMarkdownHierarchy(content string) []MarkdownChunk
ParseMarkdownHierarchy parses the given markdown content and returns a slice of MarkdownChunk structs preserving the hierarchical context
type MemoryVectorStore ¶
type MemoryVectorStore struct {
Records map[string]VectorRecord
}
MemoryVectorStore implements VectorStore using in-memory storage
func (*MemoryVectorStore) GetAll ¶
func (mvs *MemoryVectorStore) GetAll() ([]VectorRecord, error)
GetAll returns all vector records stored in the MemoryVectorStore
func (*MemoryVectorStore) Load ¶
func (mvs *MemoryVectorStore) Load(storeFilePath string) error
Load reads vector records from a JSON file and populates the MemoryVectorStore
func (*MemoryVectorStore) Persist ¶
func (mvs *MemoryVectorStore) Persist(storeFilePath string) error
Persist saves the MemoryVectorStore to a JSON file
func (*MemoryVectorStore) ResetMemory ¶
func (mvs *MemoryVectorStore) ResetMemory() error
ResetMemory clears all vector records from the MemoryVectorStore
func (*MemoryVectorStore) Save ¶
func (mvs *MemoryVectorStore) Save(vectorRecord VectorRecord) (VectorRecord, error)
Save saves a vector record to the MemoryVectorStore. If the record does not have an ID, it generates a new UUID for it. It returns the saved vector record and an error if any occurred during the save operation. If the record already exists, it will be overwritten.
func (*MemoryVectorStore) SearchSimilarities ¶
func (mvs *MemoryVectorStore) SearchSimilarities(embeddingFromQuestion VectorRecord, limit float64) ([]VectorRecord, error)
SearchSimilarities searches for vector records in the MemoryVectorStore that have a cosine distance similarity greater than or equal to the given limit.
Parameters:
- embeddingFromQuestion: the vector record to compare similarities with.
- limit: the minimum cosine distance similarity threshold.
Returns:
- []llm.VectorRecord: a slice of vector records that have a cosine distance similarity greater than or equal to the limit.
- error: an error if any occurred during the search.
func (*MemoryVectorStore) SearchTopNSimilarities ¶
func (mvs *MemoryVectorStore) SearchTopNSimilarities(embeddingFromQuestion VectorRecord, limit float64, max int) ([]VectorRecord, error)
SearchTopNSimilarities searches for the top N similar vector records based on the given embedding from a question. It returns a slice of vector records and an error if any. The limit parameter specifies the minimum similarity score for a record to be considered similar. The max parameter specifies the maximum number of vector records to return.
type VectorRecord ¶
type VectorRecord struct {
Id string `json:"id"`
Prompt string `json:"prompt"`
Embedding []float64 `json:"embedding"`
CosineSimilarity float64
}
VectorRecord represents a stored vector with metadata and similarity score
type VectorStore ¶
type VectorStore interface {
GetAll() ([]VectorRecord, error)
Save(vectorRecord VectorRecord) (VectorRecord, error)
SearchSimilarities(embeddingFromQuestion VectorRecord, limit float64) ([]VectorRecord, error)
SearchTopNSimilarities(embeddingFromQuestion VectorRecord, limit float64, max int) ([]VectorRecord, error)
}
VectorStore defines the interface for storing and searching vector embeddings