Documentation
¶
Index ¶
- func FormatGroundTruth(neighbors []int, distances []float64, k, maxResults int) string
- func FormatResults(results []core.Neighbor, maxResults int) string
- func LoadCSV(index core.Index, path string, skipHeader bool) error
- func LoadDataset(index core.Index, dir string) (testVectors [][]float32, trueNeighbors [][]int, trueDistances [][]float64, ...)
- func LoadTestDataset(dir string) ([][]float32, [][]int, [][]float64, error)
- func LoadTrainingVectors(dir string) (map[int][]float32, error)
- func RecallAtK(predicted []core.Neighbor, groundTruth []int, k int) float64
- func RunDataset(factory IndexFactory, dataset, root string, k, numQueries, maxResults int)
- type IndexFactory
- type QueryResult
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func FormatGroundTruth ¶
FormatGroundTruth returns a formatted string of ground-truth neighbor results. maxResults specifies how many items to include.
func FormatResults ¶
FormatResults returns a formatted string of neighbor results. maxResults specifies how many items to include.
func LoadDataset ¶
func LoadDataset(index core.Index, dir string) ( testVectors [][]float32, trueNeighbors [][]int, trueDistances [][]float64, err error, )
LoadDataset loads a dataset from a directory into the given index. The directory must contain the following files:
- train.csv (vectors to add to the index)
- test.csv (query vectors, not added to the index)
- neighbors.csv (expected neighbor IDs per query)
- distances.csv (expected distances per query)
func LoadTestDataset ¶
LoadTestDataset loads the test vectors and ground-truth data from the specified directory. It returns the test vectors, true neighbor IDs, and true distances (ground-truth).
func LoadTrainingVectors ¶
LoadTrainingVectors loads training vectors from "train.csv" in the specified directory. It returns a map from id (row number, 0-indexed) to the vector.
func RecallAtK ¶
RecallAtK computes Recall@k as the fraction of all ground-truth items that appear in the top k predictions.
func RunDataset ¶
func RunDataset(factory IndexFactory, dataset, root string, k, numQueries, maxResults int)
RunDataset loads the dataset, builds the index using the provided factory, and runs kNN queries on a subset of test queries. If numQueries is negative or exceeds the number of available test vectors, all test vectors are used. It prints predicted results, ground-truth (if not benchmarking), and computes Recall@k along with per-query response times, average response time, and overall runtime. When benchmarking, a progress bar is displayed. The number of worker threads is read from the HANN_BENCH_NTRD environment variable.
Types ¶
type IndexFactory ¶
IndexFactory is a function that creates a new index.
type QueryResult ¶
type QueryResult struct {
// contains filtered or unexported fields
}
QueryResult holds the results for a single query.