umap

package module
v0.0.0-...-f6085fb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 30, 2026 License: BSD-3-Clause Imports: 5 Imported by: 0

README

umap

UMAP in Go

Documentation

Overview

Package umap implements the UMAP (Uniform Manifold Approximation and Projection) dimensionality reduction algorithm.

UMAP is a dimension reduction technique that can be used for visualization similarly to t-SNE, but also for general non-linear dimension reduction.

This is a Go port of the original Python implementation by Leland McInnes: https://github.com/lmcinnes/umap

Basic usage:

model := umap.New(umap.DefaultConfig())
embedding := model.FitTransform(data)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Config

type Config struct {
	// NNeighbors is the number of neighbors for k-NN graph construction.
	// Larger values capture more global structure but are slower.
	// Default: 15
	NNeighbors int

	// NComponents is the dimensionality of the target embedding.
	// Default: 2
	NComponents int

	// Metric is the distance metric to use.
	// Options: "euclidean", "manhattan", "cosine", "correlation", etc.
	// Default: "euclidean"
	Metric string

	// MinDist is the effective minimum distance between embedded points.
	// Smaller values create tighter clusters but may lose global structure.
	// Default: 0.1
	MinDist float32

	// Spread is the effective scale of embedded points.
	// In combination with MinDist, this controls the clumpiness of the embedding.
	// Default: 1.0
	Spread float32

	// NEpochs is the number of training epochs.
	// Larger values result in more accurate embeddings but take longer.
	// Default: 200 for large datasets, 500 for small datasets
	NEpochs int

	// LearningRate is the initial learning rate for SGD.
	// Default: 1.0
	LearningRate float32

	// NegativeSampleRate is the number of negative samples per positive sample.
	// Default: 5
	NegativeSampleRate int

	// Init is the initialization method.
	// Options: "spectral" or "random"
	// Default: "spectral"
	Init string

	// LocalConnectivity controls how local the connectivity estimate is.
	// Default: 1.0
	LocalConnectivity float64

	// SetOpMixRatio controls the blend between fuzzy set union and intersection.
	// 0.0 = pure intersection, 1.0 = pure union
	// Default: 1.0
	SetOpMixRatio float64

	// Seed for random number generation.
	// Use a fixed seed for reproducible results.
	// Default: 42
	Seed int64

	// NumWorkers for parallel processing.
	// 0 = auto-detect based on CPU cores.
	// Default: 0
	NumWorkers int

	// Verbose enables progress output.
	// Default: false
	Verbose bool

	// ProgressCallback is called after each epoch with (epoch, totalEpochs).
	// Default: nil
	ProgressCallback func(epoch, total int)
}

Config configures the UMAP algorithm.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns the default UMAP configuration.

type UMAP

type UMAP struct {
	Config Config
	// contains filtered or unexported fields
}

UMAP is the main UMAP model.

func New

func New(config Config) *UMAP

New creates a new UMAP model with the given configuration.

func (*UMAP) Embedding

func (u *UMAP) Embedding() [][]float32

Embedding returns the current embedding.

func (*UMAP) Fit

func (u *UMAP) Fit(data [][]float32)

Fit fits the model to the training data.

func (*UMAP) FitTransform

func (u *UMAP) FitTransform(data [][]float32) [][]float32

FitTransform fits the model to the data and returns the embedding.

func (*UMAP) Transform

func (u *UMAP) Transform(newData [][]float32) [][]float32

Transform transforms new data using the fitted model. Note: This is a simplified implementation that uses nearest neighbor lookup.

Directories

Path Synopsis
cmd
umap command
Command umap provides a CLI for running UMAP on data files.
Command umap provides a CLI for running UMAP on data files.
Package distance provides distance metrics for UMAP.
Package distance provides distance metrics for UMAP.
Package graph provides fuzzy simplicial set construction for UMAP.
Package graph provides fuzzy simplicial set construction for UMAP.
Package init provides initialization methods for UMAP embeddings.
Package init provides initialization methods for UMAP embeddings.
internal
heap
Package heap provides max-heap implementations for k-NN tracking.
Package heap provides max-heap implementations for k-NN tracking.
math
Package math provides float32 math utilities for UMAP.
Package math provides float32 math utilities for UMAP.
parallel
Package parallel provides parallel execution helpers.
Package parallel provides parallel execution helpers.
rand
Package rand provides random number generation compatible with NumPy's RandomState.
Package rand provides random number generation compatible with NumPy's RandomState.
Package layout provides the optimization layout algorithms for UMAP.
Package layout provides the optimization layout algorithms for UMAP.
Package nn provides nearest neighbor search algorithms for UMAP.
Package nn provides nearest neighbor search algorithms for UMAP.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL