clustering-go

Open source Go library for data clustering, inspired by OPTICS from Python.
Features
- OPTICS Algorithm: Density-based clustering that extends DBSCAN
- Fast Performance: Optimized implementation with efficient data structures
- Well Tested: Comprehensive test suite with edge cases and benchmarks
- Easy to Use: Simple API with clear documentation
- Open Source: MIT licensed, community maintained
Installation
go get github.com/damian-pramparo/clustering-go
Basic usage
OPTICS Clustering
package main
import (
"fmt"
"log"
"github.com/damian-pramparo/clustering-go/clustering"
"github.com/damian-pramparo/clustering-go/clustering/optics"
)
func main() {
// Create a dataset with two distinct clusters
data := clustering.Dataset{
{0, 0}, {0, 1}, {1, 0}, {1, 1}, // Cluster 1
{10, 10}, {10, 11}, {11, 10}, {11, 11}, // Cluster 2
{5, 5}, // Noise point
}
// Initialize OPTICS clusterer
clusterer := optics.NewAdapter(2, 2.0)
// Fit and predict
result, err := clusterer.FitPredict(data)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Found %d clusters\n", len(result.Clusters))
fmt.Printf("Point labels: %v\n", result.Labels)
}
K-Means Clustering
package main
import (
"fmt"
"log"
"github.com/damian-pramparo/clustering-go/clustering"
"github.com/damian-pramparo/clustering-go/kmeans"
)
func main() {
// Create a dataset
data := clustering.Dataset{
{0, 0}, {0, 1}, {1, 0}, {1, 1},
{10, 10}, {10, 11}, {11, 10}, {11, 11},
}
// Initialize K-Means clusterer
clusterer := kmeans.New(2, 100, 1e-4)
// Fit and predict
result, err := clusterer.FitPredict(data)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Found %d clusters\n", len(result.Clusters))
fmt.Printf("Centroids: %v\n", clusterer.GetCentroids())
fmt.Printf("Inertia: %.4f\n", clusterer.GetInertia())
}
Available Algorithms
Clustering Algorithms
- OPTICS: Density-based clustering that extends DBSCAN
- K-Means: Centroid-based clustering algorithm
- More coming soon: DBSCAN, Hierarchical clustering, etc.
Dimensionality Reduction (Planned)
- PCA: Principal Component Analysis
- t-SNE: t-Distributed Stochastic Neighbor Embedding
- UMAP: Uniform Manifold Approximation and Projection
Architecture
The library uses a modular architecture with common interfaces:
clustering.Clusterer
: Base interface for all clustering algorithms
clustering.DensityBasedClusterer
: Extended interface for density-based algorithms
clustering.CentroidBasedClusterer
: Extended interface for centroid-based algorithms
dimensionality.DimensionalityReducer
: Interface for dimensionality reduction
Algorithm Parameters
OPTICS
- minPts: Minimum number of points required to form a cluster
- epsilon: Maximum distance between two points to be considered neighbors
K-Means
- k: Number of clusters to form
- maxIterations: Maximum number of iterations
- tolerance: Convergence tolerance
Contributing
Contributions are welcome! Please see CONTRIBUTING.md.
License
This project is licensed under the MIT License. See LICENSE.
Support the project
If you find this project useful, consider making a donation to support its development and maintenance:
Thank you for your support!
Development and tests
To run the tests:
go test ./...