clustering

package
v0.0.0-...-516b90f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2018 License: MIT Imports: 3 Imported by: 0

README

go.geo/clustering

Package clustering provides simple hierarchical clustering. See the clustering godoc for more information.

Example

package main

import (
	"fmt"

	"github.com/paulmach/go.geo"
	"github.com/paulmach/go.geo/clustering/point_clustering"
)

func main() {
	pointers := []clustering.Pointer{
		&Event{Location: geo.NewPoint(1, 1)},
		&Event{Location: geo.NewPoint(2, 2)},
		&Event{Location: geo.NewPoint(5, 5)},
	}

	clusters := clustering.ClusterPointers(
		pointers,
		clustering.CentroidDistance{},
		2, // distance threshold, merge until clusters are at least this far apart
	)

	for i, c := range clusters {
		fmt.Printf("cluster %d:\n", i+1)
		for _, p := range c.Pointers {
			e := p.(*Event)
			fmt.Printf("   %+v\n", e)
		}
	}
	// Output:
	// cluster 1:
	//    &{Location:[1.000000, 1.000000]}
	//    &{Location:[2.000000, 2.000000]}
	// cluster 2:
	//    &{Location:[5.000000, 5.000000]}
}

// example of an object implementing the point_clusting.Pointer interface
type Event struct {
	Location *geo.Point
}

func (e *Event) CenterPoint() *geo.Point {
	return e.Location
}

Example for Geo data

The ClusterPointersGeoProjected method first projects the points using Mercator (EPSG:3857), scales the threshold accordingly and then clusters using a euclidean distance. It is best to use this method if it makes sense, ie. the data is fairly local. Benchmarks found this to be 40% faster for a 555 point set.

package main

import (
	"fmt"

	"github.com/paulmach/go.geo"
	"github.com/paulmach/go.geo/clustering/point_clustering"
)

func main() {
	pointers := []clustering.Pointer{
		&Event{Location: geo.NewPoint(-122.548081, 37.905995)},
		&Event{Location: geo.NewPoint(-122.548091, 37.905987)},
		&Event{Location: geo.NewPoint(-122.54807, 37.905995)},
		&Event{Location: geo.NewPoint(-122.54807, 37.905995)},
		&Event{Location: geo.NewPoint(-122.54807, 37.905987)},
	}

	threshold := 1.0 // meter
	clusters := clustering.ClusterPointersGeoProjected(
		pointers,
		threshold,
	)

	for i, c := range clusters {
		fmt.Printf("cluster %d:\n", i+1)
		for _, p := range c.Pointers {
			e := p.(*Event)
			fmt.Printf("   %+v\n", e)
		}
	}
	// Output:
	// cluster 1:
	//    &{Location:[-122.548081, 37.905995]}
	// cluster 2:
	//    &{Location:[-122.548091, 37.905987]}
	// cluster 3:
	//    &{Location:[-122.548070, 37.905995]}
	//    &{Location:[-122.548070, 37.905995]}
	//    &{Location:[-122.548070, 37.905987]}
}

// example of an object implementing the point_clusting.Pointer interface
type Event struct {
	Location *geo.Point
}

func (e *Event) CenterPoint() *geo.Point {
	return e.Location
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type CentroidDistance

type CentroidDistance struct{}

CentroidDistance implements the ClusterDistancer interface where the distance is just the euclidean distance between the cluster centroids.

func (CentroidDistance) ClusterDistance

func (cd CentroidDistance) ClusterDistance(c1, c2 *Cluster) float64

ClusterDistance computes the distance between the cluster centroids.

type CentroidGeoDistance

type CentroidGeoDistance struct{}

CentroidGeoDistance implements the ClusterDistancer interface where the distance is just the geo distance between the Group centroids. If possible, it is recommended to project the lat/lng points into a euclidean space and use CentroidSquaredDistance.

func (CentroidGeoDistance) ClusterDistance

func (cgd CentroidGeoDistance) ClusterDistance(c1, c2 *Cluster) float64

ClusterDistance computes the geo distance between the cluster centroids.

type CentroidSquaredDistance

type CentroidSquaredDistance struct{}

CentroidSquaredDistance implements the ClusterDistancer interface where the distance is just the squared euclidean distance between the cluster centroids. This distancer is recommended over CentroidDistance, just square the threshold.

func (CentroidSquaredDistance) ClusterDistance

func (csd CentroidSquaredDistance) ClusterDistance(c1, c2 *Cluster) float64

ClusterDistance computes the squared euclidean distance between the cluster centroids.

type Cluster

type Cluster struct {
	Centroid *geo.Point
	Pointers []geo.Pointer
}

A Cluster is a cluster of pointers plus their centroid. It defines a center/centroid for easy centroid distance computation.

func ClusterClusters

func ClusterClusters(clusters []*Cluster, distancer ClusterDistancer, threshold float64) []*Cluster

ClusterClusters can be used if you've already created cluster objects using a prefilterer of something else. Original clusters will be copied so the original set will be unchanged.

func ClusterGeoClusters

func ClusterGeoClusters(clusters []*Cluster, threshold float64) []*Cluster

ClusterGeoClusters can be used if you've already created clusters objects using a prefilterer of something else.

func ClusterGeoPointers

func ClusterGeoPointers(pointers []geo.Pointer, threshold float64) []*Cluster

ClusterGeoPointers will take a set of Pointers and cluster them. It will project the points using mercator, scale the threshold, cluster, and project back. Performace is about 40% than simply using a geo distancer. This may not make sense for all geo datasets.

func ClusterPointers

func ClusterPointers(pointers []geo.Pointer, distancer ClusterDistancer, threshold float64) []*Cluster

ClusterPointers will take a set of Pointers and cluster them using the distancer and threshold.

func NewCluster

func NewCluster(pointers ...geo.Pointer) *Cluster

NewCluster creates the point cluster and finds the center of the given pointers.

func NewClusterWithCentroid

func NewClusterWithCentroid(centroid *geo.Point, pointers ...geo.Pointer) *Cluster

NewClusterWithCentroid creates a point cluster stub from the given centroid and optional pointers.

type ClusterDistancer

type ClusterDistancer interface {
	ClusterDistance(c1, c2 *Cluster) float64
}

A ClusterDistancer defines the how to compute the distance between point clusters.

type Combiner

type Combiner interface {
	Combine(c Combiner) Combiner
	DistanceFromCombiner(c Combiner) float64
}

A Combiner is something that can be combined.

func ClusterCombiners

func ClusterCombiners(combiners []Combiner, threshold float64) []Combiner

ClusterCombiners will do a simple hierarchical of the combiners. It will modify the input slice as things will be combined into each other.

type Sortable

type Sortable []*Cluster

Sortable implements the sorting interface allowing for sorting.

func (Sortable) Len

func (s Sortable) Len() int

Len returns the length of the sortable cluster. This is to implement the sort interface.

func (Sortable) Less

func (s Sortable) Less(i, j int) bool

Less returns truee if i > j, so bigger will be first. This is to implement the sort interface.

func (Sortable) Sort

func (s Sortable) Sort()

Sort will sort with set. Usage: clustering.Sortable(clusters).Sort()

func (Sortable) Swap

func (s Sortable) Swap(i, j int)

Swap interchanges two elements. This is to implement the sort interface.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL