Back to godoc.org
github.com/xtgo/set

Package set

v1.0.0
Latest Go to latest

The highest tagged major version is .

Published: Jan 18, 2018 | License: BSD-2-Clause | Module: github.com/xtgo/set

Overview

Package set implements type-safe, non-allocating algorithms that operate on ordered sets.

Most functions take a data parameter of type sort.Interface and a pivot parameter of type int; data represents two sets covering the ranges [0:pivot] and [pivot:Len], each of which is expected to be sorted and free of duplicates. sort.Sort may be used for sorting, and Uniq may be used to filter away duplicates.

All mutating functions swap elements as necessary from the two input sets to form a single output set, returning its size: the output set will be in the range [0:size], and will be in sorted order and free of duplicates. Elements which were moved into the range [size:Len] will have undefined order and may contain duplicates.

All pivots must be in the range [0:Len]. A panic may occur when invalid pivots are passed into any of the functions.

Convenience functions exist for slices of int, float64, and string element types, and also serve as examples for implementing utility functions for other types.

Elements will be considered equal if `!Less(i,j) && !Less(j,i)`. An implication of this is that NaN values are equal to each other.

Example

Code:

package main

import (
	"fmt"
	"github.com/xtgo/set"
)

func main() {
	s := set.Strings([]string{"alpha", "gamma", "alpha"})
	fmt.Println("set:", s)

	s = set.StringsDo(set.Union, s, "beta")
	fmt.Println("set + beta:", s)

	fmt.Println(s, "contains any [alpha delta]:",
		set.StringsChk(set.IsInter, s, "alpha", "delta"))

	fmt.Println(s, "contains all [alpha delta]:",
		set.StringsChk(set.IsSuper, s, "alpha", "delta"))

}
set: [alpha gamma]
set + beta: [alpha beta gamma]
[alpha beta gamma] contains any [alpha delta]: true
[alpha beta gamma] contains all [alpha delta]: false

Index

Examples

func Apply

func Apply(op Op, data sort.Interface, pivots []int) (size int)

Apply concurrently applies op to all the sets terminated by pivots. pivots must contain one higher than the final index in each set, with the final element of pivots being equal to data.Len(); this deviates from the pivot semantics of other functions (which treat pivot as a delimiter) in order to make initializing the pivots slice simpler.

data.Swap and data.Less are assumed to be concurrent-safe. Only associative operations should be used (Diff is not associative); see the Apply (Diff) example for a workaround. The result of applying SymDiff will contain elements that exist in an odd number of sets.

The implementation runs op concurrently on pairs of neighbor sets in-place; when any pair has been merged, the resulting set is re-paired with one of its neighbor sets and the process repeats until only one set remains. The process is adaptive (large sets will not prevent small pairs from being processed), and strives for data-locality (only adjacent neighbors are paired and data shifts toward the zero index).

Example

Code:

package main

import (
	"fmt"
	"github.com/xtgo/set"
	"sort"
)

func main() {
	sets := []sort.IntSlice{
		{1, 3, 5, 7, 9},  // odds from 1
		{3, 5, 7, 9, 11}, // odds from 3
		{5, 10, 15, 20},  // 5-multiples
		{2, 3, 5, 7, 11}, // primes
	}

	pivots := make([]int, len(sets))
	var orig, data sort.IntSlice

	// concatenate the sets together for use with the set package
	for i, set := range sets {
		pivots[i] = len(set)
		orig = append(orig, set...)
	}

	// transform set sizes into pivots
	pivots = set.Pivots(pivots...)

	tasks := []struct {
		name string
		op   set.Op
	}{
		{"union", set.Union},
		{"inter", set.Inter},
		{"sdiff", set.SymDiff},
	}

	for _, task := range tasks {
		// make a copy of the original data (Apply rearranges the input)
		data = append(data[:0], orig...)
		size := set.Apply(task.op, data, pivots)
		data = data[:size]
		fmt.Printf("%s: %v\n", task.name, data)
	}

}
union: [1 2 3 5 7 9 10 11 15 20]
inter: [5]
sdiff: [1 2 3 7 10 15 20]
Example (Diff)

Code:

package main

import (
	"fmt"
	"github.com/xtgo/set"
	"sort"
)

func main() {
	// a -  b - c - d  cannot be used with Apply (Diff is non-associative)
	// a - (b + c + d) equivalent, using Apply (Union is associative)

	sets := []sort.IntSlice{
		{0, 2, 4, 6, 8, 10},  // positive evens
		{0, 1, 2, 3, 5, 8},   // set of fibonacci numbers
		{5, 10, 15},          // positive 5-multiples
		{2, 3, 5, 7, 11, 13}, // primes
	}

	var data sort.IntSlice

	// for use with (b + c + d)
	exprsets := sets[1:]
	pivots := make([]int, len(exprsets))

	// concatenate the sets together for use with the set package
	for i, set := range exprsets {
		pivots[i] = len(set)
		data = append(data, set...)
	}

	// transform set sizes into pivots
	pivots = set.Pivots(pivots...)

	// calculate x = (b + c + d)
	size := set.Apply(set.Union, data, pivots)

	// concatenate the result to the end of a
	data = append(sets[0], data[:size]...)

	// calculate a - x
	size = set.Diff(data, len(sets[0]))
	data = data[:size]

	fmt.Println("diff:", data)

}
diff: [4 6]

func Diff

func Diff(data sort.Interface, pivot int) (size int)

Diff performs an in-place difference on the two sets [0:pivot] and [pivot:Len]; the resulting set will occupy [0:size]. Diff is neither associative nor commutative.

func Float64s

func Float64s(data []float64) []float64

Float64s sorts and deduplicates a slice of float64s in place, returning the resulting set.

func Float64sChk

func Float64sChk(cmp Cmp, s []float64, t ...float64) bool

Float64sChk compares s and t according to cmp.

func Float64sDo

func Float64sDo(op Op, s []float64, t ...float64) []float64

Float64sDo applies op to the float64 sets, s and t, returning the result. s and t must already be individually sorted and free of duplicates.

func Inter

func Inter(data sort.Interface, pivot int) (size int)

Inter performs an in-place intersection on the two sets [0:pivot] and [pivot:Len]; the resulting set will occupy [0:size]. Inter is both associative and commutative.

Example

Code:

package main

import (
	"fmt"
	"github.com/xtgo/set"
	"sort"
)

func main() {
	data := sort.IntSlice{3, 5, 7} // create a set (it must be sorted)
	pivot := len(data)             // store the length of our first set

	data = append(data, 1, 3, 5)   // append a second set (which also must be sorted)
	size := set.Inter(data, pivot) // perform set intersection

	// trim data to contain just the result set
	data = data[:size]

	fmt.Println("inter:", data)

}
inter: [3 5]

func Ints

func Ints(data []int) []int

Ints sorts and deduplicates a slice of ints in place, returning the resulting set.

func IntsChk

func IntsChk(cmp Cmp, s []int, t ...int) bool

IntsChk compares s and t according to cmp.

func IntsDo

func IntsDo(op Op, s []int, t ...int) []int

IntsDo applies op to the int sets, s and t, returning the result. s and t must already be individually sorted and free of duplicates.

func IsEqual

func IsEqual(data sort.Interface, pivot int) bool

IsEqual returns true if the sets [0:pivot] and [pivot:Len] are equal.

func IsInter

func IsInter(data sort.Interface, pivot int) bool

IsInter returns true if any element in the range [0:pivot] is also present in the range [pivot:Len]. IsInter is especially useful for partial membership testing.

Example

Code:

package main

import (
	"fmt"
	"github.com/xtgo/set"
	"sort"
)

func main() {
	data := sort.StringSlice{"b", "c", "d"} // create a set (it must be sorted)
	pivot := len(data)                      // store the length of our first set

	data = append(data, "d", "z")         // append a second set (which also must be sorted)
	contained := set.IsInter(data, pivot) // check the first set is a superset of the second

	fmt.Printf("%v intersects %v = %t\n", data[:pivot], data[pivot:], contained)

	data = data[:pivot] // trim off the second set

	data = append(data, "s")             // append a new singleton set to compare against
	contained = set.IsInter(data, pivot) // check for membership

	fmt.Printf("%v intersects %v = %t\n", data[:pivot], data[pivot:], contained)

}
[b c d] intersects [d z] = true
[b c d] intersects [s] = false

func IsSub

func IsSub(data sort.Interface, pivot int) bool

IsSub returns true only if all elements in the range [0:pivot] are also present in the range [pivot:Len].

func IsSuper

func IsSuper(data sort.Interface, pivot int) bool

IsSuper returns true only if all elements in the range [pivot:Len] are also present in the range [0:pivot]. IsSuper is especially useful for full membership testing.

Example

Code:

package main

import (
	"fmt"
	"github.com/xtgo/set"
	"sort"
)

func main() {
	data := sort.StringSlice{"b", "c", "d"} // create a set (it must be sorted)
	pivot := len(data)                      // store the length of our first set

	data = append(data, "c", "d")         // append a second set (which also must be sorted)
	contained := set.IsSuper(data, pivot) // check the first set is a superset of the second

	fmt.Printf("%v superset of %v = %t\n", data[:pivot], data[pivot:], contained)

	data = data[:pivot] // trim off the second set

	data = append(data, "s")             // append a new singleton set to compare against
	contained = set.IsSuper(data, pivot) // check for membership

	fmt.Printf("%v superset of %v = %t\n", data[:pivot], data[pivot:], contained)

}
[b c d] superset of [c d] = true
[b c d] superset of [s] = false

func Pivots

func Pivots(sizes ...int) []int

Pivots transforms set-relative sizes into data-absolute pivots. Pivots is mostly only useful in conjunction with Apply. The sizes slice sizes may be modified by the call.

func Strings

func Strings(data []string) []string

Strings sorts and deduplicates a slice of strings in place, returning the resulting set.

func StringsChk

func StringsChk(cmp Cmp, s []string, t ...string) bool

StringsChk compares s and t according to cmp.

func StringsDo

func StringsDo(op Op, s []string, t ...string) []string

StringsDo applies op to the string sets, s and t, returning the result. s and t must already be individually sorted and free of duplicates.

func SymDiff

func SymDiff(data sort.Interface, pivot int) (size int)

SymDiff performs an in-place symmetric difference on the two sets [0:pivot] and [pivot:Len]; the resulting set will occupy [0:size]. SymDiff is both associative and commutative.

func Union

func Union(data sort.Interface, pivot int) (size int)

Union performs an in-place union on the two sets [0:pivot] and [pivot:Len]; the resulting set will occupy [0:size]. Union is both associative and commutative.

func Uniq

func Uniq(data sort.Interface) (size int)

Uniq swaps away duplicate elements in data, returning the size of the unique set. data is expected to be pre-sorted, and the resulting set in the range [0:size] will remain in sorted order. Uniq, following a sort.Sort call, can be used to prepare arbitrary inputs for use as sets.

Example

Code:

package main

import (
	"fmt"
	"github.com/xtgo/set"
	"sort"
)

func main() {
	data := sort.IntSlice{5, 7, 3, 3, 5}

	sort.Sort(data)     // sort the data first
	n := set.Uniq(data) // Uniq returns the size of the set
	data = data[:n]     // trim the duplicate elements

	fmt.Println(data)
}
[3 5 7]

type Cmp

type Cmp func(data sort.Interface, pivot int) bool

The Cmp type can be used to represent any of the comparison functions, such as IsInter.

type Op

type Op func(data sort.Interface, pivot int) (size int)

The Op type can be used to represent any of the mutating functions, such as Inter.

Package Files

BUGs

  • All ops should use binary search when runs are detected

  • Union currently uses a multi-pass implementation

  • SymDiff currently uses a multi-pass implementation

Documentation was rendered with GOOS=linux and GOARCH=amd64.

Jump to identifier

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to identifier