triego

package module
v0.0.0-...-215f9bf Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 20, 2016 License: MIT Imports: 2 Imported by: 0

README

Triego Build Status

Despite the name the internal of this data structure is implemented using a Radix Tree.

This project is one of the funding blocks of typeflow-webservice but, although mostly designed around it, it features a rather general purpose API around a Radix Tree.

Implementation considerations

The whole implementation has been designed with prefix iteration efficiency in mind. This implies that feeding the radix tree is way slower than a simple trie, but looking for words or prefixes is way faster due to the reduced amount of nodes to traverse. So, although insertion should be still MAX(O(N*Log(N))), the algorithm requires some additional time mostly due to memory allocations and/or resizing of existing nodes.

Basic usage

package main

import (
  "github.com/alediaferia/triego"
  "fmt"
)

func countTrieNodes(trie *triego.Trie, i *int) {
	if len(trie.Children) == 0 {
		*i = *i + 1
		return
	}
	for _, v := range trie.Children {
		countTrieNodes(v, i)
	}

	*i = *i + 1
}

func main() {
    trie := triego.NewTrie()
    
    trie.AppendWord("trial")
    trie.AppendWord("trie")
    
    fmt.Println("Stored words:")
    for _, w := range trie.Words() {
        fmt.Println(w)
    }
    
    fmt.Println("")
    var nodes int = 0
    countTrieNodes(trie, &nodes)
    fmt.Printf("Number of allocated nodes: %d\n", nodes)
}

Output:

Stored words:
trie
trial

Number of allocated nodes: 4

That is, 3 actual nodes plus the root node. This is because 1 node is required for the "tri" prefix and just 2 additional nodes for "e" and "al":

  • tri
    • e
    • al

Advanced concepts

Prefixes iteration

The provided API can be used to iterate over all the prefixes in the radix.

You need to provide the EachPrefix API with a callback having this signature:

type PrefixIteratorCallback   func(PrefixInfo) (skip_subtree, halt bool)

PrefixInfo holds information about the current prefix. You can find out here. The callback can also be used to skip entire subtrees making the search even faster if it meets certain conditions of your choice. To do so, simply return true as the first return argument.


radix.EachPrefix(func(info PrefixInfo) (skip_subtree, halt bool) {
    value := some_computation(info.Prefix)
    
    if value > MAX_THRESHOLD {
        // this value is too high
        // and we are not interested in
        // further prefixes containing
        // the current one so we skip
        // the current subtree altogether
        return true, false
    }
    
    ...
    
    return false, false
})

License

The code in this repository is released under the terms of the MIT license. Copyright (c) Alessandro Diaferia alediaferia@gmail.com

Documentation

Overview

This package exposes a simple Trie implementation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type PrefixInfo

type PrefixInfo struct {
	Prefix string
	IsWord bool
	Depth  int

	// SharedLength gives
	// information about how
	// much length is shared with
	// the previous prefix,
	// aka, how many characters
	// of the previous prefix are shared
	// with the current one
	SharedLength int
}

type PrefixIteratorCallback

type PrefixIteratorCallback func(PrefixInfo) (skip_subtree, halt bool)

This call back is used by EachPrefix function and called for every prefix in the radix tree. Returning true for the skip_subtree value makes the algorithm skip the entire subtree that is about to explore. This feature can enable very fast tree iterations despite the DFS nature of the traversal.

type Stack

type Stack struct {
	// contains filtered or unexported fields
}

func NewStack

func NewStack() *Stack

func (*Stack) Pop

func (s *Stack) Pop() (elem *Trie)

func (*Stack) Push

func (s *Stack) Push(elem ...*Trie)

func (*Stack) Size

func (s *Stack) Size() int

func (*Stack) Top

func (s *Stack) Top() (elem *Trie)

type Trie

type Trie struct {
	IsWord bool
	Parent *Trie

	Children []*Trie
	// contains filtered or unexported fields
}

func NewTrie

func NewTrie() (t *Trie)

Initializes a new radix tree

func (*Trie) AppendWord

func (t *Trie) AppendWord(phrase string)

Appends a word to the trie The algorithm follows the BFS traversal principles implemented iteratively. Given suffix is treated as a full word.

func (*Trie) AppendWords

func (t *Trie) AppendWords(words ...string)

func (*Trie) ClosestWords

func (t *Trie) ClosestWords(word string) []interface{}

Returns an array of objects that are associated with the words closest to the specified word param

func (*Trie) EachPrefix

func (t *Trie) EachPrefix(callback PrefixIteratorCallback)

Iterates for each prefix in the radix tree calling the given callback. The given callback can be used to guide the tree traversal. The traversal is based on a DFS implementation backed by a stack to handle the nodes to iterate to. A special implementation of the stack allows for a O(1) push for all the node children at once keeping the whole traversal MAX(O(N)) where N is the number of nodes.

func (*Trie) HasWord

func (t *Trie) HasWord(word string) bool

Returns true if the word is found in the radix tree

func (*Trie) IsRoot

func (t *Trie) IsRoot() bool

Returns true if this radix tree node is root

func (*Trie) Words

func (t *Trie) Words() (words []interface{})

Returns a list with all the words present in the radix tree

type TrieNode

type TrieNode Trie

func (*TrieNode) Depth

func (t *TrieNode) Depth() int

Returns the depth of the node within the whole radix tree it belongs to

func (*TrieNode) IsRoot

func (t *TrieNode) IsRoot() bool

type TriePtr

type TriePtr *Trie

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL