Bits

package
v0.0.0-...-1e96df0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 26, 2021 License: Unlicense Imports: 2 Imported by: 0

Documentation

Overview

A Succinct Trie for Go

By Siong-Ui Te Released to the public domain. translated From:

A Succinct Trie for Javascript

By Steve Hanov Released to the public domain.

This file contains functions for creating a succinctly encoded trie structure from a list of words. The trie is encoded to a succinct bit string using the method of Jacobson (1989). The bitstring is then encoded using BASE-64.

The resulting trie does not have to be decoded to be used. This file also contains functions for looking up a word in the BASE-64 encoded data, in O(mlogn) time, where m is the number of letters in the target word, and n is the number of nodes in the trie.

Objects for encoding:

TrieNode Trie BitWriter

Objects for decoding: BitString FrozenTrieNode FrozenTrie

QUICK USAGE:

Suppose we let data be some output of the demo encoder:

var data = {
   "nodeCount": 37,
   "directory": "BMIg",
   "trie": "v2qqqqqqqpIUn4A5JZyBZ4ggCKh55ZZgBA5ZZd5vIEl1wx8g8A"
};

var frozenTrie = new FrozenTrie( Data.trie, Data.directory, Data.nodeCount);

alert( frozenTrie.lookup( "hello" ) ); // outputs true alert( frozenTrie.lookup( "kwijibo" ) ); // outputs false

Index

Constants

This section is empty.

Variables

View Source
var BASE64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"
View Source
var BASE64_CACHE = map[string]uint{
	"A": 0, "B": 1, "C": 2, "D": 3, "E": 4, "F": 5, "G": 6, "H": 7,
	"I": 8, "J": 9, "K": 10, "L": 11, "M": 12, "N": 13, "O": 14,
	"P": 15, "Q": 16, "R": 17, "S": 18, "T": 19, "U": 20, "V": 21,
	"W": 22, "X": 23, "Y": 24, "Z": 25, "a": 26, "b": 27, "c": 28,
	"d": 29, "e": 30, "f": 31, "g": 32, "h": 33, "i": 34, "j": 35,
	"k": 36, "l": 37, "m": 38, "n": 39, "o": 40, "p": 41, "q": 42,
	"r": 43, "s": 44, "t": 45, "u": 46, "v": 47, "w": 48, "x": 49,
	"y": 50, "z": 51, "0": 52, "1": 53, "2": 54, "3": 55, "4": 56,
	"5": 57, "6": 58, "7": 59, "8": 60, "9": 61, "-": 62, "_": 63,
}

*

Returns the decimal value of the given character unit.
View Source
var BitsInByte = [256]uint{}/* 256 elements not displayed */
View Source
var L1 uint = 32 * 32

*

Fixed values for the L1 and L2 table sizes in the Rank Directory
View Source
var L2 uint = 32
View Source
var MaskTop = [7]uint{
	0x3f, 0x1f, 0x0f, 0x07, 0x03, 0x01, 0x00,
}
View Source
var W uint = 6

*

The width of each unit of the encoding, in bits. Here we use 6, for base-64
encoding.

Functions

func CHR

func CHR(id uint) string

*

Returns the character unit that represents the given value. If this were
binary data, we would simply return id.

func ORD

func ORD(ch string) uint

Types

type BitString

type BitString struct {
	// contains filtered or unexported fields
}

*

Given a string of data (eg, in BASE-64), the BitString class supports
reading or counting a number of bits from an arbitrary position in the
string.

func (*BitString) Count

func (bs *BitString) Count(p, n uint) uint

*

Counts the number of bits set to 1 starting at position p and
ending at position p + n

func (*BitString) Get

func (bs *BitString) Get(p, n uint) uint

*

Returns a decimal number, consisting of a certain number, n, of bits
starting at a certain position, p.

func (*BitString) GetData

func (bs *BitString) GetData() string

*

Returns the internal string of bytes

func (*BitString) Init

func (bs *BitString) Init(data string)

func (*BitString) Rank

func (bs *BitString) Rank(x uint) uint

*

Returns the number of bits set to 1 up to and including position x.
This is the slow implementation used for testing.

type BitWriter

type BitWriter struct {
	// contains filtered or unexported fields
}

*

The BitWriter will create a stream of bytes, letting you write a certain
number of bits at a time. This is part of the encoder, so it is not
optimized for memory or speed.

func (*BitWriter) GetData

func (bw *BitWriter) GetData() string

*

Get the bitstring represented as a javascript string of bytes

func (*BitWriter) GetDebugString

func (bw *BitWriter) GetDebugString(group uint) string

*

Returns the bits as a human readable binary string for debugging

func (*BitWriter) Write

func (bw *BitWriter) Write(data, numBits uint)

*

Write some data to the bit string. The number of bits must be 32 or
fewer.

type FrozenTrie

type FrozenTrie struct {
	// contains filtered or unexported fields
}

*

The FrozenTrie is used for looking up words in the encoded trie.

@param data A string representing the encoded trie.

@param directoryData A string representing the RankDirectory. The global L1
and L2 constants are used to determine the L1Size and L2size.

@param nodeCount The number of nodes in the trie.

func (*FrozenTrie) GetNodeByIndex

func (f *FrozenTrie) GetNodeByIndex(index uint) FrozenTrieNode

*

Retrieve the FrozenTrieNode of the trie, given its index in level-order.
This is a private function that you don't have to use.

func (*FrozenTrie) GetRoot

func (f *FrozenTrie) GetRoot() FrozenTrieNode

*

Retrieve the root node. You can use this node to obtain all of the other
nodes in the trie.

func (*FrozenTrie) Init

func (f *FrozenTrie) Init(data, directoryData string, nodeCount uint)

func (*FrozenTrie) Lookup

func (f *FrozenTrie) Lookup(word string) bool

*

Look-up a word in the trie. Returns true if and only if the word exists
in the trie.

type FrozenTrieNode

type FrozenTrieNode struct {
	// contains filtered or unexported fields
}

*

This class is used for traversing the succinctly encoded trie.

func (*FrozenTrieNode) GetChild

func (f *FrozenTrieNode) GetChild(index uint) FrozenTrieNode

*

Returns the FrozenTrieNode for the given child.

@param index The 0-based index of the child of this node. For example, if
the node has 5 children, and you wanted the 0th one, pass in 0.

func (*FrozenTrieNode) GetChildCount

func (f *FrozenTrieNode) GetChildCount() uint

*

Returns the number of children.

type RankDirectory

type RankDirectory struct {
	// contains filtered or unexported fields
}

*

The rank directory allows you to build an index to quickly compute the
rank() and select() functions. The index can itself be encoded as a binary
string.

func CreateRankDirectory

func CreateRankDirectory(data string, numBits, l1Size, l2Size uint) RankDirectory

*

Used to build a rank directory from the given input string.

@param data A javascript string containing the data, as readable using the
BitString object.

@param numBits The number of bits to index.

@param l1Size The number of bits that each entry in the Level 1 table
summarizes. This should be a multiple of l2Size.

@param l2Size The number of bits that each entry in the Level 2 table
summarizes.

func (*RankDirectory) GetData

func (rd *RankDirectory) GetData() string

*

Returns the string representation of the directory.

func (*RankDirectory) Init

func (rd *RankDirectory) Init(directoryData, bitData string, numBits, l1Size, l2Size uint)

func (*RankDirectory) Rank

func (rd *RankDirectory) Rank(which, x uint) uint

*

Returns the number of 1 or 0 bits (depending on the "which" parameter) to
to and including position x.

func (*RankDirectory) Select

func (rd *RankDirectory) Select(which, y uint) uint

*

Returns the position of the y'th 0 or 1 bit, depending on the "which"
parameter.

type Trie

type Trie struct {
	// contains filtered or unexported fields
}

func (*Trie) Apply

func (t *Trie) Apply(fn func(*TrieNode))

*

Apply a function to each node, traversing the trie in level order.

func (*Trie) Encode

func (t *Trie) Encode() string

*

Encode the trie and all of its nodes. Returns a string representing the
encoded data.

func (*Trie) GetNodeCount

func (t *Trie) GetNodeCount() uint

*

Returns the number of nodes in the trie

func (*Trie) Init

func (t *Trie) Init()

func (*Trie) Insert

func (t *Trie) Insert(word string)

*

Inserts a word into the trie. This function is fastest if the words are
inserted in alphabetical order.

type TrieNode

type TrieNode struct {
	// contains filtered or unexported fields
}

*

A Trie node, for use in building the encoding trie. This is not needed for
the decoder.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL