vecstring

package module
v0.0.0-...-51b53e6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 13, 2014 License: MIT Imports: 2 Imported by: 1

README

vecstring

vecstring is a Go library for a vector representation of strings.

Conceptually, vecstring represents a vector V[0...num), and each value V[i] represents a string of variable length. Strings are concatenated in a byte array, and each offset and its length are encoded by unary coding, and stored using rsdic package.

Since rsdic supports constant time decode of unary code (Select and RunZeros) the offset and its length can be decoded in constant time, and vecstring can retrieve any string in constant time.

For strings of total lengths = TotalLen, VecString stores them in at most TotalLen + Num * (2 + log_2 TotalLen / Num) / 8 bytes.

Usage

import "github.com/hillbig/vecstring"

vs := vecstring.New()  // vs represents a vector of string
vs.PushBack("Hello") // Push String
vs.PushBack("New")
vs.PushBack("World")

fmt.Printf("%d %d\n", vs.Num(), vs.TotalLen()) // 3 13

for i := 0; i < vs.Num(); i++ {
	fmt.Printf("%d:%s\n", i, vs.Get(i))
}
// Hello
// New
// World

bytes, err := vs.MarshalBinary() // Encode to binary representation
newvs := vecstring.New()
err := newvs.UnmarshalBinary(bytes) // Decode from binary presentation

Documentation

Overview

package vecstring provides a space-efficient representation of a vector of strings of variable length.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type VecString

type VecString interface {
	// Get returns V[ind]
	Get(ind uint64) string

	// Find returns true if c is found in V[ind], and returns false otherwise.
	Find(ind uint64, c byte) bool

	// Num returns the number of strings
	Num() uint64

	// TotalLen returns the total length of strings
	TotalLen() uint64

	// ExactMatch returns true if V[ind] == str, and returns false otherwise
	ExactMatch(ind uint64, str string) bool

	// PrefixMatch returns true if V[ind] == Prefix of str, and returns false otherwise
	PrefixMatch(ind uint64, str string) (uint64, bool)

	// LenAndOffset returns the length of V[ind] and len(V[0]+...+len(V[ind-1])
	LenAndOffset(ind uint64) (uint64, uint64)

	// MarshalBinary encodes VecString into a binary form and returns the result.
	MarshalBinary() ([]byte, error)

	// UnmarshalBinary decodes the FixVec form a binary from generated MarshalBinary
	UnmarshalBinary([]byte) error

	// PushBack set V[Num] = v, and Num++
	PushBack(v string)
}

VecString represents a vector V[0...Num), and each value V[i] represents a string Internally, VecString are stored in space-efficient way. Strings are concatenated in a byte array, and each offset, length information is encoded in unary coding, and represented by rsdic to achieve further space reduction. // For strings of total lengths = TotalLen, VecString stores them in at most TotalLen + Num * (2 + log_2 TotalLen / Num) / 8 bytes.

func New

func New() VecString

New returns VecString with 0 strings.

type VecStringForWX

type VecStringForWX interface {
	VecString

	// GetByte returns the (offset+1)-th byte in the concatenated strings.
	GetByte(offset uint64) byte

	// FindZeroRank finds c in V[ind], and returns (offset, true) if found,
	// and returns (0, false) otherwise.
	FindZeroRank(ind uint64, c byte) (uint64, bool)

	// IthCharInd returns the index corresponding to i-th child.
	// lens.Rank(lens.Select(i-1, false), true) - 1
	IthCharInd(i uint64) uint64
}

VecStringForWX represents a vector of strings, and provides extended interface for library WX.

func NewForWX

func NewForWX() VecStringForWX

New returns VecStringForWX with 0 strings.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL