indexer

package module
v0.0.0-...-4eef59b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 13, 2018 License: Apache-2.0 Imports: 23 Imported by: 0

README

indexer

Indexing library written in Golang, similar to Lucene(https://lucene.apache.org/core/) and Bleve (https://github.com/blevesearch/bleve).

It supports numerical fields and text fields. Numerical value can be a multi-dimension uint64 point. Text value can be UTF-8 string, and it's tokenized by spliting the text with UTF-8 whitespace characters.

Documentation

https://godoc.org/github.com/deepfabric/indexer

Documentation

Index

Constants

View Source
const (
	MaxUint = ^uint(0)
	MinUint = 0
	MaxInt  = int(MaxUint >> 1)
	MinInt  = -MaxInt - 1
)
View Source
const (
	// DefaultIndexerMaxOpN is the default value for Indexer.MaxOpN.
	DefaultIndexerMaxOpN = uint64(1000000)
)
View Source
const (
	LiveDocs string = "__liveDocs" // the directory where stores Index.liveDocs
)

Variables

View Source
var (
	ErrUnknownProp = errors.New("unknown property")
	ErrDocExist    = errors.New("document already exist")
)
View Source
var (
	ErrIdxExist    = errors.New("index already exist")
	ErrIdxNotExist = errors.New("index not exist")
)

Functions

func CopyDir

func CopyDir(src string, dst string) (err error)

CopyDir recursively copies a directory tree, attempting to preserve permissions. Source directory must exist, destination directory must *not* exist. Symlinks are ignored and skipped.

func CopyFile

func CopyFile(src, dst string) (err error)

CopyFile copies the contents of the file named src to the file named by dst. The file will be created if it does not already exist. If the destination file exists, all it's contents will be replaced by the contents of the source file. The file mode will be copied from the source and the copied data is synced/flushed to stable storage.

func ParseWords

func ParseWords(text string) (words []string)

ParseWords parses text(encoded in UTF-8) for words. A word is a non-ascii-space lowered ASCII character sequence, or a non-ASCII non-unicode-space non-chinese-punctuate character. Note: words are not de-duplicated.

Types

type Index

type Index struct {
	MainDir string
	DocProt *cql.DocumentWithIdx //document prototype. persisted to an index-specific file
	// contains filtered or unexported fields
}

Index is created by CqlCreate

func NewIndex

func NewIndex(docProt *cql.DocumentWithIdx, mainDir string) (ind *Index, err error)

NewIndex creates index according to given conf, overwrites existing files.

func NewIndexExt

func NewIndexExt(mainDir, name string) (ind *Index, err error)

NewIndexExt create index according to existing files.

func (*Index) Close

func (ind *Index) Close() (err error)

Close closes index

func (*Index) Del

func (ind *Index) Del(docID uint64) (found bool, err error)

Del executes CqlDel. Do mark-deletion only. The caller shall rebuild index in order to recycle disk space.

func (*Index) Destroy

func (ind *Index) Destroy() (err error)

Destroy removes data and conf files on disk.

func (*Index) GetDocIDFragList

func (ind *Index) GetDocIDFragList() (numList []uint64)

GetDocIDFragList returns DocID fragment list. Each fragment's size is pilosa.SliceWidth

func (*Index) Insert

func (ind *Index) Insert(doc *cql.DocumentWithIdx) (err error)

Insert executes CqlInsert

func (*Index) Open

func (ind *Index) Open() (err error)

Open opens existing index. Assumes MainDir and DocProt is already populated.

func (*Index) Select

func (ind *Index) Select(q *cql.CqlSelect) (qr *QueryResult, err error)

Select executes CqlSelect.

func (*Index) Sync

func (ind *Index) Sync() (err error)

Sync synchronizes index to disk

type Indexer

type Indexer struct {
	MainDir string //the main directory where stores all indices
	// Number of operations performed before performing a snapshot.
	MaxOpN uint64
	// contains filtered or unexported fields
}

Indexer shall be singleton

func NewIndexer

func NewIndexer(mainDir string, overwirte bool, enableWal bool) (ir *Indexer, err error)

NewIndexer creates an Indexer.

func (*Indexer) ApplySnapshot

func (ir *Indexer) ApplySnapshot(snapDir string) (err error)

func (*Indexer) Close

func (ir *Indexer) Close() (err error)

Close close indexer

func (*Indexer) CreateIndex

func (ir *Indexer) CreateIndex(docProt *cql.DocumentWithIdx) (err error)

CreateIndex creates index

func (*Indexer) CreateSnapshot

func (ir *Indexer) CreateSnapshot(snapDir string) (numList []uint64, err error)

func (*Indexer) Del

func (ir *Indexer) Del(idxName string, docID uint64) (found bool, err error)

Del executes CqlDel. It's allowed that the given index doesn't exist.

func (*Indexer) Destroy

func (ir *Indexer) Destroy() (err error)

Destroy close and remove index files

func (*Indexer) DestroyIndex

func (ir *Indexer) DestroyIndex(name string) (err error)

DestroyIndex destroy given index

func (*Indexer) GetDocIDFragList

func (ir *Indexer) GetDocIDFragList() (numList []uint64)

func (*Indexer) GetDocProt

func (ir *Indexer) GetDocProt(name string) (docProt *cql.DocumentWithIdx)

GetDocProt returns docProt of given index

func (*Indexer) GetDocProts

func (ir *Indexer) GetDocProts() (sdump string)

GetDocProts dumps docProts

func (*Indexer) Insert

func (ir *Indexer) Insert(doc *cql.DocumentWithIdx) (err error)

Insert executes CqlInsert

func (*Indexer) Open

func (ir *Indexer) Open() (err error)

Open opens all indices. Assumes ir.MainDir is already populated.

func (*Indexer) Select

func (ir *Indexer) Select(q *cql.CqlSelect) (qr *QueryResult, err error)

Select executes CqlSelect.

func (*Indexer) Summary

func (ir *Indexer) Summary() (sum string, err error)

Summary returns a summary of all indices.

func (*Indexer) Sync

func (ir *Indexer) Sync() (err error)

Sync synchronizes index to disk

func (*Indexer) WriteMeta

func (ir *Indexer) WriteMeta() (err error)

WriteMeta persists Conf and DocProts to files.

type IntFrame

type IntFrame struct {
	// contains filtered or unexported fields
}

IntFrame represents a string field of an index. Refers to pilosa.Frame and pilosa.View.

func NewIntFrame

func NewIntFrame(path, index, name string, bitDepth uint, overwrite bool) (f *IntFrame, err error)

NewIntFrame returns a new instance of frame, and initializes it.

func (*IntFrame) BitDepth

func (f *IntFrame) BitDepth() uint

BitDepth returns the bit depth the frame was initialized with.

func (*IntFrame) Close

func (f *IntFrame) Close() (err error)

Close closes all fragments without removing files on disk. It's allowed to invoke Close multiple times.

func (*IntFrame) Destroy

func (f *IntFrame) Destroy() (err error)

Destroy closes all fragments, removes all files on disk. It's allowed to invoke Close before or after Destroy.

func (*IntFrame) DoIndex

func (f *IntFrame) DoIndex(docID uint64, val uint64) (err error)

DoIndex parses and index a field.

func (*IntFrame) FragmentPath

func (f *IntFrame) FragmentPath(slice uint64) string

FragmentPath returns the path to a fragment

func (*IntFrame) GetFragList

func (f *IntFrame) GetFragList() (numList []uint64)

GetFragList returns fragments' numbers

func (*IntFrame) GetValue

func (f *IntFrame) GetValue(docID uint64) (val uint64, exists bool, err error)

GetValue returns value of a column within the frame.

func (*IntFrame) Index

func (f *IntFrame) Index() string

Index returns the index name the frame was initialized with.

func (*IntFrame) Name

func (f *IntFrame) Name() string

Name returns the name the frame was initialized with.

func (*IntFrame) Open

func (f *IntFrame) Open() (err error)

Open opens an existing frame

func (*IntFrame) Path

func (f *IntFrame) Path() string

Path returns the path the frame was initialized with.

func (*IntFrame) QueryRange

func (f *IntFrame) QueryRange(op pql.Token, predicate uint64) (bm *pilosa.Bitmap, err error)

QueryRange query which documents' value is inside the given range.

func (*IntFrame) QueryRangeBetween

func (f *IntFrame) QueryRangeBetween(predicateMin, predicateMax uint64) (bm *pilosa.Bitmap, err error)

QueryRangeBetween query which documents' value is inside the given range.

func (*IntFrame) Sync

func (f *IntFrame) Sync() (err error)

Sync synchronizes storage bitmap to disk and reopens it.

type QueryResult

type QueryResult struct {
	Bm *pilosa.Bitmap               // used when no OrderBy given
	Oa *datastructures.OrderedArray // used when OrderBy given
}

QueryResult is query result

func NewQueryResult

func NewQueryResult(limit int) (qr *QueryResult)

NewQueryResult creates an empty QueryResult

func (*QueryResult) Merge

func (qr *QueryResult) Merge(other *QueryResult)

Merge merges other (keep unchagned) into qr

type TermDict

type TermDict struct {
	Dir string
	// contains filtered or unexported fields
}

TermDict stores terms in a map. Note that the term dict is insertion-only.

func NewTermDict

func NewTermDict(directory string, overwrite bool) (td *TermDict, err error)

NewTermDict creates and initializes a term dict

func (*TermDict) Close

func (td *TermDict) Close() (err error)

Close clear the dictionary on memory and close file.

func (*TermDict) Count

func (td *TermDict) Count() (cnt uint64)

Count returns the count of terms

func (*TermDict) CreateTermIfNotExist

func (td *TermDict) CreateTermIfNotExist(term string) (id uint64, err error)

CreateTermIfNotExist get id of the given term, will insert the term implicitly if it is not in the dict.

func (*TermDict) CreateTermsIfNotExist

func (td *TermDict) CreateTermsIfNotExist(terms []string) (ids []uint64, err error)

CreateTermsIfNotExist is bulk version of CreateTermIfNotExist

func (*TermDict) Destroy

func (td *TermDict) Destroy() (err error)

Destroy clear the dictionary on memory and disk.

func (*TermDict) GetTermID

func (td *TermDict) GetTermID(term string) (id uint64, found bool)

GetTermID get id of the given term.

func (*TermDict) Open

func (td *TermDict) Open() (err error)

Open opens an existing term dict

func (*TermDict) Sync

func (td *TermDict) Sync() (err error)

Sync synchronizes terms to disk

type TextFrame

type TextFrame struct {
	// contains filtered or unexported fields
}

TextFrame represents a string field of an index. Refers to pilosa.Frame and pilosa.View.

func NewTextFrame

func NewTextFrame(path, index, name string, overwrite bool) (f *TextFrame, err error)

NewTextFrame returns a new instance of frame, and initializes it.

func (*TextFrame) Bits

func (f *TextFrame) Bits() (bits map[uint64][]uint64, err error)

Bits returns bits set in frame.

func (*TextFrame) Close

func (f *TextFrame) Close() (err error)

Close closes all fragments without removing files on disk. It's allowed to invoke Close multiple times.

func (*TextFrame) Count

func (f *TextFrame) Count() (cnt uint64, err error)

Count returns number of bits set in frame.

func (*TextFrame) Destroy

func (f *TextFrame) Destroy() (err error)

Destroy closes all fragments, removes all files on disk. It's allowed to invoke Close before or after Destroy.

func (*TextFrame) DoIndex

func (f *TextFrame) DoIndex(docID uint64, text string) (err error)

DoIndex parses and index a field.

func (*TextFrame) FragmentPath

func (f *TextFrame) FragmentPath(slice uint64) string

FragmentPath returns the path to a fragment

func (*TextFrame) GetFragList

func (f *TextFrame) GetFragList() (numList []uint64)

GetFragList returns fragments' numbers

func (*TextFrame) Index

func (f *TextFrame) Index() string

Index returns the index name the frame was initialized with.

func (*TextFrame) Name

func (f *TextFrame) Name() string

Name returns the name the frame was initialized with.

func (*TextFrame) Open

func (f *TextFrame) Open() (err error)

Open opens an existing frame

func (*TextFrame) Path

func (f *TextFrame) Path() string

Path returns the path the frame was initialized with.

func (*TextFrame) Query

func (f *TextFrame) Query(text string) (bm *pilosa.Bitmap)

Query query which documents contain the given term.

func (*TextFrame) Sync

func (f *TextFrame) Sync() (err error)

Sync synchronizes storage bitmap to disk and reopens it.

Directories

Path Synopsis
cmd
cql
Package cql is a generated protocol buffer package.
Package cql is a generated protocol buffer package.
test
wal
walpb
Package walpb is a generated protocol buffer package.
Package walpb is a generated protocol buffer package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL