riot

package module

v0.0.0-...-fee7c4b Latest Latest Go to latest Published: Jan 22, 2018 License: Apache-2.0 Imports: 24 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/shaorxcn/riot

Links

Open Source Insights

README ¶

riot full text search engine

简体中文

Efficient indexing and search (1M blog 500M data 28 seconds index finished, 1.65 ms search response time, 19K search QPS）
Support for logical search
Support Chinese word segmentation (use gse word segmentation package concurrent word, speed 27MB / s）
Support the calculation of the keyword in the text close to the distance（token proximity）
Support calculation BM25 correlation
Support custom scoring field and scoring rules
Support add online, delete index
Support heartbeat
Support multiple persistent storage
Support distributed index and search
Can be achieved distributed index and search
Look at Word segmentation rules

Riot v0.10.0 was released in Nov 2017, check the Changelog for the full details.

Requirements

Go version >= 1.8

Installation/Update

go get -u github.com/go-ego/riot

Build-tools

go get -u github.com/go-ego/re

re riot

To create a new riot application

$ re riot my-riotapp

re run

To run the application we just created, you can navigate to the application folder and execute:

$ cd my-riotapp && re run

Usage:

Look at an example

package main

import (
	"log"

	"github.com/go-ego/riot"
	"github.com/go-ego/riot/types"
)

var (
	// searcher is coroutine safe
	searcher = riot.Engine{}
)

func main() {
	// Init
	searcher.Init(types.EngineOpts{
		Using:             4,
		NotUsingGse: true})
	defer searcher.Close()

	text := "Google Is Experimenting With Virtual Reality Advertising"
	text1 := `Google accidentally pushed Bluetooth update for Home
	speaker early`
	text2 := `Google is testing another Search results layout with 
	rounded cards, new colors, and the 4 mysterious colored dots again`
	
	// Add the document to the index, docId starts at 1
	searcher.IndexDoc(1, types.DocIndexData{Content: text})
	searcher.IndexDoc(2, types.DocIndexData{Content: text1}, false)
	searcher.IndexDoc(3, types.DocIndexData{Content: text2}, true)

	// Wait for the index to refresh
	searcher.FlushIndex()

	// The search output format is found in the types.SearchResp structure
	log.Print(searcher.Search(types.SearchReq{Text:"google testing"}))
}

It is very simple!

Use default engine:

package main

import (
	"log"

	"github.com/go-ego/riot"
	"github.com/go-ego/riot/types"
)

var (
	searcher = riot.New("zh")
)

func main() {
	data := types.DocIndexData{Content: `I wonder how, I wonder why
		, I wonder where they are`}
	data1 := types.DocIndexData{Content: "留给真爱你的人"}
	data2 := types.DocIndexData{Content: "也没有理由"}
	searcher.IndexDoc(1, data)
	searcher.IndexDoc(2, data1)
	searcher.IndexDoc(3, data2)
	searcher.FlushIndex()

	req := types.SearchReq{Text: "真爱"}
	search := searcher.Search(req)
	log.Println("search...", search)
}

Look at more Examples

Look at Store example

Look at Logic search example

Look at Pinyin search example

Look at different dict and language search example

Look at benchmark example

Riot search engine templates, client and dictionaries

Donate

Supporting riot, buy me a coffee.

Paypal

Donate money by paypal to my account vzvway@gmail.com

License

Riot is primarily distributed under the terms of the Apache License (Version 2.0), base on wukong.

Documentation ¶

Overview ¶

Package riot is riot engine

Package riot full text search engine

Index ¶

Constants
func GetVersion() string
func Try(fun func(), handler func(interface{}))
type Engine
- func New(dict ...string) *Engine
type Map
type StopTokens
- func (st *StopTokens) Init(stopTokenFile string)
- func (st *StopTokens) IsStopToken(token string) bool

Constants ¶

View Source

const (

	// NumNanosecondsInAMillisecond nano-seconds in a milli-second num
	NumNanosecondsInAMillisecond = 1000000
	// StorageFilePrefix persistent storage file prefix
	StorageFilePrefix = "riot"
)

Variables ¶

This section is empty.

Functions ¶

func GetVersion ¶

func GetVersion() string

GetVersion get version

func Try ¶

func Try(fun func(), handler func(interface{}))

Try handler(err)

Types ¶

type Engine ¶

type Engine struct {
	// contains filtered or unexported fields
}

Engine initialize the engine

func New ¶

func New(dict ...string) *Engine

New create a new engine

func (*Engine) CheckMem ¶

func (engine *Engine) CheckMem()

CheckMem check the memory when the memory is larger than 99.99% using the storage

func (*Engine) Close ¶

func (engine *Engine) Close()

Close close the engine 关闭引擎

func (*Engine) FlushIndex ¶

func (engine *Engine) FlushIndex()

FlushIndex block wait until all indexes are added 阻塞等待直到所有索引添加完毕

func (*Engine) ForSplitData ¶

func (engine *Engine) ForSplitData(splData []string, num int) (Map, int)

ForSplitData for split seg data, segspl

func (*Engine) GetAllDocIds ¶

func (engine *Engine) GetAllDocIds() []uint64

GetAllDocIds get all the DocId from the storage database and return 从数据库遍历所有的 DocId, 并返回

func (*Engine) GetAllIds ¶

func (engine *Engine) GetAllIds() []uint64

GetAllIds get all the DocId from the storage database and return 从数据库遍历所有的 DocId, 并返回

func (*Engine) IndexDoc ¶

func (engine *Engine) IndexDoc(docId uint64, data types.DocIndexData, forceUpdate ...bool)

IndexDoc add the document to the index 将文档加入索引

输入参数：

docId	      标识文档编号，必须唯一，docId == 0 表示非法文档（用于强制刷新索引），[1, +oo) 表示合法文档
data	      见 DocIndexData 注释
forceUpdate 是否强制刷新 cache，如果设为 true，则尽快添加到索引，否则等待 cache 满之后一次全量添加

注意：

这个函数是线程安全的，请尽可能并发调用以提高索引速度
这个函数调用是非同步的，也就是说在函数返回时有可能文档还没有加入索引中，因此如果立刻调用Search可能无法查询到这个文档。强制刷新索引请调用FlushIndex函数。

func (*Engine) Indexer ¶

func (engine *Engine) Indexer(options types.EngineOpts)

Indexer initialize the indexer channel

func (*Engine) Init ¶

func (engine *Engine) Init(options types.EngineOpts)

Init initialize the engine

func (*Engine) InitStorage ¶

func (engine *Engine) InitStorage()

InitStorage initialize the persistent storage channel

func (*Engine) NumDocsIndexed ¶

func (engine *Engine) NumDocsIndexed() uint64

NumDocsIndexed documents indexed number

func (*Engine) NumDocsRemoved ¶

func (engine *Engine) NumDocsRemoved() uint64

NumDocsRemoved documents removed number

func (*Engine) NumTokenIndexAdded ¶

func (engine *Engine) NumTokenIndexAdded() uint64

NumTokenIndexAdded added token index number

func (*Engine) PinYin ¶

func (engine *Engine) PinYin(hans string) []string

PinYin get the Chinese alphabet and abbreviation

func (*Engine) Rank ¶

func (engine *Engine) Rank(request types.SearchReq,
	RankOpts types.RankOpts, tokens []string,
	rankerReturnChan chan rankerReturnReq) (output types.SearchResp)

Rank rank docs by types.ScoredIDs

func (*Engine) Ranker ¶

func (engine *Engine) Ranker(options types.EngineOpts)

Ranker initialize the ranker channel

func (*Engine) Ranks ¶

func (engine *Engine) Ranks(request types.SearchReq,
	RankOpts types.RankOpts, tokens []string,
	rankerReturnChan chan rankerReturnReq) (output types.SearchResp)

Ranks rank docs by types.ScoredDocs

func (*Engine) RemoveDoc ¶

func (engine *Engine) RemoveDoc(docId uint64, forceUpdate ...bool)

RemoveDoc remove the document from the index 将文档从索引中删除

输入参数：

docId	      标识文档编号，必须唯一，docId == 0 表示非法文档（用于强制刷新索引），[1, +oo) 表示合法文档
forceUpdate 是否强制刷新 cache，如果设为 true，则尽快删除索引，否则等待 cache 满之后一次全量删除

注意：

这个函数是线程安全的，请尽可能并发调用以提高索引速度
这个函数调用是非同步的，也就是说在函数返回时有可能文档还没有加入索引中，因此如果立刻调用 Search 可能无法查询到这个文档。强制刷新索引请调用 FlushIndex 函数。

func (*Engine) Search ¶

func (engine *Engine) Search(request types.SearchReq) (output types.SearchResp)

Search find the document that satisfies the search criteria. This function is thread safe 查找满足搜索条件的文档，此函数线程安全

func (*Engine) Segment ¶

func (engine *Engine) Segment(content string) (keywords []string)

Segment get the word segmentation result of the text 获取文本的分词结果, 只分词与过滤弃用词

func (*Engine) Storage ¶

func (engine *Engine) Storage()

Storage start the persistent storage work connection

func (*Engine) Tokens ¶

func (engine *Engine) Tokens(request types.SearchReq) (tokens []string)

Tokens get the engine tokens

type Map ¶

type Map map[string][]int

Map defines the type map[string][]int

type StopTokens ¶

type StopTokens struct {
	// contains filtered or unexported fields
}

StopTokens stop tokens map

func (*StopTokens) Init ¶

func (st *StopTokens) Init(stopTokenFile string)

Init 从 stopTokenFile 中读入停用词，一个词一行文档索引建立时会跳过这些停用词

func (*StopTokens) IsStopToken ¶

func (st *StopTokens) IsStopToken(token string) bool

IsStopToken to determine whether to stop token

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
core Package core is riot core	Package core is riot core
data
client
riot
riot/heartb
riot1
riot1/heartb
engine Package engine is riot engine	Package engine is riot engine
examples
benchmark riot 性能测试	riot 性能测试
codelab 一个微博搜索的例子。	一个微博搜索的例子。
dict
logic
new
pinyin
pinyin_weibo 一个微博 pinyin 搜索的例子。	一个微博 pinyin 搜索的例子。
simple
simple/zh
store
weibo
geo
net Package net is riot net	Package net is riot net
com
grpc
grpc/riot-pb Package doc is a generated protocol buffer package.	Package doc is a generated protocol buffer package.
heartb
http
rpcx
storage
types Package types is riot types	Package types is riot types
utils

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL