riot

package module
v0.0.0-...-fee7c4b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 22, 2018 License: Apache-2.0 Imports: 24 Imported by: 0

README

riot full text search engine

CircleCI Status codecov Build Status Go Report Card GoDoc Release

简体中文

Riot v0.10.0 was released in Nov 2017, check the Changelog for the full details.

Requirements

Go version >= 1.8

Installation/Update

go get -u github.com/go-ego/riot

Build-tools

go get -u github.com/go-ego/re 
re riot

To create a new riot application

$ re riot my-riotapp
re run

To run the application we just created, you can navigate to the application folder and execute:

$ cd my-riotapp && re run

Usage:

Look at an example
package main

import (
	"log"

	"github.com/go-ego/riot"
	"github.com/go-ego/riot/types"
)

var (
	// searcher is coroutine safe
	searcher = riot.Engine{}
)

func main() {
	// Init
	searcher.Init(types.EngineOpts{
		Using:             4,
		NotUsingGse: true})
	defer searcher.Close()

	text := "Google Is Experimenting With Virtual Reality Advertising"
	text1 := `Google accidentally pushed Bluetooth update for Home
	speaker early`
	text2 := `Google is testing another Search results layout with 
	rounded cards, new colors, and the 4 mysterious colored dots again`
	
	// Add the document to the index, docId starts at 1
	searcher.IndexDoc(1, types.DocIndexData{Content: text})
	searcher.IndexDoc(2, types.DocIndexData{Content: text1}, false)
	searcher.IndexDoc(3, types.DocIndexData{Content: text2}, true)

	// Wait for the index to refresh
	searcher.FlushIndex()

	// The search output format is found in the types.SearchResp structure
	log.Print(searcher.Search(types.SearchReq{Text:"google testing"}))
}

It is very simple!

Use default engine:
package main

import (
	"log"

	"github.com/go-ego/riot"
	"github.com/go-ego/riot/types"
)

var (
	searcher = riot.New("zh")
)

func main() {
	data := types.DocIndexData{Content: `I wonder how, I wonder why
		, I wonder where they are`}
	data1 := types.DocIndexData{Content: "留给真爱你的人"}
	data2 := types.DocIndexData{Content: "也没有理由"}
	searcher.IndexDoc(1, data)
	searcher.IndexDoc(2, data1)
	searcher.IndexDoc(3, data2)
	searcher.FlushIndex()

	req := types.SearchReq{Text: "真爱"}
	search := searcher.Search(req)
	log.Println("search...", search)
}
Look at more Examples
Look at Store example
Look at Logic search example
Look at Pinyin search example
Look at different dict and language search example
Look at benchmark example
Riot search engine templates, client and dictionaries

Donate

Supporting riot, buy me a coffee.

Paypal

Donate money by paypal to my account vzvway@gmail.com

License

Riot is primarily distributed under the terms of the Apache License (Version 2.0), base on wukong.

Documentation

Overview

Package riot is riot engine

Package riot full text search engine

Index

Constants

View Source
const (

	// NumNanosecondsInAMillisecond nano-seconds in a milli-second num
	NumNanosecondsInAMillisecond = 1000000
	// StorageFilePrefix persistent storage file prefix
	StorageFilePrefix = "riot"
)

Variables

This section is empty.

Functions

func GetVersion

func GetVersion() string

GetVersion get version

func Try

func Try(fun func(), handler func(interface{}))

Try handler(err)

Types

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

Engine initialize the engine

func New

func New(dict ...string) *Engine

New create a new engine

func (*Engine) CheckMem

func (engine *Engine) CheckMem()

CheckMem check the memory when the memory is larger than 99.99% using the storage

func (*Engine) Close

func (engine *Engine) Close()

Close close the engine 关闭引擎

func (*Engine) FlushIndex

func (engine *Engine) FlushIndex()

FlushIndex block wait until all indexes are added 阻塞等待直到所有索引添加完毕

func (*Engine) ForSplitData

func (engine *Engine) ForSplitData(splData []string, num int) (Map, int)

ForSplitData for split seg data, segspl

func (*Engine) GetAllDocIds

func (engine *Engine) GetAllDocIds() []uint64

GetAllDocIds get all the DocId from the storage database and return 从数据库遍历所有的 DocId, 并返回

func (*Engine) GetAllIds

func (engine *Engine) GetAllIds() []uint64

GetAllIds get all the DocId from the storage database and return 从数据库遍历所有的 DocId, 并返回

func (*Engine) IndexDoc

func (engine *Engine) IndexDoc(docId uint64, data types.DocIndexData, forceUpdate ...bool)

IndexDoc add the document to the index 将文档加入索引

输入参数:

docId	      标识文档编号,必须唯一,docId == 0 表示非法文档(用于强制刷新索引),[1, +oo) 表示合法文档
data	      见 DocIndexData 注释
forceUpdate 是否强制刷新 cache,如果设为 true,则尽快添加到索引,否则等待 cache 满之后一次全量添加

注意:

  1. 这个函数是线程安全的,请尽可能并发调用以提高索引速度
  2. 这个函数调用是非同步的,也就是说在函数返回时有可能文档还没有加入索引中,因此 如果立刻调用Search可能无法查询到这个文档。强制刷新索引请调用FlushIndex函数。

func (*Engine) Indexer

func (engine *Engine) Indexer(options types.EngineOpts)

Indexer initialize the indexer channel

func (*Engine) Init

func (engine *Engine) Init(options types.EngineOpts)

Init initialize the engine

func (*Engine) InitStorage

func (engine *Engine) InitStorage()

InitStorage initialize the persistent storage channel

func (*Engine) NumDocsIndexed

func (engine *Engine) NumDocsIndexed() uint64

NumDocsIndexed documents indexed number

func (*Engine) NumDocsRemoved

func (engine *Engine) NumDocsRemoved() uint64

NumDocsRemoved documents removed number

func (*Engine) NumTokenIndexAdded

func (engine *Engine) NumTokenIndexAdded() uint64

NumTokenIndexAdded added token index number

func (*Engine) PinYin

func (engine *Engine) PinYin(hans string) []string

PinYin get the Chinese alphabet and abbreviation

func (*Engine) Rank

func (engine *Engine) Rank(request types.SearchReq,
	RankOpts types.RankOpts, tokens []string,
	rankerReturnChan chan rankerReturnReq) (output types.SearchResp)

Rank rank docs by types.ScoredIDs

func (*Engine) Ranker

func (engine *Engine) Ranker(options types.EngineOpts)

Ranker initialize the ranker channel

func (*Engine) Ranks

func (engine *Engine) Ranks(request types.SearchReq,
	RankOpts types.RankOpts, tokens []string,
	rankerReturnChan chan rankerReturnReq) (output types.SearchResp)

Ranks rank docs by types.ScoredDocs

func (*Engine) RemoveDoc

func (engine *Engine) RemoveDoc(docId uint64, forceUpdate ...bool)

RemoveDoc remove the document from the index 将文档从索引中删除

输入参数:

docId	      标识文档编号,必须唯一,docId == 0 表示非法文档(用于强制刷新索引),[1, +oo) 表示合法文档
forceUpdate 是否强制刷新 cache,如果设为 true,则尽快删除索引,否则等待 cache 满之后一次全量删除

注意:

  1. 这个函数是线程安全的,请尽可能并发调用以提高索引速度
  2. 这个函数调用是非同步的,也就是说在函数返回时有可能文档还没有加入索引中,因此 如果立刻调用 Search 可能无法查询到这个文档。强制刷新索引请调用 FlushIndex 函数。

func (*Engine) Search

func (engine *Engine) Search(request types.SearchReq) (output types.SearchResp)

Search find the document that satisfies the search criteria. This function is thread safe 查找满足搜索条件的文档,此函数线程安全

func (*Engine) Segment

func (engine *Engine) Segment(content string) (keywords []string)

Segment get the word segmentation result of the text 获取文本的分词结果, 只分词与过滤弃用词

func (*Engine) Storage

func (engine *Engine) Storage()

Storage start the persistent storage work connection

func (*Engine) Tokens

func (engine *Engine) Tokens(request types.SearchReq) (tokens []string)

Tokens get the engine tokens

type Map

type Map map[string][]int

Map defines the type map[string][]int

type StopTokens

type StopTokens struct {
	// contains filtered or unexported fields
}

StopTokens stop tokens map

func (*StopTokens) Init

func (st *StopTokens) Init(stopTokenFile string)

Init 从 stopTokenFile 中读入停用词,一个词一行 文档索引建立时会跳过这些停用词

func (*StopTokens) IsStopToken

func (st *StopTokens) IsStopToken(token string) bool

IsStopToken to determine whether to stop token

Directories

Path Synopsis
Package core is riot core
Package core is riot core
Package engine is riot engine
Package engine is riot engine
examples
benchmark
riot 性能测试
riot 性能测试
codelab
一个微博搜索的例子。
一个微博搜索的例子。
new
pinyin_weibo
一个微博 pinyin 搜索的例子。
一个微博 pinyin 搜索的例子。
net
Package net is riot net
Package net is riot net
com
grpc/riot-pb
Package doc is a generated protocol buffer package.
Package doc is a generated protocol buffer package.
Package types is riot types
Package types is riot types

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL