core

package

v0.0.0-...-5465ba9 Latest Latest Go to latest Published: Dec 7, 2022 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/mangenotwork/search

Links

Open Source Insights

Documentation ¶

Index ¶

Variables
func Find(term string) []string
func GetDocTerm(filePath string) ([]string, error)
func GetSearchFile(theme, term, sortTypeType string, pg int) []*entity.PL
func NewMetaData()
func SetDocTerm(filePath string, data []string)
func SetPostings(theme, docId, text string, docStamp, orderInt int64)
func SetPostingsAuthor(theme, docId, text string, docStamp, orderInt int64)
func TermExtract(str string) []*entity.Term
type PLFileInfo
type PLInfo

Constants ¶

This section is empty.

Variables ¶

View Source

var NewMetaDataObj *utils.Consistent

Functions ¶

func Find ¶

func Find(term string) []string

func GetDocTerm ¶

func GetDocTerm(filePath string) ([]string, error)

func GetSearchFile ¶

func GetSearchFile(theme, term, sortTypeType string, pg int) []*entity.PL

func NewMetaData ¶

func NewMetaData()

func SetDocTerm ¶

func SetDocTerm(filePath string, data []string)

func SetPostings ¶

func SetPostings(theme, docId, text string, docStamp, orderInt int64)

func SetPostingsAuthor ¶

func SetPostingsAuthor(theme, docId, text string, docStamp, orderInt int64)

func TermExtract ¶

func TermExtract(str string) []*entity.Term

TermExtract 提取索引词除了标点符号，助词，语气词，形容词，叹词, 副词其他都被分出来

Types ¶

type PLFileInfo ¶

type PLFileInfo struct {
	FileId   int     // 文件编号
	FileName string  // 文件名称
	ValMax   float64 // 最大排序值
	ValMin   float64 // 最小排序值
	Num      int     // 数据条数
}

type PLInfo ¶

type PLInfo struct {
	PLDir   string        // 保存数据的路径
	PLTFile []*PLFileInfo // 起到一个游标的作用, 时间排序数据文件
	PLOFile []*PLFileInfo // 排序值排序数据文件
	PLFFile []*PLFileInfo // 按词频排序数据文件
	PLTFNum int           // 文件数量
	PLOFNum int
	PLFFNum int
	StartF  int // 启始数， 1开始， 是最大的排序值的数据
	EndF    int // 结束数，= FNum , 是最小的排序值的数据
}

PLInfo 索引信息文件

.pli []*pliFile file_name,valMax,valMin, fNum 文件数量

每个索引包含一个信息文件主要记录索引存储结构,每个文件最多存储100条数据这样设计的缺点: 空间浪费，写慢这样设计的优点: 读快，读取结果已经被三个维度分别排序

文件： .plt 的文件 postingList time 按时间排序的数据存储 k:docId v:time 排序规则:只有文档时间和词频两个维度 t>f .plo 的文件 postingList orderInt 按排序值来进行排序 k:docId v:orderInt 排序规则:有自定义排序值，文档时间，词频三个维度 o>t>f .plf 的文件 postingList Freq 按词频值来进行排序 k:docId v:Freq 排序规则: 只有文档时间和词频两个维度 f>t

结构: 存储结构 []*d{docId, value(用来排序的), start, end}

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL