Documentation ¶
Index ¶
- Variables
- type Idf
- type Segment
- type Segments
- type StopWord
- type TagExtracter
- func (t *TagExtracter) ExtractTags(sentence string, topK int) (tags Segments)
- func (t *TagExtracter) LoadDict(fileName ...string) error
- func (t *TagExtracter) LoadIdf(fileName ...string) error
- func (t *TagExtracter) LoadStopWords(fileName ...string) error
- func (t *TagExtracter) WithGse(segs gse.Segmenter)
- type TextRanker
Constants ¶
This section is empty.
Variables ¶
var StopWordMap = map[string]bool{ "the": true, "of": true, "is": true, "and": true, "to": true, "in": true, "that": true, "we": true, "for": true, "an": true, "are": true, "by": true, "be": true, "as": true, "on": true, "with": true, "can": true, "if": true, "from": true, "which": true, "you": true, "it": true, "this": true, "then": true, "at": true, "have": true, "all": true, "not": true, "one": true, "has": true, "or": true, }
StopWordMap default contains some stop words.
Functions ¶
This section is empty.
Types ¶
type Idf ¶
type Idf struct {
// contains filtered or unexported fields
}
Idf represents a dictionary for all words with their IDFs(Inverse Document Frequency).
type Segment ¶
type Segment struct {
// contains filtered or unexported fields
}
Segment represents a word with weight.
type StopWord ¶
type StopWord struct {
// contains filtered or unexported fields
}
StopWord is a dictionary for all stop words.
func NewStopWord ¶
func NewStopWord() *StopWord
NewStopWord create a new StopWord with default stop words.
func (*StopWord) IsStopWord ¶
IsStopWord checks if a given word is stop word.
func (*StopWord) RemoveStop ¶ added in v0.63.0
RemoveStop remove a token into StopWord dictionary.
type TagExtracter ¶
type TagExtracter struct { Idf *Idf // contains filtered or unexported fields }
TagExtracter is used to extract tags from sentence.
func (*TagExtracter) ExtractTags ¶
func (t *TagExtracter) ExtractTags(sentence string, topK int) (tags Segments)
ExtractTags extracts the topK key words from sentence.
func (*TagExtracter) LoadDict ¶
func (t *TagExtracter) LoadDict(fileName ...string) error
LoadDict reads the given filename and create a new dictionary.
func (*TagExtracter) LoadIdf ¶
func (t *TagExtracter) LoadIdf(fileName ...string) error
LoadIdf reads the given file and create a new Idf dictionary.
func (*TagExtracter) LoadStopWords ¶
func (t *TagExtracter) LoadStopWords(fileName ...string) error
LoadStopWords reads the given file and create a new StopWord dictionary.
func (*TagExtracter) WithGse ¶
func (t *TagExtracter) WithGse(segs gse.Segmenter)
WithGse register gse segmenter
type TextRanker ¶
type TextRanker struct { HMM bool // contains filtered or unexported fields }
TextRanker is used to extract tags from sentence.
func (*TextRanker) LoadDict ¶
func (t *TextRanker) LoadDict(fileName ...string) error
LoadDict reads a given file and create a new dictionary file for Textranker.
func (*TextRanker) TextRank ¶
func (t *TextRanker) TextRank(sentence string, topK int) Segments
TextRank extract keywords from sentence using TextRank algorithm. Parameter topK specify how many top keywords to be returned at most.
func (*TextRanker) TextRankWithPOS ¶
func (t *TextRanker) TextRankWithPOS(sentence string, topK int, allowPOS []string) Segments
TextRankWithPOS extracts keywords from sentence using TextRank algorithm. Parameter allowPOS allows a customized pos list.
func (*TextRanker) WithGse ¶
func (t *TextRanker) WithGse(segs gse.Segmenter)
WithGse register gse segmenter