idf

package

v0.70.2 Latest Latest Go to latest Published: May 19, 2022 License: Apache-2.0 Imports: 6 Imported by: 2

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/go-ego/gse

Links

Open Source Insights

Documentation ¶

Index ¶

Variables
type Idf
- func NewIdf() *Idf
type Segment
- func (s Segment) Text() string
- func (s Segment) Weight() float64
type Segments
type StopWord
- func NewStopWord() *StopWord
type TagExtracter
type TextRanker

Constants ¶

This section is empty.

Variables ¶

View Source

var StopWordMap = map[string]bool{
	"the":   true,
	"of":    true,
	"is":    true,
	"and":   true,
	"to":    true,
	"in":    true,
	"that":  true,
	"we":    true,
	"for":   true,
	"an":    true,
	"are":   true,
	"by":    true,
	"be":    true,
	"as":    true,
	"on":    true,
	"with":  true,
	"can":   true,
	"if":    true,
	"from":  true,
	"which": true,
	"you":   true,
	"it":    true,
	"this":  true,
	"then":  true,
	"at":    true,
	"have":  true,
	"all":   true,
	"not":   true,
	"one":   true,
	"has":   true,
	"or":    true,
}

StopWordMap the default stop words.

Functions ¶

This section is empty.

Types ¶

type Idf ¶

type Idf struct {
	// contains filtered or unexported fields
}

Idf type a dictionary for all words with the IDFs(Inverse Document Frequency).

func NewIdf ¶

func NewIdf() *Idf

NewIdf create a new Idf

func (*Idf) AddToken ¶

func (i *Idf) AddToken(text string, freq float64, pos ...string) error

AddToken add a new word with IDF into the dictionary.

func (*Idf) Freq ¶ added in v0.69.7

func (i *Idf) Freq(key string) (float64, string, bool)

Freq return the IDF of the word

func (*Idf) LoadDict ¶

func (i *Idf) LoadDict(files ...string) error

LoadDict load the idf dictionary

func (*Idf) NumTokens ¶ added in v0.69.7

func (i *Idf) NumTokens() int

NumTokens return the IDF tokens' num

func (*Idf) TotalFreq ¶ added in v0.69.7

func (i *Idf) TotalFreq() float64

TotalFreq reruen the IDF total frequency

type Segment ¶

type Segment struct {
	// contains filtered or unexported fields
}

Segment type a word with weight.

func (Segment) Text ¶

func (s Segment) Text() string

Text return the segment's text.

func (Segment) Weight ¶

func (s Segment) Weight() float64

Weight return the segment's weight.

type Segments ¶

type Segments []Segment

Segments type a slice of Segment.

func (Segments) Len ¶

func (ss Segments) Len() int

func (Segments) Less ¶

func (ss Segments) Less(i, j int) bool

func (Segments) Swap ¶

func (ss Segments) Swap(i, j int)

type StopWord ¶

type StopWord struct {
	// contains filtered or unexported fields
}

StopWord is a dictionary for all stop words.

func NewStopWord ¶

func NewStopWord() *StopWord

NewStopWord create a new StopWord with the default stop words.

func (*StopWord) AddStop ¶ added in v0.63.0

func (s *StopWord) AddStop(text string)

AddStop add a token to StopWord dictionary.

func (*StopWord) IsStopWord ¶

func (s *StopWord) IsStopWord(word string) bool

IsStopWord check the word is a stop word

func (*StopWord) LoadDict ¶

func (s *StopWord) LoadDict(files ...string) error

LoadDict load the idf stop dictionary

func (*StopWord) RemoveStop ¶ added in v0.63.0

func (s *StopWord) RemoveStop(text string)

RemoveStop remove a token from StopWord dictionary.

type TagExtracter ¶

type TagExtracter struct {
	Idf *Idf
	// contains filtered or unexported fields
}

TagExtracter is extract tags struct.

func (*TagExtracter) ExtractTags ¶

func (t *TagExtracter) ExtractTags(text string, topK int) (tags Segments)

ExtractTags extract the topK key words from text.

func (*TagExtracter) LoadDict ¶

func (t *TagExtracter) LoadDict(fileName ...string) error

LoadDict load and create a new dictionary from the file

func (*TagExtracter) LoadIdf ¶

func (t *TagExtracter) LoadIdf(fileName ...string) error

LoadIdf load and create a new Idf dictionary from the file.

func (*TagExtracter) LoadStopWords ¶

func (t *TagExtracter) LoadStopWords(fileName ...string) error

LoadStopWords load and create a new StopWord dictionary from the file.

func (*TagExtracter) WithGse ¶

func (t *TagExtracter) WithGse(segs gse.Segmenter)

WithGse register the gse segmenter

type TextRanker ¶

type TextRanker struct {
	HMM bool
	// contains filtered or unexported fields
}

TextRanker is extract tags struct.

func (*TextRanker) LoadDict ¶

func (t *TextRanker) LoadDict(fileName ...string) error

LoadDict load and create a new dictionary from the file for Textranker

func (*TextRanker) TextRank ¶

func (t *TextRanker) TextRank(text string, topK int) Segments

TextRank extract keywords from text using TextRank algorithm. Parameter topK specify how many top keywords to be returned at most.

func (*TextRanker) TextRankWithPOS ¶

func (t *TextRanker) TextRankWithPOS(text string, topK int, allowPOS []string) Segments

TextRankWithPOS extracts keywords from text using TextRank algorithm. Parameter allowPOS allows a []string pos list.

func (*TextRanker) WithGse ¶

func (t *TextRanker) WithGse(segs gse.Segmenter)

WithGse register the gse segmenter

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL