rank

package
v2.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 8, 2021 License: MIT Imports: 2 Imported by: 5

Documentation

Index

Constants

View Source
const ByQty = 0

ByQty filter by occurrence of word.

View Source
const ByRelation = 1

ByRelation filter by phrase weight.

Variables

This section is empty.

Functions

func Calculate

func Calculate(ranks *Rank, algorithm Algorithm)

Calculate function ranking words by the given algorithm implementation.

Types

type Algorithm

type Algorithm interface {
	WeightingRelation(
		word1ID int,
		word2ID int,
		rank *Rank,
	) float32

	WeightingHits(
		wordID int,
		rank *Rank,
	) float32
}

Algorithm interface and its methods make possible the polimorf usage of weighting process.

type AlgorithmChain

type AlgorithmChain struct{}

AlgorithmChain struct is the combined implementation of Algorithm. It is a good example how weighting can be changed by a different implementations. It can weight a word or phrase by comparing them.

func NewAlgorithmChain

func NewAlgorithmChain() *AlgorithmChain

NewAlgorithmChain constructor retrieves an AlgorithmChain pointer.

func (*AlgorithmChain) WeightingHits

func (a *AlgorithmChain) WeightingHits(
	wordID int,
	rank *Rank,
) float32

WeightingHits method ranks the words by their occurrence.

func (*AlgorithmChain) WeightingRelation

func (a *AlgorithmChain) WeightingRelation(
	word1ID int,
	word2ID int,
	rank *Rank,
) float32

WeightingRelation method is a combined algorithm of text rank and word occurrence, it weights a phrase.

type AlgorithmDefault

type AlgorithmDefault struct{}

AlgorithmDefault struct is the basic implementation of Algorithm. It can weight a word or phrase by comparing them.

func NewAlgorithmDefault

func NewAlgorithmDefault() *AlgorithmDefault

NewAlgorithmDefault constructor retrieves an AlgorithmDefault pointer.

func (*AlgorithmDefault) WeightingHits

func (a *AlgorithmDefault) WeightingHits(
	wordID int,
	rank *Rank,
) float32

WeightingHits method ranks the words by their occurrence.

func (*AlgorithmDefault) WeightingRelation

func (a *AlgorithmDefault) WeightingRelation(
	word1ID int,
	word2ID int,
	rank *Rank,
) float32

WeightingRelation method is the traditional algorithm of text rank to weighting a phrase.

type Phrase

type Phrase struct {
	LeftID  int
	RightID int
	Left    string
	Right   string
	Weight  float32
	Qty     int
}

Phrase struct contains a single phrase and its data.

LeftID is the ID of the word 1.

RightID is the ID of the word 2.

Left is the token of the word 1.

Right is the token of the word 2.

Weight is between 0.00 and 1.00.

Qty is the occurrence of the phrase.

func FindPhrases

func FindPhrases(ranks *Rank) []Phrase

FindPhrases function has wrapper textrank.FindPhrases. Use the wrapper instead.

type Rank

type Rank struct {
	Max         float32
	Min         float32
	Relation    Relation
	SentenceMap map[int]string
	Words       map[int]*Word
	WordValID   map[string]int
}

Rank struct contains every original raw sentences, words, tokens, phrases, indexes, word hits, phrase hits and minimum-maximum values.

Max is the occurrence of the most used word.

Min is the occurrence of the less used word. It is always greater then 0.

Relation is the Relation object, contains phrases.

SentenceMap contains raw sentences. Index is the sentence ID, value is the sentence itself.

Words contains Word objects. Index is the word ID, value is the word/token itself.

WordValID contains words. Index is the word/token, value is the ID.

func NewRank

func NewRank() *Rank

NewRank constructor retrieves a Rank pointer.

func (*Rank) AddNewWord

func (rank *Rank) AddNewWord(word string, prevWordIdx int, sentenceID int) (wordID int)

AddNewWord method adds a new word to the rank object and it defines its ID.

func (*Rank) GetWordData

func (rank *Rank) GetWordData() map[int]*Word

GetWordData method retrieves all words as a pointer.

func (*Rank) IsWordExist

func (rank *Rank) IsWordExist(word string) bool

IsWordExist method retrieves true when the given word is already in the rank.

func (*Rank) UpdateRightConnection

func (rank *Rank) UpdateRightConnection(wordID int, rightWordID int)

UpdateRightConnection method adds the right connection to the word. It always can be used after a word has added and the next word is known.

func (*Rank) UpdateWord

func (rank *Rank) UpdateWord(word string, prevWordIdx int, sentenceID int) (wordID int)

UpdateWord method update a word what already exists in the rank object. It retrieves its ID.

type Relation

type Relation struct {
	Max  float32
	Min  float32
	Node map[int]map[int]Score
}

Relation struct contains the phrase data.

Max is the occurrence of the most used phrase.

Min is the occurrence of the less used phrase. It is always greater then 0.

Node is contains the Scores. Firs ID is the word 1, second ID is the word 2, and the value is the Score what contains the data about their relation.

func (*Relation) AddRelation

func (relation *Relation) AddRelation(wordID int, relatedWordID int, sentenceID int)

AddRelation method adds a new relation to Relation object.

type Score

type Score struct {
	Qty         int
	Weight      float32
	SentenceIDs []int
}

Score struct contains data about a relation of two words.

Qty is the occurrence of the phrase.

Weight is the weight of the phrase between 0.00 and 1.00.

SentenceIDs contains all IDs of sentences what contain the phrase.

type Sentence

type Sentence struct {
	ID    int
	Value string
}

Sentence struct contains a single sentence and its data.

func FindSentences

func FindSentences(ranks *Rank, kind int, limit int) []Sentence

FindSentences function has wrappers textrank.FindSentencesByRelationWeight and textrank.FindSentencesByWordQtyWeight. Use the wrappers instead.

func FindSentencesByPhrases

func FindSentencesByPhrases(ranks *Rank, words []string) []Sentence

FindSentencesByPhrases function has wrapper textrank.FindSentencesByPhraseChain. Use the wrapper instead.

func FindSentencesFrom

func FindSentencesFrom(ranks *Rank, id int, limit int) []Sentence

FindSentencesFrom function has wrapper textrank.FindSentencesFrom. Use the wrapper instead.

type SingleWord

type SingleWord struct {
	ID     int
	Word   string
	Weight float32
	Qty    int
}

SingleWord struct contains a single word and its data.

ID of the word.

Word itself, the token.

Weight of the word between 0.00 and 1.00.

Quantity of the word.

func FindSingleWords

func FindSingleWords(ranks *Rank) []SingleWord

FindSingleWords function has wrapper textrank.FindSingleWords. Use the wrapper instead.

type Word

type Word struct {
	ID              int
	SentenceIDs     []int
	ConnectionLeft  map[int]int
	ConnectionRight map[int]int
	Token           string
	Qty             int
	Weight          float32
}

Word struct contains all data about the words.

If a word is multiple times in the text then the multiple words point to the same ID. So Word is unique.

SentenceIDs contains all IDs of sentences what contain the word.

ConnectionLeft contains all words what are connected to this word on the left side. The map index is the ID of the related word and its value is the occurrence.

ConnectionRight contains all words what are connected to this word on the right side. The map index is the ID of the related word and its value is the occurrence.

Token is the word itself, but not the original, it is tokenized.

Qty is the number of occurrence of the word.

Weight is the weight of the word between 0.00 and 1.00.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL