snowballword

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 29, 2018 License: MIT Imports: 1 Imported by: 37

Documentation

Overview

This package defines a SnowballWord struct that is used to encapsulate most of the "state" variables we must track when stemming a word. The SnowballWord struct also has a few methods common to stemming in a variety of languages.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type SnowballWord

type SnowballWord struct {

	// A slice of runes
	RS []rune

	// The index in RS where the R1 region begins
	R1start int

	// The index in RS where the R2 region begins
	R2start int

	// The index in RS where the RV region begins
	RVstart int
}

SnowballWord represents a word that is going to be stemmed.

func New

func New(in string) (word *SnowballWord)

Create a new SnowballWord struct

func (*SnowballWord) DebugString

func (w *SnowballWord) DebugString() string

func (*SnowballWord) FirstPrefix

func (w *SnowballWord) FirstPrefix(prefixes ...string) (foundPrefix string, foundPrefixRunes []rune)

Return the first prefix found or the empty string.

func (*SnowballWord) FirstSuffix

func (w *SnowballWord) FirstSuffix(suffixes ...string) (suffix string, suffixRunes []rune)

Return the first suffix found or the empty string.

func (*SnowballWord) FirstSuffixIfIn

func (w *SnowballWord) FirstSuffixIfIn(startPos, endPos int, suffixes ...string) (suffix string, suffixRunes []rune)

Find the first suffix that ends at `endPos` in the word among those provided; then, check to see if it begins after startPos. If it does, return it, else return the empty string and empty rune slice. This may seem a counterintuitive manner to do this. However, it matches what is required most of the time by the Snowball stemmer steps.

func (*SnowballWord) FirstSuffixIn

func (w *SnowballWord) FirstSuffixIn(startPos, endPos int, suffixes ...string) (suffix string, suffixRunes []rune)

func (*SnowballWord) FitsInR1

func (w *SnowballWord) FitsInR1(x int) bool

Returns true if `x` runes would fit into R1.

func (*SnowballWord) FitsInR2

func (w *SnowballWord) FitsInR2(x int) bool

Returns true if `x` runes would fit into R2.

func (*SnowballWord) FitsInRV

func (w *SnowballWord) FitsInRV(x int) bool

Returns true if `x` runes would fit into RV.

func (*SnowballWord) HasSuffixRunes

func (w *SnowballWord) HasSuffixRunes(suffixRunes []rune) bool

Return true if `w` ends with `suffixRunes`

func (*SnowballWord) HasSuffixRunesIn

func (w *SnowballWord) HasSuffixRunesIn(startPos, endPos int, suffixRunes []rune) bool

Return true if `w.RS[startPos:endPos]` ends with runes from `suffixRunes`. That is, the slice of runes between startPos and endPos have a suffix of suffixRunes.

func (*SnowballWord) R1

func (w *SnowballWord) R1() []rune

Return the R1 region as a slice of runes

func (*SnowballWord) R1String

func (w *SnowballWord) R1String() string

Return the R1 region as a string

func (*SnowballWord) R2

func (w *SnowballWord) R2() []rune

Return the R2 region as a slice of runes

func (*SnowballWord) R2String

func (w *SnowballWord) R2String() string

Return the R2 region as a string

func (*SnowballWord) RV

func (w *SnowballWord) RV() []rune

Return the RV region as a slice of runes

func (*SnowballWord) RVString

func (w *SnowballWord) RVString() string

Return the RV region as a string

func (*SnowballWord) RemoveFirstSuffix

func (w *SnowballWord) RemoveFirstSuffix(suffixes ...string) (suffix string, suffixRunes []rune)

Removes the first suffix found

func (*SnowballWord) RemoveFirstSuffixIfIn

func (w *SnowballWord) RemoveFirstSuffixIfIn(startPos int, suffixes ...string) (suffix string, suffixRunes []rune)

Find the first suffix in the word among those provided; then, check to see if it begins after startPos. If it does, remove it.

func (*SnowballWord) RemoveFirstSuffixIn

func (w *SnowballWord) RemoveFirstSuffixIn(startPos int, suffixes ...string) (suffix string, suffixRunes []rune)

Removes the first suffix found that is in `word.RS[startPos:len(word.RS)]`

func (*SnowballWord) RemoveLastNRunes

func (w *SnowballWord) RemoveLastNRunes(n int)

Remove the last `n` runes from the SnowballWord.

func (*SnowballWord) ReplaceSuffix

func (w *SnowballWord) ReplaceSuffix(suffix, replacement string, force bool) bool

Replace a suffix and adjust R1start and R2start as needed. If `force` is false, check to make sure the suffix exists first.

func (*SnowballWord) ReplaceSuffixRunes

func (w *SnowballWord) ReplaceSuffixRunes(suffixRunes []rune, replacementRunes []rune, force bool) bool

Replace a suffix and adjust R1start and R2start as needed. If `force` is false, check to make sure the suffix exists first.

func (*SnowballWord) String

func (w *SnowballWord) String() string

Return the SnowballWord as a string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL