sentencepiece

package

v0.7.0 Latest Latest Go to latest Published: May 24, 2021 License: BSD-2-Clause Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/nlpodyssey/spago

Links

Open Source Insights

Documentation ¶

Index ¶

type Tokenizer
- func NewFromModelFolder(path string, lowercase bool) (*Tokenizer, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Tokenizer ¶

type Tokenizer struct {
	// contains filtered or unexported fields
}

Tokenizer is a Sentence Piece tokenizer.

func NewFromModelFolder ¶

func NewFromModelFolder(path string, lowercase bool) (*Tokenizer, error)

NewFromModelFolder returns a new Tokenizer.

func (*Tokenizer) Detokenize ¶

func (t *Tokenizer) Detokenize(tokens []string) string

Detokenize flatten and merges a list of tokens into a single string.

func (*Tokenizer) IDsToTokens ¶

func (t *Tokenizer) IDsToTokens(ids []int) []string

IDsToTokens returns a list of string terms from a list of token IDs. It panics if a token is not found in the vocabulary.

func (*Tokenizer) Tokenize ¶

func (t *Tokenizer) Tokenize(text string) []string

Tokenize performs sentence-piece tokenization.

func (*Tokenizer) TokensToIDs ¶

func (t *Tokenizer) TokensToIDs(tokens []string) []int

TokensToIDs returns a list of token IDs from a list of string tokens. It panics if a token is not found in the vocabulary and no unknown token is found.

Source Files ¶

View all Source files

tokenizer.go

Directories ¶

Path	Synopsis
internal
sentencepiece Package sentencepiece implements the SentencePiece encoder (Kudo and Richardson, 2018).	Package sentencepiece implements the SentencePiece encoder (Kudo and Richardson, 2018).

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL