multilang

package
v0.0.0-...-ef9048a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 25, 2024 License: AGPL-3.0 Imports: 39 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// Name used for all components
	Name = "multilang"
	// LangDivider is a special symbol added to the end of the input
	// after that symbol detected lang name is stored
	LangDivider = byte('_')
)

Variables

This section is empty.

Functions

func Register

func Register(detector lingua.LanguageDetector, defaultLang string)

Register multilang analyzer

Types

type CharFilter

type CharFilter struct {
	// contains filtered or unexported fields
}

CharFilter detects input language and appends it to the input bytes

func (*CharFilter) Filter

func (c *CharFilter) Filter(input []byte) []byte

Filter detects input language and appends it to the end of the input bytes

type Tokenizer

type Tokenizer struct {
	// contains filtered or unexported fields
}

Tokenizer converts input bytes to the token stream using analyzer for specific lang

func (*Tokenizer) Tokenize

func (d *Tokenizer) Tokenize(input []byte) analysis.TokenStream

Tokenize converts input bytes into token stream using analyzer for specific lang

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL