dictionary

package
v0.0.181 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 19, 2024 License: Apache-2.0 Imports: 16 Imported by: 5

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func LoadDictKeys added in v0.0.101

func LoadDictKeys(appConfig config.AppConfig) (*map[string]bool, error)

LoadDictKeys loads the keys only from static files

func Ngrams added in v0.0.141

func Ngrams(chars []string, minLen int) []string

Ngrams finds the set of all substrings in the array longer than minLen characters

func ValidateDict added in v0.0.18

func ValidateDict(wdict map[string]*dicttypes.Word, validator Validator) error

ValidateDict check the Chinese-English for errors

Types

type Dictionary added in v0.0.66

type Dictionary struct {
	// Forward dictionary, lookup by Chinese word
	Wdict       map[string]*dicttypes.Word
	HeadwordIds map[int]*dicttypes.Word
}

Dictionary is a struct to hold word dictionary indexes

func LoadDictFile added in v0.0.66

func LoadDictFile(appConfig config.AppConfig) (*Dictionary, error)

LoadDictFile loads all words from static files

func LoadDictURL added in v0.0.66

func LoadDictURL(appConfig config.AppConfig, url string) (*Dictionary, error)

Loads all words from a URL

func NewDictionary added in v0.0.66

func NewDictionary(wdict map[string]*dicttypes.Word) *Dictionary

type FsClient added in v0.0.148

type FsClient interface {
	Collection(path string) *firestore.CollectionRef
}

FsClient defines Firestore interfaces needed

type HeadwordSubstrings added in v0.0.142

type HeadwordSubstrings struct {
	HeadwordId  int64    `firestore:"headword_id"`
	Simplified  string   `firestore:"simplified"`
	Traditional string   `firestore:"traditional"`
	Pinyin      string   `firestore:"pinyin"`
	Substrings  []string `firestore:"substrings"`
}

HeadwordSubstrings holds substrings of a headword

type NotesExtractor added in v0.0.92

type NotesExtractor struct {
	// contains filtered or unexported fields
}

NotesExtractor is an interface for extracting multilingual equivalents using regular expressions in the notes.

func NewNotesExtractor added in v0.0.92

func NewNotesExtractor(patternList string) (*NotesExtractor, error)

NewNotesExtractor creates a new NotesExtractor.

func (NotesExtractor) Extract added in v0.0.92

func (n NotesExtractor) Extract(notes string) []string

Extract extracts multilingual equivalents from the given note

type NotesProcessor added in v0.0.67

type NotesProcessor struct {
	// contains filtered or unexported fields
}

NotesProcessor processes notes with a regular expression

func NewNotesProcessor added in v0.0.67

func NewNotesProcessor(patternList, replaceList string) NotesProcessor

newNotesProcessor creates a new notesProcessor Param

patternList a list of patterns to match regular expressions, quoted and delimited by commas
replaceList a list of replacement regular expressions, same cardinality

func (NotesProcessor) Process added in v0.0.67

Process checks all senses in the word and replaces note using the regex

type Results

type Results struct {
	Words []dicttypes.Word
}

Encapsulates term lookup recults

type ReverseIndex added in v0.0.92

type ReverseIndex interface {
	// Find searches from English, pinyin, or multilingual equivalents contained in notes to Chinese
	Find(ctx context.Context, query string) ([]dicttypes.WordSense, error)
}

ReverseIndex searches the dictionary by reverse lookup, eg to Chinese

func NewReverseIndex added in v0.0.92

func NewReverseIndex(dict *Dictionary, nExtractor *NotesExtractor) ReverseIndex

type SubstringIndex added in v0.0.92

type SubstringIndex interface {
	LookupSubstr(ctx context.Context, query, topic_en, subtopic_en string) (*Results, error)
}

func NewSubstringIndexFS added in v0.0.145

func NewSubstringIndexFS(client FsClient, corpus string, generation int, dict *Dictionary) (SubstringIndex, error)

NewSubstringIndexDB initialize an implementation of SubstringIndex using the index saved in Firestore

func NewSubstringIndexMem added in v0.0.92

func NewSubstringIndexMem(ctx context.Context) (SubstringIndex, error)

NewSubstringIndexMem initialize a SubstringIndexMem

type SubstringIndexMem added in v0.0.92

type SubstringIndexMem struct {
}

SubstringIndexMem looks up substrings from a map loaded from a file.

func (SubstringIndexMem) LookupSubstr added in v0.0.92

func (searcher SubstringIndexMem) LookupSubstr(ctx context.Context, query, topic_en, subtopic_en string) (*Results, error)

Lookup a term based on a substring and a topic

type Validator added in v0.0.18

type Validator interface {
	Validate(pos, domain string) error
}

Performs validation of dictionary entries. Use NewValidator to create a Validator.

func NewValidator added in v0.0.18

func NewValidator(posReader io.Reader, domainReader io.Reader) (Validator, error)

Crates a Validator with the given readers Params:

posReader Reader to load the valid parts of speech from
domainReader Reader to load the valid subject domains

Returns:

An initialized Validator

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL