gofiler

package module
v0.1.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2021 License: MIT Imports: 6 Imported by: 0

README

builds.sr.ht status

Go-Profiler

Cgo bindings for the profiler.

Dependencies:

  • cmake
  • g++/clang++
  • make
  • go/cgo
  • icu
  • libutfcpp-dev

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Candidate

type Candidate struct {
	HistPatterns []PosPat // Historical patterns.
	OCRPatterns  []PosPat // OCR patters/errors.
	Suggestion   string   // Correction suggestion.
	Modern       string   // Modern lexicon entry.
	Dictionary   string   // Name of the dictionary.
	Weight       float64  // Weight of the candidate.
	Distance     int      // Levenshtein distance.
}

Candidate represents a profiler correction candidate.

func (Candidate) String added in v0.1.6

func (c Candidate) String() string

type Document

type Document struct {
	// contains filtered or unexported fields
}

Document wraps a cgo document.

func NewDocument

func NewDocument() Document

NewDocument create a new document.

func (Document) AddToken

func (doc Document) AddToken(token string) error

AddTokenWithCorrection appends a token to the document.

func (Document) AddTokenWithCorrection

func (doc Document) AddTokenWithCorrection(token, cor string) error

AddTokenWithCorrection appends a token with its correction to the document.

func (Document) At

func (doc Document) At(i int) Token

At returns the i-th token in the document.

func (Document) Close

func (doc Document) Close() error

Close closes the document and frees the underlying resources.

func (Document) Len

func (doc Document) Len() int

Len returns the number of tokens in the document.

type PosPat

type PosPat struct {
	Left, Right string // Left and right parts of the pattern.
	Pos         int    // Position where the pattern applies.
}

PosPat represents an error or hisoric rewrite pattern.

func (PosPat) String added in v0.1.6

func (p PosPat) String() string

type Profiler

type Profiler struct {
	// contains filtered or unexported fields
}

Profiler wraps an underlying cgo profiler.

func New

func New() Profiler

New create a new profiler.

func (Profiler) Close

func (p Profiler) Close() error

Close closes the Profiler and frees the underlying resources.

func (Profiler) GetAdaptive

func (p Profiler) GetAdaptive() bool

GetAdaptive returns if the adaptive profiling is enabled.

func (Profiler) GetIterations

func (p Profiler) GetIterations() int

GetIterations returns the number of iterations.

func (Profiler) GetTypes

func (p Profiler) GetTypes() bool

GetTypes returns if the type-based operation is enabled.

func (Profiler) Profile

func (p Profiler) Profile(doc Document) error

Profile profiles the given document. Make sure that ReadConfig was called before the call to Profile.

func (Profiler) ReadConfig

func (p Profiler) ReadConfig(config string) error

ReadConfig read the profiler configuration file. Must be called before any calls to Profile.

func (Profiler) SetAdaptive

func (p Profiler) SetAdaptive(val bool)

SetAdaptive enables/disables the apdaptive profiling.

func (Profiler) SetIterations

func (p Profiler) SetIterations(val int)

SetIterations sets the number of iterations.

func (Profiler) SetTypes

func (p Profiler) SetTypes(val bool)

SetTypes enables/disables the type-based operation of the profiler.

type Token

type Token struct {
	OCR        string      // Recognized token from the OCR.
	Cor        string      // Correction for this token (if any).
	GT         string      // Ground-truth for this token (if any).
	Candidates []Candidate // List of correction candidates (if any).
}

Token represents a token in the document. Any token in the document has its OCR value set. If there is ground-truth or correction information the according values are not empty. After the profiling of the document suspicious tokens have a list of correction candidates. Note that lexicon entries also have one candidate in their candidates slice (the according lexicon entry with a Levenshtein distance of 0).

func (Token) IsLexiconEntry

func (t Token) IsLexiconEntry() bool

IsLexiconEntry returns true if the token is a lexicon entry.

func (Token) IsUnknown

func (t Token) IsUnknown() bool

IsUnknown returns true if the token is unknown, i.e. there are no correction suggestions for the token.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL