Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func EscDetector ¶
EscDetector checks if the buffer contains escape sequences commonly used in certain character encodings
func HighByteDetector ¶
HighByteDetector checks if the buffer contains any bytes with values >= 0x80
func WinByteDetector ¶
WinByteDetector checks if the buffer contains Windows-specific byte values in the range 0x80-0x9F
Types ¶
type Result ¶
type Result struct { // Encoding is the detected character encoding name Encoding string `json:"encoding,omitempty"` // Confidence indicates how confident the detector is about the result (0.0-1.0) Confidence float64 `json:"confidence,omitempty"` // Language represents the detected language (if applicable) Language string `json:"language,omitempty"` }
Result represents the character encoding detection result
type UniversalDetector ¶
type UniversalDetector struct { // MinimumThreshold is the minimum confidence threshold for detection MinimumThreshold float64 // IsoWinMap maps ISO encodings to Windows encodings IsoWinMap map[string]string // contains filtered or unexported fields }
UniversalDetector implements universal character encoding detection
func NewUniversalDetector ¶
func NewUniversalDetector(filter consts.LangFilter) *UniversalDetector
NewUniversalDetector creates a new UniversalDetector instance with the specified language filter
func (*UniversalDetector) Feed ¶
func (u *UniversalDetector) Feed(buf []byte)
Feed processes a chunk of bytes for character encoding detection It analyzes the input data and updates the internal state accordingly
func (*UniversalDetector) GetResult ¶
func (u *UniversalDetector) GetResult() Result
GetResult returns the final character encoding detection result If detection is not complete, it will finalize the detection process
func (*UniversalDetector) Reset ¶
func (u *UniversalDetector) Reset()
Reset resets the detector to its initial state