mbcs

package module
v0.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2023 License: MIT Imports: 14 Imported by: 11

README

Go Reference

go-windows-mbcs

Convert between UTF8 and non-UTF8 character codes(ANSI) using Windows APIs: MultiByteToWideChar and WideCharToMultiByte.

Convert from ANSI-bytes to UTF8-strings

package main

import (
    "bufio"
    "fmt"
    "os"

    "github.com/nyaosorg/go-windows-mbcs"
)

func main() {
    sc := bufio.NewScanner(os.Stdin)
    for sc.Scan() {
        text, err := mbcs.AnsiToUtf8(sc.Bytes(), mbcs.ACP)
        if err != nil {
            fmt.Fprintln(os.Stderr, err.Error())
            os.Exit(1)
        }
        fmt.Println(text)
    }
    if err := sc.Err(); err != nil {
        fmt.Fprintln(os.Stderr, err.Error())
        os.Exit(1)
    }
}

mbcs.ACP is the current codepage.

On Windows
$ chcp 932
$ go run examples\AnsiToUtf8.go < testdata\jugemu-cp932.txt | nkf32 --guess
UTF-8 (LF)
On Linux
$ env LC_ALL=ja_JP.Shift_JIS go run examples/AnsiToUtf8.go < testdata/jugemu-cp932.txt | nkf --guess
UTF-8 (LF)

When OS is not Windows, the current encoding is judged with $LC_ALL and $LANG.

Convert from UTF8-strings to ANSI-bytes

package main

import (
    "bufio"
    "fmt"
    "os"

    "github.com/nyaosorg/go-windows-mbcs"
)

func main() {
    sc := bufio.NewScanner(os.Stdin)
    for sc.Scan() {
        bytes, err := mbcs.Utf8ToAnsi(sc.Text(), mbcs.ACP)
        if err != nil {
            fmt.Fprintln(os.Stderr, err.Error())
            os.Exit(1)
        }
        os.Stdout.Write(bytes)
        os.Stdout.Write([]byte{'\n'})
    }
    if err := sc.Err(); err != nil {
        fmt.Fprintln(os.Stderr, err.Error())
        os.Exit(1)
    }
}
On Windows
$ chcp 932
$ go run examples\Utf8ToAnsi.go < testdata\jugemu-utf8.txt | nkf32 --guess
Shift_JIS (LF)
On Linux
$ env LC_ALL=ja_JP.Shift_JIS go run examples/Utf8ToAnsi.go < testdata/jugemu-utf8.txt | nkf --guess
Shift_JIS (LF)

Use golang.org/x/text/transform

Convert from ANSI-reader to UTF8-reader
package main

import (
    "bufio"
    "fmt"
    "os"

    "golang.org/x/text/transform"

    "github.com/nyaosorg/go-windows-mbcs"
)

func main() {
    sc := bufio.NewScanner(transform.NewReader(os.Stdin, mbcs.NewDecoder(mbcs.ACP)))
    for sc.Scan() {
        fmt.Println(sc.Text())
    }
    if err := sc.Err(); err != nil {
        fmt.Fprintln(os.Stderr, err.Error())
        os.Exit(1)
    }
}
On Windows
$ chcp 932
$ go run examples\NewDecoder.go < testdata\jugemu-cp932.txt | nkf32 --guess
UTF-8 (LF)
On Linux
$ env LC_ALL=ja_JP.Shift_JIS go run examples/NewDecoder.go < testdata/jugemu-cp932.txt  | nkf --guess
UTF-8 (LF)
Convert from UTF8-reader to ANSI-reader
package main

import (
    "bufio"
    "fmt"
    "os"

    "golang.org/x/text/transform"

    "github.com/nyaosorg/go-windows-mbcs"
)

func main() {
    sc := bufio.NewScanner(transform.NewReader(os.Stdin, mbcs.NewEncoder(mbcs.ACP)))
    for sc.Scan() {
        os.Stdout.Write(sc.Bytes())
        os.Stdout.Write([]byte{'\n'})
    }
    if err := sc.Err(); err != nil {
        fmt.Fprintln(os.Stderr, err.Error())
        os.Exit(1)
    }
}
On Windows
$ chcp 932
$ go run examples\NewEncoder.go < testdata/jugemu-utf8.txt  | nkf32 --guess
Shift_JIS (LF)
On Linux
$ env LC_ALL=ja_JP.Shift_JIS go run examples/NewEncoder.go < testdata/jugemu-utf8.txt  | nkf --guess
Shift_JIS (LF)

Documentation

Index

Examples

Constants

View Source
const ACP = 0

ACP is the constant meaning the active codepage for OS

View Source
const CP_ACP = 0

Code generated by go-importconst DO NOT EDIT.

View Source
const CP_MACCP = 2
View Source
const CP_OEMCP = 1
View Source
const CP_SYMBOL = 42
View Source
const CP_UTF7 = 65000
View Source
const CP_UTF8 = 65001
View Source
const MB_COMPOSITE = 2
View Source
const MB_ERR_INVALID_CHARS = 8
View Source
const MB_PRECOMPOSED = 1
View Source
const MB_USEGLYPHCHARS = 4
View Source
const THREAD_ACP = 3

THREAD_ACP is the constant meaning the active codepage for thread

View Source
const WC_COMPOSITECHECK = 512
View Source
const WC_ERR_INVALID_CHARS = 128
View Source
const WC_NO_BEST_FIT_CHARS = 1024

Variables

View Source
var ErrUnsupportedOs = errors.New("Unsupported OS")

ErrUnsupportedOs is return value when AtoU,UtoA is called on not Windows

Functions

func AnsiToUtf8 added in v0.2.0

func AnsiToUtf8(ansi []byte, codepage uintptr) (utf8 string, err error)

AnsiToUtf8 Converts Ansi-bytes to UTF8-String

Example

ExampleAnsiToUtf8 converts from ANSI-string of STDIN to UTF8 via STDOUT

sc := bufio.NewScanner(os.Stdin)
for sc.Scan() {
	text, err := mbcs.AnsiToUtf8(sc.Bytes(), mbcs.ACP)
	if err != nil {
		fmt.Fprintln(os.Stderr, err.Error())
		os.Exit(1)
	}
	fmt.Println(text)
}
Output:

func AtoU deprecated

func AtoU(ansi []byte, codepage uintptr) (string, error)

Deprecated: use AnsiToUtf8

func ConsoleCP

func ConsoleCP() uintptr

ConsoleCP returns Codepage number of Console.

func NewAutoDecoder added in v0.4.0

func NewAutoDecoder(cp uintptr) transform.Transformer

NewAutoDecoder returns transform.Transformer that converts strings that are unknown to ANSI or UTF8 to UTF8.

func NewDecoder added in v0.4.0

func NewDecoder(cp uintptr) transform.Transformer

func NewEncoder added in v0.4.0

func NewEncoder(cp uintptr) transform.Transformer

func Utf8ToAnsi added in v0.2.0

func Utf8ToAnsi(utf8 string, codepage uintptr) (ansi []byte, err error)

Utf8ToAnsi Converts UTF8-String to Ansi-bytes

Example

ExampleUtf8ToAnsi converts from UTF8-string of STDIN to ANSI via STDOUT

sc := bufio.NewScanner(os.Stdin)
for sc.Scan() {
	bytes, err := mbcs.Utf8ToAnsi(sc.Text(), mbcs.ACP)
	if err != nil {
		fmt.Fprintln(os.Stderr, err.Error())
		os.Exit(1)
	}
	os.Stdout.Write(bytes)
	os.Stdout.Write([]byte{'\n'})
}
Output:

func UtoA deprecated

func UtoA(utf8 string, codepage uintptr) (ansi []byte, err error)

Deprecated: use Utf8ToAnsi

Types

type AutoDecoder added in v0.3.0

type AutoDecoder struct {
	CP uintptr
}

AutoDecoder is an implementation of transform.Transformer that converts strings that are unknown to ANSI or UTF8 to UTF8.

func (AutoDecoder) Reset added in v0.3.0

func (f AutoDecoder) Reset()

Reset does nothing in AutoDecoder

func (AutoDecoder) Transform added in v0.3.0

func (f AutoDecoder) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err error)

Transform converts a strings that are unknown to ANSI or UTF8 in src to a UTF8 string and stores it in dst.

type Filter

type Filter struct {
	// contains filtered or unexported fields
}

Filter is the class like bufio.Scanner but detects the encoding-type and converts to utf8 on Windows. On other OSes, it works like bufio.Scanner

func NewFilter

func NewFilter(r io.Reader, codepage uintptr) *Filter

NewFilter is the constructor for Filter

func (*Filter) Err

func (f *Filter) Err() error

Err is like "bufio".Scanner.Err for Filter

func (*Filter) ForceGuessAlways

func (f *Filter) ForceGuessAlways()

ForceGuessAlways should be called when you should guess the encoding for each line repeadedly

func (*Filter) Scan

func (f *Filter) Scan() bool

Scan is like bufio.Scanner.Scan for Filter

func (*Filter) Text

func (f *Filter) Text() string

Text is like "bufio".Scanner.Text for Filter

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL