utf8reader

package

v0.0.5 Latest Latest Go to latest Published: Aug 28, 2015 License: MIT, BSD-2-Clause, BSD-3-Clause, + 3 more Imports: 4 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/linkernetworks/gotty

Links

Open Source Insights

README ¶

UTF8Reader for Go

UTF8Reader is a simple wrapper Reader that fills the given buffer with a "tail-safe" UTF8 byte sequence.

Tail-Safe?

Let's say you have a buffer of 7 bytes and your Reader is going to fill your buffer with a UTF8 byte sequence.

buf := make([]byte, 7)
reader := strings.NewReader("いろは")

reader.Read(buf)

The byte length of UTF8 characters is not fixed and some characters like the examples above have 3 byte length. There are others which have a single byte, 2 byte and 4 byte length as well. This means your buffer will be sometimes filled with incomplete bytes as an Unicode character at the tail.

By reader.Read(buf), your buf will be like below:

[]byte{
	// い
	byte(0xe3), // 1
	byte(0x81), // 2
	byte(0x84), // 3
	// ろ
	byte(0xe3), // 4
	byte(0x82), // 5
	byte(0x8d), // 6
	// は (incomplete)
	byte(0xe3), // 7
}

The last character は is incomplete and the buffer is now invalid as a UTF8 string.

UTF8Reader detects incomplete bytes like above and aborts filling up the buffer in such cases.

buf := make([]byte, 7)
reader := strings.NewReader("いろは")
utfReader := utf8reader.New(reader)

utfReader.Read(buf)

Then you will get:

[]byte{
	// い
	byte(0xe3), // 1
	byte(0x81), // 2
	byte(0x84), // 3
	// ろ
	byte(0xe3), // 4
	byte(0x82), // 5
	byte(0x8d), // 6
}

Of course, bytes left behind will be used to fill up the buffer on next Read().

Note

UTF8Reader just checks incomplete bytes at the tail of the buffer. Even if the original byte sequence given to UTF8Reader is broken, UTF8Reader reports no errors and just fills up the buffer.

License

The MIT License

Documentation ¶

Index ¶

Variables
type UTF8Reader
- func New(reader io.Reader) *UTF8Reader
- func (r *UTF8Reader) Read(p []byte) (n int, err error)

Constants ¶

This section is empty.

Variables ¶

View Source

var SmallBufferError = errors.New("Buffer size must be larger than utf8.UTFMax.")

Functions ¶

This section is empty.

Types ¶

type UTF8Reader ¶

type UTF8Reader struct {
	// contains filtered or unexported fields
}

func New ¶

func New(reader io.Reader) *UTF8Reader

func (*UTF8Reader) Read ¶

func (r *UTF8Reader) Read(p []byte) (n int, err error)

Source Files ¶

View all Source files

utf8reader.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL