charset

package
v0.0.0-...-d539043 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 15, 2015 License: ISC Imports: 11 Imported by: 0

Documentation

Overview

The charset package implements translation between character sets. It uses Unicode as the intermediate representation. Because it can be large, the character set data is separated from the charset package. It can be embedded in the Go executable by importing the data package:

import _ "code.google.com/p/go-charset/data"

It can also made available in a data directory (by settting CharsetDir).

Index

Examples

Constants

This section is empty.

Variables

View Source
var CharsetDir = "/usr/local/lib/go-charset/datafiles"

CharsetDir gives the location of the default data file directory. This directory will be used for files with names that have not been registered with RegisterDataFile.

Functions

func Names

func Names() []string

Names returns the canonical names of all supported character sets, in alphabetical order.

func NewReader

func NewReader(charset string, r io.Reader) (io.Reader, error)

NewReader returns a new Reader that translates from the named character set to UTF-8 as it reads r.

Example
package main

import (
	"fmt"
	"github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/charset"
	"io/ioutil"
	"log"
	"strings"

	_ "github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/data"
)

func main() {
	r, err := charset.NewReader("latin1", strings.NewReader("\xa35 for Pepp\xe9"))
	if err != nil {
		log.Fatal(err)
	}
	result, err := ioutil.ReadAll(r)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%s\n", result)
}
Output:

£5 for Peppé

func NewTranslatingReader

func NewTranslatingReader(r io.Reader, tr Translator) io.Reader

NewTranslatingReader returns a new Reader that translates data using the given Translator as it reads r.

func NewTranslatingWriter

func NewTranslatingWriter(w io.Writer, tr Translator) io.WriteCloser

NewTranslatingWriter returns a new WriteCloser writing to w. It passes the written bytes through the given Translator.

func NewWriter

func NewWriter(charset string, w io.Writer) (io.WriteCloser, error)

NewWriter returns a new WriteCloser writing to w. It converts writes of UTF-8 text into writes on w of text in the named character set. The Close is necessary to flush any remaining partially translated characters to the output.

Example
package main

import (
	"bytes"
	"fmt"
	"github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/charset"
	"log"

	_ "github.com/mjibson/goread/_third_party/code.google.com/p/go-charset/data"
)

func main() {
	buf := new(bytes.Buffer)
	w, err := charset.NewWriter("latin1", buf)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Fprintf(w, "£5 for Peppé")
	w.Close()
	fmt.Printf("%q\n", buf.Bytes())
}
Output:

"\xa35 for Pepp\xe9"

func NormalizedName

func NormalizedName(s string) string

NormalisedName returns s with all Roman capitals mapped to lower case, and '_' mapped to '-'

func Register

func Register(factory Factory)

Register registers a new Factory which will be consulted when NewReader or NewWriter needs a character set translator for a given name.

func RegisterDataFile

func RegisterDataFile(name string, open func() (io.ReadCloser, error))

RegisterDataFile registers the existence of a given data file with the given name that may be used by a character-set converter. It is intended to be used by packages that wish to embed data in the executable binary, and should not be used normally.

Types

type Charset

type Charset struct {
	Name    string   // Canonical name of character set.
	Aliases []string // Known aliases.
	Desc    string   // Description.
	NoFrom  bool     // Not possible to translate from this charset.
	NoTo    bool     // Not possible to translate to this charset.
}

Charset holds information about a given character set.

func Info

func Info(name string) *Charset

Info returns information about a character set, or nil if the character set is not found.

type Factory

type Factory interface {
	// TranslatorFrom creates a translator that will translate from the named character
	// set to UTF-8.
	TranslatorFrom(name string) (Translator, error) // Create a Translator from this character set to.

	// TranslatorTo creates a translator that will translate from UTF-8 to the named character set.
	TranslatorTo(name string) (Translator, error) // Create a Translator To this character set.

	// Names returns all the character set names accessibile through the factory.
	Names() []string

	// Info returns information on the named character set. It returns nil if the
	// factory doesn't recognise the given name.
	Info(name string) *Charset
}

A Factory can be used to make character set translators.

type Translator

type Translator interface {
	Translate(data []byte, eof bool) (n int, cdata []byte, err error)
}

Translator represents a character set converter. The Translate method translates the given data, and returns the number of bytes of data consumed, a slice containing the converted data (which may be overwritten on the next call to Translate), and any conversion error. If eof is true, the data represents the final bytes of the input.

func TranslatorFrom

func TranslatorFrom(charset string) (Translator, error)

TranslatorFrom returns a translator that will translate from the named character set to UTF-8.

func TranslatorTo

func TranslatorTo(charset string) (Translator, error)

TranslatorTo returns a translator that will translate from UTF-8 to the named character set.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL