gaga

package module
v0.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 21, 2020 License: MIT Imports: 3 Imported by: 1

README

go-gaga (Japanese language utility)

Installation

For using the library:

Linux:

$ go get github.com/y-bash/go-gaga

Windows:

>go get github.com/y-bash/go-gaga
Next, to install the command (If you use binaries):

Linux:

$ cd $GOPATH/src/github.com/y-bash/go-gaga
$ make install

Windows:

>cd %GOPATH%\src\github.com\y-bash\go-gaga
>go install ./...

Usage

Library
Norm
import "github.com/y-bash/go-gaga"

s := "GaGa is not がガガ"
fmt.Println(s)

n := gaga.Norm(gaga.Fold) // gaga.Fold == gaga.LatinToNarrow | gaga.KanaToWide
fmt.Println(n.String(s))

n.setFlag(gaga.LatinToWide | gaga.AlphaToUpper | gaga.KanaToHiragana)
fmt.Println(n.String(s))

Output:

GaGa is not がガガ
GaGa is not がガガ
GAGA IS NOT ががが
Vert
import "github.com/y-bash/go-gaga"

s := gaga.Vert("閑さや\n岩にしみ入る\n蝉の声")
fmt.Print(s)

Output:

蝉岩閑
のにさ
声しや
  み  
  入  
  る
Commands
Norm

Linux:

$ echo "ABCアイウ" | norm
ABCアイウ

Windows:
(with wecho comand in the gaga)

>wecho ABCアイウ | norm
ABCアイウ
Vert

Linux:

$ echo -e "閑さや\n岩にしみ入る\n蝉の声" | vert
蝉岩閑
のにさ
声しや
  み  
  入  
  る

Windows:
(with wecho comand in the gaga)

>wecho 閑さや\n岩にしみ入る\n蝉の声 | vert
蝉岩閑
のにさ
声しや
  み
  入
  る
Norm & Vert

Linux:

$ echo -e "閑さや\n岩にしみ入る\n蝉の声" | norm -flag KanaToWideKatakana | vert
蝉岩閑
ノニサ
声シヤ
  ミ
  入
  ル

Windows:
(with wecho comand in the gaga)

>wecho 閑さや\n岩にしみ入る\n蝉の声 | norm -flag KanaToWideKatakana | vert
蝉岩閑
ノニサ
声シヤ
  ミ
  入
  ル

License

This software is released under the MIT License, see LICENSE.

Author

y-bash

Documentation

Overview

Package gaga implements simple functions to manipulate UTF-8 encoded Japanese strings.

Here is a simple example, converting the character type and printing it vertically.

First, import gaga.

import "github.com/y-bash/go-gaga"

Define a normalizer using Norm() with the normalization flag. This declares a normalizer, that converts Latin characters to half-width and Hiragana-Katakana characters to full-width.

n, err := gaga.Norm(gaga.LatinToNarrow | gaga.KanaToHiragana)
if err != nil {
	log.Fatal(err)
}

After normalizer is defined, call to normalize the string using the normalization flags.

s := n.String("GaGaはがガガではありません")
fmt.Println(s)

Output is:

GaGaはがががではありません

Using Vert(), make this string vertical.

vs := gaga.Vert(s, 3, 5)
fmt.Print(vs)

Output is:

あが G
りが a
まが G
せで a
んはは

Index

Examples

Constants

View Source
const (
	// LatinToNarrow is a combination of normalization flags for converting
	// all the full-width Latin characters to their half-width.
	//
	//          | CHARACTER     | CONVERT TO
	// ---------+---------------+----------------
	//          | Wide Alphabet | Narrow Alphabet
	// Category | Wide Digit    | Narrow Digit
	//          | Wide Symbol   | Narrow Symbol
	// ---------+---------------+----------------
	// Example  | "A1?"      | "A1?"
	//
	LatinToNarrow = AlphaToNarrow | DigitToNarrow | SymbolToNarrow

	// LatinToWide is a combination of normalization flags for converting
	// all the half-width Latin characters to their full-width.
	//
	//          | CHARACTER       | CONVERT TO
	// ---------+-----------------+--------------
	//          | Narrow Alphabet | Wide Alphabet
	// Category | Narrow Digit    | Wide Digit
	//          | Narrow Symbol   | Wide Symbol
	// ---------+-----------------+--------------
	// Example  | "A1?"           | "A1?"
	//
	LatinToWide = AlphaToWide | DigitToWide | SymbolToWide

	// KanaToNarrow is a combination of normalization flags for converting
	// the full-width Hiragana-Katakana characters to their half-width as
	// much as possible.
	//
	//          | CHARACTER                       | CONVERT TO
	// ---------+---------------------------------+-------------------
	//          | Hiaragana                       | Narrow Katakana
	// Category | Wide Katakana                   | Narrow Katakana
	//          | Wide Kana Symbol                | Narrow Kana Symbol
	//          | Voiced/Semi-voiced Kana Letter  | Legacy composed
	//          | Isolated Voicing Modifier (VOM) | Narrow VOM
	// ---------+---------------------------------+-------------------
	// Example  | "あイ、が゛"                    | "アイ、ガ゙"
	//
	KanaToNarrow = HiraganaToNarrow | KatakanaToNarrow | KanaSymbolToNarrow |
		IsolatedVomToNarrow | ComposeVom

	// KanaToWide is a combination of normalization flags for converting
	// all the half-width Katakana characters to their full-width.
	//
	//          | CHARACTER                       | CONVERT TO
	// ---------+---------------------------------+-----------------
	//          | Narrow Katakana                 | Wide Katakana
	// Category | Narrow Kana Symbol              | Wide Kana Symbol
	//          | Voiced/Semi-voiced Kana Letter  | Legacy composed
	//          | Isolated Voicing Modifier (VOM) | Wide VOM
	// ---------+---------------------------------+-----------------
	// Example  | "ア、ガ゙"                         | "ア、ガ゛"
	//
	KanaToWide = KatakanaToWide | KanaSymbolToWide | IsolatedVomToWide |
		ComposeVom

	// KanaToWideKatakana is a combination of normalization flags for
	// converting all the half-width Katakana characters to their full-width,
	// and the Hiragana characters to their full-width Katakana as much as
	// possible..
	//
	//          | CHARACTER                       | CONVERT TO
	// ---------+---------------------------------+-----------------
	//          | Hiragana                        | Wide Katakana
	// Category | Narrow Katakana                 | Wide Katakana
	//          | Narrow Kana Symbol              | Wide Kana Symbol
	//          | Voiced/Semi-voiced Kana Letter  | Legacy composed
	//          | Isolated Voicing Modifier (VOM) | Wide VOM
	// ---------+---------------------------------+-----------------
	// Example  | "あイ、ガ゙"                       | "アイ、ガ゛"
	//
	KanaToWideKatakana = KatakanaToWide | HiraganaToKatakana | KanaSymbolToWide |
		IsolatedVomToWide | ComposeVom

	// KanaToNarrowKatakana is a combination of normalization flags for
	// converting the full-width Katakana characters to their half-width,
	// and the Hiragana characters to their half-width Katakana as much as
	// possible.
	//
	//          | CHARACTER                       | CONVERT TO
	// ---------+---------------------------------+-------------------
	//          | Hiragana                        | Narrow Katakana
	// Category | Wide Katakana                   | Narrow Katakana
	//          | Wide Kana Symbol                | Narrow Kana Symbol
	//          | Voiced/Semi-voiced Kana Letter  | Legacy composed
	//          | Isolated Voicing Modifier (VOM) | Narrow VOM
	// ---------+---------------------------------+-------------------
	// Example  | "あイ、が゛"                    | "アイ、ガ゙"
	//
	KanaToNarrowKatakana = KatakanaToNarrow | HiraganaToNarrow |
		KanaSymbolToNarrow | IsolatedVomToNarrow | ComposeVom

	// KanaToHiragana is a combination of normalization flags for
	// converting the full-width Katakana characters to their Hiragana
	// as much as possible, and all the half-width Katakana characters
	// to their Hiragana.
	//
	//          | CHARACTER                       | CONVERT TO
	// ---------+---------------------------------+----------------------
	//          | Wide Katakana                   | Hiragana
	// Category | Narrow Katakana                 | Hiragana
	//          | Narrow Kana Symbol              | Wide Kana Symbol
	//          | Voiced/Semi-voiced Kana Letter  | Legacy composed
	//          | Isolated Voicing Modifier (VOM) | Wide VOM
	// ---------+---------------------------------+----------------------
	// Example  | "アイ、ガ゛"                      | "あい、が゛"
	//
	KanaToHiragana = KatakanaToHiragana | KanaSymbolToWide |
		IsolatedVomToWide | ComposeVom

	// Fold is a combination of normalization flags for converting
	// the Latin characters and the Hiragana-Katakana characters to
	// their canonical width.
	//
	//          | CHARACTER                       | CONVERT TO
	// ---------+---------------------------------+-----------------
	//          | Wide Alphabet                   | Narrow Alphabet
	//          | Wide Digit                      | Narrow Digit
	//          | Wide Symbol                     | Narrow Symbol
	// Category | Narrow Katakana                 | Wide Katakana
	//          | Narrow Kana Symbol              | Wide Kana Symbol
	//          | Voiced/Semi-voiced Kana Letter  | Legacy composed
	//          | Isolated Voicing Modifier (VOM) | Wide VOM
	// ---------+---------------------------------+-----------------
	// Example  | "A1?ア、ガ゙"                   | "A1?ア、ガ゛"
	//
	Fold = LatinToNarrow | KanaToWide
)

Combination of normalization flags

Variables

This section is empty.

Functions

func Vert

func Vert(s string) string

Vert returns the vertical conversion of the in. This function is equivalent to VertShrink (s, 40, 25)

Example
in := "閑さや\n岩にしみ入る\n蝉の声"
out := gaga.Vert(in)
fmt.Print(out)
Output:

蝉岩閑
のにさ
声しや
  み
  入
  る

func VertFix

func VertFix(in string, w int, h int) []string

VertFix returns the vertical conversion of the in. The result is word wrapped so that it does not exceed h. If in contains half-width or narrow-width characters, whitespace is added to the left of it. If the converted string fits in a matrix of size w and h, the result is a string slice with one element, if not, the result is a string slice with multiple elements.

Example
s := "閑さや\n岩にしみ入る\n蝉の声"

ss := gaga.VertFix(s, 6, 6)
fmt.Println(" 1 2 3 4 5 6")
fmt.Print(ss[0])

ss = gaga.VertFix(s, 6, 3)
fmt.Println("\n 1 2 3 4 5 6")
fmt.Print(ss[0])

ss = gaga.VertFix(s, 3, 3)
fmt.Println("\n-Page1\n 1 2 3")
fmt.Print(ss[0])

fmt.Println("\n-Page2\n 1 2 3")
fmt.Print(ss[1])
Output:

 1 2 3 4 5 6
      蝉岩閑
      のにさ
      声しや
        み
        入
        る

 1 2 3 4 5 6
    蝉み岩閑
    の入にさ
    声るしや

-Page1
 1 2 3
み岩閑
入にさ
るしや

-Page2
 1 2 3
    蝉
    の
    声

func VertShrink

func VertShrink(in string, w, h int) []string

VertShrink returns the vertical conversion of the in. If the text fits in the w and h matrix without word wrapping, the result is laid out so that it fits in the smallest matrix.

Example
s := "閑さや\n岩にしみ入る\n蝉の声"

ss := gaga.VertShrink(s, 6, 6)
fmt.Println(" 1 2 3 4 5 6")
fmt.Print(ss[0])

ss = gaga.VertShrink(s, 6, 3)
fmt.Println("\n 1 2 3 4 5 6")
fmt.Print(ss[0])

ss = gaga.VertShrink(s, 3, 3)
fmt.Println("\n-Page1\n 1 2 3")
fmt.Print(ss[0])

fmt.Println("\n-Page2\n 1 2 3")
fmt.Print(ss[1])
Output:

 1 2 3 4 5 6
蝉岩閑
のにさ
声しや
  み
  入
  る

 1 2 3 4 5 6
蝉み岩閑
の入にさ
声るしや

-Page1
 1 2 3
み岩閑
入にさ
るしや

-Page2
 1 2 3
    蝉
    の
    声

Types

type NormFlag

type NormFlag int

NormFlag is the normalization rule used by Normalizer.

const (

	// AlphaToNarrow converts all the full-width Latin letters to
	// their half-width.
	// Example: [A] =>[A]
	AlphaToNarrow NormFlag

	// AlphaToWide converts all the half-width Latin letters to
	// their full-width.
	// Example: [A] => [A]
	AlphaToWide

	// AlphaToUpper converts all the lower case Latin letters to
	// their upper case.
	// Examples: [a] => [A],  [a] => [A]
	AlphaToUpper

	// AlphaToLower converts all the upper case Latin letters to
	// their lower case.
	// Examples: [A] => [a],  [A] => [a]
	AlphaToLower

	// DigitToNarrow converts all the full-width Latin digits to
	// their half-width.
	// Example: [1] => [1]
	DigitToNarrow

	// DigitToWide converts all the half-width Latin digits to
	// their full-width.
	// Example: [1] => [1]
	DigitToWide

	// SymbolToNarrow converts all the full-width Latin symbols to
	// their half-width.
	// Example: [?] => [?]
	SymbolToNarrow

	// SymbolToWide converts all the half-width Latin symbols to
	// their full-width.
	// Example: [?] => [?]
	SymbolToWide

	// HiraganaToNarrow converts the full-width Hiragana letters to
	// their half-width Katakana as much as possible.
	// Example: [あ] => [ア]
	HiraganaToNarrow

	// HiraganaToKatakana converts the full-width Hiragana letters to
	// their full-width Katakana as much as possible.
	// Example: [あ] => [ア]
	HiraganaToKatakana

	// KatakanaToNarrow converts the full-width Katakana letters to
	// their half-width Katakana as much as possible.
	// Example: [ア] => [ア]
	KatakanaToNarrow

	// KatakanaToWide converts all the half-width Katakana letters to
	// their full-width Katakana.
	// Example: [ア] => [ア]
	KatakanaToWide

	// KatakanaToHiragana converts the half-width or full-width Katakana
	// letters to their full-width Hiragana as much as possible.
	// Examples: [ア] => [あ],  [ア] => [あ]
	KatakanaToHiragana

	// KanaSymbolToNarrow converts the full-width Hiragana-Katakana
	// symbols to their half-width as much as possible.
	// Example: [、] => [、]
	KanaSymbolToNarrow

	// KanaSymbolToWide converts all the half-width Katakana symbols
	// to their full-width.
	// Example: [、] => [、]
	KanaSymbolToWide

	// ComposeVom composes the voiced or semi-voiced sound letters in
	// the most conventional way.
	// Examples:
	//  [が]     => [が],  [か][゛] => [が],    [か][\u3099] => [が],
	//  [か][゙]  => [が],  [カ][゛]  => [カ][゙],  [カ][゙]       => [カ][゙],
	//  [は][゜] => [ぱ],  [ヰ][゛] => [ヸ],    [ゐ][゛]     => [ゐ][゛]
	ComposeVom

	// DecomposeVom decomposes the voiced or semi-voiced sound letters
	// in a way similar to the Unicode canonical decomposition mappings.
	// Examples:
	//  [が]         => [か][\u3099],  [か][゛] => [か][\u3099],
	//  [か][\u3099] => [か][\u3099],  [か][゙]  => [か][\u3099],
	//  [カ][゛]      => [カ][\u3099],   [カ][゙]   => [カ][\u3099],
	//  [ぱ]         => [は][\u309A],  [ヰ][゛] => [ヰ][\u3099],
	//  [ゐ][゛]     => [ゐ][\u3099]
	DecomposeVom

	// IsolatedVomToNarrow converts an isolated voicing modifier
	// which was not combined into a base letter into a half-width
	// voiced or semi-voiced sound letter.
	// Examples:
	//  [゛] => [゙],  [\u3099] => [゙],  [゜] => [゚],  [\u309A] => [゚]
	IsolatedVomToNarrow

	// IsolatedVomToWide converts an isolated voicing modifier
	// which was not combined into a base letter into a full-width
	// voiced or semi-voiced sound letter.
	// Examples:
	//  [\u3099] => [゛],  [゙] => [゛],  [\u309A] => [゜],  [゚] => [゜]
	IsolatedVomToWide

	// IsolatedVomToCombining converts an isolated voicing
	// modifier which was not combined into a base letter into a
	// combining voiced or semi-voiced sound letter.
	// Examples:
	//  [゛] => [\u3099],  [゙] => [\u3099],  [゜] = [\u309A],  [゚] => [\u309A]
	IsolatedVomToNonspace
)

Constants to identify various normalization flags.

func ParseNormFlag

func ParseNormFlag(names string) (flags NormFlag, err error)

ParseNormFlag returns a flags parsed names

func (NormFlag) String

func (f NormFlag) String() string

String returns the name of a flag

type Normalizer

type Normalizer struct {
	// contains filtered or unexported fields
}

Normalizer normalizes the input provided and returns the normalized string.

func Norm

func Norm(flag NormFlag) (*Normalizer, error)

Norm creates a new Normalizer with specified flag (LatinToNarrow etc.). If successful, methods on the returned Normalizer can be used for normalization.

func (*Normalizer) Rune

func (n *Normalizer) Rune(r rune) string

Rune normalize r according to the current normalization mode. In most cases, this function returns a string of length 1, but in some modes the voicing modifiers may be separated, so it may return a string of length 2.

func (*Normalizer) SetFlag

func (n *Normalizer) SetFlag(flag NormFlag) error

SetFlag changes the normalization mode with the newly specified flag.

func (*Normalizer) String

func (n *Normalizer) String(s string) string

String normalizes the s according to the current normalization mode.

Example
s := "GaGa is not がガガ"
fmt.Println(0, s)

n, _ := gaga.Norm(gaga.LatinToNarrow)
fmt.Println(1, n.String(s))

n.SetFlag(gaga.KanaToWide)
fmt.Println(2, n.String(s))

n.SetFlag(gaga.KanaToHiragana)
fmt.Println(3, n.String(s))

n.SetFlag(gaga.KanaToNarrowKatakana)
fmt.Println(4, n.String(s))

n.SetFlag(gaga.LatinToNarrow | gaga.AlphaToUpper | gaga.KanaToWideKatakana)
fmt.Println(5, n.String(s))
Output:

0 GaGa is not がガガ
1 GaGa is not がガガ
2 GaGa is not がガガ
3 GaGa is not ががが
4 GaGa is not ガガガ
5 GAGA IS NOT ガガガ

Directories

Path Synopsis
cmd
norm
Norm is a utility to normalize Japanese language text files.
Norm is a utility to normalize Japanese language text files.
vert
Vert is a utility to convert text files to vertical printing.
Vert is a utility to convert text files to vertical printing.
wecho
Wecho is an echo command that writes utf-8 text to standard output.
Wecho is an echo command that writes utf-8 text to standard output.
gen
lib
Package lib implements gaga's auto-generation utility.
Package lib implements gaga's auto-generation utility.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL