stringtokenizer

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 9, 2023 License: Apache-2.0 Imports: 3 Imported by: 2

README

stringtokenizer

This is a port of the string tokenizer class from Java to Go. It is built on the bufio.Scanner in the Go standard library.

Why

This library provides a string tokenizer for Go, which is useful for splitting a string into tokens. It can be used to build a lexer or parser.

The main features of this library are:

  • It can return the delimiters as a token, which is great for lexical analysis
  • It supports UTF-8 delimiters
  • One or more delimiters can be provided which enables flexible tokenization

Usage

Below is an example of lexical analysis of key value data:

func ExampleStringTokenizer_NextToken_include_delimiters() {

	input := "name=markw,age=23,cyclist=true"
	tokenizer := NewStringTokenizer(strings.NewReader(input), "=,", true /*includeDelimiters */)

	for tokenizer.HasMoreTokens() {
		token := tokenizer.NextToken()
		fmt.Println(token)
	}

	// Output:
	// name
	// =
	// markw
	// ,
	// age
	// =
	// 23
	// ,
	// cyclist
	// =
	// true
}

Below is an example using or as a delimiter to illustrate handling multi-byte delimiters:

func ExampleStringTokenizer_NextToken() {

	input := "a⌘b鸡c"
	tokenizer := NewStringTokenizer(strings.NewReader(input), "⌘鸡", false /*includeDelimiters */)

	for tokenizer.HasMoreTokens() {
		token := tokenizer.NextToken()
		fmt.Println(token)
	}

	// Output:
	// a
	// b
	// c
}

License

This project is released under Apache 2.0 license and is copyright Mark Wolfe.

Documentation

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type StringTokenizer

type StringTokenizer struct {
	// contains filtered or unexported fields
}

func NewStringTokenizer

func NewStringTokenizer(rdr io.Reader, delimiters string, include bool) *StringTokenizer

func (*StringTokenizer) HasMoreTokens

func (st *StringTokenizer) HasMoreTokens() bool

func (*StringTokenizer) NextToken

func (st *StringTokenizer) NextToken() string
Example
input := "a⌘b鸡c"
tokenizer := NewStringTokenizer(strings.NewReader(input), "⌘鸡", false /*includeDelimiters */)

for tokenizer.HasMoreTokens() {
	token := tokenizer.NextToken()
	fmt.Println(token)
}
Output:

a
b
c
Example (Include_delimiters)
input := "name=markw,age=23,cyclist=true"
tokenizer := NewStringTokenizer(strings.NewReader(input), "=,", true /*includeDelimiters */)

for tokenizer.HasMoreTokens() {
	token := tokenizer.NextToken()
	fmt.Println(token)
}
Output:

name
=
markw
,
age
=
23
,
cyclist
=
true

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL