gorkov

package module
v0.0.0-...-ec610a1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 19, 2018 License: MIT Imports: 5 Imported by: 0

README

gorkov go-doc Build Status Coverage Status Go Report Card

Markov chains for Go.

Documentation

Index

Constants

View Source
const (
	// LiteralType is the type used for literal tokens.
	LiteralType = "l"
)

Variables

View Source
var (
	// End is a pseude token that ends a chain.
	End = NewToken("e", "")
)

Functions

func TokensEqual

func TokensEqual(a, b Token) bool

TokensEqual checks two tokens for equality. Two tokens are considered equal if their type and identifier match.

Types

type ReaderTokenizer

type ReaderTokenizer struct {
	// contains filtered or unexported fields
}

ReaderTokenizer turns data from an io.Reader into a stream of tokens. It turns newlines ('\n') into End tokens and returns everything else as literal tokens. Each literal token either only contains whitespace and punctuation or no whitespace and punctuation. Two tokens that follow each other do not contain the same type of characters.

Punctuation and whitespace is everything that is a unicode punctuation character (category P) or has Unicode's White Space Property. See the unicode package for details.

func NewTokenizer

func NewTokenizer(r io.Reader) *ReaderTokenizer

NewTokenizer creates a new ReaderTokenizer for the given reader.

func (*ReaderTokenizer) Next

func (t *ReaderTokenizer) Next() (Token, error)

Next returns the next token. See the description of ReaderTokenizer for an explanation of which kind of tokens to expect.

type Token

type Token interface {
	// Type provides a string identifying the type of the token. Type must
	// always return the same string and that string must be not be used
	// by any other type of token used in one Gorkov instance. It may not
	// contain any null bytes.
	//
	// Tokens generated by this package will only use type namse consisting
	// of a single ASCII letter or digit (a-z, A-Z and 0-9).
	Type() string

	// Identifier returns a string identifying this particular token.
	// Identifier must always return the same string for one token and
	// that string may not contain any null bytes.
	//
	// For two tokens a and b the following must hold:
	//   a.Type() == b.Type() && a.Identifier() == b.Identifier()
	// is true iff a and b are considered equal.
	Identifier() string

	// Value returns the string that is used when generating a text using
	// this token. This is usually a static string, but can also be
	// dynamically generated.
	Value() string
}

Token is an one element of a markov chain. Usually this is a word or some whitespace.

func Literal

func Literal(value string) Token

Literal creates a new token for a literal. This is a convenience function and is equal to calling NewToken(LiteralType, value).

func NewToken

func NewToken(t, value string) Token

NewToken creates a new token with a static value. The value is also used as the identifier. It can be used for static tokens, such as literal words.

type Tokenizer

type Tokenizer interface {
	// Next returns the next token, if possible. If there are no more errors,
	// io.EOF will be returned. Next can also return other errors, so
	// consumers have to check for them.
	Next() (Token, error)
}

Tokenizer allows consuming a stream of tokens.

type TokenizerFunc

type TokenizerFunc func() (Token, error)

TokenizerFunc can be used to turn a function into a Tokenizer:

func (TokenizerFunc) Next

func (t TokenizerFunc) Next() (Token, error)

Next calls t().

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL