parse

package

v0.1.1 Latest Latest Go to latest Published: Dec 20, 2022 License: MIT Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/adnsv/go-parse

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
func ExtractHex32n(src Source, max_chars int) (v uint32, overflow bool, n_chars int)
func ExtractHex64n(src Source, max_chars int) (v uint64, overflow bool, n_chars int)
func ExtractOct32n(src Source, max_chars int) (v uint32, overflow bool, n_chars int)
func Static(buf []byte, lc *LineCol) *static_impl
func Tokenize[T Key](buf []byte, bindings []*Binding[T], on_token func(k T, c *Context, lc LineCol)) error
type Binding
- func Bind[K Key](key K, descr string, sequence ...any) *Binding[K]
type Context
- func (c *Context) Reset()
type ErrAtLineCol
- func (e *ErrAtLineCol) Error() string
type ErrCode
- func EOF(src Source, ctx *Context) ErrCode
- func EOL(src Source, ctx *Context) ErrCode
- func HexCodepoint_XXXX(src Source, ctx *Context) ErrCode
- func HexCodepoint_XXXXXXXX(src Source, ctx *Context) ErrCode
- func HexCodeunit_XX(src Source, ctx *Context) ErrCode
- func HexCodeunit_Xn(src Source, ctx *Context) ErrCode
- func OctCodeunit_X3n(src Source, ctx *Context) ErrCode
- func (ec ErrCode) String() string
type ErrContent
- func Expected(v string) *ErrContent
- func Invalid(v string) *ErrContent
- func Unexpected(v string) *ErrContent
- func Unpaired(v string) *ErrContent
- func Unterminated(v string) *ErrContent
- func (e *ErrContent) Error() string
type Key
type LineCol
- func (lc *LineCol) String() string
type Location
- func (l *Location) ColumnNumber() int
type Source
type Term
type TermFunc
- func AnyOf(args ...string) TermFunc
- func Between(prefix, terminator any, content ...any) TermFunc
- func Codepoint(r rune) TermFunc
- func CodepointFunc(m func(rune) bool) TermFunc
- func Escaped(prefix rune, escapers map[rune]any) TermFunc
- func FirstOf(args ...any) TermFunc
- func HexCodeunit_XXXX(first_prefix, second_prefix string) TermFunc
- func HexN[T unsigned](prefix string) TermFunc
- func Literal(s string) TermFunc
- func OneOrMore[T Term](a T) TermFunc
- func Optional[T Term](a T) TermFunc
- func Sequence(args ...any) TermFunc
- func Skip(content ...any) TermFunc
- func Uint[T unsigned | signed](prefix string, base uint, maxval T) TermFunc
- func ZeroOrMore[T Term](a T) TermFunc

Constants ¶

View Source

const (
	ErrCodeNone = ErrCode(iota)
	ErrCodeUnexpected
	ErrCodeExpected
	ErrCodeUnterminated
	ErrCodeIncomplete
	ErrCodeUnpaired
	ErrCodeInvalid
	ErrCodeOverflow

	ErrCodeUnmatched = ErrCode(-1)
)

View Source

const Unmatched = rune(0x7fffffff)

Variables ¶

This section is empty.

Functions ¶

func ExtractHex32n ¶

func ExtractHex32n(src Source, max_chars int) (v uint32, overflow bool, n_chars int)

func ExtractHex64n ¶

func ExtractHex64n(src Source, max_chars int) (v uint64, overflow bool, n_chars int)

func ExtractOct32n ¶

func ExtractOct32n(src Source, max_chars int) (v uint32, overflow bool, n_chars int)

func Static ¶

func Static(buf []byte, lc *LineCol) *static_impl

Static implements Source that reads content from memory-loaded data.

func Tokenize ¶

func Tokenize[T Key](buf []byte, bindings []*Binding[T], on_token func(k T, c *Context, lc LineCol)) error

Types ¶

type Binding ¶

type Binding[K Key] struct {
	// contains filtered or unexported fields
}

func Bind ¶

func Bind[K Key](key K, descr string, sequence ...any) *Binding[K]

type Context ¶

type Context struct {
	strings.Builder
	Values []any
}

func (*Context) Reset ¶

func (c *Context) Reset()

type ErrAtLineCol ¶

type ErrAtLineCol struct {
	Err error
	Loc LineCol
}

func (*ErrAtLineCol) Error ¶

func (e *ErrAtLineCol) Error() string

type ErrCode ¶

type ErrCode int

func EOF ¶

func EOF(src Source, ctx *Context) ErrCode

func EOL ¶

func EOL(src Source, ctx *Context) ErrCode

func HexCodepoint_XXXX ¶

func HexCodepoint_XXXX(src Source, ctx *Context) ErrCode

HexCodepoint_XXXX captures four hexadecimal digits and interprets those as a UTF-16 codepoint. This codeunit is then converted to a UTF-8 sequence and inserted into the captured string.

Returned values are:

`ErrCodeUnmatched` if src does not start with the hex digit
`ErrCodeIncomplete` if src contains less than 4 hex digits
`ErrCodeInvalid` if src is a surrogate.
`ErrCodeNone` if src contains 4 hex digits that represent a valid codepoint.

If src contains more than 4 digits, this function consumes only the the first 4 them.

func HexCodepoint_XXXXXXXX ¶

func HexCodepoint_XXXXXXXX(src Source, ctx *Context) ErrCode

HexCodepoint_XXXXXXXX captures 8 hexadecimal digits and interprets those as a UTF-32 codeunit. This codeunit is then converted to a UTF-8 sequence and inserted into the captured string. This function does not perform any validation, neither does it check for surrogates.

Returned values are:

`ErrCodeUnmatched` if src does not start with the hex digit
`ErrCodeIncomplete` if src contains less than 8 hex digits
`ErrCodeNone` if src contains 8 hex digits

If src contains more than 8 digits, this function consumes only the the first 8 them.

func HexCodeunit_XX ¶

func HexCodeunit_XX(src Source, ctx *Context) ErrCode

HexCodeunit_XX reads two hexadecimal digits from src and inserts the corresponding numeric value into captured string as a UTF-8 codeunit. The codeunit is inserted as-is, without any validation.

Returned values are:

`ErrCodeUnmatched` if src does not start with the hex digit
`ErrCodeIncomplete` if src contains only one hex digit
`ErrCodeNone` if src contains two hex digits

If src contains more than two hex digits, this function consumes only the the first two of them.

func HexCodeunit_Xn ¶

func HexCodeunit_Xn(src Source, ctx *Context) ErrCode

HexCodeunit_Xn reads hexadecimal digits from src and inserts the corresponding numeric value into captured string as a UTF-8 codeunit. The codeunit is inserted as-is, without any validation.

Returned values are:

`ErrCodeUnmatched` if src does not start with the hex digit
`ErrCodeInvalid` if the obtained value exceeds 255
`ErrCodeNone` if src contains a value in [0..255] range

This function consumes all the hex digits, regardless of overflow.

func OctCodeunit_X3n ¶

func OctCodeunit_X3n(src Source, ctx *Context) ErrCode

OctCodeunit_X3n reads 1~3 octal digits from src and inserts the corresponding numeric value into captured string as a UTF-8 codeunit. The codeunit is inserted as-is, without any validation.

Returned values are:

`ErrCodeUnmatched` if src does not start with the hex digit
`ErrCodeInvalid` if the obtained value exceeds 255
`ErrCodeNone` if src contains a value in [0..255] range

func (ErrCode) String ¶

func (ec ErrCode) String() string

type ErrContent ¶

type ErrContent struct {
	Code ErrCode
	What string
}

func Expected ¶

func Expected(v string) *ErrContent

func Invalid ¶

func Invalid(v string) *ErrContent

func Unexpected ¶

func Unexpected(v string) *ErrContent

func Unpaired ¶

func Unpaired(v string) *ErrContent

func Unterminated ¶

func Unterminated(v string) *ErrContent

func (*ErrContent) Error ¶

func (e *ErrContent) Error() string

type Key ¶

type Key = any

type LineCol ¶

type LineCol struct {
	LineIndex   int // 0-based
	ColumnIndex int // 0-based
}

func (*LineCol) String ¶

func (lc *LineCol) String() string

type Location ¶

type Location struct {
	Offset     int
	LineNumber int
	LineOffset int
}

func (*Location) ColumnNumber ¶

func (l *Location) ColumnNumber() int

type Source ¶

type Source interface {
	// Done indicates that there is no more content available in the input.
	Done() bool

	// Peek previews the codepoint without consuming it. Returns the Unmatched
	// sentinel if the source is at the end of input.
	Peek() rune

	// Hop consumes one codepoint if it matches c.
	Hop(c rune) bool

	// Leap consumes len(seq) bytes only if all the bytes match.
	Leap(seq string) bool

	// Fetch consumes and returns one codepoint if its value is matched by f.
	// Otherwise, it returns the Unmatched sentinel.
	Fetch(f func(rune) bool) rune

	// Skip consumes len(seq) bytes only if all the bytes match and the codepoint
	// that follows matches the term. This is similar to Leap followed by Fetch.
	Skip(seq string, term func(rune) bool) rune
}

type Term ¶

type Term interface {
	TermFunc | rune | string | func(rune) bool
}

type TermFunc ¶

type TermFunc = func(Source, *Context) ErrCode

func AnyOf ¶

func AnyOf(args ...string) TermFunc

AnyOf matches and captures any of the provided literal sequences.

func Between ¶

func Between(prefix, terminator any, content ...any) TermFunc

func Codepoint ¶

func Codepoint(r rune) TermFunc

func CodepointFunc ¶

func CodepointFunc(m func(rune) bool) TermFunc

func Escaped ¶

func Escaped(prefix rune, escapers map[rune]any) TermFunc

Escaped creates a matcher for escape sequences (typically found inside string literals).

Supported escaper types (assuming prefix is `\`):

struct{} self-mapping \z -> z
byte maps to byte \z -> byte code unit
rune maps to rune \z -> utf8-encoded codepoint
string maps to string \z -> literal string sequence
TermFunc uses termfunc \z... -> envokes TermFunc to decode `...`

A key value in the supplied map may be specified as Unmatched

func FirstOf ¶

func FirstOf(args ...any) TermFunc

func HexCodeunit_XXXX ¶

func HexCodeunit_XXXX(first_prefix, second_prefix string) TermFunc

HexCodeunit_XXXX this is a tricky one that is specialized for escape sequences that may decode into a utf-16 pair of surrogates which, in turn, needs to be re-assembled into a single codeunit. JSON is a good example.

func HexN ¶

func HexN[T unsigned](prefix string) TermFunc

func Literal ¶

func Literal(s string) TermFunc

func OneOrMore ¶

func OneOrMore[T Term](a T) TermFunc

func Optional ¶

func Optional[T Term](a T) TermFunc

Optional matches zero or one: (a)?

func Sequence ¶

func Sequence(args ...any) TermFunc

func Skip ¶

func Skip(content ...any) TermFunc

func Uint ¶

func Uint[T unsigned | signed](prefix string, base uint, maxval T) TermFunc

Uint captures numeric value v from a sequence of one or more digits.

func ZeroOrMore ¶

func ZeroOrMore[T Term](a T) TermFunc

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL