json

package module
v0.10.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 17, 2025 License: MIT Imports: 6 Imported by: 1

README

Documentation Go workflow CircleCI codecov Go Report Card GitHub tag (latest SemVer)

json

Yet another json library. It's created to process unstructured json in a convenient and efficient way.

There is also some set of jq filters implemented on top of json.Decoder.

json usage

Decoder is stateless. Most of the methods take source buffer and index where to start parsing and return a result and index where they stopped parsing.

None of methods make a copy or allocate except these which take destination buffer in arguments.

The code is from examples.

// Parsing single object.

var d json.Decoder
data := []byte(`{"key": "value", "another": 1234}`)

i := 0 // initial position
i, err := d.Enter(data, i, json.Object)
if err != nil {
	// not an object
}

var key []byte // to not to shadow i and err in a loop

// extracted values
var value, another []byte

for d.ForMore(data, &i, json.Object, &err) {
	key, i, err = d.Key(data, i) // key decodes a string but don't decode '\n', '\"', '\xXX' and others
	if err != nil {
		// ...
	}

	switch string(key) {
	case "key":
		value, i, err = d.DecodeString(data, i, value[:0]) // reuse value buffer if we are in a loop or something
	case "another":
		another, i, err = d.Raw(data, i)
	default: // skip additional keys
		i, err = d.Skip(data, i)
	}

	// check error for all switch cases
	if err != nil {
		// ...
	}
}
if err != nil {
	// ForMore error
}
// Parsing jsonl: newline (or space, or comma) delimited values.

var err error // to not to shadow i in a loop
var d json.Decoder
data := []byte(`"a", 2 3
["array"]
`)

for i := d.SkipSpaces(data, 0); i < len(data); i = d.SkipSpaces(data, i) { // eat spaces and not try to read the value from string "\n"
	i, err = processOneObject(data, i) // do not use := here as it shadows i and loop will restart from the same index
	if err != nil {
		// ...
	}
}

jq usage

Deprecated in favour of (nikand.dev/go/jq)[https://pkg.go.dev/nikand.dev/go/jq].

The advantage of this implementation is that filters are stateless so they can be used by multiple goroutines at once. The rest are disadvantages: more complicated code -> less reliable, supports only json, less efficient, fewer filters implemented.

jq package is a set of Filters that take data from one buffer, process it, and append result to another buffer.

Also there is a state taken and returned. It's used by filters to return multiple values one by one. The caller must provide nil on the first iteration and returned state on the rest of iterations. Iteration must stop when returned state is nil. Filter may or may not add a value to dst buffer. Empty filter for example adds no value and returns nil state.

Destination buffer is returned even in case of error. This is mostly done for avoiding allocs in case the buffer was grown but error happened.

The code is from examples.

// Extract some inside value.

data := []byte(`{"key0":"skip it", "key1": {"next_key": ["array", null, {"obj":"val"}, "trailing element"]}}  "next"`)

f := jq.Query{"key1", "next_key", 2} // string keys and int array indexes are supported

var res []byte // reusable buffer
var i int      // start index

// Most filters only parse single value and return index where the value ended.
// Use jq.ApplyToAll(f, res[:0], data, 0, []byte("\n")) to process all values in a buffer.
res, i, _, err := f.Next(res[:0], data, i, nil)
if err != nil {
	// i is an index in a source buffer where the error occurred.
}

fmt.Printf("value: %s\n", res)
fmt.Printf("final position: %d of %d\n", i, len(data)) // filter only parsed first value in the buffer
_ = i < len(data)                                      // and stopped immideately after it

// Output:
// value: {"obj":"val"}
// final position: 92 of 100

This is especially convenient if you need to extract a value from json inside base64 inside json. Yes, I've seen such cases and this is how this library came to life.

// generated by command
// jq -nc '{key3: "value"} | {key2: (. | tojson)} | @base64 | {key1: .}'
data := []byte(`{"key1":"eyJrZXkyIjoie1wia2V5M1wiOlwidmFsdWVcIn0ifQ=="}`)

f := jq.NewPipe(
	jq.Key("key1"),
	&jq.Base64d{
		Encoding: base64.StdEncoding,
	},
	&jq.JSONDecoder{},
	jq.Key("key2"),
	&jq.JSONDecoder{},
	jq.Key("key3"),
)

res, _, _, err := f.Next(nil, data, 0, nil)
if err != nil {
	panic(err)
}

// res is []byte(`"value"`)

Documentation

Index

Examples

Constants

View Source
const (
	None    = 0 // never returned in successful case
	Null    = 'n'
	Bool    = 'b'
	String  = 's'
	Array   = '['
	Object  = '{'
	Number  = '1'
	Comment = '/'
)

Value types returned by Decoder.

Variables

View Source
var (
	ErrBadNumber   = errors.New("bad number")
	ErrShortBuffer = io.ErrShortBuffer
	ErrSyntax      = errors.New("syntax error")
	ErrType        = errors.New("incompatible type")
)

Decoder errors. Plus Str errors from skip module.

View Source
var (
	ErrNoSuchKey   = errors.New("no such object key")
	ErrOutOfBounds = errors.New("out of array bounds")
)

Seek errors.

Functions

func SkipSpaces added in v0.6.0

func SkipSpaces(b []byte, i int) int

SkipSpaces skips whitespaces.

Types

type Decoder added in v0.7.0

type Decoder struct{}

Decoder is a group of methods to parse JSON. Decoder is stateless. All the needed state is passed though arguments and return values.

Most of the methods take buffer with json and start position and return a value, end position and possible error.

Example
package main

import (
	"fmt"

	"nikand.dev/go/json"
)

func main() {
	var d json.Decoder
	data := []byte(`{"key": "value", "another": 1234}`)

	i := 0 // initial position
	i, err := d.Enter(data, i, json.Object)
	if err != nil {
		// not an object
	}

	var key []byte // to not to shadow i and err in a loop

	// extracted values
	var value, another []byte

	for d.ForMore(data, &i, json.Object, &err) {
		key, i, err = d.Key(data, i) // key decodes a string but don't decode '\n', '\"', '\xXX' and others
		if err != nil {
			// ...
		}

		switch string(key) {
		case "key":
			value, i, err = d.DecodeString(data, i, value[:0]) // reuse value buffer if we are in a loop or something
		case "another":
			another, i, err = d.Raw(data, i)
		default: // skip additional keys
			i, err = d.Skip(data, i)
		}

		// check error for all switch cases
		if err != nil {
			// ...
		}
	}
	if err != nil {
		// ForMore error
	}

	fmt.Printf("key: %s\nanother: %s\n", value, another)

}
Output:

key: value
another: 1234
Example (MultipleValues)
package main

import (
	"fmt"

	"nikand.dev/go/json"
)

func main() {
	var err error // to not to shadow i in a loop
	var d json.Decoder
	data := []byte(`"a", 2 3
["array"]
`)

	processOneObject := func(data []byte, st int) (int, error) {
		raw, i, err := d.Raw(data, st)

		fmt.Printf("value: %s\n", raw)

		return i, err
	}

	for i := d.SkipSpaces(data, 0); i < len(data); i = d.SkipSpaces(data, i) { // eat trailing spaces and not try to read the value from string "\n"
		i, err = processOneObject(data, i) // do not use := here as it shadow i and loop will restart from the same index
		if err != nil {
			// ...
		}
	}

}
Output:

value: "a"
value: 2
value: 3
value: ["array"]

func (*Decoder) Break added in v0.7.0

func (d *Decoder) Break(b []byte, st, depth int) (i int, err error)

Break breaks from inside the object to the end of it on depth levels. As a special case with depth=0 it skips the next value. Skip and Raw do exactly that.

It's intended for exiting out of arrays and objects when their content is not needed anymore (all the needed indexes or keys are already parsed) and we want to parse the next array or object.

func (*Decoder) DecodeString added in v0.7.0

func (d *Decoder) DecodeString(b []byte, st int, buf []byte) (s []byte, i int, err error)

DecodeString reads the next string, decodes escape sequences (\n, \uXXXX), and appends the result to the buf.

func (*Decoder) DecodedStringLength added in v0.7.0

func (d *Decoder) DecodedStringLength(b []byte, st int) (bs, rs, i int, err error)

DecodedStringLength reads and decodes the next string but only return the result length. It doesn't allocate while DecodeString does.

func (*Decoder) Enter added in v0.7.0

func (d *Decoder) Enter(b []byte, st int, typ byte) (i int, err error)

Enter enters an Array or an Object. typ is checked to match with the actual container type. Use More or, more convenient form, ForMore to iterate over container. See examples to better understand usage pattern.

func (*Decoder) ForMore added in v0.7.0

func (d *Decoder) ForMore(b []byte, i *int, typ byte, errp *error) bool

ForMore is a convenient wrapper for More which makes iterating code shorter and simpler.

func (*Decoder) IterFunc added in v0.8.0

func (d *Decoder) IterFunc(b []byte, st int, tp byte, f func(k, v []byte) error) (i int, err error)

IterFunc is a little helper on top of Enter and ForMore methonds. It iterates over object or array and calls f for each value. If it iterates over array k will be nil. If it iterates over object k is decoded using Key which doesn't decode escape sequences. It reads object or array to the end unless f returned an error.

func (*Decoder) Key added in v0.7.0

func (d *Decoder) Key(b []byte, st int) (k []byte, i int, err error)

Key reads the next string removing quotes but not decoding the string value. So escape sequences (\n, \uXXXX) are not decoded. They are returned as is. This is intended for object keys as they usually contain alpha-numeric symbols only. This is faster and does not require additional buffer for decoding.

func (*Decoder) Length added in v0.7.0

func (d *Decoder) Length(b []byte, st int) (n, i int, err error)

Length calculates number of elements in Array or Object.

func (*Decoder) More added in v0.7.0

func (d *Decoder) More(b []byte, st int, typ byte) (more bool, i int, err error)

More iterates over an Array or an Object elements entered by the Enter method.

func (*Decoder) Raw added in v0.7.0

func (d *Decoder) Raw(b []byte, st int) (v []byte, i int, err error)

Raw skips the next value and returns subslice with the value trimming whitespaces.

func (*Decoder) Seek added in v0.8.0

func (d *Decoder) Seek(b []byte, st int, path ...interface{}) (i int, err error)

Seek seeks to the beginning of the value at the path – list of object keys and array indexes. If you parse multiple object and you only need one value from each, it's good to use Break(len(path)) to move to the beginning of the next object.

Example (SeekIter)
package main

import (
	"fmt"
	"strconv"

	"nikand.dev/go/json"
	"nikand.dev/go/json/benchmarks_data"
)

func main() {
	err := func(b []byte) error {
		var d json.Decoder

		i, err := d.Seek(b, 0, "topics", "topics")
		if err != nil {
			return fmt.Errorf("seek topics: %w", err)
		}

		i, err = d.Enter(b, i, json.Array)

		for err == nil && d.ForMore(b, &i, json.Array, &err) {
			var id int
			var title []byte

			i, err = d.IterFunc(b, i, json.Object, func(k, v []byte) error {
				switch string(k) {
				case "id":
					x, err := strconv.ParseInt(string(v), 10, 64)
					if err != nil {
						return fmt.Errorf("parse id: %w", err)
					}

					id = int(x)
				case "title":
					title, _, err = d.DecodeString(v, 0, title[:0])
					if err != nil {
						return fmt.Errorf("decode title: %w", err)
					}
				}

				return nil
			})

			fmt.Printf("> %3d %s\n", id, title)

		}

		if err != nil {
			return fmt.Errorf("iter topics: %w", err)
		}

		return nil
	}(benchmarks_data.LargeFixture)
	if err != nil {
		fmt.Printf("error: %v\n", err)
	}

}
Output:

>   8 Welcome to Metabase's Discussion Forum
> 169 Formatting Dates
> 168 Setting for google api key
> 167 Cannot see non-US timezones on the admin
> 164 External (Metabase level) linkages in data schema
> 155 Query working on "Questions" but not in "Pulses"
> 161 Pulses posted to Slack don't show question output
> 152 Should we build Kafka connecter or Kafka plugin
> 147 Change X and Y on graph
> 142 Issues sending mail via office365 relay
> 137 I see triplicates of my mongoDB collections
> 140 Google Analytics plugin
> 138 With-mongo-connection failed: bad connection details:
> 133 "We couldn't understand your question." when I query mongoDB
> 129 My bar charts are all thin
> 106 What is the expected return order of columns for graphing results when using raw SQL?
> 131 Set site url from admin panel
> 127 Internationalization (i18n)
> 109 Returning raw data with no filters always returns We couldn't understand your question
> 103 Support for Cassandra?
> 128 Mongo query with Date breaks [solved: Mongo 3.0 required]
>  23 Can this connect to MS SQL Server?
> 121 Cannot restart metabase in docker
>  85 Edit Max Rows Count
>  96 Creating charts by querying more than one table at a time
>  90 Trying to add RDS postgresql as the database fails silently
>  17 Deploy to Heroku isn't working
> 100 Can I use DATEPART() in SQL queries?
>  98 Feature Request: LDAP Authentication
>  87 Migrating from internal H2 to Postgres

func (*Decoder) Skip added in v0.7.0

func (d *Decoder) Skip(b []byte, st int) (i int, err error)

Skip skips the next value.

func (*Decoder) SkipSpaces added in v0.7.0

func (d *Decoder) SkipSpaces(b []byte, i int) int

SkipSpaces skips whitespaces.

func (*Decoder) Type added in v0.7.0

func (d *Decoder) Type(b []byte, st int) (tp byte, i int, err error)

Type finds the beginning of the next value and detects its type. It doesn't parse the value so it can't detect if it's incorrect.

type Encoder added in v0.7.0

type Encoder struct{}

func (*Encoder) AppendKey added in v0.9.0

func (e *Encoder) AppendKey(w, s []byte) []byte

func (*Encoder) AppendString added in v0.8.0

func (e *Encoder) AppendString(w, s []byte) []byte

EncodeString encodes string in a JSON compatible way.

func (*Encoder) AppendStringContent added in v0.8.0

func (e *Encoder) AppendStringContent(w, s []byte) []byte

EncodeStringContent does the same as EncodeString but does not add quotes. It can be used to generate the string from multiple parts. Yet if a symbol designated to be escaped is split between parts it encodes each part of the symbol separately.

type RawMessage added in v0.10.0

type RawMessage []byte

func (RawMessage) MarshalJSON added in v0.10.0

func (x RawMessage) MarshalJSON() ([]byte, error)

func (*RawMessage) UnmarshalJSON added in v0.10.0

func (x *RawMessage) UnmarshalJSON(d []byte) error

type Reader added in v0.8.0

type Reader struct {
	// contains filtered or unexported fields
}

func NewReader added in v0.8.0

func NewReader(b []byte, rd io.Reader) *Reader

NewReader creates a new Reader. It first reads from b and then from rd. If you just want to provide a buffer slice it to zero length.

func (*Reader) Break added in v0.8.0

func (r *Reader) Break(depth int) (err error)

Break breaks from inside the object to the end of it on depth levels. As a special case with depth=0 it skips the next value. Skip and Raw do exactly that.

It's intended for exiting out of arrays and objects when their content is not needed anymore (all the needed indexes or keys are already parsed) and we want to parse the next array or object.

func (*Reader) DecodeString added in v0.8.0

func (r *Reader) DecodeString(buf []byte) (s []byte, err error)

DecodeString reads the next string, decodes escape sequences (\n, \uXXXX), and appends the result to the buf.

Data is appended to the provided buffer. And the buffer will not be preserved by Reader.

func (*Reader) DecodedStringLength added in v0.8.0

func (r *Reader) DecodedStringLength() (bs, rs int, err error)

DecodedStringLength reads and decodes the next string but only return the result length. It doesn't allocate while DecodeString does.

func (*Reader) Enter added in v0.8.0

func (r *Reader) Enter(typ byte) (err error)

Enter enters an Array or an Object. typ is checked to match with the actual container type. Use More or, more convenient form, ForMore to iterate over container. See examples to better understand usage pattern.

func (*Reader) ForMore added in v0.8.0

func (r *Reader) ForMore(typ byte, errp *error) bool

ForMore is a convenient wrapper for More which makes iterating code shorter and simpler.

func (*Reader) IterFunc added in v0.8.0

func (r *Reader) IterFunc(tp byte, f func(k, v []byte) error) (err error)

IterFunc is a little helper on top of Enter and ForMore methonds. See Decoder.IterFunc for more details.

func (*Reader) Key added in v0.8.0

func (r *Reader) Key() ([]byte, error)

Key reads the next string removing quotes but not decoding the string value. So escape sequences (\n, \uXXXX) are not decoded. They are returned as is. This is intended for object keys as they usually contain alpha-numeric symbols only. This is faster and does not require additional buffer for decoding.

Returned buffer is only valid until the next reading method is called. It can be reused if more data needed to be read from underlying reader.

func (*Reader) Length added in v0.8.0

func (r *Reader) Length() (n int, err error)

Length calculates number of elements in Array or Object.

func (*Reader) Lock added in v0.8.0

func (r *Reader) Lock() int

Lock locks internal buffer so the data is not overwritten when more data is read from underlaying reader. It's used to return to the locked position in a stream and reread some part of it. Internal buffer grows to the size of data locked plus additional space for the next Read. Lock must be followed by Unlock just like for sync.Mutex. Rewind is used to return to the latest Lock position. Multiple nested locks are allowed. It returns the number of locks acquired and not released so far; kinda Lock depth.

func (*Reader) More added in v0.8.0

func (r *Reader) More(typ byte) (more bool, err error)

More iterates over an Array or an Object elements entered by the Enter method.

func (*Reader) Offset added in v0.8.0

func (r *Reader) Offset() int64

Offset returns current position in the stream.

func (*Reader) Raw added in v0.8.0

func (r *Reader) Raw() ([]byte, error)

Raw skips the next value and returns subslice with the value trimming whitespaces.

Returned buffer is only valid until the next reading method is called. It can be reused if more data needed to be read from underlying reader.

func (*Reader) Reset added in v0.8.0

func (r *Reader) Reset(b []byte, rd io.Reader)

Reset resets reader. As in NewReader it makes reading first from b and then from rd.

func (*Reader) Rewind added in v0.8.0

func (r *Reader) Rewind()

Rewind returns stream position to the latest Lock.

func (*Reader) Seek added in v0.8.0

func (r *Reader) Seek(path ...interface{}) (err error)

Seek seeks to the beginning of the value at the path – list of object keys and array indexes. If you parse multiple object and you only need one value from each, it's good to use Break(len(path)) to move to the beginning of the next object.

func (*Reader) Skip added in v0.8.0

func (r *Reader) Skip() error

Skip skips the next value.

func (*Reader) Type added in v0.8.0

func (r *Reader) Type() (tp byte, err error)

Type finds the beginning of the next value and detects its type. It doesn't parse the value so it can't detect if it's incorrect.

func (*Reader) Unlock added in v0.8.0

func (r *Reader) Unlock() int

Unlock releases the latest buffer Lock. It returns the number of remaining active Locks.

type StatedEncoder added in v0.10.0

type StatedEncoder struct {
	Buf []byte

	Encoder
	// contains filtered or unexported fields
}

func NewStatedEncoder added in v0.10.0

func NewStatedEncoder(b []byte) *StatedEncoder

func (*StatedEncoder) ArrEnd added in v0.10.0

func (e *StatedEncoder) ArrEnd() *StatedEncoder

func (*StatedEncoder) ArrStart added in v0.10.0

func (e *StatedEncoder) ArrStart() *StatedEncoder

func (*StatedEncoder) Int added in v0.10.0

func (e *StatedEncoder) Int(v int) *StatedEncoder

func (*StatedEncoder) Int64 added in v0.10.0

func (e *StatedEncoder) Int64(v int64) *StatedEncoder

func (*StatedEncoder) Key added in v0.10.0

func (e *StatedEncoder) Key(s string) *StatedEncoder

func (*StatedEncoder) KeyInt added in v0.10.0

func (e *StatedEncoder) KeyInt(k string, v int) *StatedEncoder

func (*StatedEncoder) KeyInt64 added in v0.10.0

func (e *StatedEncoder) KeyInt64(k string, v int64) *StatedEncoder

func (*StatedEncoder) KeyString added in v0.10.0

func (e *StatedEncoder) KeyString(k, v string) *StatedEncoder

func (*StatedEncoder) KeyStringBytes added in v0.10.0

func (e *StatedEncoder) KeyStringBytes(k string, v []byte) *StatedEncoder

func (*StatedEncoder) Newline added in v0.10.0

func (e *StatedEncoder) Newline() *StatedEncoder

func (*StatedEncoder) NextIsKey added in v0.10.0

func (e *StatedEncoder) NextIsKey() *StatedEncoder

func (*StatedEncoder) ObjEnd added in v0.10.0

func (e *StatedEncoder) ObjEnd() *StatedEncoder

func (*StatedEncoder) ObjStart added in v0.10.0

func (e *StatedEncoder) ObjStart() *StatedEncoder

func (*StatedEncoder) Reset added in v0.10.0

func (e *StatedEncoder) Reset() *StatedEncoder

func (*StatedEncoder) Result added in v0.10.0

func (e *StatedEncoder) Result() []byte

func (*StatedEncoder) String added in v0.10.0

func (e *StatedEncoder) String(s string) *StatedEncoder

func (*StatedEncoder) StringBytes added in v0.10.0

func (e *StatedEncoder) StringBytes(s []byte) *StatedEncoder

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL