ujson

package module
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 9, 2021 License: MIT Imports: 5 Imported by: 2

README

µjson

Build Status GoDoc License

A fast and minimal JSON parser and transformer that works on unstructured json.

Motivation

Sometimes we just want to make some minimal changes to a json document, or do some generic transformations without fully unmarshalling it. For example, removing blacklist fields from response json. Why spend all the cost on unmarshalling into a map[string]interface{} just to immediate marshal it again.

Read more on dev.to/olvrng.

Example use cases:
  1. Walk through unstructured json:
  2. Transform unstructured json:

without fully unmarshalling it into a map[string]interface{}.

See usage and examples on godoc.org and dev.to/olvrng.

Important: Behaviour is undefined on invalid json. Use on trusted input only. For untrusted input, you may want to run it through json.Valid() first.

Usage

The single most important function is Walk(input, callback), which parses the input json and call callback function for each key/value pair processed.

Let's see an example:

{
   "id": 12345,
   "name": "foo",
   "numbers": ["one", "two"],
   "tags": {"color": "red", "priority": "high"},
   "active": true
}

Calling Walk() with the above input will produce:

level key value
0 {
1 "id" 12345
1 "name" "foo"
1 "numbers" [
2 "one"
2 "two"
1 ]
1 "tags" {
2 "color" "red"
2 "priority" "high"
1 }
1 "active" true
0 }

level indicates the indentation of the key/value pair as if the json is formatted properly. keys and values are provided as raw literal. Strings are always double-quoted. To get the original string, use Unquote().

value will never be empty (for valid json). You can test the first byte (value[0]) to get its type:

  • n: Null (null)
  • f, t: Boolean (false, true)
  • 0-9: Number
  • ": String, see Unquote()
  • [, ]: Array
  • {, }: Object

When processing arrays and objects, first the open bracket ([, {) will be provided as value, followed by its children, and finally the close bracket (], }). When encountering open brackets, You can make the callback function return false to skip the array/object entirely.

Documentation

Overview

Package µjson implements a fast and minimal JSON parser and transformer that works on unstructured json. Example use cases:

  1. Walk through unstructured json: - Print all keys and values - Extract some values
  2. Transform unstructured json: - Remove all spaces - Reformat - Remove blacklist fields - Wrap int64 in string for processing by JavaScript

without fully unmarshalling it into a map[string]interface{}

CAUTION: Behaviour is undefined on invalid json. Use on trusted input only.

The single most important function is "Walk()", which parses the given json and call callback function for each key/value pair processed.

{
    "id": 12345,
    "name": "foo",
    "numbers": ["one", "two"],
    "tags": {"color": "red", "priority": "high"},
    "active": true
}

Calling "Walk()" with the above input will produce:

| level | key        | value   |
|-------|------------|---------|
|   0   |            | {       |
|   1   | "id"       | 12345   |
|   1   | "name"     | "foo"   |
|   1   | "numbers"  | [       |
|   2   |            | "one"   |
|   2   |            | "two"   |
|   1   |            | ]       |
|   1   | "tags"     | {       |
|   2   | "color"    | "red"   |
|   2   | "priority" | "high"  |
|   1   |            | }       |
|   1   | "active"   | true    |
|   0   |            | }       |

"level" indicates the indentation of the key/value pair as if the json is formatted properly. Keys and values are provided as raw literal. Strings are always double-quoted. To get the original string, use "Unquote".

"value" will never be empty (for valid json). You can test the first byte ("value[0]") to get its type:

  • 'n' : Null ("null")
  • 'f', 't': Boolean ("false", "true")
  • '0'-'9' : Number
  • '"' : String, see "Unquote"
  • '[', ']': Array
  • '{', '}': Object

When processing arrays and objects, first the open bracket ("[", "{") will be provided as "value", followed by its children, and finally the close bracket ("]", "}"). When encountering open brackets, you can make the callback function return "false" to skip the array/object entirely.

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrSyntax = strconv.ErrSyntax

ErrSyntax indicates that the value has invalid syntax.

Functions

func AppendQuote

func AppendQuote(dst []byte, s []byte) []byte

AppendQuote appends a double-quoted string valid for json key and value, to dst and returns the extended buffer.

func AppendQuoteString

func AppendQuoteString(dst []byte, s string) []byte

QuoteString returns a double-quoted string valid for json key or value.

func AppendQuoteToASCII

func AppendQuoteToASCII(dst []byte, s []byte) []byte

AppendQuoteToASCII appends a double-quoted string valid for json key and value, to dst and returns the extended buffer.

func AppendQuoteToGraphic

func AppendQuoteToGraphic(dst []byte, s []byte) []byte

AppendQuoteToGraphic appends a double-quoted string valid for json key and value, to dst and returns the extended buffer.

func Reconstruct

func Reconstruct(input []byte) ([]byte, error)

Reconstruct walks through the input json and rebuild it. It's put here as an example of using Walk.

func ShouldAddComma

func ShouldAddComma(value []byte, lastChar byte) bool

ShouldAddComma decides if a comma should be appended while constructing output json. See Reconstruct for an example of rebuilding the json.

func Unquote

func Unquote(s []byte) ([]byte, error)

Unquote decodes a double-quoted string key or value to retrieve the original string value. It will avoid allocation whenever possible.

The code is inspired by strconv.Unquote, but only accepts valid json string.

func Walk

func Walk(input []byte, callback func(level int, key, value []byte) bool) error

Walk parses the given json and call "callback" for each key/value pair. See examples for sample callback params.

The function "callback":

  • may convert key and value to string for processing
  • may return false to skip processing the current object or array
  • must not modify any slice it receives.
Example
package main

import (
	"fmt"

	"github.com/olvrng/ujson"
)

func main() {
	input := []byte(`{"order_id": 12345678901234, "number": 12, "item_id": 12345678905678, "counting": [1,"2",3]}`)

	err := ujson.Walk(input, func(level int, key, value []byte) bool {
		fmt.Println(level, string(key), string(value))
		return true
	})
	if err != nil {
		panic(err)
	}
}
Output:

0  {
1 "order_id" 12345678901234
1 "number" 12
1 "item_id" 12345678905678
1 "counting" [
2  1
2  "2"
2  3
1  ]
0  }
Example (Reconstruct)
package main

import (
	"fmt"

	"github.com/olvrng/ujson"
)

func main() {
	input := []byte(`{"order_id": 12345678901234, "number": 12, "item_id": 12345678905678, "counting": [1,"2",3]}`)

	b := make([]byte, 0, 256)
	err := ujson.Walk(input, func(level int, key, value []byte) bool {
		if len(b) != 0 && ujson.ShouldAddComma(value, b[len(b)-1]) {
			b = append(b, ',')
		}
		if len(key) > 0 {
			b = append(b, key...)
			b = append(b, ':')
		}
		b = append(b, value...)
		return true
	})
	if err != nil {
		panic(err)
	}
	fmt.Printf("%s", b)
}
Output:

{"order_id":12345678901234,"number":12,"item_id":12345678905678,"counting":[1,"2",3]}
Example (Reformat)
package main

import (
	"fmt"

	"github.com/olvrng/ujson"
)

func main() {
	input := []byte(`{"order_id": 12345678901234, "number": 12, "item_id": 12345678905678, "counting": [1,"2",3]}`)

	b := make([]byte, 0, 256)
	err := ujson.Walk(input, func(level int, key, value []byte) bool {
		if len(b) != 0 && ujson.ShouldAddComma(value, b[len(b)-1]) {
			b = append(b, ',')
		}
		b = append(b, '\n')
		for i := 0; i < level; i++ {
			b = append(b, '\t')
		}
		if len(key) > 0 {
			b = append(b, key...)
			b = append(b, `: `...)
		}
		b = append(b, value...)
		return true
	})
	if err != nil {
		panic(err)
	}
	fmt.Printf("%s", b)
}
Output:

{
	"order_id": 12345678901234,
	"number": 12,
	"item_id": 12345678905678,
	"counting": [
		1,
		"2",
		3
	]
}
Example (RemoveBlacklistFields)
package main

import (
	"bytes"
	"fmt"

	"github.com/olvrng/ujson"
)

func main() {
	input := []byte(`{
        "id": 12345,
        "name": "foo",
        "numbers": ["one", "two"],
        "tags": {"color": "red", "priority": "high"},
        "active": true
    }`)

	blacklistFields := [][]byte{
		[]byte(`"numbers"`), // note the quotes
		[]byte(`"active"`),
	}
	b := make([]byte, 0, 1024)
	err := ujson.Walk(input, func(_ int, key, value []byte) bool {
		if len(key) != 0 {
			for _, blacklist := range blacklistFields {
				if bytes.Equal(key, blacklist) {
					// remove the key and value from the output
					return false
				}
			}
		}
		// write to output
		if len(b) != 0 && ujson.ShouldAddComma(value, b[len(b)-1]) {
			b = append(b, ',')
		}
		if len(key) > 0 {
			b = append(b, key...)
			b = append(b, ':')
		}
		b = append(b, value...)
		return true
	})
	if err != nil {
		panic(err)
	}
	fmt.Printf("%s", b)
}
Output:

{"id":12345,"name":"foo","tags":{"color":"red","priority":"high"}}
Example (RemoveBlacklistFields2)

This example was taken from StackOverflow: https://stackoverflow.com/questions/35441254/making-minimal-modification-to-json-data-without-a-structure-in-golang

package main

import (
	"bytes"
	"fmt"

	"github.com/olvrng/ujson"
)

func main() {
	input := []byte(`
{
  "responseHeader": {
    "status": 0,
    "QTime": 0,
    "params": {
      "q": "solo",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 2,
    "start": 0,
    "docs": [
      { "name": "foo" },
      { "name": "bar" }
    ]
  }
}`)

	blacklistFields := [][]byte{
		[]byte(`"responseHeader"`), // note the quotes
	}
	b := make([]byte, 0, 1024)
	err := ujson.Walk(input, func(_ int, key, value []byte) bool {
		if len(key) != 0 {
			for _, blacklist := range blacklistFields {
				if bytes.Equal(key, blacklist) {
					// remove the key and value from the output
					return false
				}
			}
		}
		// write to output
		if len(b) != 0 && ujson.ShouldAddComma(value, b[len(b)-1]) {
			b = append(b, ',')
		}
		if len(key) > 0 {
			b = append(b, key...)
			b = append(b, ':')
		}
		b = append(b, value...)
		return true
	})
	if err != nil {
		panic(err)
	}
	fmt.Printf("%s", b)
}
Output:

{"response":{"numFound":2,"start":0,"docs":[{"name":"foo"},{"name":"bar"}]}}
Example (WrapInt64InString)
package main

import (
	"bytes"
	"fmt"

	"github.com/olvrng/ujson"
)

func main() {
	input := []byte(`{"order_id": 12345678901234, "number": 12, "item_id": 12345678905678, "counting": [1,"2",3]}`)

	suffix := []byte(`_id`)
	b := make([]byte, 0, 256)
	err := ujson.Walk(input, func(_ int, key, value []byte) bool {
		// unquote key
		if len(key) != 0 {
			key = key[1 : len(key)-1]
		}

		// Test for field with suffix _id and value is an int64 number. For
		// valid json, value will never be empty, so we can safely test only the
		// first byte.
		wrap := bytes.HasSuffix(key, suffix) && value[0] > '0' && value[0] <= '9'

		// transform the input, wrap values in double quote
		if len(b) != 0 && ujson.ShouldAddComma(value, b[len(b)-1]) {
			b = append(b, ',')
		}
		if len(key) > 0 {
			b = append(b, '"')
			b = append(b, key...)
			b = append(b, '"')
			b = append(b, ':')
		}
		if wrap {
			b = append(b, '"')
		}
		b = append(b, value...)
		if wrap {
			b = append(b, '"')
		}
		return true
	})
	if err != nil {
		panic(err)
	}
	fmt.Printf("%s", b)
}
Output:

{"order_id":"12345678901234","number":12,"item_id":"12345678905678","counting":[1,"2",3]}

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL