jsontokenizer

package module
v0.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 13, 2026 License: MIT Imports: 2 Imported by: 1

README

Go Reference

JSON Tokenizer

Zero-allocation JSON tokenizer.

Features

  • Fast. ~15x faster than encoding/json.Decoder. See benchmarks below.
  • Similar API to encoding/json.Decoder.
  • No reflection.
  • No allocations, beyond small buffer for reading.
  • Can be reused with a call to Reset.

Anti-Features

  • Does NOT parse JSON. Will not verify semantic correctness. [} will produce 2 tokens without errors.
  • Needs an io.Writer to write numbers and strings into. Based on the use case, can be os.Stdout, bytes.Buffer, ByteBuffer, etc.
  • Does not escape strings. "he is 5'11\\"." will be exactly that.
  • Does not parse numbers into floats/ints. Use strconv.Atoi() if needed.
  • Not thread safe. Use with sync.Mutex or the like to prevent simultaneous calls.

Quick Start

import (
	"io"

	json "pitr.ca/jsontokenizer"
)

func example(in io.Reader) error {
	tk := json.New(in)

	for {
		tok, err := tk.Token()
		if err == io.EOF {
			return nil
		}
		if err != nil {
			return err
		}
		switch tok {
		case json.TokNull:
			println("got null")
		case json.TokTrue, json.TokFalse:
			println("got bool")
		case json.TokArrayOpen, json.TokArrayClose, json.TokObjectOpen, json.TokObjectClose, json.TokObjectColon, json.TokComma:
			println("got delimiter")
		case json.TokNumber:
			println("got number")
			_, err := tk.ReadNumber(io.Discard)
			if err != nil {
				return err
			}
		case json.TokString:
			println("got string")
			_, err := tk.ReadString(io.Discard)
			if err != nil {
				return err
			}
		}
	}
}

Benchmarks

Sizes are buffer sizes, which can be specified with NewWithSize. Default is 64. Tokenizer is re-used between benchmark iterations, but this doesn't impact performance.

BenchmarkBuiltinDecoder is encoding/json.Decoder.

BenchmarkTokenizer/size=8-8         	    1419	    788208 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=16-8         	    1668	    688656 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=32-8         	    1792	    628601 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=64-8         	    2040	    571411 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=128-8        	    2228	    520646 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=256-8        	    2392	    482151 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=512-8        	    2516	    460283 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=1024-8       	    2553	    458148 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=2048-8       	    2618	    451937 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=4096-8       	    2499	    451601 ns/op	       0 B/op	       0 allocs/op
BenchmarkTokenizer/size=8192-8       	    2610	    443493 ns/op	       0 B/op	       0 allocs/op

BenchmarkBuiltinDecoder-8            	     157	   7607729 ns/op	 1755495 B/op	  107836 allocs/op

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type TokType

type TokType int

A TokType is an enum for JSON types.

const (
	TokNull TokType = iota
	TokTrue
	TokFalse
	TokNumber
	TokString
	TokArrayOpen
	TokArrayClose
	TokObjectOpen
	TokObjectClose
	TokObjectColon
	TokComma
)

The following TokTypes are defined.

type Tokenizer

type Tokenizer interface {
	// Token returns next token. TokString and TokNumber tokens must be
	// consumed by ReadString and ReadNumber respectively.
	Token() (TokType, error)
	// ReadNumber consumes number token by writing it into provided io.Writer.
	ReadNumber(into io.Writer) (n int, err error)
	// ReadString consumes string token by writing it into provided io.Writer.
	ReadString(into io.Writer) (n int, err error)
	// Reset resets state of Tokenizer so it can be re-used with another Reader.
	Reset(in io.Reader)
}

Tokenizer reads and tokenizes JSON from an input stream.

func New

func New(in io.Reader) Tokenizer

New returns a new Tokenizer with default buffer size.

func NewWithSize

func NewWithSize(in io.Reader, size int) Tokenizer

NewWithSize returns a new Tokenizer with custom buffer size.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL