flexjson

package module
v0.0.0-...-c643c44 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 25, 2025 License: MIT Imports: 3 Imported by: 0

README

FlexJSON: A Robust Partial and Streaming JSON Parser for Go

Go Reference Go Report Card

FlexJSON parses incomplete or streaming JSON data, like JSON coming from an LLM. Unlike standard JSON parsers that require valid, complete JSON input, FlexJSON gracefully handles partial JSON fragments and streams of characters, extracting as much structured data as possible.

🌟 Features

  • Partial JSON Parsing: Extract data from incomplete JSON fragments

    • {"key": 123map[string]any{"key": 123}
    • {"key": 1234, "key2":map[string]any{"key": 1234, "key2": nil}
  • Character-by-Character Streaming: Process JSON one character at a time

    • Ideal for network streams, telemetry data, or large files
    • Updates an output map in real-time as data arrives
  • Nested Structure Support: Handles complex nested objects and arrays

    • Properly tracks hierarchy in deeply nested structures
    • Maintains context across partial fragments
  • Resilient Parsing: Recovers gracefully from unexpected input

    • No panic on malformed input
    • Extracts maximum valid data even from corrupted JSON
  • LLM Integration: Perfect for processing streaming JSON responses from LLMs

    • Extracts structured data as tokens arrive
    • Enables real-time UI updates with partial LLM outputs
  • Zero Dependencies: Pure Go implementation with no external dependencies

📦 Installation

go get github.com/jpoz/flexjson

🚀 Quick Start

Parsing Partial JSON
package main

import (
    "fmt"
    "github.com/jpoz/flexjson"
)

func main() {
    // Parse incomplete JSON
    partialJSON := `{"name": "John", "age": 30, "city":`
    
    result, err := flexjson.ParsePartialJSONObject(partialJSON)
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }
    
    fmt.Printf("Parsed result: %v\n", result)
    // Output: Parsed result: map[name:John age:30 city:<nil>]
}
Streaming JSON Character-by-Character
package main

import (
	"fmt"

	"github.com/jpoz/flexjson"
)

func main() {
	// Example JSON string
	jsonStrs := []string{`{"name":"John Doe"`, `,"age":30,`, `"email":"johndoe@example.com"}`}

	// Create output map
	output := map[string]any{}

	// Create streaming parser
	sp := flexjson.NewStreamingParser(&output)

	// Process each string
	for _, str := range jsonStrs {
		fmt.Printf("Processing %s\n", str)
		err := sp.ProcessString(str)
		if err != nil {
			fmt.Printf("Error: %v\n", err)
			return
		}

		// The output map is updated after each string
		fmt.Printf("Current state: %v\n", output)
	}

	fmt.Printf("Final result: %v\n", output)
}

🤖 LLM Integration Benefits

FlexJSON is particularly well-suited for applications working with LLMs:

  • Real-time Processing: Parse JSON data as it streams from LLM APIs token by token
  • Immediate Feedback: Update UIs with structured data before the LLM completes its response
  • Resilient Handling: Continue processing even if the LLM produces malformed JSON
  • Progressive Rendering: Display complex nested structures as they become available
  • Efficient Resource Usage: Begin processing data immediately rather than waiting for complete responses

⚙️ How It Works

FlexJSON uses a custom lexer and parser system for the partial JSON parsing, and a state machine approach for streaming parsing:

  1. Lexer: Tokenizes the input string into JSON tokens (strings, numbers, booleans, etc.)
  2. Parser: Converts tokens into a structured map representation
  3. StreamingParser: Maintains stacks of containers and keys to track position in the JSON hierarchy

The library intelligently handles incomplete input by:

  • Treating unexpected EOF as valid termination
  • Providing default values (nil) for incomplete key-value pairs
  • Maintaining context across nested structures

🧪 Testing

The library includes comprehensive test coverage for both partial and streaming parsing:

go test -v github.com/jpoz/flexjson

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Parse

func Parse(input string) (map[string]any, error)

Parse parses a partial JSON string into a map[string]any

Types

type Lexer

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer tokenizes JSON input

func NewLexer

func NewLexer(input string) *Lexer

NewLexer creates a new JSON lexer

func (*Lexer) Tokenize

func (l *Lexer) Tokenize() []Token

Tokenize converts the input string into tokens

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser parses tokens into a JSON value

func NewParser

func NewParser(tokens []Token) *Parser

NewParser creates a new JSON parser

func (*Parser) Parse

func (p *Parser) Parse() (interface{}, error)

Parse parses tokens into a JSON value

type StreamingParser

type StreamingParser struct {
	// contains filtered or unexported fields
}

StreamingParser is a simplified JSON parser that processes JSON character by character and updates an output map as it goes along.

func NewStreamingParser

func NewStreamingParser(output *map[string]any) *StreamingParser

NewStreamingParser creates a new StreamingParser that will update the provided map

func (*StreamingParser) GetCurrentOutput

func (sp *StreamingParser) GetCurrentOutput() map[string]any

GetCurrentOutput returns the current output map

func (*StreamingParser) ProcessChar

func (sp *StreamingParser) ProcessChar(c string) error

ProcessChar processes a single character in the JSON stream

func (*StreamingParser) ProcessString

func (sp *StreamingParser) ProcessString(chunk string) error

ProcessString processes a chunk of JSON data character by character

func (*StreamingParser) Reset

func (sp *StreamingParser) Reset()

Reset resets the parser state

func (*StreamingParser) SetDebug

func (sp *StreamingParser) SetDebug(value bool)

type Token

type Token struct {
	Type  TokenType
	Value string
}

Token represents a JSON token

type TokenType

type TokenType int

Token types used by the lexer

const (
	TokenError TokenType = iota
	TokenEOF
	TokenLeftBrace
	TokenRightBrace
	TokenLeftBracket
	TokenRightBracket
	TokenColon
	TokenComma
	TokenString
	TokenNumber
	TokenTrue
	TokenFalse
	TokenNull
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL