aijson

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2026 License: Apache-2.0 Imports: 3 Imported by: 0

README

aijson

encoding/json for JSON produced by LLMs.

One line to switch.

-import "encoding/json"
+import "github.com/docker/aijson"

-err := json.Unmarshal(data, &args)
+err := aijson.Unmarshal(data, &args)

That's the entire change. Wherever you parse JSON from a language model — tool calls today, structured outputs, JSON-mode responses, streaming partials next — drop in aijson.Unmarshal and stop losing turns to shape-only mistakes.

Why

A strict json.Unmarshal rejects paths: "foo" when the schema says paths: []string. The model retries with the same mistake — the error message it sees is unreadable JSON noise. A turn burns, the user waits.

aijson.Unmarshal does the obvious thing first: strict json.Unmarshal into your typed value. The 95% case pays nothing. Only when that fails does it apply a narrow, named set of shape repairs and try again. Repairs that don't fit return the original strict-parse error — no made-up complaints, no silent corruption.

The four repairs

Kind Before After
unwrap_string_array "paths": "[\"a\",\"b\"]" "paths": ["a","b"]
wrap_in_array "paths": "foo.txt" "paths": ["foo.txt"]
wrap_object_in_array "paths": {"path": "foo.txt"} "paths": ["foo.txt"]
drop_null "n": null (custom UnmarshalJSON) field removed

Each repair fires only when the destination type expects a specific shape the input doesn't have. The schema is the prior.

Install

go get github.com/docker/aijson

Telemetry

If you want to know when repairs fired (per-model, per-tool dashboards), pass an OnRepair callback:

err := aijson.Unmarshal(data, &args, aijson.OnRepair(func(kinds []aijson.Kind) {
    slog.Info("aijson_repaired", "kinds", kinds)
}))

The callback only fires when repairs actually fired. Valid input pays nothing — neither in CPU nor in callback noise.

Design

Validate-then-repair, not preprocess-then-validate

The naive approach is to walk every input and "fix" things that look broken before parsing. That silently corrupts: imagine a write_file tool whose content is a string that happens to look like "[1,2,3]". A generic preprocessor would turn it into an array and write garbage to disk.

aijson avoids this because repairs only run at field paths where the strict parse already failed. If content is typed as string, the parse succeeds and we never touch it. If paths is typed as []string, the parse fails with a type mismatch at paths, and only there do we try unwrapping.

There is a dedicated test (TestUnmarshal_SchemaIsThePrior) pinning this property.

Ordering is load-bearing

unwrap_string_array must run before wrap_in_array, otherwise a literal stringified array like "[\"a\",\"b\"]" would be wrapped as a single- element array and we'd silently corrupt the input. There is a dedicated test (TestUnmarshal_OrderingPreventsDoubleWrap) pinning this invariant.

API

func Unmarshal(data []byte, v any, opts ...Option) error
func OnRepair(fn func([]Kind)) Option

type Kind string
const (
    KindDropNull          Kind = "drop_null"
    KindUnwrapStringArray Kind = "unwrap_string_array"
    KindWrapObjectInArray Kind = "wrap_object_in_array"
    KindWrapInArray       Kind = "wrap_in_array"
)

Unmarshal's signature is intentionally identical to encoding/json.Unmarshal, extended only by variadic options.

Origin

Extracted from docker-agent (PR docker/docker-agent#2635), which generalised an edit_file-specific repair layer to every tool with an array-typed field.

License

Apache-2.0

Documentation

Overview

Package aijson is a drop-in replacement for encoding/json that is tolerant of the small set of shape mistakes LLMs repeat when emitting JSON for tool calls.

The headline API is Unmarshal, which has the same signature as encoding/json.Unmarshal:

var args Args
if err := aijson.Unmarshal(data, &args); err != nil {
    return err
}

Unmarshal first tries a strict encoding/json.Unmarshal. On success it returns immediately — valid input pays nothing. On failure it applies a narrow, named set of shape repairs targeted at the four mistakes open-weights models repeat across every tool that has an array field, then re-tries the parse. If the repair does not produce a parseable payload, the original strict parse error is returned so callers see the schema's complaint rather than a synthesised one.

The four repairs are:

  1. Stringified array — `paths: "[\"a\",\"b\"]"` becomes `paths: ["a","b"]`.
  2. Bare scalar where an array is expected — `paths: "foo"` becomes `paths: ["foo"]`.
  3. Single-key object placeholder — `paths: {"path": "foo"}` becomes `paths: ["foo"]`.
  4. Null for primitive scalar — `n: null` is dropped so the zero value wins.

Ordering matters: the stringified-array unwrap (1) must run before the bare-string-wrap (2), otherwise a literal stringified array would be wrapped as a single element and the input would be silently corrupted.

Validate-then-repair, not preprocess-then-validate. The destination type is the prior; repairs only fire at field paths where the strict parse disagreed with the schema. A `write_file` tool whose `content` is typed as `string` will never have its content "repaired" into an array, even if the string happens to look like JSON.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Unmarshal

func Unmarshal(data []byte, v any, opts ...Option) error

Unmarshal parses the JSON-encoded data and stores the result in the value pointed to by v, exactly like encoding/json.Unmarshal — except that when the strict parse fails because of one of the four documented shape mistakes, the data is repaired and re-parsed transparently.

v must be a non-nil pointer (same constraint as encoding/json.Unmarshal). When repairs fire, the OnRepair callback (if supplied via OnRepair) is invoked with the list of [Kind]s applied, before v is populated.

Types

type Kind

type Kind string

Kind identifies which shape repair was applied to a single field. Repairs are kept narrow and named so per-(model, tool) telemetry can be aggregated and so unintended repairs are obvious in logs.

const (
	// KindDropNull removes a field whose value is JSON null when the
	// field type is a primitive scalar. In Go this is rarely needed
	// (json.Unmarshal accepts null for slices, pointers, maps, and
	// interfaces and treats it as a no-op for primitive scalars) but it
	// stays here primarily as a safety net for fields whose custom
	// UnmarshalJSON may otherwise reject null.
	KindDropNull Kind = "drop_null"

	// KindUnwrapStringArray turns a JSON-encoded array delivered as a
	// string into a real array. Models routinely send
	//   "paths": "[\"a\",\"b\"]"
	// instead of
	//   "paths": ["a","b"]
	// This must run BEFORE [KindWrapInArray], otherwise '["a","b"]'
	// (a literal stringified array) would be wrapped as ['["a","b"]'].
	KindUnwrapStringArray Kind = "unwrap_string_array"

	// KindWrapObjectInArray turns a single-key object placeholder into a
	// one-element array. Models sometimes emit
	//   "paths": {"path": "foo.txt"}
	// when the schema expects ["foo.txt"]. We only fire this when the
	// object has exactly one entry whose value matches the slice's
	// element kind, to keep the repair narrow.
	KindWrapObjectInArray Kind = "wrap_object_in_array"

	// KindWrapInArray wraps a bare scalar in a one-element array when
	// the schema expects an array of that scalar's kind. Catches the
	// common
	//   "paths": "foo.txt"  →  ["foo.txt"]
	// failure mode.
	KindWrapInArray Kind = "wrap_in_array"
)

type Option

type Option func(*config)

Option configures an Unmarshal call.

func OnRepair

func OnRepair(fn func([]Kind)) Option

OnRepair returns an Option that invokes fn with the list of repair kinds that fired during an Unmarshal call. The callback is only invoked when repairs actually fired (i.e. the strict parse failed and a repair produced a parseable payload). It is the integration point for telemetry — for example a slog event tagged with the tool name.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL