gcf

package module
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2026 License: MIT Imports: 6 Imported by: 0

README

Blackwell Systems CI License

gcf-go

Go implementation of GCF (Graph Compact Format) — the most token-efficient wire format for LLMs. A drop-in alternative to JSON and TOON for any structured data.

79% fewer input tokens than JSON. 75% fewer output tokens. 52% smaller than TOON. 100% LLM comprehension at 500 symbols, where JSON fails at 66.7%.

Docs: gcformat.com · Playground · GCF vs TOON

Install

go get github.com/blackwell-systems/gcf-go

Zero dependencies. Single package. Don't want to change code? Use the MCP proxy for zero-code adoption.

Quick Start

import gcf "github.com/blackwell-systems/gcf-go"

p := &gcf.Payload{
    Tool:        "context_for_task",
    TokenBudget: 5000,
    TokensUsed:  1847,
    Symbols: []gcf.Symbol{
        {QualifiedName: "pkg.AuthMiddleware", Kind: "function", Score: 0.78, Provenance: "lsp_resolved", Distance: 0},
        {QualifiedName: "pkg.NewServer", Kind: "function", Score: 0.54, Provenance: "lsp_resolved", Distance: 1},
    },
    Edges: []gcf.Edge{
        {Source: "pkg.NewServer", Target: "pkg.AuthMiddleware", EdgeType: "calls"},
    },
}

output := gcf.Encode(p)

Output:

GCF tool=context_for_task budget=5000 tokens=1847 symbols=2
## targets
@0 fn pkg.AuthMiddleware 0.78 lsp_resolved
## related
@1 fn pkg.NewServer 0.54 lsp_resolved
## edges
@0<@1 calls

Decode

p, err := gcf.Decode(input)
if err != nil {
    log.Fatal(err)
}
fmt.Println(p.Tool, len(p.Symbols), "symbols", len(p.Edges), "edges")

Session Deduplication

Track transmitted symbols across multiple tool responses. Previously-sent symbols become bare references instead of full declarations:

sess := gcf.NewSession()

out1 := gcf.EncodeWithSession(payload1, sess) // full declarations
out2 := gcf.EncodeWithSession(payload2, sess) // reused symbols as "@N  # previously transmitted"

By the 5th call in a session: 92.7% token savings vs JSON.

Delta Encoding

When the consumer already has a prior context pack, send only what changed:

delta := &gcf.DeltaPayload{
    Tool:     "context_for_task",
    BaseRoot: "aaa111",
    NewRoot:  "bbb222",
    Removed:  []gcf.Symbol{{QualifiedName: "pkg.OldFunc", Kind: "function"}},
    Added:    []gcf.Symbol{{QualifiedName: "pkg.NewFunc", Kind: "function", Score: 0.85, Provenance: "rwr"}},
    DeltaTokens: 30,
    FullTokens:  200,
}

output := gcf.EncodeDelta(delta)

81.2% savings on re-queries where the pack changed slightly.

Generic Encoding

Encode any Go value (not just graph payloads) into GCF tabular format:

data := map[string]any{
    "employees": []map[string]any{
        {"id": 1, "name": "Alice", "department": "Engineering", "salary": 95000},
        {"id": 2, "name": "Bob", "department": "Sales", "salary": 72000},
    },
}
output := gcf.EncodeGeneric(data)

Output:

## employees [2]{id,name,department,salary}
1|Alice|Engineering|95000
2|Bob|Sales|72000

Works on maps, slices, structs, and primitives. Arrays of uniform objects get tabular rows. Nested objects use ## key section headers.

API

Function Description
Encode(p *Payload) string Encode a graph payload to GCF text
EncodeGeneric(data any) string Encode any value to GCF tabular format
Decode(input string) (*Payload, error) Parse GCF text back to a Payload
EncodeWithSession(p *Payload, s *Session) string Encode with session deduplication
EncodeDelta(d *DeltaPayload) string Encode a delta (added/removed only)
NewSession() *Session Create a new session tracker (thread-safe)

Types

Type Purpose
Payload Full GCF payload: tool, budget, symbols, edges, pack root
Symbol Graph node: qualified name, kind, score, provenance, distance
Edge Directed relationship: source, target, edge type
DeltaPayload Diff between two packs: added/removed symbols and edges
Session Thread-safe tracker for multi-call deduplication
KindAbbrev / KindExpand Bidirectional kind abbreviation maps

Comprehension Eval

The eval/ submodule contains a rigorous 3-way benchmark (GCF vs TOON vs JSON) at 500 symbols, 200 edges. Six structured extraction questions sent to an LLM:

Format Accuracy Tokens vs JSON
GCF 100% (6/6) 11,090 79% fewer
TOON 100% (6/6) 16,378 69% fewer
JSON 66.7% (4/6) 53,341 baseline

JSON failed on counting tasks. GCF and TOON both achieved perfect accuracy. GCF does it in 32% fewer tokens.

cd eval && GOWORK=off go test -run TestComprehension -v -timeout 15m

Token Efficiency (TOON's Own Benchmark)

Running TOON's benchmark harness with GCF inserted (their datasets, their tokenizer):

Track GCF TOON Result
Mixed-structure (nested, semi-uniform) 169,554 227,896 GCF 34% smaller
Flat-only (tabular) 66,026 67,837 GCF 3% smaller
Semi-uniform event logs 107,269 154,032 GCF 44% smaller

GCF wins on every dataset except deeply nested config (75 tokens on a 618-token payload). On semi-uniform data, GCF uses 44% fewer tokens than TOON.

Reproducible: blackwell-systems/toon@gcf-comparison

License

MIT

Documentation

Overview

Package gcf implements the GCF (Graph Compact Format) encoder and decoder.

GCF is a compact, text-only, graph-native wire format designed for MCP tool responses. It exploits referential identity (local IDs), graph topology (edges as references), and hierarchical grouping (distance-based sections) to achieve 84% token savings over JSON while remaining human-readable.

Specification: https://github.com/blackwell-systems/gcf

Encode a payload:

out := gcf.Encode(&gcf.Payload{
    Tool: "context_for_task",
    Symbols: []gcf.Symbol{{QualifiedName: "pkg.Func", Kind: "function", Score: 0.9, Provenance: "lsp_resolved"}},
})

Decode a payload:

p, err := gcf.Decode(input)

Session deduplication (previously-transmitted symbols as bare references):

sess := gcf.NewSession()
out1 := gcf.EncodeWithSession(&payload1, sess) // full declarations
out2 := gcf.EncodeWithSession(&payload2, sess) // reused symbols as @N refs

Delta encoding (only added/removed symbols):

out := gcf.EncodeDelta(&gcf.DeltaPayload{...})

Index

Constants

This section is empty.

Variables

View Source
var KindAbbrev = map[string]string{
	"function":      "fn",
	"type":          "type",
	"method":        "method",
	"interface":     "iface",
	"var":           "var",
	"const":         "const",
	"resource":      "resource",
	"table":         "table",
	"class":         "class",
	"selector":      "selector",
	"field":         "field",
	"route_handler": "route",
	"external":      "ext",
	"file":          "file",
	"package":       "pkg",
	"service":       "svc",
}

KindAbbrev maps full kind names to short GCF abbreviations.

View Source
var KindExpand = map[string]string{
	"fn":       "function",
	"type":     "type",
	"method":   "method",
	"iface":    "interface",
	"var":      "var",
	"const":    "const",
	"resource": "resource",
	"table":    "table",
	"class":    "class",
	"selector": "selector",
	"field":    "field",
	"route":    "route_handler",
	"ext":      "external",
	"file":     "file",
	"pkg":      "package",
	"svc":      "service",
}

KindExpand is the reverse of KindAbbrev.

Functions

func Encode

func Encode(p *Payload) string

Encode serializes a Payload into GCF text format.

func EncodeDelta

func EncodeDelta(d *DeltaPayload) string

EncodeDelta serializes a DeltaPayload into GCF delta format.

func EncodeGeneric added in v0.1.1

func EncodeGeneric(data any) string

EncodeGeneric encodes any value into GCF tabular format. Unlike Encode (which handles the graph Payload type), EncodeGeneric works on arbitrary maps, slices, and primitives using GCF's tabular encoding grammar.

func EncodeWithSession

func EncodeWithSession(p *Payload, sess *Session) string

EncodeWithSession encodes a payload using GCF with session deduplication. Symbols that were already transmitted in prior responses are emitted as bare references (`@N # previously transmitted`) instead of full declarations. After encoding, newly-sent symbols are recorded in the session.

Types

type Components

type Components struct {
	BlastRadius float64 // number of callers (normalized)
	Confidence  float64 // edge provenance confidence
	Recency     float64 // git recency signal
	Distance    float64 // graph distance penalty
}

Components holds the score breakdown for a symbol.

type DeltaPayload

type DeltaPayload struct {
	Tool         string
	BaseRoot     string // pack_root the consumer has
	NewRoot      string // pack_root of the current result
	Removed      []Symbol
	Added        []Symbol
	RemovedEdges []Edge
	AddedEdges   []Edge
	DeltaTokens  int
	FullTokens   int
}

DeltaPayload represents the diff between a prior context pack and the current result. Used for incremental context delivery.

type Edge

type Edge struct {
	Source   string // qualified name of source symbol
	Target   string // qualified name of target symbol
	EdgeType string
	Status   string // optional: "added", "removed", "unchanged" (for diff responses)
}

Edge represents a directed relationship in a GCF payload.

type Payload

type Payload struct {
	Tool        string   // producing tool name (e.g., "context_for_task")
	TokensUsed  int      // actual tokens consumed by this payload
	TokenBudget int      // token budget requested by the consumer
	PackRoot    string   // content-addressed identity (hex SHA-256), enables delta encoding
	Symbols     []Symbol // ordered by score descending within each distance group
	Edges       []Edge   // directed relationships between symbols
}

Payload is the input/output structure for GCF encoding/decoding.

func Decode

func Decode(input string) (*Payload, error)

Decode parses GCF text back into a Payload.

type Session

type Session struct {
	// contains filtered or unexported fields
}

Session tracks symbols that have been transmitted to a client, enabling subsequent responses to reference them by ID without full retransmission. This makes multi-call workflows progressively cheaper.

Thread-safe: multiple tool handlers may encode concurrently within a session.

func NewSession

func NewSession() *Session

NewSession creates a new empty session.

func (*Session) GetID

func (s *Session) GetID(qname string) int

GetID returns the session-global ID for a previously transmitted symbol. Returns -1 if not found.

func (*Session) Record

func (s *Session) Record(symbols []Symbol)

Record marks symbols as transmitted and assigns session-global IDs. Call this after a successful encode to register newly-sent symbols.

func (*Session) Reset

func (s *Session) Reset()

Reset clears the session state.

func (*Session) Size

func (s *Session) Size() int

Size returns the number of symbols tracked in this session.

func (*Session) Transmitted

func (s *Session) Transmitted(qname string) bool

Transmitted returns true if the symbol has been sent in a previous response.

type Symbol

type Symbol struct {
	QualifiedName string     // fully qualified identifier (e.g., "pkg/auth.Middleware")
	Kind          string     // node type: "function", "type", "method", etc.
	Score         float64    // relevance score (0.0 to 1.0)
	Provenance    string     // discovery method: "lsp_resolved", "ast_inferred", etc.
	Distance      int        // hops from query center (0=target, 1=related, 2+=extended)
	Signature     string     // optional: function/method signature
	Components    Components // optional: score breakdown
}

Symbol represents a node in a GCF payload.

Directories

Path Synopsis
cmd
gcf command
gcf is a command-line tool for encoding and decoding GCF (Graph Compact Format).
gcf is a command-line tool for encoding and decoding GCF (Graph Compact Format).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL