pathway

package module
v0.0.0-...-b5646d9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 24, 2026 License: BSD-2-Clause Imports: 11 Imported by: 0

README

Pathway

Pathway is an experimental Go library for an embedded, persistent graph database based on the Pebble key-value database. It provides a fluid, Gremlin-like query interface for natural graph traversals.

NOTE: This is an experimental library that was not validated for production use.

Vibe coded using Google Antigravity.

Installation

go get github.com/npclaudiu/pathway

Quickstart

Initialize the database, perform transactions, and run queries.

package main

import (
 "context"
 "log"

 "github.com/google/uuid"
 "github.com/npclaudiu/pathway"
)

func main() {
 // Open an in‑memory database for the example.
 db, err := pathway.Open(":memory:")
 if err != nil {
  log.Fatalf("failed to open db: %v", err)
 }
 defer db.Close()

 ctx := context.Background()

 // Create a node.
 nodeID := uuid.New()
 if err := db.Update(ctx, func(tx *pathway.Tx) error {
  if err := tx.PutNode(nodeID, "User"); err != nil {
   return err
  }
  // Set a property.
  return tx.SetProperties(nodeID, map[string]interface{}{"name": "alice"})
 }); err != nil {
  log.Fatalf("failed to create node: %v", err)
 }

 // Query the node back.
 if err := db.View(ctx, func(tx *pathway.Tx) error {
  label, exists, err := tx.GetNode(nodeID)
  if err != nil {
   return err
  }
  if exists {
   log.Printf("node %s has label %s", nodeID, label)
  }
  return nil
 }); err != nil {
  log.Fatalf("read transaction failed: %v", err)
 }
}

Architecture & Performance

Data Model
  • Nodes: Atomic entities identified by 16-byte UUIDs.
  • Edges: Directed connections with a Label and properties.
  • Properties: Key-Value pairs attached to nodes/edges.
  • Constraints:
    • Labels: Recommended max 255 bytes.
    • IDs: UUIDs only.
    • Properties: Supports standard JSON types. Encoded with type prefix for type safety.

Performance

The following benchmarks measure the performance of key graph operations using the standard Go testing framework. Each scenario is executed in two storage modes:

  1. In-Memory Storage (:memory:): Measures the overhead of the Pathway library logic, graph traversal engine, and encoding layer without disk I/O.
  2. Disk Storage: Measures real-world performance using a temporary directory on the local filesystem. This includes the cost of ACID transactions and fsync operations provided by the underlying Pebble storage engine.
Interpretation
  • Ns/Op: Nanoseconds per operation (lower is better).
  • Bytes/Op: Average memory allocated per operation.
  • Allocs/Op: Average number of heap allocations per operation.

Disclaimer: The results below are a sample run on specific hardware. Actual performance will vary depending on your machine, operating system, filesystem configuration, and data characteristics.

Benchmarks were run on an Apple M2 Pro (darwin/arm64).

In-Memory Storage (:memory:)
Benchmark Operations Ns/Op Bytes/Op Allocs/Op
GetNode 1,495,675 764.5 152 5
FindNodes (Scan) 5,725 207,038 389 8
InsertNode 303,726 4,348 1,608 11
BatchInsertNode (100) 5,612 189,410 164,367 998
InsertEdge 187,784 20,639 2,262 25
TraverseOut (1-hop) 110,222 10,742 3,584 70
BFS_2Hop 67,443 18,006 6,645 132
Disk Storage (SSD)
Benchmark Operations Ns/Op Bytes/Op Allocs/Op
GetNode 1,446,253 813.2 152 5
FindNodes (Scan) 5,700 208,467 389 8
InsertNode 517 2,490,864 602 10
BatchInsertNode (100) 378 3,301,567 47,189 934
InsertEdge 506 2,670,096 1,184 22
TraverseOut (1-hop) 101,926 11,111 3,593 70
BFS_2Hop 63,694 17,768 6,666 133

Note: Disk write performance reflects full ACID compliance with fsync enabled for every transaction. Batch operations significantly amortize this cost.

Visual Comparison

Read Latency

Write Latency

Concurrency & Thread Safety

  • Graph Handle: The *Database instance is safe for concurrent use.
  • Transactions: Individual Tx (Read-Write) and ReadTx (Read-Only) objects are NOT thread-safe. They must be confined to a single goroutine.
  • Isolation: Read transactions see a consistent snapshot of the database at the time of creation, isolated from concurrent writes.

Fluid Query Capabilities

Pathway supports a rich set of traversal steps inspired by Gremlin:

  • Traversal: Out, In, Both, OutE, InV
  • Filtering: Has, HasLabel, Where
  • Projection: Values, Limit, Count, Path
  • Recursion: Repeat, Until, Times

Documentation

For detailed API documentation, including all types and methods, please refer to the API Reference.

For a practical guide on data modeling and graph queries, please see the Social Network Tutorial.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrInvalidDatabase = errors.New("invalid database instance")
	ErrInvalidSnapshot = errors.New("invalid snapshot instance")
	ErrKeyNotFound     = errors.New("key not found")
	ErrNodeNotFound    = errors.New("node not found")
	ErrEdgeNotFound    = errors.New("edge not found")
	ErrDanglingEdge    = errors.New("cannot create edge: source or target node does not exist")
)

Core library errors

Functions

This section is empty.

Types

type Database

type Database struct {
	// contains filtered or unexported fields
}

Database represents a connection to the embedded graph database. It is safe for concurrent use by multiple goroutines.

func Open

func Open(path string) (*Database, error)

Open creates or opens a graph database at the given path with default options.

Usage:

// Open a database on disk
db, err := pathway.Open("data/graph.db")

// Open an in-memory database (useful for testing)
db, err := pathway.Open(":memory:")

func OpenWithOptions

func OpenWithOptions(path string, opts Options) (*Database, error)

OpenWithOptions opens the database with specific options. This allows configuration of logging, monitoring hooks, and underlying storage engine settings.

func (*Database) Close

func (d *Database) Close() error

Close closes the database connection and releases all resources. It is important to call Close() to ensure all data is flushed to disk (if persistent) and locks are released.

func (*Database) Compact

func (g *Database) Compact(ctx context.Context) error

Compact triggers Pebble's manual compaction for the entire key range. It can be used to reclaim disk space after deleting a large number of nodes or edges. Note: This operation can be expensive and should typically not be run during high load.

func (*Database) NewReadTx

func (d *Database) NewReadTx(ctx context.Context) (*Tx, error)

NewReadTx creates a new read-only transaction. The caller is responsible for calling Tx.Close() on the returned transaction. Consider using View() instead which manages the transaction lifecycle automatically.

func (*Database) NewSnapshot

func (g *Database) NewSnapshot(ctx context.Context) (*Snapshot, error)

NewSnapshot creates a snapshot of the current database state. The snapshot is tied to the lifetime of the underlying Graph (Database) instance.

func (*Database) Update

func (d *Database) Update(ctx context.Context, fn func(tx *Tx) error) error

Update executes a function within a read-write transaction. The transaction is committed if the function returns nil, or rolled back if it returns an error.

Usage:

err := db.Update(ctx, func(tx *pathway.Tx) error {
    return tx.PutNode(uuid.New(), "Person")
})

func (*Database) View

func (d *Database) View(ctx context.Context, fn func(tx *Tx) error) error

View executes a function within a read-only transaction. The transaction provides a consistent snapshot of the database at the start of the call. It is automatically rolled back after the function returns.

Usage:

err := db.View(ctx, func(tx *pathway.Tx) error {
    node, exists, err := tx.GetNode(id)
    // ...
    return nil
})

type EdgeIterator

type EdgeIterator interface {
	Iterator // Embed generic iterator
	// Edge returns: EdgeID, TargetNodeID, Label, Error
	Edge() (uuid.UUID, uuid.UUID, string, error)
}

EdgeIterator iterates over edges returning typed data.

type Iterator

type Iterator interface {
	// Next advances the iterator to the next element. Returns false if exhausted/error.
	Next() bool
	// SeekGE moves to the first key greater than or equal to the given key.
	SeekGE(key []byte) bool
	// Key returns the current key.
	Key() []byte
	// Value returns the current value.
	Value() []byte
	// Valid returns true if the iterator is positioned at a valid element.
	Valid() bool
	// Close releases resources.
	Close() error
	// Error returns any accumulated error.
	Error() error

	// Path returns the current path history for the element.
	Path() []interface{}
}

Iterator is the generic interface for iterating over key-value pairs. It abstracts the underlying storage iterator. Users typically interact with specific interfaces like NodeIterator or EdgeIterator.

type Logger

type Logger interface {
	// Infof logs a formatted information message.
	Infof(format string, args ...interface{})
	// Errorf logs a formatted error message.
	Errorf(format string, args ...interface{})
}

Logger defines a simple logging interface for internal database logs. It allows users to plug in their own logging implementation (e.g., standard log, zap, logrus).

type NodeIterator

type NodeIterator interface {
	Iterator // Embed generic iterator
	// Node returns the ID and Label of the current node.
	Node() (uuid.UUID, string, error)
}

NodeIterator iterates over nodes returning typed data.

type Options

type Options struct {
	// OnQueryStart is called before a query executes.
	// This is useful for auditing or tracing query execution.
	OnQueryStart func(ctx context.Context, query string)

	// OnQueryEnd is called after execution, providing duration and error/success status.
	// This is useful for monitoring performance and logging slow queries.
	OnQueryEnd func(ctx context.Context, query string, duration time.Duration, err error)

	// Logger interface for internal debug logs.
	// If nil, no internal logging will be performed.
	Logger Logger

	// PebbleOptions allows customizing the underlying storage engine (cockroachdb/pebble).
	// Use this to tune cache sizes, compaction settings, or file system options.
	PebbleOptions *pebble.Options
}

Options configuration for the database.

Example:

opts := pathway.Options{
    OnQueryStart: func(ctx context.Context, query string) {
        fmt.Printf("Starting query: %s\n", query)
    },
}

type Predicate

type Predicate func(val interface{}) bool

Predicate is a function that tests a value.

func Contains

func Contains(substr string) Predicate

Contains returns a predicate that checks if a string contains substr.

func Eq

func Eq(expected interface{}) Predicate

Eq returns a predicate that checks for strict equality.

Usage:

pathway.Eq("Person")

func Gt

func Gt(expected interface{}) Predicate

Gt returns a predicate that checks if value > expected. Only supports int/float64 for now directly.

Usage:

pathway.Gt(18)

func Lt

func Lt(expected interface{}) Predicate

Lt returns a predicate that checks if value < expected.

Usage:

pathway.Lt(100.0)

func Prefix

func Prefix(prefix string) Predicate

Prefix returns a predicate that checks if a string starts with prefix.

Usage:

pathway.Prefix("User-")

type RepeatConfig

type RepeatConfig struct {
	// contains filtered or unexported fields
}

RepeatConfig holds configuration for repeat steps (loops).

type Snapshot

type Snapshot struct {
	// contains filtered or unexported fields
}

Snapshot represents a read‑only view of the database at a point in time. It holds a Pebble snapshot which guarantees a consistent view even while writes are occurring.

func (*Snapshot) Close

func (s *Snapshot) Close() error

Close releases the snapshot resources.

func (*Snapshot) Get

func (s *Snapshot) Get(key []byte) ([]byte, error)

Get returns a Pebble reader that can be used to read keys from the snapshot.

type Step

type Step func(tx *Tx, prev Iterator) Iterator

Step defines a processing step in the traversal pipeline. It takes a transaction context and a previous iterator, and returns a new iterator.

type Traversal

type Traversal struct {
}

Traversal represents a fluent query builder. Currently empty but kept for type safety and potential future state.

type TraversalPipeline

type TraversalPipeline struct {
	// contains filtered or unexported fields
}

TraversalPipeline represents a chain of query steps.

func (*TraversalPipeline) Emit

Emit causes the Repeat loop to emit the current element at each iteration, effectively returning intermediate results as well as the final results.

func (*TraversalPipeline) HasLabel

func (tp *TraversalPipeline) HasLabel(labels ...string) *TraversalPipeline

HasLabel filters the current stream of elements, keeping only those with the specified label(s).

func (*TraversalPipeline) In

func (tp *TraversalPipeline) In(labels ...string) *TraversalPipeline

In moves to incoming neighbor nodes, optionally filtering by edge label.

Usage:

g.V().In("EMPLOYED_BY")...

func (*TraversalPipeline) Out

func (tp *TraversalPipeline) Out(labels ...string) *TraversalPipeline

Out moves to outgoing neighbor nodes, optionally filtering by edge label.

Usage:

g.V().Out("KNOWS")...

func (*TraversalPipeline) Path

Path transforms the current stream to return the full path history of each element. Usage:

g.V().Out().Path()

func (*TraversalPipeline) Repeat

Repeat repeats the provided sub-traversal. It is used in conjunction with Until(), Times(), or Emit() to control the loop.

Usage:

// 2-hop friends
g.V().Repeat(func(t *TraversalPipeline) { return t.Out("KNOWS") }).Times(2)

func (*TraversalPipeline) Times

func (tp *TraversalPipeline) Times(n int) *TraversalPipeline

Times terminates a Repeat loop after a fixed number of iterations.

func (*TraversalPipeline) ToList

func (tp *TraversalPipeline) ToList() ([]interface{}, error)

ToList executes the traversal pipeline and returns the results as a list. This triggers the actual database transaction.

func (*TraversalPipeline) Until

Until terminates a Repeat loop when the predicate is true for the current element.

func (*TraversalPipeline) Values

func (tp *TraversalPipeline) Values(keys ...string) *TraversalPipeline

Values extracts property values from the current elements. Not fully implemented in Phase 1 (returns raw elements).

type TraversalSource

type TraversalSource struct {
	// contains filtered or unexported fields
}

TraversalSource is the starting point for graph traversals. It holds a reference to the database and spawns TraversalPipelines.

func NewTraversalSource

func NewTraversalSource(db *Database) *TraversalSource

NewTraversalSource creates a new traversal source from a database instance.

Usage:

g := pathway.NewTraversalSource(db)

func (*TraversalSource) V

func (ts *TraversalSource) V(ids ...string) *TraversalPipeline

V starts a traversal. If ids are provided, it starts at the specified nodes. If no ids are provided, it starts a scan of all nodes in the graph.

Usage:

// Start at specific node
g.V("uuid-string").Out()...

// Start at all nodes (scan)
g.V().HasLabel("Person")...

type Tx

type Tx struct {
	// contains filtered or unexported fields
}

Tx represents a database transaction. It can be either read-only (created via View) or read-write (created via Update).

func (*Tx) Access

func (tx *Tx) Access(fn func(tx *Tx) error) error

Access allows executing a function within the transaction context. Useful for executing multiple operations on the same transaction (e.g. Iterators).

func (*Tx) Close

func (tx *Tx) Close() error

Close closes the transaction. This releases the underlying snapshot. For write transactions (Update), it's a no-op as the batch is managed by the DB.Update method, but for read-only (View/NewReadTx), it releases the read lease.

func (*Tx) Delete

func (tx *Tx) Delete(key []byte, opts *pebble.WriteOptions) error

Delete deletes the raw value for a given key. Returns an error if the transaction is read-only.

func (*Tx) DeleteEdge deprecated

func (tx *Tx) DeleteEdge(edgeID uuid.UUID) error

DeleteEdge removes a specific edge. Note: Currently, deleting by EdgeID alone is not fully supported efficiently without an index. Use DeleteEdgeBetween(src, dst, label) if available (planned for future).

Deprecated: Use DeleteNode or specific edge removal logic when API expands.

func (*Tx) DeleteNode

func (tx *Tx) DeleteNode(id uuid.UUID) error

DeleteNode deletes a node and all its incident edges (both outgoing and incoming). This ensures graph consistency so no dangling edges remain. Note: This operation can be expensive for highly connected nodes as it requires scanning and deleting all edges.

func (*Tx) FindNodes

func (tx *Tx) FindNodes(label, propKey, propValue string) NodeIterator

FindNodes scans the index. Not implemented in Phase 1.

func (*Tx) Get

func (tx *Tx) Get(key []byte) ([]byte, error)

Get retrieves the raw value for a given key. It handles the difference between read-only readers and write batches.

func (*Tx) GetNode

func (tx *Tx) GetNode(id uuid.UUID) (string, bool, error)

GetNode retrieves a node's label by its ID. Returns the label, a boolean indicating existence, and any error.

func (*Tx) GetProperties

func (tx *Tx) GetProperties(id uuid.UUID) (map[string]interface{}, error)

GetProperties retrieves the properties map for a given node or edge ID. Returns nil if no properties exist.

func (*Tx) InEdges

func (tx *Tx) InEdges(id uuid.UUID, labels ...string) EdgeIterator

InEdges returns an iterator for incoming edges to the given node ID. Optionally filters by edge labels.

func (*Tx) Load

func (tx *Tx) Load(id uuid.UUID, dest interface{}) error

Load populates the destination struct with properties from the node/edge with the given ID. dest must be a pointer to a struct.

func (*Tx) NewIterator

func (tx *Tx) NewIterator(opts *pebble.IterOptions) (Iterator, error)

NewIterator creates a new low-level iterator for the transaction. This is primarily for internal use; users should typically use high-level iterators like ScanNodes, OutEdges, etc.

func (*Tx) OutEdges

func (tx *Tx) OutEdges(id uuid.UUID, labels ...string) EdgeIterator

OutEdges returns an iterator for outgoing edges from the given node ID. Optionally filters by edge labels.

Usage:

iter := tx.OutEdges(nodeID, "KNOWS", "WORKS_WITH")
defer iter.Close()
for iter.Next() { ... }

func (*Tx) PutEdge

func (tx *Tx) PutEdge(srcID, dstID uuid.UUID, label string) (uuid.UUID, error)

PutEdge creates a directed edge between two nodes. It performs a dual-write, creating both an outgoing key (for traversals from source) and an incoming key (for traversals to target).

Returns an error if either source or destination node does not exist.

Usage:

edgeID, err := tx.PutEdge(srcID, dstID, "KNOWS")

func (*Tx) PutNode

func (tx *Tx) PutNode(id uuid.UUID, label string) error

PutNode creates or updates a node with the given label.

Usage:

id := uuid.New()
err := tx.PutNode(id, "Person")

func (*Tx) ScanNodes

func (tx *Tx) ScanNodes() NodeIterator

ScanNodes scans all nodes in the database. This is a full table scan and can be slow for large datasets.

func (*Tx) Set

func (tx *Tx) Set(key, value []byte, opts *pebble.WriteOptions) error

Set sets the raw value for a given key. Returns an error if the transaction is read-only.

func (*Tx) SetProperties

func (tx *Tx) SetProperties(id uuid.UUID, props map[string]interface{}) error

SetProperties sets a map of properties for a given node or edge. This completely replaces any existing properties for that entity.

Usage:

err := tx.SetProperties(id, map[string]interface{}{
    "name": "Alice",
    "age":  30,
})

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL