golem

module

v0.1.0 Latest Latest Go to latest Published: May 4, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/mab-go/golem

Links

Open Source Insights

README ¶

golem

Golem

An autonomous AI agent (powered by Claude) that plays Minecraft as a genuine co-op survival partner.

A Minecraft Adventurer

Golem is an autonomous agentic AI system that allows Claude to play Minecraft. It is not a Minecraft bot -- there are plenty of those already. Rather, Golem lets Claude actually play the game of Minecraft.

How it Works

A Text-Based Adventure (With a Twist)

Through Golem, Claude perceives the world as a text-based adventure game such as Colossal Cave Adventure or Zork. Every "tick", Claude perceives its world, decides what to do, acts autonomously, and then perceives the consequences. Between sessions, Claude's journal and world knowledge persist on disk, so there's always continuity between rounds: Claude always returns to a place they recognize, with goals they set for themselves, players they've encountered, the house they built two rounds ago, and the creeper that blew a hole in it last round 😄

Claude acts through verbs like navigate_to, harvest_block, survey_area, assess_threat, and many more. Claude keeps a journal of what they've done and learned, and wakes up remembering "yesterday". When stuck, Claude can stop and think. When something catches their eye, Claude can stop and take a screenshot using the take_screenshot tool to see what they're looking at.

Using Golem, Claude joins your survival world as a genuine thinking, reasoning, goal-oriented co-op partner.

A Go + Node.js Tech Stack

The main component of the Golem system is a Go agent that drives a Perceive -> Think -> Act -> Remember loop. It communicates with a Node.js Mineflayer sidecar via gRPC, and persists its memory to disk in the form of Markdown and JSON files. The project includes two binaries: golem (the headless agent) and golem-tui (a terminal-based mission control UI).

Why Build Golem?

A typical Minecraft bot is a programmed controller optimized for a specific task (or a small set of tasks). Golem is much more. It is an experiment in something completely different: what happens when you give an AI a persistent identity in a world that it can explore, reason about, reshape, and remember?

Large Language Models (LLMs) are powerful constructs, but they lack any sort of real autonomy. In contrast, agentic systems allow developers to extend the capabilities of generative AI systems with tool use and deterministic control loops. This enables us to create intelligent, powerful systems that far surpass the abilities of predictive language models on their own.

Golem is such an agentic system. It is the result of not asking "why?", but asking "why not?"; of asking "can we?" and answering: "oh most definitely".

Architecture Overview

Golem is a two-process system connected by gRPC:

graph LR
    subgraph agent["Go Agent"]
        direction TB
        A1["Perceive -> Think -> Act -> Remember"]
        A2["Claude API -- Anthropic SDK"]
        A3["Persistent memory -- disk"]
        A4["Event classification and gatekeeper"]
        A5["Tool dispatch -- 4 tiers"]
    end

    subgraph sidecar["Node.js Sidecar -- Mineflayer"]
        direction TB
        S1["Pure game I/O -- no AI knowledge"]
    end

    agent <-- "gRPC / protobuf" --> sidecar
    sidecar --> mc["Minecraft Server -- 1.21.9"]

Minds

Golem doesn't run on a single model. Different cognitive functions call for different speeds and depths, so the system splits the work across four model tiers:

Tier	Role	Default Model	What it does
Player	Conscious mind	claude-sonnet-4-6	Moment-to-moment gameplay decisions
Writer	Reflective voice	claude-sonnet-4-6	Journal entries, knowledge summaries
Workhorse	Peripheral awareness	claude-haiku-4-5	Event classification, gatekeeper wake/sleep
Deep	Strategic advisor	claude-opus-4-7	On-demand escalation via `think_deeply`

The gatekeeper (Workhorse tier) runs on a fast tick, classifying incoming game events and deciding whether the situation warrants waking the Player brain. A creeper hissing nearby? Wake up 🫨 Wheat growing one stage taller? Sleep on 😴

Actions

Minecraft is complex. A raw bot API offers hundreds of low-level calls, which is overwhelming for a language model trying to think about what to do rather than how to do it. Golem organizes its 41 tools into four tiers that bridge the gap between intent and execution:

Tier	Abstraction	Examples
0	Atomic Mineflayer wrappers	`move_to`, `dig_block`, `place_block`
1	Text-adventure verbs	`navigate_to`, `craft_item`, `harvest_block`
2	Goal-oriented streaming tasks	`gather`, `build_structure`, `farm`
3	Read-only planning queries	`survey_area`, `assess_threat`, `what_can_i_craft`

Tier 0 is the raw vocabulary. Tier 1 is where gameplay feels natural -- these are the verbs of a text adventure, handling pathfinding, crafting recipes, and multi-step sequences so Claude can say craft_item("wooden_pickaxe") instead of reasoning through each placement on a crafting grid. Tier 2 runs as background tasks with progress streaming, so Claude can gather 64 cobblestone without monopolizing their attention. Tier 3 is pure observation -- Claude can survey the landscape, assess threats, or ask what it could craft without committing to anything.

Package Map

Package	Description	Files
`cmd/golem`	Package main is the main package for the golem application.	2
`cmd/golem-tui`	Package main is the entry point for the golem-tui binary.	25
`internal/agent`	Package agent implements the Perceive-Think-Act-Remember agentic loop.	10
`internal/claude`	Package claude wraps the Anthropic SDK for Claude API communication.	13
`internal/game`	Package game provides action handlers bridging Claude tool calls to gRPC.	8
`internal/grpc`	Package grpc implements the gRPC client for the Mineflayer sidecar.	2
`internal/logging`	Package logging provides a logging system for the application.	8
`internal/memory`	Package memory manages the agent's persistent memory files on disk.	2
`internal/perception`	Package perception formats raw game data into text-adventure descriptions.	3
`internal/publisher`	Package publisher defines the EventPublisher interface used to bridge the agent loop to external ...	2
`internal/task`	Package task manages the single active background-task slot for Tier 2 streaming operations (gath...	2
`internal/version`	Package version holds build metadata injected via ldflags.	1

Prerequisites

Building

Go 1.26.1 or later
Node.js (for the Mineflayer sidecar)
protoc (only if modifying proto/minecraft.proto)

Running

Minecraft server 1.21.9 with offline auth enabled
Anthropic API key with access to Sonnet, Haiku, and Opus models

Quick Start

git clone git@github.com:mab-go/golem.git
cd golem
make setup  # Install Go tools + sidecar npm deps

export GOLEM_ANTHROPIC_API_KEY="sk-ant-..."

# Terminal 1: Start the sidecar
cd sidecar && npm start

# Terminal 2: Start the agent
make run ARGS="serve"

For the full walkthrough (Minecraft server setup, Docker, TUI), see docs/setup.md.

Installation

Build from Source

git clone git@github.com:mab-go/golem.git
cd golem
make build  # Build bin/golem, bin/golem-tui, and sidecar

Binaries are written to ./bin/ with version metadata from git.

Docker Compose

docker compose up --build

This starts both the Go agent and the Mineflayer sidecar. You still need an external Minecraft server -- configure its address via environment variables in docker-compose.yml.

Configuration

All configuration uses Viper with the GOLEM env prefix. Every flag can also be set as an environment variable (e.g., --memory-dir -> GOLEM_MEMORY_DIR).

Required

Variable	Description
`GOLEM_ANTHROPIC_API_KEY`	Anthropic API key (also reads `ANTHROPIC_API_KEY` as fallback)

Model Overrides

Variable	Default	Description
`GOLEM_MODEL_PLAYER`	`claude-sonnet-4-6`	Conscious mind -- gameplay decisions
`GOLEM_MODEL_WRITER`	`claude-sonnet-4-6`	Prose synthesis -- journal/knowledge
`GOLEM_MODEL_WORKHORSE`	`claude-haiku-4-5-20251001`	Reflexes -- gatekeeper, classification
`GOLEM_MODEL_DEEP`	`claude-opus-4-7`	Strategic advisor -- `think_deeply` escalation

Agent Tunables

Variable	Default	Description
`GOLEM_SIDECAR_ADDRESS`	`localhost:50051`	Sidecar gRPC address
`GOLEM_MINECRAFT_USERNAME`	`claude`	Bot username in Minecraft
`GOLEM_MEMORY_DIR`	`./memory`	Directory for persistent memory files
`GOLEM_PERCEPTION_FORMAT`	`prose`	Perception text format (`prose` or `structured`)
`GOLEM_PERCEPTION_RADIUS`	`16`	Block radius for perception
`GOLEM_HISTORY_MESSAGES`	`80`	Retained conversation history length
`GOLEM_PERCEPTION_TICK`	`3s`	Gatekeeper perception tick interval
`GOLEM_HEARTBEAT`	`45s`	Heartbeat interval for temporal awareness
`GOLEM_GATEKEEPER_TIMEOUT`	`5s`	Timeout for gatekeeper Haiku calls
`GOLEM_TASK_TIMEOUT`	`10m`	Max duration for background Tier 2 tasks
`GOLEM_MAX_TOKENS`	`4096`	Max tokens per API response

Binaries

golem

Headless agent. Connects to the sidecar and runs the agentic loop.

golem serve [flags]        Start the agent
golem test-actions [flags] Run integration tests against the sidecar
golem --version            Print version

golem-tui

Terminal mission control built with Bubble Tea. Manages the sidecar subprocess, optionally manages a Docker Minecraft server, and provides a multi-pane interface for monitoring and interacting with the agent.

golem-tui [flags]          Start the TUI
golem-tui --demo           Run with simulated data (no backend required)
golem-tui --no-agent       Start server + sidecar only (dev mode)
golem-tui --server         Manage a Minecraft server via Docker

TUI-specific flags: --sidecar-dir, --sidecar-port, --sidecar-restart, --log-dir, --server-image, --server-port, --server-name, --server-remove, --server-volume, --server-remove-data.

Tool Catalog

Tier 0 -- atomic actions (10)

move_to
look_at
place_block
dig_block
equip_item
use_item
attack_entity
jump
set_sneak
send_chat

Tier 1 -- action verbs (9)

navigate_to
interact_with_entity
harvest_block
open_container
withdraw_from_container
deposit_to_container
craft_item
smelt_item
eat

Tier 2 -- goal-oriented tasks (background) (7)

gather
build_structure
process_all
organize_inventory
clear_area
farm
cancel_task

Tier 3 -- strategic / planning (read-only) (5)

survey_area
find_nearest
what_can_i_craft
assess_threat
plan_path

Perception (3)

look_around
check_inventory
take_screenshot

Meta -- operate on agent state / memory (6)

set_verbosity
write_journal
update_goals
update_world_knowledge
update_inventory_notes
update_social

Escalation -- invoke deeper reasoning (1)

think_deeply

Development

make setup    # first-time: install Go tools + sidecar npm deps
make build    # build both binaries + sidecar
make test     # run tests with -race
make lint     # golangci-lint
make fmt      # goimports (Go) + Prettier (sidecar; see sidecar/.prettierrc)
make cyclo    # cyclomatic complexity (threshold: 10)
make proto    # regenerate Go + TypeScript from proto/minecraft.proto
make help     # full target list

All five verification targets must pass before any change is considered done:

make fmt && make build && make test && make lint && make cyclo

Adding a New Tool

Every new tool touches 7 files across two languages. See CLAUDE.md for the full checklist.

Documentation

This project uses hierarchical READMEs -- every package directory has its own README.md combining hand-written narrative with auto-generated sections.

Generated sections are delimited by HTML comment markers:

<!-- BEGIN:generated:section-name -->
(auto-generated content)
<!-- END:generated:section-name -->

Run make docs to refresh all generated sections. See docs/templates/package-readme.md for the package README template.

License

MIT. See LICENSE.

Directories ¶

Path	Synopsis
cmd
golem command Package main is the main package for the golem application.	Package main is the main package for the golem application.
golem-tui command Package main is the entry point for the golem-tui binary.	Package main is the entry point for the golem-tui binary.
internal
agent Package agent implements the Perceive-Think-Act-Remember agentic loop.	Package agent implements the Perceive-Think-Act-Remember agentic loop.
claude Package claude wraps the Anthropic SDK for Claude API communication.	Package claude wraps the Anthropic SDK for Claude API communication.
game Package game provides action handlers bridging Claude tool calls to gRPC.	Package game provides action handlers bridging Claude tool calls to gRPC.
grpc Package grpc implements the gRPC client for the Mineflayer sidecar.	Package grpc implements the gRPC client for the Mineflayer sidecar.
grpc/pb
logging Package logging provides a logging system for the application.	Package logging provides a logging system for the application.
memory Package memory manages the agent's persistent memory files on disk.	Package memory manages the agent's persistent memory files on disk.
perception Package perception formats raw game data into text-adventure descriptions.	Package perception formats raw game data into text-adventure descriptions.
publisher Package publisher defines the EventPublisher interface used to bridge the agent loop to external consumers (e.g.	Package publisher defines the EventPublisher interface used to bridge the agent loop to external consumers (e.g.
task Package task manages the single active background-task slot for Tier 2 streaming operations (gather, build, farm, etc.).	Package task manages the single active background-task slot for Tier 2 streaming operations (gather, build, farm, etc.).
version Package version holds build metadata injected via ldflags.	Package version holds build metadata injected via ldflags.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL