golem

module

v0.6.3 Latest Latest Go to latest Published: Feb 25, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/MEKXH/golem

Links

Open Source Insights

README ¶

Golem (גּוֹלֶם)

Your AI agent. Your terminal. Your rules.

Golem is a terminal-first personal AI assistant built with Go and Eino. It can chat, run tools, call shell commands, manage files, search/fetch web content, keep memory, schedule cron jobs, run as a background service across multiple channels, and support provider auth login plus channel audio transcription.

Golem (גולם): In Jewish folklore, a golem is an animated being made from inanimate matter, created to serve.

Documentation

Why Golem

One binary, zero runtime dependency bloat (no Python/Node/Docker required).
Provider-agnostic model access through a unified OpenAI-compatible layer.
Real agent loop with tool calling, not just plain text chat.
Works both interactively (golem chat) and as long-running service (golem run).
Built-in channels, gateway API, cron scheduler, heartbeat service, and skill system.
Built-in auth commands, voice transcription pipeline, and restart-safe heartbeat routing.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                           Golem Architecture                         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌──────────────┐     ┌──────────────┐     ┌──────────────────┐    │
│  │   Channels   │     │    Agent     │     │     Providers    │    │
│  │  (Telegram,  │────▶│    Loop      │────▶│  (Claude, OpenAI,│    │
│  │  Discord,    │     │              │     │   DeepSeek...)   │    │
│  │  Slack...)   │     └──────┬───────┘     └──────────────────┘    │
│  └──────────────┘            │                                       │
│         │                    │                                       │
│         ▼                    ▼                                       │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                         Message Bus                          │    │
│  │           (Inbound/Outbound async message queue)             │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                              │                                       │
│         ┌────────────────────┼────────────────────┐                 │
│         ▼                    ▼                    ▼                 │
│  ┌─────────────┐     ┌─────────────┐     ┌─────────────────┐       │
│  │   Session   │     │   Skills    │     │     Tools       │       │
│  │  (History)  │     │ (Prompts)   │     │(exec, file, web)│       │
│  └─────────────┘     └─────────────┘     └─────────────────┘       │
│         │                    │                    │                 │
│         └────────────────────┼────────────────────┘                 │
│                              ▼                                       │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │                     Supporting Services                      │    │
│  │    (Memory | Cron | Heartbeat | Gateway | Skills)           │    │
│  └─────────────────────────────────────────────────────────────┘    │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Core Components

Component	Path	Description
Agent Loop	`internal/agent/`	Main processing loop with tool calling, max 20 iterations
Message Bus	`internal/bus/`	Event-driven message routing via Go channels
Channel System	`internal/channel/`	Multi-platform integrations (Telegram, Discord, Slack, etc.)
Provider	`internal/provider/`	Unified LLM interface via Eino's OpenAI wrapper
Session	`internal/session/`	Persistent JSONL-based conversation history
Tools	`internal/tools/`	Built-in tools: file, shell, memory, web, cron, message, subagent, workflow
Memory	`internal/memory/`	Long-term memory and daily diary system
Skills	`internal/skills/`	Extensible Markdown-based prompt packs
Cron	`internal/cron/`	Scheduled job management
Heartbeat	`internal/heartbeat/`	Periodic health probe and status reporting
Gateway	`internal/gateway/`	HTTP API server (`/health`, `/version`, `/chat`)

Core Features

Interaction Modes

Terminal TUI chat (golem chat)
Multi-channel bot mode (golem run): Telegram, WhatsApp, Feishu, Discord, Slack, QQ, DingTalk, MaixCam
Gateway HTTP API (/health, /version, /chat)

Latest Additions

Auth workflow commands: golem auth login, golem auth logout, golem auth status
Heartbeat target persistence across restarts (last active channel/chat is restored automatically)
Audio transcription in Telegram/Discord/Slack with fallback placeholders when transcription fails
File mutation tools edit_file and append_file for safer incremental edits
Outbound channel reliability policy (channels.outbound): retry, rate-limit, dedup window, and bounded send concurrency

Built-in Tools

Tool	Description
`exec`	Run shell commands (workspace restriction supported)
`read_file` / `write_file` / `edit_file` / `append_file`	File read/write/edit/append in workspace
`list_dir`	List directory contents
`read_memory` / `write_memory`	Persistent memory access
`append_diary`	Append daily notes
`web_search`	Web search (Brave when API key exists; fallback available)
`web_fetch`	Fetch and extract web page content
`manage_cron`	Manage scheduled jobs
`message`	Send messages to channels
`spawn` / `subagent` / `workflow`	Delegate tasks to subagents and orchestrated workflows

LLM Providers

OpenRouter, Claude, OpenAI, DeepSeek, Gemini, Ark, Qianfan, Qwen, Ollama.

Subagent System

Golem supports delegating tasks to subagents for parallel processing:

spawn: Asynchronous subagent, returns task ID immediately, notifies via message bus
subagent: Synchronous subagent, blocks until completion, returns result directly
workflow: Built-in workflow orchestration tool (decompose task, run sequential/parallel subtasks, aggregate per-step results)

All modes use isolated sessions and propagate origin channel/chat for result delivery.

Memory System

Two-tier memory architecture:

Long-term Memory: Single MEMORY.md file for persistent knowledge
Daily Diary: YYYY-MM-DD.md files for timestamped journal entries

Heartbeat Service

When enabled, server mode can periodically run a health probe and send heartbeat output to the latest active channel/session. The latest target is persisted in workspace state, so routing survives process restarts.

Installation

Option A: Download Binary

Download Windows/Linux binaries from Releases.

Option B: Install from Source

go install github.com/MEKXH/golem/cmd/golem@latest

Quick Start

1. Initialize config

golem init

This creates ~/.golem/config.json and workspace directories.

2. Bootstrap with the example config

Use the provided template as your starting point:

cp config/config.example.json ~/.golem/config.json

PowerShell:

Copy-Item config/config.example.json "$HOME/.golem/config.json"

Then edit ~/.golem/config.json and set at least one provider key (for example providers.openai.api_key).

Create an environment file from template (recommended for local/staging/production separation):

cp .env.example .env.local

PowerShell:

Copy-Item .env.example .env.local

Fill required secrets in .env.local (at least one provider key, and GOLEM_GATEWAY_TOKEN for exposed deployments).

Optional (token/OAuth auth store):

golem auth login --provider openai --token "$OPENAI_API_KEY"

3. Run smoke checks

make smoke

Without make:

go test ./...
go run ./cmd/golem status
go run ./cmd/golem chat "ping"

4. Start chatting

golem chat

One-shot:

golem chat "Analyze the current directory structure"

5. Start server mode

golem run

CLI Commands

Command	Description
`golem init`	Initialize config and workspace
`golem chat [message]`	Start TUI chat or send one-shot message
`golem run`	Start server mode
`golem status [--json]`	Show system status summary (human-readable or JSON)
`golem auth login --provider <name> [--token <token> \| --device-code \| --browser]`	Save provider credentials via token or OAuth
`golem auth logout [--provider <name>]`	Remove one provider credential or all credentials
`golem auth status`	Show current auth credential status
`golem channels list`	List configured channels
`golem channels status`	Show detailed channel status
`golem channels start <channel>`	Enable one channel in config
`golem channels stop <channel>`	Disable one channel in config
`golem cron list`	List scheduled jobs
`golem cron add -n <name> -m <msg> [--every <sec> \| --cron <expr> \| --at <ts>]`	Add a job
`golem cron run <job_id>`	Run a job immediately
`golem cron remove <job_id>`	Remove a job
`golem cron enable <job_id>`	Enable a job
`golem cron disable <job_id>`	Disable a job
`golem approval list`	List pending approval requests
`golem approval approve <id> --by <name> [--note <text>]`	Approve a pending request
`golem approval reject <id> --by <name> [--note <text>]`	Reject a pending request
`golem skills list`	List installed skills
`golem skills install <owner/repo>`	Install skill from GitHub
`golem skills remove <name>`	Remove installed skill
`golem skills show <name>`	Show skill content
`golem skills search [keyword]`	Search remote skill index

Authentication

Credentials are stored in ~/.golem/auth.json. Provider clients can use auth-store tokens as API credentials when config keys are empty.

Examples:

golem auth login --provider openai --device-code
golem auth status
golem auth logout --provider openai

Cron Scheduling

Schedule types:

--every <seconds>: fixed interval
--cron "<expr>": standard 5-field cron expression
--at "<RFC3339>": one-shot execution

Examples:

golem cron add -n "hourly-check" -m "Check system status and report" --every 3600
golem cron add -n "morning-brief" -m "Give me a morning briefing" --cron "0 9 * * *"
golem cron add -n "meeting-reminder" -m "Remind me about the team meeting" --at "2026-02-14T09:00:00Z"

Skills System

Skills are Markdown instruction packs loaded into the agent prompt.

Skill discovery precedence:

workspace/skills
~/.golem/skills
builtin skills directory (default: ~/.golem/builtin-skills, override via GOLEM_BUILTIN_SKILLS_DIR)

Install from GitHub:

golem skills install owner/repo

Search remote skills:

golem skills search
golem skills search weather

Configuration

Main file: ~/.golem/config.json

Template file in repo: config/config.example.json

{
  "agents": {
    "defaults": {
      "workspace_mode": "default",
      "workspace": "",
      "model": "anthropic/claude-sonnet-4-5",
      "max_tokens": 8192,
      "temperature": 0.7,
      "max_tool_iterations": 20
    },
    "subagent": {
      "timeout_seconds": 300,
      "retry": 1,
      "max_concurrency": 3
    }
  },
  "channels": {
    "telegram": {
      "enabled": false,
      "token": "",
      "allow_from": []
    },
    "outbound": {
      "max_concurrent_sends": 16,
      "retry_max_attempts": 3,
      "retry_base_backoff_ms": 200,
      "retry_max_backoff_ms": 2000,
      "rate_limit_per_second": 20,
      "dedup_window_seconds": 30
    }
  },
  "providers": {
    "claude": {
      "api_key": ""
    },
    "openai": {
      "api_key": ""
    },
    "ollama": {
      "base_url": "http://localhost:11434"
    }
  },
  "policy": {
    "mode": "strict",
    "off_ttl": "",
    "allow_persistent_off": false,
    "require_approval": ["exec"]
  },
  "mcp": {
    "servers": {
      "localfs": {
        "enabled": true,
        "transport": "stdio",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-filesystem", "."]
      }
    }
  },
  "tools": {
    "exec": {
      "timeout": 60,
      "restrict_to_workspace": true
    },
    "web": {
      "search": {
        "api_key": "",
        "max_results": 5
      }
    },
    "voice": {
      "enabled": false,
      "provider": "openai",
      "model": "gpt-4o-mini-transcribe",
      "timeout_seconds": 30
    }
  },
  "gateway": {
    "host": "0.0.0.0",
    "port": 18790,
    "token": ""
  },
  "heartbeat": {
    "enabled": true,
    "interval": 30,
    "max_idle_minutes": 720
  },
  "log": {
    "level": "info",
    "file": ""
  }
}

workspace_mode values:

default: use ~/.golem/workspace
cwd: use current working directory
path: use agents.defaults.workspace

agents.subagent runtime values:

timeout_seconds: delegated subtask timeout (default 300)
retry: retry count per subtask (default 1, total attempts = retry + 1)
max_concurrency: max concurrent subtask executions across spawn/subagent/workflow (default 3)

channels.outbound reliability values:

max_concurrent_sends: max concurrent outbound sends (default 16)
retry_max_attempts: max attempts per outbound message on retriable channels (default 3)
retry_base_backoff_ms / retry_max_backoff_ms: exponential backoff window in milliseconds
rate_limit_per_second: global outbound send rate limit (default 20)
dedup_window_seconds: dedup window for same channel+chat_id+request_id (default 30)

policy.mode values:

strict: enforce require_approval list before tool execution
relaxed: allow execution without approval gate
off: disable policy checks (use off_ttl for temporary bypass)

Approval and audit state files:

workspace/state/approvals.json
workspace/state/audit.jsonl

Environment Variables

All config keys support GOLEM_ prefix:

export GOLEM_PROVIDERS_OPENROUTER_APIKEY="your-key"
export GOLEM_PROVIDERS_CLAUDE_APIKEY="your-key"
export GOLEM_LOG_LEVEL=debug

Recommended profile files:

.env.local: local development defaults
.env.staging: pre-release integration environment
.env.production: production deployment

You can start from .env.example and keep policy.mode=strict / policy.allow_persistent_off=false as safe defaults.

Minimum required secrets:

At least one provider API key (or use golem auth login --provider <name>).
GOLEM_GATEWAY_TOKEN for staging/production where gateway is network-accessible.

Gateway API

Available in server mode (golem run):

GET /health
GET /version
POST /chat

POST /chat example:

{
  "message": "Summarize the latest logs",
  "session_id": "ops-room",
  "sender_id": "api-client"
}

If gateway.token is configured, include:

Authorization: Bearer <token>

Data Flow

User Input (CLI/Telegram/Discord/Slack...)
         │
         ▼
    Channel (receives & validates message)
         │
         ▼
    Bus.PublishInbound() ──▶ MessageBus.inbound
         │
         ▼
    Agent Loop (processes message)
         │
    ┌────┴────┐
    ▼         ▼         ▼
Session  Context  LLM Generate
(History) Builder  (with tools bound)
              │           │
              │           ▼
              │      Tools.Execute()
              │      (tool calls)
              │           │
              └─────┬─────┘
                    ▼
         Bus.PublishOutbound()
                    │
                    ▼
         Channel Manager (routes)
                    │
                    ▼
         Channel.Send() ──▶ User

Bootstrap Files

The agent's system prompt is built from these files (searched in workspace):

IDENTITY.md - Agent identity and persona
SOUL.md - Core beliefs and values
USER.md - User-specific context
TOOLS.md - Custom tool descriptions
AGENTS.md - Subagent definitions

Operations

For incident handling, restart/rollback flow, and production guidance:

Development

Common commands:

make build
make test
make lint
make smoke

Without make, run before pushing:

go test ./...
go test -race ./...
go vet ./...

Build:

go build -o golem ./cmd/golem

License

MIT

Directories ¶

Path	Synopsis
cmd
golem command
golem/commands
internal
agent
approval
audit
auth
bus
channel
channel/dingtalk
channel/discord
channel/feishu
channel/maixcam
channel/qq
channel/slack
channel/telegram
channel/whatsapp
command
config
cron
gateway
heartbeat
mcp
memory
metrics
policy
provider
render
session
skills
state
tools
version
voice

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL