cli

command module
v0.50.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 17, 2025 License: MIT Imports: 1 Imported by: 0

README

Inference Gateway CLI

Go Version License Build Status Release Go Report Card

A powerful command-line interface for managing and interacting with the Inference Gateway. This CLI provides tools for configuration, monitoring, and management of inference services.

⚠️ Warning

Early Development Stage: This project is in its early development stage and breaking changes are expected until it reaches a stable version.

Always use pinned versions by specifying a specific version tag when downloading binaries or using install scripts.

Table of Contents

Features

  • Status Monitoring: Check gateway health and resource usage
  • Interactive Chat: Chat with models using an interactive interface
  • Conversation History: Store and retrieve past conversations with multiple storage backends
  • Configuration Management: Manage gateway settings via YAML config
  • Project Initialization: Set up local project configurations
  • Tool Execution: LLMs can execute whitelisted commands and tools including:
    • Bash: Execute safe shell commands
    • Read: Read file contents with optional line ranges
    • Write: Write content to files with security controls
    • Grep: Fast ripgrep-powered search with regex support and multiple output modes
    • WebSearch: Search the web using DuckDuckGo or Google
    • WebFetch: Fetch content from whitelisted URLs
    • Github: Interact with GitHub API to fetch issues, pull requests, and create content
    • Tree: Display directory structure with polyfill support
    • Delete: Delete files and directories with security controls
    • Edit: Perform exact string replacements in files
    • MultiEdit: Make multiple edits to files in atomic operations
    • TodoWrite: Create and manage structured task lists

Installation

Using Go Install
go install github.com/inference-gateway/cli@latest
Using Container Image

For containerized environments, you can use the official container image:

# Run the CLI directly
docker run --rm -it ghcr.io/inference-gateway/cli:latest --help

# With volume mount for config persistence
docker run --rm -it -v ~/.infer:/home/infer/.infer ghcr.io/inference-gateway/cli:latest

# Example: Run chat command
docker run --rm -it -v ~/.infer:/home/infer/.infer ghcr.io/inference-gateway/cli:latest chat

Using specific version:

docker run --rm -it ghcr.io/inference-gateway/cli:0.48.12

Available architectures: linux/amd64, linux/arm64

Using Install Script

For quick installation, you can use our install script:

Unix/macOS/Linux:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash

With specific version:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --version v0.1.1

Custom install directory:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --install-dir $HOME/.local/bin

The install script will:

  • Detect your operating system and architecture automatically
  • Download the appropriate binary from GitHub releases
  • Install to /usr/local/bin by default (or custom directory with --dir)
  • Make the binary executable
  • Verify the installation
Manual Download

Download the latest release binary for your platform from the releases page.

Verifying Release Binaries

All release binaries are signed with Cosign for supply chain security. You can verify the integrity and authenticity of downloaded binaries using the following steps:

1. Download the binary, checksums, and signature files:

# Download binary (replace with your platform)
curl -L -o infer-darwin-amd64 \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/infer-darwin-amd64

# Download checksums and signature files
curl -L -o checksums.txt \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt
curl -L -o checksums.txt.pem \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt.pem
curl -L -o checksums.txt.sig \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt.sig

2. Verify SHA256 checksum:

# Calculate checksum of downloaded binary
shasum -a 256 infer-darwin-amd64

# Compare with checksums in checksums.txt
grep infer-darwin-amd64 checksums.txt

3. Verify Cosign signature (requires Cosign to be installed):

# Decode base64 encoded certificate
cat checksums.txt.pem | base64 -d > checksums.txt.pem.decoded

# Verify the signature
cosign verify-blob \
  --certificate checksums.txt.pem.decoded \
  --signature checksums.txt.sig \
  --certificate-identity "https://github.com/inference-gateway/cli/.github/workflows/release.yml@refs/heads/main" \
  --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
  checksums.txt

4. Make binary executable and install:

chmod +x infer-darwin-amd64
sudo mv infer-darwin-amd64 /usr/local/bin/infer

Note: Replace v0.29.1 with the desired release version and infer-darwin-amd64 with your platform's binary name.

Build from Source
git clone https://github.com/inference-gateway/cli.git
cd cli
go build -o infer .

Quick Start

  1. Initialize project configuration:

    infer init --model deepseek/deepseek-chat
    

    Using --model is recommended as it enables AI project analysis and generates a comprehensive AGENTS.md file tailored to your specific project.

  2. Check gateway status:

    infer status
    
  3. Start an interactive chat:

    infer chat
    

Commands

infer init

Initialize a new project with Inference Gateway CLI. This creates:

  • .infer/ directory with:
    • config.yaml - Main configuration file for the project
    • .gitignore - Ensures sensitive files are not committed to version control
  • AGENTS.md - AI-generated project documentation in the repository root (only when --model is specified)

This is the recommended command to start working with Inference Gateway CLI in a new project.

Options:

  • --overwrite: Overwrite existing files if they already exist
  • --userspace: Initialize configuration in user home directory (~/.infer/)

Examples:

# Initialize project-level configuration (default)
infer init
infer init --overwrite

# Initialize userspace configuration (global fallback)
infer init --userspace
infer config

Manage CLI configuration settings including models, system prompts, and tools.

infer config init

Initialize a new .infer/config.yaml configuration file in the current directory. This creates only the configuration file with default settings.

For complete project initialization, use infer init instead.

Options:

  • --overwrite: Overwrite existing configuration file
  • --userspace: Initialize configuration in user home directory (~/.infer/)

Examples:

# Initialize project-level configuration (default)
infer config init
infer config init --overwrite

# Initialize userspace configuration (global fallback)
infer config init --userspace
infer config agent set-model

Set the default model for chat sessions. When set, chat sessions will automatically use this model without showing the model selection prompt.

Examples:

infer config agent set-model openai/gpt-4-turbo
infer config agent set-model anthropic/claude-opus-4-1-20250805
infer config agent set-system

Set a system prompt that will be included with every chat session, providing context and instructions to the AI model.

Examples:

infer config agent set-system "You are a helpful assistant."
infer config agent set-system "You are a Go programming expert."
infer config agent set-system "You are a helpful assistant."
infer config agent set-system "You are a Go programming expert."
infer config tools

Manage tool execution settings for LLMs, including enabling/disabling tools, managing whitelists, and security settings.

Subcommands:

  • enable: Enable tool execution for LLMs
  • disable: Disable tool execution for LLMs
  • list [--format text|json]: List whitelisted commands and patterns
  • validate <command>: Validate if a command is whitelisted
  • exec <command> [--format text|json]: Execute a whitelisted command directly
  • safety: Manage safety approval settings
    • enable: Enable safety approval prompts
    • disable: Disable safety approval prompts
    • status: Show current safety approval status
  • sandbox: Manage sandbox directories for security
    • list: List all sandbox directories
    • add <path>: Add a protected path to the sandbox
    • remove <path>: Remove a protected path from the sandbox

Examples:

# Enable/disable tool execution
infer config tools enable
infer config tools disable

# List whitelisted commands
infer config tools list
infer config tools list --format json

# Validate and execute commands
infer config tools validate "ls -la"
infer config tools exec "git status"

# Manage global safety settings (approval prompts)
infer config tools safety enable   # Enable approval prompts for all tool execution
infer config tools safety disable  # Disable approval prompts (execute tools immediately)
infer config tools safety status   # Show current safety approval status

# Manage tool-specific safety settings (granular control)
infer config tools safety set Bash enabled        # Require approval for Bash tool only
infer config tools safety set WebSearch disabled  # Skip approval for WebSearch tool
infer config tools safety unset Bash              # Remove tool-specific setting (use global)

# Manage excluded paths
infer config tools sandbox list
infer config tools sandbox add ".github/"
infer config tools sandbox remove "test.txt"
infer status

Check the status of the inference gateway including health checks and resource usage.

Examples:

infer status
infer chat

Start an interactive chat session with model selection. Provides a conversational interface where you can select models and have conversations.

Features:

  • Interactive model selection
  • Conversational interface
  • Real-time streaming responses
  • Scrollable chat history with mouse wheel and keyboard support

Navigation Controls:

  • Mouse wheel: Scroll up/down through chat history
  • Arrow keys (/) or Vim keys (k/j): Scroll one line at a time
  • Page Up/Page Down: Scroll by page
  • Home/End: Jump to top/bottom of chat history
  • Shift+↑/Shift+↓: Half-page scrolling
  • Ctrl+R: Toggle expanded view of tool results

System Reminders:

The chat interface supports configurable system reminders that can provide periodic contextual information to the AI model during conversations. These reminders help maintain context and provide relevant guidance throughout the session.

  • Customizable interval: Set how often reminders appear (in number of messages)
  • Dynamic content: Reminders can contain contextual information based on the current state
  • Non-intrusive: Reminders are sent to the AI model but don't interrupt the user experience
  • Configurable: Enable/disable and customize reminder content through configuration

Examples:

infer chat
infer agent

Execute a task using an autonomous agent in background mode. The CLI will work iteratively until the task is considered complete. Particularly useful for SCM tickets like GitHub issues.

Features:

  • Autonomous execution: Agent works independently to complete tasks
  • Iterative processing: Continues until task completion criteria are met
  • Tool integration: Full access to all available tools (Bash, Read, Write, etc.)
  • Parallel tool execution: Executes multiple tool calls simultaneously for improved efficiency
  • Background operation: Runs without interactive user input
  • Task completion detection: Automatically detects when tasks are complete
  • Configurable concurrency: Control the maximum number of parallel tool executions (default: 5)
  • JSON output: Structured JSON output for easy parsing and integration

Options:

  • -m, --model: Model to use for the agent (e.g., openai/gpt-4)

Examples:

# Execute a task described in a GitHub issue
infer agent "Please fix the github issue 38"

# Use a specific model for the agent
infer agent --model "openai/gpt-4" "Implement the feature described in issue #42"

# Debug a failing test
infer agent "Debug the failing test in PR 15"

# Refactor code
infer agent "Refactor the authentication module to use JWT tokens"
infer version

Display version information for the Inference Gateway CLI.

Examples:

infer version

Available Tools for LLMs

When tool execution is enabled, LLMs can use the following tools to interact with the system:

Tree Tool

Display directory structure in a tree format, similar to the Unix tree command. Provides a polyfill implementation when the native tree command is unavailable.

Parameters:

  • path (optional): Directory path to display tree structure for (default: current directory)
  • max_depth (optional): Maximum depth to traverse (unlimited by default)
  • show_hidden (optional): Whether to show hidden files and directories (default: false)
  • respect_gitignore (optional): Whether to exclude patterns from .gitignore (default: true)
  • format (optional): Output format - "text" or "json" (default: "text")

Examples:

  • Basic tree: Uses current directory with default settings
  • Tree with depth limit: max_depth: 2 - Shows only 2 levels deep
  • Tree with hidden files: show_hidden: true
  • Tree ignoring gitignore: respect_gitignore: false - Shows all files including those in .gitignore
  • JSON output: format: "json" - Returns structured data

Features:

  • Native Integration: Uses system tree command when available for optimal performance
  • Polyfill Implementation: Falls back to custom implementation when tree is not installed
  • Pattern Exclusion: Supports glob patterns to exclude specific files and directories
  • Depth Control: Limit traversal depth to prevent overwhelming output
  • Hidden File Control: Toggle visibility of hidden files and directories
  • Multiple Formats: Text output for readability, JSON for structured data

Security:

  • Respects configured path exclusions for security
  • Validates directory access permissions
  • Limited by the same security restrictions as other file tools
Bash Tool

Execute whitelisted bash commands securely with validation against configured command patterns.

Read Tool

Read file content from the filesystem with optional line range specification.

Write Tool

Write content to files on the filesystem with security controls and directory creation support.

Parameters:

  • file_path (required): The path to the file to write
  • content (required): The content to write to the file
  • create_dirs (optional): Whether to create parent directories if they don't exist (default: true)
  • overwrite (optional): Whether to overwrite existing files (default: true)
  • format (optional): Output format - "text" or "json" (default: "text")

Features:

  • Directory Creation: Automatically creates parent directories when needed
  • Overwrite Control: Configurable behavior for existing files
  • Security Validation: Respects path exclusions and security restrictions
  • Performance Optimized: Efficient file writing with proper error handling

Security:

  • Approval Required: Write operations require approval by default (secure by default)
  • Path Exclusions: Respects configured excluded paths (e.g., .infer/ directory)
  • Pattern Matching: Supports glob patterns for path exclusions
  • Validation: Validates file paths and content before writing

Examples:

  • Create new file: file_path: "output.txt", content: "Hello, World!"
  • Write to subdirectory: file_path: "logs/app.log", content: "log entry", create_dirs: true
  • Safe overwrite: file_path: "config.json", content: "{...}", overwrite: false
WebSearch Tool

Search the web using DuckDuckGo or Google search engines to find information.

WebFetch Tool

WebFetch content from whitelisted URLs or GitHub references using the format example.com.

Github Tool

Interact with GitHub API to fetch issues, pull requests, create comments, and create pull requests with authentication support. This is a standalone tool separate from WebFetch.

Parameters:

  • owner (required): Repository owner (username or organization)
  • repo (required): Repository name
  • resource (optional): Resource type to fetch or create (default: "issue")
    • issue: Fetch a specific issue
    • issues: Fetch a list of issues
    • pull_request: Fetch a specific pull request
    • comments: Fetch comments for an issue/PR
    • create_comment: Create a comment on an issue/PR
    • create_pull_request: Create a new pull request
  • issue_number (required for issue/pull_request/comments/create_comment): Issue or PR number
  • comment_body (required for create_comment): Comment body text
  • title (required for create_pull_request): Pull request title
  • body (optional for create_pull_request): Pull request body/description
  • head (required for create_pull_request): Head branch name
  • base (optional for create_pull_request): Base branch name (default: "main")
  • state (optional): Filter by state for issues list ("open", "closed", "all", default: "open")
  • per_page (optional): Number of items per page for lists (1-100, default: 30)

Features:

  • GitHub API Integration: Direct access to GitHub's REST API v3
  • Authentication: Supports GitHub personal access tokens via environment variables
  • Multiple Resources: Fetch issues, pull requests, comments, and create new content
  • Structured Data: Returns properly typed GitHub data structures
  • Error Handling: Comprehensive error handling with GitHub API error messages
  • Rate Limiting: Respects GitHub API rate limits
  • Security: Configurable timeout and response size limits
  • Environment Variables: Supports token resolution via %GITHUB_TOKEN% syntax
  • Security Controls: Owner validation for secure repository access

Configuration:

tools:
  github:
    enabled: true
    token: "%GITHUB_TOKEN%"  # Environment variable reference
    base_url: "https://api.github.com"
    owner: "your-username"  # Default owner for security
    repo: "your-repo"       # Default repository (optional)
    safety:
      max_size: 1048576  # 1MB
      timeout: 30        # 30 seconds
    require_approval: false

Examples:

  • Fetch specific issue: owner: "octocat", repo: "Hello-World", resource: "issue", issue_number: 1
  • List open issues: owner: "octocat", repo: "Hello-World", resource: "issues", state: "open", per_page: 10
  • Fetch pull request: owner: "octocat", repo: "Hello-World", resource: "pull_request", issue_number: 5
  • Get issue comments: owner: "octocat", repo: "Hello-World", resource: "comments", issue_number: 1
  • Create comment: owner: "octocat", repo: "Hello-World", resource: "create_comment", issue_number: 1, comment_body: "Great work!"
  • Create pull request: owner: "octocat", repo: "Hello-World", resource: "create_pull_request", title: "Add feature", body: "New feature implementation", head: "feature-branch", base: "main"
Delete Tool

Delete files or directories from the filesystem with security controls. Supports wildcard patterns for batch operations.

Parameters:

  • path (required): The path to the file or directory to delete
  • recursive (optional): Whether to delete directories recursively (default: false)
  • force (optional): Whether to force deletion (ignore non-existent files, default: false)

Features:

  • Wildcard Support: Delete multiple files using patterns like *.txt or temp/*
  • Recursive Deletion: Remove directories and their contents
  • Safety Controls: Respects configured path exclusions and security restrictions
  • Validation: Validates file paths and permissions before deletion

Security:

  • Approval Required: Delete operations require approval by default
  • Path Exclusions: Respects configured excluded paths for security
  • Pattern Matching: Supports glob patterns for path exclusions
  • Validation: Validates file paths and prevents deletion of protected directories

Examples:

  • Delete single file: path: "temp.txt"
  • Delete directory recursively: path: "temp/", recursive: true
  • Delete with wildcard: path: "*.log"
  • Force delete: path: "missing.txt", force: true
Edit Tool

Perform exact string replacements in files with security validation and preview support.

Parameters:

  • file_path (required): The path to the file to modify
  • old_string (required): The text to replace (must match exactly)
  • new_string (required): The text to replace it with
  • replace_all (optional): Replace all occurrences of old_string (default: false)

Features:

  • Exact Matching: Requires exact string matches for safety
  • Preview Support: Shows diff preview before applying changes
  • Atomic Operations: Either all changes succeed or none are applied
  • Security Validation: Respects path exclusions and file permissions

Security:

  • Read Tool Requirement: Requires Read tool to be used first on the file
  • Approval Required: Edit operations require approval by default
  • Path Exclusions: Respects configured excluded paths
  • Validation: Validates file paths and prevents editing protected files

Examples:

  • Single replacement: file_path: "config.txt", old_string: "port: 3000", new_string: "port: 8080"
  • Replace all occurrences: file_path: "script.py", old_string: "print", new_string: "logging.info", replace_all: true
MultiEdit Tool

Make multiple edits to a single file in atomic operations. All edits succeed or none are applied.

Parameters:

  • file_path (required): The path to the file to modify
  • edits (required): Array of edit operations to perform sequentially
    • old_string: The text to replace (must match exactly)
    • new_string: The text to replace it with
    • replace_all (optional): Replace all occurrences (default: false)

Features:

  • Atomic Operations: All edits succeed or none are applied
  • Sequential Processing: Edits are applied in the order provided
  • Preview Support: Shows comprehensive diff preview
  • Security Validation: Respects all security restrictions

Security:

  • Read Tool Requirement: Requires Read tool to be used first on the file
  • Approval Required: MultiEdit operations require approval by default
  • Path Exclusions: Respects configured excluded paths
  • Validation: Validates all edits before execution

Examples:

{
  "file_path": "config.yaml",
  "edits": [
    {
      "old_string": "port: 3000",
      "new_string": "port: 8080"
    },
    {
      "old_string": "debug: true",
      "new_string": "debug: false"
    }
  ]
}
Grep Tool

A powerful search tool with configurable backend (ripgrep or Go implementation).

Parameters:

  • pattern (required): The regular expression pattern to search for
  • path (optional): File or directory to search in (default: current directory)
  • output_mode (optional): Output mode - "content", "files_with_matches", or "count" (default: "files_with_matches")
  • -i (optional): Case insensitive search
  • -n (optional): Show line numbers in output
  • -A (optional): Number of lines to show after each match
  • -B (optional): Number of lines to show before each match
  • -C (optional): Number of lines to show before and after each match
  • glob (optional): Glob pattern to filter files (e.g., ".js", ".{ts,tsx}")
  • type (optional): File type to search (e.g., "js", "py", "rust")
  • multiline (optional): Enable multiline mode where patterns can span lines
  • head_limit (optional): Limit output to first N results

Features:

  • Dual Backend: Uses ripgrep when available for optimal performance, falls back to Go implementation
  • Full Regex Support: Supports complete regex syntax
  • Multiple Output Modes: Content matching, file lists, or count results
  • Context Lines: Show lines before and after matches
  • File Filtering: Filter by glob patterns or file types
  • Multiline Matching: Patterns can span multiple lines
  • Automatic Exclusions: Automatically excludes common directories and files (.git, node_modules, .infer, etc.)
  • Gitignore Support: Respects .gitignore patterns in your repository
  • User-Configurable Exclusions: Additional exclusion patterns can be configured by users (not by the LLM)

Security & Exclusions:

  • Path Exclusions: Respects configured excluded paths and patterns
  • Automatic Exclusions: The tool automatically excludes:
    • Version control directories (.git, .svn, etc.)
    • Dependency directories (node_modules, vendor, etc.)
    • Build artifacts (dist, build, target, etc.)
    • Cache and temp files (.cache, *.tmp, *.log, etc.)
    • Security-sensitive files (.env, secrets, etc.)
  • Gitignore Integration: Automatically reads and respects .gitignore patterns
  • Validation: Validates search patterns and file access
  • Performance Limits: Configurable result limits to prevent overwhelming output

Examples:

  • Basic search: pattern: "error", output_mode: "content"
  • Case insensitive: pattern: "TODO", -i: true, output_mode: "content"
  • With context: pattern: "function", -C: 3, output_mode: "content"
  • File filtering: pattern: "interface", glob: "*.go", output_mode: "files_with_matches"
  • Count results: pattern: "log.*Error", output_mode: "count"
TodoWrite Tool

Create and manage structured task lists for LLM-assisted development workflows.

Parameters:

  • todos (required): Array of todo items with status tracking
    • id (required): Unique identifier for the task
    • content (required): Task description
    • status (required): Task status - "pending", "in_progress", or "completed"

Features:

  • Structured Task Management: Organized task tracking with status
  • Real-time Updates: Mark tasks as in_progress/completed during execution
  • Progress Tracking: Visual representation of task completion
  • LLM Integration: Designed for LLM-assisted development workflows

Security:

  • No File System Access: Pure memory-based operation
  • Validation: Validates todo structure and status values
  • Size Limits: Configurable limits on todo list size

Examples:

{
  "todos": [
    {
      "id": "1",
      "content": "Update README with new tool documentation",
      "status": "in_progress"
    },
    {
      "id": "2",
      "content": "Add test cases for new features",
      "status": "pending"
    }
  ]
}

Security Notes:

  • All tools respect configured safety settings and exclusion patterns
  • Commands require approval when safety approval is enabled
  • File access is restricted to allowed paths and excludes sensitive directories

Configuration

The CLI supports a 2-layer configuration system that allows for both user-level and project-level configuration with proper precedence handling. For detailed configuration documentation and examples, see CONFIG.md.

Configuration Layers
  1. Userspace Configuration (~/.infer/config.yaml)

    • Global configuration for the user across all projects
    • Used as a fallback when no project-level configuration exists
    • Can be created with: infer init --userspace or infer config init --userspace
  2. Project Configuration (.infer/config.yaml in current directory)

    • Project-specific configuration that takes precedence over userspace config
    • Default location for most commands
    • Can be created with: infer init or infer config init
Configuration Precedence

Configuration values are merged with the following precedence (highest to lowest):

  1. Project-level config (.infer/config.yaml) - Highest Priority
  2. Userspace config (~/.infer/config.yaml)
  3. Built-in defaults - Lowest Priority

Example: If your userspace config sets agent.model: "anthropic/claude-4" and your project config sets agent.model: "deepseek/deepseek-chat", the project config wins and deepseek/deepseek-chat will be used. However, if the project config doesn't specify a model but does specify other settings, the userspace model will be preserved while project settings take precedence for their specific values.

Usage Examples
# Create userspace configuration (global fallback)
infer init --userspace

# Create project configuration (takes precedence)
infer init

# Both configurations will be automatically merged when commands are run

You can also specify a custom config file using the --config flag which will override the automatic 2-layer loading.

Default Configuration
gateway:
  url: http://localhost:8080
  api_key: ""
  timeout: 200
client:
  timeout: 200
  retry:
    enabled: true
    max_attempts: 3
    initial_backoff_sec: 5
    max_backoff_sec: 60
    backoff_multiplier: 2
    retryable_status_codes: [400, 408, 429, 500, 502, 503, 504]
logging:
  debug: false
tools:
  enabled: true # Tools are enabled by default with safe read-only commands
  sandbox:
    directories: [".", "/tmp"] # Allowed directories for tool operations
    protected_paths: # Paths excluded from tool access for security
      - .infer/
      - .git/
      - *.env
  bash:
    enabled: true
    whitelist:
      commands: # Exact command matches
        - ls
        - pwd
        - echo
        - wc
        - sort
        - uniq
        - gh
        - task
      patterns: # Regex patterns for more complex commands
        - ^git branch( --show-current)?$
        - ^git checkout -b [a-zA-Z0-9/_-]+( [a-zA-Z0-9/_-]+)?$
        - ^git checkout [a-zA-Z0-9/_-]+
        - ^git add [a-zA-Z0-9/_.-]+
        - ^git diff+
        - ^git remote -v$
        - ^git status$
        - ^git log --oneline -n [0-9]+$
        - ^git commit -m ".+"$
        - ^git push( --set-upstream)?( origin)?( [a-zA-Z0-9/_-]+)?$
  read:
    enabled: true
    require_approval: false
  write:
    enabled: true
    require_approval: true # Write operations require approval by default for security
  edit:
    enabled: true
    require_approval: true # Edit operations require approval by default for security
  delete:
    enabled: true
    require_approval: true # Delete operations require approval by default for security
  grep:
    enabled: true
    backend: auto # "auto", "ripgrep", or "go"
    require_approval: false
  tree:
    enabled: true
    require_approval: false
  web_fetch:
    enabled: true
    whitelisted_domains:
      - golang.org
    safety:
      max_size: 4096 # 4KB
      timeout: 30 # 30 seconds
      allow_redirect: true
    cache:
      enabled: true
      ttl: 3600 # 1 hour
      max_size: 52428800 # 50MB
  web_search:
    enabled: true
    default_engine: duckduckgo
    max_results: 10
    engines:
      - duckduckgo
      - google
    timeout: 10
  todo_write:
    enabled: true
    require_approval: false
  github:
    enabled: true
    token: "%GITHUB_TOKEN%"
    base_url: "https://api.github.com"
    owner: ""
    safety:
      max_size: 1048576  # 1MB
      timeout: 30        # 30 seconds
    require_approval: false
  safety:
    require_approval: true
compact:
  output_dir: .infer # Directory for compact command exports
  summary_model: "" # Model to use for summarization (optional)
agent:
  model: "" # Default model for agent operations
  system_prompt: | # System prompt for agent sessions
    Autonomous software engineering agent. Execute tasks iteratively until completion.

    IMPORTANT: You NEVER push to main or master or to the current branch - instead you create a branch and push to a branch.
    IMPORTANT: You NEVER read all the README.md - start by reading 300 lines

    RULES:
    - Security: Defensive only (analysis, detection, docs)
    - Style: no emojis/comments unless asked, use conventional commits
    - Code: Follow existing patterns, check deps, no secrets
    - Tasks: Use TodoWrite, mark progress immediately
    - Chat exports: Read only "## Summary" to "---" section
    - Tools: Batch calls, prefer Grep for search

    WORKFLOW:
    When asked to implement features or fix issues:
    1. Plan with TodoWrite
    2. Search codebase to understand context
    3. Implement solution
    4. Run tests with: task test
    5. Run lint/format with: task fmt and task lint
    6. Commit changes (only if explicitly asked)
    7. Create a pull request (only if explicitly asked)
  system_reminders:
    enabled: true
    interval: 4
    reminder_text: |
      System reminder text for maintaining context
  verbose_tools: false
  max_turns: 50 # Maximum number of turns for agent sessions
  max_tokens: 4096 # The maximum number of tokens that can be generated per request
  optimization:
    enabled: false
    max_history: 10
    compact_threshold: 20
    truncate_large_outputs: true
    skip_redundant_confirmations: true
chat:
  theme: tokyo-night
Configuration Options

Gateway Settings:

  • gateway.url: The URL of the inference gateway
  • gateway.api_key: API key for authentication (if required)
  • gateway.timeout: Request timeout in seconds

Client Settings:

  • client.timeout: HTTP client timeout in seconds
  • client.retry.enabled: Enable automatic retries for failed requests
  • client.retry.max_attempts: Maximum number of retry attempts
  • client.retry.initial_backoff_sec: Initial delay between retries in seconds
  • client.retry.max_backoff_sec: Maximum delay between retries in seconds
  • client.retry.backoff_multiplier: Backoff multiplier for exponential delay
  • client.retry.retryable_status_codes: HTTP status codes that trigger retries (e.g., [400, 408, 429, 500, 502, 503, 504])

Logging Settings:

  • logging.debug: Enable debug logging for verbose output

Tool Settings:

  • tools.enabled: Enable/disable tool execution for LLMs (default: true)
  • tools.sandbox.directories: Allowed directories for tool operations (default: [".", "/tmp"])
  • tools.sandbox.protected_paths: Paths excluded from tool access for security (default: [".infer/", ".git/", "*.env"])
  • tools.whitelist.commands: List of allowed commands (supports arguments)
  • tools.whitelist.patterns: Regex patterns for complex command validation
  • tools.safety.require_approval: Prompt user before executing any command (default: true)
  • Individual tool settings: Each tool (Bash, Read, Write, Edit, Delete, Grep, Tree, WebFetch, WebSearch, TodoWrite) has:
    • enabled: Enable/disable the specific tool
    • require_approval: Override global safety setting for this tool (optional)

Compact Settings:

  • compact.output_dir: Directory for compact command exports (default: ".infer")

Chat Settings:

  • chat.default_model: Default model for chat sessions (skips model selection when set)
  • chat.system_prompt: System prompt included with every chat session
  • chat.system_reminders.enabled: Enable/disable system reminders (default: true)
  • chat.system_reminders.interval: Number of messages between reminders (default: 10)
  • chat.system_reminders.text: Custom reminder text to provide contextual guidance

Agent Settings:

  • agent.model: Default model for agent operations
  • agent.system_prompt: System prompt for agent sessions
  • agent.system_reminders.enabled: Enable system reminders during agent sessions
  • agent.system_reminders.interval: Number of messages between reminders (default: 4)
  • agent.system_reminders.reminder_text: Custom reminder text for agent context
  • agent.verbose_tools: Enable verbose tool output (default: false)
  • agent.max_turns: Maximum number of turns for agent sessions (default: 50)
  • agent.max_tokens: Maximum tokens per agent request (default: 8192)
  • agent.optimization.enabled: Enable optimization features (default: false)
  • agent.optimization.max_history: Maximum conversation history to maintain (default: 10)
  • agent.optimization.compact_threshold: Threshold for compacting conversation (default: 20)
  • agent.optimization.truncate_large_outputs: Truncate large tool outputs (default: true)
  • agent.optimization.skip_redundant_confirmations: Skip redundant confirmation messages (default: true)

Web Search Settings:

  • web_search.enabled: Enable/disable web search tool for LLMs (default: true)
  • web_search.default_engine: Default search engine to use ("duckduckgo" or "google", default: "duckduckgo")
  • web_search.max_results: Maximum number of search results to return (1-50, default: 10)
  • web_search.engines: List of available search engines
  • web_search.timeout: Search timeout in seconds (default: 10)

Chat Interface Settings:

  • chat.theme: Chat interface theme name (default: "tokyo-night")
    • Available themes: tokyo-night, github-light, dracula
    • Can be changed during chat using /theme [theme-name] shortcut
    • Affects colors and styling of the chat interface
Web Search API Setup (Optional)

Both search engines work out of the box, but for better reliability and performance in production, you can configure API keys:

Google Custom Search Engine:

  1. Create a Custom Search Engine:

    • Go to Google Programmable Search Engine
    • Click "Add" to create a new search engine
    • Enter a name for your search engine
    • In "Sites to search", enter * to search the entire web
    • Click "Create"
  2. Get your Search Engine ID:

    • In your search engine settings, note the "Search engine ID" (cx parameter)
  3. Get a Google API Key:

    • Go to the Google Cloud Console
    • Create a new project or select an existing one
    • Enable the "Custom Search JSON API"
    • Go to "Credentials" and create an API key
    • Restrict the API key to the Custom Search JSON API for security
  4. Configure Environment Variables:

    export GOOGLE_SEARCH_API_KEY="your_api_key_here"
    export GOOGLE_SEARCH_ENGINE_ID="your_search_engine_id_here"
    

DuckDuckGo API (Optional):

export DUCKDUCKGO_SEARCH_API_KEY="your_api_key_here"

Note: Both engines have built-in fallback methods that work without API configuration. However, using official APIs provides better reliability and performance for production use.

Global Flags

  • -c, --config: Config file (default is ./.infer/config.yaml)
  • -v, --verbose: Verbose output
  • -h, --help: Help for any command

Examples

Basic Workflow
# Initialize project configuration
infer init

# Check if gateway is running
infer status

# Start interactive chat
infer chat
Configuration Management
# Use custom config file
infer --config ./my-config.yaml status

# Get verbose output
infer --verbose status

# Set default model for chat sessions
infer config agent set-model openai/gpt-4-turbo

# Set system prompt
infer config agent set-system "You are a helpful assistant."

# Enable tool execution with safety approval
infer config tools enable
infer config tools safety enable

# Configure sandbox directories for security
infer config tools sandbox add "/home/user/projects"
infer config tools sandbox add "/tmp/work"

# Add protected paths to prevent accidental modification
infer config tools sandbox add ".env"
infer config tools sandbox add ".git/"

# Configure individual tool safety settings
infer config tools safety set Read disabled    # Skip approval for Read tool
infer config tools safety set Write enabled    # Require approval for Write tool
infer config tools safety set Delete enabled   # Require approval for Delete tool

Development

Building
go build -o infer .
Testing
go test ./...
Dependencies

Extensible Shortcuts System

The CLI provides an extensible shortcuts system that allows you to quickly execute common commands with /shortcut-name syntax.

Built-in Shortcuts

Core Shortcuts
  • /clear - Clear conversation history
  • /exit - Exit the chat session
  • /help [shortcut] - Show available shortcuts or specific shortcut help
  • /switch [model] - Switch to a different model
  • /theme [theme-name] - Switch chat interface theme or list available themes
  • /config <show|get|set|reload> [key] [value] - Manage configuration settings
  • /compact [format] - Export conversation to markdown
Git Shortcuts
  • /git <command> [args...] - Execute git commands (supports commit, push, status, etc.)
  • /git commit [flags] - NEW: Commit staged changes with AI-generated message
  • /git push [remote] [branch] [flags] - NEW: Push commits to remote repository

The git shortcuts provide intelligent commit message generation using AI when no message is provided with /git commit.

User-Defined Shortcuts

You can create custom shortcuts by adding YAML configuration files in the .infer/shortcuts/ directory.

Configuration File Format

Create files named custom-*.yaml (e.g., custom-1.yaml, custom-dev.yaml) in .infer/shortcuts/:

shortcuts:
  - name: "tests"
    description: "Run all tests in the project"
    command: "go"
    args: ["test", "./..."]
    working_dir: "."  # Optional: set working directory

  - name: "build"
    description: "Build the project"
    command: "go"
    args: ["build", "-o", "infer", "."]

  - name: "lint"
    description: "Run linter on the codebase"
    command: "golangci-lint"
    args: ["run"]
Configuration Fields
  • name (required): The shortcut name (used as /name)
  • description (required): Human-readable description shown in /help
  • command (required): The executable command to run
  • args (optional): Array of arguments to pass to the command
  • working_dir (optional): Working directory for the command (defaults to current)
Using Shortcuts

With the configuration above, you can use:

  • /tests - Runs go test ./...
  • /build - Runs go build -o infer .
  • /lint - Runs golangci-lint run

You can also pass additional arguments:

  • /tests -v - Runs go test ./... -v
  • /build --race - Runs go build -o infer . --race

Example Custom Shortcuts

Here are some useful shortcuts you might want to add:

Development Shortcuts (custom-dev.yaml)
shortcuts:
  - name: "fmt"
    description: "Format all Go code"
    command: "go"
    args: ["fmt", "./..."]

  - name: "mod tidy"
    description: "Tidy up go modules"
    command: "go"
    args: ["mod", "tidy"]

  - name: "version"
    description: "Show current version"
    command: "git"
    args: ["describe", "--tags", "--always", "--dirty"]
Docker Shortcuts (custom-docker.yaml)
shortcuts:
  - name: "docker build"
    description: "Build Docker image"
    command: "docker"
    args: ["build", "-t", "myapp", "."]

  - name: "docker run"
    description: "Run Docker container"
    command: "docker"
    args: ["run", "-p", "8080:8080", "myapp"]
Project-Specific Shortcuts (custom-project.yaml)
shortcuts:
  - name: "migrate"
    description: "Run database migrations"
    command: "./scripts/migrate.sh"
    working_dir: "."

  - name: "seed"
    description: "Seed database with test data"
    command: "go"
    args: ["run", "cmd/seed/main.go"]

Tips

  1. File Organization: Use descriptive names for your config files (e.g., custom-dev.yaml, custom-docker.yaml)
  2. Command Discovery: Use /help to see all available shortcuts including your custom ones
  3. Error Handling: If a custom shortcut fails to load, it will be skipped with a warning
  4. Reloading: Restart the chat session to reload custom shortcuts after making changes
  5. Security: Be careful with custom shortcuts as they execute system commands

Troubleshooting

  • Shortcut not appearing: Check YAML syntax and file naming (custom-*.yaml)
  • Command not found: Ensure the command is available in your PATH
  • Permission denied: Check file permissions and executable rights
  • Invalid YAML: Use a YAML validator to check your configuration syntax

License

This project is licensed under the MIT License.

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL