cli

command module

v0.50.0 Latest Latest Go to latest Published: Sep 17, 2025 License: MIT Imports: 1 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/inference-gateway/cli

Links

Open Source Insights

README ¶

Inference Gateway CLI

A powerful command-line interface for managing and interacting with the Inference Gateway. This CLI provides tools for configuration, monitoring, and management of inference services.

⚠️ Warning

Early Development Stage: This project is in its early development stage and breaking changes are expected until it reaches a stable version.

Always use pinned versions by specifying a specific version tag when downloading binaries or using install scripts.

Features

Status Monitoring: Check gateway health and resource usage
Interactive Chat: Chat with models using an interactive interface
Conversation History: Store and retrieve past conversations with multiple storage backends
- Conversation Storage - Detailed storage backend documentation
- Conversation Title Generation - AI-powered title generation system
Configuration Management: Manage gateway settings via YAML config
Project Initialization: Set up local project configurations
Tool Execution: LLMs can execute whitelisted commands and tools including:
- Bash: Execute safe shell commands
- Read: Read file contents with optional line ranges
- Write: Write content to files with security controls
- Grep: Fast ripgrep-powered search with regex support and multiple output modes
- WebSearch: Search the web using DuckDuckGo or Google
- WebFetch: Fetch content from whitelisted URLs
- Github: Interact with GitHub API to fetch issues, pull requests, and create content
- Tree: Display directory structure with polyfill support
- Delete: Delete files and directories with security controls
- Edit: Perform exact string replacements in files
- MultiEdit: Make multiple edits to files in atomic operations
- TodoWrite: Create and manage structured task lists

Installation

Using Go Install

go install github.com/inference-gateway/cli@latest

Using Container Image

For containerized environments, you can use the official container image:

# Run the CLI directly
docker run --rm -it ghcr.io/inference-gateway/cli:latest --help

# With volume mount for config persistence
docker run --rm -it -v ~/.infer:/home/infer/.infer ghcr.io/inference-gateway/cli:latest

# Example: Run chat command
docker run --rm -it -v ~/.infer:/home/infer/.infer ghcr.io/inference-gateway/cli:latest chat

Using specific version:

docker run --rm -it ghcr.io/inference-gateway/cli:0.48.12

Available architectures: linux/amd64, linux/arm64

Using Install Script

For quick installation, you can use our install script:

Unix/macOS/Linux:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash

With specific version:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --version v0.1.1

Custom install directory:

curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --install-dir $HOME/.local/bin

The install script will:

Detect your operating system and architecture automatically
Download the appropriate binary from GitHub releases
Install to /usr/local/bin by default (or custom directory with --dir)
Make the binary executable
Verify the installation

Manual Download

Download the latest release binary for your platform from the releases page.

Verifying Release Binaries

All release binaries are signed with Cosign for supply chain security. You can verify the integrity and authenticity of downloaded binaries using the following steps:

1. Download the binary, checksums, and signature files:

# Download binary (replace with your platform)
curl -L -o infer-darwin-amd64 \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/infer-darwin-amd64

# Download checksums and signature files
curl -L -o checksums.txt \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt
curl -L -o checksums.txt.pem \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt.pem
curl -L -o checksums.txt.sig \
  https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt.sig

2. Verify SHA256 checksum:

# Calculate checksum of downloaded binary
shasum -a 256 infer-darwin-amd64

# Compare with checksums in checksums.txt
grep infer-darwin-amd64 checksums.txt

3. Verify Cosign signature (requires Cosign to be installed):

# Decode base64 encoded certificate
cat checksums.txt.pem | base64 -d > checksums.txt.pem.decoded

# Verify the signature
cosign verify-blob \
  --certificate checksums.txt.pem.decoded \
  --signature checksums.txt.sig \
  --certificate-identity "https://github.com/inference-gateway/cli/.github/workflows/release.yml@refs/heads/main" \
  --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
  checksums.txt

4. Make binary executable and install:

chmod +x infer-darwin-amd64
sudo mv infer-darwin-amd64 /usr/local/bin/infer

Note: Replace v0.29.1 with the desired release version and infer-darwin-amd64 with your platform's binary name.

Build from Source

git clone https://github.com/inference-gateway/cli.git
cd cli
go build -o infer .

Quick Start

Initialize project configuration:
```
infer init --model deepseek/deepseek-chat
```
Using --model is recommended as it enables AI project analysis and generates a comprehensive AGENTS.md file tailored to your specific project.
Check gateway status:
```
infer status
```
Start an interactive chat:
```
infer chat
```

Commands

`infer init`

Initialize a new project with Inference Gateway CLI. This creates:

.infer/ directory with:
- config.yaml - Main configuration file for the project
- .gitignore - Ensures sensitive files are not committed to version control
AGENTS.md - AI-generated project documentation in the repository root (only when --model is specified)

This is the recommended command to start working with Inference Gateway CLI in a new project.

Options:

--overwrite: Overwrite existing files if they already exist
--userspace: Initialize configuration in user home directory (~/.infer/)

Examples:

# Initialize project-level configuration (default)
infer init
infer init --overwrite

# Initialize userspace configuration (global fallback)
infer init --userspace

`infer config`

Manage CLI configuration settings including models, system prompts, and tools.

`infer config init`

Initialize a new .infer/config.yaml configuration file in the current directory. This creates only the configuration file with default settings.

For complete project initialization, use infer init instead.

Options:

--overwrite: Overwrite existing configuration file
--userspace: Initialize configuration in user home directory (~/.infer/)

Examples:

# Initialize project-level configuration (default)
infer config init
infer config init --overwrite

# Initialize userspace configuration (global fallback)
infer config init --userspace

`infer config agent set-model`

Set the default model for chat sessions. When set, chat sessions will automatically use this model without showing the model selection prompt.

Examples:

infer config agent set-model openai/gpt-4-turbo
infer config agent set-model anthropic/claude-opus-4-1-20250805

`infer config agent set-system`

Set a system prompt that will be included with every chat session, providing context and instructions to the AI model.

Examples:

infer config agent set-system "You are a helpful assistant."
infer config agent set-system "You are a Go programming expert."

infer config agent set-system "You are a helpful assistant."
infer config agent set-system "You are a Go programming expert."

`infer config tools`

Manage tool execution settings for LLMs, including enabling/disabling tools, managing whitelists, and security settings.

Subcommands:

enable: Enable tool execution for LLMs
disable: Disable tool execution for LLMs
list [--format text|json]: List whitelisted commands and patterns
validate <command>: Validate if a command is whitelisted
exec <command> [--format text|json]: Execute a whitelisted command directly
safety: Manage safety approval settings
- enable: Enable safety approval prompts
- disable: Disable safety approval prompts
- status: Show current safety approval status
sandbox: Manage sandbox directories for security
- list: List all sandbox directories
- add <path>: Add a protected path to the sandbox
- remove <path>: Remove a protected path from the sandbox

Examples:

# Enable/disable tool execution
infer config tools enable
infer config tools disable

# List whitelisted commands
infer config tools list
infer config tools list --format json

# Validate and execute commands
infer config tools validate "ls -la"
infer config tools exec "git status"

# Manage global safety settings (approval prompts)
infer config tools safety enable   # Enable approval prompts for all tool execution
infer config tools safety disable  # Disable approval prompts (execute tools immediately)
infer config tools safety status   # Show current safety approval status

# Manage tool-specific safety settings (granular control)
infer config tools safety set Bash enabled        # Require approval for Bash tool only
infer config tools safety set WebSearch disabled  # Skip approval for WebSearch tool
infer config tools safety unset Bash              # Remove tool-specific setting (use global)

# Manage excluded paths
infer config tools sandbox list
infer config tools sandbox add ".github/"
infer config tools sandbox remove "test.txt"

`infer status`

Check the status of the inference gateway including health checks and resource usage.

Examples:

infer status

`infer chat`

Start an interactive chat session with model selection. Provides a conversational interface where you can select models and have conversations.

Features:

Interactive model selection
Conversational interface
Real-time streaming responses
Scrollable chat history with mouse wheel and keyboard support

Navigation Controls:

Mouse wheel: Scroll up/down through chat history
Arrow keys (↑/↓) or Vim keys (k/j): Scroll one line at a time
Page Up/Page Down: Scroll by page
Home/End: Jump to top/bottom of chat history
Shift+↑/Shift+↓: Half-page scrolling
Ctrl+R: Toggle expanded view of tool results

System Reminders:

The chat interface supports configurable system reminders that can provide periodic contextual information to the AI model during conversations. These reminders help maintain context and provide relevant guidance throughout the session.

Customizable interval: Set how often reminders appear (in number of messages)
Dynamic content: Reminders can contain contextual information based on the current state
Non-intrusive: Reminders are sent to the AI model but don't interrupt the user experience
Configurable: Enable/disable and customize reminder content through configuration

Examples:

infer chat

`infer agent`

Execute a task using an autonomous agent in background mode. The CLI will work iteratively until the task is considered complete. Particularly useful for SCM tickets like GitHub issues.

Features:

Autonomous execution: Agent works independently to complete tasks
Iterative processing: Continues until task completion criteria are met
Tool integration: Full access to all available tools (Bash, Read, Write, etc.)
Parallel tool execution: Executes multiple tool calls simultaneously for improved efficiency
Background operation: Runs without interactive user input
Task completion detection: Automatically detects when tasks are complete
Configurable concurrency: Control the maximum number of parallel tool executions (default: 5)
JSON output: Structured JSON output for easy parsing and integration

Options:

-m, --model: Model to use for the agent (e.g., openai/gpt-4)

Examples:

# Execute a task described in a GitHub issue
infer agent "Please fix the github issue 38"

# Use a specific model for the agent
infer agent --model "openai/gpt-4" "Implement the feature described in issue #42"

# Debug a failing test
infer agent "Debug the failing test in PR 15"

# Refactor code
infer agent "Refactor the authentication module to use JWT tokens"

`infer version`

Display version information for the Inference Gateway CLI.

Examples:

infer version

Available Tools for LLMs

When tool execution is enabled, LLMs can use the following tools to interact with the system:

Tree Tool

Display directory structure in a tree format, similar to the Unix tree command. Provides a polyfill implementation when the native tree command is unavailable.

Parameters:

path (optional): Directory path to display tree structure for (default: current directory)
max_depth (optional): Maximum depth to traverse (unlimited by default)
show_hidden (optional): Whether to show hidden files and directories (default: false)
respect_gitignore (optional): Whether to exclude patterns from .gitignore (default: true)
format (optional): Output format - "text" or "json" (default: "text")

Examples:

Basic tree: Uses current directory with default settings
Tree with depth limit: max_depth: 2 - Shows only 2 levels deep
Tree with hidden files: show_hidden: true
Tree ignoring gitignore: respect_gitignore: false - Shows all files including those in .gitignore
JSON output: format: "json" - Returns structured data

Features:

Native Integration: Uses system tree command when available for optimal performance
Polyfill Implementation: Falls back to custom implementation when tree is not installed
Pattern Exclusion: Supports glob patterns to exclude specific files and directories
Depth Control: Limit traversal depth to prevent overwhelming output
Hidden File Control: Toggle visibility of hidden files and directories
Multiple Formats: Text output for readability, JSON for structured data

Security:

Respects configured path exclusions for security
Validates directory access permissions
Limited by the same security restrictions as other file tools

Bash Tool

Execute whitelisted bash commands securely with validation against configured command patterns.

Read Tool

Read file content from the filesystem with optional line range specification.

Write Tool

Write content to files on the filesystem with security controls and directory creation support.

Parameters:

file_path (required): The path to the file to write
content (required): The content to write to the file
create_dirs (optional): Whether to create parent directories if they don't exist (default: true)
overwrite (optional): Whether to overwrite existing files (default: true)
format (optional): Output format - "text" or "json" (default: "text")

Features:

Directory Creation: Automatically creates parent directories when needed
Overwrite Control: Configurable behavior for existing files
Security Validation: Respects path exclusions and security restrictions
Performance Optimized: Efficient file writing with proper error handling

Security:

Approval Required: Write operations require approval by default (secure by default)
Path Exclusions: Respects configured excluded paths (e.g., .infer/ directory)
Pattern Matching: Supports glob patterns for path exclusions
Validation: Validates file paths and content before writing

Examples:

Create new file: file_path: "output.txt", content: "Hello, World!"
Write to subdirectory: file_path: "logs/app.log", content: "log entry", create_dirs: true
Safe overwrite: file_path: "config.json", content: "{...}", overwrite: false

WebSearch Tool

Search the web using DuckDuckGo or Google search engines to find information.

WebFetch Tool

WebFetch content from whitelisted URLs or GitHub references using the format example.com.

Github Tool

Interact with GitHub API to fetch issues, pull requests, create comments, and create pull requests with authentication support. This is a standalone tool separate from WebFetch.

Parameters:

owner (required): Repository owner (username or organization)
repo (required): Repository name
resource (optional): Resource type to fetch or create (default: "issue")
- issue: Fetch a specific issue
- issues: Fetch a list of issues
- pull_request: Fetch a specific pull request
- comments: Fetch comments for an issue/PR
- create_comment: Create a comment on an issue/PR
- create_pull_request: Create a new pull request
issue_number (required for issue/pull_request/comments/create_comment): Issue or PR number
comment_body (required for create_comment): Comment body text
title (required for create_pull_request): Pull request title
body (optional for create_pull_request): Pull request body/description
head (required for create_pull_request): Head branch name
base (optional for create_pull_request): Base branch name (default: "main")
state (optional): Filter by state for issues list ("open", "closed", "all", default: "open")
per_page (optional): Number of items per page for lists (1-100, default: 30)

Features:

GitHub API Integration: Direct access to GitHub's REST API v3
Authentication: Supports GitHub personal access tokens via environment variables
Multiple Resources: Fetch issues, pull requests, comments, and create new content
Structured Data: Returns properly typed GitHub data structures
Error Handling: Comprehensive error handling with GitHub API error messages
Rate Limiting: Respects GitHub API rate limits
Security: Configurable timeout and response size limits
Environment Variables: Supports token resolution via %GITHUB_TOKEN% syntax
Security Controls: Owner validation for secure repository access

Configuration:

tools:
  github:
    enabled: true
    token: "%GITHUB_TOKEN%"  # Environment variable reference
    base_url: "https://api.github.com"
    owner: "your-username"  # Default owner for security
    repo: "your-repo"       # Default repository (optional)
    safety:
      max_size: 1048576  # 1MB
      timeout: 30        # 30 seconds
    require_approval: false

Examples:

Fetch specific issue: owner: "octocat", repo: "Hello-World", resource: "issue", issue_number: 1
List open issues: owner: "octocat", repo: "Hello-World", resource: "issues", state: "open", per_page: 10
Fetch pull request: owner: "octocat", repo: "Hello-World", resource: "pull_request", issue_number: 5
Get issue comments: owner: "octocat", repo: "Hello-World", resource: "comments", issue_number: 1
Create comment: owner: "octocat", repo: "Hello-World", resource: "create_comment", issue_number: 1, comment_body: "Great work!"
Create pull request: owner: "octocat", repo: "Hello-World", resource: "create_pull_request", title: "Add feature", body: "New feature implementation", head: "feature-branch", base: "main"

Delete Tool

Delete files or directories from the filesystem with security controls. Supports wildcard patterns for batch operations.

Parameters:

path (required): The path to the file or directory to delete
recursive (optional): Whether to delete directories recursively (default: false)
force (optional): Whether to force deletion (ignore non-existent files, default: false)

Features:

Wildcard Support: Delete multiple files using patterns like *.txt or temp/*
Recursive Deletion: Remove directories and their contents
Safety Controls: Respects configured path exclusions and security restrictions
Validation: Validates file paths and permissions before deletion

Security:

Approval Required: Delete operations require approval by default
Path Exclusions: Respects configured excluded paths for security
Pattern Matching: Supports glob patterns for path exclusions
Validation: Validates file paths and prevents deletion of protected directories

Examples:

Delete single file: path: "temp.txt"
Delete directory recursively: path: "temp/", recursive: true
Delete with wildcard: path: "*.log"
Force delete: path: "missing.txt", force: true

Edit Tool

Perform exact string replacements in files with security validation and preview support.

Parameters:

file_path (required): The path to the file to modify
old_string (required): The text to replace (must match exactly)
new_string (required): The text to replace it with
replace_all (optional): Replace all occurrences of old_string (default: false)

Features:

Exact Matching: Requires exact string matches for safety
Preview Support: Shows diff preview before applying changes
Atomic Operations: Either all changes succeed or none are applied
Security Validation: Respects path exclusions and file permissions

Security:

Read Tool Requirement: Requires Read tool to be used first on the file
Approval Required: Edit operations require approval by default
Path Exclusions: Respects configured excluded paths
Validation: Validates file paths and prevents editing protected files

Examples:

Single replacement: file_path: "config.txt", old_string: "port: 3000", new_string: "port: 8080"
Replace all occurrences: file_path: "script.py", old_string: "print", new_string: "logging.info", replace_all: true

MultiEdit Tool

Make multiple edits to a single file in atomic operations. All edits succeed or none are applied.

Parameters:

file_path (required): The path to the file to modify
edits (required): Array of edit operations to perform sequentially
- old_string: The text to replace (must match exactly)
- new_string: The text to replace it with
- replace_all (optional): Replace all occurrences (default: false)

Features:

Atomic Operations: All edits succeed or none are applied
Sequential Processing: Edits are applied in the order provided
Preview Support: Shows comprehensive diff preview
Security Validation: Respects all security restrictions

Security:

Read Tool Requirement: Requires Read tool to be used first on the file
Approval Required: MultiEdit operations require approval by default
Path Exclusions: Respects configured excluded paths
Validation: Validates all edits before execution

Examples:

{
  "file_path": "config.yaml",
  "edits": [
    {
      "old_string": "port: 3000",
      "new_string": "port: 8080"
    },
    {
      "old_string": "debug: true",
      "new_string": "debug: false"
    }
  ]
}

Grep Tool

A powerful search tool with configurable backend (ripgrep or Go implementation).

Parameters:

pattern (required): The regular expression pattern to search for
path (optional): File or directory to search in (default: current directory)
output_mode (optional): Output mode - "content", "files_with_matches", or "count" (default: "files_with_matches")
-i (optional): Case insensitive search
-n (optional): Show line numbers in output
-A (optional): Number of lines to show after each match
-B (optional): Number of lines to show before each match
-C (optional): Number of lines to show before and after each match
glob (optional): Glob pattern to filter files (e.g., ".js", ".{ts,tsx}")
type (optional): File type to search (e.g., "js", "py", "rust")
multiline (optional): Enable multiline mode where patterns can span lines
head_limit (optional): Limit output to first N results

Features:

Dual Backend: Uses ripgrep when available for optimal performance, falls back to Go implementation
Full Regex Support: Supports complete regex syntax
Multiple Output Modes: Content matching, file lists, or count results
Context Lines: Show lines before and after matches
File Filtering: Filter by glob patterns or file types
Multiline Matching: Patterns can span multiple lines
Automatic Exclusions: Automatically excludes common directories and files (.git, node_modules, .infer, etc.)
Gitignore Support: Respects .gitignore patterns in your repository
User-Configurable Exclusions: Additional exclusion patterns can be configured by users (not by the LLM)

Security & Exclusions:

Path Exclusions: Respects configured excluded paths and patterns
Automatic Exclusions: The tool automatically excludes:
- Version control directories (.git, .svn, etc.)
- Dependency directories (node_modules, vendor, etc.)
- Build artifacts (dist, build, target, etc.)
- Cache and temp files (.cache, *.tmp, *.log, etc.)
- Security-sensitive files (.env, secrets, etc.)
Gitignore Integration: Automatically reads and respects .gitignore patterns
Validation: Validates search patterns and file access
Performance Limits: Configurable result limits to prevent overwhelming output

Examples:

Basic search: pattern: "error", output_mode: "content"
Case insensitive: pattern: "TODO", -i: true, output_mode: "content"
With context: pattern: "function", -C: 3, output_mode: "content"
File filtering: pattern: "interface", glob: "*.go", output_mode: "files_with_matches"
Count results: pattern: "log.*Error", output_mode: "count"

TodoWrite Tool

Create and manage structured task lists for LLM-assisted development workflows.

Parameters:

todos (required): Array of todo items with status tracking
- id (required): Unique identifier for the task
- content (required): Task description
- status (required): Task status - "pending", "in_progress", or "completed"

Features:

Structured Task Management: Organized task tracking with status
Real-time Updates: Mark tasks as in_progress/completed during execution
Progress Tracking: Visual representation of task completion
LLM Integration: Designed for LLM-assisted development workflows

Security:

No File System Access: Pure memory-based operation
Validation: Validates todo structure and status values
Size Limits: Configurable limits on todo list size

Examples:

{
  "todos": [
    {
      "id": "1",
      "content": "Update README with new tool documentation",
      "status": "in_progress"
    },
    {
      "id": "2",
      "content": "Add test cases for new features",
      "status": "pending"
    }
  ]
}

Security Notes:

All tools respect configured safety settings and exclusion patterns
Commands require approval when safety approval is enabled
File access is restricted to allowed paths and excludes sensitive directories

Configuration

The CLI supports a 2-layer configuration system that allows for both user-level and project-level configuration with proper precedence handling. For detailed configuration documentation and examples, see CONFIG.md.

Configuration Layers

Userspace Configuration (~/.infer/config.yaml)
- Global configuration for the user across all projects
- Used as a fallback when no project-level configuration exists
- Can be created with: infer init --userspace or infer config init --userspace
Project Configuration (.infer/config.yaml in current directory)
- Project-specific configuration that takes precedence over userspace config
- Default location for most commands
- Can be created with: infer init or infer config init

Configuration Precedence

Configuration values are merged with the following precedence (highest to lowest):

Project-level config (.infer/config.yaml) - Highest Priority
Userspace config (~/.infer/config.yaml)
Built-in defaults - Lowest Priority

Example: If your userspace config sets agent.model: "anthropic/claude-4" and your project config sets agent.model: "deepseek/deepseek-chat", the project config wins and deepseek/deepseek-chat will be used. However, if the project config doesn't specify a model but does specify other settings, the userspace model will be preserved while project settings take precedence for their specific values.

Usage Examples

# Create userspace configuration (global fallback)
infer init --userspace

# Create project configuration (takes precedence)
infer init

# Both configurations will be automatically merged when commands are run

You can also specify a custom config file using the --config flag which will override the automatic 2-layer loading.

Default Configuration

gateway:
  url: http://localhost:8080
  api_key: ""
  timeout: 200
client:
  timeout: 200
  retry:
    enabled: true
    max_attempts: 3
    initial_backoff_sec: 5
    max_backoff_sec: 60
    backoff_multiplier: 2
    retryable_status_codes: [400, 408, 429, 500, 502, 503, 504]
logging:
  debug: false
tools:
  enabled: true # Tools are enabled by default with safe read-only commands
  sandbox:
    directories: [".", "/tmp"] # Allowed directories for tool operations
    protected_paths: # Paths excluded from tool access for security
      - .infer/
      - .git/
      - *.env
  bash:
    enabled: true
    whitelist:
      commands: # Exact command matches
        - ls
        - pwd
        - echo
        - wc
        - sort
        - uniq
        - gh
        - task
      patterns: # Regex patterns for more complex commands
        - ^git branch( --show-current)?$
        - ^git checkout -b [a-zA-Z0-9/_-]+( [a-zA-Z0-9/_-]+)?$
        - ^git checkout [a-zA-Z0-9/_-]+
        - ^git add [a-zA-Z0-9/_.-]+
        - ^git diff+
        - ^git remote -v$
        - ^git status$
        - ^git log --oneline -n [0-9]+$
        - ^git commit -m ".+"$
        - ^git push( --set-upstream)?( origin)?( [a-zA-Z0-9/_-]+)?$
  read:
    enabled: true
    require_approval: false
  write:
    enabled: true
    require_approval: true # Write operations require approval by default for security
  edit:
    enabled: true
    require_approval: true # Edit operations require approval by default for security
  delete:
    enabled: true
    require_approval: true # Delete operations require approval by default for security
  grep:
    enabled: true
    backend: auto # "auto", "ripgrep", or "go"
    require_approval: false
  tree:
    enabled: true
    require_approval: false
  web_fetch:
    enabled: true
    whitelisted_domains:
      - golang.org
    safety:
      max_size: 4096 # 4KB
      timeout: 30 # 30 seconds
      allow_redirect: true
    cache:
      enabled: true
      ttl: 3600 # 1 hour
      max_size: 52428800 # 50MB
  web_search:
    enabled: true
    default_engine: duckduckgo
    max_results: 10
    engines:
      - duckduckgo
      - google
    timeout: 10
  todo_write:
    enabled: true
    require_approval: false
  github:
    enabled: true
    token: "%GITHUB_TOKEN%"
    base_url: "https://api.github.com"
    owner: ""
    safety:
      max_size: 1048576  # 1MB
      timeout: 30        # 30 seconds
    require_approval: false
  safety:
    require_approval: true
compact:
  output_dir: .infer # Directory for compact command exports
  summary_model: "" # Model to use for summarization (optional)
agent:
  model: "" # Default model for agent operations
  system_prompt: | # System prompt for agent sessions
    Autonomous software engineering agent. Execute tasks iteratively until completion.

    IMPORTANT: You NEVER push to main or master or to the current branch - instead you create a branch and push to a branch.
    IMPORTANT: You NEVER read all the README.md - start by reading 300 lines

    RULES:
    - Security: Defensive only (analysis, detection, docs)
    - Style: no emojis/comments unless asked, use conventional commits
    - Code: Follow existing patterns, check deps, no secrets
    - Tasks: Use TodoWrite, mark progress immediately
    - Chat exports: Read only "## Summary" to "---" section
    - Tools: Batch calls, prefer Grep for search

    WORKFLOW:
    When asked to implement features or fix issues:
    1. Plan with TodoWrite
    2. Search codebase to understand context
    3. Implement solution
    4. Run tests with: task test
    5. Run lint/format with: task fmt and task lint
    6. Commit changes (only if explicitly asked)
    7. Create a pull request (only if explicitly asked)
  system_reminders:
    enabled: true
    interval: 4
    reminder_text: |
      System reminder text for maintaining context
  verbose_tools: false
  max_turns: 50 # Maximum number of turns for agent sessions
  max_tokens: 4096 # The maximum number of tokens that can be generated per request
  optimization:
    enabled: false
    max_history: 10
    compact_threshold: 20
    truncate_large_outputs: true
    skip_redundant_confirmations: true
chat:
  theme: tokyo-night

Configuration Options

Gateway Settings:

gateway.url: The URL of the inference gateway
gateway.api_key: API key for authentication (if required)
gateway.timeout: Request timeout in seconds

Client Settings:

client.timeout: HTTP client timeout in seconds
client.retry.enabled: Enable automatic retries for failed requests
client.retry.max_attempts: Maximum number of retry attempts
client.retry.initial_backoff_sec: Initial delay between retries in seconds
client.retry.max_backoff_sec: Maximum delay between retries in seconds
client.retry.backoff_multiplier: Backoff multiplier for exponential delay
client.retry.retryable_status_codes: HTTP status codes that trigger retries (e.g., [400, 408, 429, 500, 502, 503, 504])

Logging Settings:

logging.debug: Enable debug logging for verbose output

Tool Settings:

tools.enabled: Enable/disable tool execution for LLMs (default: true)
tools.sandbox.directories: Allowed directories for tool operations (default: [".", "/tmp"])
tools.sandbox.protected_paths: Paths excluded from tool access for security (default: [".infer/", ".git/", "*.env"])
tools.whitelist.commands: List of allowed commands (supports arguments)
tools.whitelist.patterns: Regex patterns for complex command validation
tools.safety.require_approval: Prompt user before executing any command (default: true)
Individual tool settings: Each tool (Bash, Read, Write, Edit, Delete, Grep, Tree, WebFetch, WebSearch, TodoWrite) has:
- enabled: Enable/disable the specific tool
- require_approval: Override global safety setting for this tool (optional)

Compact Settings:

compact.output_dir: Directory for compact command exports (default: ".infer")

Chat Settings:

chat.default_model: Default model for chat sessions (skips model selection when set)
chat.system_prompt: System prompt included with every chat session
chat.system_reminders.enabled: Enable/disable system reminders (default: true)
chat.system_reminders.interval: Number of messages between reminders (default: 10)
chat.system_reminders.text: Custom reminder text to provide contextual guidance

Agent Settings:

agent.model: Default model for agent operations
agent.system_prompt: System prompt for agent sessions
agent.system_reminders.enabled: Enable system reminders during agent sessions
agent.system_reminders.interval: Number of messages between reminders (default: 4)
agent.system_reminders.reminder_text: Custom reminder text for agent context
agent.verbose_tools: Enable verbose tool output (default: false)
agent.max_turns: Maximum number of turns for agent sessions (default: 50)
agent.max_tokens: Maximum tokens per agent request (default: 8192)
agent.optimization.enabled: Enable optimization features (default: false)
agent.optimization.max_history: Maximum conversation history to maintain (default: 10)
agent.optimization.compact_threshold: Threshold for compacting conversation (default: 20)
agent.optimization.truncate_large_outputs: Truncate large tool outputs (default: true)
agent.optimization.skip_redundant_confirmations: Skip redundant confirmation messages (default: true)

Web Search Settings:

web_search.enabled: Enable/disable web search tool for LLMs (default: true)
web_search.default_engine: Default search engine to use ("duckduckgo" or "google", default: "duckduckgo")
web_search.max_results: Maximum number of search results to return (1-50, default: 10)
web_search.engines: List of available search engines
web_search.timeout: Search timeout in seconds (default: 10)

Chat Interface Settings:

chat.theme: Chat interface theme name (default: "tokyo-night")
- Available themes: tokyo-night, github-light, dracula
- Can be changed during chat using /theme [theme-name] shortcut
- Affects colors and styling of the chat interface

Web Search API Setup (Optional)

Both search engines work out of the box, but for better reliability and performance in production, you can configure API keys:

Google Custom Search Engine:

Create a Custom Search Engine:
- Go to Google Programmable Search Engine
- Click "Add" to create a new search engine
- Enter a name for your search engine
- In "Sites to search", enter * to search the entire web
- Click "Create"
Get your Search Engine ID:
- In your search engine settings, note the "Search engine ID" (cx parameter)
Get a Google API Key:
- Go to the Google Cloud Console
- Create a new project or select an existing one
- Enable the "Custom Search JSON API"
- Go to "Credentials" and create an API key
- Restrict the API key to the Custom Search JSON API for security

Configure Environment Variables:

export GOOGLE_SEARCH_API_KEY="your_api_key_here"
export GOOGLE_SEARCH_ENGINE_ID="your_search_engine_id_here"

DuckDuckGo API (Optional):

export DUCKDUCKGO_SEARCH_API_KEY="your_api_key_here"

Note: Both engines have built-in fallback methods that work without API configuration. However, using official APIs provides better reliability and performance for production use.

Global Flags

-c, --config: Config file (default is ./.infer/config.yaml)
-v, --verbose: Verbose output
-h, --help: Help for any command

Examples

Basic Workflow

# Initialize project configuration
infer init

# Check if gateway is running
infer status

# Start interactive chat
infer chat

Configuration Management

# Use custom config file
infer --config ./my-config.yaml status

# Get verbose output
infer --verbose status

# Set default model for chat sessions
infer config agent set-model openai/gpt-4-turbo

# Set system prompt
infer config agent set-system "You are a helpful assistant."

# Enable tool execution with safety approval
infer config tools enable
infer config tools safety enable

# Configure sandbox directories for security
infer config tools sandbox add "/home/user/projects"
infer config tools sandbox add "/tmp/work"

# Add protected paths to prevent accidental modification
infer config tools sandbox add ".env"
infer config tools sandbox add ".git/"

# Configure individual tool safety settings
infer config tools safety set Read disabled    # Skip approval for Read tool
infer config tools safety set Write enabled    # Require approval for Write tool
infer config tools safety set Delete enabled   # Require approval for Delete tool

Development

Building

go build -o infer .

Testing

go test ./...

Dependencies

Cobra - CLI framework
YAML v3 - YAML parsing

Extensible Shortcuts System

The CLI provides an extensible shortcuts system that allows you to quickly execute common commands with /shortcut-name syntax.

Built-in Shortcuts

Core Shortcuts

/clear - Clear conversation history
/exit - Exit the chat session
/help [shortcut] - Show available shortcuts or specific shortcut help
/switch [model] - Switch to a different model
/theme [theme-name] - Switch chat interface theme or list available themes
/config <show|get|set|reload> [key] [value] - Manage configuration settings
/compact [format] - Export conversation to markdown

Git Shortcuts

/git <command> [args...] - Execute git commands (supports commit, push, status, etc.)
/git commit [flags] - NEW: Commit staged changes with AI-generated message
/git push [remote] [branch] [flags] - NEW: Push commits to remote repository

The git shortcuts provide intelligent commit message generation using AI when no message is provided with /git commit.

User-Defined Shortcuts

You can create custom shortcuts by adding YAML configuration files in the .infer/shortcuts/ directory.

Configuration File Format

Create files named custom-*.yaml (e.g., custom-1.yaml, custom-dev.yaml) in .infer/shortcuts/:

shortcuts:
  - name: "tests"
    description: "Run all tests in the project"
    command: "go"
    args: ["test", "./..."]
    working_dir: "."  # Optional: set working directory

  - name: "build"
    description: "Build the project"
    command: "go"
    args: ["build", "-o", "infer", "."]

  - name: "lint"
    description: "Run linter on the codebase"
    command: "golangci-lint"
    args: ["run"]

Configuration Fields

name (required): The shortcut name (used as /name)
description (required): Human-readable description shown in /help
command (required): The executable command to run
args (optional): Array of arguments to pass to the command
working_dir (optional): Working directory for the command (defaults to current)

Using Shortcuts

With the configuration above, you can use:

/tests - Runs go test ./...
/build - Runs go build -o infer .
/lint - Runs golangci-lint run

You can also pass additional arguments:

/tests -v - Runs go test ./... -v
/build --race - Runs go build -o infer . --race

Example Custom Shortcuts

Here are some useful shortcuts you might want to add:

Development Shortcuts (`custom-dev.yaml`)

shortcuts:
  - name: "fmt"
    description: "Format all Go code"
    command: "go"
    args: ["fmt", "./..."]

  - name: "mod tidy"
    description: "Tidy up go modules"
    command: "go"
    args: ["mod", "tidy"]

  - name: "version"
    description: "Show current version"
    command: "git"
    args: ["describe", "--tags", "--always", "--dirty"]

Docker Shortcuts (`custom-docker.yaml`)

shortcuts:
  - name: "docker build"
    description: "Build Docker image"
    command: "docker"
    args: ["build", "-t", "myapp", "."]

  - name: "docker run"
    description: "Run Docker container"
    command: "docker"
    args: ["run", "-p", "8080:8080", "myapp"]

Project-Specific Shortcuts (`custom-project.yaml`)

shortcuts:
  - name: "migrate"
    description: "Run database migrations"
    command: "./scripts/migrate.sh"
    working_dir: "."

  - name: "seed"
    description: "Seed database with test data"
    command: "go"
    args: ["run", "cmd/seed/main.go"]

Tips

File Organization: Use descriptive names for your config files (e.g., custom-dev.yaml, custom-docker.yaml)
Command Discovery: Use /help to see all available shortcuts including your custom ones
Error Handling: If a custom shortcut fails to load, it will be skipped with a warning
Reloading: Restart the chat session to reload custom shortcuts after making changes
Security: Be careful with custom shortcuts as they execute system commands

Troubleshooting

Shortcut not appearing: Check YAML syntax and file naming (custom-*.yaml)
Command not found: Ensure the command is available in your PATH
Permission denied: Check file permissions and executable rights
Invalid YAML: Use a YAML validator to check your configuration syntax

License

This project is licensed under the MIT License.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
cmd
config
internal
app
constants
container
domain
domain/filewriter
handlers
infra/adapters
infra/storage
logger
services
services/filewriter
services/tools
shortcuts
ui
ui/components
ui/history
ui/keybinding
ui/keys
ui/shared
ui/styles
ui/styles/colors
ui/styles/icons
utils
tests
mocks/generated Code generated by counterfeiter.	Code generated by counterfeiter.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

Inference Gateway CLI

⚠️ Warning

Table of Contents

Features

Installation

Using Go Install

Using Container Image

Using Install Script

Manual Download

Verifying Release Binaries

Build from Source

Quick Start

Commands

infer init

infer config

infer config init

infer config agent set-model

infer config agent set-system

infer config tools

infer status

infer chat

infer agent

infer version

Available Tools for LLMs

Tree Tool

Bash Tool

Read Tool

Write Tool

WebSearch Tool

WebFetch Tool

Github Tool

Delete Tool

Edit Tool

MultiEdit Tool

Grep Tool

TodoWrite Tool

Configuration

Configuration Layers

Configuration Precedence

Usage Examples

Default Configuration

Configuration Options

Web Search API Setup (Optional)

Global Flags

Examples

Basic Workflow

Configuration Management

Development

Building

Testing

Dependencies

Extensible Shortcuts System

Built-in Shortcuts

Core Shortcuts

Git Shortcuts

User-Defined Shortcuts

Configuration File Format

Configuration Fields

Using Shortcuts

Example Custom Shortcuts

Development Shortcuts (custom-dev.yaml)

Docker Shortcuts (custom-docker.yaml)

Project-Specific Shortcuts (custom-project.yaml)

Tips

Troubleshooting

License

Documentation ¶

Source Files ¶

Directories ¶

`infer init`

`infer config`

`infer config init`

`infer config agent set-model`

`infer config agent set-system`

`infer config tools`

`infer status`

`infer chat`

`infer agent`

`infer version`

Development Shortcuts (`custom-dev.yaml`)

Docker Shortcuts (`custom-docker.yaml`)

Project-Specific Shortcuts (`custom-project.yaml`)