README
¶
Inference Gateway CLI
A powerful command-line interface for managing and interacting with the Inference Gateway. This CLI provides tools for configuration, monitoring, and management of inference services.
⚠️ Warning
Early Development Stage: This project is in its early development stage and breaking changes are expected until it reaches a stable version.
Always use pinned versions by specifying a specific version tag when downloading binaries or using install scripts.
Table of Contents
- Features
- Installation
- Quick Start
- Commands
- Extensible Shortcuts System
- Available Tools for LLMs
- Configuration
- Global Flags
- Examples
- Development
- License
Features
- Status Monitoring: Check gateway health and resource usage
- Interactive Chat: Chat with models using an interactive interface
- Conversation History: Store and retrieve past conversations with multiple storage backends
- Conversation Storage - Detailed storage backend documentation
- Conversation Title Generation - AI-powered title generation system
- Configuration Management: Manage gateway settings via YAML config
- Project Initialization: Set up local project configurations
- Tool Execution: LLMs can execute whitelisted commands and tools including:
- Bash: Execute safe shell commands
- Read: Read file contents with optional line ranges
- Write: Write content to files with security controls
- Grep: Fast ripgrep-powered search with regex support and multiple output modes
- WebSearch: Search the web using DuckDuckGo or Google
- WebFetch: Fetch content from whitelisted URLs
- Github: Interact with GitHub API to fetch issues, pull requests, and create content
- Tree: Display directory structure with polyfill support
- Delete: Delete files and directories with security controls
- Edit: Perform exact string replacements in files
- MultiEdit: Make multiple edits to files in atomic operations
- TodoWrite: Create and manage structured task lists
Installation
Using Go Install
go install github.com/inference-gateway/cli@latest
Using Container Image
For containerized environments, you can use the official container image:
# Run the CLI directly
docker run --rm -it ghcr.io/inference-gateway/cli:latest --help
# With volume mount for config persistence
docker run --rm -it -v ~/.infer:/home/infer/.infer ghcr.io/inference-gateway/cli:latest
# Example: Run chat command
docker run --rm -it -v ~/.infer:/home/infer/.infer ghcr.io/inference-gateway/cli:latest chat
Using specific version:
docker run --rm -it ghcr.io/inference-gateway/cli:0.48.12
Available architectures: linux/amd64
, linux/arm64
Using Install Script
For quick installation, you can use our install script:
Unix/macOS/Linux:
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash
With specific version:
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --version v0.1.1
Custom install directory:
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash -s -- --install-dir $HOME/.local/bin
The install script will:
- Detect your operating system and architecture automatically
- Download the appropriate binary from GitHub releases
- Install to
/usr/local/bin
by default (or custom directory with--dir
) - Make the binary executable
- Verify the installation
Manual Download
Download the latest release binary for your platform from the releases page.
Verifying Release Binaries
All release binaries are signed with Cosign for supply chain security. You can verify the integrity and authenticity of downloaded binaries using the following steps:
1. Download the binary, checksums, and signature files:
# Download binary (replace with your platform)
curl -L -o infer-darwin-amd64 \
https://github.com/inference-gateway/cli/releases/download/v0.29.1/infer-darwin-amd64
# Download checksums and signature files
curl -L -o checksums.txt \
https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt
curl -L -o checksums.txt.pem \
https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt.pem
curl -L -o checksums.txt.sig \
https://github.com/inference-gateway/cli/releases/download/v0.29.1/checksums.txt.sig
2. Verify SHA256 checksum:
# Calculate checksum of downloaded binary
shasum -a 256 infer-darwin-amd64
# Compare with checksums in checksums.txt
grep infer-darwin-amd64 checksums.txt
3. Verify Cosign signature (requires Cosign to be installed):
# Decode base64 encoded certificate
cat checksums.txt.pem | base64 -d > checksums.txt.pem.decoded
# Verify the signature
cosign verify-blob \
--certificate checksums.txt.pem.decoded \
--signature checksums.txt.sig \
--certificate-identity "https://github.com/inference-gateway/cli/.github/workflows/release.yml@refs/heads/main" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
checksums.txt
4. Make binary executable and install:
chmod +x infer-darwin-amd64
sudo mv infer-darwin-amd64 /usr/local/bin/infer
Note: Replace
v0.29.1
with the desired release version andinfer-darwin-amd64
with your platform's binary name.
Build from Source
git clone https://github.com/inference-gateway/cli.git
cd cli
go build -o infer .
Quick Start
-
Initialize project configuration:
infer init --model deepseek/deepseek-chat
Using
--model
is recommended as it enables AI project analysis and generates a comprehensive AGENTS.md file tailored to your specific project. -
Check gateway status:
infer status
-
Start an interactive chat:
infer chat
Commands
infer init
Initialize a new project with Inference Gateway CLI. This creates:
.infer/
directory with:config.yaml
- Main configuration file for the project.gitignore
- Ensures sensitive files are not committed to version control
AGENTS.md
- AI-generated project documentation in the repository root (only when--model
is specified)
This is the recommended command to start working with Inference Gateway CLI in a new project.
Options:
--overwrite
: Overwrite existing files if they already exist--userspace
: Initialize configuration in user home directory (~/.infer/
)
Examples:
# Initialize project-level configuration (default)
infer init
infer init --overwrite
# Initialize userspace configuration (global fallback)
infer init --userspace
infer config
Manage CLI configuration settings including models, system prompts, and tools.
infer config init
Initialize a new .infer/config.yaml
configuration file in the current
directory. This creates only the configuration file with default settings.
For complete project initialization, use infer init
instead.
Options:
--overwrite
: Overwrite existing configuration file--userspace
: Initialize configuration in user home directory (~/.infer/
)
Examples:
# Initialize project-level configuration (default)
infer config init
infer config init --overwrite
# Initialize userspace configuration (global fallback)
infer config init --userspace
infer config agent set-model
Set the default model for chat sessions. When set, chat sessions will automatically use this model without showing the model selection prompt.
Examples:
infer config agent set-model openai/gpt-4-turbo
infer config agent set-model anthropic/claude-opus-4-1-20250805
infer config agent set-system
Set a system prompt that will be included with every chat session, providing context and instructions to the AI model.
Examples:
infer config agent set-system "You are a helpful assistant."
infer config agent set-system "You are a Go programming expert."
infer config agent set-system "You are a helpful assistant."
infer config agent set-system "You are a Go programming expert."
infer config tools
Manage tool execution settings for LLMs, including enabling/disabling tools, managing whitelists, and security settings.
Subcommands:
enable
: Enable tool execution for LLMsdisable
: Disable tool execution for LLMslist [--format text|json]
: List whitelisted commands and patternsvalidate <command>
: Validate if a command is whitelistedexec <command> [--format text|json]
: Execute a whitelisted command directlysafety
: Manage safety approval settingsenable
: Enable safety approval promptsdisable
: Disable safety approval promptsstatus
: Show current safety approval status
sandbox
: Manage sandbox directories for securitylist
: List all sandbox directoriesadd <path>
: Add a protected path to the sandboxremove <path>
: Remove a protected path from the sandbox
Examples:
# Enable/disable tool execution
infer config tools enable
infer config tools disable
# List whitelisted commands
infer config tools list
infer config tools list --format json
# Validate and execute commands
infer config tools validate "ls -la"
infer config tools exec "git status"
# Manage global safety settings (approval prompts)
infer config tools safety enable # Enable approval prompts for all tool execution
infer config tools safety disable # Disable approval prompts (execute tools immediately)
infer config tools safety status # Show current safety approval status
# Manage tool-specific safety settings (granular control)
infer config tools safety set Bash enabled # Require approval for Bash tool only
infer config tools safety set WebSearch disabled # Skip approval for WebSearch tool
infer config tools safety unset Bash # Remove tool-specific setting (use global)
# Manage excluded paths
infer config tools sandbox list
infer config tools sandbox add ".github/"
infer config tools sandbox remove "test.txt"
infer status
Check the status of the inference gateway including health checks and resource usage.
Examples:
infer status
infer chat
Start an interactive chat session with model selection. Provides a conversational interface where you can select models and have conversations.
Features:
- Interactive model selection
- Conversational interface
- Real-time streaming responses
- Scrollable chat history with mouse wheel and keyboard support
Navigation Controls:
- Mouse wheel: Scroll up/down through chat history
- Arrow keys (
↑
/↓
) or Vim keys (k
/j
): Scroll one line at a time - Page Up/Page Down: Scroll by page
- Home/End: Jump to top/bottom of chat history
- Shift+↑/Shift+↓: Half-page scrolling
- Ctrl+R: Toggle expanded view of tool results
System Reminders:
The chat interface supports configurable system reminders that can provide periodic contextual information to the AI model during conversations. These reminders help maintain context and provide relevant guidance throughout the session.
- Customizable interval: Set how often reminders appear (in number of messages)
- Dynamic content: Reminders can contain contextual information based on the current state
- Non-intrusive: Reminders are sent to the AI model but don't interrupt the user experience
- Configurable: Enable/disable and customize reminder content through configuration
Examples:
infer chat
infer agent
Execute a task using an autonomous agent in background mode. The CLI will work iteratively until the task is considered complete. Particularly useful for SCM tickets like GitHub issues.
Features:
- Autonomous execution: Agent works independently to complete tasks
- Iterative processing: Continues until task completion criteria are met
- Tool integration: Full access to all available tools (Bash, Read, Write, etc.)
- Parallel tool execution: Executes multiple tool calls simultaneously for improved efficiency
- Background operation: Runs without interactive user input
- Task completion detection: Automatically detects when tasks are complete
- Configurable concurrency: Control the maximum number of parallel tool executions (default: 5)
- JSON output: Structured JSON output for easy parsing and integration
Options:
-m, --model
: Model to use for the agent (e.g., openai/gpt-4)
Examples:
# Execute a task described in a GitHub issue
infer agent "Please fix the github issue 38"
# Use a specific model for the agent
infer agent --model "openai/gpt-4" "Implement the feature described in issue #42"
# Debug a failing test
infer agent "Debug the failing test in PR 15"
# Refactor code
infer agent "Refactor the authentication module to use JWT tokens"
infer version
Display version information for the Inference Gateway CLI.
Examples:
infer version
Available Tools for LLMs
When tool execution is enabled, LLMs can use the following tools to interact with the system:
Tree Tool
Display directory structure in a tree format, similar to the Unix tree
command. Provides a polyfill
implementation when the native tree
command is unavailable.
Parameters:
path
(optional): Directory path to display tree structure for (default: current directory)max_depth
(optional): Maximum depth to traverse (unlimited by default)show_hidden
(optional): Whether to show hidden files and directories (default: false)respect_gitignore
(optional): Whether to exclude patterns from .gitignore (default: true)format
(optional): Output format - "text" or "json" (default: "text")
Examples:
- Basic tree: Uses current directory with default settings
- Tree with depth limit:
max_depth: 2
- Shows only 2 levels deep - Tree with hidden files:
show_hidden: true
- Tree ignoring gitignore:
respect_gitignore: false
- Shows all files including those in .gitignore - JSON output:
format: "json"
- Returns structured data
Features:
- Native Integration: Uses system
tree
command when available for optimal performance - Polyfill Implementation: Falls back to custom implementation when
tree
is not installed - Pattern Exclusion: Supports glob patterns to exclude specific files and directories
- Depth Control: Limit traversal depth to prevent overwhelming output
- Hidden File Control: Toggle visibility of hidden files and directories
- Multiple Formats: Text output for readability, JSON for structured data
Security:
- Respects configured path exclusions for security
- Validates directory access permissions
- Limited by the same security restrictions as other file tools
Bash Tool
Execute whitelisted bash commands securely with validation against configured command patterns.
Read Tool
Read file content from the filesystem with optional line range specification.
Write Tool
Write content to files on the filesystem with security controls and directory creation support.
Parameters:
file_path
(required): The path to the file to writecontent
(required): The content to write to the filecreate_dirs
(optional): Whether to create parent directories if they don't exist (default: true)overwrite
(optional): Whether to overwrite existing files (default: true)format
(optional): Output format - "text" or "json" (default: "text")
Features:
- Directory Creation: Automatically creates parent directories when needed
- Overwrite Control: Configurable behavior for existing files
- Security Validation: Respects path exclusions and security restrictions
- Performance Optimized: Efficient file writing with proper error handling
Security:
- Approval Required: Write operations require approval by default (secure by default)
- Path Exclusions: Respects configured excluded paths (e.g.,
.infer/
directory) - Pattern Matching: Supports glob patterns for path exclusions
- Validation: Validates file paths and content before writing
Examples:
- Create new file:
file_path: "output.txt"
,content: "Hello, World!"
- Write to subdirectory:
file_path: "logs/app.log"
,content: "log entry"
,create_dirs: true
- Safe overwrite:
file_path: "config.json"
,content: "{...}"
,overwrite: false
WebSearch Tool
Search the web using DuckDuckGo or Google search engines to find information.
WebFetch Tool
WebFetch content from whitelisted URLs or GitHub references using the format example.com
.
Github Tool
Interact with GitHub API to fetch issues, pull requests, create comments, and create pull requests with authentication support. This is a standalone tool separate from WebFetch.
Parameters:
owner
(required): Repository owner (username or organization)repo
(required): Repository nameresource
(optional): Resource type to fetch or create (default: "issue")issue
: Fetch a specific issueissues
: Fetch a list of issuespull_request
: Fetch a specific pull requestcomments
: Fetch comments for an issue/PRcreate_comment
: Create a comment on an issue/PRcreate_pull_request
: Create a new pull request
issue_number
(required for issue/pull_request/comments/create_comment): Issue or PR numbercomment_body
(required for create_comment): Comment body texttitle
(required for create_pull_request): Pull request titlebody
(optional for create_pull_request): Pull request body/descriptionhead
(required for create_pull_request): Head branch namebase
(optional for create_pull_request): Base branch name (default: "main")state
(optional): Filter by state for issues list ("open", "closed", "all", default: "open")per_page
(optional): Number of items per page for lists (1-100, default: 30)
Features:
- GitHub API Integration: Direct access to GitHub's REST API v3
- Authentication: Supports GitHub personal access tokens via environment variables
- Multiple Resources: Fetch issues, pull requests, comments, and create new content
- Structured Data: Returns properly typed GitHub data structures
- Error Handling: Comprehensive error handling with GitHub API error messages
- Rate Limiting: Respects GitHub API rate limits
- Security: Configurable timeout and response size limits
- Environment Variables: Supports token resolution via
%GITHUB_TOKEN%
syntax - Security Controls: Owner validation for secure repository access
Configuration:
tools:
github:
enabled: true
token: "%GITHUB_TOKEN%" # Environment variable reference
base_url: "https://api.github.com"
owner: "your-username" # Default owner for security
repo: "your-repo" # Default repository (optional)
safety:
max_size: 1048576 # 1MB
timeout: 30 # 30 seconds
require_approval: false
Examples:
- Fetch specific issue:
owner: "octocat", repo: "Hello-World", resource: "issue", issue_number: 1
- List open issues:
owner: "octocat", repo: "Hello-World", resource: "issues", state: "open", per_page: 10
- Fetch pull request:
owner: "octocat", repo: "Hello-World", resource: "pull_request", issue_number: 5
- Get issue comments:
owner: "octocat", repo: "Hello-World", resource: "comments", issue_number: 1
- Create comment:
owner: "octocat", repo: "Hello-World", resource: "create_comment", issue_number: 1, comment_body: "Great work!"
- Create pull request:
owner: "octocat", repo: "Hello-World", resource: "create_pull_request", title: "Add feature", body: "New feature implementation", head: "feature-branch", base: "main"
Delete Tool
Delete files or directories from the filesystem with security controls. Supports wildcard patterns for batch operations.
Parameters:
path
(required): The path to the file or directory to deleterecursive
(optional): Whether to delete directories recursively (default: false)force
(optional): Whether to force deletion (ignore non-existent files, default: false)
Features:
- Wildcard Support: Delete multiple files using patterns like
*.txt
ortemp/*
- Recursive Deletion: Remove directories and their contents
- Safety Controls: Respects configured path exclusions and security restrictions
- Validation: Validates file paths and permissions before deletion
Security:
- Approval Required: Delete operations require approval by default
- Path Exclusions: Respects configured excluded paths for security
- Pattern Matching: Supports glob patterns for path exclusions
- Validation: Validates file paths and prevents deletion of protected directories
Examples:
- Delete single file:
path: "temp.txt"
- Delete directory recursively:
path: "temp/", recursive: true
- Delete with wildcard:
path: "*.log"
- Force delete:
path: "missing.txt", force: true
Edit Tool
Perform exact string replacements in files with security validation and preview support.
Parameters:
file_path
(required): The path to the file to modifyold_string
(required): The text to replace (must match exactly)new_string
(required): The text to replace it withreplace_all
(optional): Replace all occurrences of old_string (default: false)
Features:
- Exact Matching: Requires exact string matches for safety
- Preview Support: Shows diff preview before applying changes
- Atomic Operations: Either all changes succeed or none are applied
- Security Validation: Respects path exclusions and file permissions
Security:
- Read Tool Requirement: Requires Read tool to be used first on the file
- Approval Required: Edit operations require approval by default
- Path Exclusions: Respects configured excluded paths
- Validation: Validates file paths and prevents editing protected files
Examples:
- Single replacement:
file_path: "config.txt", old_string: "port: 3000", new_string: "port: 8080"
- Replace all occurrences:
file_path: "script.py", old_string: "print", new_string: "logging.info", replace_all: true
MultiEdit Tool
Make multiple edits to a single file in atomic operations. All edits succeed or none are applied.
Parameters:
file_path
(required): The path to the file to modifyedits
(required): Array of edit operations to perform sequentiallyold_string
: The text to replace (must match exactly)new_string
: The text to replace it withreplace_all
(optional): Replace all occurrences (default: false)
Features:
- Atomic Operations: All edits succeed or none are applied
- Sequential Processing: Edits are applied in the order provided
- Preview Support: Shows comprehensive diff preview
- Security Validation: Respects all security restrictions
Security:
- Read Tool Requirement: Requires Read tool to be used first on the file
- Approval Required: MultiEdit operations require approval by default
- Path Exclusions: Respects configured excluded paths
- Validation: Validates all edits before execution
Examples:
{
"file_path": "config.yaml",
"edits": [
{
"old_string": "port: 3000",
"new_string": "port: 8080"
},
{
"old_string": "debug: true",
"new_string": "debug: false"
}
]
}
Grep Tool
A powerful search tool with configurable backend (ripgrep or Go implementation).
Parameters:
pattern
(required): The regular expression pattern to search forpath
(optional): File or directory to search in (default: current directory)output_mode
(optional): Output mode - "content", "files_with_matches", or "count" (default: "files_with_matches")-i
(optional): Case insensitive search-n
(optional): Show line numbers in output-A
(optional): Number of lines to show after each match-B
(optional): Number of lines to show before each match-C
(optional): Number of lines to show before and after each matchglob
(optional): Glob pattern to filter files (e.g., ".js", ".{ts,tsx}")type
(optional): File type to search (e.g., "js", "py", "rust")multiline
(optional): Enable multiline mode where patterns can span lineshead_limit
(optional): Limit output to first N results
Features:
- Dual Backend: Uses ripgrep when available for optimal performance, falls back to Go implementation
- Full Regex Support: Supports complete regex syntax
- Multiple Output Modes: Content matching, file lists, or count results
- Context Lines: Show lines before and after matches
- File Filtering: Filter by glob patterns or file types
- Multiline Matching: Patterns can span multiple lines
- Automatic Exclusions: Automatically excludes common directories and files (.git, node_modules, .infer, etc.)
- Gitignore Support: Respects .gitignore patterns in your repository
- User-Configurable Exclusions: Additional exclusion patterns can be configured by users (not by the LLM)
Security & Exclusions:
- Path Exclusions: Respects configured excluded paths and patterns
- Automatic Exclusions: The tool automatically excludes:
- Version control directories (.git, .svn, etc.)
- Dependency directories (node_modules, vendor, etc.)
- Build artifacts (dist, build, target, etc.)
- Cache and temp files (.cache, *.tmp, *.log, etc.)
- Security-sensitive files (.env, secrets, etc.)
- Gitignore Integration: Automatically reads and respects .gitignore patterns
- Validation: Validates search patterns and file access
- Performance Limits: Configurable result limits to prevent overwhelming output
Examples:
- Basic search:
pattern: "error", output_mode: "content"
- Case insensitive:
pattern: "TODO", -i: true, output_mode: "content"
- With context:
pattern: "function", -C: 3, output_mode: "content"
- File filtering:
pattern: "interface", glob: "*.go", output_mode: "files_with_matches"
- Count results:
pattern: "log.*Error", output_mode: "count"
TodoWrite Tool
Create and manage structured task lists for LLM-assisted development workflows.
Parameters:
todos
(required): Array of todo items with status trackingid
(required): Unique identifier for the taskcontent
(required): Task descriptionstatus
(required): Task status - "pending", "in_progress", or "completed"
Features:
- Structured Task Management: Organized task tracking with status
- Real-time Updates: Mark tasks as in_progress/completed during execution
- Progress Tracking: Visual representation of task completion
- LLM Integration: Designed for LLM-assisted development workflows
Security:
- No File System Access: Pure memory-based operation
- Validation: Validates todo structure and status values
- Size Limits: Configurable limits on todo list size
Examples:
{
"todos": [
{
"id": "1",
"content": "Update README with new tool documentation",
"status": "in_progress"
},
{
"id": "2",
"content": "Add test cases for new features",
"status": "pending"
}
]
}
Security Notes:
- All tools respect configured safety settings and exclusion patterns
- Commands require approval when safety approval is enabled
- File access is restricted to allowed paths and excludes sensitive directories
Configuration
The CLI supports a 2-layer configuration system that allows for both user-level and project-level configuration with proper precedence handling. For detailed configuration documentation and examples, see CONFIG.md.
Configuration Layers
-
Userspace Configuration (
~/.infer/config.yaml
)- Global configuration for the user across all projects
- Used as a fallback when no project-level configuration exists
- Can be created with:
infer init --userspace
orinfer config init --userspace
-
Project Configuration (
.infer/config.yaml
in current directory)- Project-specific configuration that takes precedence over userspace config
- Default location for most commands
- Can be created with:
infer init
orinfer config init
Configuration Precedence
Configuration values are merged with the following precedence (highest to lowest):
- Project-level config (
.infer/config.yaml
) - Highest Priority - Userspace config (
~/.infer/config.yaml
) - Built-in defaults - Lowest Priority
Example: If your userspace config sets agent.model: "anthropic/claude-4"
and your project config sets
agent.model: "deepseek/deepseek-chat"
, the project config wins and deepseek/deepseek-chat
will be used.
However, if the project config doesn't specify a model but does specify other settings, the userspace model will be preserved
while project settings take precedence for their specific values.
Usage Examples
# Create userspace configuration (global fallback)
infer init --userspace
# Create project configuration (takes precedence)
infer init
# Both configurations will be automatically merged when commands are run
You can also specify a custom config file using the --config
flag which will override the automatic 2-layer loading.
Default Configuration
gateway:
url: http://localhost:8080
api_key: ""
timeout: 200
client:
timeout: 200
retry:
enabled: true
max_attempts: 3
initial_backoff_sec: 5
max_backoff_sec: 60
backoff_multiplier: 2
retryable_status_codes: [400, 408, 429, 500, 502, 503, 504]
logging:
debug: false
tools:
enabled: true # Tools are enabled by default with safe read-only commands
sandbox:
directories: [".", "/tmp"] # Allowed directories for tool operations
protected_paths: # Paths excluded from tool access for security
- .infer/
- .git/
- *.env
bash:
enabled: true
whitelist:
commands: # Exact command matches
- ls
- pwd
- echo
- wc
- sort
- uniq
- gh
- task
patterns: # Regex patterns for more complex commands
- ^git branch( --show-current)?$
- ^git checkout -b [a-zA-Z0-9/_-]+( [a-zA-Z0-9/_-]+)?$
- ^git checkout [a-zA-Z0-9/_-]+
- ^git add [a-zA-Z0-9/_.-]+
- ^git diff+
- ^git remote -v$
- ^git status$
- ^git log --oneline -n [0-9]+$
- ^git commit -m ".+"$
- ^git push( --set-upstream)?( origin)?( [a-zA-Z0-9/_-]+)?$
read:
enabled: true
require_approval: false
write:
enabled: true
require_approval: true # Write operations require approval by default for security
edit:
enabled: true
require_approval: true # Edit operations require approval by default for security
delete:
enabled: true
require_approval: true # Delete operations require approval by default for security
grep:
enabled: true
backend: auto # "auto", "ripgrep", or "go"
require_approval: false
tree:
enabled: true
require_approval: false
web_fetch:
enabled: true
whitelisted_domains:
- golang.org
safety:
max_size: 4096 # 4KB
timeout: 30 # 30 seconds
allow_redirect: true
cache:
enabled: true
ttl: 3600 # 1 hour
max_size: 52428800 # 50MB
web_search:
enabled: true
default_engine: duckduckgo
max_results: 10
engines:
- duckduckgo
- google
timeout: 10
todo_write:
enabled: true
require_approval: false
github:
enabled: true
token: "%GITHUB_TOKEN%"
base_url: "https://api.github.com"
owner: ""
safety:
max_size: 1048576 # 1MB
timeout: 30 # 30 seconds
require_approval: false
safety:
require_approval: true
compact:
output_dir: .infer # Directory for compact command exports
summary_model: "" # Model to use for summarization (optional)
agent:
model: "" # Default model for agent operations
system_prompt: | # System prompt for agent sessions
Autonomous software engineering agent. Execute tasks iteratively until completion.
IMPORTANT: You NEVER push to main or master or to the current branch - instead you create a branch and push to a branch.
IMPORTANT: You NEVER read all the README.md - start by reading 300 lines
RULES:
- Security: Defensive only (analysis, detection, docs)
- Style: no emojis/comments unless asked, use conventional commits
- Code: Follow existing patterns, check deps, no secrets
- Tasks: Use TodoWrite, mark progress immediately
- Chat exports: Read only "## Summary" to "---" section
- Tools: Batch calls, prefer Grep for search
WORKFLOW:
When asked to implement features or fix issues:
1. Plan with TodoWrite
2. Search codebase to understand context
3. Implement solution
4. Run tests with: task test
5. Run lint/format with: task fmt and task lint
6. Commit changes (only if explicitly asked)
7. Create a pull request (only if explicitly asked)
system_reminders:
enabled: true
interval: 4
reminder_text: |
System reminder text for maintaining context
verbose_tools: false
max_turns: 50 # Maximum number of turns for agent sessions
max_tokens: 4096 # The maximum number of tokens that can be generated per request
optimization:
enabled: false
max_history: 10
compact_threshold: 20
truncate_large_outputs: true
skip_redundant_confirmations: true
chat:
theme: tokyo-night
Configuration Options
Gateway Settings:
- gateway.url: The URL of the inference gateway
- gateway.api_key: API key for authentication (if required)
- gateway.timeout: Request timeout in seconds
Client Settings:
- client.timeout: HTTP client timeout in seconds
- client.retry.enabled: Enable automatic retries for failed requests
- client.retry.max_attempts: Maximum number of retry attempts
- client.retry.initial_backoff_sec: Initial delay between retries in seconds
- client.retry.max_backoff_sec: Maximum delay between retries in seconds
- client.retry.backoff_multiplier: Backoff multiplier for exponential delay
- client.retry.retryable_status_codes: HTTP status codes that trigger retries (e.g., [400, 408, 429, 500, 502, 503, 504])
Logging Settings:
- logging.debug: Enable debug logging for verbose output
Tool Settings:
- tools.enabled: Enable/disable tool execution for LLMs (default: true)
- tools.sandbox.directories: Allowed directories for tool operations (default: [".", "/tmp"])
- tools.sandbox.protected_paths: Paths excluded from tool access for security (default: [".infer/", ".git/", "*.env"])
- tools.whitelist.commands: List of allowed commands (supports arguments)
- tools.whitelist.patterns: Regex patterns for complex command validation
- tools.safety.require_approval: Prompt user before executing any command (default: true)
- Individual tool settings: Each tool (Bash, Read, Write, Edit, Delete, Grep, Tree, WebFetch, WebSearch, TodoWrite) has:
- enabled: Enable/disable the specific tool
- require_approval: Override global safety setting for this tool (optional)
Compact Settings:
- compact.output_dir: Directory for compact command exports (default: ".infer")
Chat Settings:
- chat.default_model: Default model for chat sessions (skips model selection when set)
- chat.system_prompt: System prompt included with every chat session
- chat.system_reminders.enabled: Enable/disable system reminders (default: true)
- chat.system_reminders.interval: Number of messages between reminders (default: 10)
- chat.system_reminders.text: Custom reminder text to provide contextual guidance
Agent Settings:
- agent.model: Default model for agent operations
- agent.system_prompt: System prompt for agent sessions
- agent.system_reminders.enabled: Enable system reminders during agent sessions
- agent.system_reminders.interval: Number of messages between reminders (default: 4)
- agent.system_reminders.reminder_text: Custom reminder text for agent context
- agent.verbose_tools: Enable verbose tool output (default: false)
- agent.max_turns: Maximum number of turns for agent sessions (default: 50)
- agent.max_tokens: Maximum tokens per agent request (default: 8192)
- agent.optimization.enabled: Enable optimization features (default: false)
- agent.optimization.max_history: Maximum conversation history to maintain (default: 10)
- agent.optimization.compact_threshold: Threshold for compacting conversation (default: 20)
- agent.optimization.truncate_large_outputs: Truncate large tool outputs (default: true)
- agent.optimization.skip_redundant_confirmations: Skip redundant confirmation messages (default: true)
Web Search Settings:
- web_search.enabled: Enable/disable web search tool for LLMs (default: true)
- web_search.default_engine: Default search engine to use ("duckduckgo" or "google", default: "duckduckgo")
- web_search.max_results: Maximum number of search results to return (1-50, default: 10)
- web_search.engines: List of available search engines
- web_search.timeout: Search timeout in seconds (default: 10)
Chat Interface Settings:
- chat.theme: Chat interface theme name (default: "tokyo-night")
- Available themes:
tokyo-night
,github-light
,dracula
- Can be changed during chat using
/theme [theme-name]
shortcut - Affects colors and styling of the chat interface
- Available themes:
Web Search API Setup (Optional)
Both search engines work out of the box, but for better reliability and performance in production, you can configure API keys:
Google Custom Search Engine:
-
Create a Custom Search Engine:
- Go to Google Programmable Search Engine
- Click "Add" to create a new search engine
- Enter a name for your search engine
- In "Sites to search", enter
*
to search the entire web - Click "Create"
-
Get your Search Engine ID:
- In your search engine settings, note the "Search engine ID" (cx parameter)
-
Get a Google API Key:
- Go to the Google Cloud Console
- Create a new project or select an existing one
- Enable the "Custom Search JSON API"
- Go to "Credentials" and create an API key
- Restrict the API key to the Custom Search JSON API for security
-
Configure Environment Variables:
export GOOGLE_SEARCH_API_KEY="your_api_key_here" export GOOGLE_SEARCH_ENGINE_ID="your_search_engine_id_here"
DuckDuckGo API (Optional):
export DUCKDUCKGO_SEARCH_API_KEY="your_api_key_here"
Note: Both engines have built-in fallback methods that work without API configuration. However, using official APIs provides better reliability and performance for production use.
Global Flags
-c, --config
: Config file (default is./.infer/config.yaml
)-v, --verbose
: Verbose output-h, --help
: Help for any command
Examples
Basic Workflow
# Initialize project configuration
infer init
# Check if gateway is running
infer status
# Start interactive chat
infer chat
Configuration Management
# Use custom config file
infer --config ./my-config.yaml status
# Get verbose output
infer --verbose status
# Set default model for chat sessions
infer config agent set-model openai/gpt-4-turbo
# Set system prompt
infer config agent set-system "You are a helpful assistant."
# Enable tool execution with safety approval
infer config tools enable
infer config tools safety enable
# Configure sandbox directories for security
infer config tools sandbox add "/home/user/projects"
infer config tools sandbox add "/tmp/work"
# Add protected paths to prevent accidental modification
infer config tools sandbox add ".env"
infer config tools sandbox add ".git/"
# Configure individual tool safety settings
infer config tools safety set Read disabled # Skip approval for Read tool
infer config tools safety set Write enabled # Require approval for Write tool
infer config tools safety set Delete enabled # Require approval for Delete tool
Development
Building
go build -o infer .
Testing
go test ./...
Dependencies
Extensible Shortcuts System
The CLI provides an extensible shortcuts system that allows you to quickly execute common commands with
/shortcut-name
syntax.
Built-in Shortcuts
Core Shortcuts
/clear
- Clear conversation history/exit
- Exit the chat session/help [shortcut]
- Show available shortcuts or specific shortcut help/switch [model]
- Switch to a different model/theme [theme-name]
- Switch chat interface theme or list available themes/config <show|get|set|reload> [key] [value]
- Manage configuration settings/compact [format]
- Export conversation to markdown
Git Shortcuts
/git <command> [args...]
- Execute git commands (supports commit, push, status, etc.)/git commit [flags]
- NEW: Commit staged changes with AI-generated message/git push [remote] [branch] [flags]
- NEW: Push commits to remote repository
The git shortcuts provide intelligent commit message generation using AI when no message is provided with /git commit
.
User-Defined Shortcuts
You can create custom shortcuts by adding YAML configuration files in the .infer/shortcuts/
directory.
Configuration File Format
Create files named custom-*.yaml
(e.g., custom-1.yaml
, custom-dev.yaml
) in .infer/shortcuts/
:
shortcuts:
- name: "tests"
description: "Run all tests in the project"
command: "go"
args: ["test", "./..."]
working_dir: "." # Optional: set working directory
- name: "build"
description: "Build the project"
command: "go"
args: ["build", "-o", "infer", "."]
- name: "lint"
description: "Run linter on the codebase"
command: "golangci-lint"
args: ["run"]
Configuration Fields
- name (required): The shortcut name (used as
/name
) - description (required): Human-readable description shown in
/help
- command (required): The executable command to run
- args (optional): Array of arguments to pass to the command
- working_dir (optional): Working directory for the command (defaults to current)
Using Shortcuts
With the configuration above, you can use:
/tests
- Runsgo test ./...
/build
- Runsgo build -o infer .
/lint
- Runsgolangci-lint run
You can also pass additional arguments:
/tests -v
- Runsgo test ./... -v
/build --race
- Runsgo build -o infer . --race
Example Custom Shortcuts
Here are some useful shortcuts you might want to add:
Development Shortcuts (custom-dev.yaml
)
shortcuts:
- name: "fmt"
description: "Format all Go code"
command: "go"
args: ["fmt", "./..."]
- name: "mod tidy"
description: "Tidy up go modules"
command: "go"
args: ["mod", "tidy"]
- name: "version"
description: "Show current version"
command: "git"
args: ["describe", "--tags", "--always", "--dirty"]
Docker Shortcuts (custom-docker.yaml
)
shortcuts:
- name: "docker build"
description: "Build Docker image"
command: "docker"
args: ["build", "-t", "myapp", "."]
- name: "docker run"
description: "Run Docker container"
command: "docker"
args: ["run", "-p", "8080:8080", "myapp"]
Project-Specific Shortcuts (custom-project.yaml
)
shortcuts:
- name: "migrate"
description: "Run database migrations"
command: "./scripts/migrate.sh"
working_dir: "."
- name: "seed"
description: "Seed database with test data"
command: "go"
args: ["run", "cmd/seed/main.go"]
Tips
- File Organization: Use descriptive names for your config files (e.g.,
custom-dev.yaml
,custom-docker.yaml
) - Command Discovery: Use
/help
to see all available shortcuts including your custom ones - Error Handling: If a custom shortcut fails to load, it will be skipped with a warning
- Reloading: Restart the chat session to reload custom shortcuts after making changes
- Security: Be careful with custom shortcuts as they execute system commands
Troubleshooting
- Shortcut not appearing: Check YAML syntax and file naming (
custom-*.yaml
) - Command not found: Ensure the command is available in your PATH
- Permission denied: Check file permissions and executable rights
- Invalid YAML: Use a YAML validator to check your configuration syntax
License
This project is licensed under the MIT License.
Documentation
¶
There is no documentation for this package.
Directories
¶
Path | Synopsis |
---|---|
internal
|
|
tests
|
|
mocks/generated
Code generated by counterfeiter.
|
Code generated by counterfeiter. |