agentcall

module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 19, 2026 License: MIT

README

AgentCall

Build Status Lint Status Go Report Card Docs License

An MCP plugin that enables voice calls via phone for AI coding assistants. Start a task, walk away. Your phone rings when the AI is done, stuck, or needs a decision.

Supports: Claude Code, AWS Kiro CLI, Gemini CLI

Built with the agentplexus stack - showcasing a complete voice AI architecture in Go.

Inspiration

This project is inspired by ZeframLou/call-me, an excellent TypeScript MCP plugin that pioneered the "AI calls you" pattern. We wanted to:

  1. Build a Go implementation - Leverage Go's deployment simplicity and performance
  2. Exercise the AgentPlexus libraries - Demonstrate the omnivoice abstraction layer with pluggable providers
  3. Use premium voice providers - ElevenLabs for natural TTS, Deepgram for accurate STT
Comparison
Aspect agentcall (Go) call-me (TypeScript)
Deployable size 53 MB (single binary) 68 MB (node_modules)
Runtime required None Node.js/Bun
Dependencies Compiled in 122 npm packages
Distribution Single file copy npm install
TTS Provider ElevenLabs (streaming) OpenAI
STT Provider Deepgram (streaming) OpenAI
Phone Provider Twilio Twilio/Telnyx

The Go binary is self-contained with no runtime dependencies, making deployment as simple as copying a single file.

Architecture

┌────────────────────────────────────────────────────────────────────────────┐
│                           agentcall                                        │
├────────────────────────────────────────────────────────────────────────────┤
│  MCP Tools (via mcpkit)                                                    │
│  ├── initiate_call  - Start a new call to the user                         │
│  ├── continue_call  - Continue conversation on active call                 │
│  ├── speak_to_user  - Speak without waiting for response                   │
│  └── end_call       - End the call with optional goodbye                   │
├────────────────────────────────────────────────────────────────────────────┤
│  Call Manager                                                              │
│  - Orchestrates calls, TTS, STT                                            │
│  - Manages call state and conversation history                             │
├────────────────────────────────────────────────────────────────────────────┤
│  omnivoice (abstraction layer)                                             │
│  ├── tts.StreamingProvider  - Text-to-Speech interface                     │
│  ├── stt.StreamingProvider  - Speech-to-Text interface                     │
│  ├── transport.Transport    - Audio streaming interface                    │
│  └── callsystem.CallSystem  - Phone call management interface              │
├────────────────────────────────────────────────────────────────────────────┤
│  Provider Implementations                                                  │
│  ├── go-elevenlabs  - ElevenLabs streaming TTS (natural voices)            │
│  ├── omnivoice-deepgram    - Deepgram streaming STT (accurate transcripts) │
│  └── omnivoice-twilio                                                      │
│      ├── Transport via Twilio Media Streams WebSocket                      │
│      └── CallSystem via Twilio REST API                                    │
├────────────────────────────────────────────────────────────────────────────┤
│  mcpkit                                                                    │
│  - MCP server with HTTP/SSE transport                                      │
│  - Built-in ngrok integration for public webhooks                          │
│  - Library-mode for direct function calls                                  │
└────────────────────────────────────────────────────────────────────────────┘

The agentplexus Stack

This project demonstrates the agentplexus voice AI stack:

Package Role Description
omnivoice Abstraction Provider-agnostic interfaces for TTS, STT, Transport, CallSystem
go-elevenlabs TTS Provider ElevenLabs streaming TTS with natural voices
omnivoice-deepgram STT Provider Deepgram streaming STT with accurate transcription
omnivoice-twilio Phone Provider Twilio transport and call system
mcpkit Server MCP server runtime with ngrok and multiple transport modes
Why This Architecture?
  1. Provider Independence: Switch providers by swapping implementations (e.g., OpenAI instead of ElevenLabs)
  2. Testability: Mock interfaces for unit testing without real phone calls
  3. Composability: Mix and match TTS/STT/Phone providers independently
  4. Premium Quality: ElevenLabs voices + Deepgram accuracy for professional call quality

Installation

Prerequisites
  • Go 1.24+
  • Twilio account with:
    • Account SID and Auth Token
    • Phone number capable of making outbound calls
  • ngrok account with auth token
Build
cd /path/to/agentcall
go mod tidy
go build -o agentcall ./cmd/agentcall

Configuration

Set the following environment variables:

# Required: Twilio credentials
export AGENTCALL_PHONE_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export AGENTCALL_PHONE_AUTH_TOKEN=your_auth_token
export AGENTCALL_PHONE_NUMBER=+15551234567      # Your Twilio number
export AGENTCALL_USER_PHONE_NUMBER=+15559876543  # Your personal phone

# Required: ElevenLabs TTS
export AGENTCALL_ELEVENLABS_API_KEY=your_elevenlabs_key
# or: export ELEVENLABS_API_KEY=your_elevenlabs_key

# Required: Deepgram STT
export AGENTCALL_DEEPGRAM_API_KEY=your_deepgram_key
# or: export DEEPGRAM_API_KEY=your_deepgram_key

# Required: ngrok
export NGROK_AUTHTOKEN=your_ngrok_authtoken

# Optional: ElevenLabs voice (default: Rachel)
export AGENTCALL_TTS_VOICE=Rachel

# Optional: ElevenLabs model (default: eleven_turbo_v2_5)
export AGENTCALL_TTS_MODEL=eleven_turbo_v2_5

# Optional: Deepgram model (default: nova-2)
export AGENTCALL_STT_MODEL=nova-2

# Optional: STT language (default: en-US)
export AGENTCALL_STT_LANGUAGE=en-US

# Optional: Server port (default: 3333)
export AGENTCALL_PORT=3333

# Optional: Custom ngrok domain (requires paid plan)
export AGENTCALL_NGROK_DOMAIN=myapp.ngrok.io
ElevenLabs Voices

Popular voices: Rachel, Adam, Antoni, Arnold, Bella, Domi, Elli, Josh, Sam

See ElevenLabs Voice Library for the full list.

Usage

Running the Server
./agentcall

Output:

Starting agentcall MCP server...
Using agentplexus stack:
  - omnivoice (voice abstraction)
  - omnivoice-twilio (Twilio implementation)
  - mcpruntime (MCP server with ngrok)
MCP server ready
  Local:  http://localhost:3333/mcp
  Public: https://abc123.ngrok.io/mcp
Twilio webhooks configured:
  Voice:   https://abc123.ngrok.io/voice
  Stream:  https://abc123.ngrok.io/media-stream
  Status:  https://abc123.ngrok.io/status
Multi-Tool Support

agentcall supports multiple AI coding assistants. Generate configuration files for your preferred tool:

# Generate for a specific tool
go run ./cmd/generate-plugin claude .   # Claude Code
go run ./cmd/generate-plugin kiro .     # AWS Kiro CLI
go run ./cmd/generate-plugin gemini .   # Gemini CLI

# Generate for all tools
go run ./cmd/generate-plugin all ./plugins
Claude Code Integration

Option 1: Use generated plugin files

go run ./cmd/generate-plugin claude .

This creates:

  • .claude-plugin/plugin.json - Plugin manifest
  • skills/phone-input/SKILL.md - Voice calling skill
  • commands/call.md - /call slash command
  • .claude/settings.json - Lifecycle hooks

Option 2: Manual MCP configuration

Add to ~/.claude/settings.json or .claude/settings.json:

{
  "mcpServers": {
    "agentcall": {
      "command": "/path/to/agentcall",
      "env": {
        "AGENTCALL_PHONE_ACCOUNT_SID": "ACxxx",
        "AGENTCALL_PHONE_AUTH_TOKEN": "xxx",
        "AGENTCALL_PHONE_NUMBER": "+15551234567",
        "AGENTCALL_USER_PHONE_NUMBER": "+15559876543",
        "NGROK_AUTHTOKEN": "xxx"
      }
    }
  }
}
Kiro CLI Integration
go run ./cmd/generate-plugin kiro .

This creates:

  • .kiro/settings/mcp.json - MCP server configuration
  • .kiro/agents/voice-caller.json - Voice calling agent

Manual configuration - Add to .kiro/settings/mcp.json:

{
  "mcpServers": {
    "agentcall": {
      "command": "/path/to/agentcall",
      "env": {
        "AGENTCALL_PHONE_ACCOUNT_SID": "${AGENTCALL_PHONE_ACCOUNT_SID}",
        "AGENTCALL_PHONE_AUTH_TOKEN": "${AGENTCALL_PHONE_AUTH_TOKEN}",
        "AGENTCALL_PHONE_NUMBER": "${AGENTCALL_PHONE_NUMBER}",
        "AGENTCALL_USER_PHONE_NUMBER": "${AGENTCALL_USER_PHONE_NUMBER}",
        "NGROK_AUTHTOKEN": "${NGROK_AUTHTOKEN}"
      }
    }
  }
}
Gemini CLI Integration
go run ./cmd/generate-plugin gemini .

This creates:

  • gemini-extension.json - Extension manifest
  • commands/call.toml - Call command
  • agents/voice-caller.toml - Voice calling agent

MCP Tools

initiate_call

Start a new call to the user.

{
  "message": "Hey! I finished implementing the feature. Want me to walk you through it?"
}

Returns:

{
  "call_id": "call-1-1234567890",
  "response": "Sure, go ahead and explain what you built."
}
continue_call

Continue an active call with another message.

{
  "call_id": "call-1-1234567890",
  "message": "I added authentication using JWT. Should I also add refresh tokens?"
}
speak_to_user

Speak without waiting for a response (useful for status updates).

{
  "call_id": "call-1-1234567890",
  "message": "Let me search for that in the codebase. Give me a moment..."
}
end_call

End the call with an optional goodbye message.

{
  "call_id": "call-1-1234567890",
  "message": "Perfect! I'll get started on that. Talk soon!"
}

Use Cases

Ideal for:

  • Reporting significant task completion
  • Requesting clarification when blocked
  • Discussing complex decisions
  • Walking through code changes
  • Multi-step processes needing back-and-forth

Not ideal for:

  • Simple yes/no questions (use text)
  • Status updates that don't need discussion
  • Information already in the conversation

Development

Project Structure
agentcall/
├── cmd/
│   └── agentcall/
│       └── main.go          # Entry point
├── pkg/
│   ├── callmanager/
│   │   └── manager.go       # Call orchestration
│   ├── config/
│   │   └── config.go        # Configuration
│   └── tools/
│       └── tools.go         # MCP tool definitions
├── go.mod
└── README.md
Dependencies
  • github.com/agentplexus/omnivoice - Voice abstraction layer
  • github.com/agentplexus/go-elevenlabs - ElevenLabs TTS provider
  • github.com/agentplexus/omnivoice-deepgram - Deepgram STT provider
  • github.com/agentplexus/omnivoice-twilio - Twilio transport and call system
  • github.com/agentplexus/mcpkit - MCP server runtime
  • github.com/modelcontextprotocol/go-sdk - MCP protocol SDK

Cost Estimate

Service Cost
Twilio outbound calls ~$0.014/min
Twilio phone number ~$1.15/month
ElevenLabs TTS $0.30/1K chars ($0.03/min of speech)
Deepgram STT ~$0.0043/min (Nova-2)
ngrok (free tier) $0
Total per minute ~$0.05/min

Note: Costs vary by plan and usage. ElevenLabs and Deepgram offer free tiers for testing.

License

MIT

Credits

Inspired by ZeframLou/call-me (TypeScript).

Built with the agentplexus stack:

Directories

Path Synopsis
cmd
agentcall command
Package main is the entry point for the agentcall MCP server.
Package main is the entry point for the agentcall MCP server.
generate-plugin command
Package main generates AI assistant integration files using assistantkit bundle.
Package main generates AI assistant integration files using assistantkit bundle.
publish command
Package main publishes the agentcall plugin to the Claude Code marketplace.
Package main publishes the agentcall plugin to the Claude Code marketplace.
pkg
callmanager
Package callmanager orchestrates voice calls using the omnivoice stack.
Package callmanager orchestrates voice calls using the omnivoice stack.
config
Package config provides configuration management for agentcall.
Package config provides configuration management for agentcall.
tools
Package tools defines the MCP tools for agentcall.
Package tools defines the MCP tools for agentcall.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL