voice

package

v0.3.0 Latest Latest Go to latest Published: Feb 23, 2026 License: MIT Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/agentplexus/omniagent

Links

Open Source Insights

Documentation ¶

Overview ¶

Package voice provides voice processing capabilities for omniagent.

Index ¶

type Config
type Processor
- func New(config Config, logger *slog.Logger) (*Processor, error)
type STTConfig
type TTSConfig

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶

type Config struct {
	// Enabled indicates whether voice processing is enabled.
	Enabled bool
	// ResponseMode controls when to respond with voice: "auto", "always", "never".
	// "auto" responds with voice when the user sends a voice message.
	ResponseMode string
	// STT configures speech-to-text.
	STT STTConfig
	// TTS configures text-to-speech.
	TTS TTSConfig
}

Config configures voice processing.

type Processor ¶

type Processor struct {
	// contains filtered or unexported fields
}

Processor handles voice transcription and synthesis using OmniVoice interfaces.

func New ¶

func New(config Config, logger *slog.Logger) (*Processor, error)

New creates a new voice processor with the configured providers.

func (*Processor) Close ¶

func (p *Processor) Close() error

Close releases provider resources.

func (*Processor) ResponseMode ¶

func (p *Processor) ResponseMode() string

ResponseMode returns the voice response mode.

func (*Processor) SynthesizeSpeech ¶

func (p *Processor) SynthesizeSpeech(ctx context.Context, text string) ([]byte, string, error)

SynthesizeSpeech converts text to audio using the configured TTS provider. Returns audio bytes and MIME type.

func (*Processor) TranscribeAudio ¶

func (p *Processor) TranscribeAudio(ctx context.Context, audio []byte, mimeType string) (string, error)

TranscribeAudio converts audio to text using the configured STT provider.

type STTConfig ¶

type STTConfig struct {
	// Provider is the STT provider name (e.g., "deepgram").
	Provider string
	// APIKey is the provider API key.
	APIKey string //nolint:gosec // G117: APIKey loaded from config file
	// Model is the provider-specific model identifier.
	Model string
	// Language is the BCP-47 language code. Empty for auto-detection.
	Language string
}

STTConfig configures the speech-to-text provider.

type TTSConfig ¶

type TTSConfig struct {
	// Provider is the TTS provider name (e.g., "deepgram").
	Provider string
	// APIKey is the provider API key.
	APIKey string //nolint:gosec // G117: APIKey loaded from config file
	// Model is the provider-specific model identifier.
	Model string
	// VoiceID is the provider-specific voice identifier.
	VoiceID string
}

TTSConfig configures the text-to-speech provider.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL