Documentation
¶
Overview ¶
Package stt provides STT service implementations (OpenAI Whisper, Groq Whisper).
Package stt provides STT service implementations (OpenAI Whisper).
Index ¶
Constants ¶
const DefaultGroqWhisperModel = "whisper-large-v3"
DefaultGroqWhisperModel is the default Groq Whisper model when none is specified.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type GroqService ¶
type GroqService struct {
// contains filtered or unexported fields
}
GroqService implements services.STTService using Groq's Whisper API. It provides high-performance speech-to-text conversion by leveraging Groq's infrastructure, while maintaining compatibility with the OpenAI transcription requested format.
func NewGroq ¶
func NewGroq(apiKey string) *GroqService
NewGroq creates a Groq Whisper STT service with default model (whisper-large-v3). If apiKey is empty, config.GetEnv("GROQ_API_KEY", "") is used.
func NewGroqWithModel ¶
func NewGroqWithModel(apiKey, model string) *GroqService
NewGroqWithModel creates a Groq Whisper STT service with the given model. If apiKey is empty, config.GetEnv("GROQ_API_KEY", "") is used. If model is empty, DefaultGroqWhisperModel is used.
func (*GroqService) Transcribe ¶
func (s *GroqService) Transcribe(ctx context.Context, audio []byte, sampleRate, numChannels int) ([]*frames.TranscriptionFrame, error)
Transcribe sends audio to Groq Whisper and returns one TranscriptionFrame (final).
func (*GroqService) TranscribeStream ¶
func (s *GroqService) TranscribeStream(ctx context.Context, audioCh <-chan []byte, sampleRate, numChannels int, outCh chan<- frames.Frame)
TranscribeStream buffers audio from audioCh and sends final TranscriptionFrame(s) to outCh.
type OpenAIService ¶
type OpenAIService struct {
// contains filtered or unexported fields
}
OpenAIService implements services.STTService using OpenAI Whisper.
func NewOpenAI ¶
func NewOpenAI(apiKey string) *OpenAIService
NewOpenAI creates an OpenAI Whisper STT service.
func (*OpenAIService) Transcribe ¶
func (s *OpenAIService) Transcribe(ctx context.Context, audio []byte, sampleRate, numChannels int) ([]*frames.TranscriptionFrame, error)
Transcribe sends audio to Whisper and returns one TranscriptionFrame (final). Audio is written to a temp file because the client expects a file path.
func (*OpenAIService) TranscribeStream ¶
func (s *OpenAIService) TranscribeStream(ctx context.Context, audioCh <-chan []byte, sampleRate, numChannels int, outCh chan<- frames.Frame)
TranscribeStream buffers audio from audioCh and sends final TranscriptionFrame(s) to outCh. OpenAI Whisper is not truly streaming; this batches incoming audio and transcribes when context is done or buffer is flushed.