Documentation
¶
Overview ¶
Go bindings for Vosk speech recognition toolkit. Vosk is an offline open source speech to text API for Android, iOS, Raspberry Pi and servers. It enables speech recognition models for 18 languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian.
Index ¶
- func GPUInit()
- func GPUThreadInit()
- func SetLogLevel(logLevel int)
- type VoskModel
- type VoskRecognizer
- func (r *VoskRecognizer) AcceptWaveform(buffer []byte) int
- func (r *VoskRecognizer) FinalResult() string
- func (r *VoskRecognizer) Free()
- func (r *VoskRecognizer) PartialResult() string
- func (r *VoskRecognizer) Reset()
- func (r *VoskRecognizer) Result() string
- func (r *VoskRecognizer) SetEndpointerDelays(startMax, end, max float64)
- func (r *VoskRecognizer) SetGrm(grammar string)
- func (r *VoskRecognizer) SetMaxAlternatives(maxAlternatives int)
- func (r *VoskRecognizer) SetPartialWords(words int)
- func (r *VoskRecognizer) SetSpkModel(spkModel *VoskSpkModel)
- func (r *VoskRecognizer) SetWords(words int)
- type VoskSpkModel
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GPUInit ¶
func GPUInit()
GPUInit automatically selects a CUDA device and allows multithreading.
func GPUThreadInit ¶
func GPUThreadInit()
GPUThreadInit inits CUDA device in a multi-threaded environment.
func SetLogLevel ¶
func SetLogLevel(logLevel int)
SetLogLevel sets the log level for Kaldi messages.
Types ¶
type VoskModel ¶
type VoskModel struct {
// contains filtered or unexported fields
}
VoskModel contains a reference to the C VoskModel
type VoskRecognizer ¶
type VoskRecognizer struct {
// contains filtered or unexported fields
}
VoskRecognizer contains a reference to the C VoskRecognizer
func NewRecognizer ¶
func NewRecognizer(model *VoskModel, sampleRate float64) (*VoskRecognizer, error)
NewRecognizer creates a new VoskRecognizer instance
func NewRecognizerGrm ¶
func NewRecognizerGrm(model *VoskModel, sampleRate float64, grammar string) (*VoskRecognizer, error)
NewRecognizerGrm creates a new VoskRecognizer instance with the phrase list.
func NewRecognizerSpk ¶
func NewRecognizerSpk(model *VoskModel, sampleRate float64, spkModel *VoskSpkModel) (*VoskRecognizer, error)
NewRecognizerSpk creates a new VoskRecognizer instance with a speaker model.
func (*VoskRecognizer) AcceptWaveform ¶
func (r *VoskRecognizer) AcceptWaveform(buffer []byte) int
AcceptWaveform accepts and processes a new chunk of the voice data.
func (*VoskRecognizer) FinalResult ¶
func (r *VoskRecognizer) FinalResult() string
FinalResult returns a speech recognition result. Same as result, but doesn't wait for silence.
func (*VoskRecognizer) Free ¶
func (r *VoskRecognizer) Free()
func (*VoskRecognizer) PartialResult ¶
func (r *VoskRecognizer) PartialResult() string
PartialResult returns a partial speech recognition result.
func (*VoskRecognizer) Result ¶
func (r *VoskRecognizer) Result() string
Result returns a speech recognition result.
func (*VoskRecognizer) SetEndpointerDelays ¶
func (r *VoskRecognizer) SetEndpointerDelays(startMax, end, max float64)
SetEndpointerDelays sets the recognition timeouts, where startMax is the timeout for stopping recognition in case of initial silence (usually around 5), end is the timeout for stopping recognition in milliseconds after we recognized something (usually around 0.5-1.0), and max is the timeout for forcing utterance end in milliseconds (usually around 20-30).
func (*VoskRecognizer) SetGrm ¶
func (r *VoskRecognizer) SetGrm(grammar string)
SetGrm sets which phrases to recognize on an already initialized recognizer.
func (*VoskRecognizer) SetMaxAlternatives ¶
func (r *VoskRecognizer) SetMaxAlternatives(maxAlternatives int)
SetMaxAlternatives configures the recognizer to output n-best results.
func (*VoskRecognizer) SetPartialWords ¶
func (r *VoskRecognizer) SetPartialWords(words int)
SetPartialWords enables words with times in the partial ouput.
func (*VoskRecognizer) SetSpkModel ¶
func (r *VoskRecognizer) SetSpkModel(spkModel *VoskSpkModel)
SetSpkModel adds a speaker model to an already initialized recognizer.
func (*VoskRecognizer) SetWords ¶
func (r *VoskRecognizer) SetWords(words int)
SetWords enables words with times in the ouput.
type VoskSpkModel ¶
type VoskSpkModel struct {
// contains filtered or unexported fields
}
VoskSpkModel contains a reference to the C VoskSpkModel
func NewSpkModel ¶
func NewSpkModel(spkModelPath string) (*VoskSpkModel, error)
NewSpkModel creates a new VoskSpkModel instance
func (*VoskSpkModel) Free ¶
func (s *VoskSpkModel) Free()