leopard

package module
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 27, 2023 License: Apache-2.0 Imports: 14 Imported by: 4

README

Leopard Speech-to-Text Engine

Made in Vancouver, Canada by Picovoice

Leopard is an on-device speech-to-text engine. Leopard is:

  • Private; All voice processing runs locally.
  • Accurate
  • Compact and Computationally-Efficient
  • Cross-Platform:
    • Linux (x86_64), macOS (x86_64, arm64), and Windows (x86_64)
    • Android and iOS
    • Chrome, Safari, Firefox, and Edge
    • Raspberry Pi (4, 3) and NVIDIA Jetson Nano

Compatibility

  • go 1.16+
  • Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), Raspberry Pi (4, 3), and NVIDIA Jetson Nano.

Installation

go get github.com/Picovoice/leopard/binding/go

AccessKey

Leopard requires a valid Picovoice AccessKey at initialization. AccessKey acts as your credentials when using Leopard SDKs. You can get your AccessKey for free. Make sure to keep your AccessKey secret. Signup or Login to Picovoice Console to get your AccessKey.

Usage

Create an instance of the engine and transcribe an audio file:

import . "github.com/Picovoice/leopard/binding/go"

leopard := NewLeopard("${ACCESS_KEY}")
err := leopard.Init()
if err != nil {
    // handle err init
}
defer leopard.Delete()

transcript, words, err := leopard.ProcessFile("${AUDIO_PATH}")
if err != nil {
    // handle process error
}

print(transcript)

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${AUDIO_PATH} to the path an audio file. Finally, when done be sure to explicitly release the resources using leopard.Delete().

Language Model

The Leopard Go SDK comes preloaded with a default English language model (.pv file). Default models for other supported languages can be found in lib/common.

Create custom language models using the Picovoice Console. Here you can train language models with custom vocabulary and boost words in the existing vocabulary.

Pass in the .pv file by setting .ModelPath on an instance of Leopard before initializing:

leopard := NewLeopard("${ACCESS_KEY}")
leopard.ModelPath = "${MODEL_PATH}"
err := leopard.Init()

Demos

Check out the Leopard Go demos here.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// SampleRate Audio sample rate accepted by Picovoice.
	SampleRate int

	// Version Leopard version
	Version string
)

Functions

This section is empty.

Types

type Leopard

type Leopard struct {

	// AccessKey obtained from Picovoice Console (https://console.picovoice.ai/).
	AccessKey string

	// Absolute path to the file containing model parameters.
	ModelPath string

	// Absolute path to the Leopard's dynamic library.
	LibraryPath string

	// Flag to enable automatic punctuation insertion.
	EnableAutomaticPunctuation bool
	// contains filtered or unexported fields
}

Leopard struct

func NewLeopard added in v1.1.0

func NewLeopard(accessKey string) Leopard

NewLeopard returns a Leopard struct with default parameters

func (*Leopard) Delete

func (leopard *Leopard) Delete() error

Delete releases resources acquired by Leopard.

func (*Leopard) Init

func (leopard *Leopard) Init() error

Init function for Leopard. Must be called before attempting process

func (*Leopard) Process

func (leopard *Leopard) Process(pcm []int16) (string, []LeopardWord, error)

Processes a given audio data and returns its transcription. The audio needs to have a sample rate equal to `.SampleRate` and be 16-bit linearly-encoded. This function operates on single-channel audio. If you wish to process data in a different sample rate or format consider using `ProcessFile`. Returns the inferred transcription.

func (*Leopard) ProcessFile

func (leopard *Leopard) ProcessFile(audioPath string) (string, []LeopardWord, error)

ProcessFile Processes a given audio file and returns its transcription. The supported formats are: `3gp (AMR)`, `FLAC`, `MP3`, `MP4/m4a (AAC)`, `Ogg`, `WAV`, `WebM`. Returns the inferred transcription.

type LeopardError

type LeopardError struct {
	StatusCode PvStatus
	Message    string
}

func (*LeopardError) Error

func (e *LeopardError) Error() string

type LeopardWord added in v1.1.0

type LeopardWord struct {
	// Transcribed word
	Word string

	// Start of word in seconds.
	StartSec float32

	// End of word in seconds.
	EndSec float32

	// Transcription confidence. It is a number within [0, 1].
	Confidence float32
}

type PvStatus

type PvStatus int

PvStatus type

const (
	SUCCESS                  PvStatus = 0
	OUT_OF_MEMORY            PvStatus = 1
	IO_ERROR                 PvStatus = 2
	INVALID_ARGUMENT         PvStatus = 3
	STOP_ITERATION           PvStatus = 4
	KEY_ERROR                PvStatus = 5
	INVALID_STATE            PvStatus = 6
	RUNTIME_ERROR            PvStatus = 7
	ACTIVATION_ERROR         PvStatus = 8
	ACTIVATION_LIMIT_REACHED PvStatus = 9
	ACTIVATION_THROTTLED     PvStatus = 10
	ACTIVATION_REFUSED       PvStatus = 11
)

Possible status return codes from the Leopard library

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL