Documentation ¶
Overview ¶
Package mfcc can compute Mel-frequency cepstrum coefficients from raw sample data.
For more information about MFCC, see Wikipedia: https://en.wikipedia.org/wiki/Mel-frequency_cepstrum
Index ¶
Constants ¶
const ( DefaultWindow = time.Millisecond * 20 DefaultOverlap = time.Millisecond * 10 DefaultFFTSize = 512 DefaultLowFreq = 300 DefaultHighFreq = 8000 DefaultMelCount = 26 DefaultKeepCount = 13 )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CoeffSource ¶
type CoeffSource interface { // NextCoeffs returns the next batch of coefficients, // or an error if the underlying Source ended with one. // // This will never return a non-nil batch along with // an error. NextCoeffs() ([]float64, error) }
CoeffSource computes MFCCs (or augmented MFCCs) from an underlying audio source.
func AddVelocities ¶
func AddVelocities(c CoeffSource) CoeffSource
AddVelocities generates a CoeffSource which wraps c and augments every vector of coefficients with an additional vector of coefficient velocities.
For example, for input coefficients [a,b,c], the resulting source would produce coefficients [a,b,c,da,db,dc] where d stands for derivative.
func MFCC ¶
func MFCC(source Source, sampleRate int, options *Options) CoeffSource
MFCC generates a CoeffSource that computes the MFCCs of the given Source.
After source returns its first error, the last window will be padded with zeroes and used to compute a final batch of MFCCs before returning the error.
type Options ¶
type Options struct { // Window is the amount of time represented in each // MFCC frame. // If this is 0, DefaultWindow is used. Window time.Duration // Overlap is the amount of overlapping time between // adjacent windows. // If this is 0, DefaultOverlap is used. Overlap time.Duration // DisableOverlap can be set to disable overlap. // If this is set to true, Overlap is ignored. DisableOverlap bool // FFTSize is the number of FFT bins to compute for // each window. // This must be a power of 2. // If this is 0, DefaultFFTSize is used. // // It may be noted that the FFT size influences the // upsampling/downsampling behavior of the converter. FFTSize int // LowFreq is the minimum frequency for Mel banks. // If this is 0, DefaultLowFreq is used. LowFreq float64 // HighFreq is the maximum frequency for Mel banks. // If this is 0, DefaultHighFreq is used. // In practice, this may be bounded by the FFT window // size. HighFreq float64 // MelCount is the number of Mel banks to compute. // If this is 0, DefaultMelCount is used. MelCount int // KeepCount is the number of MFCCs to keep after the // discrete cosine transform is complete. // If this is 0, DefaultKeepCount is used. KeepCount int }
Options stores all of the configuration options for computing MFCCs.
type SliceSource ¶
type SliceSource struct { Slice []float64 // Offset is the current offset into the slice. // This will be increased as samples are read. Offset int }
A SliceSource is a Source which returns pre-determined samples from a slice.
func (*SliceSource) ReadSamples ¶
func (s *SliceSource) ReadSamples(out []float64) (n int, err error)