mfcc

package
v0.0.0-...-26fe002 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 24, 2017 License: BSD-2-Clause Imports: 3 Imported by: 1

Documentation

Overview

Package mfcc can compute Mel-frequency cepstrum coefficients from raw sample data.

For more information about MFCC, see Wikipedia: https://en.wikipedia.org/wiki/Mel-frequency_cepstrum

Index

Constants

View Source
const (
	DefaultWindow  = time.Millisecond * 20
	DefaultOverlap = time.Millisecond * 10

	DefaultFFTSize   = 512
	DefaultLowFreq   = 300
	DefaultHighFreq  = 8000
	DefaultMelCount  = 26
	DefaultKeepCount = 13
)

Variables

This section is empty.

Functions

This section is empty.

Types

type CoeffSource

type CoeffSource interface {
	// NextCoeffs returns the next batch of coefficients,
	// or an error if the underlying Source ended with one.
	//
	// This will never return a non-nil batch along with
	// an error.
	NextCoeffs() ([]float64, error)
}

CoeffSource computes MFCCs (or augmented MFCCs) from an underlying audio source.

func AddVelocities

func AddVelocities(c CoeffSource) CoeffSource

AddVelocities generates a CoeffSource which wraps c and augments every vector of coefficients with an additional vector of coefficient velocities.

For example, for input coefficients [a,b,c], the resulting source would produce coefficients [a,b,c,da,db,dc] where d stands for derivative.

func MFCC

func MFCC(source Source, sampleRate int, options *Options) CoeffSource

MFCC generates a CoeffSource that computes the MFCCs of the given Source.

After source returns its first error, the last window will be padded with zeroes and used to compute a final batch of MFCCs before returning the error.

type Options

type Options struct {
	// Window is the amount of time represented in each
	// MFCC frame.
	// If this is 0, DefaultWindow is used.
	Window time.Duration

	// Overlap is the amount of overlapping time between
	// adjacent windows.
	// If this is 0, DefaultOverlap is used.
	Overlap time.Duration

	// DisableOverlap can be set to disable overlap.
	// If this is set to true, Overlap is ignored.
	DisableOverlap bool

	// FFTSize is the number of FFT bins to compute for
	// each window.
	// This must be a power of 2.
	// If this is 0, DefaultFFTSize is used.
	//
	// It may be noted that the FFT size influences the
	// upsampling/downsampling behavior of the converter.
	FFTSize int

	// LowFreq is the minimum frequency for Mel banks.
	// If this is 0, DefaultLowFreq is used.
	LowFreq float64

	// HighFreq is the maximum frequency for Mel banks.
	// If this is 0, DefaultHighFreq is used.
	// In practice, this may be bounded by the FFT window
	// size.
	HighFreq float64

	// MelCount is the number of Mel banks to compute.
	// If this is 0, DefaultMelCount is used.
	MelCount int

	// KeepCount is the number of MFCCs to keep after the
	// discrete cosine transform is complete.
	// If this is 0, DefaultKeepCount is used.
	KeepCount int
}

Options stores all of the configuration options for computing MFCCs.

type SliceSource

type SliceSource struct {
	Slice []float64

	// Offset is the current offset into the slice.
	// This will be increased as samples are read.
	Offset int
}

A SliceSource is a Source which returns pre-determined samples from a slice.

func (*SliceSource) ReadSamples

func (s *SliceSource) ReadSamples(out []float64) (n int, err error)

type Source

type Source interface {
	ReadSamples(s []float64) (n int, err error)
}

A Source is a place from which audio sample data can be read. This interface is very similar to io.Reader, except that it deals with samples instead of bytes.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL