Back to godoc.org
github.com/btracey/mixent

Package mixent

v0.0.0-...-dc3ef5b
Latest Go to latest

The latest major version is .

Published: Oct 28, 2017 | License: BSD-3-Clause | Module: github.com/btracey/mixent

Overview

mixent implements routines for estimating the entropy of a mixture distribution. See

Estimating Mixture Entropy using Pairwise Distances by A. Kolchinksy and
B. Tracey

for more information. Documentation notation: the mixture distribution is

p(x) = \sum_{i=1}^N w_i p_i(x)

The symbol X will be used to represent the mixture distribution p(x), and X_i will represent p_i(x), that is the distribution of the i^th mixture component. It will also use \sum_i as a shorthand for \sum_{i=1}^N.

Please note that some of the types implemented here work for all mixture distributions, while others only work for special cases, such as mixture of Gaussians.

Index

type AvgEnt

type AvgEnt struct{}

AvgEnt provides a lower bound on the mixture entropy using the average entropy of the components. See the MixtureEntropy method for more information.

func (AvgEnt) MixtureEntropy

func (AvgEnt) MixtureEntropy(components []Component, weights []float64) float64

MixtureEntropy computes the entropy of the mixture conditional on the mixture weight, which is a lower bound to the true entropy of the mixture distribution.

H(X) ≥ H(X|W) = \sum_i w_i H(X_i)

i.e. the average entropy of the components.

If weights is nil, all components are assumed to have the same mixture entropy.

type Component

type Component interface {
	Entropy() float64
}

Component is an individual component in a mixture distribution.

type ComponentCenters

type ComponentCenters struct{}

ComponentCenters estimates the entropy based on the probability at the component centers.

func (ComponentCenters) MixtureEntropy

func (comp ComponentCenters) MixtureEntropy(components []Component, weights []float64) float64

MixtureEntropy computes the estimate of the entropy based on the average probability of the cluster centers.

H(x) ≈ - \sum_i w_i ln \sum_j w_j p_j(μ_i)

If weights is nil, all components are assumed to have the same mixture entropy.

Currently only coded for Gaussian and Uniform components.

type DistNormaler

type DistNormaler interface {
	DistNormal(l, r *distmv.Normal) float64
}

DistNormaler is a type that can compute the distance between two Normal distributions.

type DistUniformer

type DistUniformer interface {
	DistUniform(l, r *distmv.Uniform) float64
}

DistNormaler is a type that can compute the distance between two Normal distributions.

type Distancer

type Distancer interface {
	Distance(a, b Component) float64
}

Distancer is a type that can compute the distance between two components. The distance returned must be greater than zero, and must equal zero if a and b are identical distributions.

type ELK

type ELK struct{}

ELK implements the Expected Likelihood Kernel.

ELX(X_i, X_j) = \int_x p_i(x) p_j(x) dx

The Expected Likelihood Kernel can be used to find a lower bound on the mixture entropy. See the Mixture Entropy method for more information

func (ELK) LogKernelNormal

func (ELK) LogKernelNormal(l, r *distmv.Normal) float64

KernelNormal computes the log of the Expected Likelihood Kernel for two Gaussians.

ELK = 𝒩(μ_i; μ_j, Σ_i+Σ_j)

func (ELK) LogKernelUniform

func (ELK) LogKernelUniform(l, r *distmv.Uniform) float64

func (ELK) MixtureEntropy

func (elk ELK) MixtureEntropy(components []Component, weights []float64) float64

MixtureEntropy computes an estimate of the mixture entropy using the Expected Likelihood kernel. The lower bound is

H(x) ≈ -\sum_i w_i log(\sum_j w_j z_{i,j})

Currently only works with components that are *distmv.Normal or *distmv.Uniform.

type ELKDist

type ELKDist struct{}

ELKDist is a distance metric based on the expected likelihood distance. It is equal to

- ln(ELK(X_i, X_j)/sqrt(ELK(X_i,X_i)*ELK(X_j,X_j)))

func (ELKDist) DistNormal

func (ELKDist) DistNormal(l, r *distmv.Normal) float64

func (ELKDist) DistUniform

func (ELKDist) DistUniform(l, r *distmv.Uniform) float64

type Estimator

type Estimator interface {
	MixtureEntropy(components []Component, weights []float64) float64
}

Estimator is a type that can estimate the entropy of a mixture of Components.

type JointEntropy

type JointEntropy struct{}

JointEntropy provides an upper bound on the mixture entropy using the joint entropy of the mixture and the weights. See the MixtureEntropy method for more information.

func (JointEntropy) MixtureEntropy

func (JointEntropy) MixtureEntropy(components []Component, weights []float64) float64

MixtureEntropy computes an upper bound to the mixture entropy for arbitrary mixture components.

The joint entropy between the mixture distribution and the weight distribution is an upper bound on the entropy of the distribution

H(X) ≤ H(X, W) = H(X|W) + H(W) = \sum_i w_i H(p_i) + H(W)

If weights is nil, all components are assumed to have the same mixture entropy.

type NormalDistance

type NormalDistance struct {
	DistNormaler
}

NormalDistance wraps a DistNormaler for use with components.

func (NormalDistance) Distance

func (n NormalDistance) Distance(l, r Component) float64

Distance computes the distance between two Normal components. Distance panics if the underlying type of the components is not *distmv.Normal or if the dimensions are unequal.

type PairwiseDistance

type PairwiseDistance struct {
	Distancer Distancer
}

PairwiseDistance estimates the entropy using a pairwise distance metric. See the MixtureEntropy method for more information.

func (PairwiseDistance) MixtureEntropy

func (p PairwiseDistance) MixtureEntropy(components []Component, weights []float64) float64

MixtureEntropy estimates the entropy using a distance metric between each pair of distributions. It implements EQUATION IN PAPER, that is

H(x) ≈ \sum_i w_i H(X_i) - \sum_i w_i ln(\sum_j w_j exp(-D(X_i || X_j)))

As shown in

Estimating Mixture Entropy using Pairwise Distances by A. Kolchinksy and
B. Tracey

this estimator has several nice properties (see the paper for the necessary conditions on D for the following to hold).

1) The estimate returned is always larger than Conditional and smaller than JointMixtureWeight. These two estimators differ by H(weights), so this provides a bound on the error.

2) The pairwise estimator becomes an exact estimate of the entropy as the mixture components become clustered. The number of clusters is arbitrary, so this estimator is exact when all mixture components are the same, all are very far apart from one another, or clustered into k far apart clusters.

3) The Bhattacharrya distance metric is a lower-bound on the mixture entropy and the Kullback-Leibler divergence is an upper bound on the mixture entropy. Any distance metric D_B ≤ D ≤ D_KL will provide an estimate in between these two distances.

If weights is nil, all components are assumed to have the same mixture entropy.

type UniformDistance

type UniformDistance struct {
	DistUniformer
}

NormalDistance wraps a DistUniformer for use with components.

func (UniformDistance) Distance

func (u UniformDistance) Distance(l, r Component) float64

Distance computes the distance between two Normal components. Distance panics if the underlying type of the components is not *distmv.Uniform or if the dimensions are unequal.

Package Files

Documentation was rendered with GOOS=linux and GOARCH=amd64.

Jump to identifier

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to identifier