rational

package

v0.14.0 Latest Latest Go to latest Published: Oct 24, 2024 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/gomlx/gomlx

Links

Open Source Insights

Documentation ¶

Overview ¶

Package rational implements "learnable rational functions".

They can be used for activations or simply as a univariate learnable function -- they are used for KANs (Kolmogorov-Arnold Network) in the KAT (Kolmogorov-Arnold Transformer) paper [1].

Rational functions take the form of f(x) = w*P(x)/Q(x), where P(x) and Q(x) are polynomial functions on x of order m/n, or for short, degree m/n

See details in New.

Several sources of inspiration for this implementation:

[1] "Kolmogorov-Arnold Transformer" by Xingyi Yang and Xinchao Wang, https://arxiv.org/abs/2409.10594 [2] https://github.com/ml-research/rational_activations/ [3] "Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks" by

Alejandro Molina, Patrick Schramowski, Kristian Kersting, https://arxiv.org/abs/1907.06732

[4] "Rational neural networks" by Nicolas Boullé, Yuji Nakatsukasa, Alex Townsend, https://arxiv.org/abs/2004.01902

Index ¶

Constants
type Config
- func New(ctx *context.Context, x *Node) *Config

Constants ¶

View Source

const IdentityApproximation = "identity"

IdentityApproximation is a value to use in Config.Approximate to initialize the rational function with an identity function.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Config ¶

type Config struct {
	// contains filtered or unexported fields
}

Config holds the configuration for a learnable rational function.

Call its several methods to configure, and Done when configuration is finished to apply the rational function.

func New ¶

func New(ctx *context.Context, x *Node) *Config

New creates the configuration for a "learnable rational function".

They can be used for activations or simply as a univariate learnable function -- they are used for KANs (Kolmogorov-Arnold Network) in the KAT (Kolmogorov-Arnold Transformer) paper [1].

It comes with sane defaults, but it can be further configured using the various configuration methods. Once configuration is finished, call Config.Done and it will return the application of the resulting rational function.

By default, it is configured to create one learnable function per input feature (the last dimension of x). So if x has shape [batch_size=64, feature_dim=32], it will create 32 functions initialized the same, but which will learn separately. See Config.WithInputGroups and Config.NumOutputs for other options.

New returns a Config object that can be further configured. Once finished, call Config.Done and it will return the result of "rational(x)" with the same as x.Shape(), except if configured with a Config.WithMultipleOutputs, in which case there is an extra output axis with dimension equal to the number of the outputs.

func (*Config) Approximate ¶

func (c *Config) Approximate(activation string) *Config

Approximate takes as input an activation function name (see package activations for valid names), and it will initialize the parameters of the rational function such that it approximates the given function.

The rational package contains a table of various (version, degrees, activation) initial values, and if the combination is not there, it will fail when Done is called. But it's easy to generate new values for new combinations, see notebook https://github.com/gomlx/gomlx/blob/main/ml/layers/rational/rational.ipynb . In this notebook you enter the univariate function you want to learn (the `target` function), and it approximates it and generates a cache entry, that can be added to the `cache.go` file, or you can enter the values manually with Config.WithInitialValues and Config.WithMultiplierInitVariance.

The default is "identity" (IdentityApproximation), and its alias "".

func (*Config) Done ¶

func (c *Config) Done() *Node

Done creates and applies the learnable rational function configured, returning the result of Rational(x).

The returned shape is the same as x.Shape(), except if configured with a Config.WithMultipleOutputs, in which case there is an extra output axis with dimension equal to the number of the outputs.

func (*Config) Version ¶

func (c *Config) Version(version string) *Config

Version of Rational to use. Rational(x) = w*P(x)/Q(x), where

	P(x) = (a_0 + a_1 * x + a_2 * x^2 + ... + a_n * x^n) and

  - "A": Q(x) = (1 + |b_0 * x| + | b_1 * x^2| + ... +  | b_m * x^{m+1}|)
  - "B": Q(x) = (1 + |b_0 * x + b_1 * x^2 + ... + b_m * x^{m + 1}|)
  - "C": Q(x) = (0.1 + |b_0 + b_1 * x + b_2 * x^2 + ... + b_m * x^m|)
  - "D": like `B` with noised coefficients a_i and b_i. See WithRandomDeviation to set the amount of noise.
    Noise only applied during training. No noise during inference.

Based on https://github.com/ml-research/rational_activations/blob/master/rational/keras/rationals.py, using the same version notation for compatibility.

Default is version "B".

func (*Config) WithDegrees ¶

func (c *Config) WithDegrees(numerator, denominator int) *Config

WithDegrees configures the degree of the rational functions. It defaults to 5,4 (numerator is 5, denominator is 4).

func (*Config) WithInitialValues ¶

func (c *Config) WithInitialValues(numeratorInit, denominatorInit *tensors.Tensor) *Config

WithInitialValues takes the given tensors as inputs for the numerator and denominators learnable coefficients.

The shape of the numerator should be 1+degree(numerator), and the denominator should be degree(denominator) -- there is one less parameter in the denominator.

If set, this supersedes Config.Approximate.

By default, this is unset (nil).

func (*Config) WithInputGroups ¶

func (c *Config) WithInputGroups(numInputGroups int) *Config

WithInputGroups allows multiple inputs to share the same learnable rational function. The numInputGroups must be a divisor of the features dimension, that is, the last dimension of the input x.

So if x is shaped [64, 32], and numInputGroups == 2, the inputs x[:, 0:16] uses one learnable function, and x[:, 16:] uses the second.

If numInputGroups is 0, the default, numInputGroups is set to x.Shape().Dim(-1), that is, one function per input feature.

func (*Config) WithMultipleOutputs ¶

func (c *Config) WithMultipleOutputs(numOutputsPerInput int) *Config

WithMultipleOutputs allows the layer to generate multiple outputs per input feature (last dimension of the input x). This can be useful for instance for KANs (see KAT [1]) using rational functions, where each input feature is used (with its own learnable rational function) in calculating each of the outputs.

The default is 1, so the output is the same size and shape as the input.

If the value is different from 1, the output of Config.Done will have one extra axis appended to the end with dimension equal to numOutputsPerInput.

func (*Config) WithMultiplier ¶

func (c *Config) WithMultiplier(useMultiplier bool) *Config

WithMultiplier if set adds a learnable multiplier weight w that multiplies the rational function P(x)/Q(x). Default is 0, that means, no multiplier term.

func (*Config) WithMultiplierInitVariance ¶

func (c *Config) WithMultiplierInitVariance(initializerVariance float64) *Config

WithMultiplierInitVariance defines the variance of the normal distribution used to initialize the values of w, the multiplier.

Set initVariance to 0 to have a default initializer variance be selected for you, based on the Config.Approximate function chosen, in order to keep the variance constant layer over layer.

See KAT/GR-KAN paper [1] for details.

func (*Config) WithNoise ¶

func (c *Config) WithNoise(randomDeviation float64) *Config

WithNoise sets that amount of uniform noise (around 0.0) to add to the coefficients in version "D". This only has an impact if version "D" is selected.

Default is 0.1.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL