batchnorm

package

v0.16.1 Latest Latest Go to latest Published: Dec 19, 2024 License: Apache-2.0 Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/gomlx/gomlx

Links

Open Source Insights

Documentation ¶

Overview ¶

Package batchnorm implements a batch normalization layer, and associated tools. It's a very common normalization technique that greatly facilitates training of deeper models.

See details and examples in New.

Based on paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" (Sergey Ioffe, Christian Szegedy), https://arxiv.org/abs/1502.03167.

Index ¶

Constants
func ResetWeights(ctx *context.Context)
func UpdateAverages(trainer *train.Trainer, oneEpochDS train.Dataset) bool
type Config
- func New(ctx *context.Context, x *Node, featureAxis int) *Config

Constants ¶

View Source

const (
	// AveragesUpdatesTriggerParam is a boolean parameter set in case batch normalization was used.
	// See UpdateAverages.
	AveragesUpdatesTriggerParam = "batch_normalization_averages_updates_trigger"
)

View Source

const (
	// BatchNormalizationScopeName is used as sub-scope for all batch normalization variables.
	BatchNormalizationScopeName = "batch_normalization"
)

Variables ¶

This section is empty.

Functions ¶

func ResetWeights ¶

func ResetWeights(ctx *context.Context)

ResetWeights reset the weights of the moving averages, forcing them to be reinitialized to 0. It searches for all variables under scope named "batch_normalization"

It is a no-op if no batch-normalization was used.

Usually this method is not used directly, instead use UpdateAverages.

func UpdateAverages ¶

func UpdateAverages(trainer *train.Trainer, oneEpochDS train.Dataset) bool

UpdateAverages resets the weights of the moving averages and recalculate them over the given oneEpochDS dataset and the trainer. It uses the context assigned to the trainer.

It is a no-op if no batch-normalization was used.

The oneEpochDS dataset (typically, the same as a training data evaluation dataset) should be a 1-epoch training data dataset, and it can use evaluation batch sizes. If oneEpochDS is nil, it disabled the updating of the averages.

It returns whether batch normalization was used and averages were updated.

See discussions: - https://www.mindee.com/blog/batch-normalization - https://discuss.pytorch.org/t/batch-norm-instability/32159/14

Types ¶

type Config ¶

type Config struct {
	// contains filtered or unexported fields
}

Config for a batch normalization layer. Create it with New, set the desired parameters, and when all is set, call Done.

func New ¶

func New(ctx *context.Context, x *Node, featureAxis int) *Config

New creates builder performs a batch normalization layer on the input. It includes a scaling and offset factor, and normalization over the batch entries. It maintains a moving average mean and variance of the inputs which is later used during inference.

featureAxis is the axis over which **not to normalize**: this will normalize over the other dimensions, calculating the mean and variance by reducing all other dimensions. E.g: if your input is `[batch_size, features]` you should use featureAxis=1 (same as -1) to normalize over the batch; if your input is an image of shape `[batch_size, height, width, channels]` you should use featureAxis=3 (same as -1) to normalize over the batch and all the pixels, so each channel is normalized differently, but normalization happens over all the pixes of the whole batch.

Notice the difference between LayerNormalization, that normalizes over the feature dimensions, as opposed to the batch dimension.

To ease setting its parameters, it returns a Config object for configuration. Once it is set up call `Config.Done` and it will return the normalized x. Browse through Config to see the capabilities and the defaults.

Batch normalization behaves differently during training and inference: during training it normalizes over the batch (so it likely won't work well for very small batch sizes), and in inference, it normalizes using the collected moving average of the mean and variance.

Based on paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" (Sergey Ioffe, Christian Szegedy), https://arxiv.org/abs/1502.03167.

See also UpdateAverages to update the running averages after the training of the model -- or just before evaluations. Because during training the target averages of the mean and variances are moving (as the model changes), they are often biases and suboptimal, UpdateAverages fixes that and often provides some significant gains.

FutureWork: 1. Support padding by not normalizing parts that weren't touched. 2. Support selection of multiple features axes.

func (*Config) Center ¶

func (builder *Config) Center(value bool) *Config

Center defines whether the batch normalization tries to center the input by adding a learned offset. Default to true.

This is also called the β (beta) parameter, and referred to as a "learnable offset".

func (*Config) CurrentScope ¶

func (builder *Config) CurrentScope() *Config

CurrentScope configures New not to create a new sub-scope named BatchNormalizationScopeName for its variables. This allows more control on scope names, but it breaks things that rely on batch normalization variables to be under BatchNormalizationScopeName (e.g.: ResetWeights).

func (*Config) Done ¶

func (builder *Config) Done() *Node

Done finishes configuring the New and generates the graph computation to normalize the input.

func (*Config) Epsilon ¶

func (builder *Config) Epsilon(value float64) *Config

Epsilon is a small float added to variance to avoid dividing by zero. It defaults to 1e-3.

Notice Keras default is 1e-3 (the one we use), but PyTorch's default of 1e-05.

func (*Config) FrozenAverages ¶

func (builder *Config) FrozenAverages(frozen bool) *Config

FrozenAverages defines whether the moving averages for mean and variance should be kept frozen for this layer.

This is useful in transfer learning when the sub-model being incorporated was trained on a different distribution, and we don't want to impact that.

func (*Config) Momentum ¶

func (builder *Config) Momentum(value float64) *Config

Momentum sets the moment of the moving averages collected for the mean and variance of the values. New maintains moving averages for the mean and variance during training. This averaged mean and variance is used during inference for normalization. The default is 0.99.

Notice Keras default is 0.99 (the one we use), but PyTorch's default of 0.9.

This has no effect if one sets `Trainable(false)`.

func (*Config) Scale ¶

func (builder *Config) Scale(value bool) *Config

Scale defines whether the batch normalization tries to scale the input by adding a learned scale. Default to true.

This is also called the γ (gamma) parameter.

func (*Config) Trainable ¶

func (builder *Config) Trainable(trainable bool) *Config

Trainable defines whether the batch normalization is trainable. If set to `false` it is frozen, and none of its parameters are changeable. The default is `true`.

Independent of the value set here, if the context is not set for training ( see `context.Context.IsTraining()`) like during evaluation and inference, the Config will generate code for inference only.

func (*Config) UseBackendInference ¶

func (builder *Config) UseBackendInference(value bool) *Config

UseBackendInference uses a backend version of batch normalization inference. The alternative is a manually defined batch normalization inference, which is differentiable. Only used if training is false.

The default is true.

Source Files ¶

View all Source files

batchnorm.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL