batchnorm

package
v0.16.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 19, 2024 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Package batchnorm implements a batch normalization layer, and associated tools. It's a very common normalization technique that greatly facilitates training of deeper models.

See details and examples in New.

Based on paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" (Sergey Ioffe, Christian Szegedy), https://arxiv.org/abs/1502.03167.

Index

Constants

View Source
const (
	// AveragesUpdatesTriggerParam is a boolean parameter set in case batch normalization was used.
	// See UpdateAverages.
	AveragesUpdatesTriggerParam = "batch_normalization_averages_updates_trigger"
)
View Source
const (
	// BatchNormalizationScopeName is used as sub-scope for all batch normalization variables.
	BatchNormalizationScopeName = "batch_normalization"
)

Variables

This section is empty.

Functions

func ResetWeights

func ResetWeights(ctx *context.Context)

ResetWeights reset the weights of the moving averages, forcing them to be reinitialized to 0. It searches for all variables under scope named "batch_normalization"

It is a no-op if no batch-normalization was used.

Usually this method is not used directly, instead use UpdateAverages.

func UpdateAverages

func UpdateAverages(trainer *train.Trainer, oneEpochDS train.Dataset) bool

UpdateAverages resets the weights of the moving averages and recalculate them over the given oneEpochDS dataset and the trainer. It uses the context assigned to the trainer.

It is a no-op if no batch-normalization was used.

The oneEpochDS dataset (typically, the same as a training data evaluation dataset) should be a 1-epoch training data dataset, and it can use evaluation batch sizes. If oneEpochDS is nil, it disabled the updating of the averages.

It returns whether batch normalization was used and averages were updated.

See discussions: - https://www.mindee.com/blog/batch-normalization - https://discuss.pytorch.org/t/batch-norm-instability/32159/14

Types

type Config

type Config struct {
	// contains filtered or unexported fields
}

Config for a batch normalization layer. Create it with New, set the desired parameters, and when all is set, call Done.

func New

func New(ctx *context.Context, x *Node, featureAxis int) *Config

New creates builder performs a batch normalization layer on the input. It includes a scaling and offset factor, and normalization over the batch entries. It maintains a moving average mean and variance of the inputs which is later used during inference.

featureAxis is the axis over which **not to normalize**: this will normalize over the other dimensions, calculating the mean and variance by reducing all other dimensions. E.g: if your input is `[batch_size, features]` you should use featureAxis=1 (same as -1) to normalize over the batch; if your input is an image of shape `[batch_size, height, width, channels]` you should use featureAxis=3 (same as -1) to normalize over the batch and all the pixels, so each channel is normalized differently, but normalization happens over all the pixes of the whole batch.

Notice the difference between LayerNormalization, that normalizes over the feature dimensions, as opposed to the batch dimension.

To ease setting its parameters, it returns a Config object for configuration. Once it is set up call `Config.Done` and it will return the normalized x. Browse through Config to see the capabilities and the defaults.

Batch normalization behaves differently during training and inference: during training it normalizes over the batch (so it likely won't work well for very small batch sizes), and in inference, it normalizes using the collected moving average of the mean and variance.

Based on paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" (Sergey Ioffe, Christian Szegedy), https://arxiv.org/abs/1502.03167.

See also UpdateAverages to update the running averages after the training of the model -- or just before evaluations. Because during training the target averages of the mean and variances are moving (as the model changes), they are often biases and suboptimal, UpdateAverages fixes that and often provides some significant gains.

FutureWork: 1. Support padding by not normalizing parts that weren't touched. 2. Support selection of multiple features axes.

func (*Config) Center

func (builder *Config) Center(value bool) *Config

Center defines whether the batch normalization tries to center the input by adding a learned offset. Default to true.

This is also called the β (beta) parameter, and referred to as a "learnable offset".

func (*Config) CurrentScope

func (builder *Config) CurrentScope() *Config

CurrentScope configures New not to create a new sub-scope named BatchNormalizationScopeName for its variables. This allows more control on scope names, but it breaks things that rely on batch normalization variables to be under BatchNormalizationScopeName (e.g.: ResetWeights).

func (*Config) Done

func (builder *Config) Done() *Node

Done finishes configuring the New and generates the graph computation to normalize the input.

func (*Config) Epsilon

func (builder *Config) Epsilon(value float64) *Config

Epsilon is a small float added to variance to avoid dividing by zero. It defaults to 1e-3.

Notice Keras default is 1e-3 (the one we use), but PyTorch's default of 1e-05.

func (*Config) FrozenAverages

func (builder *Config) FrozenAverages(frozen bool) *Config

FrozenAverages defines whether the moving averages for mean and variance should be kept frozen for this layer.

This is useful in transfer learning when the sub-model being incorporated was trained on a different distribution, and we don't want to impact that.

func (*Config) Momentum

func (builder *Config) Momentum(value float64) *Config

Momentum sets the moment of the moving averages collected for the mean and variance of the values. New maintains moving averages for the mean and variance during training. This averaged mean and variance is used during inference for normalization. The default is 0.99.

Notice Keras default is 0.99 (the one we use), but PyTorch's default of 0.9.

This has no effect if one sets `Trainable(false)`.

func (*Config) Scale

func (builder *Config) Scale(value bool) *Config

Scale defines whether the batch normalization tries to scale the input by adding a learned scale. Default to true.

This is also called the γ (gamma) parameter.

func (*Config) Trainable

func (builder *Config) Trainable(trainable bool) *Config

Trainable defines whether the batch normalization is trainable. If set to `false` it is frozen, and none of its parameters are changeable. The default is `true`.

Independent of the value set here, if the context is not set for training ( see `context.Context.IsTraining()`) like during evaluation and inference, the Config will generate code for inference only.

func (*Config) UseBackendInference

func (builder *Config) UseBackendInference(value bool) *Config

UseBackendInference uses a backend version of batch normalization inference. The alternative is a manually defined batch normalization inference, which is differentiable. Only used if training is false.

The default is true.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL