Documentation
¶
Overview ¶
Package batchnorm implements a batch normalization layer, and associated tools. It's a very common normalization technique that greatly facilitates training of deeper models.
See details and examples in New.
Based on paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" (Sergey Ioffe, Christian Szegedy), https://arxiv.org/abs/1502.03167.
Index ¶
- Constants
- func ResetWeights(ctx *context.Context)
- func UpdateAverages(trainer *train.Trainer, oneEpochDS train.Dataset) bool
- type Config
- func (builder *Config) Center(value bool) *Config
- func (builder *Config) CurrentScope() *Config
- func (builder *Config) Done() *Node
- func (builder *Config) Epsilon(value float64) *Config
- func (builder *Config) FrozenAverages(frozen bool) *Config
- func (builder *Config) Momentum(value float64) *Config
- func (builder *Config) Scale(value bool) *Config
- func (builder *Config) Trainable(trainable bool) *Config
- func (builder *Config) UseBackendInference(value bool) *Config
Constants ¶
const ( // AveragesUpdatesTriggerParam is a boolean parameter set in case batch normalization was used. // See UpdateAverages. AveragesUpdatesTriggerParam = "batch_normalization_averages_updates_trigger" )
const (
// BatchNormalizationScopeName is used as sub-scope for all batch normalization variables.
BatchNormalizationScopeName = "batch_normalization"
)
Variables ¶
This section is empty.
Functions ¶
func ResetWeights ¶
ResetWeights reset the weights of the moving averages, forcing them to be reinitialized to 0. It searches for all variables under scope named "batch_normalization"
It is a no-op if no batch-normalization was used.
Usually this method is not used directly, instead use UpdateAverages.
func UpdateAverages ¶
UpdateAverages resets the weights of the moving averages and recalculate them over the given oneEpochDS dataset and the trainer. It uses the context assigned to the trainer.
It is a no-op if no batch-normalization was used.
The oneEpochDS dataset (typically, the same as a training data evaluation dataset) should be a 1-epoch training data dataset, and it can use evaluation batch sizes. If oneEpochDS is nil, it disabled the updating of the averages.
It returns whether batch normalization was used and averages were updated.
See discussions: - https://www.mindee.com/blog/batch-normalization - https://discuss.pytorch.org/t/batch-norm-instability/32159/14
Types ¶
type Config ¶
type Config struct {
// contains filtered or unexported fields
}
Config for a batch normalization layer. Create it with New, set the desired parameters, and when all is set, call Done.
func New ¶
New creates builder performs a batch normalization layer on the input. It includes a scaling and offset factor, and normalization over the batch entries. It maintains a moving average mean and variance of the inputs which is later used during inference.
featureAxis is the axis over which **not to normalize**: this will normalize over the other dimensions, calculating the mean and variance by reducing all other dimensions. E.g: if your input is `[batch_size, features]` you should use featureAxis=1 (same as -1) to normalize over the batch; if your input is an image of shape `[batch_size, height, width, channels]` you should use featureAxis=3 (same as -1) to normalize over the batch and all the pixels, so each channel is normalized differently, but normalization happens over all the pixes of the whole batch.
Notice the difference between LayerNormalization, that normalizes over the feature dimensions, as opposed to the batch dimension.
To ease setting its parameters, it returns a Config object for configuration. Once it is set up call `Config.Done` and it will return the normalized x. Browse through Config to see the capabilities and the defaults.
Batch normalization behaves differently during training and inference: during training it normalizes over the batch (so it likely won't work well for very small batch sizes), and in inference, it normalizes using the collected moving average of the mean and variance.
Based on paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" (Sergey Ioffe, Christian Szegedy), https://arxiv.org/abs/1502.03167.
See also UpdateAverages to update the running averages after the training of the model -- or just before evaluations. Because during training the target averages of the mean and variances are moving (as the model changes), they are often biases and suboptimal, UpdateAverages fixes that and often provides some significant gains.
FutureWork: 1. Support padding by not normalizing parts that weren't touched. 2. Support selection of multiple features axes.
func (*Config) Center ¶
Center defines whether the batch normalization tries to center the input by adding a learned offset. Default to true.
This is also called the β (beta) parameter, and referred to as a "learnable offset".
func (*Config) CurrentScope ¶
CurrentScope configures New not to create a new sub-scope named BatchNormalizationScopeName for its variables. This allows more control on scope names, but it breaks things that rely on batch normalization variables to be under BatchNormalizationScopeName (e.g.: ResetWeights).
func (*Config) Done ¶
func (builder *Config) Done() *Node
Done finishes configuring the New and generates the graph computation to normalize the input.
func (*Config) Epsilon ¶
Epsilon is a small float added to variance to avoid dividing by zero. It defaults to 1e-3.
Notice Keras default is 1e-3 (the one we use), but PyTorch's default of 1e-05.
func (*Config) FrozenAverages ¶
FrozenAverages defines whether the moving averages for mean and variance should be kept frozen for this layer.
This is useful in transfer learning when the sub-model being incorporated was trained on a different distribution, and we don't want to impact that.
func (*Config) Momentum ¶
Momentum sets the moment of the moving averages collected for the mean and variance of the values. New maintains moving averages for the mean and variance during training. This averaged mean and variance is used during inference for normalization. The default is 0.99.
Notice Keras default is 0.99 (the one we use), but PyTorch's default of 0.9.
This has no effect if one sets `Trainable(false)`.
func (*Config) Scale ¶
Scale defines whether the batch normalization tries to scale the input by adding a learned scale. Default to true.
This is also called the γ (gamma) parameter.
func (*Config) Trainable ¶
Trainable defines whether the batch normalization is trainable. If set to `false` it is frozen, and none of its parameters are changeable. The default is `true`.
Independent of the value set here, if the context is not set for training ( see `context.Context.IsTraining()`) like during evaluation and inference, the Config will generate code for inference only.
func (*Config) UseBackendInference ¶
UseBackendInference uses a backend version of batch normalization inference. The alternative is a manually defined batch normalization inference, which is differentiable. Only used if training is false.
The default is true.