Documentation
¶
Overview ¶
Package collector implements DataCollector, which attaches itself to a computation graph executor and collects values of any selected computation graph node. This data can then be used for data analsys.
Index ¶
- Constants
- func CollectGradL2(ctx *context.Context, g *graph.Graph, grads []*graph.Node, ...)
- func CollectLogStepSizeL2(ctx *context.Context, g *graph.Graph)
- type DataCollector
- func (c *DataCollector) AttachToExecutor(exec HasNodeLogger) error
- func (c *DataCollector) AttachToLoop(loop *train.Loop) error
- func (c *DataCollector) Collect(node *graph.Node, series string)
- func (c *DataCollector) CollectAll() *DataCollector
- func (c *DataCollector) EveryNSteps(n int) *DataCollector
- func (c *DataCollector) GetAllSeriesNames() []string
- func (c *DataCollector) GetSeriesValues(series string) []float64
- func (c *DataCollector) KeepNPoints(n int) *DataCollector
- type HasNodeLogger
Constants ¶
const ( // OptimizerGradsL2CollectorKey is a key to `Context.Params`. If set to a DataCollector, most optimizers // will use it to collect the L2 norm of the gradients at each step. OptimizerGradsL2CollectorKey = "optimizer_grads_l2_collector" // OptimizerGradsL2Series is the name of the series used when collecting the l2 norm of gradients in the // optimizers, if a collector is defined with OptimizerGradsL2CollectorKey. OptimizerGradsL2Series = "optimizer_grads_l2norm" // OptimizerGlobalStepSeries is the name of the series where the global step is stored by optimizers, if a collector // is defined with OptimizerGradsL2CollectorKey. OptimizerGlobalStepSeries = "optimizer_global_step" // TrainerLogStepSizeL2CollectorKey is a key to `Context.Params`. If set to a DataCollector, the train.Trainer will // collect the Log of the L2 norm of the updates to all trainable variables. This is a measure of how much the weights // of the models actually moved. It's usually a combination of the gradient and the learning rate calculated // by the optimizer. But it can also be affected by other projections. TrainerLogStepSizeL2CollectorKey = "trainer_log_step_size_l2_collector" // TrainerLogStepSizeL2Series are the step size L2 norm series name with the values collected by the train.Trainer, // if a collector is defined with TrainerLogStepSizeL2CollectorKey. TrainerLogStepSizeL2Series = "trainer_log_step_size_l2norm" )
These are Context.Params keys used to hold DataCollector for specialized uses. It's merely a convention, not every implementation may adhere to.
Variables ¶
This section is empty.
Functions ¶
func CollectGradL2 ¶
func CollectGradL2(ctx *context.Context, g *graph.Graph, grads []*graph.Node, globalStep *graph.Node)
CollectGradL2 will calculate and collect the L2 norm of the gradients, if a DataCollector is configured in the current context. See `collector.OptimizerGradsL2CollectorKey`.
It also collects the globalStep (useful for an x-axis in a plot), if it is not nil.
This is useful to debug/monitor different optimizers.
TODO: Add hook in optimziers tto clal this.
func CollectLogStepSizeL2 ¶
CollectLogStepSizeL2 will calculate and collect the log of the L2 norm of the "step sizes" (how much the trainable variables changed), if a DataCollector is configured in the current context. See `collector.TrainerLogStepSizeL2CollectorKey`.
It also collects the globalStep (useful for an x-axis in a plot), if it is not nil.
This is useful to debug/monitor different optimizers.
TODO: Add hook to trainer to call this.
Types ¶
type DataCollector ¶
type DataCollector struct {
// contains filtered or unexported fields
}
DataCollector attaches itself to a computation graph executor -- graph.Exec or context.Exec -- and listens to graph graph.Node that are set to be logged. Those selected for collection are intercepted and collected here. These can then be used for plotting.
It only works for scalar nodes (it could be extended to collect tensors of different shapes).
Example: collect the L2 norm of the last layer and logits layer.
```
var collector := NewDataCollector(...)
func modelGraph(ctx *context.Context, inputs []*Node) (logits *Node) {
// ... build model ...
lastLayer := layers.Dense(...)
collector.Collect(L2Norm(lastLayer), "lastLayer")
logits = layers.Dense(layers.Relu(lastLayer),...)
collector.Collect(L2Norm(logits), "logits")
return logits
}
func train(...) {
...
trainer := train.NewTrainer(manager, modelGraph, ...)
...
loop := train.NewLoop(trainer)
collector.AttachToLoop(loop)
...
_, err = loop.RunSteps(dsTrain, *flagNumSteps)
...
lastLayerL2 := collector.GetSeriesValues("lastLayer") // []float64
logitsL2 := collector.GetSeriesValues("logits") // []float64
}
```
func NewDataCollector ¶
func NewDataCollector() *DataCollector
NewDataCollector returns a new DataCollector. Before it's ready to use one needs to complete 2 steps:
- Select how often to collect data with one of EveryNSteps, KeepNPoints or CollectAll.
- Attach to an executor (graph.Exec or context.Exec) with AttachToExecutor. Notice a train.Trainer uses an executor to train, it can be access by train.Trainer.GetTrainExec (or GetEvalExec).
Once set up, you can mark the graph.Node to collect in your graph building function with Collect. Each time the graph is executed, the data points are collected. When ready (after training), one can get the collected data with Data.
func (*DataCollector) AttachToExecutor ¶
func (c *DataCollector) AttachToExecutor(exec HasNodeLogger) error
AttachToExecutor attach the DataCollector to the executor (or anything that support HasNodeLogger interface) and start listening to any logged tensors for those marked for the DataCollector.
This is an alternative to the AttachToLoop function, if not using the standard Trainer/Loop objects.
Any logged tensor that was not marked for the DataCollector (see Collect method) are passed through to the previous LoggerFn registered. That means that multiple DataCollector's can be active at the same time.
A DataCollector can only be installed in one executor, it will return an error if installed somewhere else.
func (*DataCollector) AttachToLoop ¶
func (c *DataCollector) AttachToLoop(loop *train.Loop) error
AttachToLoop attaches the DataCollector to the trainer of the given Loop variable.
It actually calls AttachToExecutor to the executor associated with the Loop / Trainer.
A DataCollector can only be installed in one Loop / Trainer (and underlying executor), it will return an error if installed somewhere else.
func (*DataCollector) Collect ¶
func (c *DataCollector) Collect(node *graph.Node, series string)
Collect indicates that the graph node value should be "collected" (saved) at every execution and stored in the named series.
Note that a node marked for collection can't be logged -- they use the same mechanism.
func (*DataCollector) CollectAll ¶
func (c *DataCollector) CollectAll() *DataCollector
CollectAll sets the DataCollector to collect every data point. Memory consumption grow unbounded.
One and only one of CollectAll, KeepNPoints or EveryNSteps must be set.
It returns itself, so methods calling can be cascaded.
func (*DataCollector) EveryNSteps ¶
func (c *DataCollector) EveryNSteps(n int) *DataCollector
EveryNSteps configures the DataCollector to collect data only every N steps.
One and only one of CollectAll, KeepNPoints or EveryNSteps must be set.
It returns itself, so methods calling can be cascaded.
func (*DataCollector) GetAllSeriesNames ¶
func (c *DataCollector) GetAllSeriesNames() []string
GetAllSeriesNames returns the list of series names used so far.
func (*DataCollector) GetSeriesValues ¶
func (c *DataCollector) GetSeriesValues(series string) []float64
GetSeriesValues returns the values stored for a given series name.
func (*DataCollector) KeepNPoints ¶
func (c *DataCollector) KeepNPoints(n int) *DataCollector
KeepNPoints configures the DataCollector to collect at most N data points. It starts collecting every point and whenever the buffer is full, it halves the frequency of collecting points. In the end it will have collected anywhere between N/2 and (N-1) points.
One and only one of CollectAll, KeepNPoints or EveryNSteps must be set.
It returns itself, so methods calling can be cascaded.