nn

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 9, 2020 License: BSD-2-Clause Imports: 14 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Affine

func Affine(g *ag.Graph, xs ...ag.Node) ag.Node

Affine performs an affine transformation over an arbitrary (odd) number of nodes held in the input. The first node is the “bias”, which is added to the output as-is. The remaining nodes of the form "Wx" are multiplied together in pairs, then added. The pairs except the first whose "x" is nil are not considered. y = b + W1x1 + W2x2 + ... + WnXn

func BiAffine

func BiAffine(g *ag.Graph, w, u, v, b, x1, x2 ag.Node) ag.Node

BiAffine performs a biaffine transformation.

func BiLinear

func BiLinear(g *ag.Graph, w, x1, x2 ag.Node) ag.Node

BiLinear performs a bilinear transformation of the type (x_1 W x_2)

func ClearSupport

func ClearSupport(m Model)

ClearPayload clears the support structure of all model's parameters (including sub-params). TODO: use ParamsIterator?

func Conv2D

func Conv2D(g *ag.Graph, w, x ag.Node, xStride, yStride int) ag.Node

Conv2D performs a 2D convolution.

func DumpParamsVector

func DumpParamsVector(model Model) *mat.Dense

TODO: use ParamsIterator?

func ForEachParam

func ForEachParam(m Model, callback func(param *Param))

ForEachParam iterate all the parameters of a model also exploring the sub-parameters recursively. TODO: don't loop the field every time, use a lazy initialized "params list" instead

func ForEachParamStrict

func ForEachParamStrict(m Model, callback func(param *Param))

ForEachParamStrict iterate all the parameters of a model without exploring the sub-models.

func LinearAttention

func LinearAttention(g *ag.Graph, qs, ks, vs []ag.Node, mappingFunction MappingFunc, eps float64) []ag.Node

LinearAttention performs the self-attention as a linear dot-product of kernel feature maps. It operates with O(N) complexity, where N is the sequence length. Reference: "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al. (2020)

func LoadParamsVector

func LoadParamsVector(model Model, vector *mat.Dense)

TODO: use ParamsIterator?

func PayloadMarshalBinaryTo

func PayloadMarshalBinaryTo(supp *Payload, w io.Writer) (int, error)

PayloadMarshalBinaryTo returns the number of bytes written into w and an error, if any.

func ScaledDotProductAttention

func ScaledDotProductAttention(g *ag.Graph, qs, ks, vs []ag.Node, scaleFactor float64) (context []ag.Node, prob []mat.Matrix)

ScaledDotProductAttention is a self-attention mechanism relating different positions of a single sequence in order to compute a representation of the same sequence. This method requires that the query, the key and the value vectors have already been obtained from the input sequence. The scaled factor is the square root of the dimension of the key vectors.

func ScaledDotProductAttentionConcurrent

func ScaledDotProductAttentionConcurrent(g *ag.Graph, qs, ks, vs []ag.Node, scaleFactor float64) (context []ag.Node, prob []mat.Matrix)

ScaledDotProductAttentionConcurrent does the same thing as ScaledDotProductAttention but processes input concurrently.

func Separate

func Separate(g *ag.Graph, x ag.Node) [][]ag.Node

Separate returns a matrix of Node(s) represented as a slice of slice containing the elements extracted from the input. The dimensions of the resulting matrix are the same of the input.

func SeparateVec

func SeparateVec(g *ag.Graph, x ag.Node) []ag.Node

SeparateVec returns a slice of Node(s) containing the elements extracted from the input. The size of the vector equals the number of input elements. You can think of this method as the inverse of the ag.Concat operator.

func SplitVec

func SplitVec(g *ag.Graph, x ag.Node, chunks int) []ag.Node

TODO: optimize, this is extremely inefficient!

func ZeroGrad

func ZeroGrad(m Model)

ZeroGrad set the gradients of all model's parameters (including sub-params) to zeros. TODO: use ParamsIterator?

Types

type BaseProcessor

type BaseProcessor struct {
	Model             Model
	Mode              ProcessingMode
	Graph             *ag.Graph
	FullSeqProcessing bool
}

BaseProcessors satisfies some methods of the Processor interface. It is meant to be embedded in other processors to reduce the amount of boilerplate code.

func (*BaseProcessor) GetGraph

func (p *BaseProcessor) GetGraph() *ag.Graph

GetGraph returns the computational graph on which the processor operates.

func (*BaseProcessor) GetMode

func (p *BaseProcessor) GetMode() ProcessingMode

GetMode returns whether the processor is being used for training or inference.

func (*BaseProcessor) GetModel

func (p *BaseProcessor) GetModel() Model

GetModel returns the model the processor belongs to.

func (*BaseProcessor) RequiresFullSeq

func (p *BaseProcessor) RequiresFullSeq() bool

RequiresFullSeq returns whether the processor needs the complete sequence to start processing (as in the case of BiRNN and other bidirectional models), or not.

type Context

type Context struct {
	// Graph is the computational graph on which the processor(s) operate.
	Graph *ag.Graph
	// Mode regulates the different usage of some operations whether you're doing training or inference.
	Mode ProcessingMode
}

Context is used to instantiate a processor to operate on a graph, according to the desired ProcessingMode. If a processor contains other sub-processors, you must instantiate them using the same context to make sure you are operating on the same graph and in the same mode.

type DefaultParamsIterator

type DefaultParamsIterator struct {
	// contains filtered or unexported fields
}

func NewDefaultParamsIterator

func NewDefaultParamsIterator(models ...Model) *DefaultParamsIterator

func (*DefaultParamsIterator) ParamsList

func (i *DefaultParamsIterator) ParamsList() []*Param

type MappingFunc

type MappingFunc func(g *ag.Graph, x ag.Node) ag.Node

type Model

type Model interface {
	// NewProc returns a new processor to execute the forward step.
	NewProc(ctx Context) Processor
}

Model contains the serializable parameters.

func MakeNewModels

func MakeNewModels(n int, callback func(i int) Model) []Model

MakeNewModels return n new models. The callback is delegated to return a new model for each i-item.

type Param

type Param struct {
	// contains filtered or unexported fields
}

func NewParam

func NewParam(value mat.Matrix, opts ...ParamOption) *Param

NewParam returns a new param.

func (*Param) ApplyDelta

func (r *Param) ApplyDelta(delta mat.Matrix)

ApplyDelta updates the value of the underlying storage applying the delta.

func (*Param) ClearPayload

func (r *Param) ClearPayload()

ClearPayload clears the support structure.

func (*Param) Grad

func (r *Param) Grad() mat.Matrix

Grad returns the gradients accumulated during the backward pass.

func (*Param) HasGrad

func (r *Param) HasGrad() bool

HasGrad returns true if there are accumulated gradients.

func (*Param) MarshalBinary

func (r *Param) MarshalBinary() ([]byte, error)

MarshalBinary satisfies package pkg/encoding/gob custom marshaling interface

func (*Param) Name

func (r *Param) Name() string

Name returns the params name (can be empty string).

func (*Param) Payload

func (r *Param) Payload() *Payload

Payload returns the optimizer support structure (can be nil).

func (*Param) PropagateGrad

func (r *Param) PropagateGrad(grad mat.Matrix)

PropagateGrad accumulate the gradients

func (*Param) ReplaceValue

func (r *Param) ReplaceValue(value mat.Matrix)

ReplaceValue replaces the value of the parameter and clears the support structure.

func (*Param) RequiresGrad

func (r *Param) RequiresGrad() bool

RequiresGrad returns true if the param requires gradients.

func (*Param) ScalarValue

func (r *Param) ScalarValue() float64

ScalarValue() returns the the scalar value of the node. It panics if the value is not a scalar. Note that it is not possible to start the backward step from a scalar value.

func (*Param) SetName

func (r *Param) SetName(name string)

SetName set the params name (can be empty string).

func (*Param) SetPayload

func (r *Param) SetPayload(payload *Payload)

func (*Param) SetType

func (r *Param) SetType(name string)

SetType set the params type (weights, biases, undefined).

func (*Param) Type

func (r *Param) Type() ParamsType

Type returns the params type (weights, biases, undefined).

func (*Param) UnmarshalBinary

func (r *Param) UnmarshalBinary(data []byte) error

UnmarshalBinary satisfies pkg/encoding/gob custom marshaling interface

func (*Param) Value

func (r *Param) Value() mat.Matrix

Value returns the value of the delegate itself.

func (*Param) ZeroGrad

func (r *Param) ZeroGrad()

ZeroGrad clears the gradients.

type ParamOption

type ParamOption func(*Param)

func RequiresGrad

func RequiresGrad(value bool) ParamOption

func SetStorage

func SetStorage(storage kvdb.KeyValueDB) ParamOption

type ParamSerializer

type ParamSerializer struct {
	*Param
}

func (*ParamSerializer) Deserialize

func (s *ParamSerializer) Deserialize(r io.Reader) (n int, err error)

func (*ParamSerializer) Serialize

func (s *ParamSerializer) Serialize(w io.Writer) (int, error)

type ParamsIterator

type ParamsIterator interface {
	ParamsList() []*Param
}

type ParamsSerializer

type ParamsSerializer struct {
	Model
}

func NewParamsSerializer

func NewParamsSerializer(m Model) *ParamsSerializer

func (*ParamsSerializer) Deserialize

func (m *ParamsSerializer) Deserialize(r io.Reader) (n int, err error)

Deserialize assigns the params with the values obtained from the reader. TODO: use ParamsIterator?

func (*ParamsSerializer) Serialize

func (m *ParamsSerializer) Serialize(w io.Writer) (n int, err error)

Serialize dumps the params values to the writer. TODO: use ParamsIterator?

type ParamsType

type ParamsType int
const (
	Weights ParamsType = iota
	Biases
	Undefined
)

func ToType

func ToType(s string) ParamsType

ToType convert a string to a ParamsType. It returns Undefined if the string doesn't match any ParamsType.

func (ParamsType) String

func (t ParamsType) String() string

type Payload

type Payload struct {
	Label int
	Data  []mat.Matrix
}

Payload contains the support data used for example by the optimization methods

func NewEmptySupport

func NewEmptySupport() *Payload

NewEmptySupport returns an empty support structure, not connected to any optimization method.

func NewPayloadUnmarshalBinaryFrom

func NewPayloadUnmarshalBinaryFrom(r io.Reader) (*Payload, int, error)

type ProcessingMode

type ProcessingMode int

ProcessingMode regulates the different usage of some operations (e.g. Dropout, BatchNorm, etc.) inside a Processor, depending on whether you're doing training or inference. Failing to set the right mode will yield inconsistent inference results.

const (
	// Training is to be used during the training phase of a model. For example, dropouts are enabled.
	Training ProcessingMode = iota
	// Inference keeps weights fixed while using the model and disables some operations (e.g. skip dropout).
	Inference
)

type Processor

type Processor interface {
	// GetModel returns the model the processor belongs to.
	GetModel() Model
	// GetMode returns whether the processor is being used for training or inference.
	GetMode() ProcessingMode
	// GetGraph returns the computational graph on which the processor operates.
	GetGraph() *ag.Graph
	// RequiresFullSeq returns whether the processor needs the complete sequence to start processing
	// (as in the case of BiRNN and other bidirectional models), or not.
	RequiresFullSeq() bool
	// Forward performs the forward step for each input and returns the result.
	// Recurrent networks treats the input nodes as a sequence.
	// Differently, feed-forward networks are stateless so every computation is independent.
	Forward(xs ...ag.Node) []ag.Node
}

Processor performs the operations on the computational graphs using the model's parameters.

Directories

Path Synopsis
Bidirectional Recurrent Neural Network (BiRNN) with a Conditional Random Fields (CRF) on top.
Bidirectional Recurrent Neural Network (BiRNN) with a Conditional Random Fields (CRF) on top.
Implementation of the Broad Learning System (BLS) described in "Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture" by C. L. Philip Chen and Zhulin Liu, 2017.
Implementation of the Broad Learning System (BLS) described in "Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture" by C. L. Philip Chen and Zhulin Liu, 2017.
gnn
slstm
slstm Reference: "Sentence-State LSTM for Text Representation" by Zhang et al, 2018.
slstm Reference: "Sentence-State LSTM for Text Representation" by Zhang et al, 2018.
startransformer
StarTransformer is a variant of the model introduced by Qipeng Guo, Xipeng Qiu et al.
StarTransformer is a variant of the model introduced by Qipeng Guo, Xipeng Qiu et al.
LSH-Attention as in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya.
LSH-Attention as in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya.
normalization
adanorm
Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
fixnorm
Reference: "Improving Lexical Choice in Neural Machine Translation" by Toan Q. Nguyen and David Chiang (2018) (https://arxiv.org/pdf/1710.01329.pdf)
Reference: "Improving Lexical Choice in Neural Machine Translation" by Toan Q. Nguyen and David Chiang (2018) (https://arxiv.org/pdf/1710.01329.pdf)
layernorm
Reference: "Layer normalization" by Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton (2016).
Reference: "Layer normalization" by Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton (2016).
layernormsimple
Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
rmsnorm
Reference: "Root Mean Square Layer Normalization" by Biao Zhang and Rico Sennrich (2019).
Reference: "Root Mean Square Layer Normalization" by Biao Zhang and Rico Sennrich (2019).
Implementation of the recursive auto-encoder strategy described in "Towards Lossless Encoding of Sentences" by Prato et al., 2019.
Implementation of the recursive auto-encoder strategy described in "Towards Lossless Encoding of Sentences" by Prato et al., 2019.
This package contains built-in Residual Connections (RC).
This package contains built-in Residual Connections (RC).
rec
cfn
gru
horn
Higher Order Recurrent Neural Networks (HORN)
Higher Order Recurrent Neural Networks (HORN)
lstmsc
LSTM enriched with a PolicyGradient to enable Dynamic Skip Connections.
LSTM enriched with a PolicyGradient to enable Dynamic Skip Connections.
ltm
mist
Implementation of the MIST (MIxed hiSTory) recurrent network as described in "Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies" by Di Pietro et al., 2018 (https://arxiv.org/pdf/1702.07805.pdf).
Implementation of the MIST (MIxed hiSTory) recurrent network as described in "Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies" by Di Pietro et al., 2018 (https://arxiv.org/pdf/1702.07805.pdf).
nru
Implementation of the NRU (Non-Saturating Recurrent Units) recurrent network as described in "Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies" by Chandar et al., 2019.
Implementation of the NRU (Non-Saturating Recurrent Units) recurrent network as described in "Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies" by Chandar et al., 2019.
ran
rla
RLA (Recurrent Linear Attention) "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al., 2020.
RLA (Recurrent Linear Attention) "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al., 2020.
srn
srnn
srnn implements the SRNN (Shuffling Recurrent Neural Networks) by Rotman and Wolf, 2020.
srnn implements the SRNN (Shuffling Recurrent Neural Networks) by Rotman and Wolf, 2020.
tpr
This is an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.
This is an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL