nn

package

v0.1.0 Latest Latest Go to latest Published: Dec 9, 2020 License: BSD-2-Clause Imports: 14 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/nlpodyssey/spago

Links

Open Source Insights

Documentation ¶

Index ¶

func Affine(g *ag.Graph, xs ...ag.Node) ag.Node
func BiAffine(g *ag.Graph, w, u, v, b, x1, x2 ag.Node) ag.Node
func BiLinear(g *ag.Graph, w, x1, x2 ag.Node) ag.Node
func ClearSupport(m Model)
func Conv2D(g *ag.Graph, w, x ag.Node, xStride, yStride int) ag.Node
func DumpParamsVector(model Model) *mat.Dense
func ForEachParam(m Model, callback func(param *Param))
func ForEachParamStrict(m Model, callback func(param *Param))
func LinearAttention(g *ag.Graph, qs, ks, vs []ag.Node, mappingFunction MappingFunc, eps float64) []ag.Node
func LoadParamsVector(model Model, vector *mat.Dense)
func PayloadMarshalBinaryTo(supp *Payload, w io.Writer) (int, error)
func ScaledDotProductAttention(g *ag.Graph, qs, ks, vs []ag.Node, scaleFactor float64) (context []ag.Node, prob []mat.Matrix)
func ScaledDotProductAttentionConcurrent(g *ag.Graph, qs, ks, vs []ag.Node, scaleFactor float64) (context []ag.Node, prob []mat.Matrix)
func Separate(g *ag.Graph, x ag.Node) [][]ag.Node
func SeparateVec(g *ag.Graph, x ag.Node) []ag.Node
func SplitVec(g *ag.Graph, x ag.Node, chunks int) []ag.Node
func ZeroGrad(m Model)
type BaseProcessor
- func (p *BaseProcessor) GetGraph() *ag.Graph
- func (p *BaseProcessor) GetMode() ProcessingMode
- func (p *BaseProcessor) GetModel() Model
- func (p *BaseProcessor) RequiresFullSeq() bool
type Context
type DefaultParamsIterator
- func NewDefaultParamsIterator(models ...Model) *DefaultParamsIterator
- func (i *DefaultParamsIterator) ParamsList() []*Param
type MappingFunc
type Model
- func MakeNewModels(n int, callback func(i int) Model) []Model
type Param
- func NewParam(value mat.Matrix, opts ...ParamOption) *Param
- func (r *Param) ApplyDelta(delta mat.Matrix)
- func (r *Param) ClearPayload()
- func (r *Param) Grad() mat.Matrix
- func (r *Param) HasGrad() bool
- func (r *Param) MarshalBinary() ([]byte, error)
- func (r *Param) Name() string
- func (r *Param) Payload() *Payload
- func (r *Param) PropagateGrad(grad mat.Matrix)
- func (r *Param) ReplaceValue(value mat.Matrix)
- func (r *Param) RequiresGrad() bool
- func (r *Param) ScalarValue() float64
- func (r *Param) SetName(name string)
- func (r *Param) SetPayload(payload *Payload)
- func (r *Param) SetType(name string)
- func (r *Param) Type() ParamsType
- func (r *Param) UnmarshalBinary(data []byte) error
- func (r *Param) Value() mat.Matrix
- func (r *Param) ZeroGrad()
type ParamOption
- func RequiresGrad(value bool) ParamOption
- func SetStorage(storage kvdb.KeyValueDB) ParamOption
type ParamSerializer
- func (s *ParamSerializer) Deserialize(r io.Reader) (n int, err error)
- func (s *ParamSerializer) Serialize(w io.Writer) (int, error)
type ParamsIterator
type ParamsSerializer
- func NewParamsSerializer(m Model) *ParamsSerializer
- func (m *ParamsSerializer) Deserialize(r io.Reader) (n int, err error)
- func (m *ParamsSerializer) Serialize(w io.Writer) (n int, err error)
type ParamsType
- func ToType(s string) ParamsType
- func (t ParamsType) String() string
type Payload
- func NewEmptySupport() *Payload
- func NewPayloadUnmarshalBinaryFrom(r io.Reader) (*Payload, int, error)
type ProcessingMode
type Processor

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Affine ¶

func Affine(g *ag.Graph, xs ...ag.Node) ag.Node

Affine performs an affine transformation over an arbitrary (odd) number of nodes held in the input. The first node is the “bias”, which is added to the output as-is. The remaining nodes of the form "Wx" are multiplied together in pairs, then added. The pairs except the first whose "x" is nil are not considered. y = b + W1x1 + W2x2 + ... + WnXn

func BiAffine ¶

func BiAffine(g *ag.Graph, w, u, v, b, x1, x2 ag.Node) ag.Node

BiAffine performs a biaffine transformation.

func BiLinear ¶

func BiLinear(g *ag.Graph, w, x1, x2 ag.Node) ag.Node

BiLinear performs a bilinear transformation of the type (x_1 W x_2)

func ClearSupport ¶

func ClearSupport(m Model)

ClearPayload clears the support structure of all model's parameters (including sub-params). TODO: use ParamsIterator?

func Conv2D ¶

func Conv2D(g *ag.Graph, w, x ag.Node, xStride, yStride int) ag.Node

Conv2D performs a 2D convolution.

func DumpParamsVector ¶

func DumpParamsVector(model Model) *mat.Dense

TODO: use ParamsIterator?

func ForEachParam ¶

func ForEachParam(m Model, callback func(param *Param))

ForEachParam iterate all the parameters of a model also exploring the sub-parameters recursively. TODO: don't loop the field every time, use a lazy initialized "params list" instead

func ForEachParamStrict ¶

func ForEachParamStrict(m Model, callback func(param *Param))

ForEachParamStrict iterate all the parameters of a model without exploring the sub-models.

func LinearAttention ¶

func LinearAttention(g *ag.Graph, qs, ks, vs []ag.Node, mappingFunction MappingFunc, eps float64) []ag.Node

LinearAttention performs the self-attention as a linear dot-product of kernel feature maps. It operates with O(N) complexity, where N is the sequence length. Reference: "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al. (2020)

func LoadParamsVector ¶

func LoadParamsVector(model Model, vector *mat.Dense)

TODO: use ParamsIterator?

func PayloadMarshalBinaryTo ¶

func PayloadMarshalBinaryTo(supp *Payload, w io.Writer) (int, error)

PayloadMarshalBinaryTo returns the number of bytes written into w and an error, if any.

func ScaledDotProductAttention ¶

func ScaledDotProductAttention(g *ag.Graph, qs, ks, vs []ag.Node, scaleFactor float64) (context []ag.Node, prob []mat.Matrix)

ScaledDotProductAttention is a self-attention mechanism relating different positions of a single sequence in order to compute a representation of the same sequence. This method requires that the query, the key and the value vectors have already been obtained from the input sequence. The scaled factor is the square root of the dimension of the key vectors.

func ScaledDotProductAttentionConcurrent ¶

func ScaledDotProductAttentionConcurrent(g *ag.Graph, qs, ks, vs []ag.Node, scaleFactor float64) (context []ag.Node, prob []mat.Matrix)

ScaledDotProductAttentionConcurrent does the same thing as ScaledDotProductAttention but processes input concurrently.

func Separate ¶

func Separate(g *ag.Graph, x ag.Node) [][]ag.Node

Separate returns a matrix of Node(s) represented as a slice of slice containing the elements extracted from the input. The dimensions of the resulting matrix are the same of the input.

func SeparateVec ¶

func SeparateVec(g *ag.Graph, x ag.Node) []ag.Node

SeparateVec returns a slice of Node(s) containing the elements extracted from the input. The size of the vector equals the number of input elements. You can think of this method as the inverse of the ag.Concat operator.

func SplitVec ¶

func SplitVec(g *ag.Graph, x ag.Node, chunks int) []ag.Node

TODO: optimize, this is extremely inefficient!

func ZeroGrad ¶

func ZeroGrad(m Model)

ZeroGrad set the gradients of all model's parameters (including sub-params) to zeros. TODO: use ParamsIterator?

Types ¶

type BaseProcessor ¶

type BaseProcessor struct {
	Model             Model
	Mode              ProcessingMode
	Graph             *ag.Graph
	FullSeqProcessing bool
}

BaseProcessors satisfies some methods of the Processor interface. It is meant to be embedded in other processors to reduce the amount of boilerplate code.

func (*BaseProcessor) GetGraph ¶

func (p *BaseProcessor) GetGraph() *ag.Graph

GetGraph returns the computational graph on which the processor operates.

func (*BaseProcessor) GetMode ¶

func (p *BaseProcessor) GetMode() ProcessingMode

GetMode returns whether the processor is being used for training or inference.

func (*BaseProcessor) GetModel ¶

func (p *BaseProcessor) GetModel() Model

GetModel returns the model the processor belongs to.

func (*BaseProcessor) RequiresFullSeq ¶

func (p *BaseProcessor) RequiresFullSeq() bool

RequiresFullSeq returns whether the processor needs the complete sequence to start processing (as in the case of BiRNN and other bidirectional models), or not.

type Context ¶

type Context struct {
	// Graph is the computational graph on which the processor(s) operate.
	Graph *ag.Graph
	// Mode regulates the different usage of some operations whether you're doing training or inference.
	Mode ProcessingMode
}

Context is used to instantiate a processor to operate on a graph, according to the desired ProcessingMode. If a processor contains other sub-processors, you must instantiate them using the same context to make sure you are operating on the same graph and in the same mode.

type DefaultParamsIterator ¶

type DefaultParamsIterator struct {
	// contains filtered or unexported fields
}

func NewDefaultParamsIterator ¶

func NewDefaultParamsIterator(models ...Model) *DefaultParamsIterator

func (*DefaultParamsIterator) ParamsList ¶

func (i *DefaultParamsIterator) ParamsList() []*Param

type MappingFunc ¶

type MappingFunc func(g *ag.Graph, x ag.Node) ag.Node

type Model ¶

type Model interface {
	// NewProc returns a new processor to execute the forward step.
	NewProc(ctx Context) Processor
}

Model contains the serializable parameters.

func MakeNewModels ¶

func MakeNewModels(n int, callback func(i int) Model) []Model

MakeNewModels return n new models. The callback is delegated to return a new model for each i-item.

type Param ¶

type Param struct {
	// contains filtered or unexported fields
}

func NewParam ¶

func NewParam(value mat.Matrix, opts ...ParamOption) *Param

NewParam returns a new param.

func (*Param) ApplyDelta ¶

func (r *Param) ApplyDelta(delta mat.Matrix)

ApplyDelta updates the value of the underlying storage applying the delta.

func (*Param) ClearPayload ¶

func (r *Param) ClearPayload()

ClearPayload clears the support structure.

func (*Param) Grad ¶

func (r *Param) Grad() mat.Matrix

Grad returns the gradients accumulated during the backward pass.

func (*Param) HasGrad ¶

func (r *Param) HasGrad() bool

HasGrad returns true if there are accumulated gradients.

func (*Param) MarshalBinary ¶

func (r *Param) MarshalBinary() ([]byte, error)

MarshalBinary satisfies package pkg/encoding/gob custom marshaling interface

func (*Param) Name ¶

func (r *Param) Name() string

Name returns the params name (can be empty string).

func (*Param) Payload ¶

func (r *Param) Payload() *Payload

Payload returns the optimizer support structure (can be nil).

func (*Param) PropagateGrad ¶

func (r *Param) PropagateGrad(grad mat.Matrix)

PropagateGrad accumulate the gradients

func (*Param) ReplaceValue ¶

func (r *Param) ReplaceValue(value mat.Matrix)

ReplaceValue replaces the value of the parameter and clears the support structure.

func (*Param) RequiresGrad ¶

func (r *Param) RequiresGrad() bool

RequiresGrad returns true if the param requires gradients.

func (*Param) ScalarValue ¶

func (r *Param) ScalarValue() float64

ScalarValue() returns the the scalar value of the node. It panics if the value is not a scalar. Note that it is not possible to start the backward step from a scalar value.

func (*Param) SetName ¶

func (r *Param) SetName(name string)

SetName set the params name (can be empty string).

func (*Param) SetPayload ¶

func (r *Param) SetPayload(payload *Payload)

func (*Param) SetType ¶

func (r *Param) SetType(name string)

SetType set the params type (weights, biases, undefined).

func (*Param) Type ¶

func (r *Param) Type() ParamsType

Type returns the params type (weights, biases, undefined).

func (*Param) UnmarshalBinary ¶

func (r *Param) UnmarshalBinary(data []byte) error

UnmarshalBinary satisfies pkg/encoding/gob custom marshaling interface

func (*Param) Value ¶

func (r *Param) Value() mat.Matrix

Value returns the value of the delegate itself.

func (*Param) ZeroGrad ¶

func (r *Param) ZeroGrad()

ZeroGrad clears the gradients.

type ParamOption ¶

type ParamOption func(*Param)

func RequiresGrad ¶

func RequiresGrad(value bool) ParamOption

func SetStorage ¶

func SetStorage(storage kvdb.KeyValueDB) ParamOption

type ParamSerializer ¶

type ParamSerializer struct {
	*Param
}

func (*ParamSerializer) Deserialize ¶

func (s *ParamSerializer) Deserialize(r io.Reader) (n int, err error)

func (*ParamSerializer) Serialize ¶

func (s *ParamSerializer) Serialize(w io.Writer) (int, error)

type ParamsIterator ¶

type ParamsIterator interface {
	ParamsList() []*Param
}

type ParamsSerializer ¶

type ParamsSerializer struct {
	Model
}

func NewParamsSerializer ¶

func NewParamsSerializer(m Model) *ParamsSerializer

func (*ParamsSerializer) Deserialize ¶

func (m *ParamsSerializer) Deserialize(r io.Reader) (n int, err error)

Deserialize assigns the params with the values obtained from the reader. TODO: use ParamsIterator?

func (*ParamsSerializer) Serialize ¶

func (m *ParamsSerializer) Serialize(w io.Writer) (n int, err error)

Serialize dumps the params values to the writer. TODO: use ParamsIterator?

type ParamsType ¶

type ParamsType int

const (
	Weights ParamsType = iota
	Biases
	Undefined
)

func ToType ¶

func ToType(s string) ParamsType

ToType convert a string to a ParamsType. It returns Undefined if the string doesn't match any ParamsType.

func (ParamsType) String ¶

func (t ParamsType) String() string

type Payload ¶

type Payload struct {
	Label int
	Data  []mat.Matrix
}

Payload contains the support data used for example by the optimization methods

func NewEmptySupport ¶

func NewEmptySupport() *Payload

NewEmptySupport returns an empty support structure, not connected to any optimization method.

func NewPayloadUnmarshalBinaryFrom ¶

func NewPayloadUnmarshalBinaryFrom(r io.Reader) (*Payload, int, error)

type ProcessingMode ¶

type ProcessingMode int

ProcessingMode regulates the different usage of some operations (e.g. Dropout, BatchNorm, etc.) inside a Processor, depending on whether you're doing training or inference. Failing to set the right mode will yield inconsistent inference results.

const (
	// Training is to be used during the training phase of a model. For example, dropouts are enabled.
	Training ProcessingMode = iota
	// Inference keeps weights fixed while using the model and disables some operations (e.g. skip dropout).
	Inference
)

type Processor ¶

type Processor interface {
	// GetModel returns the model the processor belongs to.
	GetModel() Model
	// GetMode returns whether the processor is being used for training or inference.
	GetMode() ProcessingMode
	// GetGraph returns the computational graph on which the processor operates.
	GetGraph() *ag.Graph
	// RequiresFullSeq returns whether the processor needs the complete sequence to start processing
	// (as in the case of BiRNN and other bidirectional models), or not.
	RequiresFullSeq() bool
	// Forward performs the forward step for each input and returns the result.
	// Recurrent networks treats the input nodes as a sequence.
	// Differently, feed-forward networks are stateless so every computation is independent.
	Forward(xs ...ag.Node) []ag.Node
}

Processor performs the operations on the computational graphs using the model's parameters.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
activation
birnn
birnncrf Bidirectional Recurrent Neural Network (BiRNN) with a Conditional Random Fields (CRF) on top.	Bidirectional Recurrent Neural Network (BiRNN) with a Conditional Random Fields (CRF) on top.
bls Implementation of the Broad Learning System (BLS) described in "Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture" by C. L. Philip Chen and Zhulin Liu, 2017.	Implementation of the Broad Learning System (BLS) described in "Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture" by C. L. Philip Chen and Zhulin Liu, 2017.
convolution
crf
flatten
gnn
slstm slstm Reference: "Sentence-State LSTM for Text Representation" by Zhang et al, 2018.	slstm Reference: "Sentence-State LSTM for Text Representation" by Zhang et al, 2018.
startransformer StarTransformer is a variant of the model introduced by Qipeng Guo, Xipeng Qiu et al.	StarTransformer is a variant of the model introduced by Qipeng Guo, Xipeng Qiu et al.
highway
linear
lshattention LSH-Attention as in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya.	LSH-Attention as in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya.
multiheadattention
normalization
adanorm Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).	Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
batchnorm
fixnorm Reference: "Improving Lexical Choice in Neural Machine Translation" by Toan Q. Nguyen and David Chiang (2018) (https://arxiv.org/pdf/1710.01329.pdf)	Reference: "Improving Lexical Choice in Neural Machine Translation" by Toan Q. Nguyen and David Chiang (2018) (https://arxiv.org/pdf/1710.01329.pdf)
layernorm Reference: "Layer normalization" by Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton (2016).	Reference: "Layer normalization" by Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton (2016).
layernormsimple Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).	Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
rmsnorm Reference: "Root Mean Square Layer Normalization" by Biao Zhang and Rico Sennrich (2019).	Reference: "Root Mean Square Layer Normalization" by Biao Zhang and Rico Sennrich (2019).
scalenorm
pooling
rae Implementation of the recursive auto-encoder strategy described in "Towards Lossless Encoding of Sentences" by Prato et al., 2019.	Implementation of the recursive auto-encoder strategy described in "Towards Lossless Encoding of Sentences" by Prato et al., 2019.
rc This package contains built-in Residual Connections (RC).	This package contains built-in Residual Connections (RC).
rec
cfn
deltarnn
fsmn
gru
horn Higher Order Recurrent Neural Networks (HORN)	Higher Order Recurrent Neural Networks (HORN)
indrnn
lstm
lstmsc LSTM enriched with a PolicyGradient to enable Dynamic Skip Connections.	LSTM enriched with a PolicyGradient to enable Dynamic Skip Connections.
ltm
mist Implementation of the MIST (MIxed hiSTory) recurrent network as described in "Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies" by Di Pietro et al., 2018 (https://arxiv.org/pdf/1702.07805.pdf).	Implementation of the MIST (MIxed hiSTory) recurrent network as described in "Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies" by Di Pietro et al., 2018 (https://arxiv.org/pdf/1702.07805.pdf).
nru Implementation of the NRU (Non-Saturating Recurrent Units) recurrent network as described in "Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies" by Chandar et al., 2019.	Implementation of the NRU (Non-Saturating Recurrent Units) recurrent network as described in "Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies" by Chandar et al., 2019.
ran
rla RLA (Recurrent Linear Attention) "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al., 2020.	RLA (Recurrent Linear Attention) "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al., 2020.
srn
srnn srnn implements the SRNN (Shuffling Recurrent Neural Networks) by Rotman and Wolf, 2020.	srnn implements the SRNN (Shuffling Recurrent Neural Networks) by Rotman and Wolf, 2020.
tpr
selfattention
sqrdist
stack
syntheticattention This is an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.	This is an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL