attention

package
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 24, 2021 License: BSD-2-Clause Imports: 3 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func LinearAttention

func LinearAttention(g *ag.Graph, qkv QKV, mappingFunction MappingFunc, eps mat.Float) []ag.Node

LinearAttention performs the self-attention as a linear dot-product of kernel feature maps. It operates with O(N) complexity, where N is the sequence length. Reference: "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al. (2020)

func MakeCausalMask added in v0.5.0

func MakeCausalMask(curIndex, seqLength int) []mat.Float

MakeCausalMask returns a slice of size seqLength filled with zeros until curIndex, and the rest with -inf.

func ScaledDotProductAttention

func ScaledDotProductAttention(g *ag.Graph, qkv QKV, scaleFactor mat.Float, useCausalMask bool) (context []ag.Node, prob []mat.Matrix)

ScaledDotProductAttention is a self-attention mechanism relating different positions of a single sequence to compute a representation of the same sequence. This method requires that the query, the key and the value vectors have already been obtained from the input sequence. The scaled factor is the square root of the dimension of the key vectors.

func ScaledDotProductAttentionConcurrent

func ScaledDotProductAttentionConcurrent(g *ag.Graph, qkv QKV, scaleFactor mat.Float) (context []ag.Node, prob []mat.Matrix)

ScaledDotProductAttentionConcurrent does the same thing as ScaledDotProductAttention but processes input concurrently.

Types

type KeysValuesPair added in v0.5.0

type KeysValuesPair struct {
	Keys   []ag.Node
	Values []ag.Node
}

KeysValuesPair contains Keys and Values.

type MappingFunc

type MappingFunc func(g *ag.Graph, x ag.Node) ag.Node

MappingFunc is a mapping function used by LinearAttention.

type Output added in v0.5.0

type Output struct {
	// AttOutput is the result of the self-attention.
	AttOutput []ag.Node
	// AttWeights are the attention scores for each element of the sequence.
	AttWeights []mat.Matrix
	// ProjKeysValues is the list of Keys and Values used to compute the self-attention.
	ProjKeysValues KeysValuesPair
}

Output aggregates the multiple output of the self-attentions, incl. attention scores and last projected keys and values.

type QKV

type QKV struct {
	Queries []ag.Node
	Keys    []ag.Node
	Values  []ag.Node
}

QKV groups queries, keys and values useful for self-attention functions, as described in "Attention Is All You Need" (Vaswani et al., 2017 - http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf).

func ToQKV

func ToQKV(xs []ag.Node) QKV

ToQKV create a new QKV struct with queries = keys = values = xs.

Directories

Path Synopsis
Package lshattention provides an implementation of the LSH-Attention model, as describe in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya (https://arxiv.org/pdf/2001.04451.pdf).
Package lshattention provides an implementation of the LSH-Attention model, as describe in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya (https://arxiv.org/pdf/2001.04451.pdf).
Package syntheticattention provides an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.
Package syntheticattention provides an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL