spago

command module

v0.1.0 Latest Latest Go to latest Published: Dec 9, 2020 License: BSD-2-Clause Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/nlpodyssey/spago

Links

Open Source Insights

README ¶

alt text

Unstable

If you like the project, please ★ star this repository to show your support! 🤩

A beautiful and maintainable machine learning library written in Go. It is designed to support relevant neural architectures in Natural Language Processing.

spaGO is compatible with 🤗 BERT-like Transformers and with the Flair sequence labeler architecture.

Features

Automatic differentiation

You write the forward(), it does all backward() derivatives for you:
- Define-by-Run (default, just like PyTorch does)
- Define-and-Run (similar to the static graph of TensorFlow)

Optimization methods

Gradient descent:
- Adam, RAdam, RMS-Prop, AdaGrad, SGD
Differential Evolution

Neural networks

Feed-forward models (Linear, Highway, Convolution, ...)
Recurrent models (LSTM, GRU, BiLSTM...)
Attention mechanisms (Self-Attention, Multi-Head Attention, ...)
Recursive auto-encoders

Natural Language Processing

Memory-efficient Word Embeddings (with badger key–value store)
Character Language Models
Recurrent Sequence Labeler with CRF on top (e.g. Named Entities Recognition)
Transformer models (BERT-like)
- Masked language model
- Next sentence prediction
- Tokens Classification
- Text Classification (e.g. Sentiment Analysis)
- Question Answering
- Textual Entailment
- Text Similarity

Compatible with pre-trained state-of-the-art neural models:

🤗 BERT-like Transformers
Flair sequence labeler architecture

Documentation

Usage

Requirements:

Clone this repo or get the library:

go get -u github.com/nlpodyssey/spago

To get started, you can find some tutorials on the Wiki about the Machine Learning Framework.

Several demo programs can be leveraged to tour the current capabilities in spaGO. The demos are documented on this page of the Wiki. A list of the demos now follows.

There is also a repo with handy examples, such as MNIST classification.

Project Goals

Is spaGO right for me?

Are you looking for a highly optimized, scalable, battle-tested, production-ready machine-learning/NLP framework? Are you also a Python lover and enjoy manipulating tensors? If yes, you won't find much to your satisfaction here.

PyTorch plus the wonders of the friends of Hugging Face is the answer you seek!

If instead you prefer statically typed, compiled programming language, and a simpler yet well-structured machine-learning framework almost ready to use is what you need, then you are in the right place!

The idea is that you could have written spaGO. Most of it, from the computational graph to the LSTM is straightforward Go code :)

Why spaGO?

I've been writing more or less the same software for almost 20 years. I guess it's my way of learning a new language. Now it's Go's turn, and spaGO is the result of a few days of pure fun!

Let me explain a little further. It's not precisely the very same software I've been writing now for 20 years: I've been working in the NLP for this long, experimenting with different approaches and techniques, and therefore software of the same field. I've always felt satisfied to limit the use of third-party dependencies, writing firsthand the algorithms that interest me most. So, I took the opportunity to speed up my understanding of the deep learning techniques and methodologies underlying cutting-edge NLP results, implementing them almost from scratch in straightforward Go code. I'm aware that reinventing the wheel is an anti-pattern; nevertheless, I wanted to build something with my own concepts in my own (italian) style: that's the way I learn best, and it could be your best chance to understand what's going on under the hood of the artificial intelligence :)

When I start programming in a new language, I usually do not know much of it. I often combine the techniques I have acquired by writing in other languages and other paradigms, so some choices may not be the most idiomatic... but who cares, right?

It's with this approach that I jumped on Go and created spaGo: a work in progress, (hopefully) understandable, easy to use library for machine learning and natural language processing.

What direction did you take for the development of spaGO?

I started spaGO to deepen first-hand the mechanisms underlying a machine learning framework. In doing this, I thought it was an excellent opportunity to set up the library so to enable the use and understanding of such algorithms to non-experts as well.

In my experience, the first barrier to (deep) machine learning for developers who do not enjoy mathematics, at least not too much, is getting familiar with the use of tensors rather than understanding neural architecture. Well, in spaGO, we only use well-known 2D Matrices, by which we can represent vectors and scalars too. That's all we need (performance aside). You won't lose sleep anymore by watching tensor axes to figure out how to do math operations.

Since it's a counter-trend decision, let me argue some more. It happened a few times that friends and colleagues, who are super cool full-stack developers, tried to understand the NLP algorithms I was programming in PyTorch. Sometimes they gave up just because "the forward() method doesn't look like the usual code" to them.

Honestly, I don't find it hard to believe that by combining Python's dynamism with the versatility of tensors, the flow of a program can become hard to digest. It is undoubtedly essential to devote a good time reading the documentation, which may not be immediately available. Hence, you find yourself forced to inspect the content of the variables at runtime with your favorite IDE (PyCharm, of course). It happens in general, but I believe in machine learning in particular.

In other words, I wanted to limit as much as possible the use of tensors larger than two dimensions, preferring the use of built-in types such as slices and maps. For example, batches are explicit as slices of nodes, not part of the same forward() computation. Too much detail here, sorry. At the end, I guess we do gain static code analysis this way, by shifting the focus from the tensor operations back to traditional control-flows. Of course, the type checker still can't verify the correct shapes of matrices and the like. That still requires runtime panics etc. I agree that it is hard to see where to draw the line, but so far, I'm pretty happy with my decision.

Does spaGO support GPU?

Sadly, not using tensors, spaGO is not GPU or TPU friendly by design. You bet, I'm going to do some experiments integrating CUDA, but I can already tell you that I will not reach satisfactory levels.

In spaGO, using slices of (slices of) matrices, we have to "loop" often to do mathematical operations, whereas they are performed in one go using tensors. Any time your code has a loop that is not GPU or TPU friendly.

Mainstream machine-learning tensor-based frameworks such as PyTorch and TensorFlow, the first thing they want to do, is to convert whatever you're doing into a big matrix multiplication problem, which is where the GPU does its best. Yeah, that's an overstatement, but not so far from reality. Storing all data in tensors and applying batched operations to them is the way to go for hardware acceleration. On GPU, it's a must, and even on CPU, that could give a 10x speedup or more with cache-aware BLAS libraries.

Beyond that, I think there's a lot of basic design improvements that would be necessary before spaGO could fit for mainstream use. Many boilerplates could go away using reflection, or more simply by careful engineering. It's perfectly normal; the more I program in Go, the more I would review some choices.

Is spaGO stable?

We're not at a v1.0.0 yet, so spaGO is currently an experimental work-in-progress. It's pretty easy to get your hands on through, so you might want to use it in your real applications. Early adopters may make use of it for production use today as long as they understand and accept that spaGO is not fully tested and that APIs will change (maybe extensively).

~~If you're wondering, I haven't used spaGO in production yet, but I plan to do the first integration tests soon.~~

spaGO has been running smoothly for a couple of months now in a system that analyzes thousands of news items a day!

Contact

I encourage you to write an issue. This would help the community grow.

If you really want to write to me privately, please email Matteo Grella with your questions or comments.

Acknowledgments

spaGO is a personal project that is part of the open-source NLP Odyssey initiative initiated by members of the EXOP team. I would therefore like to thank EXOP GmbH here, which is providing full support for development by promoting the project and giving it increasing importance.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

docker-entrypoint.go

Directories ¶

Path	Synopsis
cmd
bert
bert/internal/app
clientutils
huggingfaceimporter
huggingfaceimporter/internal
ner This is the first attempt to launch a sequence labeling server from the command line.	This is the first attempt to launch a sequence labeling server from the command line.
ner/internal/app
embeddings
store/diskstore Module
graphviz module
nn
approxlinear Module
pkg
global
mat
mat/f64utils
mat/internal/asm/f64 Package f64 provides float64 vector primitives.	Package f64 provides float64 vector primitives.
mat/rand
mat/rand/bernulli
mat/rand/normal
mat/rand/uniform
ml/ag
ml/ag/fn SparseMax implementation based on https://github.com/gokceneraslan/SparseMax.torch	SparseMax implementation based on https://github.com/gokceneraslan/SparseMax.torch
ml/encoding/fofe
ml/encoding/pe
ml/initializers
ml/losses
ml/nn
ml/nn/activation
ml/nn/birnn
ml/nn/birnncrf Bidirectional Recurrent Neural Network (BiRNN) with a Conditional Random Fields (CRF) on top.	Bidirectional Recurrent Neural Network (BiRNN) with a Conditional Random Fields (CRF) on top.
ml/nn/bls Implementation of the Broad Learning System (BLS) described in "Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture" by C. L. Philip Chen and Zhulin Liu, 2017.	Implementation of the Broad Learning System (BLS) described in "Broad Learning System: An Effective and Efficient Incremental Learning System Without the Need for Deep Architecture" by C. L. Philip Chen and Zhulin Liu, 2017.
ml/nn/convolution
ml/nn/crf
ml/nn/flatten
ml/nn/gnn/slstm slstm Reference: "Sentence-State LSTM for Text Representation" by Zhang et al, 2018.	slstm Reference: "Sentence-State LSTM for Text Representation" by Zhang et al, 2018.
ml/nn/gnn/startransformer StarTransformer is a variant of the model introduced by Qipeng Guo, Xipeng Qiu et al.	StarTransformer is a variant of the model introduced by Qipeng Guo, Xipeng Qiu et al.
ml/nn/highway
ml/nn/linear
ml/nn/lshattention LSH-Attention as in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya.	LSH-Attention as in `Reformer: The Efficient Transformer` by N. Kitaev, Ł. Kaiser, A. Levskaya.
ml/nn/multiheadattention
ml/nn/normalization/adanorm Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).	Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
ml/nn/normalization/batchnorm
ml/nn/normalization/fixnorm Reference: "Improving Lexical Choice in Neural Machine Translation" by Toan Q. Nguyen and David Chiang (2018) (https://arxiv.org/pdf/1710.01329.pdf)	Reference: "Improving Lexical Choice in Neural Machine Translation" by Toan Q. Nguyen and David Chiang (2018) (https://arxiv.org/pdf/1710.01329.pdf)
ml/nn/normalization/layernorm Reference: "Layer normalization" by Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton (2016).	Reference: "Layer normalization" by Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton (2016).
ml/nn/normalization/layernormsimple Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).	Reference: "Understanding and Improving Layer Normalization" by Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin (2019).
ml/nn/normalization/rmsnorm Reference: "Root Mean Square Layer Normalization" by Biao Zhang and Rico Sennrich (2019).	Reference: "Root Mean Square Layer Normalization" by Biao Zhang and Rico Sennrich (2019).
ml/nn/normalization/scalenorm
ml/nn/pooling
ml/nn/rae Implementation of the recursive auto-encoder strategy described in "Towards Lossless Encoding of Sentences" by Prato et al., 2019.	Implementation of the recursive auto-encoder strategy described in "Towards Lossless Encoding of Sentences" by Prato et al., 2019.
ml/nn/rc This package contains built-in Residual Connections (RC).	This package contains built-in Residual Connections (RC).
ml/nn/rec/cfn
ml/nn/rec/deltarnn
ml/nn/rec/fsmn
ml/nn/rec/gru
ml/nn/rec/horn Higher Order Recurrent Neural Networks (HORN)	Higher Order Recurrent Neural Networks (HORN)
ml/nn/rec/indrnn
ml/nn/rec/lstm
ml/nn/rec/lstmsc LSTM enriched with a PolicyGradient to enable Dynamic Skip Connections.	LSTM enriched with a PolicyGradient to enable Dynamic Skip Connections.
ml/nn/rec/ltm
ml/nn/rec/mist Implementation of the MIST (MIxed hiSTory) recurrent network as described in "Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies" by Di Pietro et al., 2018 (https://arxiv.org/pdf/1702.07805.pdf).	Implementation of the MIST (MIxed hiSTory) recurrent network as described in "Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies" by Di Pietro et al., 2018 (https://arxiv.org/pdf/1702.07805.pdf).
ml/nn/rec/nru Implementation of the NRU (Non-Saturating Recurrent Units) recurrent network as described in "Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies" by Chandar et al., 2019.	Implementation of the NRU (Non-Saturating Recurrent Units) recurrent network as described in "Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies" by Chandar et al., 2019.
ml/nn/rec/ran
ml/nn/rec/rla RLA (Recurrent Linear Attention) "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al., 2020.	RLA (Recurrent Linear Attention) "Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention" by Katharopoulos et al., 2020.
ml/nn/rec/srn
ml/nn/rec/srnn srnn implements the SRNN (Shuffling Recurrent Neural Networks) by Rotman and Wolf, 2020.	srnn implements the SRNN (Shuffling Recurrent Neural Networks) by Rotman and Wolf, 2020.
ml/nn/rec/tpr
ml/nn/selfattention
ml/nn/sqrdist
ml/nn/stack
ml/nn/syntheticattention This is an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.	This is an implementation of the Synthetic Attention described in: "SYNTHESIZER: Rethinking Self-Attention in Transformer Models" by Tay et al., 2020.
ml/optimizers
ml/optimizers/de
ml/optimizers/gd
ml/optimizers/gd/adagrad
ml/optimizers/gd/adam
ml/optimizers/gd/clipper
ml/optimizers/gd/decay
ml/optimizers/gd/decay/exponential
ml/optimizers/gd/decay/hyperbolic
ml/optimizers/gd/gdmbuilder
ml/optimizers/gd/radam
ml/optimizers/gd/rmsprop
ml/optimizers/gd/sgd
ml/stats
nlp/charlm CharLM implements a character-level language model that uses a recurrent neural network as its backbone.	CharLM implements a character-level language model that uses a recurrent neural network as its backbone.
nlp/contextualstringembeddings Implementation of the "Contextual String Embeddings" of words (Akbik et al., 2018).	Implementation of the "Contextual String Embeddings" of words (Akbik et al., 2018).
nlp/corpora
nlp/embeddings
nlp/evolvingembeddings A word embedding model that evolves itself by dynamically aggregating contextual embeddings over time during inference.	A word embedding model that evolves itself by dynamically aggregating contextual embeddings over time during inference.
nlp/sequencelabeler Implementation of a sequence labeling architecture composed by Embeddings -> BiRNN -> Scorer -> CRF.	Implementation of a sequence labeling architecture composed by Embeddings -> BiRNN -> Scorer -> CRF.
nlp/sequencelabeler/grpcapi
nlp/stackedembeddings StackedEmbeddings is a convenient module that stacks multiple word embedding representations by concatenating them.	StackedEmbeddings is a convenient module that stacks multiple word embedding representations by concatenating them.
nlp/tokenizers This package is an interim solution while developing `gotokenizers` (https://github.com/nlpodyssey/gotokenizers).	This package is an interim solution while developing `gotokenizers` (https://github.com/nlpodyssey/gotokenizers).
nlp/tokenizers/basetokenizer BaseTokenizer is a very simple tokenizer that splits per white-spaces (and alike) and punctuation symbols.	BaseTokenizer is a very simple tokenizer that splits per white-spaces (and alike) and punctuation symbols.
nlp/tokenizers/wordpiecetokenizer
nlp/transformers
nlp/transformers/bert Reference: "Attention Is All You Need" by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin (2017) (http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf).	Reference: "Attention Is All You Need" by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser and Illia Polosukhin (2017) (http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf).
nlp/transformers/bert/grpcapi
nlp/vocabulary
utils
utils/data
utils/gopickleutils
utils/grpcutils
utils/homedir
utils/httphandlers
utils/httputils
utils/kvdb
webui
webui/bertclassification
webui/bertqa
webui/ner

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL