skogul

package module
v0.5.2-pre1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 30, 2020 License: LGPL-2.1 Imports: 8 Imported by: 2

README

.. image:: https://goreportcard.com/badge/github.com/telenornms/skogul
   :target: https://goreportcard.com/report/github.com/telenornms/skogul

.. image:: https://godoc.org/github.com/telenornms/skogul?status.svg
   :target: https://godoc.org/github.com/telenornms/skogul

.. image:: https://cloud.drone.io/api/badges/telenornms/skogul/status.svg
   :target: https://cloud.drone.io/telenornms/skogul

======================================
Skogul - generic metric/data collector
======================================

Skogul is a generic tool for moving metric data around. It can serve as a
collector of data, but is primarily designed to be a framework for building
bridges between data collectors and storage engines.

This repository contains the Skogul library/package, and ``cmd/skogul``,
which parses a JSON-config to set up Skogul.

A copy of the auto-generated manual for skogul is also provided, which is
aimed at end-users. See ``skogul.rst`` (or ``man ./skogul.1``).

.. contents:: Table of contents
   :depth: 2
   :local:

Quickstart
----------

You need to install a recent/decent version of Go. Either from your
favorite Linux distro, or through https://golang.org/dl/ .

Building ``skogul``, including cloning::

   $ git clone https://github.com/telenornms/skogul
   (...)
   $ cd skogul/cmd/skogul
   $ go build
   $ 
   # (No output from go build is good)

Alternatively, you can use ``go install`` instead of ``go build`` to
install to ``$GOPATH/bin``, which is typically ``~/go/bin``.

To use the locally imported/vendored packages instead of downloading them
directly, e.g. if a system does not have direct internet access or you wish
to take a local copy of the code in its entirety, including dependencies.
First make a vendored copy on an internet-attached computer - checksums in
the repo will be verified::

   $ cd skogul
   $ go mod vendor
   $
   ( skogul/vendor is now populated after a while )

Copy repo/directory to relevant computer, then run::

   $ cd skogul/cmd/skogul
   $ go build -mod vendor
   $

(or ``go install -mod vendor``)


About
-----

Skogul is written to solve a myriad of issues that typically arise when
dealing with metric data and complex systems. It can be used for very
simple setups, and expanded to large, multi-datacenter infrastructures with
a mixture of new and old systems attached to it.

To accomplish this, you set up chains that define how data is received, how
it is treated, where it goes and what happens if something goes wrong.

A Skogul chain is built from one or more independent receivers which
receive data and pass it on to a sender. A sender can either transmit data
to an external source (including an other Skogul instance), or add some
internal routing logic before passing it on to one or more other senders.

.. image:: docs/imgs/basic.png

Unlike most APIs or collectors of metrics, Skogul does NOT have a
preference when it comes to storage engine. It is explicitly designed to
disconnect the task of how data is collected from how it is stored.

The rationale is that the problem of writing an efficient snmp collector
should not be tightly coupled to where you store the data. And where you
store the data should not be tightly coupled with how you receive it, or
what you do with it.

This enables an organization to gradually shift from older to newer stacks,
as Skogul can both receive data on old and new transport mechanisms,
and store it both in new and old systems. That way, older collectors can
continue working how they always how worked, but send data to Skogul.
During testing/maturing, Skogul can store the data in both legacy system
and replacement system. When the legacy system is removed, no change is
needed on the side of the collector.

Extra care has been put into making it trivial to write senders and
receivers. For example, an author of a new sender only has to add tags
to their data structure to have that exposed as documentation.

See the package documentation over at godoc for development-related
documentation: 
https://godoc.org/github.com/telenornms/skogul

End-user documentation is found in the manual page, which Skogul can
generate on demand, or you can review a copy on github: 
https://github.com/telenornms/skogul/blob/master/skogul.rst

More discussion on architecture can be found in `docs/`.

Performance
-----------

Skogul is meant to scale well. Early tests on a laptop proved to work very
well:

.. image:: docs/imgs/skogul-rates.png

The above graph is from a very simple test on a laptop (with a quad core
i7), using the provided tester to write data to influxdb. It demonstrates
that despite well-known weaknesses at the time (specially in the
influx-writer), we're able to push roughly 600-800k values/s through
Skogul. This has since been exceeded.

The laptop in question was using about 150-190% CPU for skogul and 400% for
InfluxDB, the rest went to the testers. No real attempt at tuning was done,
but a few different scenarios were tested.

Note that the general values/s is decent both with a ton of values for each
metric, and just a handful of values per metric, but tons of metrics per
containers.

Update:

As of September 2019, TLS was enabled and Skogul was tested again, just for
TLS. Skogul was seen sending roughly 2 million key:values/s over HTTPS on
the same laptop. The batch sender has also proven to be very valuable.

Name
----

Skogul is a Valkyrie. After extensive research (5 minutes on Wikipedia with
a cross-check on duckduckgo), this name was selected because it is
reasonably unique and is also a Valkyrie, like Gondul, a sister-project.

Hacking
-------

There is little "exotic" about Skogul hacking, so the following sections
are aimed mostly at people who are unfamiliar with Go.

A few sources for more documentation:

- docs/CODE_OF_CONDUCT.md
- docs/CONTRIBUTING
- docs/CODING
- doc.go

Testing
.......

To run test cases, ``go test`` can be run. This can be used either in
individual directories, or at the top directory, with ``go test -short ./...``
(note the triple dots. This is a go-ism for recursive behavior). Tests are
run automatically if you create a pull request.

The ``-short`` argument disables integration tests that would otherwise
fail unless you've set up a compatible postgres and mysql database locally.

To produce coverage analysis, use::

   $ cd skogul
   $ go test -short ./... -covermode=count -coverprofile=coverage.out
   $ go tool cover -html coverage.out
   // Opens a browser with coverage anlysis

Tests are extracted from ``*_test.go`` files, and start with the name
``Test`` followed by a function or data structure, optionally followed by
an underscore and an arbitrary name to support multiple tests of the same
function/type. E.g. ``TestValidate()``, ``TestHTTP_foobar()`` etc.

Formatting etc
..............

The "go report" at the top of this document is a decent test of
marginal OK-ish-ness.

Tools you should use:

- `gofmt`, to format code according to Go coding style. Use ``gofmt -d .``
  see local diff, or ``gofmt -w .`` to fix it.
- `golint` to lint your code. ``golint .``

Installing these tools is left as an exercise to the reader.

Documentation
.............

Documentation comes in two forms. One is aimed at end-users. This is
provided mainly by adding proper labels to your data structures (see any
sender or receiver implementation), and through hard-coded text found in
``cmd/skogul/main.go``. In addition to this, stand-alone examples of setups
are provided in the ``examples/`` directory.

For development, documentation is written and maintained using code
comments and runnable examples, following the ``godoc`` approach. Some
architecture comments are kept in ``docs/``, but by and large,
documentation should be consumed from godoc.

See https://godoc.org/github.com/telenornms/skogul for the online
version, or use ``go doc github.com/telenornms/skogul`` or similar,
as you would any other go package.

Examples are part of the test suite and thus extracted from ``*_test.go``
where applicable.

Roadmap
-------

We are doing frequent releases on github, with an ambition of reaching a
1.0 version within some reasonable time frame, I'm guessing 2020. It
doesn't really mean much.

Short term work is defined in milestones on github.

Overall, the core modules and the scaffolding is getting pretty good. The
new config engine is still receiving period updates, but the actual
configuration hasn't changed much.

Future work to get us to 1.0 will be rounding out the new logrus-based
logging by both rewriting the log receiver and overhauling each module to
make our approach to logging consistent across all modules.

Similarly, test cases need to be refreshed. Tests are written very
isolated, and a good bit of spaghetti-logic has arisen. We have decent
coverage, but it's getting trickier to scale test case writing.

Other than that, there are modules to be written and features to be added
which are mostly a matter of what needs arise.

Documentation

Overview

Package skogul is a framework for receiving, processing and forwarding data, typically metric data or event-oriented data, at high throughput.

It is designed to be as agnostic as possible with regards to how it transmits data and how it receives it, and the processors in between need not worry about how the data got there or how it will be treated in the next chain.

This means you can use Skogul to receive data on a influxdb-like line-based TCP interface and send it on to postgres - or influxdb - without having to write explicit support, just set up the chain.

The guiding principles of Skogul is:

- Make as few assumptions as possible about how data is received

- Be stupid fast

End users should only need to worry about the cmd/skogul tool, which comes fully equipped with self-contained documentation.

Adding new logic to Skogul should also be fairly easy. New developers should focus on understanding two things:

1. The skogul.Container data structure - which is the heart of Skogul.

2. The relationship from receiver to handler to sender.

The Container is documented in this very package.

Receivers are where data originates within Skogul. The typical Receiver will receive data from the outside world, e.g. by other tools posting data to a HTTP endpoint. Receivers can also be used to "create" data, either test data or, for example, log data. When skogul starts, it will start all receivers that are configured.

Handlers determine what is done with the data once received. They are responsible for parsing raw data and optionally transform it. This is the only place where it is allowed to _modify_ data. Today, the only transformer is the "templater", which allows a collection of metrics which share certain attributes (e.g.: all collected at the same time and from the same machine) to provide these shared attributes in a template which the "templater" transformer then applies to all metrics.

Other examples of transformations that make sense are:

- Adding a metadata field

- Removing a metadata field

- Removing all but a specific set of fields

- Converting nested metrics to multiple metrics or flatten them

Once a handler has done its deed, it sends the Container to the sender, and this is where "the fun begins" so to speak.

Senders consist of just a data structure that implements the Send() interface. They are not allowed to change the container, but besides that, they can do "whatever". The most obvious example is to send the container to a suitable storage system - e.g., a time series database.

So if you want to add support for a new time series database in Skogul, you will write a sender.

In addition to that, many senders serve only to add internal logic and pass data on to other senders. Each sender should only do one specific thing. For example, if you want to write data both to InfluxDB and MySQL, you need three senders: The "MySQL" and "InfluxDB" senders, and the "dupe" sender, which just takes a list of other senders and sends whatever it receives on to all of them.

Today, Senders and Receivers both have an identical "Auto"-system, found in auto.go of the relevant directories. This is how the individual implementations are made discoverable to the configuration system, and how documentation is provided. Documentation for the settings of a sender/receiver is handled as struct tags.

Once more parsers/transformers are added, they will likely also use a similar system.

Index

Constants

This section is empty.

Variables

View Source
var AssertErrors int

AssertErrors counts the number of assert errors

View Source
var HandlerMap []*HandlerRef

HandlerMap keeps track of which named handlers exists. A configuration engine needs to iterate over this and back-fill the real handlers.

View Source
var SenderMap []*SenderRef

SenderMap is a list of all referenced senders. This is used during configuration loading and should not be used afterwards. However, it needs to be exported so skogul.config can reach it, and it needs to be outside of skogul.config to avoid circular dependencies.

View Source
var TransformerMap []*TransformerRef

TransformerMap keeps track of the named transformers.

Functions

func Assert

func Assert(x bool, v ...interface{})

Assert panics if x is false, useful mainly for doing error-checking for "impossible scenarios" we can't really handle anyway.

Keep in mind that net/http steals panics, but you can check AssertErrors, which is incremented with each assert error encountered.

func ConfigureLogger added in v0.2.0

func ConfigureLogger(requestedLoglevel string, logtimestamp bool)

ConfigureLogger sets up the logger based on calling parameters

func ExtractNestedObject

func ExtractNestedObject(object map[string]interface{}, keys []string) (map[string]interface{}, error)

ExtractNestedObject extracts an object from a nested object structure. All intermediate objects has to map[string]interface{}

func GetLogLevelFromString added in v0.3.0

func GetLogLevelFromString(requestedLevel string) logrus.Level

GetLogLevelFromString returns the matching logrus.Level from a string

func Logger added in v0.2.0

func Logger(category, implementation string) *logrus.Entry

Logger returns a logrus.Entry pre-populated with standard Skogul fields. category is the typical family of the code/module: sender/receiver/parser/transformer/core, while implementation is the local implementation (http, json, protobuf, udp, etc).

Types

type Container

type Container struct {
	Template *Metric   `json:"template,omitempty"`
	Metrics  []*Metric `json:"metrics"`
}

Container is the top-level object for one or more metric.

If a Template is provided, it will used as the initial value for each of the metrics - this is expanded by the transformers.Template transformers, and internal code does not need to worry about this.

A single Container instance is typically the result of a single POST to the HTTP receiver or similar.

func (Container) String

func (c Container) String() string

func (*Container) Validate

func (c *Container) Validate() error

Validate checks the validity of the container, verifying that it follows the exepcted spec. It should be called by any HTTP receiver after accepting a Container from an external source. It is NOT required nor recommended to use Validate in senders - the data is already validated by that time.

type Duration

type Duration struct {
	time.Duration
}

Duration provides a wrapper around time.Duration to add JSON marshalling and unmarshalling. It is used whenever time.Duration needs to be exposed in configuration.

func (Duration) MarshalJSON

func (d Duration) MarshalJSON() ([]byte, error)

MarshalJSON provides JSON marshalling for Duration

func (*Duration) UnmarshalJSON

func (d *Duration) UnmarshalJSON(b []byte) error

UnmarshalJSON provides JSON unmarshalling for Duration

type Error

type Error struct {
	Reason string
	Source string
	Next   error
}

Error is a typical skogul error. All Skogul functions should provide Source and Reason. I'm not entirely sure why, except that it allows chaining errors?

If the Next field is provided, error messages will recurse to the bottom, thus propagating errors from the bottom and up.

func (Error) Container

func (e Error) Container() Container

Container returns a skogul container representing the error

func (Error) Error

func (e Error) Error() string

Error for use in regular error messages. Also outputs to log.Print(). Will also include e.Next, if present.

type Handler

type Handler struct {
	Parser       Parser
	Transformers []Transformer
	Sender       Sender
}

Handler determines what a receiver will do with data received. It requires a parser to interperet the raw data, 0 or more transformers to mutate Containers, and a sender to call after data is parsed and mutated and ready to be dealt with.

Whenever a new Container is created, it should pass that to a Handler, not directly to a Sender. This goes for artificially created data too, e.g. if a sender wants to emit statistics. This ensures that transformers can be used in the future.

To make it configurable, a HandlerRef should be used.

func (*Handler) Handle

func (h *Handler) Handle(b []byte) error

Handle parses the byte array using the configured parser, issues transformers and sends the data off.

func (*Handler) Transform

func (h *Handler) Transform(c *Container) error

Transform runs all available transformers in order and enforce any rules

func (*Handler) TransformAndSend

func (h *Handler) TransformAndSend(c *Container) error

TransformAndSend transforms the already parsed container and sends the data off.

func (Handler) Verify

func (h Handler) Verify() error

Verify the basic integrity of a handler. Quite shallow.

type HandlerRef

type HandlerRef struct {
	H    *Handler
	Name string
}

HandlerRef references a named handler. Used whenever a handler is defined by configuration.

func (*HandlerRef) MarshalJSON

func (hr *HandlerRef) MarshalJSON() ([]byte, error)

MarshalJSON just returns the Name of the handler reference.

func (*HandlerRef) UnmarshalJSON

func (hr *HandlerRef) UnmarshalJSON(b []byte) error

UnmarshalJSON will create an entry on the HandlerMap for the parsed handler reference, so the real handler can be substituted later.

type LoggerCopyHook added in v0.3.0

type LoggerCopyHook struct {
	Writer *logrus.Logger
}

LoggerCopyHook is simply a wrapper around a logrus logger

func (*LoggerCopyHook) Fire added in v0.3.0

func (l *LoggerCopyHook) Fire(entry *logrus.Entry) error

Fire logs the log entry onto a copied logger to stdout

func (*LoggerCopyHook) Levels added in v0.3.0

func (l *LoggerCopyHook) Levels() []logrus.Level

Levels returns the levels this hook will support

type Metric

type Metric struct {
	Time     *time.Time             `json:"timestamp,omitempty"`
	Metadata map[string]interface{} `json:"metadata,omitempty"`
	Data     map[string]interface{} `json:"data,omitempty"`
}

Metric is a collection of measurements related to the same metadata and point in time.

A good example of a single metric is port statistics for a single interface on a router.

Both Metadata and Data is proided. The difference is generally in how data is accessed. Metadata is data you will search for - e.g. the name of the router, the name of the interface, etc. It is also possible to add more dynamic data as metadata, such as OS versions, but exactly how this will be handled will be up to underlying storage engines. E.g.: for influxdb, anything in metadata will be an (indexed) tag, so having reasonably rich metadata is perfectly fine, but you may want to keep an eye on the granularity.

A simple rule of thumb:

Metadata is what you search with.

Data is what you search for.

Example:

{
	"time": "2019-03-25T12:00:00Z",
	"metadata": {
		"device": "routera",
		"os": "JUNOS 15.4R1",
		"chassisId": "something"
	},
	"data": {
		"uptime": 124125124,
		"cputemp": 22
	}
}

It is possible to have nested data, however, it is NOT a requirement that a sender accepts this. And in general, it is better to "flatten" out data into multiple metrics instead. This can be done with a (custom) transformer.

Example of a nested structure:

{
	"time": "2019-03-25T12:00:00Z",
	"metadata": {
		"device": "routera",
		"os": "JUNOS 15.4R1",
		"chassisId": "something"
	},
	"data": {
		"ports": {
			"ge-0/0/0": {
				"ifHCInOctets":  5,
				"ifHCOutOctets": 10
			},
			"ge-0/0/1": {
				"ifHCInOctets":  2,
				"ifHCOutOctets": 20
			}
		}
	}
}

This is legal, but it's probably wise to use a transformer to change it into:

{
	"time": "2019-03-25T12:00:00Z",
	"metadata": {
		"device": "routera",
		"os": "JUNOS 15.4R1",
		"chassisId": "something",
		"port": "ge-0/0/0"
	},
	"data": {
		"ifHCInOctets":  5,
		"ifHCOutOctets": 10
		}
	}
},
{
	"time": "2019-03-25T12:00:00Z",
	"metadata": {
		"device": "routera",
		"os": "JUNOS 15.4R1",
		"chassisId": "something",
		"port": "ge-0/0/1"
	},
	"data": {
		"ifHCInOctets":  2,
		"ifHCOutOctets": 20
		}
	}
}

type Module added in v0.3.0

type Module struct {
	Name    string             // short name of the module (e.g: "http")
	Aliases []string           // optional aliases (e.g. "https")
	Alloc   func() interface{} // allocation of a blank module structure
	Extras  []interface{}      // Optional additional custom data structures that should be exposed in documentation.
	Help    string             // Human-readable help description.
}

Module is metadata for a skogul module. It is used by the receiver, sender and transformer package. The Alloc() function must return a data structure that implements the relevant module interface, which is checked primarily in config.Parse.

See */auto.go for how to utilize this, and config/help.go, cmd/skogul/main.go for how to extract information/help, and ultimately config/parse.go for how it is applied.

type ModuleMap added in v0.3.0

type ModuleMap map[string]*Module

ModuleMap maps a name of a module to the Module data structure. Each type of module has its own module map. E.g.: receiver.Auto, sender.Auto and transformer.Auto.

func (*ModuleMap) Add added in v0.3.0

func (amap *ModuleMap) Add(item Module) error

Add adds a module to a module map, ensuring basic sanity and announcing it to the world, so to speak.

type Parser

type Parser interface {
	Parse(data []byte) (*Container, error)
}

Parser is the interface for parsing arbitrary data into a Container

type Receiver

type Receiver interface {
	Start() error
}

Receiver is how we get data. Receivers are responsible for getting raw data and the outer boundaries of a Container, but should explicitly avoid parsing raw data. This ensures that how data is transported is not bound by how it is parsed.

type Sender

type Sender interface {
	Send(c *Container) error
}

Sender accepts data through Send() - and "sends it off". The canonical sender is one that implements a storage backend or outgoing API. E.g.: accept data, send to influx.

Senders are not allowed to modify the Container - there could be multiple goroutines running with same Container. If modification is required, the Sender needs to take a copy.

A sender should assume that the container has been validated, and is non-null. Slightly counter to common sense, it is NOT recommended to verify the input data again, since multiple senders are likely chained and will thus likely redo the same verifications.

Senders that pass data off to other senders should use a SenderRef instead, to facilitate configuration.

type SenderRef

type SenderRef struct {
	S    Sender
	Name string
}

SenderRef is a reference to a named sender. This is required to allow references to be resolved after all senders are loaded. Wherever a Sender is loaded from configuration, a SenderRef should be used in its place. The maintenance of the sender is handled in the configuration system.

func (*SenderRef) MarshalJSON

func (sr *SenderRef) MarshalJSON() ([]byte, error)

MarshalJSON for a reference just prints the name

func (*SenderRef) UnmarshalJSON

func (sr *SenderRef) UnmarshalJSON(b []byte) error

UnmarshalJSON will unmarshal a sender reference by creating a SenderRef object and putting it on the SenderMap list. The configuration system in question needs to iterate over SenderMap after it has completed the first pass of configuration

type Transformer

type Transformer interface {
	Transform(c *Container) error
}

Transformer mutates a collection before it is passed to a sender. Transformers should be very fast, but are the only means to modifying the data.

type TransformerRef added in v0.3.0

type TransformerRef struct {
	T    Transformer
	Name string
}

TransformerRef is a string mapping to a Transformer. It is used during configuration/transformer setup.

func (*TransformerRef) MarshalJSON added in v0.3.0

func (tr *TransformerRef) MarshalJSON() ([]byte, error)

MarshalJSON just returns the Name of the transformer reference.

func (*TransformerRef) UnmarshalJSON added in v0.3.0

func (tr *TransformerRef) UnmarshalJSON(b []byte) error

UnmarshalJSON will create an entry on the TransformerMap for the parsed transformer reference, so the real transformer can be substituted later.

type Verifier

type Verifier interface {
	Verify() error
}

Verifier is an *optional* interface for senders and receivers. If implemented, the configuration engine will issue Verify() after all configuration is parsed. The sender/receiver should never modify state upon Verify(), but should simply check that internal state is usable.

Directories

Path Synopsis
cmd
skogul
cmd/skogul parses a json-based config file and starts skogul.
cmd/skogul parses a json-based config file and starts skogul.
Package config handles Skogul configuration parsing.
Package config handles Skogul configuration parsing.
gen
Package gen consists of auto-generated protobuf code for the junos streaming telemetry interface.
Package gen consists of auto-generated protobuf code for the junos streaming telemetry interface.
internal
mqtt
Package mqtt provides a bit of glue common between Skogul's MQTT sender and receiver.
Package mqtt provides a bit of glue common between Skogul's MQTT sender and receiver.
Package parser is responsible for interpreting raw byte data into Containers.
Package parser is responsible for interpreting raw byte data into Containers.
Package receiver provides various skogul Receivers that accept data and execute a handler.
Package receiver provides various skogul Receivers that accept data and execute a handler.
Package sender is a set of types that implement skogul.Sender.
Package sender is a set of types that implement skogul.Sender.
Package transformer provides the means to mutate a container as part of a skogul.Handler, before it is passed on to a Sender.
Package transformer provides the means to mutate a container as part of a skogul.Handler, before it is passed on to a Sender.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL