ftdc

package module
v0.0.0-...-79636ce Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 25, 2023 License: Apache-2.0 Imports: 25 Imported by: 1

README

======================================================
``ftdc`` -- Golang FTDC Parsing and Generating Library
======================================================

Overview
--------

FTDC, originally short for *full time diagnostic data capture*, is MongoDB's
internal diagnostic data collection facility. It encodes data in a
space-efficient format, which allows MongoDB to record diagnostic information
every second, and store weeks of data with only a few hundred megabytes of
storage.

The FTDC data format is based on BSON, and provides ways to store series of
documents with the same schema/structure in a compressed columnar format.

This library provides a fully-featured and easy to use toolkit for
interacting data stored in this format in Go programs. The library
itself originated as a `project by 2016 Summer interns at MongoDB
<https://github.com/10gen/ftdc-utils>`_ but has diverged substantially
since then, and adds features for generating data in this format.

Use
---

All documentation is in the `godoc
<https://godoc.org/github.com/tychoish/birch/x/ftdc>`_.

Features
--------

This library supports parsing of the FTDC data format and
several ways of iterating these results. Additionally, it provides the
ability to create FTDC payloads, and is the only extant (?) tool for
generating FTDC data outside of the MongoDB code base.

The library includes tools for generating FTDC payloads and document
streams as well as iterators and tools for accessing data from FTDC
files. All functionality is part of the ``ftdc`` package, and the API
is fully documented.

The ``events`` and ``metrics`` sub-packages provide higher level functionality
for collecting data from performance tests (events), and more generalized
system metrics collection (metrics).

Development
-----------

This project emerged from work at MongoDB to support this fork drops support
older versions of Golang, thereby adding support for modules.

Pull requests are welcome. Feel free to create issues with enhancements or
bugs.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConvertFromCSV

func ConvertFromCSV(ctx context.Context, bucketSize int, input io.Reader, output io.Writer) error

ConvertFromCSV takes an input stream and writes ftdc compressed data to the provided output writer.

If the number of fields changes in the CSV fields, the first field with the changed number of fields becomes the header for the subsequent documents in the stream.

func DumpCSV

func DumpCSV(ctx context.Context, iter *fun.Iterator[*Chunk], prefix string) error

DumpCSV writes a sequence of chunks to CSV files, creating new files if the iterator detects a schema change, using only the number of fields in the chunk to detect schema changes. DumpCSV writes a header row to each file.

The file names are constructed as "prefix.<count>.csv".

func FlushCollector

func FlushCollector(c Collector, writer io.Writer) error

FlushCollector writes the contents of a collector out to an io.Writer. This is useful in the context of any collector, but is particularly useful in the context of streaming collectors, which flush data periodically and may have cached data.

func NewWriterCollector

func NewWriterCollector(chunkSize int, writer io.WriteCloser) io.WriteCloser

func ReadChunks

func ReadChunks(r io.Reader) *fun.Iterator[*Chunk]

ReadChunks creates a ChunkIterator from an underlying FTDC data source.

func WriteCSV

func WriteCSV(ctx context.Context, iter *fun.Iterator[*Chunk], writer io.Writer) error

WriteCSV exports the contents of a stream of chunks as CSV. Returns an error if the number of metrics changes between points, or if there are any errors writing data.

Types

type Chunk

type Chunk struct {
	Metrics []Metric
	// contains filtered or unexported fields
}

Chunk represents a 'metric chunk' of data in the FTDC.

func (*Chunk) GetMetadata

func (c *Chunk) GetMetadata() *birch.Document

func (*Chunk) Iterator

func (c *Chunk) Iterator(ctx context.Context) *Iterator

Iterator returns an iterator that you can use to read documents for each sample period in the chunk. Documents are returned in collection order, with keys flattened and dot-separated fully qualified paths.

The documents are constructed from the metrics data lazily.

func (*Chunk) Len

func (c *Chunk) Len() int

func (*Chunk) Size

func (c *Chunk) Size() int

func (*Chunk) StructuredIterator

func (c *Chunk) StructuredIterator(ctx context.Context) *Iterator

StructuredIterator returns the contents of the chunk as a sequence of documents that (mostly) resemble the original source documents (with the non-metrics fields omitted.) The output documents mirror the structure of the input documents.

type ChunkIterator

type ChunkIterator struct {
	*fun.Iterator[*Chunk]
	// contains filtered or unexported fields
}

ChunkIterator is a simple iterator for reading off of an FTDC data source (e.g. file). The iterator processes chunks batches of metrics lazily, reading form the io.Reader every time the iterator is advanced.

Use the iterator as follows:

iter := ReadChunks(ctx, file)

for iter.Next(ctx) {
    chunk := iter.Chunk()

    // <manipulate chunk>

}

if err := iter.Close(ctx); err != nil {
    return err
}

You MUST call the Chunk() method no more than once per iteration.

You shoule check the Err() method when iterator is complete to see if there were any issues encountered when decoding chunks.

func (*ChunkIterator) Close

func (iter *ChunkIterator) Close() error

Close releases resources of the iterator. Use this method to release those resources if you stop iterating before the iterator is exhausted. Canceling the context that you used to create the iterator has the same effect. Close returns a non-nil error if the iterator encountered any errors during iteration.

type Collector

type Collector interface {
	// SetMetadata sets the metadata document for the collector or
	// chunk. This document is optional. Pass a nil to unset it,
	// or a different document to override a previous operation.
	SetMetadata(any) error

	// Add extracts metrics from a document and appends it to the
	// current collector. These documents MUST all be
	// identical including field order. Returns an error if there
	// is a problem parsing the document or if the number of
	// metrics collected changes.
	Add(any) error

	// Resolve renders the existing documents and outputs the full
	// FTDC chunk as a byte slice to be written out to storage.
	Resolve() ([]byte, error)

	// Reset clears the collector for future use.
	Reset()

	// Info reports on the current state of the collector for
	// introspection and to support schema change and payload
	// size.
	Info() CollectorInfo
}

Collector describes the interface for collecting and constructing FTDC data series. Implementations may have different efficiencies and handling of schema changes.

The SetMetadata and Add methods both take any values. These are converted to bson documents; however it is an error to pass a type based on a map.

func NewBaseCollector

func NewBaseCollector(maxSize int) Collector

NewBasicCollector provides a basic FTDC data collector that mirrors the server's implementation. The Add method will error if you attempt to add more than the specified number of records (plus one, as the reference/schema document doesn't count).

func NewBatchCollector

func NewBatchCollector(maxSamples int) Collector

NewBatchCollector constructs a collector implementation that builds data chunks with payloads of the specified number of samples. This implementation allows you break data into smaller components for more efficient read operations.

func NewBufferedCollector

func NewBufferedCollector(ctx context.Context, size int, coll Collector) Collector

NewBufferedCollector wraps an existing collector with a buffer to normalize throughput to an underlying collector implementation.

func NewDynamicCollector

func NewDynamicCollector(maxSamples int) Collector

NewDynamicCollector constructs a Collector that records metrics from documents, creating new chunks when either the number of samples collected exceeds the specified max sample count OR the schema changes.

There is some overhead associated with detecting schema changes, particularly for documents with more complex schemas, so you may wish to opt for a simpler collector in some cases.

func NewSamplingCollector

func NewSamplingCollector(minimumInterval time.Duration, collector Collector) Collector

NewSamplingCollector wraps a different collector implementation and provides an implementation of the Add method that skips collection of results if the specified minimumInterval has not elapsed since the last collection.

func NewStreamingCollector

func NewStreamingCollector(maxSamples int, writer io.Writer) Collector

NewStreamingCollector wraps the underlying collector, writing the data to the underlying writer after the underlying collector is filled. This is similar to the batch collector, but allows the collector to drop FTDC data from memory. Chunks are flushed to disk when the collector as collected the "maxSamples" number of samples during the Add operation.

func NewStreamingDynamicCollector

func NewStreamingDynamicCollector(max int, writer io.Writer) Collector

NewStreamingDynamicCollector has the same semantics as the dynamic collector but wraps the streaming collector rather than the batch collector. Chunks are flushed during the Add() operation when the schema changes or the chunk is full.

func NewStreamingDynamicUncompressedCollectorBSON

func NewStreamingDynamicUncompressedCollectorBSON(maxSamples int, writer io.Writer) Collector

NewStreamingUncompressedCollectorJSON constructs a collector that resolves data into a stream of JSON documents. The output of these uncompressed collectors does not use the FTDC encoding for data, and can be read as newline separated JSON.

The metadata for this collector is rendered as the first document in the stream. Additionally, the collector will automatically handle schema changes by flushing the previous batch.

All data is written to the writer when the underlying collector has captured its target number of collectors and is automatically flushed to the writer during the write operation or when you call Close. You can also use the FlushCollector helper.

func NewStreamingDynamicUncompressedCollectorJSON

func NewStreamingDynamicUncompressedCollectorJSON(maxSamples int, writer io.Writer) Collector

NewStreamingUncompressedCollectorBSON constructs a collector that resolves data into a stream of BSON documents. The output of these uncompressed collectors does not use the FTDC encoding for data, and can be read with the bsondump and other related utilites.

The metadata for this collector is rendered as the first document in the stream. Additionally, the collector will automatically handle schema changes by flushing the previous batch.

All data is written to the writer when the underlying collector has captured its target number of collectors and is automatically flushed to the writer during the write operation or when you call Close. You can also use the FlushCollector helper.

func NewStreamingUncompressedCollectorBSON

func NewStreamingUncompressedCollectorBSON(maxSamples int, writer io.Writer) Collector

NewUncompressedCollectorBSON constructs a collector that resolves data into a stream of BSON documents. The output of these uncompressed collectors does not use the FTDC encoding for data, and can be read with the bsondump and other related utilites.

This collector will not allow you to collect documents with different schema (determined by the number of top-level fields.)

The metadata for this collector is rendered as the first document in the stream.

All data is written to the writer when the underlying collector has captured its target number of collectors and is automatically flushed to the writer during the write operation or when you call Close. You can also use the FlushCollector helper.

func NewStreamingUncompressedCollectorJSON

func NewStreamingUncompressedCollectorJSON(maxSamples int, writer io.Writer) Collector

NewUncompressedCollectorJSON constructs a collector that resolves data into a stream of JSON documents. The output of these uncompressed collectors does not use the FTDC encoding for data, and can be read as newline separated JSON.

This collector will not allow you to collect documents with different schema (determined by the number of top-level fields.)

The metadata for this collector is rendered as the first document in the stream.

All data is written to the writer when the underlying collector has captured its target number of collectors and is automatically flushed to the writer during the write operation or when you call Close. You can also use the FlushCollector helper.

func NewSynchronizedCollector

func NewSynchronizedCollector(coll Collector) Collector

NewSynchronizedCollector wraps an existing collector in a synchronized wrapper that guards against incorrect concurrent access.

func NewUncompressedCollectorBSON

func NewUncompressedCollectorBSON(maxSamples int) Collector

NewUncompressedCollectorBSON constructs a collector that resolves data into a stream of BSON documents. The output of these uncompressed collectors does not use the FTDC encoding for data, and can be read with the bsondump and other related utilites.

This collector will not allow you to collect documents with different schema (determined by the number of top-level fields.)

If you do not resolve the after receiving the maximum number of samples, then additional Add operations will fail.

The metadata for this collector is rendered as the first document in the stream.

func NewUncompressedCollectorJSON

func NewUncompressedCollectorJSON(maxSamples int) Collector

NewUncompressedCollectorJSON constructs a collector that resolves data into a stream of JSON documents. The output of these uncompressed collectors does not use the FTDC encoding for data, and can be read as newline separated JSON.

This collector will not allow you to collect documents with different schema (determined by the number of top-level fields.)

If you do not resolve the after receiving the maximum number of samples, then additional Add operations will fail.

The metadata for this collector is rendered as the first document in the stream.

type CollectorInfo

type CollectorInfo struct {
	MetricsCount int
	SampleCount  int
}

CollectorInfo reports on the current state of the collector and provides introspection into the current state of the collector for testing, transparency, and to support more complex collector features, including payload size controls and schema change

type Iterator

type Iterator struct {
	*fun.Iterator[*birch.Document]
	// contains filtered or unexported fields
}

func ReadMatrix

func ReadMatrix(ctx context.Context, r io.Reader) *Iterator

ReadMatrix returns a "matrix format" for the data in a chunk. The ducments returned by the iterator represent the entire chunk, in flattened form, with each field representing a single metric as an array of all values for the event.

The matrix documents have full type fidelity, but are not substantially less expensive to produce than full iteration.

func ReadMetrics

func ReadMetrics(ctx context.Context, r io.Reader) *Iterator

ReadMetrics returns a standard document iterator that reads FTDC chunks. The Documents returned by the iterator are flattened.

func ReadSeries

func ReadSeries(ctx context.Context, r io.Reader) *Iterator

ReadSeries is similar to the ReadMatrix format, and produces a single document per chunk, that contains the flattented keys for that chunk, mapped to arrays of all the values of the chunk.

The matrix documents have better type fidelity than raw chunks but do not properly collapse the bson timestamp type. To use these values produced by the iterator, consider marshaling them directly to map[string]any and use a case statement, on the values in the map, such as:

switch v.(type) {
case []int32:
       // ...
case []int64:
       // ...
case []bool:
       // ...
case []time.Time:
       // ...
case []float64:
       // ...
}

Although the *birch.Document type does support iteration directly.

func ReadStructuredMetrics

func ReadStructuredMetrics(ctx context.Context, r io.Reader) *Iterator

ReadStructuredMetrics returns a standard document iterator that reads FTDC chunks. The Documents returned by the iterator retain the structure of the input documents.

func (*Iterator) Close

func (iter *Iterator) Close() error

func (*Iterator) Metadata

func (iter *Iterator) Metadata() *birch.Document

type Metric

type Metric struct {
	// For metrics that were derived from nested BSON documents,
	// this preserves the path to the field, in support of being
	// able to reconstitute metrics/chunks as a stream of BSON
	// documents.
	ParentPath []string

	// KeyName is the specific field name of a metric in. It is
	// *not* fully qualified with its parent document path, use
	// the Key() method to access a value with more appropriate
	// user facing context.
	KeyName string

	// Values is an array of each value collected for this metric.
	// During decoding, this attribute stores delta-encoded
	// values, but those are expanded during decoding and should
	// never be visible to user.
	Values []int64
	// contains filtered or unexported fields
}

Metric represents an item in a chunk.

func (*Metric) Key

func (m *Metric) Key() string

Directories

Path Synopsis
Package events contains a number of different data types and formats that you can use to populate ftdc metrics series.
Package events contains a number of different data types and formats that you can use to populate ftdc metrics series.
Package metrics includes data types used for Golang runtime and system metrics collection
Package metrics includes data types used for Golang runtime and system metrics collection

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL