monitor

package
v1.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 3, 2023 License: Apache-2.0 Imports: 25 Imported by: 0

Documentation

Overview

Package monitor provides core Blip components that, together, monitor one MySQL instance. Most monitoring logic happens in the package, but package metrics is closely related: this latter actually collect metrics, but it is driven by this package. Other Blip packages are mostly set up and support of monitors.

Index

Constants

This section is empty.

Variables

View Source
var CollectParallel = 2

CollectParallel sets how many domains to collect in parallel. Currently, this is not configurable via Blip config; it can only be changed via integration.

View Source
var ErrStopLoss = errors.New("stop-loss prevents reloading")
View Source
var Now func() time.Time = time.Now

Functions

func NewLevelCollector

func NewLevelCollector(args LevelCollectorArgs) *lpc

func NewPlanChanger

func NewPlanChanger(args PlanChangerArgs) *planChanger

func TickerDuration

func TickerDuration(d time.Duration)

TickerDuration sets the internal ticker duration for testing. This is only called for testing; do not called outside testing.

Types

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

Engine does the real work: collect metrics.

func NewEngine

func NewEngine(cfg blip.ConfigMonitor, db *sql.DB) *Engine

func (*Engine) Collect

func (e *Engine) Collect(ctx context.Context, levelName string) (*blip.Metrics, error)

func (*Engine) DB

func (e *Engine) DB() *sql.DB

func (*Engine) MonitorId

func (e *Engine) MonitorId() string

func (*Engine) Prepare

func (e *Engine) Prepare(ctx context.Context, plan blip.Plan, before, after func()) error

Prepare prepares the monitor to collect metrics for the plan. The monitor must be successfully prepared for Collect() to work because Prepare() initializes metrics collectors for every level of the plan. Prepare() can be called again when, for example, the PlanChanger detects a state change and calls the LevelCollector to change plans, which than calls this func with the new state plan.

Do not call this func concurrently! It does not guard against concurrent calls. Sserialization is handled by the only caller: LevelCollector.ChangePlan().

func (*Engine) Stop

func (e *Engine) Stop()

Stop the engine and cleanup any metrics associated with it. TODO: There is a possible race condition when this is called. Since Engine.Collect is called as a go-routine, we could have an invocation of the function block waiting for Engine.Stop to runlock planMux, after which Collect would run after cleanup has been called. This could result in a panic, though that should be caught and logged. Since the monitor is stopping anyway this isn't a huge issue.

type Exporter

type Exporter struct {
	*sync.Mutex
	// contains filtered or unexported fields
}

Exporter emulates a Prometheus mysqld_exporter. It implement prom.Exporter.

func NewExporter

func NewExporter(cfg blip.ConfigExporter, plan blip.Plan, engine *Engine) *Exporter

func (Exporter) Collect

func (e Exporter) Collect(ch chan<- prometheus.Metric)

Collect collects metrics. It is called indirectly via Scrpe.

func (Exporter) Describe

func (e Exporter) Describe(descs chan<- *prometheus.Desc)

func (Exporter) Plan

func (e Exporter) Plan() blip.Plan

func (Exporter) Scrape

func (e Exporter) Scrape() (string, error)

Scrape collects and returns metrics in Prometheus exposition format. This function is called in response to GET /metrics.

type LevelCollector

type LevelCollector interface {
	// Run runs the collector to collect metrics; it's a blocking call.
	Run(stopChan, doneChan chan struct{}) error

	// ChangePlan changes the plan; it's called by an Pl
	ChangePlan(newState, newPlanName string) error

	// Pause pauses metrics collection until ChangePlan is called.
	Pause()
}

LevelCollector collect metrics according to a plan. It doesn't collect metrics directly, as part of a Monitor, it calls the Engine when it's time to collect metrics for a certain level--based on the frequency the users specifies for each level. After the Engine returns metrics, the collector (or "LPC" for short) calls the blip.Plugin.TransformMetrics (if specified), then sends metrics to all sinks specififed for the monitor. Then it waits until it's time to collect metrics for the next level. Consequently, the LPC drives metrics collection, but the Engine does the actual work of collecting metrics.

type LevelCollectorArgs

type LevelCollectorArgs struct {
	Config           blip.ConfigMonitor
	DB               *sql.DB
	PlanLoader       *plan.Loader
	Sinks            []blip.Sink
	TransformMetrics func(*blip.Metrics) error
}

type LoadFunc

type LoadFunc func(blip.Config) ([]blip.ConfigMonitor, error)

LoadFunc is a callback that matches blip.Plugin.LoadMonitors. It's an arg to NewLoader, if specified by the user.

type Loader

type Loader struct {
	*sync.Mutex
	// contains filtered or unexported fields
}

Loader is the singleton monitor loader and repo. It's created by the server and only used there (and via API calls). It's dynamic so monitors can be loaded (created) and unloaded (destroyed) while Blip is running, but the normal case is one load and start on Blip startup: Server.Boot calls Load, then Server.Run calls StartMonitors. The user can make API calls to reload while Blip is running.

Loader is safe for concurrent use, but it's currently only called by the Server.

func NewLoader

func NewLoader(args LoaderArgs) *Loader

NewLoader creates a new Loader singleton. It's called in Server.Boot and Server.Run.

func (*Loader) Count

func (ml *Loader) Count() uint

Count returns the number of loaded monitors. It's used by the API for status.

func (*Loader) Load

func (ml *Loader) Load(ctx context.Context) error

Load loads all configured monitors and unloads (stops and removes) monitors that have been removed or changed since the last call to Load. It does not start new monitors. Call StartMonitors after Load to start new (or previously stopped) monitors.

Server.Boot calls Load, then Server.Run calls StartMonitors.

Load checks for stop-loss and does local MySQL auto-detection, if these two features are enabled.

If Load returns error, the currently loaded monitors are not affected. The error indicates a problem loading monitors or a validation error.

This function is safe for concurrent use, but calls are serialized.

func (*Loader) Monitor

func (ml *Loader) Monitor(monitorId string) *Monitor

Monitor returns one monitor by ID. It's used by the API to get single monitor status.

func (*Loader) Monitors

func (ml *Loader) Monitors() []*Monitor

Monitors returns a list of all currently loaded monitors.

func (*Loader) Print

func (ml *Loader) Print() string

Print prints all loaded monitors in blip.ConfigMonitor YAML format. It's used for --print-monitors.

func (*Loader) Start

func (ml *Loader) Start(monitorId string, lock bool) error

Start starts a monitor if it's not already running.

func (*Loader) StartMonitors

func (ml *Loader) StartMonitors()

StartMonitors starts all monitors that have been loaded but not started. This should be called after Load. On Blip startup, the server calls Load in Server.Boot, then StartMonitors in server.Run. The user can reload by calling the server API: /monitors/reload.

This function is safe for concurrent use, but calls are serialized.

func (*Loader) Stop

func (ml *Loader) Stop(monitorId string, lock bool) error

Stop stops a monitor but does not unload it. It can be started again by calling Start.

func (*Loader) Unload

func (ml *Loader) Unload(monitorId string, lock bool) error

Unload stops and removes a monitor.

type LoaderArgs

type LoaderArgs struct {
	Config     blip.Config
	Factories  blip.Factories
	Plugins    blip.Plugins
	PlanLoader *plan.Loader
	RDSLoader  aws.RDSLoader
}

type Monitor

type Monitor struct {
	// contains filtered or unexported fields
}

Monitor monitors one MySQL instance. The monitor is a high-level component that runs (and keeps running) four monitor subsystems:

  • Plan changer (PCH)
  • Level collector (LCO)
  • Blip heartbeat writer
  • Exporter (Prometheus)

Each subsystem is optional based on the config, but LCO runs by default because it contains the Engine component that does actual metrics collection. If any subsystem crashes (returns for any reason or panics), the monitor stops and restarts all subsystems. The monitor doesn't stop until Stop is called. Consequently, if a monitor is not configured correctly (for example, it can't connect to MySQL), it tries and reports every forever.

Monitors are loaded, created, and initially started only by the MonitorLoader. A monitor can be stopped and started (again) via the server API.

A monitor is uniquely identified by its monitor ID, which should be included in all output by the monitor and its subsystems. The monitor ID is set when loaded by the MonitoLoad, which calls blip.MonitorId to determine the value.

A monitor is completely self-contained and independent. For example, each monitor has its own LCO, engine, and metric collectors.

func NewMonitor

func NewMonitor(args MonitorArgs) *Monitor

NewMonitor creates a new Monitor with the given arguments. The caller must call Boot then, if that does not return an error, Run to start monitoring the MySQL instance.

func (*Monitor) Config

func (m *Monitor) Config() blip.ConfigMonitor

Config returns the monitor config.

func (*Monitor) DSN

func (m *Monitor) DSN() string

DSN returns the redacted DSN (no password).

func (*Monitor) MonitorId

func (m *Monitor) MonitorId() string

MonitorId returns the monitor ID.

func (*Monitor) Start

func (m *Monitor) Start() error

Start starts the monitor. If it's already running, it returns an error. It can be called again after calling Stop.

Start/stop monitors only through the Loader. DO NOT call Start or Stop directly, else the running state of the monitor and the Loader will be out of sync.

func (*Monitor) Stop

func (m *Monitor) Stop() error

Stop stops the monitor. It is idempotent and thread-safe.

Start/stop monitors only through the Loader. DO NOT call Start or Stop directly, else the running state of the monitor and the Loader will be out of sync.

type MonitorArgs

type MonitorArgs struct {
	Config          blip.ConfigMonitor
	DbMaker         blip.DbFactory
	PlanLoader      *plan.Loader
	Sinks           []blip.Sink
	TransformMetric func(metrics *blip.Metrics) error
}

MonitorArgs are required arguments to NewMonitor.

type PlanChanger

type PlanChanger interface {
	Run(stopChan, doneChan chan struct{}) error
}

PlanChanger changes the plan based on database instance state.

type PlanChangerArgs

type PlanChangerArgs struct {
	MonitorId string
	Config    blip.ConfigPlanChange
	DB        *sql.DB
	LCO       LevelCollector
	HA        ha.Manager
}

type StartMonitorFunc

type StartMonitorFunc func(blip.ConfigMonitor) bool

StartMonitorFunc is a callback that matches blip.Plugin.StartMonitor.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL