observability

package module
v0.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 24, 2025 License: MIT Imports: 23 Imported by: 0

README

gobservability

Production-ready observability stack for Go applications with OpenTelemetry integration.

Overview

gobservability provides a unified, zero-configuration observability solution for Go services. It handles metrics, traces, logs, and profiling through a single initialization call, with strict environment-based configuration for production safety.

Features

  • Zero-configuration setup - Single Init() call to initialize everything
  • Environment-based configuration - Strict validation with fail-fast behavior
  • OpenTelemetry integration - OTLP export for metrics, traces, and logs
  • Pre-configured metrics - HTTP, database, cache, runtime, and system metrics ready to use
  • Graceful shutdown - Proper cleanup and metric flushing
  • Production-ready - Battle-tested defaults and comprehensive error handling

Installation

go get github.com/gath-stack/gobservability

Quick Start

package main

import (
    "context"
    "net/http"
    
    "github.com/go-chi/chi/v5"
    observability "github.com/gath-stack/gobservability"
    "go.uber.org/zap"
)

func main() {
    log, _ := zap.NewProduction()
    
    // Initialize observability stack
    stack, err := observability.Init(log, nil)
    if err != nil {
        log.Fatal("failed to initialize observability", zap.Error(err))
    }
    defer func() {
        if err := stack.Shutdown(context.Background()); err != nil {
            log.Error("failed to shutdown observability", zap.Error(err))
        }
    }()
    
    // Setup HTTP server with automatic metrics
    router := chi.NewRouter()
    router.Use(stack.HTTPMetricsMiddleware())
    router.Get("/api/users", handleGetUsers)
    
    http.ListenAndServe(":8080", router)
}

func handleGetUsers(w http.ResponseWriter, r *http.Request) {
    // Your handler code here
}

Configuration

All configuration is done through environment variables. The package enforces strict validation and fails fast on misconfiguration.

Required Variables
APP_NAME=my-service
APP_VERSION=1.0.0
APP_ENV=production  # development, staging, or production
Feature Flags

Enable specific observability components:

OBSERVABILITY_METRICS_ENABLED=true
OBSERVABILITY_TRACING_ENABLED=false
OBSERVABILITY_LOGS_ENABLED=false
OBSERVABILITY_PROFILING_ENABLED=false
Endpoints

Required when features are enabled:

OBSERVABILITY_OTLP_ENDPOINT=otel-collector:4317
OBSERVABILITY_PYROSCOPE_SERVER_ADDRESS=http://pyroscope:4040  # only for profiling
Optional Configuration
OBSERVABILITY_METRIC_EXPORT_INTERVAL=10      # seconds (default: 10)
OBSERVABILITY_TRACE_SAMPLING_RATE=0.1        # 0.0-1.0 (default: environment-based)
OBSERVABILITY_METRIC_BATCH_SIZE=1024         # default: 1024
OBSERVABILITY_TRACE_BATCH_SIZE=512           # default: 512
DEPLOYMENT_ID=deployment-xyz                 # optional deployment identifier

Usage Examples

HTTP Metrics

Automatic HTTP request tracking with Chi router:

stack, _ := observability.Init(log, nil)
router := chi.NewRouter()
router.Use(stack.HTTPMetricsMiddleware())

Manual HTTP metrics recording:

start := time.Now()
// ... handle request ...
duration := time.Since(start)

stack.HTTP.RecordRequest(ctx, 
    "GET", 
    "/api/users", 
    200, 
    duration, 
    requestSize, 
    responseSize)
Database Metrics
start := time.Now()
rows, err := db.Query(ctx, "SELECT * FROM users WHERE active = ?", true)
duration := time.Since(start)

stack.DB.RecordQuery(ctx, "SELECT", "users", duration, err == nil)

Track connection pool:

// Connection acquired
stack.DB.UpdateConnections(ctx, +1)
defer stack.DB.UpdateConnections(ctx, -1)
Cache Metrics
start := time.Now()
value, found := cache.Get(key)
duration := time.Since(start)

var size int64
if found {
    size = int64(len(value))
}

stack.Cache.RecordGet(ctx, "get", found, duration, size)
Runtime and System Metrics

Runtime and system metrics are collected automatically. No manual instrumentation required.

Runtime metrics include:

  • Goroutine count
  • Memory usage (heap, stack, GC stats)
  • GC pause times
  • CPU usage

System metrics include:

  • CPU utilization
  • Disk usage and I/O
  • Network I/O

To disable system metrics:

stack, err := observability.Init(log, &observability.InitOptions{
    DisableSystemMetrics: true,
})
Custom Metrics

For advanced use cases, access the OpenTelemetry meter directly:

meter := stack.Meter()

counter, _ := meter.Int64Counter(
    "custom.operations.total",
    metric.WithDescription("Total custom operations"),
)

counter.Add(ctx, 1, metric.WithAttributes(
    attribute.String("operation", "process"),
    attribute.String("status", "success"),
))

Architecture

Components
  • Config: Environment-based configuration with validation
  • Metrics: OpenTelemetry metrics with OTLP export
  • HTTP Metrics: Request count, duration, size, status codes
  • DB Metrics: Query performance, connection pool, operation types
  • Cache Metrics: Hit/miss rates, operation duration, data size
  • Runtime Metrics: Go runtime statistics (automatic)
  • System Metrics: OS-level resource usage (automatic)
Metric Export

All metrics are exported via OTLP (OpenTelemetry Protocol) to a configured collector endpoint. The default export interval is 10 seconds.

Best Practices

Initialization

Always initialize observability early in your application startup:

func main() {
    log := initLogger()
    
    stack, err := observability.Init(log, nil)
    if err != nil {
        log.Fatal("observability init failed", zap.Error(err))
    }
    defer func() {
        if err := stack.Shutdown(context.Background()); err != nil {
            log.Error("observability shutdown failed", zap.Error(err))
        }
    }()
    
    // Continue with application setup
}
Graceful Shutdown

Always call Shutdown() to flush metrics and clean up resources:

defer func() {
    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer cancel()
    
    if err := stack.Shutdown(ctx); err != nil {
        log.Error("failed to shutdown observability", zap.Error(err))
    }
}()
Environment-Specific Configuration

Use different sampling rates and export intervals per environment:

Development:

APP_ENV=development
OBSERVABILITY_TRACE_SAMPLING_RATE=1.0  # 100% sampling
OBSERVABILITY_METRIC_EXPORT_INTERVAL=5  # faster feedback

Production:

APP_ENV=production
OBSERVABILITY_TRACE_SAMPLING_RATE=0.1  # 10% sampling
OBSERVABILITY_METRIC_EXPORT_INTERVAL=10
Error Handling

The package follows fail-fast principles. Configuration errors cause immediate failure:

// This will fail fast if environment is misconfigured
stack, err := observability.Init(log, nil)
if err != nil {
    // Log the error and exit - don't continue with invalid config
    log.Fatal("observability configuration invalid", zap.Error(err))
}
Context Propagation

Always pass context through your call chain for proper metric attribution:

func handleRequest(w http.ResponseWriter, r *http.Request) {
    ctx := r.Context()
    
    users, err := getUsersFromDB(ctx)  // Pass context down
    if err != nil {
        // Handle error
    }
}

func getUsersFromDB(ctx context.Context) ([]User, error) {
    start := time.Now()
    // ... database query ...
    
    stack.DB.RecordQuery(ctx, "SELECT", "users", time.Since(start), true)
    return users, nil
}

Integration Examples

Docker Compose
version: '3.8'

services:
  app:
    build: .
    environment:
      - APP_NAME=my-service
      - APP_VERSION=1.0.0
      - APP_ENV=production
      - OBSERVABILITY_METRICS_ENABLED=true
      - OBSERVABILITY_OTLP_ENDPOINT=otel-collector:4317
    depends_on:
      - otel-collector

  otel-collector:
    image: otel/opentelemetry-collector:latest
    ports:
      - "4317:4317"
    volumes:
      - ./otel-config.yaml:/etc/otel-config.yaml
    command: ["--config=/etc/otel-config.yaml"]
Kubernetes
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  APP_NAME: "my-service"
  APP_VERSION: "1.0.0"
  APP_ENV: "production"
  OBSERVABILITY_METRICS_ENABLED: "true"
  OBSERVABILITY_OTLP_ENDPOINT: "otel-collector.observability.svc.cluster.local:4317"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-service
spec:
  template:
    spec:
      containers:
      - name: app
        image: my-service:1.0.0
        envFrom:
        - configMapRef:
            name: app-config

Troubleshooting

Metrics not appearing

Check that:

  1. OBSERVABILITY_METRICS_ENABLED=true is set
  2. OBSERVABILITY_OTLP_ENDPOINT points to a valid collector
  3. The OTLP collector is reachable from your service
  4. The collector is properly configured to receive OTLP metrics
High memory usage

Adjust batch sizes to reduce memory footprint:

OBSERVABILITY_METRIC_BATCH_SIZE=512  # Reduce from default 1024
OBSERVABILITY_TRACE_BATCH_SIZE=256   # Reduce from default 512
Initialization failures

The package validates configuration strictly. Common issues:

  • Missing required environment variables (APP_NAME, APP_VERSION, APP_ENV)
  • Invalid APP_ENV value (must be: development, staging, or production)
  • Invalid OBSERVABILITY_OTLP_ENDPOINT format (must be host:port)
  • OTLP endpoint required when features are enabled

Check logs for detailed validation errors.

Requirements

  • Go 1.21 or later
  • OpenTelemetry Collector or compatible OTLP endpoint

Dependencies

  • go.opentelemetry.io/otel
  • go.opentelemetry.io/otel/sdk
  • go.opentelemetry.io/otel/exporters/otlp/otlpmetric
  • go.uber.org/zap (for logging interface)

License

MIT License - see LICENSE file for details

Contributing

Contributions are welcome. Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Submit a pull request

Ensure all tests pass and code follows the existing style.

Support

For issues and questions:

Documentation

Overview

Package observability provides a production-ready observability stack for Go applications.

This package offers unified initialization and management of metrics, traces, logs, and profiles using OpenTelemetry standards. Configuration is strictly environment-based for production safety.

Features

  • Zero-configuration setup with Init()
  • Environment-only configuration with strict validation
  • OpenTelemetry metrics export via OTLP
  • Automatic runtime and system metrics collection
  • Graceful shutdown with proper cleanup
  • Fail-fast behavior for misconfiguration

Quick Start

import "github.com/gath-stack/gobservability"

func main() {
    log := logger.Get()

    // One line to initialize everything!
    stack, err := observability.Init(log, nil)
    if err != nil {
        log.Fatal("failed to init observability", zap.Error(err))
    }
    defer func() {
		if err := observability.Shutdown(context.Background()); err != nil {
			log.Error("Failed to shutdown observability", zap.Error(err))
		}
	}()

    // All metrics are ready to use
    stack.HTTP.RecordRequest(ctx, "GET", "/api/users", 200, duration, reqSize, respSize)
    stack.DB.RecordQuery(ctx, "SELECT", "users", duration, true)
    // Runtime and System metrics: automatic!
}

Environment Variables

Required environment variables:

  • APP_NAME: Service name
  • APP_VERSION: Service version
  • APP_ENV: Environment (development, staging, production)
  • OBSERVABILITY_OTLP_ENDPOINT: OTLP collector endpoint (if features enabled)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type InitOptions

type InitOptions struct {
	// DisableSystemMetrics prevents system metrics collection if true.
	// This is useful in containerized environments where system metrics
	// may not be meaningful or accessible.
	DisableSystemMetrics bool

	// SystemDiskPath specifies the disk path to monitor for disk metrics.
	// Defaults to "/" if not specified.
	SystemDiskPath string
}

InitOptions configures optional behaviors during observability stack initialization.

type Logger

type Logger interface {
	Debug(msg string, fields ...zap.Field)
	Info(msg string, fields ...zap.Field)
	Warn(msg string, fields ...zap.Field)
	Error(msg string, fields ...zap.Field)
}

Logger is the interface for logging operations used throughout the observability stack. It is compatible with zap.Logger and other structured logging packages that follow the same field-based logging pattern.

type Observability

type Observability struct {
	// contains filtered or unexported fields
}

Observability manages the lifecycle of the observability stack components.

This is a lower-level type used internally by Stack. Most applications should use Init() to create a Stack rather than working with Observability directly.

func MustNew

func MustNew(log Logger) *Observability

MustNew creates a new Observability instance or panics on error.

This is the panic-on-error variant of New().

func New

func New(log Logger) (*Observability, error)

New creates a new Observability instance by loading configuration from environment variables.

This is a lower-level function compared to Init(). It requires manual calls to Start() and metric initialization. For most use cases, Init() is preferred.

Example:

obs, err := observability.New(log)
if err != nil {
    return err
}
if err := obs.Start(ctx); err != nil {
    return err
}
// Manually initialize metrics as needed

func (*Observability) Config

func (o *Observability) Config() config.Config

Config returns the current observability configuration.

func (*Observability) IsInitialized

func (o *Observability) IsInitialized() bool

IsInitialized returns true if the observability stack has been initialized.

func (*Observability) LogsProvider

func (o *Observability) LogsProvider() *logs.LogsProvider

func (*Observability) Meter

func (o *Observability) Meter() metric.Meter

Meter returns the global OpenTelemetry meter for creating custom metric instruments.

This method panics if observability is not initialized or metrics are not enabled. For typical use cases, use the pre-configured metrics in Stack (HTTP, DB, Cache, etc.) rather than creating custom metrics.

Example:

meter := obs.Meter()
counter, _ := meter.Int64Counter("custom.requests")
counter.Add(ctx, 1)

func (*Observability) Shutdown

func (o *Observability) Shutdown(ctx context.Context) error

Shutdown performs graceful shutdown of all observability components.

This method ensures metrics are flushed and resources are properly cleaned up. It will attempt to shut down all components even if some fail, and will return an error describing all failures that occurred.

If Shutdown is called on an uninitialized Observability instance, it returns immediately without error.

func (*Observability) Start

func (o *Observability) Start(ctx context.Context) error

Start initializes all enabled components of the observability stack.

Start must be called before using any observability features. It will return an error if called multiple times or if component initialization fails.

If no components are enabled in the configuration, Start succeeds immediately and logs a message indicating the stack is disabled.

func (*Observability) Tracer

func (o *Observability) Tracer() trace.Tracer

type Stack

type Stack struct {

	// HTTP provides metrics for HTTP request/response tracking.
	HTTP *metrics.HTTPMetrics

	// DB provides metrics for database query operations.
	DB *metrics.DBMetrics

	// Cache provides metrics for cache operations (hits, misses, etc.).
	Cache *metrics.CacheMetrics

	// Runtime provides automatic Go runtime metrics (goroutines, memory, GC, etc.).
	Runtime *metrics.RuntimeMetrics

	// System provides system-level metrics (CPU, disk, network).
	// May be nil if DisableSystemMetrics is true.
	System *metrics.SystemMetrics

	// Tracing provides distributed tracing for HTTP, DB, and Cache operations.
	// May be nil if TracingEnabled is false.
	Tracing *TracingStack

	Logs *logs.LogsProvider
	// contains filtered or unexported fields
}

Stack represents a fully initialized observability stack with pre-configured metrics collectors.

Stack provides convenient access to HTTP, database, cache, runtime, and system metrics without requiring manual initialization of each component. All metrics are ready to use immediately after calling Init().

The Stack must be properly shut down using Shutdown() to ensure metrics are flushed and resources are cleaned up.

func Init

func Init(log Logger, opts *InitOptions) (*Stack, error)

Init initializes the complete observability stack automatically.

This is the main entry point for adding observability to your application. It loads configuration from environment variables, initializes all enabled components, and returns a ready-to-use Stack with pre-configured metrics.

Init will return an error if:

  • Required environment variables are missing
  • The OTLP endpoint cannot be reached (if metrics are enabled)
  • Metrics initialization fails

Example:

stack, err := observability.Init(log, nil)
if err != nil {
    log.Fatal("failed to init observability", zap.Error(err))
}
defer func() {
	if err := observability.Shutdown(context.Background()); err != nil {
		log.Error("Failed to shutdown observability", zap.Error(err))
	}
}()

// Use pre-configured metrics
stack.HTTP.RecordRequest(ctx, "GET", "/api/users", 200, duration, reqSize, respSize)

func MustInit

func MustInit(log Logger, opts *InitOptions) *Stack

MustInit is like Init but panics if initialization fails.

This is useful for simplifying application startup when an observability initialization failure should terminate the application.

Example:

func main() {
    log := logger.Get()
    stack := observability.MustInit(log, nil)
    defer stack.Shutdown(context.Background())
    // ...
}

func (*Stack) Config

func (s *Stack) Config() config.Config

Config returns the current observability configuration.

func (*Stack) EnableLogsExport

func (s *Stack) EnableLogsExport(log Logger) error

EnableLogsExport hooks the application logger to send logs to OTLP.

This method reconfigures the logger to use a TeeCore that sends logs to both console and OTLP simultaneously.

Must be called after Init() and after the logger is initialized.

Example:

log := logger.Get()
obsStack, _ := observability.Init(log, nil)
if err := obsStack.EnableLogsExport(log); err != nil {
    log.Fatal("Failed to enable logs export", zap.Error(err))
}
log.Info("This log goes to both console and Loki")

func (*Stack) HTTPMetricsMiddleware

func (s *Stack) HTTPMetricsMiddleware() func(http.Handler) http.Handler

HTTPMetricsMiddleware returns a Chi-compatible middleware that automatically records HTTP metrics for all requests.

The middleware tracks request count, duration, request/response sizes, and groups metrics by method, path, and status code. It integrates seamlessly with the go-chi/chi router.

This is a convenience wrapper around HTTPMetrics.Middleware() for easy access from the Stack.

Example:

stack, _ := observability.Init(log, nil)
router := chi.NewRouter()
router.Use(stack.HTTPMetricsMiddleware())
router.Get("/api/users", handler)

func (*Stack) HTTPObservabilityMiddleware

func (s *Stack) HTTPObservabilityMiddleware() func(http.Handler) http.Handler

HTTPObservabilityMiddleware returns a Chi-compatible middleware that records both HTTP metrics and distributed traces in a single pass.

This middleware uses a "lazy capture" approach: it captures the route pattern AFTER the handler executes, when Chi has fully populated the RouteContext. This allows it to work as a global middleware while still using route patterns.

func (*Stack) IsInitialized

func (s *Stack) IsInitialized() bool

IsInitialized returns true if the stack has been initialized.

func (*Stack) LogsProvider

func (s *Stack) LogsProvider() *logs.LogsProvider

func (*Stack) Meter

func (s *Stack) Meter() metric.Meter

Meter returns the OpenTelemetry meter for creating advanced custom metrics.

For most use cases, the pre-configured metrics (HTTP, DB, Cache, Runtime, System) are sufficient. Use this method only when you need to create custom metric instruments that aren't covered by the built-in metrics.

Example:

meter := stack.Meter()
customCounter, _ := meter.Int64Counter("app.custom.operations")
customCounter.Add(ctx, 1, metric.WithAttributes(
    attribute.String("operation", "process"),
))

func (*Stack) Shutdown

func (s *Stack) Shutdown(ctx context.Context) error

Shutdown performs graceful shutdown of the observability stack.

Shutdown flushes any pending metrics and cleans up resources. It should be called during application shutdown, typically in a defer statement.

The provided context controls the shutdown timeout. If the context expires before shutdown completes, an error is returned but cleanup continues for remaining components.

Example:

stack, _ := observability.Init(log, nil)
defer stack.Shutdown(context.Background())

func (*Stack) Tracer

func (s *Stack) Tracer() trace.Tracer

type TracingStack

type TracingStack struct {
	// HTTP provides HTTP request tracing and middleware.
	HTTP *tracing.HTTPTracing

	// DB provides database operation tracing.
	DB *tracing.DBTracing

	// Cache provides cache operation tracing.
	Cache *tracing.CacheTracing
}

TracingStack holds all tracing components.

Directories

Path Synopsis
cmd
example command
cmd/example/main.go
cmd/example/main.go
internal
config
Package config handles all configuration loading and validation for the observability stack.
Package config handles all configuration loading and validation for the observability stack.
logs
Package logs provides OpenTelemetry logs integration.
Package logs provides OpenTelemetry logs integration.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL