tapio

module
v0.0.0-...-38df0df Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 6, 2026 License: MIT

README

Tapio

Edge Intelligence for Kubernetes

eBPF-based agent that captures kernel-level events, filters to anomalies at the edge (~1%), and sends enriched events to AHTI for root cause analysis.


What Makes Tapio Different

Tapio doesn't just collect data - it learns baselines and only sends what matters.

Traditional Observability Tapio (Edge Intelligence)
Send everything Filter to ~1% (anomalies only)
Central processing Edge filtering
High bandwidth Low bandwidth
Noise Signal
eBPF Kernel Events (millions/sec)
        │
        ▼
   Edge Filtering
   (RTT baseline learning,
    memory pressure detection)
        │
        ▼
   ~1% Anomalies ──────▶ AHTI (Central Intelligence)
                         └─▶ Root cause analysis

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    TAPIO (Edge - Per Node)                       │
│                                                                  │
│  ┌──────────────────────┐                                        │
│  │   eBPF Observers     │                                        │
│  │                      │                                        │
│  │  • network (TCP/DNS) │                                        │
│  │  • container (OOM)   │                                        │
│  │  • node (PMC)        │                                        │
│  └──────────┬───────────┘                                        │
│             │                                                    │
│             ▼                                                    │
│      Filter (~1%)                                                │
│      (anomalies)                                                 │
│             │                                                    │
│             ▼                                                    │
│        POLKU ────────────────────────────────────────────────────┤
└─────────────────────────┬───────────────────────────────────────┘
                          │
┌─────────────────────────┴───────────────────────────────────────┐
│                    PORTTI (Cluster - 1-2 replicas)               │
│                                                                  │
│  K8s API Watcher: Deployments, Pods, Services, Nodes, Events    │
│  Sends 100% (low volume, causal anchors)                         │
│             │                                                    │
│             ▼                                                    │
│        POLKU ────────────────────────────────────────────────────┤
└─────────────────────────┬───────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    AHTI (Central)                                │
│                                                                  │
│              Receives → Learns → Correlates                      │
│                                                                  │
│      "Deployment X at T=0 → OOM at T=5min → Root Cause"         │
│                                                                  │
│                   Never watches anything                         │
└──────────────────────────────────────────────────────────────────┘

Key insight:

  • TAPIO: eBPF kernel events → Filter to 1% (only anomalies)
  • PORTTI: K8s API events → Send 100% (they're causal anchors)
  • AHTI: Never watches - only receives and correlates

Observers

eBPF Observers (Kernel Level)
Observer Captures Filters To
Network TCP states, DNS, retransmits RTT spikes >2x baseline, connection failures
Container OOM kills, process exits OOM, error exits (code ≠ 0)
Node PMC, cgroup metrics Memory pressure >80%, CPU throttling
Prometheus Scraping
Observer Captures Sends
Scheduler kube-scheduler metrics Scheduling latency, queue depth

Note: K8s API watching (Deployments, Pods, Services, Nodes, Events) moved to PORTTI.


Status

Production Ready
Component Coverage Description
Supervisor 89.8% Observer lifecycle
Network Observer 78% eBPF TCP/DNS/RTT
Scheduler Observer 85.2% Prometheus scraping
K8s Context - Pod metadata enrichment
In Progress
Component Status
Container Observer eBPF code written, needs compilation
Node Observer PMC metrics, cgroup integration

Quick Start

# Prerequisites: Linux, Go 1.24+, Kubernetes cluster

git clone https://github.com/yairfalse/tapio
cd tapio

# Build
make build

# Run with OTLP export (FREE tier)
./bin/tapio --observer=network

# Run with POLKU (connects to AHTI via gRPC gateway)
./bin/tapio --observer=network --polku=localhost:50051
Environment Variables
OTEL_EXPORTER_OTLP_ENDPOINT=localhost:4317  # OTLP collector
POLKU_ENDPOINT=localhost:50051               # POLKU gateway for AHTI
KUBECONFIG=~/.kube/config                    # K8s access

Edge Filtering Examples

Network Observer
// In eBPF: Only emit when RTT spikes >2x baseline OR >500ms
if (rtt_us > (baseline->baseline_us * 2) || rtt_us > 500000) {
    emit_rtt_spike_event();  // ~1% of traffic
}
Container Observer
// Only send OOM kills and error exits
if evt.Type == EventTypeOOMKill || classification.Category == ExitCategoryError {
    publish(event)  // Skip normal exits
}

Project Structure

tapio/
├── internal/
│   ├── observers/
│   │   ├── network/        # eBPF TCP/DNS/RTT
│   │   ├── container/      # eBPF OOM/exits
│   │   ├── node/           # eBPF PMC/cgroup
│   │   ├── deployments/    # K8s API
│   │   └── scheduler/      # K8s Events
│   ├── runtime/
│   │   └── supervisor/     # Observer lifecycle
│   └── services/
│       └── k8scontext/     # Pod metadata enrichment
├── pkg/
│   ├── domain/             # Event types (ObserverEvent)
│   └── intelligence/       # POLKU routing
└── docs/
    └── designs/            # Architecture docs

Documentation


Project Description
PORTTI K8s API watcher - Deployments, Pods, Services, Nodes
AHTI Central Intelligence - receives events, builds causality graph
POLKU Event router - transforms raw events to AhtiEvent
Sykli CI in your language

Why "Tapio"?

Finnish god of forests. Watches over the ecosystem.

Tapio watches Kubernetes at the kernel level - network packets, container lifecycle, node health. It sees what APM tools miss.


License

Apache 2.0

Directories

Path Synopsis
api
v1alpha1
Package v1alpha1 contains API Schema definitions for the tapio v1alpha1 API group +kubebuilder:object:generate=true +groupName=tapio.io
Package v1alpha1 contains API Schema definitions for the tapio v1alpha1 API group +kubebuilder:object:generate=true +groupName=tapio.io
internal
services/k8scontext
Package k8scontext provides in-memory K8s metadata lookup for eBPF enrichment.
Package k8scontext provides in-memory K8s metadata lookup for eBPF enrichment.
pkg
publisher
Package publisher provides the PolkuPublisher for sending events to POLKU gateway.
Package publisher provides the PolkuPublisher for sending events to POLKU gateway.
correlation module
humanoutput module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL