Directories
¶
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
ingero
command
Package main is the entry point for the ingero CLI.
|
Package main is the entry point for the ingero CLI. |
|
ingero-alerter
command
ingero-alerter is a sidecar that consumes straggler events from the Ingero agent's remediation UDS socket and dispatches them to Slack incoming webhooks and/or PagerDuty Events API v2.
|
ingero-alerter is a sidecar that consumes straggler events from the Ingero agent's remediation UDS socket and dispatches them to Slack incoming webhooks and/or PagerDuty Events API v2. |
|
straggler-sink
command
Command straggler-sink is a reference consumer for the ingero remediation UDS event stream.
|
Command straggler-sink is a reference consumer for the ingero remediation UDS event stream. |
|
Package docs exposes documentation files that ship inside the ingero binary so subcommands can use them as offline fallbacks.
|
Package docs exposes documentation files that ship inside the ingero binary so subcommands can use them as offline fallbacks. |
|
internal
|
|
|
alerter
Package alerter consumes straggler NDJSON events from the agent's remediation UDS socket and dispatches them to one or more notification backends (Slack incoming webhooks, PagerDuty Events API v2).
|
Package alerter consumes straggler NDJSON events from the agent's remediation UDS socket and dispatches them to one or more notification backends (Slack incoming webhooks, PagerDuty Events API v2). |
|
auth
Package auth holds bearer-token parsing + constant-time compare.
|
Package auth holds bearer-token parsing + constant-time compare. |
|
cgroup
Package cgroup extracts container IDs from Linux cgroup paths.
|
Package cgroup extracts container IDs from Linux cgroup paths. |
|
cli
Package cli — explain.go implements `ingero explain`, which produces automated incident reports with multi-layer causal chains.
|
Package cli — explain.go implements `ingero explain`, which produces automated incident reports with multi-layer causal chains. |
|
config
Package config loads the agent's YAML configuration.
|
Package config loads the agent's YAML configuration. |
|
correlate
Package correlate provides cross-layer correlation between host kernel events, system metrics, and CUDA latency statistics.
|
Package correlate provides cross-layer correlation between host kernel events, system metrics, and CUDA latency statistics. |
|
dashboard
Package dashboard provides an HTTPS dashboard server for Ingero.
|
Package dashboard provides an HTTPS dashboard server for Ingero. |
|
discover
Package discover detects CUDA processes, libraries, and system capabilities.
|
Package discover detects CUDA processes, libraries, and system capabilities. |
|
ebpf/blockio
Package blockio manages eBPF tracepoints for block I/O request tracing.
|
Package blockio manages eBPF tracepoints for block I/O request tracing. |
|
ebpf/cuda
Package cuda manages eBPF uprobes for CUDA Runtime API tracing.
|
Package cuda manages eBPF uprobes for CUDA Runtime API tracing. |
|
ebpf/cudagraph
Package cudagraph manages eBPF uprobes for CUDA Graph lifecycle tracing.
|
Package cudagraph manages eBPF uprobes for CUDA Graph lifecycle tracing. |
|
ebpf/host
Package host manages eBPF tracepoints for host kernel event tracing.
|
Package host manages eBPF tracepoints for host kernel event tracing. |
|
ebpf/kernellaunch
Package kernellaunch is the v0.15 item M loader.
|
Package kernellaunch is the v0.15 item M loader. |
|
ebpf/memfrag
Package memfrag is the v0.15 W1 IOCTL-kprobe loader.
|
Package memfrag is the v0.15 W1 IOCTL-kprobe loader. |
|
ebpf/ncclprobe
Package ncclprobe attaches eBPF uprobes to NCCL collective entry points (ncclCommInitRank, ncclCommDestroy, ncclAllReduce, ncclAllGather, ncclReduceScatter, ncclBcast) in libnccl.so or statically-linked-NCCL hosts (libtorch_cuda.so etc).
|
Package ncclprobe attaches eBPF uprobes to NCCL collective entry points (ncclCommInitRank, ncclCommDestroy, ncclAllReduce, ncclAllGather, ncclReduceScatter, ncclBcast) in libnccl.so or statically-linked-NCCL hosts (libtorch_cuda.so etc). |
|
ebpf/net
Package net manages eBPF tracepoints for network socket I/O tracing.
|
Package net manages eBPF tracepoints for network socket I/O tracing. |
|
ebpf/parity
Package parity hosts cross-architecture parity assertions for the per-arch BPF artifacts produced by bpf2go's `-target amd64,arm64` mode.
|
Package parity hosts cross-architecture parity assertions for the per-arch BPF artifacts produced by bpf2go's `-target amd64,arm64` mode. |
|
ebpf/pytrace
Package pytrace provides Go-side helpers for the in-kernel CPython frame walker.
|
Package pytrace provides Go-side helpers for the in-kernel CPython frame walker. |
|
ebpf/tcp
Package tcp manages eBPF tracepoints for TCP retransmission tracing.
|
Package tcp manages eBPF tracepoints for TCP retransmission tracing. |
|
export
Package export provides OTEL-compatible metric and trace export.
|
Package export provides OTEL-compatible metric and trace export. |
|
filter
Package filter provides suppression filters for system metric snapshots.
|
Package filter provides suppression filters for system metric snapshots. |
|
fleet
Package fleet provides an HTTP fan-out client for querying multiple Ingero nodes and concatenating results.
|
Package fleet provides an HTTP fan-out client for querying multiple Ingero nodes and concatenating results. |
|
health
Package health implements the agent-side health score used by Ingero Fleet to detect stragglers across a GPU cluster.
|
Package health implements the agent-side health score used by Ingero Fleet to detect stragglers across a GPU cluster. |
|
k8s
Package k8s provides a lightweight Kubernetes API client for pod metadata enrichment and GPU pod discovery.
|
Package k8s provides a lightweight Kubernetes API client for pod metadata enrichment and GPU pod discovery. |
|
kprobe
Package kprobe holds the experimental closed-driver kprobe surface (W1 memfrag, W2 throttle, kernel grid/block dims) shipped behind the --enable-experimental-kprobes flag.
|
Package kprobe holds the experimental closed-driver kprobe surface (W1 memfrag, W2 throttle, kernel grid/block dims) shipped behind the --enable-experimental-kprobes flag. |
|
mcp
Package mcp provides an MCP (Model Context Protocol) server for Ingero.
|
Package mcp provides an MCP (Model Context Protocol) server for Ingero. |
|
migrate
Package migrate ships the v0.11 framework for SQLite schema migrations on the local trace DB at ~/.ingero/ingero.db.
|
Package migrate ships the v0.11 framework for SQLite schema migrations on the local trace DB at ~/.ingero/ingero.db. |
|
nvml
Package nvml exposes a minimal, pure-Go wrapper around NVML clock-throttle reasons via a `nvidia-smi` subprocess.
|
Package nvml exposes a minimal, pure-Go wrapper around NVML clock-throttle reasons via a `nvidia-smi` subprocess. |
|
orchestrator
Package orchestrator detects which workload orchestrator the agent is running under (Slurm, Docker, ECS, K8s, or none) and surfaces a stable per-process identity for cost-attribution and per-job correlation.
|
Package orchestrator detects which workload orchestrator the agent is running under (Slurm, Docker, ECS, K8s, or none) and surfaces a stable per-process identity for cost-attribution and per-job correlation. |
|
procpath
Package procpath provides helpers for accessing files in another process's mount namespace via /proc/<pid>/root/.
|
Package procpath provides helpers for accessing files in another process's mount namespace via /proc/<pid>/root/. |
|
provider
Package provider auto-detects the cloud provider an agent is running on, by probing the standard instance-metadata endpoints.
|
Package provider auto-detects the cloud provider an agent is running on, by probing the standard instance-metadata endpoints. |
|
sampling
Package sampling provides a mode-controlled, edge-triggered event sampler.
|
Package sampling provides a mode-controlled, edge-triggered event sampler. |
|
stats
Package stats provides rolling latency statistics, time-fraction breakdown, periodic spike detection, and anomaly flagging for events from all sources.
|
Package stats provides rolling latency statistics, time-fraction breakdown, periodic spike detection, and anomaly flagging for events from all sources. |
|
store
Package store provides SQLite-based persistent storage for events.
|
Package store provides SQLite-based persistent storage for events. |
|
straggler
Package straggler detects CPU scheduling contention correlated with GPU throughput drops and emits StraggleState messages for remediation.
|
Package straggler detects CPU scheduling contention correlated with GPU throughput drops and emits StraggleState messages for remediation. |
|
support
Package support builds a single tarball that an operator can attach to a support case.
|
Package support builds a single tarball that an operator can attach to a support case. |
|
symtab
Package symtab provides userspace symbol resolution for stack traces.
|
Package symtab provides userspace symbol resolution for stack traces. |
|
synth
gpu_steal.go — GPU time-slicing scenario (renamed from gpu-contention).
|
gpu_steal.go — GPU time-slicing scenario (renamed from gpu-contention). |
|
sysinfo
Package sysinfo reads system-level CPU, memory, and load metrics from /proc.
|
Package sysinfo reads system-level CPU, memory, and load metrics from /proc. |
|
update
Package update provides a background version update checker.
|
Package update provides a background version update checker. |
|
version
Package version provides build-time version information.
|
Package version provides build-time version information. |
|
pkg
|
|
|
contract
Package contract defines the shared constants for the Ingero Fleet interface contract.
|
Package contract defines the shared constants for the Ingero Fleet interface contract. |
|
events
Package events defines shared event types mirroring bpf/common.bpf.h structs.
|
Package events defines shared event types mirroring bpf/common.bpf.h structs. |
Click to show internal directories.
Click to hide internal directories.