README
¶
AgentSmith-HUB
A high-performance security data pipeline with a real-time rules engine and deeply integrated LLM agents — built for modern SOC and detection engineering teams.
Process, enrich, detect, and respond at scale — with simple XML-based rules, CEP, rich plugins, and AI-powered analysis wired directly into the stream.

Why AgentSmith-HUB?
If you work in security operations, you probably deal with massive volumes of raw logs and alerts every day. You need to normalize, enrich, correlate, and route them — and ideally detect threats in real time, not in batch jobs. AgentSmith-HUB is built to handle all of this in a single, opinionated platform:
- High-signal detections, not dashboards — Design real-time detections and data transformations with simple, readable XML rules instead of ad‑hoc scripts
- Blazing fast at scale — 3.90M messages/sec on just 2 vCPUs (benchmark); built to sit directly in front of your SIEM / lake
- All-in-one pipeline — Input, normalization, enrichment, correlation, and output in one flow; no more glue scripts between Kafka, ES, ClickHouse, and “rule engines”
- First-class CEP — Detect ordered event sequences, absence patterns, and multi-source correlations over time with
<sequence>,<threshold>,<iterator>, and<checklist> - LLM agents in the stream — Drop LLM-powered agents into the same pipeline for alert triage, enrichment, rule authoring, and auto-whitelisting
- Comment-to-memory learning loop — Convert reviewer comments from Agent Tools Logs into durable
memory_notes, auto-commit updates, and continuously improve agent behavior - Skills system — Attach knowledge bases and operational tools to agents via Skills, with progressive disclosure so prompts stay small and fast
- Rich plugin ecosystem — Threat intel (VirusTotal, ThreatBook, Shodan), GeoIP, encoding, regex, time/window helpers, LLM calls, and more
- Production features out of the box — Cluster mode, health checks, daily stats, sample data, Push Changes / review workflow, and a modern Web UI for rule and project orchestration
Who is this for?
- SOC / CERT / CSIRT teams that want an opinionated place to run detections, triage alerts, and reduce false positives without building their own engine from scratch.
- Detection engineers / threat hunters who care about CEP, thresholds, and precise control over when an alert fires (and when it must not).
- Security platform / data teams who already own Kafka / ES / ClickHouse and want a thin, fast, open platform to orchestrate security data flows and LLM-powered analysis.
How It Works
AgentSmith-HUB uses a straightforward pipeline model:
INPUT (Kafka / SLS / ...) → RULESET / AGENT → RULESET / AGENT → OUTPUT (Kafka / ES / ClickHouse / SLS / ...)
Rulesets and agents can be freely chained within a Project, giving you full control over data flow and allowing you to mix “hard” rules with “soft” LLM judgement in the same stream:

Core Components at a Glance
- INPUT: Connects to streaming sources like Kafka, Aliyun SLS, and cloud-managed Kafka variants; supports Grok parsing and JSON so you normalize once and reuse everywhere.
- RULESET: XML-based real-time rules engine with checks, checklists, thresholds (count / SUM / CLASSIFY), CEP sequences, iterators, and data append/modify/del — all executed strictly in the order you write them.
- AGENT: LLM-powered node that runs in the same pipeline as rulesets; for each event it can call an LLM (with tools and skills) to score, enrich, or auto-generate rules/whitelists, then forward the enriched event downstream.
- OUTPUT: Sends processed data to Kafka, Elasticsearch (v7/v8/v9), ClickHouse, or simple print, with batching, time-based flush, TLS/auth, and idempotent Kafka producers for safe delivery.
- SKILL: Reusable capability module for agents — knowledge skills provide on‑demand reference content, builtin skills expose Go-implemented tools like
hub_ruleset_editorfor ruleset CRUD. - PLUGIN: Extensible function system powering checks, enrichment, and actions: GeoIP, URL parsing, encoding, time window helpers, threat intelligence lookups, single-shot LLM calls, and more — all composable directly in rules.
Web UI & API Highlights
- Visual rule and project editing: Rich browser UI for editing rulesets with syntax help, validation, and GIF-level feedback; drag-style project orchestration to define
INPUT → RULESET / AGENT → OUTPUTflows. - One-click testing everywhere: Built-in test runners for Output, Ruleset, Plugin, Agent, and Project components (including sample data capture), so you can validate changes before they hit real outputs.
- Operations, errors, and cluster view: Dedicated views for error logs, operations history (project start/stop/restart, config changes, agent tool calls), and basic cluster status so you can see what is running where.
- Safe change management: All edits go through temporary configs, diff & review, and then Push Changes to apply — the platform automatically figures out affected projects and restarts them safely.
- HTTP API for automation: JSON APIs mirror the UI capabilities (component CRUD, project lifecycle, testing), so you can integrate AgentSmith-HUB into CI/CD, internal portals, or automation scripts.
Rules Engine in 60 Seconds
At the heart of AgentSmith-HUB is a streaming rules engine designed for security detections:
- Checks & checklists: Match on strings, numbers, regex, and plugins; combine conditions with AND/OR/NOT using logical expressions.
- Thresholds & windows: Detect frequency, sums, or distinct counts over sliding time windows (e.g. brute-force, spray, exfil).
- CEP sequences: Express ordered multi-event patterns and absence (e.g.
login -> !mfa,recon -> exploit -> exfil) with<sequence>. - Data shaping: Enrich, modify, or delete fields in place, and call plugins to pull in external context or compute derived fields.
A minimal example that enriches with threat intel and then detects on the enriched field:
<rule id="enrich_and_detect" name="Enrich with TI then alert">
<append type="PLUGIN" field="threat_info">threatbook(src_ip)</append>
<check type="EQU" field="threat_info.severity">high</check>
<append field="alert_level">critical</append>
</rule>
For the full syntax (all operations, modes, and best practices), see the Complete Guide.
LLM Agents & Skills
Agents are LLM-powered components that sit in the pipeline alongside rulesets. They process events independently, call an LLM with tool-use support, and forward enriched results downstream.
# Agent: AI-powered alert triage
model: gpt-4o-mini
system_prompt: |
For each alert, add llm_confidence (0-1) and llm_analysis fields.
skills:
- hub_ruleset_expert # Knowledge skill: rules engine reference
tools: all # Expose all plugins as LLM tools
max_rounds: 3
timeout: 30s
# Optional long-term memory (recommended as YAML sequence)
memory_notes:
- Keep output JSON compact and stable.
- Treat routine CI scanner traffic as lower priority unless other signals exist.
Skills provide modular capabilities to agents:
- Knowledge skills — Reference docs loaded on-demand (progressive disclosure)
- Builtin skills — Go-implemented tools (e.g.,
hub_ruleset_editorfor reading/writing rulesets)
Quick production tips:
- Prefer
tools: []by default and allowlist only needed plugin tools. - Use
tools: allonly for broad assistant agents (rule-authoring / deep triage). - In cluster mode, memory write/generate actions must go to the leader node.
Use agents in your project like any other component:
content: |
INPUT.kafka_alerts -> AGENT.alert_reviewer
AGENT.alert_reviewer -> OUTPUT.enriched_alerts
For full agent details (fields like reasoning_mode, reasoning_budget_tokens, memory_notes, and memory workflow in UI/API), see the Complete Guide.
Built-in Detection Rulesets
AgentSmith-HUB ships with production-ready detection rulesets that you can deploy immediately — no rule-writing required. All rules are mapped to MITRE ATT&CK for seamless integration with your security workflows.
Built-in K8s Ruleset Files
AgentSmith-HUB includes Kubernetes security rulesets out of the box. You can use them directly without writing custom XML first:
config/ruleset/k8s_security/k8s_audit_baseline.xmlconfig/ruleset/k8s_security/k8s_audit_intrusion.xml
Recommended onboarding flow:
- Import both built-in rulesets.
- Route Kubernetes audit logs to these rulesets in your Project.
- Verify detections in test mode with real sample events.
- Tune thresholds (if needed) for your cluster's normal behavior.
Sysmon Endpoint Security (Windows)
Two Sysmon rulesets are provided for medium/high-confidence endpoint detection use cases:
config/ruleset/sysmon_security/sysmon_baseline.xmlconfig/ruleset/sysmon_security/sysmon_intrusion.xmlconfig/ruleset/sysmon_security/sysmon_exclude.xml(strict allowlist template)
Recommended onboarding flow for Sysmon:
- Ensure your input normalizes core Sysmon fields used by rulesets.
- Import
sysmon_baseline.xmlfirst and validate behavior in test mode. - Import
sysmon_intrusion.xmland tune based on your endpoint baseline. - Add environment-specific allowlists with a separate EXCLUDE ruleset if needed.
More built-in rulesets for additional data sources are on the roadmap. Contributions are welcome!
Features at a Glance
|
Rule Editing
|
Rule Testing
|
|
Project Orchestration
|
Plugin Testing
|
|
Input Connection Check
|
Search
|
|
Error Logs & Operations History
|
Comment-to-memory learning loop
|
Deployment
- Download and extract the release archive to
/opt/agentsmith-hub - Copy the config folder:
cp -r /opt/agentsmith-hub/config /opt/hub_config - Configure Redis in
/opt/hub_config/config.yaml - Start the service:
# Leader mode (default) ./start.sh # Follower mode (uses the same Redis as leader) ./start.sh --follower # See all options ./start.sh --help - Access token is generated at
/etc/hub/.tokenon first run - Install and configure Nginx:
sudo cp /opt/agentsmith-hub/nginx/nginx.conf /etc/nginx/ sudo nginx -s reload - Open
http://your-hostin your browser (port 80)
Documentation
License
AgentSmith-HUB is licensed under the Apache License 2.0 with the Commons Clause restriction.
You are free to use, modify, and deploy this software — the restriction only prevents selling the software itself as a commercial product or service. Internal enterprise use is fully permitted.







