cerebro

module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 5, 2026 License: Apache-2.0

README

Cerebro

Security Data Platform for Cloud and SaaS Posture Management

Cerebro is a comprehensive security platform that combines cloud asset discovery, policy evaluation, compliance reporting, AI-powered investigation, and automated remediation workflows.

Go Version License


Features

  • Cloud Asset Discovery - Ingest configurations from AWS, GCP, Azure, and Kubernetes via native scanners
  • Policy Engine - Cedar-style policies for security evaluation with custom condition support
  • Parallel Scanning - High-performance scanning with configurable worker pools
  • Compliance Frameworks - Pre-built mappings for SOC 2, CIS, PCI DSS, HIPAA, NIST 800-53
  • AI Agents - LLM-powered security investigation with Anthropic Claude and OpenAI GPT
  • Deep Research Agent - Code-to-cloud security analysis bridging source code and live cloud inspection
  • Distributed Job Queue - SQS + DynamoDB based job system for scalable distributed processing
  • Identity Governance - Access reviews, stale access detection, and risk scoring
  • Attack Path Analysis - Graph-based visualization of potential attack paths
  • Integrations - Jira, Linear, Slack, PagerDuty, and custom webhooks
  • Scheduled Operations - Automated scanning with configurable intervals

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                              CEREBRO PLATFORM                                │
│                                                                              │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌──────────┐  │
│  │   CLI    │   │ REST API │   │ Webhooks │   │Scheduler │   │  Agents  │  │
│  └────┬─────┘   └────┬─────┘   └────┬─────┘   └────┬─────┘   └────┬─────┘  │
│       └──────────────┴──────────────┴──────────────┴──────────────┘        │
│                                     │                                       │
│                       ┌─────────────▼─────────────┐                        │
│                       │    Application Container   │                        │
│                       │  Policy│Scanner│Findings  │                        │
│                       └─────────────┬─────────────┘                        │
│                                     │                                       │
└─────────────────────────────────────┼───────────────────────────────────────┘
                                      │
        ┌─────────────────────────────┼─────────────────────────────┐
        ▼                             ▼                             ▼
  ┌───────────┐              ┌───────────────┐              ┌───────────┐
  │ Snowflake │◀─────────────│  Native Sync │              │ External  │
  │ (Storage) │              │ (Ingestion)  │              │   APIs    │
  └───────────┘              └──────────────┘              └───────────┘
        │                           │                             │
  AWS/GCP/Azure              Cloud Providers             Jira/Slack/PD
  Kubernetes                  SaaS Apps                  Anthropic/OpenAI

Quick Start

Prerequisites
  • Go 1.23+
  • Snowflake account
Installation
# Clone repository
git clone https://github.com/writer/cerebro.git
cd cerebro

# Install dependencies
make setup

# Build
make build
Configuration
# Copy environment template
cp .env.example .env

# Required: Snowflake connection
export SNOWFLAKE_CONNECTION_STRING="user:pass@account/CEREBRO/CEREBRO"

# Optional: AI agents
export ANTHROPIC_API_KEY="sk-ant-..."

# Optional: Notifications
export SLACK_WEBHOOK_URL="https://hooks.slack.com/..."

# Optional: Ticketing
export JIRA_BASE_URL="https://company.atlassian.net"
export JIRA_API_TOKEN="..."
Local mode (no Snowflake)

For local development, you can run Cerebro without Snowflake credentials:

unset SNOWFLAKE_PRIVATE_KEY SNOWFLAKE_ACCOUNT SNOWFLAKE_USER
export CEREBRO_DB_PATH=.cerebro/cerebro.db
make serve

In local mode, findings are persisted to SQLite. Snowflake-backed capabilities (for example direct data-lake query endpoints and security graph population) are reduced or unavailable.

Running
# Start API server
./bin/cerebro serve

# Or with make
make serve

# Development mode
make dev

CLI Commands

# Start API server
cerebro serve

# Start distributed job worker
cerebro worker

# Run code-to-cloud security analysis
cerebro agent run --repo-url https://github.com/org/repo
cerebro agent run --resource arn:aws:s3:::my-bucket --aws-region us-east-1

# Run distributed analysis (enqueue jobs to SQS)
cerebro agent run --repo-url https://github.com/org/repo --distributed --wait

# Sync cloud data via native scanners
cerebro sync
cerebro sync --gcp --gcp-project my-project
cerebro sync --azure

# Policy management
cerebro policy list
cerebro policy validate
cerebro policy test <policy-id> <asset.json>

# Query Snowflake
cerebro query "SELECT * FROM aws_s3_buckets LIMIT 10"
cerebro query --format json "SELECT * FROM aws_iam_users"

# Bootstrap database
cerebro bootstrap

API Overview

Endpoint Description
GET /health Health check
GET /ready Readiness with dependency status
GET /metrics Prometheus metrics
GET /api/v1/tables List Snowflake tables
POST /api/v1/query Execute SQL query
GET /api/v1/policies List loaded policies
POST /api/v1/policies/evaluate Evaluate policy
GET /api/v1/findings List findings
POST /api/v1/findings/scan Trigger policy scan
GET /api/v1/compliance/frameworks List frameworks
GET /api/v1/compliance/frameworks/{id}/pre-audit Pre-audit check
POST /api/v1/agents/sessions Create agent session
POST /api/v1/agents/sessions/{id}/messages Send message to agent
GET /api/v1/identity/stale-access Detect stale access
POST /api/v1/attack-paths/analyze Analyze attack paths
POST /api/v1/webhooks Register webhook

See API Reference for complete documentation.


Policies

Policies are JSON files defining security checks:

{
    "id": "aws-s3-bucket-no-public-access",
    "name": "S3 Bucket Public Access",
    "description": "S3 buckets should not allow public access",
    "effect": "forbid",
    "conditions": ["block_public_acls != true"],
    "severity": "critical",
    "tags": ["cis-aws-2.1.5", "security", "s3"]
}
Policy Directory
policies/
├── aws/           # AWS policies (S3, IAM, EC2, RDS)
├── gcp/           # GCP policies (Storage, Compute, IAM)
├── azure/         # Azure policies (Storage, VM)
└── kubernetes/    # Kubernetes policies (Pods, RBAC)

See Policy Documentation for writing custom policies.


Compliance

Supported Frameworks
  • SOC 2 Type II - Trust Services Criteria
  • CIS AWS Foundations - v1.4.0 Benchmark
  • CIS GCP Foundations - v1.3.0 Benchmark
  • PCI DSS - v4.0
  • HIPAA - Security Rule
  • NIST 800-53 - Rev 5
Pre-Audit Check
curl http://localhost:8080/api/v1/compliance/frameworks/soc2/pre-audit

Returns estimated audit outcome, failing controls, and remediation recommendations.


AI Agents

Cerebro includes AI-powered security investigation agents:

Available Agents
Agent Provider Purpose
security-analyst Anthropic Claude Security finding investigation
incident-responder OpenAI GPT Incident triage and response
Usage
# Create session
curl -X POST http://localhost:8080/api/v1/agents/sessions \
  -H "Content-Type: application/json" \
  -d '{"agent_id": "security-analyst", "user_id": "analyst@company.com"}'

# Send message
curl -X POST http://localhost:8080/api/v1/agents/sessions/{id}/messages \
  -H "Content-Type: application/json" \
  -d '{"content": "Investigate the public S3 bucket findings"}'
Agent Tools
  • query_snowflake - Execute SQL queries
  • list_findings - List security findings
  • get_asset - Get asset details
  • evaluate_policy - Test policy against asset
  • search_logs - Search audit logs

Identity & Access Review

Stale Access Detection
curl http://localhost:8080/api/v1/identity/stale-access

Detects:

  • Inactive users (90+ days)
  • Unused access keys
  • Stale service accounts
Access Reviews
# Create review
curl -X POST http://localhost:8080/api/v1/identity/reviews \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Q1 2024 Access Review",
    "type": "user_access",
    "scope": {"providers": ["aws", "gcp"]}
  }'

Webhooks

Register webhooks for real-time event notifications:

curl -X POST http://localhost:8080/api/v1/webhooks \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/webhook",
    "events": ["finding.created", "scan.completed"],
    "secret": "webhook-secret"
  }'
Event Types
  • finding.created / finding.resolved / finding.suppressed
  • scan.completed
  • review.started / review.completed
  • attack_path.found
  • ticket.created

Distributed Job System

Cerebro includes a distributed job queue for scalable security analysis across large repositories and cloud environments.

Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│                         DISTRIBUTED JOB SYSTEM                               │
│                                                                              │
│  ┌──────────────┐        ┌──────────────┐        ┌──────────────┐          │
│  │  API/CLI     │───────▶│    SQS       │◀───────│   Workers    │          │
│  │ (Orchestrator)│       │   Queue      │        │  (N instances)│          │
│  └──────────────┘        └──────────────┘        └──────┬───────┘          │
│         │                       │                        │                   │
│         │                       ▼                        │                   │
│         │               ┌──────────────┐                │                   │
│         └──────────────▶│  DynamoDB    │◀───────────────┘                   │
│                         │  Job Store   │                                     │
│                         └──────────────┘                                     │
└─────────────────────────────────────────────────────────────────────────────┘
Components
  • Job Manager: Enqueues inspection jobs and tracks batch completion
  • SQS Queue: Distributes work with visibility timeout and dead-letter queue
  • DynamoDB Store: Persists job state with lease-based claiming for exactly-once execution
  • Workers: Poll SQS, claim jobs, execute inspections, update results
Usage
# Set up infrastructure (via Pulumi)
cd infra && pulumi up --stack prod

# Run orchestrator to enqueue jobs
cerebro agent run --repo-url https://github.com/org/repo --distributed

# Run workers (scale horizontally)
cerebro worker --concurrency 4

# Or wait for completion
cerebro agent run --repo-url https://github.com/org/repo --distributed --wait
Infrastructure (Pulumi)

The distributed job infrastructure is managed via Pulumi in infra/:

  • SQS queue with dead-letter queue for failed jobs
  • DynamoDB table with GSI for group/status queries
  • Worker ECS service with auto-scaling based on queue depth
  • CloudWatch alarms for DLQ messages and queue backlog

Development

# Run tests
make test

# Run with coverage
go test -v -cover ./...

# Lint
make lint

# Build Docker image
make docker-build

See Development Guide for detailed instructions.


Documentation

Document Description
Architecture System architecture and design
API Reference Complete API documentation
Packages Internal package documentation
Configuration Environment variables and setup
Policies Policy authoring guide
Development Development guide

Environment Variables

Variable Description Default
API_PORT Server port 8080
LOG_LEVEL Log verbosity info
SNOWFLAKE_CONNECTION_STRING Snowflake DSN -
POLICIES_PATH Policy directory policies
ANTHROPIC_API_KEY Claude API key -
OPENAI_API_KEY OpenAI API key -
JIRA_BASE_URL Jira instance -
SLACK_WEBHOOK_URL Slack webhook -
SCAN_INTERVAL Scan frequency -
JOB_QUEUE_URL SQS queue URL for distributed jobs -
JOB_TABLE_NAME DynamoDB table for job state -
JOB_REGION AWS region for job infrastructure -
JOB_WORKER_CONCURRENCY Concurrent jobs per worker 4

See Configuration for all options.


Stack

Component Technology
Language Go 1.23+
API Framework Chi
Database Snowflake
Data Ingestion Native scanners
Policy Engine Cedar-style JSON
CLI Cobra
Metrics Prometheus
AI Anthropic, OpenAI

License

Apache 2.0

Directories

Path Synopsis
cmd
cerebro command
policy-enhancer command
Command policy-enhancer adds compliance framework mappings and risk categories to policies
Command policy-enhancer adds compliance framework mappings and risk categories to policies
internal
api
app
Package app provides the main application container that wires together all Cerebro services and manages their lifecycle.
Package app provides the main application container that wires together all Cerebro services and manages their lifecycle.
auth
Package auth provides role-based access control (RBAC) and multi-tenant authentication capabilities for the Cerebro platform.
Package auth provides role-based access control (RBAC) and multi-tenant authentication capabilities for the Cerebro platform.
cerrors
Package cerrors provides sentinel errors and error handling utilities for Cerebro.
Package cerrors provides sentinel errors and error handling utilities for Cerebro.
cli
compliance
Package compliance provides compliance framework definitions and report generation.
Package compliance provides compliance framework definitions and report generation.
findings
Package findings provides unified context-aware risk scoring for vulnerability prioritization.
Package findings provides unified context-aware risk scoring for vulnerability prioritization.
health
Package health provides health check functionality for monitoring application component status.
Package health provides health check functionality for monitoring application component status.
k8s
lineage
Package lineage provides deployment lineage tracking to connect runtime cloud assets back to their source code, container images, and IaC definitions.
Package lineage provides deployment lineage tracking to connect runtime cloud assets back to their source code, container images, and IaC definitions.
policy
Package policy implements a policy engine for evaluating cloud security policies against cloud resources.
Package policy implements a policy engine for evaluating cloud security policies against cloud resources.
remediation
Package remediation provides automated response and remediation capabilities for security findings.
Package remediation provides automated response and remediation capabilities for security findings.
runtime
Package runtime provides real-time threat detection and response capabilities for cloud-native workloads.
Package runtime provides real-time threat detection and response capabilities for cloud-native workloads.
scm
server
Package server provides HTTP server with graceful shutdown support.
Package server provides HTTP server with graceful shutdown support.
threatintel
Package threatintel provides EPSS (Exploit Prediction Scoring System) integration for probability-based vulnerability prioritization.
Package threatintel provides EPSS (Exploit Prediction Scoring System) integration for probability-based vulnerability prioritization.
worker
Package worker provides utilities for managing concurrent work with proper error handling and context cancellation.
Package worker provides utilities for managing concurrent work with proper error handling and context cancellation.
tools
linters module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL