s3spectre

module
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 23, 2026 License: MIT

README

S3Spectre

CI

Static and runtime auditor for AWS S3 bucket drift, unused resources, and lifecycle misconfigurations.

Part of the Spectre family of infrastructure cleanup tools.

What it is

S3Spectre correlates S3 bucket references found in your codebase against live AWS state. It operates in two modes:

  • Scan mode cross-references code against AWS to detect drift: missing buckets, stale prefixes, version sprawl, and lifecycle gaps.
  • Discover mode inspects your AWS account directly to find unused, unencrypted, or publicly accessible buckets without requiring any code.

Both modes produce deterministic, machine-readable output suitable for CI/CD gating.

What it is NOT

  • Not a replacement for AWS Config Rules or GuardDuty. S3Spectre does not monitor in real time.
  • Not a data scanner. It never reads object contents, only metadata.
  • Not a remediation tool. It reports problems and lets you decide what to do.
  • Not a cost calculator. It identifies waste but does not estimate dollar amounts.
  • Not a security scanner. Encryption and public access checks are surface-level flags, not a compliance audit.

Philosophy

Principiis obsta -- resist the beginnings.

Infrastructure drift is not a detection problem. It is a structural problem. By the time a missing bucket breaks a deployment, the damage is done. S3Spectre is designed to surface these conditions early -- in CI, in code review, in scheduled audits -- so they can be addressed before they matter.

The tool presents evidence and lets humans decide. It does not auto-remediate, does not guess intent, and does not assign confidence scores where deterministic checks suffice.

Installation

# Homebrew
brew install ppiankov/tap/s3spectre

# Docker
docker pull ghcr.io/ppiankov/s3spectre:latest

# From source
git clone https://github.com/ppiankov/s3spectre.git
cd s3spectre && make build

Quick start

# Scan mode: correlate code references against AWS
s3spectre scan --repo .

# Discover mode: audit all buckets in your AWS account
s3spectre discover

# SARIF output for GitHub Security tab
s3spectre scan --repo . --format sarif --output results.sarif

# Baseline: suppress known findings on repeat runs
s3spectre scan --repo . --format json --output baseline.json --update-baseline
s3spectre scan --repo . --baseline baseline.json

Requires valid AWS credentials (environment, profile, or IAM role).

Usage

Scan mode

Cross-references S3 references in code with live AWS state.

# Basic scan
s3spectre scan --repo ./my-repo

# Specific AWS profile and regions
s3spectre scan --repo . --aws-profile production --regions us-east-1,eu-west-1

# JSON output for CI/CD
s3spectre scan --repo . --format json --output report.json

# Fail the pipeline on drift
s3spectre scan --repo . --fail-on-missing --fail-on-stale --stale-days 60

# Enable unused bucket detection
s3spectre scan --repo . --check-unused --fail-on-unused

# Include file-level reference details
s3spectre scan --repo . --include-references --format json

Scan flags:

Flag Default Description
--repo, -r . Repository path to scan
--aws-profile AWS profile
--aws-region Single region mode
--all-regions true Scan all enabled regions
--regions Specific regions (comma-separated)
--stale-days 90 Stale prefix threshold
--check-unused false Enable unused bucket scoring
--unused-threshold-days 180 Unused bucket threshold
--concurrency 10 Max concurrent S3 API calls
--format, -f text Output format: text or json
--output, -o stdout Output file
--fail-on-missing false Exit non-zero on missing buckets
--fail-on-stale false Exit non-zero on stale prefixes
--fail-on-version-sprawl false Exit non-zero on version sprawl
--fail-on-unused false Exit non-zero on unused buckets
--include-references false Include reference details in output
--no-progress false Disable TTY progress indicators
Discover mode

Audits all S3 buckets in an AWS account without requiring code references.

# Discover all buckets across all regions
s3spectre discover

# Security surface checks
s3spectre discover --check-encryption --check-public

# Custom staleness thresholds
s3spectre discover --age-threshold-days 730 --inactive-days 365

# CI/CD gating
s3spectre discover --fail-on-unused --fail-on-risky --format json

Discover flags:

Flag Default Description
--aws-profile AWS profile
--all-regions true Scan all enabled regions
--regions Specific regions (comma-separated)
--age-threshold-days 365 Flag buckets older than N days
--inactive-days 180 Flag buckets inactive for N days
--check-encryption false Flag missing encryption
--check-public false Flag public access
--concurrency 10 Max concurrent S3 API calls
--format, -f text Output format: text or json
--output, -o stdout Output file
--fail-on-unused false Exit non-zero on unused buckets
--fail-on-risky false Exit non-zero on risky configs
--no-progress false Disable TTY progress indicators
Drift classifications

Scan mode classifies each bucket and prefix into one of:

Status Meaning
MISSING_BUCKET Referenced in code, does not exist in AWS
UNUSED_BUCKET Exists in AWS, not referenced in code
MISSING_PREFIX Code references a prefix with no objects
STALE_PREFIX Prefix exists but unmodified for N days
VERSION_SPRAWL Versioning enabled, no lifecycle rules
LIFECYCLE_MISCONFIG Many objects, no lifecycle rules
OK Bucket and prefix match expected usage

Architecture

s3spectre/
├── cmd/s3spectre/main.go       # Entry point, delegates to commands
├── internal/
│   ├── commands/               # Cobra CLI: scan, discover, version
│   │   ├── root.go
│   │   ├── scan.go
│   │   ├── discover.go
│   │   ├── helpers.go          # Shared: error enhancement, status output
│   │   └── version.go
│   ├── scanner/                # Repository scanning (regex, YAML, Terraform, JSON, .env)
│   │   ├── scanner.go          # Orchestrator: walks files, dispatches to parsers
│   │   ├── regex.go            # S3 URL and bucket name pattern matching
│   │   ├── yaml.go
│   │   ├── terraform.go
│   │   ├── json.go
│   │   ├── env.go
│   │   └── types.go
│   ├── s3/                     # AWS S3 integration
│   │   ├── client.go           # S3 client wrapper with retry and backoff
│   │   ├── inspector.go        # Concurrent bucket and prefix inspection
│   │   └── types.go
│   ├── analyzer/               # Drift analysis and scoring
│   │   ├── analyzer.go         # Scan mode: code-vs-AWS correlation
│   │   ├── discovery.go        # Discover mode: account-wide heuristics
│   │   └── types.go
│   └── report/                 # Output generation
│       ├── text.go
│       ├── json.go
│       ├── discovery.go
│       └── types.go
├── Makefile
├── go.mod
└── go.sum

Key design decisions:

  • cmd/s3spectre/main.go is minimal -- a single Execute() call.
  • All logic lives in internal/ to prevent external import.
  • S3 API calls use a bounded worker pool (--concurrency) with exponential backoff.
  • Scanner dispatches files to format-specific parsers based on extension.
  • Analysis is deterministic: same inputs always produce the same classifications.

Known limitations

  • No object-level scanning. S3Spectre inspects bucket and prefix metadata. It does not list or read individual objects beyond what is needed for prefix existence and staleness checks.
  • Regex-based code scanning. The scanner uses pattern matching, not AST parsing. It will miss dynamically constructed bucket names and may produce false positives on commented-out code.
  • No cost estimation. The tool identifies unused resources but does not calculate storage costs.
  • IAM permissions required. Needs s3:ListBucket, s3:ListAllMyBuckets, s3:GetBucketLocation, s3:GetBucketVersioning, s3:GetLifecycleConfiguration, and s3:GetBucketTagging. Missing permissions produce access-denied errors, not silent failures.
  • No real-time monitoring. S3Spectre is a point-in-time scanner, not a daemon. Run it in CI or on a schedule.
  • Single AWS account. Cross-account scanning is not supported.
  • Progress line artifacts. The TTY progress indicator uses carriage return without clearing the full line, so shorter bucket names leave trailing characters from the previous name. Cosmetic only.

Roadmap

  • Cost estimation for unused and stale resources
  • Deep prefix scanning with pagination
  • Replication rule validation
  • IAM access path analysis
  • Naming convention enforcement
  • Historical trend tracking
  • SpectreHub integration for cross-tool correlation

License

MIT License -- see LICENSE.

Contributing

See CONTRIBUTING.md. Issues and pull requests welcome.

Part of the Spectre family: VaultSpectre | ClickSpectre | KafkaSpectre

Directories

Path Synopsis
cmd
s3spectre command
internal
s3

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL