s3spectre

module

v0.2.1 Latest Latest Go to latest Published: Feb 23, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ppiankov/s3spectre

Links

Open Source Insights

README ¶

S3Spectre

Static and runtime auditor for AWS S3 bucket drift, unused resources, and lifecycle misconfigurations.

Part of the Spectre family of infrastructure cleanup tools.

What it is

S3Spectre correlates S3 bucket references found in your codebase against live AWS state. It operates in two modes:

Scan mode cross-references code against AWS to detect drift: missing buckets, stale prefixes, version sprawl, and lifecycle gaps.
Discover mode inspects your AWS account directly to find unused, unencrypted, or publicly accessible buckets without requiring any code.

Both modes produce deterministic, machine-readable output suitable for CI/CD gating.

What it is NOT

Not a replacement for AWS Config Rules or GuardDuty. S3Spectre does not monitor in real time.
Not a data scanner. It never reads object contents, only metadata.
Not a remediation tool. It reports problems and lets you decide what to do.
Not a cost calculator. It identifies waste but does not estimate dollar amounts.
Not a security scanner. Encryption and public access checks are surface-level flags, not a compliance audit.

Philosophy

Principiis obsta -- resist the beginnings.

Infrastructure drift is not a detection problem. It is a structural problem. By the time a missing bucket breaks a deployment, the damage is done. S3Spectre is designed to surface these conditions early -- in CI, in code review, in scheduled audits -- so they can be addressed before they matter.

The tool presents evidence and lets humans decide. It does not auto-remediate, does not guess intent, and does not assign confidence scores where deterministic checks suffice.

Installation

# Homebrew
brew install ppiankov/tap/s3spectre

# Docker
docker pull ghcr.io/ppiankov/s3spectre:latest

# From source
git clone https://github.com/ppiankov/s3spectre.git
cd s3spectre && make build

Quick start

# Scan mode: correlate code references against AWS
s3spectre scan --repo .

# Discover mode: audit all buckets in your AWS account
s3spectre discover

# SARIF output for GitHub Security tab
s3spectre scan --repo . --format sarif --output results.sarif

# Baseline: suppress known findings on repeat runs
s3spectre scan --repo . --format json --output baseline.json --update-baseline
s3spectre scan --repo . --baseline baseline.json

Requires valid AWS credentials (environment, profile, or IAM role).

Usage

Scan mode

Cross-references S3 references in code with live AWS state.

# Basic scan
s3spectre scan --repo ./my-repo

# Specific AWS profile and regions
s3spectre scan --repo . --aws-profile production --regions us-east-1,eu-west-1

# JSON output for CI/CD
s3spectre scan --repo . --format json --output report.json

# Fail the pipeline on drift
s3spectre scan --repo . --fail-on-missing --fail-on-stale --stale-days 60

# Enable unused bucket detection
s3spectre scan --repo . --check-unused --fail-on-unused

# Include file-level reference details
s3spectre scan --repo . --include-references --format json

Scan flags:

Flag	Default	Description
`--repo, -r`	`.`	Repository path to scan
`--aws-profile`		AWS profile
`--aws-region`		Single region mode
`--all-regions`	`true`	Scan all enabled regions
`--regions`		Specific regions (comma-separated)
`--stale-days`	`90`	Stale prefix threshold
`--check-unused`	`false`	Enable unused bucket scoring
`--unused-threshold-days`	`180`	Unused bucket threshold
`--concurrency`	`10`	Max concurrent S3 API calls
`--format, -f`	`text`	Output format: `text` or `json`
`--output, -o`	stdout	Output file
`--fail-on-missing`	`false`	Exit non-zero on missing buckets
`--fail-on-stale`	`false`	Exit non-zero on stale prefixes
`--fail-on-version-sprawl`	`false`	Exit non-zero on version sprawl
`--fail-on-unused`	`false`	Exit non-zero on unused buckets
`--include-references`	`false`	Include reference details in output
`--no-progress`	`false`	Disable TTY progress indicators

Discover mode

Audits all S3 buckets in an AWS account without requiring code references.

# Discover all buckets across all regions
s3spectre discover

# Security surface checks
s3spectre discover --check-encryption --check-public

# Custom staleness thresholds
s3spectre discover --age-threshold-days 730 --inactive-days 365

# CI/CD gating
s3spectre discover --fail-on-unused --fail-on-risky --format json

Discover flags:

Flag	Default	Description
`--aws-profile`		AWS profile
`--all-regions`	`true`	Scan all enabled regions
`--regions`		Specific regions (comma-separated)
`--age-threshold-days`	`365`	Flag buckets older than N days
`--inactive-days`	`180`	Flag buckets inactive for N days
`--check-encryption`	`false`	Flag missing encryption
`--check-public`	`false`	Flag public access
`--concurrency`	`10`	Max concurrent S3 API calls
`--format, -f`	`text`	Output format: `text` or `json`
`--output, -o`	stdout	Output file
`--fail-on-unused`	`false`	Exit non-zero on unused buckets
`--fail-on-risky`	`false`	Exit non-zero on risky configs
`--no-progress`	`false`	Disable TTY progress indicators

Drift classifications

Scan mode classifies each bucket and prefix into one of:

Status	Meaning
`MISSING_BUCKET`	Referenced in code, does not exist in AWS
`UNUSED_BUCKET`	Exists in AWS, not referenced in code
`MISSING_PREFIX`	Code references a prefix with no objects
`STALE_PREFIX`	Prefix exists but unmodified for N days
`VERSION_SPRAWL`	Versioning enabled, no lifecycle rules
`LIFECYCLE_MISCONFIG`	Many objects, no lifecycle rules
`OK`	Bucket and prefix match expected usage

Architecture

s3spectre/
├── cmd/s3spectre/main.go       # Entry point, delegates to commands
├── internal/
│   ├── commands/               # Cobra CLI: scan, discover, version
│   │   ├── root.go
│   │   ├── scan.go
│   │   ├── discover.go
│   │   ├── helpers.go          # Shared: error enhancement, status output
│   │   └── version.go
│   ├── scanner/                # Repository scanning (regex, YAML, Terraform, JSON, .env)
│   │   ├── scanner.go          # Orchestrator: walks files, dispatches to parsers
│   │   ├── regex.go            # S3 URL and bucket name pattern matching
│   │   ├── yaml.go
│   │   ├── terraform.go
│   │   ├── json.go
│   │   ├── env.go
│   │   └── types.go
│   ├── s3/                     # AWS S3 integration
│   │   ├── client.go           # S3 client wrapper with retry and backoff
│   │   ├── inspector.go        # Concurrent bucket and prefix inspection
│   │   └── types.go
│   ├── analyzer/               # Drift analysis and scoring
│   │   ├── analyzer.go         # Scan mode: code-vs-AWS correlation
│   │   ├── discovery.go        # Discover mode: account-wide heuristics
│   │   └── types.go
│   └── report/                 # Output generation
│       ├── text.go
│       ├── json.go
│       ├── discovery.go
│       └── types.go
├── Makefile
├── go.mod
└── go.sum

Key design decisions:

cmd/s3spectre/main.go is minimal -- a single Execute() call.
All logic lives in internal/ to prevent external import.
S3 API calls use a bounded worker pool (--concurrency) with exponential backoff.
Scanner dispatches files to format-specific parsers based on extension.
Analysis is deterministic: same inputs always produce the same classifications.

Known limitations

No object-level scanning. S3Spectre inspects bucket and prefix metadata. It does not list or read individual objects beyond what is needed for prefix existence and staleness checks.
Regex-based code scanning. The scanner uses pattern matching, not AST parsing. It will miss dynamically constructed bucket names and may produce false positives on commented-out code.
No cost estimation. The tool identifies unused resources but does not calculate storage costs.
IAM permissions required. Needs s3:ListBucket, s3:ListAllMyBuckets, s3:GetBucketLocation, s3:GetBucketVersioning, s3:GetLifecycleConfiguration, and s3:GetBucketTagging. Missing permissions produce access-denied errors, not silent failures.
No real-time monitoring. S3Spectre is a point-in-time scanner, not a daemon. Run it in CI or on a schedule.
Single AWS account. Cross-account scanning is not supported.
Progress line artifacts. The TTY progress indicator uses carriage return without clearing the full line, so shorter bucket names leave trailing characters from the previous name. Cosmetic only.

Roadmap

Cost estimation for unused and stale resources
Deep prefix scanning with pagination
Replication rule validation
IAM access path analysis
Naming convention enforcement
Historical trend tracking
SpectreHub integration for cross-tool correlation

License

MIT License -- see LICENSE.

Contributing

See CONTRIBUTING.md. Issues and pull requests welcome.

Part of the Spectre family: VaultSpectre | ClickSpectre | KafkaSpectre

Directories ¶

Path	Synopsis
cmd
s3spectre command
internal
analyzer
baseline
commands
config
logging
report
s3
scanner

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL