castai-focus

module
v0.0.0-...-6c742d3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2026 License: Apache-2.0

README

castai-focus

Go Version License Go Report Card

Kubernetes-native FOCUS 1.3 cost exporter for Cast AI

castai-focus extracts granular workload-level cost data from Cast AI and exports it as FOCUS 1.3-compliant Parquet files to S3, enabling integration with FinOps tools like Ternary.


Overview

castai-focus is a Kubernetes-native Go application that:

  1. Connects to Cast AI API - Retrieves granular workload cost data from your Kubernetes clusters
  2. Fetches enrichment data - Pulls pricing, enterprise usage, platform features, and discount data from Cast AI billing APIs
  3. Transforms to FOCUS 1.3 - Maps Cast AI cost data to the FinOps Open Cost and Usage Specification (FOCUS 1.3) with accurate list, contracted, and billed cost differentiation
  4. Writes Parquet to S3 - Outputs cost data as columnar Parquet files with Hive-style partitioning
  5. Enables FinOps tools - Makes your cost data available to tools like Ternary via "Bring Your Own Data"
Architecture
flowchart TD
    subgraph CastAI[Cast AI]
        AIK8s[Kubernetes Clusters]
        AIAPI[Cast AI API]
        PricingAPI[Pricing APIs]
        BillingAPI[Billing APIs]
        AIK8s -->|Cost Data| AIAPI
    end

    subgraph CastAIFocus[castai-focus]
        AIAPI -->|GET /workload-costs| CFClient[REST Client]
        PricingAPI -->|cluster prices + default pricing| CFEnrich[Enrichment Fetcher]
        BillingAPI -->|enterprise usage + platform features + discounts| CFEnrich
        CFClient -->|WorkloadData| CFTransform[FOCUS Transformer]
        CFEnrich -->|EnrichmentContext| CFTransform
        CFTransform -->|FocusRows| CFWriter[Parquet/CSV Writer]
        CFWriter -->|Local Temp File| CFS3[S3 Uploader]
    end

    subgraph AWS[S3 Bucket]
        CFS3 -->|version=focus-1.3/year=...| S3Data[Parquet Files]
        S3Data -->|Ternary BYOD| Ternary[Ternary]
    end

    Ternary -->|Cost Analytics| FinOps[FinOps Dashboard]
What is FOCUS 1.3?

FOCUS 1.3 (FinOps Open Cost and Usage Specification) is an open standard for cost and usage data that enables interoperability between cloud cost tools. It provides a common schema for:

  • Cost breakdowns by resource, service, and time
  • Usage metrics and pricing information
  • Tag-based cost allocation
What is Cast AI?

CAST AI is a Kubernetes optimization platform that helps teams reduce cloud costs, improve resource efficiency, and automate cluster management. It provides fine-grained cost data for Kubernetes workloads down to the pod level.


Features

Feature Description
FOCUS 1.3 Compliant Full FOCUS 1.3 schema with mandatory, conditional, and custom x_ columns
Parquet + CSV Support Columnar Parquet for analytics, CSV for compatibility
Workload-Level Granularity Cost data at Deployment, StatefulSet, DaemonSet, Job, and Pod level
Multi-Cloud Supports EKS, GKE, AKS, and other Kubernetes providers
Helm Chart Kubernetes-native CronJob deployment with IRSA support
Configurable Filtering Include/exclude clusters and namespaces
Hive Partitioning version=focus-1.3/year=/month=/day=/ structure for efficient queries
Ternary BYOD Automatic duplicate upload to ternary/ path prefix for Ternary integration
IRSA / Workload Identity Native cloud-native authentication without static credentials
Historical Backfill Backfill months of cost data for initial setup
Hourly/Daily Granularity Choose between hourly or daily aggregation
Cost Differentiation Separate ListCost, ContractedCost, and BilledCost via pricing API enrichment
Tag Filtering Allowlist-based tag filtering to control which labels appear in FOCUS output
Enterprise Sub-Accounts SubAccountId/Name resolved to org-level when enterprise report is available

Quick Start

Prerequisites
  • Kubernetes cluster (v1.25+)
  • Helm v3.8+
  • Cast AI account with API credentials
  • S3 bucket for cost data storage
  • AWS IAM permissions for S3 upload (or IRSA)
Installation
  1. Add the Helm repository:
helm repo add castai https://castai.github.io/helm-charts
helm repo update
  1. Create a values file (values.yaml):
castai:
  organizationId: "your-org-id"
  apiKey: "your-api-key"  # Or use existing Secret

s3:
  bucket: "your-cost-bucket"
  region: "us-east-1"

focus:
  publisherName: "Cast AI"         # Publisher name in FOCUS rows
  billingAccountName: ""           # Optional override (defaults to org ID)
  tagAllowlist: []                 # Allowlist of label keys to include as tags
  enterpriseId: ""                 # Enterprise org ID for org-level sub-account resolution

# Optional: enable Ternary integration
ternary:
  enabled: true
  pathPrefix: "ternary/"
  1. Install the chart:
helm install castai-focus castai/castai-focus -f values.yaml

The CronJob runs every 6 hours by default, exporting the last 24 hours of cost data.

Manual Trigger
# Get the CronJob name
kubectl get cronjob

# Create a manual Job
kubectl create job --from=cronjob/castai-focus castai-focus-manual

Configuration Reference

All configuration is via environment variables (set via Helm values or ConfigMap).

Cast AI Configuration
Variable Required Default Description
CASTAI_API_URL No https://api.cast.ai Cast AI API base URL
CASTAI_API_KEY Yes - Cast AI API key
CASTAI_ORGANIZATION_ID Yes - Cast AI organization ID
Collection Configuration
Variable Required Default Description
COLLECTION_LOOKBACK_HOURS No 24 Number of hours to look back for cost data
COLLECTION_GRANULARITY No hourly Data granularity: hourly or daily
COLLECTION_INCLUDE_CLUSTER_IDS No - Comma-separated list of cluster IDs to include (mutually exclusive with EXCLUDE)
COLLECTION_EXCLUDE_CLUSTER_IDS No - Comma-separated list of cluster IDs to exclude (mutually exclusive with INCLUDE)
COLLECTION_INCLUDE_NAMESPACES No - Comma-separated list of namespaces to include (mutually exclusive with EXCLUDE)
COLLECTION_EXCLUDE_NAMESPACES No - Comma-separated list of namespaces to exclude (mutually exclusive with INCLUDE)
FOCUS Configuration
Variable Required Default Description
FOCUS_BILLING_CURRENCY No USD ISO 4217 billing currency code
FOCUS_SERVICE_PROVIDER_NAME No CAST AI Service provider name in FOCUS rows
FOCUS_INVOICE_ISSUER_NAME No CAST AI Invoice issuer name in FOCUS rows
FOCUS_PUBLISHER_NAME No Cast AI Publisher name in FOCUS rows
FOCUS_BILLING_ACCOUNT_NAME No (org ID) Display name for the billing account; defaults to organization ID if not set
FOCUS_TAG_ALLOWLIST No (all tags) Comma-separated list of label/tag keys to include in the Tags column; if empty, all tags are included
FOCUS_ENTERPRISE_ID No - Enterprise parent organization ID; when set, enables enterprise usage report fetch and resolves SubAccountId/Name to org-level
S3 Configuration
Variable Required Default Description
S3_BUCKET Yes - S3 bucket name for cost data
S3_REGION Yes - AWS region for the S3 bucket
S3_PREFIX No focus/ Key prefix for uploaded objects
S3_FORMAT No parquet Output format: parquet or csv
S3_ACCESS_KEY_ID No - Static S3 access key (for non-IRSA auth)
S3_SECRET_ACCESS_KEY No - Static S3 secret key (for non-IRSA auth)
Ternary Configuration
Variable Required Default Description
TERNARY_ENABLED No false Enable Ternary integration
TERNARY_PATH_PREFIX No ternary/ Path prefix for Ternary uploads
Job Configuration
Variable Required Default Description
JOB_BACKFILL_DAYS No 0 Number of days to backfill on first run (0 = disabled)
Logging Configuration
Variable Required Default Description
LOG_LEVEL No info Log level: debug, info, warn, or error

Cast AI API Endpoints

castai-focus calls the following Cast AI API endpoints to collect and enrich cost data:

Workload Cost Data
Endpoint Purpose
GET /v1/kubernetes/external-clusters List all clusters in the organization
GET /v1/cost-reports/clusters/{clusterId}/workload-costs Per-cluster workload-level cost metrics (CPU, RAM, GPU, storage) broken down by pricing tier (on-demand, spot, spot-fallback)
Pricing Enrichment
Endpoint Purpose
GET /pricing/v1/organizations/{orgId}/clusters/prices Per-cluster contracted pricing (CPU and memory hourly rates) used to compute ContractedCost and ContractedUnitPrice
GET /pricing/v1beta/organizations/{orgId}/anywhere-cluster-default-pricing Organization-level default/list pricing used to compute ListCost and ListUnitPrice
Billing Enrichment
Endpoint Purpose
GET /v1/billing/enterprise/usage-report Enterprise usage report; when FOCUS_ENTERPRISE_ID is set, provides per-org sub-account names and IDs for multi-org environments
GET /v1/billing/platform-usage-report Platform feature usage summary (e.g., Workload Optimization billable CPU hours)
GET /v1/discounts Active discount rules for the organization

All enrichment endpoints are fetched in parallel. Individual failures degrade gracefully — enrichment context is partially populated rather than causing the export job to fail.

Extending the enrichment layer: The internal/enrichment package is designed to be extensible. Additional Cast AI APIs (e.g., commitment/reservation data, spot interruption history, node-level cost allocation) can be added to FetchEnrichment in context.go following the same sync.WaitGroup pattern with graceful degradation.


Helm Chart

Installation
# Using values file
helm install castai-focus castai/castai-focus -f values.yaml

# Inline values (quick test)
helm install castai-focus castai/castai-focus \
  --set castai.organizationId=your-org-id \
  --set castai.apiKey=your-api-key \
  --set s3.bucket=your-bucket \
  --set s3.region=us-east-1
Key Values
Parameter Description Default
image.repository Container image repository ghcr.io/castai/castai-focus
image.tag Image tag .Chart.AppVersion
schedule CronJob schedule "0 */6 * * *" (every 6 hours)
castai.apiKey Cast AI API key ""
castai.organizationId Cast AI organization ID ""
s3.bucket S3 bucket name ""
s3.region AWS region ""
focus.publisherName Publisher name in FOCUS rows "Cast AI"
focus.billingAccountName Billing account display name ""
focus.tagAllowlist Label keys to include as tags []
focus.enterpriseId Enterprise org ID ""
ternary.enabled Enable Ternary integration false
IRSA Configuration
serviceAccount:
  create: true
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/castai-focus-role

s3:
  auth:
    useIRSA: true  # Don't set S3_ACCESS_KEY_ID/S3_SECRET_ACCESS_KEY
Manual Job Trigger
# List CronJobs
kubectl get cronjob

# Create a manual Job
kubectl create job --from=cronjob/castai-focus castai-focus-manual

# View Job logs
kubectl logs job/castai-focus-manual -f

FOCUS 1.3 Column Mapping

The following FOCUS columns are populated from Cast AI data:

Mandatory Columns
FOCUS Column Cast AI Source Description
BilledCost WorkloadCostItem.CostOnDemand/Spot/Fallback Total cost for the charge (Cast AI billed rate)
BillingAccountId Cluster.OrganizationID Organization ID
BillingAccountName FOCUS_BILLING_ACCOUNT_NAME or Cluster.OrganizationID Display name for the billing account
BillingCurrency Config (FOCUS_BILLING_CURRENCY) Always "USD" for Cast AI
BillingPeriodStart Calculated (1st of month) Inclusive billing period start
BillingPeriodEnd Calculated (1st of next month) Exclusive billing period end
ChargeCategory Fixed Always "Usage"
ChargeClass N/A Null (not a correction)
ChargeDescription Formatted e.g., "on-demand default/nginx-deployment in my-cluster (vCPU)"
ChargePeriodStart WorkloadCostItem.Timestamp - granularity Inclusive charge period start
ChargePeriodEnd WorkloadCostItem.Timestamp Exclusive charge period end (API returns interval end)
ContractedCost Pricing API (/clusters/prices) Cost at negotiated/contracted per-cluster rates; falls back to list cost if no custom pricing
EffectiveCost BilledCost Equal to BilledCost in Phase 1 (commitment amortization not yet modeled)
HostProviderName Cluster.ProviderType "AWS", "Google Cloud", or "Microsoft Azure"
InvoiceIssuerName Config (FOCUS_INVOICE_ISSUER_NAME) Always "CAST AI"
ListCost Pricing API (/anywhere-cluster-default-pricing) Cost at public/list rates; falls back to BilledCost if pricing API unavailable
PricingQuantity WorkloadCostItem.CpuCountOnDemand/Spot/Fallback CPU count (primary pricing dimension)
PricingUnit Fixed "vCPU Hours"
PublisherName Config (FOCUS_PUBLISHER_NAME) Defaults to "Cast AI"
ServiceCategory Fixed "Management and Governance"
ServiceName Fixed "Workload Optimization"
ServiceProviderName Config (FOCUS_SERVICE_PROVIDER_NAME) Always "CAST AI"
FOCUS Column Cast AI Source Description
AvailabilityZone N/A Null (not available at workload level)
ChargeFrequency Fixed "Usage-Based"
ConsumedQuantity Calculated Primary resource dimension (CPU > GPU > RAM)
ConsumedUnit Calculated "vCPU", "GPU", or "GiB"
ContractedUnitPrice Pricing API (/clusters/prices) Contracted CPU hourly price for the cluster
InvoiceId Generated "CASTAI-{orgId}-{year}-{month}"
ListUnitPrice Pricing API (/anywhere-cluster-default-pricing) Default/list CPU hourly price
PricingCategory Pricing tier "Standard" (on-demand) or "Dynamic" (spot/spot-fallback)
RegionId Cluster.Region.Name Provider region identifier
RegionName Cluster.Region.DisplayName Human-readable region name
ResourceId Generated "castai://{clusterId}/{namespace}/{workloadType}/{workloadName}"
ResourceName WorkloadReportItem.WorkloadName Workload name
ResourceType WorkloadReportItem.WorkloadType Deployment, StatefulSet, DaemonSet, etc.
ServiceSubcategory Fixed "Kubernetes Optimization"
SkuId Generated "castai://Workload Optimization/{tier}"
SkuPriceId Generated "castai://Workload Optimization/{tier}/{clusterId}"
SubAccountId Enterprise report (org ID) or Cluster.ID Org ID when enterprise enrichment available, otherwise cluster ID
SubAccountName Enterprise report (org name) or Cluster.Name Org name when enterprise enrichment available, otherwise cluster name
Tags Merged (cluster tags + workload labels) Kubernetes labels as cost allocation tags; filtered by FOCUS_TAG_ALLOWLIST if set
Custom x_ Columns (CAST AI Extensions)
FOCUS Column Cast AI Source Description
x_CastAIClusterId Cluster.ID CAST AI cluster identifier
x_CastAIWorkloadType WorkloadReportItem.WorkloadType Kubernetes workload type
x_CastAINamespace WorkloadReportItem.Namespace Kubernetes namespace
x_CastAIPricingTier Calculated "on-demand", "spot", or "spot-fallback"
x_CastAICpuCost WorkloadCostItem.CpuCost* CPU component cost
x_CastAIRamCost WorkloadCostItem.RamCost* RAM component cost
x_CastAIGpuCost WorkloadCostItem.GpuCost* GPU component cost
x_CastAIStorageCost WorkloadCostItem.StorageCost Storage component cost

S3 Output Structure

File Path Convention

Files are stored with Hive-style partitioning:

{s3_prefix}version=focus-1.3/year=2025/month=03/day=10/castai-focus-{orgId}-{date}-{hour}.{format}

Example:

focus/version=focus-1.3/year=2025/month=03/day=10/castai-focus-org123-2025-03-10-14.parquet
Ternary Path

When Ternary integration is enabled, files are uploaded to:

ternary/{s3_prefix}version=focus-1.3/year=2025/month=03/day=10/castai-focus-{orgId}-{date}-{hour}.{format}
File Format Details
  • Parquet: Columnar format with Snappy compression, optimized for analytics
  • CSV: Human-readable format with headers

Development

Prerequisites
  • Go 1.23+
  • Docker (optional, for building images)
  • Helm (optional, for testing chart)
Build
# Build binary
make build

# Run tests
make test

# Lint code
make lint

# Format code
make fmt

# Tidy dependencies
make tidy
Docker Build
# Build image
make docker-build

# Push image
make docker-push
Project Structure
.
├── cmd/
│   └── castai-focus/
│       └── main.go              # Application entry point
├── internal/
│   ├── castai/                  # Cast AI API client
│   │   ├── api.go              # API methods (ListClusters, GetWorkloadCosts, GetClusterPrices, GetDefaultPricing, GetEnterpriseUsageReport, GetPlatformUsageReport, ListDiscounts)
│   │   ├── client.go           # HTTP client with retries/rate limiting
│   │   └── types.go            # API response types (workloads, pricing, billing, discounts)
│   ├── enrichment/              # Enrichment data fetching and context
│   │   ├── context.go          # EnrichmentContext struct and FetchEnrichment (parallel fetch with graceful degradation)
│   │   └── context_test.go     # Enrichment fetch tests
│   ├── focus/                   # FOCUS 1.3 schema and transformation
│   │   ├── schema.go           # FocusRow struct, constants, ColumnNames()
│   │   ├── transformer.go      # Cast AI → FOCUS transformation with enrichment-aware cost differentiation
│   │   └── validator.go        # Parquet schema validation
│   ├── output/                  # Output writers
│   │   ├── parquet.go          # Parquet file writer
│   │   ├── csv.go              # CSV file writer
│   │   └── s3.go               # S3 uploader with IRSA support
│   └── config/                  # Configuration management
│       └── config.go           # Environment variable loading
├── helm/
│   └── castai-focus/            # Helm chart
│       ├── Chart.yaml
│       ├── values.yaml
│       └── templates/
│           ├── cronjob.yaml
│           ├── serviceaccount.yaml
│           ├── secret.yaml
│           └── configmap.yaml
├── go.mod                       # Go module definition
├── Makefile                     # Build targets
└── Dockerfile                   # Container image definition
Running Locally
# Set environment variables
export CASTAI_API_KEY=your-api-key
export CASTAI_ORGANIZATION_ID=your-org-id
export S3_BUCKET=your-bucket
export S3_REGION=us-east-1

# Optional enrichment config
export FOCUS_PUBLISHER_NAME="Cast AI"
export FOCUS_TAG_ALLOWLIST="environment,team,app"
export FOCUS_ENTERPRISE_ID="your-enterprise-org-id"  # Only for enterprise accounts

# Build and run
go build -o bin/castai-focus ./cmd/castai-focus
./bin/castai-focus

Ternary Integration

Ternary is a FinOps analytics platform that supports "Bring Your Own Data" (BYOD) via S3 integration.

Setup
  1. Enable Ternary integration in your values file:
ternary:
  enabled: true
  pathPrefix: "ternary/"  # Default, can be customized
  1. Configure S3 bucket policy to allow Ternary access:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ternary.dev"
      },
      "Action": ["s3:GetObject", "s3:ListBucket"],
      "Resource": [
        "arn:aws:s3:::your-bucket",
        "arn:aws:s3:::your-bucket/*"
      ]
    }
  ]
}
  1. In Ternary console:
    • Navigate to Settings → Data Sources
    • Add S3 as a data source
    • Enter your bucket name
    • Select the version=focus-1.3/ folder
Data Schema

Ternary automatically detects the FOCUS 1.3 schema from the Parquet files. All columns are mapped correctly for cost analysis, workload attribution, and FinOps reporting.


License

MIT License

Copyright © 2025 CAST AI

Directories

Path Synopsis
cmd
castai-focus command
Package main is the entry point for the castai-focus exporter.
Package main is the entry point for the castai-focus exporter.
internal
castai
Package castai provides an HTTP client for the Cast AI API.
Package castai provides an HTTP client for the Cast AI API.
config
Package config provides configuration management for castai-focus.
Package config provides configuration management for castai-focus.
enrichment
Package enrichment fetches supplementary data from Cast AI APIs to enrich FOCUS cost rows with pricing, org hierarchy, and discount information.
Package enrichment fetches supplementary data from Cast AI APIs to enrich FOCUS cost rows with pricing, org hierarchy, and discount information.
focus
Package focus defines the FOCUS 1.3 (FinOps Open Cost and Usage Specification) schema for the castai-focus exporter.
Package focus defines the FOCUS 1.3 (FinOps Open Cost and Usage Specification) schema for the castai-focus exporter.
output
Package output writes FOCUS 1.3 rows to various output formats.
Package output writes FOCUS 1.3 rows to various output formats.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL