Documentation
¶
Overview ¶
Package docker implements a Docker backend for Fred that provisions ephemeral containers with SKU-based resource profiles, registry allowlisting, and port mapping for tenant connectivity.
Package docker implements the Backend interface for Docker container provisioning. It is the production backend bundled with Fred.
For operators and tenants, see the README.md alongside this package (internal/backend/docker/README.md) for the full configuration reference, HTTP API, lease state machine, callback protocol, and Traefik integration.
Architecture overview (for developers) ¶
The package is organized around a single concurrency primitive: the per-lease actor. Every active lease owns one goroutine that serializes all state-mutating operations for that lease through a stateless state machine. The actor and SM implementations are substrate-agnostic and live in internal/backend/shared/leasesm; this package supplies the Docker-specific seams via the closure-builder factory in lease_actor_factory.go. All Docker calls happen outside any shared mutex; linearization comes from the actor's inbox.
The actor model is what gives the backend its key properties:
- No held locks during slow I/O (image pull, container create/start)
- Deterministic preemption — a Deprovision arriving mid-provisioning cancels the in-flight worker via OnExit and transitions cleanly
- Blast-radius-contained panics — recover() in each handler keeps unrelated leases unaffected
- One terminal callback per lease — emission lives only in SM entry actions, never in worker goroutines
Major components ¶
- internal/backend/shared/leasesm: per-lease actor + state machine (substrate-agnostic; consumed by every backend, not just Docker)
- lease_actor_factory.go, lease_actor_routing.go: factory wiring Docker dependencies into leasesm.NewLeaseActor, plus Backend-side routing/dispatch around the actor inbox (b.actors map, routeToLease, DebugActors)
- leasesm_adapters.go, leasesm_metrics.go: Docker implementations of leasesm.InstanceInspector / DiagnosticsGatherer / LeaseProvisionStore / SMMetrics
- internal/backend/shared/workbarrier: per-actor worker reference counter (used by OnExit to wait for canceled goroutines before completing the transition)
- provision.go, deprovision.go, restart_update.go: the lifecycle workers that the actor spawns for each long-running operation
- recover.go: state recovery from Docker labels on startup and during each reconciliation cycle (via RefreshState)
- compose.go, compose_project.go: Compose-based stack provisioning
- reconcile_custom_domain.go: Traefik label sync for tenant custom domains
- volume.go (+ volume_btrfs.go, volume_xfs.go, volume_zfs.go): filesystem-specific quota enforcement for stateful SKUs
- ingress.go: Traefik label generation for routable ports
- metrics.go: Prometheus metrics under fred_docker_backend_*
Container hardening ¶
Every container is created with: dropped capabilities, no-new-privileges, read-only rootfs, tmpfs for /tmp and /run, PID limits, no swap, restart policy disabled (for crash detection), and per-tenant network isolation. See the README for the full list and operator-facing knobs.
Index ¶
- Constants
- func ComputeFQDN(subdomain, wildcardDomain string) string
- func ComputeSubdomain(leaseUUID, serviceName string, instanceIndex, quantity int) string
- func CustomDomainRouterName(leaseUUID, serviceName string) string
- func RouterName(leaseUUID, serviceName string, instanceIndex, quantity int) string
- func SelectIngressPort(ports map[string]manifest.PortConfig) (int, bool)
- func TenantNetworkName(tenant string) string
- func TraefikCustomDomainLabels(cfg IngressConfig, customRouterName, customDomain string, containerPort int) map[string]string
- func TraefikLabels(cfg IngressConfig, networkName, routerName, fqdn string, containerPort int) map[string]string
- type ActorSnapshot
- type Backend
- func (b *Backend) DebugActors() []ActorSnapshot
- func (b *Backend) Deprovision(ctx context.Context, leaseUUID string) error
- func (b *Backend) GetInfo(ctx context.Context, leaseUUID string) (*backend.LeaseInfo, error)
- func (b *Backend) GetLoadStats(_ context.Context) (*backend.LoadStats, error)
- func (b *Backend) GetLogs(ctx context.Context, leaseUUID string, tail int) (map[string]string, error)
- func (b *Backend) GetProvision(_ context.Context, leaseUUID string) (*backend.ProvisionInfo, error)
- func (b *Backend) GetReleases(_ context.Context, leaseUUID string) ([]backend.ReleaseInfo, error)
- func (b *Backend) Health(ctx context.Context) error
- func (b *Backend) ListProvisions(_ context.Context) ([]backend.ProvisionInfo, error)
- func (b *Backend) ListProvisionsPage(ctx context.Context, after string, limit int) ([]backend.ProvisionInfo, string, error)
- func (b *Backend) ListRetentions(_ context.Context) ([]backend.RetainedLease, error)
- func (b *Backend) ListRetentionsPage(_ context.Context, after string, limit int) ([]backend.RetainedLease, string, error)
- func (b *Backend) LookupProvisions(_ context.Context, uuids []string) ([]backend.ProvisionInfo, error)
- func (b *Backend) Name() string
- func (b *Backend) Provision(ctx context.Context, req backend.ProvisionRequest) error
- func (b *Backend) ReconcileCustomDomain(ctx context.Context, leaseUUID string, items []backend.LeaseItem) error
- func (b *Backend) RefreshState(ctx context.Context) error
- func (b *Backend) Restart(ctx context.Context, req backend.RestartRequest) error
- func (b *Backend) Restore(ctx context.Context, req backend.RestoreRequest) error
- func (b *Backend) Start(ctx context.Context) error
- func (b *Backend) Stats() shared.ResourceStats
- func (b *Backend) Stop() error
- func (b *Backend) Update(ctx context.Context, req backend.UpdateRequest) error
- type Config
- func (c *Config) GetHostBindIP() string
- func (c *Config) GetPidsLimit() *int64
- func (c *Config) GetSKUProfile(sku string) (SKUProfile, error)
- func (c *Config) GetTmpfsSizeMB() int
- func (c *Config) HasStatefulSKUs() bool
- func (c *Config) IsNetworkIsolation() bool
- func (c *Config) IsReadonlyRootfs() bool
- func (c *Config) Validate() error
- type ContainerEvent
- type ContainerInfo
- type ContainerMount
- type CreateContainerParams
- type DaemonSecurityInfo
- type DockerClient
- func (d *DockerClient) Close() error
- func (d *DockerClient) ContainerEvents(ctx context.Context) (<-chan ContainerEvent, <-chan error)
- func (d *DockerClient) ContainerLogs(ctx context.Context, containerID string, tail int) (string, error)
- func (d *DockerClient) CreateContainer(ctx context.Context, params CreateContainerParams, timeout time.Duration) (string, error)
- func (d *DockerClient) DaemonInfo(ctx context.Context) (DaemonSecurityInfo, error)
- func (d *DockerClient) DetectVolumeOwner(ctx context.Context, imageName string, volumePaths []string) (uid, gid int, err error)
- func (d *DockerClient) DetectWritablePaths(ctx context.Context, imageName string, uid int, candidateParents []string) ([]string, error)
- func (d *DockerClient) EnsureTenantNetwork(ctx context.Context, tenant string) (string, error)
- func (d *DockerClient) ExtractImageContent(ctx context.Context, imageName string, paths []string, destDir string, ...) map[string]error
- func (d *DockerClient) InspectContainer(ctx context.Context, containerID string) (*ContainerInfo, error)
- func (d *DockerClient) InspectImage(ctx context.Context, imageName string) (*ImageInfo, error)
- func (d *DockerClient) ListManagedContainers(ctx context.Context) ([]ContainerInfo, error)
- func (d *DockerClient) ListManagedNetworks(ctx context.Context) ([]networktypes.Inspect, error)
- func (d *DockerClient) Ping(ctx context.Context) error
- func (d *DockerClient) PullImage(ctx context.Context, imageName string, timeout time.Duration) error
- func (d *DockerClient) RemoveContainer(ctx context.Context, containerID string) error
- func (d *DockerClient) RemoveTenantNetworkIfEmpty(ctx context.Context, tenant string) error
- func (d *DockerClient) RenameContainer(ctx context.Context, containerID string, newName string) error
- func (d *DockerClient) ResolveImageUser(ctx context.Context, imageName string, userOverride string) (uid, gid int, err error)
- func (d *DockerClient) StartContainer(ctx context.Context, containerID string, timeout time.Duration) error
- func (d *DockerClient) StopContainer(ctx context.Context, containerID string, timeout time.Duration) error
- type HealthStatus
- type ImageInfo
- type IngressConfig
- type PortBinding
- type SKUProfile
- type TenantQuotaConfig
Constants ¶
const ( LabelManaged = "fred.managed" LabelLeaseUUID = "fred.lease_uuid" LabelTenant = "fred.tenant" LabelProviderUUID = "fred.provider_uuid" LabelSKU = "fred.sku" LabelCreatedAt = "fred.created_at" LabelInstanceIndex = "fred.instance_index" LabelFailCount = "fred.fail_count" LabelCallbackURL = "fred.callback_url" LabelBackendName = "fred.backend_name" LabelServiceName = "fred.service_name" LabelFQDN = "fred.fqdn" LabelCustomDomain = "fred.custom_domain" )
Labels used for tracking managed containers.
const DefaultMaxRequestBodySize int64 = 2 << 20 // 2 MiB
DefaultMaxRequestBodySize caps inbound HTTP request bodies for the docker backend. It is deliberately larger than providerd's config.DefaultMaxRequestBodySize (1 MiB): providerd caps the raw tenant body, then re-serializes and wraps it (JSON envelope + base64) before forwarding, so a manifest that just cleared providerd can exceed 1 MiB on the backend hop and would otherwise be rejected with an opaque 400. Configurable via max_request_body_size / DOCKER_BACKEND_MAX_REQUEST_BODY_SIZE. (ENG-448 / F42)
Variables ¶
This section is empty.
Functions ¶
func ComputeFQDN ¶
ComputeFQDN returns subdomain + "." + wildcardDomain.
func ComputeSubdomain ¶
ComputeSubdomain derives a unique, human-friendly subdomain from lease and service metadata. The result is guaranteed to be at most 63 characters (the DNS label limit); long service names are truncated to fit.
The hash suffix is derived from all discriminating fields (leaseUUID, serviceName, instanceIndex) to prevent cross-pattern collisions — e.g., service "web" instance 0 vs. service "web-0" instance 0.
Matrix:
serviceName == "" && quantity <= 1 → {hash7}
serviceName == "" && quantity > 1 → {idx}-{hash7}
serviceName != "" && quantity <= 1 → {svc}-{hash7}
serviceName != "" && quantity > 1 → {svc}-{idx}-{hash7}
func CustomDomainRouterName ¶
CustomDomainRouterName returns a Traefik router name shared across all instances of (leaseUUID, serviceName). Custom domain is per-LeaseItem (per-service, not per-instance), so this name is the same regardless of instance index — every instance container of the service emits the same secondary-router labels and Traefik aggregates them into one router with N backends. The "-custom" suffix avoids any collision with the per-instance primary router name produced by RouterName.
func RouterName ¶
RouterName returns a router name derived from lease metadata.
func SelectIngressPort ¶
func SelectIngressPort(ports map[string]manifest.PortConfig) (int, bool)
SelectIngressPort picks the best TCP port for ingress routing. Preference: ingress hint > 80 > 8080 > lowest TCP port number. Returns (port, true) if a suitable port is found, (0, false) otherwise.
func TenantNetworkName ¶
TenantNetworkName returns a deterministic network name for a tenant address.
func TraefikCustomDomainLabels ¶
func TraefikCustomDomainLabels(cfg IngressConfig, customRouterName, customDomain string, containerPort int) map[string]string
TraefikCustomDomainLabels generates the secondary Traefik labels that route a tenant-supplied custom domain to the same per-tenant container(s) served by the primary router. The returned map is intended to be merged into a container's label set alongside the primary TraefikLabels output.
Multi-instance services (quantity > 1) are load-balanced by emitting these byte-identical labels on every instance container: Traefik's Docker provider deduplicates the router by name and aggregates the service into one with N backends. For single-instance services this is the same path with a degenerate single backend.
The router uses cfg.CustomDomainCertResolver (default "http01") for per-domain ACME and cfg.CustomDomainMiddlewares (default "security-headers@file") for transport-level hardening. The entrypoint reuses the existing IngressConfig.Entrypoint so the secondary router shares the primary's TLS termination posture.
func TraefikLabels ¶
func TraefikLabels(cfg IngressConfig, networkName, routerName, fqdn string, containerPort int) map[string]string
TraefikLabels generates the Docker labels that Traefik uses for auto-discovery and routing configuration. networkName is the per-tenant Docker network that Traefik should use to reach the container (set via traefik.docker.network). This is the only Traefik-specific function; everything else is proxy-agnostic.
Routers are emitted with tls=true and no certresolver; see IngressConfig for the wildcard-cert provisioning contract.
The explicit `.service` label binding is required even though Traefik's docker provider can auto-bind a router to a same-named service: auto-binding only fires when exactly one service is declared on the container. As soon as a second service appears (notably TraefikCustomDomainLabels' secondary service for tenant custom domains), auto-binding becomes ambiguous and Traefik leaves the router orphaned (returns HTTP 418 on the primary URL). Always emitting the explicit binding is cheap and immune to future label additions.
Types ¶
type ActorSnapshot ¶
type ActorSnapshot struct {
LeaseUUID string `json:"lease_uuid"`
SMState string `json:"sm_state"` // current SM state
InboxDepth int `json:"inbox_depth"` // pending messages not yet processed
InboxCap int `json:"inbox_cap"`
}
ActorSnapshot is a point-in-time view of one lease actor's state for operator introspection. Safe to marshal to JSON for a /debug/actors endpoint when integrated with the HTTP layer.
type Backend ¶
type Backend struct {
// contains filtered or unexported fields
}
Backend implements the backend.Backend interface for Docker containers.
func (*Backend) DebugActors ¶
func (b *Backend) DebugActors() []ActorSnapshot
DebugActors returns a snapshot of every live lease actor. The result is stable for the caller: it's a copy; the registry may grow or change state after return. Intended for ops introspection during incidents — pair with a /debug/actors HTTP handler that JSON-encodes the return.
func (*Backend) Deprovision ¶
Deprovision is the public shim: it routes the request through the lease's actor so that container-death and deprovision messages serialize per lease. Routing forces a Ready/Failing/Failed → Deprovisioning SM transition whose Failing.OnExit cancels the in-flight diag goroutine — the structural suppression of stale Failed callbacks.
func (*Backend) GetInfo ¶
GetInfo returns lease information including connection details. Always populates `Services` keyed by service name (the primary post-Tasks-3-9 source of truth). `Instances` is a flattened convenience view computed by concatenating service instances in deterministic service-name order so older tooling that consumes the flat array keeps working.
func (*Backend) GetLoadStats ¶ added in v0.5.0
GetLoadStats returns the backend's current CPU-load snapshot for least-loaded provision routing (ENG-318). It wraps the resource pool's Stats() — the same source the HTTP GET /stats endpoint serves — so the in-process docker backend exposes the same load signal as the HTTP path.
func (*Backend) GetLogs ¶
func (b *Backend) GetLogs(ctx context.Context, leaseUUID string, tail int) (map[string]string, error)
GetLogs returns the last N lines of stdout/stderr for each container in a lease, keyed by "serviceName/instanceIndex" (e.g., "web/0", "db/0"). Falls back to the diagnostics store when the provision is not in memory (e.g., after deprovision). Returns ErrNotProvisioned only if both miss. On partial failure (some containers succeed, some fail), the successful logs are returned along with error placeholders, and the errors are logged.
Post-Tasks-3-9 the live path always populates `prov.ServiceContainers`, so the legacy "key by instance index only" branch is gone — every lease is stack-shaped from the in-memory state's perspective.
func (*Backend) GetProvision ¶
GetProvision returns a single provision by lease UUID.
When the lease is not in the in-memory map (e.g., after close/expire), the retention store is consulted BEFORE the diagnostics fallback so a soft-deleted lease surfaces as Status=retained (with RetainedUntil + Items for the restore shape) for the offline tenant to self-serve within the grace window — and never regresses to a stale Status=failed diagnostics entry. Falls back to the diagnostics store otherwise. Returns ErrNotProvisioned only if all sources miss.
func (*Backend) GetReleases ¶
GetReleases returns the release history for a lease.
func (*Backend) Health ¶
Health checks that the Docker daemon is reachable AND the persistence stores are readable. Probing the bbolt stores (not just docker.Ping) means a locked/corrupt/read-only retention or release store surfaces as unhealthy instead of the backend reporting healthy while soft-delete/restore silently fail — the most data-loss-sensitive subsystem must not be the unmonitored one. (ENG-448 / F31)
func (*Backend) ListProvisions ¶
ListProvisions returns all currently provisioned resources.
func (*Backend) ListProvisionsPage ¶ added in v0.7.0
func (b *Backend) ListProvisionsPage(ctx context.Context, after string, limit int) ([]backend.ProvisionInfo, string, error)
ListProvisionsPage returns one keyset page of provisions for the /provisions handler — the same paged-handler role ListRetentionsPage plays for /retentions, but NOT the same performance profile. The docker provision store is an in-memory map with no ordered index, so this snapshots it via ListProvisions and paginates in memory (O(N log N) per page, no disk I/O), whereas ListRetentionsPage serves O(limit) pages directly from the ordered bbolt index via a cursor seek. A true store-level O(limit) provision read is tracked as ENG-455 (it needs the ENG-381 ordered snapshot, since recoverState rebuilds the whole map each tick).
func (*Backend) ListRetentions ¶ added in v0.5.0
ListRetentions returns the leases this docker backend currently retains (soft-deleted, awaiting restore or grace-reap), read from the retention store. Used by fred's reconciler for restore backend affinity (ENG-333). Returns an empty slice when the retention store is not configured.
func (*Backend) ListRetentionsPage ¶ added in v0.7.0
func (b *Backend) ListRetentionsPage(_ context.Context, after string, limit int) ([]backend.RetainedLease, string, error)
ListRetentionsPage returns one keyset page of retained lease UUIDs, served directly from the retention store's ordered bbolt index via a cursor Seek (O(limit) per page) rather than reading the whole set and paginating in memory. It is the paged sibling of ListRetentions used by the /retentions handler. limit is coerced down to backend.MaxPageLimit, mirroring PaginateRetentions; limit <= 0 is the unpaginated passthrough.
func (*Backend) LookupProvisions ¶
func (b *Backend) LookupProvisions(_ context.Context, uuids []string) ([]backend.ProvisionInfo, error)
LookupProvisions returns provision info for the requested lease UUIDs. Missing leases are absent from the returned slice (not an error). O(k) lookups against the in-memory provisions map, where k = len(uuids).
func (*Backend) Provision ¶
Provision starts async provisioning of containers. For multi-unit leases (quantity > 1), multiple containers are created. For multi-SKU leases, containers are created with the appropriate profile for each SKU.
Pre-flight validation errors (unknown SKU, invalid manifest, disallowed image, insufficient resources) are returned synchronously so the caller can respond with an appropriate HTTP status. Only truly asynchronous failures (image pull, container create/start) are communicated via callback.
func (*Backend) ReconcileCustomDomain ¶
func (b *Backend) ReconcileCustomDomain(ctx context.Context, leaseUUID string, items []backend.LeaseItem) error
ReconcileCustomDomain reapplies the per-LeaseItem custom_domain values from chain onto the running provision. When at least one item's CustomDomain differs from the in-memory state, the backend computes the diff read-only, then routes the redeploy through the lease actor via routeReplaceRestart. The actor commits prov.Items on success — no off-actor mutation, no CAS rollback. A failed redeploy leaves prov.Items untouched so the next reconciler tick retries (ENG-278).
Reconciliation only runs when the provision is in ProvisionStatusReady. Any other state (Provisioning, Restarting, Updating, Failing, Failed, Deprovisioning, Unknown) is treated as "not the right time": skip without error and let the periodic reconciler call back when the provision settles.
The reconcile uses a two-RLock-pass candidate-only shape (ENG-277): a candidate pre-pass (RLock) finds which incoming domains actually differ from what's emitted and need DNS resolution; only those are resolved off-lock via b.dnsGateAllows; a main pass (RLock) re-reads fresh prov and runs computeCustomDomainOverrides read-only. A steady-state lease performs zero DNS lookups; a not-Ready lease short-circuits in the pre-pass with zero DNS.
No-op when this backend is configured with ingress disabled. Without ingress, applyIngressLabels emits no Traefik labels (primary or secondary) at all, so a Restart triggered for custom-domain drift would recreate the containers with no LabelCustomDomain — and the next recoverState tick would rebuild prov.Items[].CustomDomain back to "" from the unlabeled containers, putting the reconciler into a permanent restart loop against the chain's non-empty value. Returning early here avoids the loop.
func (*Backend) RefreshState ¶
RefreshState synchronizes in-memory provision state with Docker.
func (*Backend) Restart ¶
Restart restarts containers for a lease without changing the manifest. State machine: Ready|Failed → Restarting → Ready|Failed
SEAM CLOSED (ENG-230). This prelude is read-only: it fast-fails on ErrNotProvisioned / ErrInvalidState under provisionsMu, snapshots the fields the worker needs, then does pure work (manifest marshal + release-store Append). It performs NO write to prov.Status / prov.CallbackURL — the lease actor's onEnterRestarting entry action is the sole writer of those fields, firing inside handleRestartRequested BEFORE the ack. Because Restart() returns only after observing that ack, the "Restart() returns => prov.Status == Restarting" invariant the HTTP handler's event-broker publish depends on (api/handlers.go: RestartLease) is preserved without an off-actor write.
The prelude's fast-fail is only a route-time precondition — it does NOT guarantee the lease is still Ready/Failed when the actor dequeues the message. The real serialization is the actor inbox (the only path that mutates prov.Status). So a same-lease concurrent restart that passes the route-time check but loses the race (the winner already ran onEnterRestarting) is REJECTED by the actor, not prevented here: handleRestartRequested's classifyReplaceReject returns ErrInvalidState for the busy SM, which this function forwards and api/handlers.go maps to a clean 409.
Since no off-actor Status write remains, there is nothing to roll back on a marshal / Append / routing / ack failure: the error paths just return (the release-store Append is on a separate bbolt store; a "deploying" record left behind on routing/ack failure is cosmetic — recover.go skips non-active releases and deprovision deletes them).
func (*Backend) Restore ¶ added in v0.5.0
Restore adopts a soft-deleted lease's retained volumes into a NEW lease and brings up its stack from the retained manifest (ENG-325). The new lease is reserved at Provisioning and driven through the existing replace machinery via evRestoreRequested (Provisioning→Restarting→Ready|Failed).
The flow is the reviewed Rev 5 design; ordering is load-bearing:
(a) validate against the retained record (read-only), (b) reserve the new-lease provision at Provisioning (reject if live), (c) allocate pool slots, (d) ATOMICALLY claim active→restoring (closes the prelude-vs-reaper race), (e) adopt: rename retained→canonical (full rollback on failure), (f) hand off to the actor; doRestore's terminal defer owns success/failure/panic.
Synchronous errors (validation, already-provisioned, insufficient resources, not-retained, not-restorable) are returned to the caller; asynchronous outcomes flow via the lease callback.
func (*Backend) Stats ¶
func (b *Backend) Stats() shared.ResourceStats
Stats returns current resource usage statistics.
func (*Backend) Update ¶
Update deploys a new manifest for a lease, replacing containers. State machine: Ready|Failed → Updating → Ready|Failed
SEAM CLOSED (ENG-230) — see the extended comment on Backend.Restart. Like Restart, the prelude is read-only: it fast-fails / validates under provisionsMu, snapshots fields, then records the release. It performs NO write to prov.Status / prov.CallbackURL — the actor's onEnterUpdating entry action is the sole writer, firing inside handleUpdateRequested BEFORE the ack, so the "Update() returns => Status is Updating" contract holds without an off-actor write. No rollback is needed on any failure path (nothing on prov was mutated).
type Config ¶
type Config struct {
// LogLevel controls the log verbosity (debug, info, warn, error).
// When empty, defaults to "info" at startup (see cmd/docker-backend/main.go).
LogLevel string `yaml:"log_level"`
// Name is the backend identifier.
Name string `yaml:"name"`
// ListenAddr is the address the HTTP server listens on.
ListenAddr string `yaml:"listen_addr"`
// MaxRequestBodySize caps inbound HTTP request bodies (bytes). It must
// exceed providerd's request cap plus forward-wrapping overhead; defaults to
// DefaultMaxRequestBodySize when unset or non-positive. (ENG-448 / F42)
MaxRequestBodySize int64 `yaml:"max_request_body_size"`
// ProductionMode tightens startup checks beyond basic validation. When true,
// Validate rejects dev-only insecure toggles — currently
// callback_insecure_skip_verify, which disables TLS verification on the
// backend → Fred callback hop. Mirrors providerd's production_mode (which
// gates the reverse providerd → backend tls_skip_verify). Defaults to false.
ProductionMode bool `yaml:"production_mode"`
// TLSCertFile and TLSKeyFile enable HTTPS on the listener when both are
// set; otherwise it serves plaintext HTTP (the default). Loaded once at
// startup — rotation requires a restart (see ENG-294).
TLSCertFile string `yaml:"tls_cert_file"`
TLSKeyFile string `yaml:"tls_key_file"`
// TLSClientCAFile turns on mutual TLS when set: the listener requires and
// verifies a client certificate signed by this CA. Requires TLSCertFile and
// TLSKeyFile (the listener must be on TLS first).
TLSClientCAFile string `yaml:"tls_client_ca_file"`
// TLSClientAllowedNames optionally pins the mTLS client's identity: the
// presented certificate's CommonName or a DNS SAN must be in this list.
// Empty accepts any certificate signed by TLSClientCAFile. Requires
// TLSClientCAFile. Use this whenever the client CA is not dedicated solely
// to providerd.
TLSClientAllowedNames []string `yaml:"tls_client_allowed_names"`
// DockerHost is the Docker daemon socket path or URL.
DockerHost string `yaml:"docker_host"`
// TotalCPUCores is the total CPU cores available in the resource pool.
TotalCPUCores float64 `yaml:"total_cpu_cores"`
// TotalMemoryMB is the total memory available in MB.
TotalMemoryMB int64 `yaml:"total_memory_mb"`
// TotalDiskMB is the total disk space available in MB.
TotalDiskMB int64 `yaml:"total_disk_mb"`
// SKUMapping maps on-chain SKU UUIDs to profile names.
// This allows the backend to translate chain SKU UUIDs to local resource profiles.
// Example: {"019c1ee7-1aaf-7000-802c-ad775c72cc27": "docker-small"}
SKUMapping map[string]string `yaml:"sku_mapping"`
// SKUProfiles maps SKU names to resource profiles.
SKUProfiles map[string]SKUProfile `yaml:"sku_profiles"`
// AllowedRegistries is the list of allowed container registries.
AllowedRegistries []string `yaml:"allowed_registries"`
// CallbackSecret is the HMAC secret for signing callbacks.
CallbackSecret config.Secret `yaml:"callback_secret"`
// HostAddress is the external address for port mappings.
HostAddress string `yaml:"host_address"`
// ImagePullTimeout is the timeout for pulling images.
ImagePullTimeout time.Duration `yaml:"image_pull_timeout"`
// ContainerCreateTimeout is the timeout for creating containers.
ContainerCreateTimeout time.Duration `yaml:"container_create_timeout"`
// ContainerStartTimeout is the timeout for starting containers.
ContainerStartTimeout time.Duration `yaml:"container_start_timeout"`
// ContainerStopTimeout is the grace period for stopping containers.
// Containers receive SIGTERM and have this long to shut down gracefully
// before being force-killed (SIGKILL). Defaults to 30 seconds.
ContainerStopTimeout time.Duration `yaml:"container_stop_timeout"`
// ReconcileInterval is how often to reconcile state with Docker.
ReconcileInterval time.Duration `yaml:"reconcile_interval"`
// CallbackInsecureSkipVerify skips TLS certificate verification for callbacks.
// WARNING: This disables TLS certificate validation, enabling MITM attacks.
// NEVER enable in production. Only use for local development with self-signed certificates.
CallbackInsecureSkipVerify bool `yaml:"callback_insecure_skip_verify"`
// CallbackDBPath is the path to the bbolt database for persisting pending callbacks.
// Defaults to "callbacks.db".
CallbackDBPath string `yaml:"callback_db_path"`
// ProvisionTimeout is the maximum time allowed for the entire provisioning
// operation (image pull + container creation + start). If exceeded, the
// provisioning is canceled and a failure callback is sent.
ProvisionTimeout time.Duration `yaml:"provision_timeout"`
// HostBindIP is the IP address to bind container ports to.
// Defaults to "0.0.0.0" (all interfaces).
HostBindIP string `yaml:"host_bind_ip"`
// NetworkIsolation enables per-tenant Docker network isolation.
// When true, each tenant's containers are placed in a separate bridge network.
// Provides inter-tenant isolation. The network is created with Internal:false
// — required for port publishing (moby#36174) — so containers retain outbound
// internet access as a side effect. Defaults to true.
NetworkIsolation *bool `yaml:"network_isolation"`
// ContainerReadonlyRootfs sets the container's root filesystem to read-only.
// When true, /tmp and /run are mounted as tmpfs. Defaults to true.
ContainerReadonlyRootfs *bool `yaml:"container_readonly_rootfs"`
// ContainerPidsLimit limits the number of PIDs in each container.
// Defaults to 256.
ContainerPidsLimit *int64 `yaml:"container_pids_limit"`
// ContainerTmpfsSizeMB sets the tmpfs size in MB for /tmp and /run when
// readonly rootfs is enabled. Defaults to 64.
ContainerTmpfsSizeMB int `yaml:"container_tmpfs_size_mb"`
// StartupVerifyDuration is how long to wait after starting containers before
// verifying they're still running. This catches containers that crash immediately
// on startup (e.g., bad config, read-only filesystem errors, missing dependencies).
// The success callback is only sent after verification passes.
// Defaults to 5 seconds. Setting to 0 uses the default (verification cannot be disabled).
StartupVerifyDuration time.Duration `yaml:"startup_verify_duration"`
// TenantQuota configures per-tenant resource limits.
// When set, prevents any single tenant from consuming the entire pool.
TenantQuota *TenantQuotaConfig `yaml:"tenant_quota"`
// VolumeDataPath is the host directory for managed volumes.
// Required when any SKU profile has DiskMB > 0.
// Each container gets a quota-enforced subdirectory under this path.
VolumeDataPath string `yaml:"volume_data_path"`
// VolumeFilesystem specifies the filesystem type for volume quota enforcement.
// Supported values: "btrfs", "xfs", "zfs". If empty, auto-detected from VolumeDataPath.
VolumeFilesystem string `yaml:"volume_filesystem"`
// CallbackMaxAge is the maximum age of a persisted callback entry.
// Entries older than this are removed by the callback store's background cleanup.
// Defaults to 24h.
CallbackMaxAge time.Duration `yaml:"callback_max_age"`
// DiagnosticsDBPath is the path to the bbolt database for persisting failure diagnostics.
// Defaults to "diagnostics.db".
DiagnosticsDBPath string `yaml:"diagnostics_db_path"`
// DiagnosticsMaxAge is the maximum age of a persisted diagnostic entry.
// Entries older than this are removed by the diagnostics store's background cleanup.
// Defaults to 7 days.
DiagnosticsMaxAge time.Duration `yaml:"diagnostics_max_age"`
// ReleasesDBPath is the path to the bbolt database for persisting release history.
// Defaults to "releases.db".
ReleasesDBPath string `yaml:"releases_db_path"`
// ReleasesMaxAge is the maximum age of a persisted release entry.
// Entries older than this are removed by the release store's background cleanup.
// Defaults to 90 days.
ReleasesMaxAge time.Duration `yaml:"releases_max_age"`
// RetainOnClose controls whether a lease's managed VOLUMES are soft-deleted
// (renamed into the fred-retained- namespace and recorded in the retention
// store) instead of immediately destroyed when the lease is closed. The
// lease's containers are still stopped and removed either way; only the
// volumes are retained. When false (default), the volumes are destroyed
// immediately on close.
RetainOnClose bool `yaml:"retain_on_close"`
// RetentionDBPath is the path to the bbolt database for persisting
// soft-deleted leases awaiting restore or reaping.
// Defaults to "retention.db".
RetentionDBPath string `yaml:"retention_db_path"`
// RetentionMaxAge is the grace window after which a soft-deleted lease
// becomes eligible for reaping. 0 disables reaping entirely.
// Defaults to 90 days.
RetentionMaxAge time.Duration `yaml:"retention_max_age"`
// RetentionReapInterval is how often the background reaper sweeps for
// expired soft-deleted leases. Decoupled from RetentionMaxAge so the
// sweep cadence can be tuned independently of the grace window.
// Defaults to 1h.
RetentionReapInterval time.Duration `yaml:"retention_reap_interval"`
// MaxRetainedLeasesPerTenant caps how many soft-deleted leases a single
// tenant may have in the retention store at once. 0 means unlimited.
MaxRetainedLeasesPerTenant int `yaml:"max_retained_leases_per_tenant"`
// RetentionOrphanConfirmations is the number of consecutive retention sweeps a
// soft-deleted record must be observed with ALL its retained volumes missing
// before the record is pruned (ENG-370). It is a SWEEP COUNT, not a duration:
// the effective confirmation window is N × RetentionReapInterval (≈3h at the
// default 1h interval), so shortening RetentionReapInterval proportionally
// shrinks the window — re-tune N to keep a fixed grace. 0 is valid and
// disables orphan pruning entirely (kill-switch); negative values are rejected
// by Validate. Defaults to 3.
RetentionOrphanConfirmations int `yaml:"retention_orphan_confirmations"`
// MaxRetainedDiskMB caps the aggregate disk (MB) the provider will hold in
// the retained (soft-deleted) tier across ALL tenants. When retaining a
// closing lease would push the total over this cap, the lease is destroyed
// immediately instead of retained (refuse-to-retain) — never evicting
// another tenant's in-grace data. 0 means unlimited (default; retained
// volumes still count against total_disk_mb via the admission gate, but are
// not separately bounded). Must be <= total_disk_mb and, when set, >= the
// largest single stateful SKU's disk_mb (else a SKU-legal lease could never
// be retained). Independent of tenant_quota.max_disk_mb: it may be smaller,
// in which case a tenant's max-sized lease is SKU-legal yet refused
// retention. Value is a plain integer in MB (mebibytes, 2^20 bytes —
// consistent with total_disk_mb and the SKU disk_mb fields).
MaxRetainedDiskMB int64 `yaml:"max_retained_disk_mb"`
// MigrationGracePeriod is how long the renamed `-prev` legacy container
// lingers after a successful recover-time migration before forced
// removal. Preserves rollback potential if the operator interrupts fred
// in the migration window to inspect. Defaults to 1m.
MigrationGracePeriod time.Duration `yaml:"migration_grace_period"`
// MigrationReadyTimeout caps how long the recover-time migration waits
// for the new stack-form container to reach `healthy` (or `running`
// when no health check is declared) before declaring the migration
// failed for that lease. Defaults to 90s.
MigrationReadyTimeout time.Duration `yaml:"migration_ready_timeout"`
// Ingress configures optional reverse proxy integration.
// When enabled, containers with routable TCP ports get proxy labels
// pointing Traefik at the per-tenant network for HTTPS auto-discovery.
// Requires network_isolation to be enabled.
Ingress IngressConfig `yaml:"ingress"`
}
Config holds the configuration for the Docker backend.
func DefaultConfig ¶
func DefaultConfig() Config
DefaultConfig returns a Config with sensible defaults.
SKUProfiles is intentionally left empty: tier sizing is operator policy, not a code default. yaml.v3 merges map keys, so seeding defaults here would silently leak into any partial sku_profiles: block in YAML and trip the bidirectional sku_mapping/sku_profiles reachability check in Validate (see ENG-238). Operators must declare sku_profiles in their config; Validate enforces non-empty.
func (*Config) GetHostBindIP ¶
GetHostBindIP returns the configured bind IP, defaulting to "0.0.0.0".
func (*Config) GetPidsLimit ¶
GetPidsLimit returns the PID limit for containers. Defaults to 256.
func (*Config) GetSKUProfile ¶
func (c *Config) GetSKUProfile(sku string) (SKUProfile, error)
GetSKUProfile returns the profile for a SKU, or an error if not found. It first checks if the SKU is a UUID that maps to a profile name via SKUMapping, then falls back to direct profile lookup.
func (*Config) GetTmpfsSizeMB ¶
GetTmpfsSizeMB returns the tmpfs size in MB. Defaults to 64.
func (*Config) HasStatefulSKUs ¶
HasStatefulSKUs returns true if any SKU profile has DiskMB > 0, indicating that volume management is needed.
func (*Config) IsNetworkIsolation ¶
IsNetworkIsolation returns whether per-tenant network isolation is enabled. Defaults to true (secure by default).
func (*Config) IsReadonlyRootfs ¶
IsReadonlyRootfs returns whether containers should have a read-only root filesystem. Defaults to true (secure by default).
type ContainerEvent ¶
ContainerEvent represents a container lifecycle event from the Docker daemon. This keeps Docker SDK types out of the interface boundary.
type ContainerInfo ¶
type ContainerInfo struct {
ContainerID string
LeaseUUID string
Tenant string
ProviderUUID string
SKU string
ServiceName string // Stack service name (empty for single-container leases)
InstanceIndex int
FailCount int
CallbackURL string
Image string
Status string
Health HealthStatus // Health check status (HealthStatusHealthy, HealthStatusUnhealthy, HealthStatusStarting, or HealthStatusNone)
ExitCode int // Process exit code (meaningful when Status is "exited"/"dead")
OOMKilled bool // True if container was killed by the OOM killer
CreatedAt time.Time
Ports map[string]PortBinding
FQDN string
CustomDomain string // Tenant-supplied custom FQDN (empty when not set)
// Name is the human-readable container name (without the leading "/"
// the Docker API prepends). Populated by both InspectContainer and
// ListManagedContainers. Used by the recover-time migration to filter
// out already-migrated `-prev` remnants.
Name string
// Mounts is the set of bind/volume mounts attached to the container.
// Populated by InspectContainer always; populated by
// ListManagedContainers from `types.Container.Mounts` (which the
// list-containers API includes inline, so no extra Inspect round-trip
// is needed at startup). Used by the recover-time migration to locate
// managed volumes that need renaming under the new naming convention.
Mounts []ContainerMount
}
ContainerInfo holds information about a managed container.
type ContainerMount ¶
type ContainerMount struct {
Source string
Target string
Type string // "bind" | "volume" | "tmpfs"
}
ContainerMount mirrors the subset of docker's Mount data fred needs for the recover-time migration: where the host bind comes from, where it's mounted inside the container, and the type discriminator (bind / volume / tmpfs). Type is a string per the docker API rather than the typed `mount.Type` because the list-containers and inspect-container APIs surface it as a free-form string and copying that semantics here keeps callers from caring about which API shape they're consuming.
type CreateContainerParams ¶
type CreateContainerParams struct {
LeaseUUID string
Tenant string
ProviderUUID string
SKU string
ServiceName string // Stack service name (empty for single-container leases)
Manifest *manifest.Manifest
Profile SKUProfile
InstanceIndex int // For multi-unit leases (0-based index)
// Retry tracking
FailCount int
// CallbackURL is persisted as a label so failure callbacks can be
// sent after a docker-backend restart (when in-memory state is lost).
CallbackURL string
// Hardening parameters
HostBindIP string
ReadonlyRootfs bool
PidsLimit *int64
TmpfsSizeMB int
NetworkConfig *networktypes.NetworkingConfig
// VolumeBinds are bind mounts from host to container for managed volumes.
// Each entry maps a host path to a container path.
// Used for stateful containers (disk_mb > 0).
VolumeBinds map[string]string
// ImageVolumes are VOLUME paths from the image that need tmpfs overrides
// (for ephemeral containers only, when VolumeBinds is nil).
ImageVolumes []string
// WritablePathBinds are bind mounts from managed volume subdirectories
// to auto-detected writable directories in the container. Each entry
// maps a host path (with pre-extracted image content) to a container path.
WritablePathBinds map[string]string
// User overrides the container's runtime user (e.g., "999:999").
// When set, container.Config.User is set to this value so the container
// runs directly as the target user instead of relying on the entrypoint
// to switch users (which requires CAP_CHOWN that we drop).
User string
// BackendName identifies the backend instance that created this container,
// stored as a label to scope list/filter operations per backend.
BackendName string
// Ingress holds the reverse proxy configuration.
// When Enabled, proxy labels are injected into the container.
// NetworkName must be non-empty when Ingress is enabled (enforced by
// Config.Validate requiring network_isolation).
Ingress IngressConfig
NetworkName string // Per-tenant network name for traefik.docker.network label
Quantity int // Total quantity for this service (used in subdomain computation)
// CustomDomain is the optional tenant-supplied FQDN for this LeaseItem.
// When non-empty (and a routable HTTP port exists), CreateContainer
// emits a secondary Traefik router routing Host(<CustomDomain>) to the
// shared per-service loadbalancer. Validated defense-in-depth before
// emission; chain authoritatively validates upstream.
CustomDomain string
}
CreateContainerParams holds parameters for creating a container.
type DaemonSecurityInfo ¶
type DaemonSecurityInfo struct {
StorageDriver string
BackingFilesystem string
SecurityOptions []string
IPv4Forwarding bool
Warnings []string
}
DaemonSecurityInfo contains Docker daemon capabilities relevant to container hardening validation.
type DockerClient ¶
type DockerClient struct {
// contains filtered or unexported fields
}
DockerClient wraps the Docker client for container lifecycle operations.
func NewDockerClient ¶
func NewDockerClient(host string, backendName string) (*DockerClient, error)
NewDockerClient creates a new Docker client. The backendName parameter scopes all list/filter/event operations so that multiple backend instances sharing the same Docker daemon do not interfere with each other.
func (*DockerClient) ContainerEvents ¶
func (d *DockerClient) ContainerEvents(ctx context.Context) (<-chan ContainerEvent, <-chan error)
ContainerEvents subscribes to Docker container lifecycle events, filtering for "die" events on containers managed by fred (label fred.managed=true). Returns a channel of ContainerEvent and a channel of errors. Both channels are closed when the context is canceled or the Docker event stream closes.
func (*DockerClient) ContainerLogs ¶
func (d *DockerClient) ContainerLogs(ctx context.Context, containerID string, tail int) (string, error)
ContainerLogs returns the last `tail` lines of combined stdout/stderr for a container. If tail is <= 0 it defaults to 100.
func (*DockerClient) CreateContainer ¶
func (d *DockerClient) CreateContainer(ctx context.Context, params CreateContainerParams, timeout time.Duration) (string, error)
CreateContainer creates a new container with the specified configuration. For ephemeral port bindings, it retries up to portBindRetries times on port conflict errors since these can be transient.
func (*DockerClient) DaemonInfo ¶
func (d *DockerClient) DaemonInfo(ctx context.Context) (DaemonSecurityInfo, error)
DaemonInfo returns Docker daemon capabilities for hardening validation.
func (*DockerClient) DetectVolumeOwner ¶
func (d *DockerClient) DetectVolumeOwner(ctx context.Context, imageName string, volumePaths []string) (uid, gid int, err error)
DetectVolumeOwner inspects the ownership of VOLUME directories inside an image by creating a temporary container (never started) and reading the tar headers from CopyFromContainer. If all volume paths share the same non-root UID:GID, those values are returned. If the paths have mixed ownership, are owned by root, or a path doesn't exist, (0, 0, nil) is returned.
func (*DockerClient) DetectWritablePaths ¶
func (d *DockerClient) DetectWritablePaths(ctx context.Context, imageName string, uid int, candidateParents []string) ([]string, error)
DetectWritablePaths scans candidate parent directories inside an image for depth-1 subdirectories owned by uid. When uid is 0 (root image), it matches directories owned by any non-root user — this handles images like neo4j that run as root but chown directories to a service user during build.
func (*DockerClient) EnsureTenantNetwork ¶
EnsureTenantNetwork creates or returns the existing network for a tenant. The network is a per-tenant bridge that provides inter-tenant isolation. Note: Internal must be false because Docker internal networks do not support port publishing (moby/moby#36174). Outbound internet access from containers is a side effect.
func (*DockerClient) ExtractImageContent ¶
func (d *DockerClient) ExtractImageContent(ctx context.Context, imageName string, paths []string, destDir string, maxBytes int64) map[string]error
ExtractImageContent extracts pre-existing image content for the given paths into destDir on the host filesystem. For each path, it creates a temporary container from the image (never started), streams the tar archive of that path via CopyFromContainer, sanitizes the tar stream, and extracts it to the appropriate subdirectory under destDir.
Returns nil on full success, or a map of path → error for failures. Callers should log failures but not fail the provision (graceful degradation).
func (*DockerClient) InspectContainer ¶
func (d *DockerClient) InspectContainer(ctx context.Context, containerID string) (*ContainerInfo, error)
InspectContainer returns detailed information about a container.
func (*DockerClient) InspectImage ¶
InspectImage inspects a pulled image and returns its metadata.
func (*DockerClient) ListManagedContainers ¶
func (d *DockerClient) ListManagedContainers(ctx context.Context) ([]ContainerInfo, error)
ListManagedContainers returns all containers managed by Fred. When backendName is set, only containers belonging to this backend are returned.
func (*DockerClient) ListManagedNetworks ¶
func (d *DockerClient) ListManagedNetworks(ctx context.Context) ([]networktypes.Inspect, error)
ListManagedNetworks returns all networks created by Fred, with full details. When backendName is set, only networks belonging to this backend are returned.
func (*DockerClient) Ping ¶
func (d *DockerClient) Ping(ctx context.Context) error
Ping checks connectivity to the Docker daemon.
func (*DockerClient) PullImage ¶
func (d *DockerClient) PullImage(ctx context.Context, imageName string, timeout time.Duration) error
PullImage pulls a container image with timeout.
func (*DockerClient) RemoveContainer ¶
func (d *DockerClient) RemoveContainer(ctx context.Context, containerID string) error
RemoveContainer removes a container. It is idempotent — if the container is already gone or the daemon is already removing it, returns nil only after the removal has actually completed, so callers can safely proceed with operations that assume the container is physically gone (volume destroy, name re-use).
func (*DockerClient) RemoveTenantNetworkIfEmpty ¶
func (d *DockerClient) RemoveTenantNetworkIfEmpty(ctx context.Context, tenant string) error
RemoveTenantNetworkIfEmpty removes the tenant's network if no containers are connected.
func (*DockerClient) RenameContainer ¶
func (d *DockerClient) RenameContainer(ctx context.Context, containerID string, newName string) error
RenameContainer changes the name of a container. The container can be running or stopped. This is used during updates/restarts to free the canonical name for the replacement container while keeping the old one available for rollback.
func (*DockerClient) ResolveImageUser ¶
func (d *DockerClient) ResolveImageUser(ctx context.Context, imageName string, userOverride string) (uid, gid int, err error)
ResolveImageUser resolves a container user specification to numeric UID/GID. If userOverride is non-empty, it is used instead of the image's Config.User. This is needed for images like postgres that start as root and expect to chown data directories — since we drop CAP_CHOWN, the manifest must specify the target user explicitly. If both userOverride and Config.User are empty, returns (0, 0, nil) (root). Numeric UID/GID values are parsed directly. Non-numeric usernames are resolved by reading /etc/passwd (and optionally /etc/group) from a temporary container created from the image.
func (*DockerClient) StartContainer ¶
func (d *DockerClient) StartContainer(ctx context.Context, containerID string, timeout time.Duration) error
StartContainer starts a container.
func (*DockerClient) StopContainer ¶
func (d *DockerClient) StopContainer(ctx context.Context, containerID string, timeout time.Duration) error
StopContainer gracefully stops a running container with a timeout. After the timeout, the container is forcefully killed.
type HealthStatus ¶
type HealthStatus string
HealthStatus represents the health check status of a Docker container.
const ( HealthStatusHealthy HealthStatus = "healthy" HealthStatusUnhealthy HealthStatus = "unhealthy" HealthStatusStarting HealthStatus = "starting" HealthStatusNone HealthStatus = "" // No health check configured )
type ImageInfo ¶
type ImageInfo struct {
// ID is the content-addressable image ID (e.g., "sha256:...").
// Immutable for a given image build, suitable as a cache key.
ID string
// Volumes are the VOLUME declarations from the Dockerfile.
Volumes map[string]struct{}
// User is the USER directive from the Dockerfile (may be name, uid, uid:gid, or name:group).
User string
}
ImageInfo holds metadata from a container image inspection.
type IngressConfig ¶
type IngressConfig struct {
Enabled bool `yaml:"enabled"`
WildcardDomain string `yaml:"wildcard_domain"`
Entrypoint string `yaml:"entrypoint"`
// CustomDomainCertResolver is the Traefik certresolver name used for
// per-tenant custom domains (HTTP-01 by default). Defaults to "http01"
// when empty.
CustomDomainCertResolver string `yaml:"custom_domain_cert_resolver"`
// CustomDomainMiddlewares is the list of Traefik middleware references
// applied to the secondary custom-domain router. Defaults to
// ["security-headers@file"] when empty.
CustomDomainMiddlewares []string `yaml:"custom_domain_middlewares"`
// CustomDomainDNSResolvers are the DNS servers (host:port) fred queries to
// check whether a tenant custom domain resolves to this host before
// emitting its HTTP-01 router (ENG-266). Public resolvers are used so the
// answer matches what the ACME CA sees. Defaults to Cloudflare 1.1.1.1:53 /
// Google 8.8.8.8:53 / Quad9 9.9.9.9:53.
CustomDomainDNSResolvers []string `yaml:"custom_domain_dns_resolvers"`
// CustomDomainDNSQuorum is how many of the resolvers must independently see
// the domain at this host before the gate opens (ENG-266). 0 (default) ==
// a majority of CustomDomainDNSResolvers. Clamped to [1, len(resolvers)],
// or 1 when no resolvers are configured.
CustomDomainDNSQuorum int `yaml:"custom_domain_dns_quorum"`
// CustomDomainDNSCheckDisabled turns OFF the readiness gate (ENG-266),
// reverting to emitting the custom-domain router immediately. Default false.
CustomDomainDNSCheckDisabled bool `yaml:"custom_domain_dns_check_disabled"`
}
IngressConfig holds configuration for reverse proxy integration. When Enabled, containers with routable TCP ports get proxy labels pointing Traefik at the per-tenant network for auto-discovery. Requires network_isolation to be enabled (validated at config load time). Currently generates Traefik-specific labels; the config layer is proxy-agnostic so a future backend swap only changes label generation.
The wildcard certificate covering *.WildcardDomain must be provisioned at the Traefik level (e.g. via a DNS-01 ACME resolver with domains in static config, or a default cert in tls.stores). Fred does not drive per-domain ACME challenges — routers are emitted with tls=true but name no certresolver, so Traefik serves whichever cert matches SNI.
func (*IngressConfig) Validate ¶
func (ic *IngressConfig) Validate() error
Validate checks that all required IngressConfig fields are set when enabled.
type PortBinding ¶
PortBinding represents a port mapping.
type SKUProfile ¶
type SKUProfile = shared.SKUProfile
Type aliases for readability within the docker package.
type TenantQuotaConfig ¶
type TenantQuotaConfig = shared.TenantQuotaConfig
Source Files
¶
- backend.go
- capability.go
- capability_linux.go
- compose.go
- compose_project.go
- config.go
- custom_domain_dns.go
- deprovision.go
- doc.go
- info.go
- ingress.go
- lease_actor_factory.go
- lease_actor_routing.go
- leasesm_adapters.go
- leasesm_metrics.go
- lifecycle.go
- metrics.go
- migrate.go
- provision.go
- quota_backfill.go
- reconcile_custom_domain.go
- recover.go
- recovered_provision.go
- restart_update.go
- restore.go
- retention.go
- retention_accounting.go
- retention_writable_path.go
- volume.go
- volume_btrfs.go
- volume_xfs.go
- volume_zfs.go