Documentation
¶
Overview ¶
Package federation provides multi-cluster client management for the MCP Kubernetes server.
This package enables the MCP server to operate across multiple Kubernetes clusters in a federated environment, specifically designed for Giant Swarm's Management Cluster and Workload Cluster architecture using Cluster API (CAPI).
Architecture Overview ¶
The federation package implements a "Hub-and-Spoke" model where:
- The Management Cluster (MC) acts as the central hub containing CAPI resources
- Workload Clusters (WC) are dynamically discovered and accessed through kubeconfig secrets
- All operations are executed under the authenticated user's identity via Kubernetes impersonation
Core Components ¶
ClusterClientManager is the primary interface for multi-cluster operations:
manager := federation.NewManager(localClient, localDynamic, logger) // Get a client for a specific workload cluster client, err := manager.GetClient(ctx, "my-cluster", userInfo) // List all available clusters clusters, err := manager.ListClusters(ctx, userInfo)
Security Model ¶
The federation package enforces security through:
- User impersonation: All cluster operations use Impersonate-User headers
- RBAC delegation: Authorization is delegated to each cluster's RBAC policies
- No credential exposure: Admin credentials are only used for TLS establishment
Thread Safety ¶
All operations in this package are thread-safe. The ClusterClientManager uses internal synchronization to handle concurrent access from multiple tool handlers.
Error Handling ¶
The package defines specific error types for common failure scenarios:
- ErrClusterNotFound: The requested cluster doesn't exist or is inaccessible
- ErrKubeconfigSecretNotFound: CAPI kubeconfig secret is missing
- ErrKubeconfigInvalid: Secret contains malformed kubeconfig data
- ErrConnectionFailed: Network or TLS issues connecting to the cluster
Example Usage ¶
// Initialize the manager with the Management Cluster client
manager, err := federation.NewManager(localClient, localDynamic, logger)
if err != nil {
return err
}
defer manager.Close()
// Get user info from OAuth token
userInfo := &federation.UserInfo{
Email: "user@example.com",
Groups: []string{"developers"},
}
// Get a client for a workload cluster with user impersonation
client, err := manager.GetClient(ctx, "production-cluster", userInfo)
if err != nil {
return fmt.Errorf("failed to get cluster client: %w", err)
}
// Use the client for Kubernetes operations
pods, err := client.CoreV1().Pods("default").List(ctx, metav1.ListOptions{})
Integration with MCP Server ¶
The federation package is designed to integrate with the ServerContext pattern:
serverCtx, err := server.NewServerContext(ctx, server.WithK8sClient(k8sClient), server.WithFederationManager(federationManager), )
Tool handlers can then access the federation manager to perform multi-cluster operations.
Index ¶
- Constants
- Variables
- func AnonymizeEmail(email string) string
- func AnonymizeUserInfo(user *UserInfo) map[string]interface{}
- func ValidateClusterName(name string) error
- func ValidateUserInfo(user *UserInfo) error
- type ClusterClientManager
- type ClusterNotFoundError
- type ClusterPhase
- type ClusterSummary
- type ConnectionError
- type KubeconfigError
- type Manager
- func (m *Manager) Close() error
- func (m *Manager) GetClient(ctx context.Context, clusterName string, user *UserInfo) (kubernetes.Interface, error)
- func (m *Manager) GetClusterSummary(ctx context.Context, clusterName string, user *UserInfo) (*ClusterSummary, error)
- func (m *Manager) GetDynamicClient(ctx context.Context, clusterName string, user *UserInfo) (dynamic.Interface, error)
- func (m *Manager) ListClusters(ctx context.Context, user *UserInfo) ([]ClusterSummary, error)
- type UserInfo
- type ValidationError
Constants ¶
const ( // ImpersonateUserHeader is the header name for the impersonated user. ImpersonateUserHeader = "Impersonate-User" // ImpersonateGroupHeader is the header name for impersonated groups. ImpersonateGroupHeader = "Impersonate-Group" // ImpersonateExtraHeaderPrefix is the prefix for extra impersonation headers. ImpersonateExtraHeaderPrefix = "Impersonate-Extra-" )
ImpersonationHeaders contains the header names used for Kubernetes user impersonation.
const ( // MaxEmailLength is the maximum allowed length for an email address. MaxEmailLength = 254 // MaxGroupNameLength is the maximum allowed length for a group name. MaxGroupNameLength = 256 // MaxGroupCount is the maximum number of groups allowed per user. MaxGroupCount = 100 // MaxExtraKeyLength is the maximum allowed length for an extra header key. MaxExtraKeyLength = 256 // MaxExtraValueLength is the maximum allowed length for an extra header value. MaxExtraValueLength = 1024 // MaxExtraCount is the maximum number of extra headers allowed. MaxExtraCount = 50 // MaxClusterNameLength is the maximum allowed length for a cluster name. // Kubernetes names are limited to 253 characters. MaxClusterNameLength = 253 )
Validation constants for security limits.
const CAPISecretKey = "value"
CAPISecretKey is the key within the kubeconfig secret that contains the actual kubeconfig YAML data.
const CAPISecretSuffix = "-kubeconfig"
CAPISecretSuffix is the suffix used by CAPI for kubeconfig secrets. The full secret name is: ${CLUSTER_NAME}-kubeconfig
Variables ¶
var ( // ErrClusterNotFound indicates that the requested cluster does not exist // or the user does not have permission to access it. ErrClusterNotFound = errors.New("cluster not found") // ErrKubeconfigSecretNotFound indicates that the CAPI kubeconfig secret // for the cluster is missing from the Management Cluster. ErrKubeconfigSecretNotFound = errors.New("kubeconfig secret not found") // ErrKubeconfigInvalid indicates that the kubeconfig secret contains // malformed or unparseable kubeconfig data. ErrKubeconfigInvalid = errors.New("kubeconfig data is invalid") // ErrConnectionFailed indicates a network or TLS error when attempting // to connect to the target cluster. ErrConnectionFailed = errors.New("failed to connect to cluster") // ErrImpersonationFailed indicates that the user impersonation could not // be configured on the cluster client. ErrImpersonationFailed = errors.New("failed to configure user impersonation") // ErrManagerClosed indicates that the ClusterClientManager has been closed // and can no longer be used. ErrManagerClosed = errors.New("federation manager is closed") )
Sentinel errors for common federation failure scenarios. These errors can be checked using errors.Is() for programmatic error handling.
var ( // ErrUserInfoRequired indicates that user information is required but was not provided. ErrUserInfoRequired = fmt.Errorf("user information is required for cluster operations") // ErrInvalidEmail indicates that the email address format is invalid. ErrInvalidEmail = fmt.Errorf("invalid email address format") // ErrInvalidGroupName indicates that a group name is invalid. ErrInvalidGroupName = fmt.Errorf("invalid group name") // ErrInvalidExtraHeader indicates that an extra header key or value is invalid. ErrInvalidExtraHeader = fmt.Errorf("invalid extra header") // ErrInvalidClusterName indicates that a cluster name is invalid. ErrInvalidClusterName = fmt.Errorf("invalid cluster name") )
Validation errors.
Functions ¶
func AnonymizeEmail ¶
AnonymizeEmail returns a hashed representation of an email for logging purposes. This allows correlation of log entries without exposing PII.
func AnonymizeUserInfo ¶
AnonymizeUserInfo returns anonymized user identifiers for logging. Returns a map with "user_hash" and "group_count" for safe logging.
func ValidateClusterName ¶
ValidateClusterName validates a cluster name against Kubernetes naming conventions.
func ValidateUserInfo ¶
ValidateUserInfo validates the UserInfo struct for security. Returns ErrUserInfoRequired if user is nil. Returns a ValidationError if any field fails validation.
Types ¶
type ClusterClientManager ¶
type ClusterClientManager interface {
// GetClient returns a Kubernetes client for the target cluster,
// configured to impersonate the provided user.
// If clusterName is empty, returns the local (Management Cluster) client.
//
// The returned client has Impersonate-User and Impersonate-Group headers
// configured based on the UserInfo, ensuring all operations are executed
// under the authenticated user's identity.
GetClient(ctx context.Context, clusterName string, user *UserInfo) (kubernetes.Interface, error)
// GetDynamicClient returns a dynamic client for the target cluster,
// useful for working with CRDs like CAPI resources.
// If clusterName is empty, returns the local (Management Cluster) dynamic client.
//
// Like GetClient, the returned client is configured for user impersonation.
GetDynamicClient(ctx context.Context, clusterName string, user *UserInfo) (dynamic.Interface, error)
// ListClusters returns a list of available workload clusters.
// The list is filtered based on the user's RBAC permissions - only clusters
// the user has access to view will be returned.
//
// This method queries CAPI Cluster resources on the Management Cluster.
ListClusters(ctx context.Context, user *UserInfo) ([]ClusterSummary, error)
// GetClusterSummary returns detailed information about a specific cluster.
// Returns ErrClusterNotFound if the cluster doesn't exist or the user
// doesn't have permission to access it.
GetClusterSummary(ctx context.Context, clusterName string, user *UserInfo) (*ClusterSummary, error)
// Close releases all cached clients and resources.
// After Close is called, all other methods will return ErrManagerClosed.
Close() error
}
ClusterClientManager manages Kubernetes clients for multi-cluster operations. It retrieves clients for both the local Management Cluster and remote Workload Clusters, with support for user impersonation.
All methods are thread-safe and can be called concurrently from multiple tool handlers.
type ClusterNotFoundError ¶
ClusterNotFoundError provides detailed context about a cluster lookup failure.
func (*ClusterNotFoundError) Error ¶
func (e *ClusterNotFoundError) Error() string
Error implements the error interface.
func (*ClusterNotFoundError) Unwrap ¶
func (e *ClusterNotFoundError) Unwrap() error
Unwrap returns the underlying sentinel error for use with errors.Is().
func (*ClusterNotFoundError) UserFacingError ¶
func (e *ClusterNotFoundError) UserFacingError() string
UserFacingError returns a sanitized error message safe for end users. This prevents leaking internal cluster names and namespace structure.
type ClusterPhase ¶
type ClusterPhase string
ClusterPhase represents the lifecycle phase of a CAPI cluster.
const ( // ClusterPhasePending indicates the cluster is awaiting provisioning. ClusterPhasePending ClusterPhase = "Pending" // ClusterPhaseProvisioning indicates the cluster is being created. ClusterPhaseProvisioning ClusterPhase = "Provisioning" // ClusterPhaseProvisioned indicates the cluster is fully operational. ClusterPhaseProvisioned ClusterPhase = "Provisioned" // ClusterPhaseDeleting indicates the cluster is being deleted. ClusterPhaseDeleting ClusterPhase = "Deleting" // ClusterPhaseFailed indicates the cluster encountered a fatal error. ClusterPhaseFailed ClusterPhase = "Failed" // ClusterPhaseUnknown indicates the cluster phase cannot be determined. ClusterPhaseUnknown ClusterPhase = "Unknown" )
Standard CAPI cluster phases.
type ClusterSummary ¶
type ClusterSummary struct {
// Name is the unique identifier of the cluster within its namespace.
// This corresponds to the Cluster API Cluster resource name.
Name string `json:"name"`
// Namespace is the organization namespace on the Management Cluster
// where the CAPI Cluster resource is located.
Namespace string `json:"namespace"`
// Provider indicates the infrastructure provider (e.g., "aws", "azure", "vsphere").
// This is extracted from the CAPI infrastructure reference.
Provider string `json:"provider,omitempty"`
// Release is the Giant Swarm release version running on the cluster.
// Format follows semver, e.g., "19.3.0".
Release string `json:"release,omitempty"`
// KubernetesVersion is the Kubernetes version running on the cluster.
// Format follows semver, e.g., "1.28.5".
KubernetesVersion string `json:"kubernetesVersion,omitempty"`
// Status indicates the current lifecycle phase of the cluster.
// Common values: "Provisioned", "Provisioning", "Deleting", "Failed".
Status string `json:"status"`
// Ready indicates whether the cluster is fully operational and
// ready to accept workloads.
Ready bool `json:"ready"`
// ControlPlaneReady indicates whether the control plane components
// are healthy and operational.
ControlPlaneReady bool `json:"controlPlaneReady"`
// InfrastructureReady indicates whether the underlying infrastructure
// (VMs, networks, etc.) is provisioned and healthy.
InfrastructureReady bool `json:"infrastructureReady"`
// NodeCount is the current number of worker nodes in the cluster.
// This may differ from the desired count during scaling operations.
NodeCount int `json:"nodeCount,omitempty"`
// CreatedAt is the timestamp when the cluster was initially created.
CreatedAt time.Time `json:"createdAt"`
// Labels contains the Kubernetes labels applied to the Cluster resource.
// These often include organization, team, or environment tags.
Labels map[string]string `json:"labels,omitempty"`
// Annotations contains the Kubernetes annotations on the Cluster resource.
// May include operational metadata or external references.
Annotations map[string]string `json:"annotations,omitempty"`
}
ClusterSummary provides basic information about a workload cluster. This is returned by ListClusters and contains metadata useful for cluster selection and display purposes.
type ConnectionError ¶
ConnectionError provides detailed context about cluster connection failures.
func (*ConnectionError) Error ¶
func (e *ConnectionError) Error() string
Error implements the error interface.
func (*ConnectionError) Unwrap ¶
func (e *ConnectionError) Unwrap() error
Unwrap returns the underlying error for use with errors.Is() and errors.As().
func (*ConnectionError) UserFacingError ¶
func (e *ConnectionError) UserFacingError() string
UserFacingError returns a sanitized error message safe for end users. This prevents leaking internal host URLs and network topology.
type KubeconfigError ¶
type KubeconfigError struct {
ClusterName string
SecretName string
Namespace string
Reason string
Err error
// NotFound indicates the kubeconfig secret was not found (vs other errors like invalid data).
// When true, Unwrap() returns ErrKubeconfigSecretNotFound instead of ErrKubeconfigInvalid.
NotFound bool
}
KubeconfigError provides detailed context about kubeconfig retrieval failures.
func (*KubeconfigError) Error ¶
func (e *KubeconfigError) Error() string
Error implements the error interface.
func (*KubeconfigError) Unwrap ¶
func (e *KubeconfigError) Unwrap() error
Unwrap returns the underlying error for use with errors.Is() and errors.As().
func (*KubeconfigError) UserFacingError ¶
func (e *KubeconfigError) UserFacingError() string
UserFacingError returns a sanitized error message safe for end users. This prevents leaking internal secret names and namespace structure.
type Manager ¶
type Manager struct {
// contains filtered or unexported fields
}
Manager implements ClusterClientManager for CAPI-based multi-cluster federation.
func NewManager ¶
func NewManager(localClient kubernetes.Interface, localDynamic dynamic.Interface, logger *slog.Logger) (*Manager, error)
NewManager creates a new ClusterClientManager with the provided local clients.
Parameters:
- localClient: Kubernetes clientset for the Management Cluster
- localDynamic: Dynamic client for the Management Cluster (for CAPI CRDs)
- logger: Structured logger for operational messages (can be nil)
The local clients should be configured with admin credentials for the Management Cluster. These credentials are only used to:
- Read CAPI Cluster resources for discovery
- Read kubeconfig Secrets for Workload Cluster access
- Establish TLS connections to Workload Clusters
All actual operations are executed under user impersonation.
func (*Manager) GetClient ¶
func (m *Manager) GetClient(ctx context.Context, clusterName string, user *UserInfo) (kubernetes.Interface, error)
GetClient returns a Kubernetes client for the target cluster. Returns ErrUserInfoRequired if user is nil (to prevent privilege escalation). Returns ErrInvalidClusterName if the cluster name fails validation.
func (*Manager) GetClusterSummary ¶
func (m *Manager) GetClusterSummary(ctx context.Context, clusterName string, user *UserInfo) (*ClusterSummary, error)
GetClusterSummary returns information about a specific cluster. Returns ErrUserInfoRequired if user is nil (to prevent privilege escalation). Returns ErrInvalidClusterName if the cluster name fails validation.
func (*Manager) GetDynamicClient ¶
func (m *Manager) GetDynamicClient(ctx context.Context, clusterName string, user *UserInfo) (dynamic.Interface, error)
GetDynamicClient returns a dynamic client for the target cluster. Returns ErrUserInfoRequired if user is nil (to prevent privilege escalation). Returns ErrInvalidClusterName if the cluster name fails validation.
func (*Manager) ListClusters ¶
ListClusters returns all available workload clusters. Returns ErrUserInfoRequired if user is nil (to prevent privilege escalation).
type UserInfo ¶
type UserInfo struct {
// Email is the user's email address from the OAuth token's email claim.
// This is used as the Impersonate-User header value.
Email string
// Groups contains the user's group memberships from OAuth claims.
// These are passed via Impersonate-Group headers for RBAC evaluation.
Groups []string
// Extra contains additional claims from the OAuth token that should be
// propagated to the Kubernetes API server via Impersonate-Extra headers.
// Common examples include organization IDs, tenant identifiers, or custom claims.
Extra map[string][]string
}
UserInfo contains the authenticated user's identity information extracted from the OAuth token. This information is used to configure Kubernetes user impersonation headers.
type ValidationError ¶
type ValidationError struct {
Field string
Value string // Sanitized value (may be truncated or anonymized)
Reason string
Err error
}
ValidationError provides detailed context about a validation failure.
func (*ValidationError) Error ¶
func (e *ValidationError) Error() string
Error implements the error interface.
func (*ValidationError) Unwrap ¶
func (e *ValidationError) Unwrap() error
Unwrap returns the underlying error for use with errors.Is() and errors.As().
func (*ValidationError) UserFacingError ¶
func (e *ValidationError) UserFacingError() string
UserFacingError returns a sanitized error message safe for end users.