machine-deletion-remediation

command module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 19, 2023 License: Apache-2.0 Imports: 18 Imported by: 0

README

Machine-API Driven Remediation

This operator conforms to the External Remediation of NodeHealthCheck and is designed to work with Node Health Check to reprovision unhealthy nodes using the Machine API. It functions by following the annotation on the Node to the associated Machine object, confirms that it has an owning controller (e.g. MachineSetController), and deletes it. Once the Machine CR has been deleted, the owning controller creates a replacement.

Pre-requisites

  • Machine API based cluster that is able to programmatically destroy and create cluster nodes
  • Nodes are associated with Machines
  • Machines are declaratively managed
  • Node Health Check is installed and running

Installation

  • Deploy MDR (Machine-deletion-remediation) to a container in the cluster pod. Try make deploy, official images coming soon.
  • Load the yaml manifest of the MDR template (see below).
  • Modifying NodeHealthCheck CR to use MDR as it's remediator. This is basically a specific use case of an External Remediation of NodeHealthCheck. In order to set up: make sure that Node Health Check is running, Machine-deletion-remediation controller exists and then create the necessary CRs.

Example CRs

An example MDR template object.

   apiVersion: machine-deletion-remediation.medik8s.io/v1alpha1
   kind: MachineDeletionRemediationTemplate
   metadata:
     name: group-x
     namespace: default
   spec:
     template:
       spec: {}

These CRs are created by the admin and are used as a template by NodeHealthCheck for creating the CRs that represent a request for a Node to be recovered.

Configuring NodeHealthCheck to use the example group-x template above.

apiVersion: remediation.medik8s.io/v1alpha1
kind: NodeHealthCheck
metadata:
  name: nodehealthcheck-sample
spec:
  remediationTemplate:
    kind: MachineDeletionRemediationTemplate
    apiVersion: machine-deletion-remediation.medik8s.io/v1alpha1
    name: group-x
    namespace: default

While the admin may define many NodeHealthCheck domains, they can all use the same MDR template if desired.

An example remediation request for Node worker-0-21 (NOTE: uid is the nodehealthcheck-sample's UID).

apiVersion: machine-deletion-remediation.medik8s.io/v1alpha1
kind: MachineDeletionRemediation
metadata:
  name: worker-0-21
  namespace: default
spec: {}

These CRs are created by NodeHealthCheck when it detects a failed node. The MDR operator watches for them to be created, looks up the Machine CR and deletes Node associated with it. MDR CRs are deleted by NodeHealthCheck when it sees the Node is healthy again.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
api
v1alpha1
Package v1alpha1 contains API Schema definitions for the machine-deletion-remediation v1alpha1 API group +kubebuilder:object:generate=true +groupName=machine-deletion-remediation.medik8s.io
Package v1alpha1 contains API Schema definitions for the machine-deletion-remediation v1alpha1 API group +kubebuilder:object:generate=true +groupName=machine-deletion-remediation.medik8s.io
Package version provides information about the version of the operator.
Package version provides information about the version of the operator.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL