About
The PagerDuty operator is used to automate integrating OpenShift Dedicated
clusters with PagerDuty that are provisioned via https://cloud.redhat.com/.
This operator runs on Hive and watches for new
cluster deployments. Hive is an API driven OpenShift cluster providing
OpenShift Dedicated provisioning and management.
- The PagerDutyIntegration controller watches for changes to
PagerDutyIntegration CRs, and also for changes to appropriately labeled
ClusterDeployment CRs (and ConfigMap/Secret/SyncSet resources owned by such
a ClusterDeployment).
- For each PagerDutyIntegration CR, it will get a list of matching
ClusterDeployments that have the
spec.installed
field set to true.
- For each of these ClusterDeployments, PagerDuty creates a secret which
contains the integration key required to communicate with PagerDuty Web
application.
- The PagerDuty operator then creates
syncset
with the relevant information for hive to send the PagerDuty secret to the
newly provisioned cluster.
- This syncset is used by hive to deploy the pagerduty secret to the
provisioned cluster so that the relevant SRE team get notified of alerts on
the cluster.
- The pagerduty secret is deployed to the coordinates specified in the
spec.targetSecretRef
field of the PagerDutyIntegration CR.
Development
Set up local OpenShift cluster
For example install
crc as
described in its documentation.
Deploy dependencies
Create hive CRDs. To do so, clone
hive repo and run
oc apply -f config/crds
Deploy namespace, role, etc from pagerduty-operator
oc apply -f manifests/01-namespace.yaml
oc apply -f manifests/02-role.yaml
oc apply -f manifests/03-service_account.yaml
oc apply -f manifests/04-role_binding.yaml
oc apply -f deploy/crds/pagerduty.openshift.io_pagerdutyintegrations.yaml
Create secret with pagerduty api key, for example using a
trial account. You can then create an
API key at https://{your-account}.pagerduty.com/api_keys.
Following is an example secret to adjust and apply with oc apply -f ${FILENAME}
.
apiVersion: v1
data:
PAGERDUTY_API_KEY: bXktYXBpLWtleQ== #echo -n ${PAGERDUTY_API_KEY} | base64
kind: Secret
metadata:
name: pagerduty-api-key
namespace: pagerduty-operator
type: Opaque
export OPERATOR_NAME=pagerduty-operator
go run main.go
In another terminal create namespace pagerduty-operator
if needed:
oc create namespace pagerduty-operator
Continue to Create PagerDutyIntegration.
Option 2: Run local built operator in CRC
Build local code modifications and push image to your own quay.io account.
$ ALLOW_DIRTY_CHECKOUT=true IMAGE_REPOSITORY=${USER_ID} make docker-build
[...]
Successfully tagged quay.io/${USER_ID}/pagerduty-operator:latest
$ podman login quay.io
$ podman push quay.io/${USER_ID}/pagerduty-operator:latest
Generate secret with quay.io creds
oc project pagerduty-operator
oc apply -f ~/Downloads/${USER_ID}-secret.yml -n pagerduty-operator
Create a copy of manifests/05-operator.yaml
and modify it to use your image
from quay.io
...
imagePullSecrets:
- name: ${USER_ID}-pull-secret
containers:
- name: pagerduty-operator
image: quay.io/${USER_ID}/pagerduty-operator
...
Deploy modified operator manifest
oc apply -f path/to/modified/operator.yaml
Note: In some cases, the pagerduty-operator
pod in the
pagerduty-operator
namespace doesn't start with the following error:
Warning FailedScheduling 3m5s default-scheduler 0/1 nodes are available:
1 Insufficient memory. preemption: 0/1 nodes are available: 1 No preemption
victims found for incoming pod.
To remedy this, lower the requested resources in the
manifests/05-operator.yaml
deployment (e.g. lower memory from 2G to 0.5G).
There's an example at
deploy-extras/pagerduty_v1alpha1_pagerdutyintegration_cr.yaml
that
you can edit and apply to your cluster.
You'll need to use a valid escalation policy ID from your PagerDuty account. You
can get this by clicking on your policy at
https://{your-account}.pagerduty.com/escalation_policies#. The ID will be
visible in the URL after the #
character.
Create the ClusterDeployment
object
pagerduty-operator
doesn't start reconciling clusters until
spec.installed
is set to true
.
You can create a dummy ClusterDeployment by copying a real one from an active
hive
real-hive$ oc get cd -n ${NAMESPACE} ${CD_NAME} -o yaml > /tmp/fake-clusterdeployment.yaml
...
$ oc create namespace ${NAMESPACE}
$ oc apply -f /tmp/fake-clusterdeployment.yaml
Perform the following modifications:
oc edit clusterdeployment ${CD_NAME} -n ${NAMESPACE}
-
Add the following finalizer (i.e. metadata.finalizers
):
- pd.managed.openshift.io/example-pagerdutyintegration
(the suffix comes from the PagerDutyIntegration
name)
-
Add the following label:
api.openshift.com/test: "true"
(this is used by the ClusterDeployment
selector in the default PagerDutyIntegration
)
-
Set spec.installed
to true.
Delete the ClusterDeployment
object
To trigger pagerduty-operator
to remove the service in pagerduty, delete the clusterdeployment.
oc delete clusterdeployment fake-cluster -n fake-cluster-namespace
You may need to remove dangling finalizers from the clusterdeployment
object.
oc edit clusterdeployment fake-cluster -n fake-cluster-namespace