Operator for installation and lifecycle management of CodeFlare distributed workload stack, starting with MCAD and InstaScale
CodeFlare Stack Compatibility Matrix
Component
Version
CodeFlare Operator
v0.2.2
Multi-Cluster App Dispatcher
v1.34.0
CodeFlare-SDK
v0.7.0
InstaScale
v0.0.7
KubeRay
v0.5.0
Development
Testing
The e2e tests can be executed locally by running the following commands:
Use an existing cluster, or set up a test cluster, e.g.:
# Create a KinD cluster
make kind-e2e
# Install the CRDs
make install
[!NOTE]
Some e2e tests cover the access to services via Ingresses, as end-users would do, which requires access to the Ingress controller load balancer by its IP.
For it to work on macOS, this requires installing docker-mac-net-connect.
Start the operator locally:
make run
Alternatively, You can run the operator from your IDE / debugger.
Set up the test CodeFlare stack:
make setup-e2e
[!NOTE]
In OpenShift the KubeRay operator pod gets random user assigned. This user is then used to run Ray cluster.
However the random user assigned by OpenShift doesn't have rights to store dataset downloaded as part of test execution, causing tests to fail.
To prevent this failure on OpenShift user should enforce user 1000 for KubeRay and Ray cluster by creating this SCC in KubeRay operator namespace (replace the namespace placeholder):
Once PR is merged, announce the new release in slack and mail lists, if any.
Update the Distributed Workloads component in ODH (also copy/update the compatibility matrix). This may require yaml and test updates depending on the release. Make sure to create a tag + release in the Distributed Workloads repository that matches the project-codeflare release version.
Update the readme/markdown/yaml in odh-manifests as required.
Releases involving part of the stack
There may be instances in which a new CodeFlare stack release requires releases of only a subset of the stack components. Examples could be hotfixes for a specific component. In these instances: