trustyai-service-operator

command module
v1.19.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 19, 2024 License: Apache-2.0 Imports: 15 Imported by: 0

README

Controller TestsYAML lint

TrustyAI Kubernetes Operator

Overview

The TrustyAI Kubernetes Operator aims at simplifying the deployment and management of the TrustyAI service on Kubernetes and OpenShift clusters by watching for custom resources of kind TrustyAIService in the trustyai.opendatahub.io API group and manages deployments, services, and optionally, routes and ServiceMonitors corresponding to these resources.

The operator ensures the service is properly configured, is discoverable by Prometheus for metrics scraping (on both Kubernetes and OpenShift), and is accessible via a Route on OpenShift.

Prerequisites

  • Kubernetes cluster v1.19+ or OpenShift cluster v4.6+
  • kubectl v1.19+ or oc client v4.6+

Installation using pre-built Operator image

This operator is available as an image on Quay.io. To deploy it on your cluster:

  1. Install the Custom Resource Definition (CRD):

    Apply the CRD to your cluster (replace the URL with the relevant one, if using another repository):

    kubectl apply -f https://raw.githubusercontent.com/trustyai-explainability/trustyai-service-operator/main/config/crd/bases/trustyai.opendatahub.io_trustyaiservices.yaml
    
  2. Deploy the Operator:

    Apply the following Kubernetes manifest to deploy the operator:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: trustyai-operator
      namespace: trustyai-operator-system
    spec:
      replicas: 1
      selector:
        matchLabels:
          control-plane: trustyai-operator
      template:
        metadata:
          labels:
            control-plane: trustyai-operator
        spec:
          containers:
            - name: trustyai-operator
              image: quay.io/trustyai/trustyai-service-operator:latest
              command:
                - /manager
              resources:
                limits:
                  cpu: 100m
                  memory: 30Mi
                requests:
                  cpu: 100m
                  memory: 20Mi
    

    or run

    kubectl apply -f https://raw.githubusercontent.com/trustyai-explainability/trustyai-service-operator/main/artifacts/examples/deploy-operator.yaml   
    

Usage

Once the operator is installed, you can create TrustyAIService resources, and the operator will create corresponding TrustyAI deployments, services, and (on OpenShift) routes.

Here's an example TrustyAIService manifest:

apiVersion: trustyai.opendatahub.io/v1alpha1
kind: TrustyAIService
metadata:
  name: trustyai-service-example
spec:
  storage:
    format: "PVC"
    folder: "/inputs"
    size: "1Gi"
  data:
    filename: "data.csv"
    format: "CSV"
  metrics:
    schedule: "5s"
    batchSize: 5000 # Optional, defaults to 5000

You can apply this manifest with

kubectl apply -f <file-name.yaml> -n $NAMESPACE

to create a service, where $NAMESPACE is the namespace where you want to deploy it.

Additionally, in that namespace:

  • a ServiceMonitor will be created to allow Prometheus to scrape metrics from the service.
  • (if on OpenShift) a Route will be created to allow external access to the service.
Custom Image Configuration using ConfigMap

You can specify a custom TrustyAI-service image via adding parameters to the TrustyAI-Operator KFDef, for example:

apiVersion: kfdef.apps.kubeflow.org/v1
kind: KfDef
metadata:
  name: trustyai-service-operator
  namespace: opendatahub
spec:
  applications:
  - kustomizeConfig:
      repoRef:
        name: manifests
        path: config
      parameters:
         - name: trustyaiServiceImage
           value: NEW_IMAGE_NAME
    name: trustyai-service-operator
  repos:
  - name: manifests
    uri: https://github.com/trustyai-explainability/trustyai-service-operator/tarball/main
  version: v1.0.0

If these parameters are unspecified, the default image and tag will be used.

If you'd like to change the service image/tag after deploying the operator, simply change the parameters in the KFDef. Any TrustyAI service deployed subsequently will use the new image and tag.

TrustyAIService Status Updates

The TrustyAIService custom resource tracks the availability of InferenceServices and PersistentVolumeClaims (PVCs) through its status field. Below are the status types and reasons that are available:

InferenceService Status
Status Type Status Reason Description
InferenceServicesPresent InferenceServicesNotFound InferenceServices were not found.
InferenceServicesPresent InferenceServicesFound InferenceServices were found.
PersistentVolumeClaim (PVCs) Status
Status Type Status Reason Description
PVCAvailable PVCNotFound PersistentVolumeClaim not found.
PVCAvailable PVCFound PersistentVolumeClaim found.
Status Behavior
  • If a PVC is not available, the Ready status of TrustyAIService will be set to False.
  • However, if InferenceServices are not found, the Ready status of TrustyAIService will not be affected, i.e., it is Ready by all other conditions, it will remain so.

Contributing

Please see the CONTRIBUTING.md file for more details on how to contribute to this project.

License

This project is licensed under the Apache License Version 2.0 - see the LICENSE file for details.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
api
v1alpha1
Package v1alpha1 contains API Schema definitions for the trustyai.opendatahub.io v1alpha1 API group +kubebuilder:object:generate=true +groupName=trustyai.opendatahub.io
Package v1alpha1 contains API Schema definitions for the trustyai.opendatahub.io v1alpha1 API group +kubebuilder:object:generate=true +groupName=trustyai.opendatahub.io

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL