README ¶
Kubeflow Bootstrap
Bootstrap is a tool to manage ksonnet application configured to take advantage of a user's cluster. See dev guide for more details.
The tool collects information from a variety of sources
- Kubeconfig file
- K8s master
- User input
- Cloud environment
and based on the results chooses good values for various Kubeflow parameters.
Requires ksonnet 0.10.0
- The app generated won't work with earlier versions (0.9) of ksonnet
- You can use the version of ksonnet built in the docker container as illustrated below.
Usage
quick start
kubectl create -f bootstrapper.yaml
You should have kubeflow components deployed inside your k8s cluster. Generated ksonnet application is store in app-dir.
Exec into pod kubeflow-bootstrapper-0
in namespace kubeflow-admin
if you need to edit your ksonnet app.
The default components are defined in default.yaml, user can customize which components to deploy by
pointing --config
args in bootstrapper.yaml to their own config (eg. a configmap in k8s clsuter)
This bootstrapper example config can help explain how config customization works.
Interactive-use container
TAG=latest
APP_DIR_HOST=$HOME/kfBootstrap
GITHUB_TOKEN=<Get a GitHub token to avoid API Limits>
# Start container
# Need to map config files like kubeconfig and gcloud config into the container.
docker run -ti \
-e GITHUB_TOKEN=${GITHUB_TOKEN} \
-e GROUP_ID=`id -g ${GROUP}` \
-e USER_ID=`id -u ${USER}` \
-e USER=${USER} \
-v ${APP_DIR_HOST}:/home/${USER}/kfBootstrap \
-v ${HOME}/.kube:/home/${USER}/.kube \
-v ${HOME}/.config:/home/${USER}/.config gcr.io/kubeflow-images-public/bootstrapper:latest
Check how to avoid getting a GitHub rate limit exceeded error.
Inside container, choose one way to generate kubeflow apps:
- On GKE, without Google Sign-in:
/opt/kubeflow/bootstrapper --app-dir=/home/${USER}/kfBootstrap/<your_app_name> --namespace=<new_namespace_for_bootstrap> --email=<GCP_account>
- Outside GKE:
/opt/kubeflow/bootstrapper --app-dir=/home/${USER}/kfBootstrap/<your_app_name> --namespace=<new_namespace_for_bootstrap>
Now the ksonnet app for deploying Kubeflow will be available in ${APP_DIR_HOST}/<your_app_name>
(Optional) enable usage reporting
ks param set kubeflow-core reportUsage true
ks param set kubeflow-core usageId $(uuidgen)
To deploy it
# Inside container:
cd /home/${USER}/kfBootstrap/<your_app_name>
ks apply default
To connect to your Jupyter Notebook:
```
# On your local machine:
PODNAME=`kubectl get pods --namespace=<Namespace for bootstrap> --selector="app=tf-hub" --output=template --template="{{with index .items 0}}{{.metadata.name}}{{end}}"`
kubectl port-forward --namespace=<Namespace for bootstrap> $PODNAME 8000:8000
```
Then, open [http://127.0.0.1:8000](http://127.0.0.1:8000) in your browser.
Explanation
For Kubeflow we want a low bar and a high ceiling.
Low bar means we want Kubeflow to be easily accessible.
High ceiling means we want to allow advanced users to customize Kubeflow in complex ways to meet their needs.
Ksonnet creates a high ceiling that allows Kubeflow users to manage Kubeflow declaratively.
However ksonnet creates a bar to getting started with Kubeflow
- Users have to download and install the ksonnet tool chain.
- Users have to learn how to use ksonnet to deploy Kubeflow
- We lack a mechanism for auto-configuring the ksonnet app based on a user's K8s setup
The config manager aims to solve this by providing a binary/server that can be used to generate the Kubeflow ksonnet app.
Goals
-
Provide a tool to auto-generate and deploy Kubeflow
-
Optimize the initial Kubeflow config for the user's K8s setup
* Eventually we'd like to provide a web-ui to allow for a guided onboarding experience
- Allow advanced customization by emitting the resulting ksonnet application so that users can do arbitrary manipulations
Non Goals
- Kubeflow lifecycle management
* The current thinking is that lifecycle management should be handled using the application CRD ([KEP](SIG apps repo https://github.com/kubernetes/community/pull/1629) [PR](https://github.com/kubernetes/community/pull/1629)) that sig-apps is developing
- wrap or replace ks/kubectl
Bootstrapper Dev Guide
Bootstrapper create your ksonnet application by creating component set following config file (config example). In other word, bootstrapper provide user an easy way to use ksonnet api within k8s cluster by editing config file.
Currently we only support using local registries within bootstrapper image. Edit image config to control which registries to include in bootstrapper image, then user can build custom image to bootstrap.
Background
Here are some of the current difficulties with deploying Kubeflow today.
ksonnet is a barrier
As mentioned above, learning and setting up ksonnet is a barrier.
No mechanism for auto-configure
Kubeflow is adding options to control how Kubeflow is deployed; examples include
- Using persistent storage to back Jupyter notebooks
- How Jupyter auth is handled
- Ingress
While these options raise the ceiling, they make it difficult to get started.
Ksonnet supports default values but finding defaults that work across all K8s deployments is difficult. kubeflow/kubeflow#336 is one example where relying on a default storage class caused problems for some users.
We need a mechanism to auto-configure Kubeflow for a particular K8s setup so that we don't end up defaulting to the lowest common denominator.
Non K8s dependencies
Another problem is what to do about non-k8s dependencies. For example, as part of deploying Kubeflow users may want to reserve an ip address to use for ingress.
Some options for dealing with this
- Create the resources in the bootstrapper
- Downside of this is that it violates the K8s philosophy of managing infrastructure declaratively
- It also means salient details about the deployment aren't stored in the configs (ksonnet application) and versioned in source controle
- Wrap the resource creation/management in a CRD using kube-metacontroller
-
One potential downside is that this may require extra permissions.
No ordering to deployments
There's currently no explicit mechanism in K8s to control the order in which resources are created. Phase ordering is one problem highlighted in Brian Grant's doc about declarative application management in K8s.
Potential solutions
- implicit phase ordering
- e.g. if a pod depends on a volume or ConfigMap that pod won't be scheduled until the config map exists