envd

module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 2, 2022 License: Apache-2.0

README ΒΆ

envd

Development Environment for Data Scientists

discord invitation link continuous integration trackgit-views

envd is a development environment management tool for data scientists.

🐍 No docker, only python - Write python code to build the development environment, we help you take care of Docker.

πŸ–¨οΈ Built-in jupyter/vscode - Jupyter and VSCode remote extension are the first-class support.

⏱️ Save time - Better cache management to save your time, keep the focus on the model, instead of dependencies

☁️ Local & cloud - Run the environment locally or in the cloud, without any code change

🐳 Container native - Leverage container technologies but no need to learn how to use them, we optimize it for you

🀟 Infrastructure as code - Describe your project in a declarative way, 100% reproducible

Why use envd?

It is still too difficult to configure development environments and reproduce results for data scientists and AI/ML researchers.

They have to play with Docker, conda, CUDA, GPU Drivers, and even Kubernetes if the training jobs are running in the cloud, to make things happen.

Thus, researchers have to find infra guys to help them. But the infra guys also struggle to build environments for machine learning. Infra guys love immutable infrastructure. But researchers optimize AI/ML models by trial and error. The environment will be updated, modified, or rebuilt again, and again, in place. Researchers do not have the bandwidth to be the expert on Dockerfile. They prefer docker commit, then the image is error-prone and hard to maintain, or debug.

envd provides another way to solve the problem. As the infra guys, we accept the reality of the differences between AI/ML and traditional workloads. We do not expect researchers to learn the basics of infrastructure, instead, we build tools to help researchers manage their development environments easily, and in a cloud-native way.

envd provides build language similar to Python and has first-class support for jupyter, vscode, and python dependencies in container technologies.

How does envd work?

Install

From binary

You can download the binary from the latest release page.

After the download, please run envd bootstrap to bootstrap.

From source code

git clone https://github.com/tensorchord/envd
go mod tidy
make
./bin/envd --version

Quickstart

Checkout the examples, and configure envd with the manifest build.envd:

vscode(plugins=[
    "ms-python.python",
])

base(os="ubuntu20.04", language="python3")
pip_package(name=[
    "tensorflow",
    "numpy",
])
cuda(version="11.6", cudnn="8")
shell("zsh")
jupyter(password="", port=8888)

Then you can run envd up to create the development environment.

TODO: illustrate that the cache will be persistent.

$ envd up
[+] ⌚ parse build.envd and download/cache dependencies 0.0s βœ… (finished)        
 => πŸ’½ (cached) download oh-my-zsh                                            0.0s
 => πŸ’½ (cached) download ms-python.python                                     0.0s
[+] πŸ‹ build envd environment 7.7s (24/25)                                        
 => πŸ’½ (cached) (built-in packages) apt-get install curl openssh-client g     0.0s
 => πŸ’½ (cached) create user group envd                                        0.0s
 => πŸ’½ (cached) create user envd                                              0.0s
 => πŸ’½ (cached) add user envd to sudoers                                      0.0s
 => πŸ’½ (cached) (user-defined packages) apt-get install screenfetch           0.0s
 => πŸ’½ (cached) install system packages                                       0.0s
 => πŸ’½ (cached) pip install jupyter                                           0.0s
 => πŸ’½ (cached) install PyPI packages                                         0.0s
 => πŸ’½ (cached) install envd-ssh                                              0.0s
 => πŸ’½ (cached) install vscode plugin ms-python.python                        0.0s
 => πŸ’½ (cached) copy /oh-my-zsh /home/envd/.oh-my-zsh                         0.0s
 => πŸ’½ (cached) mkfile /home/envd/install.sh                                  0.0s
 => πŸ’½ (cached) install oh-my-zsh                                             0.0s
...
# You are in the docker container for dev
(envd 🐳)  ➜  mnist-dev git:(master) python3 ./main.py
...

Jupyter notebook service and sshd server are running inside the container. You can use jupyter or vscode remote-ssh extension to develop AI/ML models.

$ envd get envs
NAME         JUPYTER                 SSH TARGET   CONTEXT  IMAGE      GPU  CUDA  CUDNN  STATUS      CONTAINER ID 
mnist        http://localhost:9999   mnist.envd   /mnist   mnist:dev  true 11.6  8      Up 23 hours 74a9f1007004
$ envd get images
NAME         CONTEXT GPU     CUDA    CUDNN   IMAGE ID        CREATED         SIZE   
mnist:dev    /mnist  true    11.6    8       034ae55c5f4f    23 hours ago    7.28GB

Features

Pause and resume

$ envd pause --env mnist
mnist
$ env get envs
NAME         JUPYTER                 SSH TARGET   CONTEXT  IMAGE      GPU  CUDA  CUDNN  STATUS              CONTAINER ID 
mnist        http://localhost:9999   mnist.envd   /mnist   mnist:dev  true 11.6  8      Up 23 hours(Paused) 74a9f1007004
$ envd resume --env mnist
$ ssh mnist.envd
(envd 🐳) $ # The environment is resumed!

Configure mirrors

envd supports PyPI mirror and apt source configuration. You can configure them in build.env or $HOME/.config/envd/config.envd to set up in all environments.

cat ~/.config/envd/config.envd
ubuntu_apt(source="""
deb https://mirror.sjtu.edu.cn/ubuntu focal main restricted
deb https://mirror.sjtu.edu.cn/ubuntu focal-updates main restricted
deb https://mirror.sjtu.edu.cn/ubuntu focal universe
deb https://mirror.sjtu.edu.cn/ubuntu focal-updates universe
deb https://mirror.sjtu.edu.cn/ubuntu focal multiverse
deb https://mirror.sjtu.edu.cn/ubuntu focal-updates multiverse
deb https://mirror.sjtu.edu.cn/ubuntu focal-backports main restricted universe multiverse
deb http://archive.canonical.com/ubuntu focal partner
deb https://mirror.sjtu.edu.cn/ubuntu focal-security main restricted universe multiverse
""")
pip_index(url = "https://mirror.sjtu.edu.cn/pypi/web/simple")
vscode(plugins = [
    "ms-python.python",
    "github.copilot"
])

Join Us

envd is backed by TensorChord and licensed under Apache-2.0. We are actively hiring engineers to build developer tools for machine learning practitioners in open source.

Contribute

We welcome all kinds of contributions from the open-source community, individuals, and partners.

Directories ΒΆ

Path Synopsis
cmd
envd-ssh
ssh is the CLI running in the container as the sshd.
ssh is the CLI running in the container as the sshd.
pkg
buildkitd/mock
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
lang/frontend/starlark/mock
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
progress/compileui/mock
Package mock is a generated GoMock package.
Package mock is a generated GoMock package.
ssh
https://gist.github.com/stefanprodan/2d20d0c6fdab6f14ce8219464e8b4b9a
https://gist.github.com/stefanprodan/2d20d0c6fdab6f14ce8219464e8b4b9a

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL