+ - 0:00:00
Notes for current slide
Notes for next slide

Docker for Cluster Computing in Python

Olivier Grisel - EuroSciPy 2016

Inria scikit-learn
1 / 46

Outline

Distributed ML with joblib and dask/distributed

A quick intro to Docker

Cluster orchestration with Docker and Kubernetes

2 / 46

Distributed ML with joblib and dask/distributed

3 / 46

A cluster for sklearn

Cluster diagram
4 / 46

Demo

https://github.com/ogrisel/docker-distributed

Have a look a the notebooks in the examples folder.

5 / 46

What is Docker?

6 / 46

Docker is:

Not a virtual machine system

Linux container + Layered file-system + Abstract network

It can run in a VM (or not)

7 / 46

VM vs Container

VM-based hosting vs Container-based hosting
8 / 46

Why is docker nicer than VMs

Less overhead: isolation vs virtualization

Layered FS: incremental diffs + common base images

Nice command line based UX

Simple reproducible builds and deployment

9 / 46

Docker architecture

Architecture of Docker
10 / 46

Some use cases

Host applications / services efficiently

Fast developer environment setup

Flexible build / Continuous Integration

Quickly provision a compute cluster

Reproducible science

11 / 46

Getting started with Docker

https://docs.docker.com/engine/installation/

Windows: daemon in Linux with Hyper-V or Virtualbox

OSX: daemon in Linux with xhyve or Virtualbox

Linux: daemon on host without VM

Cross-platform docker client binary

12 / 46

Running in a container

$ docker run python:3.5 python -c "print(40 + 2)"
13 / 46

Running in a container

$ docker run python:3.5 python -c "print(40 + 2)"
14 / 46

Running in a container

$ docker run python:3.5 python -c "print(40 + 2)"
15 / 46

Running in a container

$ docker run --rm python:3.5 python -c "print(40 + 2)"
16 / 46

Running in a container

$ docker run --rm python:3.5 python -c "print(40 + 2)"
Unable to find image 'python:3.5' locally
latest: Pulling from library/python
357ea8c3d80b: Already exists
52befadefd24: Pull complete
3c0732d5313c: Pull complete
ceb711c7e301: Downloading [=========> ] 90.27 MB/129.7 MB
4211bb537697: Download complete
71f9074c0739: Download complete
3e5349707036: Download complete
17 / 46

Running in a container

$ docker run --rm python:3.5 python -c "print(40 + 2)"
Unable to find image 'python:3.5' locally
latest: Pulling from library/python
357ea8c3d80b: Already exists
52befadefd24: Pull complete
3c0732d5313c: Pull complete
ceb711c7e301: Pull complete
4211bb537697: Pull complete
71f9074c0739: Pull complete
3e5349707036: Pull complete
Digest: sha256:a755ad5a30b2[...]
Status: Downloaded newer image for python:3.5
42
18 / 46

Running in a container

$ docker run --rm python:3.5 python -c "print(40 + 2)"
Unable to find image 'python:3.5' locally
latest: Pulling from library/python
357ea8c3d80b: Already exists
52befadefd24: Pull complete
3c0732d5313c: Pull complete
ceb711c7e301: Pull complete
4211bb537697: Pull complete
71f9074c0739: Pull complete
3e5349707036: Pull complete
Digest: sha256:a755ad5a30b2[...]
Status: Downloaded newer image for python:3.5
42
$ docker run --rm python:3.5 python -c "print(40 + 3)"
43
19 / 46

Interactive sessions

$ docker run --rm -ti python:3.5 bash
root@10d2dfedb935:/#
20 / 46

Interactive sessions

$ docker run --rm -ti python:3.5 bash
root@10d2dfedb935:/# ps
PID TTY TIME CMD
1 ? 00:00:00 bash
8 ? 00:00:00 ps
root@10d2dfedb935:/#
21 / 46

Interactive sessions

$ docker run --rm -ti python:3.5 bash
root@10d2dfedb935:/# ps
PID TTY TIME CMD
1 ? 00:00:00 bash
8 ? 00:00:00 ps
root@10d2dfedb935:/# python
Python 3.5.2 (default, Aug 9 2016, 20:58:38)
[GCC 4.9.2] on linux
>>>
22 / 46

Interactive sessions

$ docker run --rm -ti python:3.5 bash
root@10d2dfedb935:/# ps
PID TTY TIME CMD
1 ? 00:00:00 bash
8 ? 00:00:00 ps
root@10d2dfedb935:/# python
Python 3.5.2 (default, Aug 9 2016, 20:58:38)
[GCC 4.9.2] on linux
>>> 40 + 2
42
23 / 46

Interactive sessions

$ docker run --rm -ti python:3.5 bash
root@10d2dfedb935:/# ps
PID TTY TIME CMD
1 ? 00:00:00 bash
8 ? 00:00:00 ps
root@10d2dfedb935:/# python
Python 3.5.2 (default, Aug 9 2016, 20:58:38)
[GCC 4.9.2] on linux
>>> 40 + 2
42
>>> ^D
root@10d2dfedb935:/# exit
24 / 46

A real-life example

$ cd bokeh/bokeyjs
$ npm install -g gulp
$ gulp build
[Bunch or errors caused by a broken nodejs install]
25 / 46

A real-life example

$ cd bokeh
$ docker run --rm --volume $PWD:/io node \
bash -c "cd /io/bokehjs && npm install -g gulp && gulp build"
26 / 46

A real-life example

$ cd bokeh
$ docker run --rm --volume $PWD:/io node \
bash -c "cd /io/bokehjs && npm install -g gulp && gulp build"
27 / 46

A real-life example

$ cd bokeh
$ docker run --rm --volume $PWD:/io node \
bash -c "cd /io/bokehjs && npm install -g gulp && gulp build"
npm info it worked if it ends with ok
npm info using npm@3.10.3
npm info using node@v6.4.0
npm info attempt registry request try #1 at 12:45:51 PM
npm http request GET https://registry.npmjs.org/gulp
[...] # Download the Internet
[13:03:42] Finished 'build' after 10 s
$ ls bokehjs/node_modules | wc -l
604
28 / 46

Building an image

29 / 46
$ git clone https://github.com/ogrisel/docker-distributed
$ ls docker-distributed
docker-compose.yml
Dockerfile
examples
kubernetes
README.md
requirements.txt
30 / 46
$ git clone https://github.com/ogrisel/docker-distributed
$ ls docker-distributed
docker-compose.yml
Dockerfile
examples
kubernetes
README.md
requirements.txt
31 / 46
$ git clone https://github.com/ogrisel/docker-distributed
$ ls docker-distributed
docker-compose.yml
Dockerfile
examples
kubernetes
README.md
requirements.txt
32 / 46
FROM debian:jessie
MAINTAINER Olivier Grisel <olivier.grisel@ensta.org>
RUN apt-get update -yqq && apt-get install -yqq wget bzip2 git \
&& rm -rf /var/lib/apt/lists/*
# Configure environment
ENV LC_ALL=C.UTF-8 LANG=C.UTF-8
RUN mkdir /work
WORKDIR /work
# Install Python 3 from miniconda
RUN wget -O miniconda.sh \
https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& bash miniconda.sh -b -p /work/miniconda \
&& rm miniconda.sh
ENV PATH="/work/bin:/work/miniconda/bin:$PATH"
RUN conda install -y \
pip \
notebook \
pandas \
scikit-learn \
&& conda clean -tipsy
# Install the master branch of distributed and dask
COPY requirements.txt .
RUN pip install -r requirements.txt && rm -rf ~/.cache/pip/
# Add the example notebooks
COPY examples .
33 / 46

docker build

$ docker build -t ogrisel/distributed .
Sending build context to Docker daemon 178.2 kB
Step 1 : FROM debian:jessie
---> 1b01529cc499
Step 2 : MAINTAINER Olivier Grisel <olivier.grisel@ensta.org>
---> Using cache
---> 37887ee139f1
Step 3 : RUN apt-get update -yqq && apt-get install -yqq wget [...]
---> Using cache
---> 3c2b8caccb80
[...]
34 / 46

docker build & docker push

$ docker build -t ogrisel/distributed .
Sending build context to Docker daemon 178.2 kB
Step 1 : FROM debian:jessie
---> 1b01529cc499
Step 2 : MAINTAINER Olivier Grisel <olivier.grisel@ensta.org>
---> Using cache
---> 37887ee139f1
Step 3 : RUN apt-get update -yqq && apt-get install -yqq wget [...]
---> Using cache
---> 3c2b8caccb80
[...]
$ docker run --rm ogrisel/distributed \
python -c "import sklearn; print(sklearn.__version__)"
0.17.1
$ docker push ogrisel/distributed
35 / 46

Orchestration

36 / 46

Orchestration tools

Docker Swarm and Docker Compose

Kubernetes

DC/OS and Mesos

37 / 46

Kubernetes and GKE

$ gcloud config set compute/zone europe-west1-d
$ gcloud container clusters create cluster-1 \
--num-nodes 3 \
--machine-type n1-highcpu-32 \
--scopes bigquery,storage-rw \
--wait
Creating cluster cluster-1...done.
Created [https://container.googleapis.com/v1/.../cluster-1].
kubeconfig entry generated for cluster-1.
NAME ZONE ... MACHINE_TYPE NUM_NODES STATUS
cluster-1 europe-west1-d ... n1-highcpu-32 3 RUNNING
38 / 46

Kubernetes and GKE

$ gcloud config set compute/zone europe-west1-d
$ gcloud container clusters create cluster-1 \
--num-nodes 3 \
--machine-type n1-highcpu-32 \
--scopes bigquery,storage-rw \
--wait
Creating cluster cluster-1...done.
Created [https://container.googleapis.com/v1/.../cluster-1].
kubeconfig entry generated for cluster-1.
NAME ZONE ... MACHINE_TYPE NUM_NODES STATUS
cluster-1 europe-west1-d ... n1-highcpu-32 3 RUNNING
$ gcloud container clusters get-credentials cluster-1
Fetching cluster endpoint and auth data.
kubeconfig entry generated for cluster-1.
39 / 46

Kubernetes and GKE

$ git clone https://github.com/ogrisel/docker-distributed
$ cd docker-distributed
$ kubectl create -f kubernetes/
service "dscheduler" created
service "dscheduler-status" created
replicationcontroller "dscheduler" created
replicationcontroller "dworker" created
service "jupyter-notebook" created
replicationcontroller "jupyter-notebook" created
40 / 46

Kubernetes and GKE

$ git clone https://github.com/ogrisel/docker-distributed
$ cd docker-distributed
$ kubectl create -f kubernetes/
service "dscheduler" created
service "dscheduler-status" created
replicationcontroller "dscheduler" created
replicationcontroller "dworker" created
service "jupyter-notebook" created
replicationcontroller "jupyter-notebook" created
$ kubectl get services
NAME CLUSTER-IP EXTERNAL-IP PORT(S)
dscheduler 10.115.249.189 <none> 8786/TCP,9786/TCP
dscheduler-status 10.115.244.201 130.211.50.206 8787/TCP
jupyter-notebook 10.115.254.255 146.148.114.90 80/TCP
kubernetes 10.115.240.1 <none> 443/TCP
41 / 46

distributed-worker.yml

apiVersion: v1
kind: ReplicationController
metadata:
name: dworker-controller
spec:
replicas: 3
selector:
name: dworker
template:
metadata:
labels:
name: dworker
spec:
containers:
- name: dworker
image: ogrisel/distributed:latest
args: ["dask-worker", "dscheduler:8786"]
42 / 46

Conclusion

Docker is a useful developer tool

Containers make cluster setup reproducible and cloud agnostic

Docker Swarm vs Kubernetes vs Mesos: learn any one and understand others

43 / 46

Thank you

Rackspace for free cloud resources to support SciPy projects and for helping the Python community in general.

Google for GCP credits to test distributed Python with Kubernetes.

Inria for supporting my work on scikit-learn and related projects.

44 / 46

Thank you!

Sample configuration:

Those slides:

45 / 46

Image credits

46 / 46

Outline

Distributed ML with joblib and dask/distributed

A quick intro to Docker

Cluster orchestration with Docker and Kubernetes

2 / 46
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow