docker_install_en.rst 8.2 KB
Newer Older
Y
Yi Wang 已提交
1 2
PaddlePaddle in Docker Containers
=================================
3

Y
Yi Wang 已提交
4 5 6 7 8 9
Docker container is currently the only officially-supported way to
running PaddlePaddle.  This is reasonable as Docker now runs on all
major operating systems including Linux, Mac OS X, and Windows.
Please be aware that you will need to change `Dockers settings
<https://github.com/PaddlePaddle/Paddle/issues/627>`_ to make full use
of your hardware resource on Mac OS X and Windows.
10

H
Helin Wang 已提交
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Working With Docker
-------------------

Here we will describe the basic docker concepts that we will be using
in this tutorial.

- *container* is an environment for running applications

- *image* is an immutable snapshot of a docker container. One can run
  a container based on a docker image by using command :code:`docker
  run docker_image_name`.

- By default docker container have an isolated file system namespace,
  we can not see the files in the host file system. By using *volume*,
  mounted files in host will be visible inside docker container.
  Following command will mount current dirctory into /data inside
  docker container, run docker container from debian image with
  command :code:`ls /data`.

  .. code-block:: bash

     docker run --rm -v $(pwd):/data debian ls /data
33

L
liaogang 已提交
34 35
Usage of CPU-only and GPU Images
----------------------------------
36

H
Helin Wang 已提交
37 38 39 40 41 42
For each version of PaddlePaddle, we release 2 types of Docker images:
development image and production image. Production image includes
CPU-only version and a CUDA GPU version and their no-AVX versions. We
put the docker images on `dockerhub.com
<https://hub.docker.com/r/paddledev/paddle/>`_. You can find the
latest versions under "tags" tab at dockerhub.com
43

H
Helin Wang 已提交
44
1. Production images, this image might have multiple variants:
45

H
Helin Wang 已提交
46 47 48 49
   - GPU/AVX::code:`paddlepaddle/paddle:<version>-gpu`
   - GPU/no-AVX::code:`paddlepaddle/paddle:<version>-gpu-noavx`
   - CPU/AVX::code:`paddlepaddle/paddle:<version>`
   - CPU/no-AVX::code:`paddlepaddle/paddle:<version>-noavx`
50

H
Helin Wang 已提交
51 52 53 54
   Please be aware that the CPU-only and the GPU images both use the
   AVX instruction set, but old computers produced before 2008 do not
   support AVX.  The following command checks if your Linux computer
   supports AVX:
55

H
Helin Wang 已提交
56
   .. code-block:: bash
57

H
Helin Wang 已提交
58
      if cat /proc/cpuinfo | grep -i avx; then echo Yes; else echo No; fi
59

H
Helin Wang 已提交
60 61
   
   To run the CPU-only image as an interactive container:
62

H
Helin Wang 已提交
63
   .. code-block:: bash
64

H
Helin Wang 已提交
65
      docker run -it --rm paddlepaddle/paddle:0.10.0rc2 /bin/bash
66

H
Helin Wang 已提交
67 68
   Above method work with the GPU image too -- the recommended way is
   using `nvidia-docker <https://github.com/NVIDIA/nvidia-docker>`_.
69

H
Helin Wang 已提交
70 71
   Please install nvidia-docker first following this `tutorial
   <https://github.com/NVIDIA/nvidia-docker#quick-start>`_.
72

H
Helin Wang 已提交
73
   Now you can run a GPU image:
L
liaogang 已提交
74

H
Helin Wang 已提交
75
   .. code-block:: bash
L
liaogang 已提交
76

H
Helin Wang 已提交
77
      nvidia-docker run -it --rm paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash
L
liaogang 已提交
78

H
Helin Wang 已提交
79
2. development image :code:`paddlepaddle/paddle:<version>-dev`
L
liaogang 已提交
80

H
Helin Wang 已提交
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
   This image has packed related develop tools and runtime
   environment. Users and developers can use this image instead of
   their own local computer to accomplish development, build,
   releasing, document writing etc. While different version of paddle
   may depends on different version of libraries and tools, if you
   want to setup a local environment, you must pay attention to the
   versions.  The development image contains:
   
   - gcc/clang
   - nvcc
   - Python
   - sphinx
   - woboq
   - sshd
     
   Many developers use servers with GPUs, they can use ssh to login to
   the server and run :code:`docker exec` to enter the docker
   container and start their work.  Also they can start a development
   docker image with SSHD service, so they can login to the container
   and start work.
101

Y
yi.wu 已提交
102

H
Helin Wang 已提交
103 104
Train Model Using Python API
----------------------------
Y
yi.wu 已提交
105

H
Helin Wang 已提交
106 107
Our official docker image provides a runtime for PaddlePaddle
programs. The typical workflow will be as follows:
Y
yi.wu 已提交
108

H
Helin Wang 已提交
109
Create a directory as workspace:
L
liaogang 已提交
110

H
Helin Wang 已提交
111
.. code-block:: bash
L
liaogang 已提交
112

H
Helin Wang 已提交
113
   mkdir ~/workspace
L
liaogang 已提交
114

H
Helin Wang 已提交
115
Edit a PaddlePaddle python program using your favourite editor
Y
yi.wu 已提交
116

H
Helin Wang 已提交
117
.. code-block:: bash
Y
yi.wu 已提交
118

H
Helin Wang 已提交
119
   emacs ~/workspace/example.py
Y
yi.wu 已提交
120

H
Helin Wang 已提交
121
Run the program using docker:
Y
yi.wu 已提交
122

H
Helin Wang 已提交
123
.. code-block:: bash
Y
yi.wu 已提交
124

H
Helin Wang 已提交
125
   docker run -it --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 python /workspace/example.py
126

H
Helin Wang 已提交
127
Or if you are using GPU for training:
128

H
Helin Wang 已提交
129
.. code-block:: bash
130

H
Helin Wang 已提交
131
   nvidia-docker run -it --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu python /workspace/example.py
132

H
Helin Wang 已提交
133 134 135
Above commands will start a docker container by running :code:`python
/workspace/example.py`. It will stop once :code:`python
/workspace/example.py` finishes.
136

H
Helin Wang 已提交
137 138
Another way is to tell docker to start a :code:`/bin/bash` session and
run PaddlePaddle program interactively:
139

H
Helin Wang 已提交
140 141 142 143 144 145 146 147
.. code-block:: bash

   docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 /bin/bash
   # now we are inside docker container
   cd /workspace
   python example.py

Running with GPU is identical:
148 149

.. code-block:: bash
Y
yi.wu 已提交
150

H
Helin Wang 已提交
151 152 153 154
   nvidia-docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash
   # now we are inside docker container
   cd /workspace
   python example.py
155

Y
yi.wu 已提交
156

H
Helin Wang 已提交
157 158
Develop PaddlePaddle or Train Model Using C++ API
---------------------------------------------------
159

H
Helin Wang 已提交
160 161
We will be using PaddlePaddle development image since it contains all
compiling tools and dependencies.
162

H
Helin Wang 已提交
163
Let's clone PaddlePaddle repo first:
164

H
Helin Wang 已提交
165
.. code-block:: bash
166

H
Helin Wang 已提交
167
   git clone https://github.com/PaddlePaddle/Paddle.git && cd Paddle
168

H
Helin Wang 已提交
169 170 171
Mount both workspace folder and paddle code folder into docker
container, so we can access them inside docker container. There are
two ways of using PaddlePaddle development docker image:
172

H
Helin Wang 已提交
173
- run interactive bash directly
174

H
Helin Wang 已提交
175
  .. code-block:: bash
D
dayhaha 已提交
176

H
Helin Wang 已提交
177 178 179
     # use nvidia-docker instead of docker if you need to use GPU
     docker run -it -v ~/workspace:/workspace -v $(pwd):/paddle paddlepaddle/paddle:0.10.0rc2-dev /bin/bash
     # now we are inside docker container
180

H
Helin Wang 已提交
181
- or, we can run it as a daemon container
182

H
Helin Wang 已提交
183
  .. code-block:: bash
184

H
Helin Wang 已提交
185 186
     # use nvidia-docker instead of docker if you need to use GPU
     docker run -d -p 2202:22 -p 8888:8888 -v ~/workspace:/workspace -v $(pwd):/paddle paddlepaddle/paddle:0.10.0rc2-dev /usr/sbin/sshd -D
187

H
Helin Wang 已提交
188
  and SSH to this container using password :code:`root`:
189

H
Helin Wang 已提交
190
  .. code-block:: bash
191

H
Helin Wang 已提交
192
     ssh -p 2202 root@localhost
王益 已提交
193

H
Helin Wang 已提交
194 195
  An advantage is that we can run the PaddlePaddle container on a
  remote server and SSH to it from a laptop.
196

H
Helin Wang 已提交
197 198 199
When developing PaddlePaddle, you can edit PaddlePaddle source code
from outside of docker container using your favoriate editor. To
compile PaddlePaddle, run inside container:
200

H
Helin Wang 已提交
201
.. code-block:: bash
202

H
Helin Wang 已提交
203
   WITH_GPU=OFF WITH_AVX=ON WITH_TEST=ON bash /paddle/paddle/scripts/docker/build.sh
王益 已提交
204

H
Helin Wang 已提交
205 206
This builds everything about Paddle in :code:`/paddle/build`.  And we
can run unit tests there:
王益 已提交
207

H
Helin Wang 已提交
208
.. code-block:: bash
209

H
Helin Wang 已提交
210 211
   cd /paddle/build
   ctest
王益 已提交
212

H
Helin Wang 已提交
213 214 215
When training model using C++ API, we can edit paddle program in
~/workspace outside of docker. And build from /workspace inside of
docker.
216

H
Helin Wang 已提交
217 218
PaddlePaddle Book
------------------
219

H
Helin Wang 已提交
220 221 222
The Jupyter Notebook is an open-source web application that allows
you to create and share documents that contain live code, equations,
visualizations and explanatory text in a single browser.
223

H
Helin Wang 已提交
224 225 226
PaddlePaddle Book is an interactive Jupyter Notebook for users and developers.
We already exposed port 8888 for this book. If you want to
dig deeper into deep learning, PaddlePaddle Book definitely is your best choice.
王益 已提交
227

H
Helin Wang 已提交
228
We provide a packaged book image, simply issue the command:
229

H
Helin Wang 已提交
230
.. code-block:: bash
王益 已提交
231

H
Helin Wang 已提交
232
    docker run -p 8888:8888 paddlepaddle/book
王益 已提交
233

H
Helin Wang 已提交
234 235 236 237 238 239 240
Then, you would back and paste the address into the local browser:

.. code-block:: text

    http://localhost:8888/

That's all. Enjoy your journey!
241

242 243 244 245 246 247 248 249 250 251

Documentation
-------------

Paddle Docker images include an HTML version of C++ source code
generated using `woboq code browser
<https://github.com/woboq/woboq_codebrowser>`_.  This makes it easy
for users to browse and understand the C++ source code.

As long as we give the Paddle Docker container a name, we can run an
D
dayhaha 已提交
252
additional Nginx Docker container to serve the volume from the Paddle
253 254 255 256
container:

.. code-block:: bash

Y
yi.wu 已提交
257
   docker run -d --name paddle-cpu-doc paddle:<version>
258 259 260 261 262
   docker run -d --volumes-from paddle-cpu-doc -p 8088:80 nginx


Then we can direct our Web browser to the HTML version of source code
at http://localhost:8088/paddle/