From e19861c567b544b7b43ae2e0315371abe6d86fa0 Mon Sep 17 00:00:00 2001
From: Helin Wang <helinwang@baidu.com>
Date: Tue, 28 Mar 2017 12:06:15 -0700
Subject: [PATCH] refine doc for run paddle on docker

---
 .../build_and_install/docker_install_en.rst   | 281 ++++++++++--------
 1 file changed, 159 insertions(+), 122 deletions(-)
diff --git a/doc/getstarted/build_and_install/docker_install_en.rst b/doc/getstarted/build_and_install/docker_install_en.rst
index f43e83d129..add666261f 100644
--- a/doc/getstarted/build_and_install/docker_install_en.rst
+++ b/doc/getstarted/build_and_install/docker_install_en.rst
@@ -8,199 +8,236 @@ Please be aware that you will need to change `Dockers settings
 <https://github.com/PaddlePaddle/Paddle/issues/627>`_ to make full use
 of your hardware resource on Mac OS X and Windows.
 
+Working With Docker
+-------------------
+
+Here we will describe the basic docker concepts that we will be using
+in this tutorial.
+
+- *container* is an environment for running applications
+
+- *image* is an immutable snapshot of a docker container. One can run
+  a container based on a docker image by using command :code:`docker
+  run docker_image_name`.
+
+- By default docker container have an isolated file system namespace,
+  we can not see the files in the host file system. By using *volume*,
+  mounted files in host will be visible inside docker container.
+  Following command will mount current dirctory into /data inside
+  docker container, run docker container from debian image with
+  command :code:`ls /data`.
+
+  .. code-block:: bash
+
+     docker run --rm -v $(pwd):/data debian ls /data
 
 Usage of CPU-only and GPU Images
 ----------------------------------
 
-For each version of PaddlePaddle, we release 2 types of Docker images: development
-image and production image. Production image includes CPU-only version and a CUDA
-GPU version and their no-AVX versions. We put the docker images on
-`dockerhub.com <https://hub.docker.com/r/paddledev/paddle/>`_. You can find the
-latest versions under "tags" tab at dockerhub.com.
-1. development image :code:`paddlepaddle/paddle:<version>-dev`
+For each version of PaddlePaddle, we release 2 types of Docker images:
+development image and production image. Production image includes
+CPU-only version and a CUDA GPU version and their no-AVX versions. We
+put the docker images on `dockerhub.com
+<https://hub.docker.com/r/paddledev/paddle/>`_. You can find the
+latest versions under "tags" tab at dockerhub.com
 
-    This image has packed related develop tools and runtime environment. Users and
-    developers can use this image instead of their own local computer to accomplish
-    development, build, releasing, document writing etc. While different version of
-    paddle may depends on different version of libraries and tools, if you want to
-    setup a local environment, you must pay attention to the versions.
-    The development image contains:
-    - gcc/clang
-    - nvcc
-    - Python
-    - sphinx
-    - woboq
-    - sshd
-    Many developers use servers with GPUs, they can use ssh to login to the server
-    and run :code:`docker exec` to enter the docker container and start their work.
-    Also they can start a development docker image with SSHD service, so they can login to
-    the container and start work.
+1. Production images, this image might have multiple variants:
 
-    To run the CPU-only image as an interactive container:
+   - GPU/AVX：:code:`paddlepaddle/paddle:<version>-gpu`
+   - GPU/no-AVX：:code:`paddlepaddle/paddle:<version>-gpu-noavx`
+   - CPU/AVX：:code:`paddlepaddle/paddle:<version>`
+   - CPU/no-AVX：:code:`paddlepaddle/paddle:<version>-noavx`
 
-    .. code-block:: bash
+   Please be aware that the CPU-only and the GPU images both use the
+   AVX instruction set, but old computers produced before 2008 do not
+   support AVX.  The following command checks if your Linux computer
+   supports AVX:
 
-        docker run -it --rm paddledev/paddle:<version> /bin/bash
+   .. code-block:: bash
 
-    or, we can run it as a daemon container
+      if cat /proc/cpuinfo | grep -i avx; then echo Yes; else echo No; fi
 
-    .. code-block:: bash
+   
+   To run the CPU-only image as an interactive container:
 
-        docker run -d -p 2202:22 -p 8888:8888 paddledev/paddle:<version>
+   .. code-block:: bash
 
-    and SSH to this container using password :code:`root`:
+      docker run -it --rm paddlepaddle/paddle:0.10.0rc2 /bin/bash
 
-    .. code-block:: bash
+   Above method work with the GPU image too -- the recommended way is
+   using `nvidia-docker <https://github.com/NVIDIA/nvidia-docker>`_.
 
-        ssh -p 2202 root@localhost
+   Please install nvidia-docker first following this `tutorial
+   <https://github.com/NVIDIA/nvidia-docker#quick-start>`_.
 
-    An advantage of using SSH is that we can connect to PaddlePaddle from
-    more than one terminals.  For example, one terminal running vi and
-    another one running Python interpreter.  Another advantage is that we
-    can run the PaddlePaddle container on a remote server and SSH to it
-    from a laptop.
+   Now you can run a GPU image:
 
+   .. code-block:: bash
 
-2. Production images, this image might have multiple variants:
-    - GPU/AVX：:code:`paddlepaddle/paddle:<version>-gpu`
-    - GPU/no-AVX：:code:`paddlepaddle/paddle:<version>-gpu-noavx`
-    - CPU/AVX：:code:`paddlepaddle/paddle:<version>`
-    - CPU/no-AVX：:code:`paddlepaddle/paddle:<version>-noavx`
+      nvidia-docker run -it --rm paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash
 
-    Please be aware that the CPU-only and the GPU images both use the AVX
-    instruction set, but old computers produced before 2008 do not support
-    AVX.  The following command checks if your Linux computer supports
-    AVX:
+2. development image :code:`paddlepaddle/paddle:<version>-dev`
 
-    .. code-block:: bash
+   This image has packed related develop tools and runtime
+   environment. Users and developers can use this image instead of
+   their own local computer to accomplish development, build,
+   releasing, document writing etc. While different version of paddle
+   may depends on different version of libraries and tools, if you
+   want to setup a local environment, you must pay attention to the
+   versions.  The development image contains:
+   
+   - gcc/clang
+   - nvcc
+   - Python
+   - sphinx
+   - woboq
+   - sshd
+     
+   Many developers use servers with GPUs, they can use ssh to login to
+   the server and run :code:`docker exec` to enter the docker
+   container and start their work.  Also they can start a development
+   docker image with SSHD service, so they can login to the container
+   and start work.
 
-       if cat /proc/cpuinfo | grep -i avx; then echo Yes; else echo No; fi
 
+Train Model Using Python API
+----------------------------
 
-       If it doesn't, we will use the non-AVX images.
+Our official docker image provides a runtime for PaddlePaddle
+programs. The typical workflow will be as follows:
 
-    Above methods work with the GPU image too -- just please don't forget
-    to install GPU driver. To support GPU driver, we recommend to use 
-    [nvidia-docker](https://github.com/NVIDIA/nvidia-docker). Run using
+Create a directory as workspace:
 
-    .. code-block:: bash
+.. code-block:: bash
 
-        nvidia-docker run -it --rm paddledev/paddle:0.10.0rc1-gpu /bin/bash
+   mkdir ~/workspace
 
-    Note: If you would have a problem running nvidia-docker, you may try the old method we have used (not recommended).
+Edit a PaddlePaddle python program using your favourite editor
 
-    .. code-block:: bash
+.. code-block:: bash
 
-        export CUDA_SO="$(\ls /usr/lib64/libcuda* | xargs -I{} echo '-v {}:{}') $(\ls /usr/lib64/libnvidia* | xargs -I{} echo '-v {}:{}')"
-        export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
-        docker run ${CUDA_SO} ${DEVICES} -it paddledev/paddle:<version>-gpu
+   emacs ~/workspace/example.py
 
+Run the program using docker:
 
-3. Use production image to release you AI application
-    Suppose that we have a simple application program in :code:`a.py`, we can test and run it using the production image:
+.. code-block:: bash
 
-    ```bash
-    docker run -it -v $PWD:/work paddle /work/a.py
-    ```
+   docker run -it --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 python /workspace/example.py
 
-    But this works only if all dependencies of :code:`a.py` are in the production image. If this is not the case, we need to build a new Docker image from the production image and with more dependencies installs.
+Or if you are using GPU for training:
 
+.. code-block:: bash
 
-PaddlePaddle Book
-------------------
+   nvidia-docker run -it --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu python /workspace/example.py
 
-The Jupyter Notebook is an open-source web application that allows
-you to create and share documents that contain live code, equations,
-visualizations and explanatory text in a single browser.
+Above commands will start a docker container by running :code:`python
+/workspace/example.py`. It will stop once :code:`python
+/workspace/example.py` finishes.
 
-PaddlePaddle Book is an interactive Jupyter Notebook for users and developers.
-We already exposed port 8888 for this book. If you want to
-dig deeper into deep learning, PaddlePaddle Book definitely is your best choice.
+Another way is to tell docker to start a :code:`/bin/bash` session and
+run PaddlePaddle program interactively:
 
-We provide a packaged book image, simply issue the command:
+.. code-block:: bash
+
+   docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 /bin/bash
+   # now we are inside docker container
+   cd /workspace
+   python example.py
+
+Running with GPU is identical:
 
 .. code-block:: bash
 
-    docker run -p 8888:8888 paddlepaddle/book
+   nvidia-docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash
+   # now we are inside docker container
+   cd /workspace
+   python example.py
 
-Then, you would back and paste the address into the local browser:
 
-.. code-block:: text
+Develop PaddlePaddle or Train Model Using C++ API
+---------------------------------------------------
 
-    http://localhost:8888/
+We will be using PaddlePaddle development image since it contains all
+compiling tools and dependencies.
 
-That's all. Enjoy your journey!
+Let's clone PaddlePaddle repo first:
 
-Development Using Docker
-------------------------
+.. code-block:: bash
 
-Developers can work on PaddlePaddle using Docker.  This allows
-developers to work on different platforms -- Linux, Mac OS X, and
-Windows -- in a consistent way.
+   git clone https://github.com/PaddlePaddle/Paddle.git && cd Paddle
 
-1. Build the Development Docker Image
+Mount both workspace folder and paddle code folder into docker
+container, so we can access them inside docker container. There are
+two ways of using PaddlePaddle development docker image:
 
-   .. code-block:: bash
+- run interactive bash directly
 
-      git clone --recursive https://github.com/PaddlePaddle/Paddle
-      cd Paddle
-      docker build -t paddle:dev .
+  .. code-block:: bash
 
-   Note that by default :code:`docker build` wouldn't import source
-   tree into the image and build it.  If we want to do that, we need docker the
-   development docker image and then run the following command:
+     # use nvidia-docker instead of docker if you need to use GPU
+     docker run -it -v ~/workspace:/workspace -v $(pwd):/paddle paddlepaddle/paddle:0.10.0rc2-dev /bin/bash
+     # now we are inside docker container
 
-   .. code-block:: bash
+- or, we can run it as a daemon container
 
-      docker run -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=ON" -e "TEST=OFF" paddle:dev
+  .. code-block:: bash
 
+     # use nvidia-docker instead of docker if you need to use GPU
+     docker run -d -p 2202:22 -p 8888:8888 -v ~/workspace:/workspace -v $(pwd):/paddle paddlepaddle/paddle:0.10.0rc2-dev /usr/sbin/sshd -D
 
-2. Run the Development Environment
+  and SSH to this container using password :code:`root`:
 
-   Once we got the image :code:`paddle:dev`, we can use it to develop
-   Paddle by mounting the local source code tree into a container that
-   runs the image:
+  .. code-block:: bash
 
-   .. code-block:: bash
+     ssh -p 2202 root@localhost
 
-      docker run -d -p 2202:22 -p 8888:8888 -v $PWD:/paddle paddle:dev sshd
+  An advantage is that we can run the PaddlePaddle container on a
+  remote server and SSH to it from a laptop.
 
-   This runs a container of the development environment Docker image
-   with the local source tree mounted to :code:`/paddle` of the
-   container.
+When developing PaddlePaddle, you can edit PaddlePaddle source code
+from outside of docker container using your favoriate editor. To
+compile PaddlePaddle, run inside container:
 
-   The above :code:`docker run` commands actually starts
-   an SSHD server listening on port 2202.  This allows us to log into
-   this container with:
+.. code-block:: bash
 
-   .. code-block:: bash
+   WITH_GPU=OFF WITH_AVX=ON WITH_TEST=ON bash /paddle/paddle/scripts/docker/build.sh
 
-      ssh root@localhost -p 2202
+This builds everything about Paddle in :code:`/paddle/build`.  And we
+can run unit tests there:
 
-   Usually, I run above commands on my Mac.  I can also run them on a
-   GPU server :code:`xxx.yyy.zzz.www` and ssh from my Mac to it:
+.. code-block:: bash
 
-   .. code-block:: bash
+   cd /paddle/build
+   ctest
 
-      my-mac$ ssh root@xxx.yyy.zzz.www -p 2202
+When training model using C++ API, we can edit paddle program in
+~/workspace outside of docker. And build from /workspace inside of
+docker.
 
-3. Build and Install Using the Development Environment
+PaddlePaddle Book
+------------------
 
-   Once I am in the container, I can use
-   :code:`paddle/scripts/docker/build.sh` to build, install, and test
-   Paddle:
+The Jupyter Notebook is an open-source web application that allows
+you to create and share documents that contain live code, equations,
+visualizations and explanatory text in a single browser.
 
-   .. code-block:: bash
+PaddlePaddle Book is an interactive Jupyter Notebook for users and developers.
+We already exposed port 8888 for this book. If you want to
+dig deeper into deep learning, PaddlePaddle Book definitely is your best choice.
 
-      /paddle/paddle/scripts/docker/build.sh
+We provide a packaged book image, simply issue the command:
 
-   This builds everything about Paddle in :code:`/paddle/build`.  And
-   we can run unit tests there:
+.. code-block:: bash
 
-   .. code-block:: bash
+    docker run -p 8888:8888 paddlepaddle/book
 
-      cd /paddle/build
-      ctest
+Then, you would back and paste the address into the local browser:
+
+.. code-block:: text
+
+    http://localhost:8888/
+
+That's all. Enjoy your journey!
 
 
 Documentation
-- 
GitLab