refine doc for run paddle on docker

e19861c5 · Helin Wang · 06b2e4d2 · e19861c5
显示空白变更内容
内联并排

Showing with 159 addition and 122 deletion

doc/getstarted/build_and_install/docker_install_en.rst doc/getstarted/build_and_install/docker_install_en.rst +159 -122

未找到文件。
--- a/doc/getstarted/build_and_install/docker_install_en.rst
+++ b/doc/getstarted/build_and_install/docker_install_en.rst
@@ -8,200 +8,237 @@ Please be aware that you will need to change `Dockers settings
 <https://github.com/PaddlePaddle/Paddle/issues/627>`_ to make full use
 of your hardware resource on Mac OS X and Windows.

+Working With Docker
+-------------------

-Usage of CPU-only and GPU Images
----------------------------------
-
-For each version of PaddlePaddle, we release 2 types of Docker images: development
-image and production image. Production image includes CPU-only version and a CUDA
-GPU version and their no-AVX versions. We put the docker images on
-`dockerhub.com <https://hub.docker.com/r/paddledev/paddle/>`_. You can find the
-latest versions under "tags" tab at dockerhub.com.
-1. development image :code:`paddlepaddle/paddle:<version>-dev`
-
-    This image has packed related develop tools and runtime environment. Users and
-    developers can use this image instead of their own local computer to accomplish
-    development, build, releasing, document writing etc. While different version of
-    paddle may depends on different version of libraries and tools, if you want to
-    setup a local environment, you must pay attention to the versions.
-    The development image contains:
-    - gcc/clang
-    - nvcc
-    - Python
-    - sphinx
-    - woboq
-    - sshd
-    Many developers use servers with GPUs, they can use ssh to login to the server
-    and run :code:`docker exec` to enter the docker container and start their work.
-    Also they can start a development docker image with SSHD service, so they can login to
-    the container and start work.
+Here we will describe the basic docker concepts that we will be using
+in this tutorial.

-    To run the CPU-only image as an interactive container:
-
-    .. code-block:: bash
+- *container* is an environment for running applications

-        docker run -it --rm paddledev/paddle:<version> /bin/bash
+- *image* is an immutable snapshot of a docker container. One can run
+  a container based on a docker image by using command :code:`docker
+  run docker_image_name`.

-    or, we can run it as a daemon container
+- By default docker container have an isolated file system namespace,
+  we can not see the files in the host file system. By using *volume*,
+  mounted files in host will be visible inside docker container.
+  Following command will mount current dirctory into /data inside
+  docker container, run docker container from debian image with
+  command :code:`ls /data`.

  .. code-block:: bash

-        docker run -d -p 2202:22 -p 8888:8888 paddledev/paddle:<version>
+     docker run --rm -v $(pwd):/data debian ls /data

-    and SSH to this container using password :code:`root`:
-
-    .. code-block:: bash
-
-        ssh -p 2202 root@localhost
+Usage of CPU-only and GPU Images
+----------------------------------

-    An advantage of using SSH is that we can connect to PaddlePaddle from
-    more than one terminals.  For example, one terminal running vi and
-    another one running Python interpreter.  Another advantage is that we
-    can run the PaddlePaddle container on a remote server and SSH to it
-    from a laptop.
+For each version of PaddlePaddle, we release 2 types of Docker images:
+development image and production image. Production image includes
+CPU-only version and a CUDA GPU version and their no-AVX versions. We
+put the docker images on `dockerhub.com
+<https://hub.docker.com/r/paddledev/paddle/>`_. You can find the
+latest versions under "tags" tab at dockerhub.com

+1. Production images, this image might have multiple variants:

-2. Production images, this image might have multiple variants:
   - GPU/AVX：:code:`paddlepaddle/paddle:<version>-gpu`
   - GPU/no-AVX：:code:`paddlepaddle/paddle:<version>-gpu-noavx`
   - CPU/AVX：:code:`paddlepaddle/paddle:<version>`
   - CPU/no-AVX：:code:`paddlepaddle/paddle:<version>-noavx`

-    Please be aware that the CPU-only and the GPU images both use the AVX
-    instruction set, but old computers produced before 2008 do not support
-    AVX.  The following command checks if your Linux computer supports
-    AVX:
+   Please be aware that the CPU-only and the GPU images both use the
+   AVX instruction set, but old computers produced before 2008 do not
+   support AVX.  The following command checks if your Linux computer
+   supports AVX:

   .. code-block:: bash

      if cat /proc/cpuinfo | grep -i avx; then echo Yes; else echo No; fi

   
-       If it doesn't, we will use the non-AVX images.
-
-    Above methods work with the GPU image too -- just please don't forget
-    to install GPU driver. To support GPU driver, we recommend to use 
-    [nvidia-docker](https://github.com/NVIDIA/nvidia-docker). Run using
+   To run the CPU-only image as an interactive container:

   .. code-block:: bash

-        nvidia-docker run -it --rm paddledev/paddle:0.10.0rc1-gpu /bin/bash
+      docker run -it --rm paddlepaddle/paddle:0.10.0rc2 /bin/bash
+
+   Above method work with the GPU image too -- the recommended way is
+   using `nvidia-docker <https://github.com/NVIDIA/nvidia-docker>`_.
+
+   Please install nvidia-docker first following this `tutorial
+   <https://github.com/NVIDIA/nvidia-docker#quick-start>`_.

-    Note: If you would have a problem running nvidia-docker, you may try the old method we have used (not recommended).
+   Now you can run a GPU image:

   .. code-block:: bash

-        export CUDA_SO="$(\ls /usr/lib64/libcuda* | xargs -I{} echo '-v {}:{}') $(\ls /usr/lib64/libnvidia* | xargs -I{} echo '-v {}:{}')"
-        export DEVICES=$(\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')
-        docker run ${CUDA_SO} ${DEVICES} -it paddledev/paddle:<version>-gpu
+      nvidia-docker run -it --rm paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash

+2. development image :code:`paddlepaddle/paddle:<version>-dev`

-3. Use production image to release you AI application
-    Suppose that we have a simple application program in :code:`a.py`, we can test and run it using the production image:
+   This image has packed related develop tools and runtime
+   environment. Users and developers can use this image instead of
+   their own local computer to accomplish development, build,
+   releasing, document writing etc. While different version of paddle
+   may depends on different version of libraries and tools, if you
+   want to setup a local environment, you must pay attention to the
+   versions.  The development image contains:
   
-    ```bash
-    docker run -it -v $PWD:/work paddle /work/a.py
-    ```
+   - gcc/clang
+   - nvcc
+   - Python
+   - sphinx
+   - woboq
+   - sshd
     
-    But this works only if all dependencies of :code:`a.py` are in the production image. If this is not the case, we need to build a new Docker image from the production image and with more dependencies installs.
+   Many developers use servers with GPUs, they can use ssh to login to
+   the server and run :code:`docker exec` to enter the docker
+   container and start their work.  Also they can start a development
+   docker image with SSHD service, so they can login to the container
+   and start work.


-PaddlePaddle Book
------------------
+Train Model Using Python API
+----------------------------

-The Jupyter Notebook is an open-source web application that allows
-you to create and share documents that contain live code, equations,
-visualizations and explanatory text in a single browser.
+Our official docker image provides a runtime for PaddlePaddle
+programs. The typical workflow will be as follows:

-PaddlePaddle Book is an interactive Jupyter Notebook for users and developers.
-We already exposed port 8888 for this book. If you want to
-dig deeper into deep learning, PaddlePaddle Book definitely is your best choice.
+Create a directory as workspace:

-We provide a packaged book image, simply issue the command:
+.. code-block:: bash
+
+   mkdir ~/workspace
+
+Edit a PaddlePaddle python program using your favourite editor

 .. code-block:: bash

-    docker run -p 8888:8888 paddlepaddle/book
+   emacs ~/workspace/example.py

-Then, you would back and paste the address into the local browser:
+Run the program using docker:

-.. code-block:: text
+.. code-block:: bash

-    http://localhost:8888/
+   docker run -it --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 python /workspace/example.py

-That's all. Enjoy your journey!
+Or if you are using GPU for training:

-Development Using Docker
------------------------
+.. code-block:: bash

-Developers can work on PaddlePaddle using Docker.  This allows
-developers to work on different platforms -- Linux, Mac OS X, and
-Windows -- in a consistent way.
+   nvidia-docker run -it --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu python /workspace/example.py

-1. Build the Development Docker Image
+Above commands will start a docker container by running :code:`python
+/workspace/example.py`. It will stop once :code:`python
+/workspace/example.py` finishes.

-   .. code-block:: bash
+Another way is to tell docker to start a :code:`/bin/bash` session and
+run PaddlePaddle program interactively:

-      git clone --recursive https://github.com/PaddlePaddle/Paddle
-      cd Paddle
-      docker build -t paddle:dev .
+.. code-block:: bash

-   Note that by default :code:`docker build` wouldn't import source
-   tree into the image and build it.  If we want to do that, we need docker the
-   development docker image and then run the following command:
+   docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 /bin/bash
+   # now we are inside docker container
+   cd /workspace
+   python example.py

-   .. code-block:: bash
+Running with GPU is identical:
+
+.. code-block:: bash

-      docker run -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=ON" -e "TEST=OFF" paddle:dev
+   nvidia-docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash
+   # now we are inside docker container
+   cd /workspace
+   python example.py


-2. Run the Development Environment
+Develop PaddlePaddle or Train Model Using C++ API
+---------------------------------------------------

-   Once we got the image :code:`paddle:dev`, we can use it to develop
-   Paddle by mounting the local source code tree into a container that
-   runs the image:
+We will be using PaddlePaddle development image since it contains all
+compiling tools and dependencies.

-   .. code-block:: bash
+Let's clone PaddlePaddle repo first:
+
+.. code-block:: bash

-      docker run -d -p 2202:22 -p 8888:8888 -v $PWD:/paddle paddle:dev sshd
+   git clone https://github.com/PaddlePaddle/Paddle.git && cd Paddle

-   This runs a container of the development environment Docker image
-   with the local source tree mounted to :code:`/paddle` of the
-   container.
+Mount both workspace folder and paddle code folder into docker
+container, so we can access them inside docker container. There are
+two ways of using PaddlePaddle development docker image:

-   The above :code:`docker run` commands actually starts
-   an SSHD server listening on port 2202.  This allows us to log into
-   this container with:
+- run interactive bash directly

  .. code-block:: bash

-      ssh root@localhost -p 2202
+     # use nvidia-docker instead of docker if you need to use GPU
+     docker run -it -v ~/workspace:/workspace -v $(pwd):/paddle paddlepaddle/paddle:0.10.0rc2-dev /bin/bash
+     # now we are inside docker container

-   Usually, I run above commands on my Mac.  I can also run them on a
-   GPU server :code:`xxx.yyy.zzz.www` and ssh from my Mac to it:
+- or, we can run it as a daemon container

  .. code-block:: bash

-      my-mac$ ssh root@xxx.yyy.zzz.www -p 2202
-
-3. Build and Install Using the Development Environment
+     # use nvidia-docker instead of docker if you need to use GPU
+     docker run -d -p 2202:22 -p 8888:8888 -v ~/workspace:/workspace -v $(pwd):/paddle paddlepaddle/paddle:0.10.0rc2-dev /usr/sbin/sshd -D

-   Once I am in the container, I can use
-   :code:`paddle/scripts/docker/build.sh` to build, install, and test
-   Paddle:
+  and SSH to this container using password :code:`root`:

  .. code-block:: bash

-      /paddle/paddle/scripts/docker/build.sh
+     ssh -p 2202 root@localhost

-   This builds everything about Paddle in :code:`/paddle/build`.  And
-   we can run unit tests there:
+  An advantage is that we can run the PaddlePaddle container on a
+  remote server and SSH to it from a laptop.

-   .. code-block:: bash
+When developing PaddlePaddle, you can edit PaddlePaddle source code
+from outside of docker container using your favoriate editor. To
+compile PaddlePaddle, run inside container:
+
+.. code-block:: bash
+
+   WITH_GPU=OFF WITH_AVX=ON WITH_TEST=ON bash /paddle/paddle/scripts/docker/build.sh
+
+This builds everything about Paddle in :code:`/paddle/build`.  And we
+can run unit tests there:
+
+.. code-block:: bash

   cd /paddle/build
   ctest

+When training model using C++ API, we can edit paddle program in
+~/workspace outside of docker. And build from /workspace inside of
+docker.
+
+PaddlePaddle Book
+------------------
+
+The Jupyter Notebook is an open-source web application that allows
+you to create and share documents that contain live code, equations,
+visualizations and explanatory text in a single browser.
+
+PaddlePaddle Book is an interactive Jupyter Notebook for users and developers.
+We already exposed port 8888 for this book. If you want to
+dig deeper into deep learning, PaddlePaddle Book definitely is your best choice.
+
+We provide a packaged book image, simply issue the command:
+
+.. code-block:: bash
+
+    docker run -p 8888:8888 paddlepaddle/book
+
+Then, you would back and paste the address into the local browser:
+
+.. code-block:: text
+
+    http://localhost:8888/
+
+That's all. Enjoy your journey!
+

 Documentation
 -------------