README.md 6.4 KB
Newer Older
1
# Building PaddlePaddle
G
gongweibao 已提交
2

3
## Goals
G
gongweibao 已提交
4

5
We want the building procedure generates Docker images, so we can run PaddlePaddle applications on Kubernetes clusters.
G
gongweibao 已提交
6

7
We want it generates .deb packages, so that enterprises without Docker support can run PaddlePaddle applications as well.
G
gongweibao 已提交
8

9
We want to minimize the size of  generated Docker images and .deb packages so to ease the deployment cost.
G
gongweibao 已提交
10

11
We want to encapsulate building tools and dependencies in a *development* Docker image so to ease the tools installation for developers.
12

13
We want developers can use whatever editing tools (emacs, vim, Eclipse, Jupyter Notebook), so the development Docker image contains only building tools, not editing tools, and developers are supposed to git clone source code into their development computers, instead of the container running the development Docker image.
14

15
We want the procedure and tools work also with testing, continuous integration, and releasing.
16 17


18
## Docker Images
19 20 21 22 23 24 25 26 27 28 29

We want two Docker images for each version of PaddlePaddle:

1. `paddle:<version>-dev`

   This a development image contains only the development tools.  This standardizes the building tools and procedure.  Users include:

   - developers -- no longer need to install development tools on the host, and can build their current work on the host (development computer).
   - release engineers -- use this to build the official release from certain branch/tag on Github.com.
   - document writers / Website developers -- Our documents are in the source repo in the form of .md/.rst files and comments in source code.  We need tools to extract the information, typeset, and generate Web pages.

30 31 32
   Of course developers can install building tools on their development computers.  But different version of PaddlePaddle might require different set/version of building tools.  Also, it makes collaborative debugging eaiser if all developers use a unified development environment.

  The development image should include the following tools:
33 34 35 36 37 38 39 40

   - gcc/clang
   - nvcc
   - Python
   - sphinx
   - woboq
   - sshd

41
   where `sshd` makes it easy for developers to have multiple terminals connecting into the container.  `docker exec` works too, but if the container is running on a remote machine, it would be easier to ssh directly into the container than ssh to the box and run `docker exec`.
42 43 44 45 46 47 48 49 50 51

1. `paddle:<version>`

   This is the production image, generated using the development image. This image might have multiple variants:

   - GPU/AVX   `paddle:<version>-gpu`
   - GPU/no-AVX  `paddle:<version>-gpu-noavx`
   - no-GPU/AVX  `paddle:<version>`
   - no-GPU/no-AVX  `paddle:<version>-noavx`

52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
   We'd like to give users the choice between GPU and no-GPU, because the GPU version image is much larger than then the no-GPU version.

   We'd like to give users the choice between AVX and no-AVX, because some cloud providers don't provide AVX-enabled VMs.


## Development Environment

Here we describe how to use above two images.  We start from considering our daily development environment.

Developers work on a computer, which is usually a laptop or desktop:

![](doc/paddle-development-environment.png)

or, they might rely on a more sophisticated box (like with GPUs):

![](doc/paddle-development-environment-gpu.png)

A basic principle is that source code lies on the development computer (host), so that editing tools like Eclipse can parse the source code and support auto-completion.

71

72
## Usages
73

74
### Build the Development Docker Image
75

76
The following commands check out the source code on the development computer (host) and build the development image `paddle:dev`:
77

78 79 80 81 82
```bash
git clone https://github.com/PaddlePaddle/Paddle paddle
cd paddle
docker build -t paddle:dev .
```
83

84
The `docker build` command assumes that `Dockerfile` is in the root source tree.  This is reasonable because this Dockerfile is this only on in our repo in this design.
G
gongweibao 已提交
85

86

87
### Build PaddlePaddle from Source Code
G
gongweibao 已提交
88

89
Given the development image `paddle:dev`, the following command builds PaddlePaddle from the source tree on the development computer (host):
90

91 92 93
```bash
docker run -v $PWD:/paddle -e "GPU=OFF" -e "AVX=ON" -e "TEST=ON" paddle:dev
```
G
gongweibao 已提交
94

95
This command mounts the source directory on the host into `/paddle` in the container, so  the default entrypoint of `paddle:dev`, `build.sh`, would build the source code with possible local changes.  When it writes to `/paddle/build` in the container, it actually writes to `$PWD/build` on the host.
96

97
`build.sh` builds the following:
98

99 100 101
- PaddlePaddle binaries,
- `$PWD/build/paddle-<version>.deb` for production installation, and
- `$PWD/build/Dockerfile`, which builds the production Docker image.
102 103


104
### Build the Production Docker Image
105

106
The following command builds the production image:
107

108 109 110
```bash
docker build -t paddle -f build/Dockerfile .
```
111

112
This production image is minimal -- it includes binary `paddle`, the share library `libpaddle.so`, and Python runtime.
113

114
### Run PaddlePaddle Applications
115

116
Again the development happens on the host.  Suppoose that we have a simple application program in `a.py`, we can test and run it using the production image:
117

118 119 120
```bash
docker run -it -v $PWD:/work paddle /work/a.py
```
121

122
But this works only if all dependencies of `a.py` are in the production image. If this is not the case, we need to build a new Docker image from the production image and with more dependencies installs.
123

124
### Build and Run PaddlePaddle Appications
G
gongweibao 已提交
125

126
We need a Dockerfile in https://github.com/paddlepaddle/book that builds Docker image `paddlepaddle/book:<version>`, basing on the PaddlePaddle production image:
127

128 129 130 131 132 133 134
```
FROM paddlepaddle/paddle:<version>
RUN pip install -U matplotlib jupyter ...
COPY . /book
EXPOSE 8080
CMD ["jupyter"]
```
G
gongweibao 已提交
135

136
The book image is an example of PaddlePaddle application image.  We can build it
137

138 139 140 141 142
```bash
git clone https://github.com/paddlepaddle/book
cd book
docker build -t book .
```
143

144
### Build and Run Distributed Applications
145

146
In our [API design doc](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/api.md#distributed-training), we proposed an API that starts a distributed training job on a cluster.  This API need to build a PaddlePaddle application into a Docekr image as above, and calls kubectl to run it on the cluster.  This API might need to generate a Dockerfile look like above and call `docker build`.
147

148
Of course, we can manually build an application image and launch the job using the kubectl tool:
G
gongweibao 已提交
149

150 151 152 153 154 155
```bash
docker build -f some/Dockerfile -t myapp .
docker tag myapp me/myapp
docker push
kubectl ...
```