README.md 6.4 KB
Newer Older
1
# Building PaddlePaddle
G
gongweibao 已提交
2

3
## Goals
G
gongweibao 已提交
4

T
typhoonzero 已提交
5
We want to make the building procedures:
G
gongweibao 已提交
6

T
typhoonzero 已提交
7 8 9 10 11 12 13 14
1. Static, can reproduce easily.
1. Generate python `whl` packages that can be widely use cross many distributions.
1. Build different binaries per release to satisfy different environments:
    - Binaries for different CUDA and CUDNN versions, like CUDA 7.5, 8.0, 9.0
    - Binaries containing only capi
    - Binaries for python with wide unicode support or not.
1. Build docker images with PaddlePaddle pre-installed, so that we can run
PaddlePaddle applications directly in docker or on Kubernetes clusters.
G
gongweibao 已提交
15

16 17
To achieve this, we maintain a dockerhub repo:https://hub.docker.com/r/paddlepaddle/paddle
which provides pre-built environment images to build PaddlePaddle and generate corresponding `whl`
18
binaries.(**We strongly recommend building paddlepaddle in our pre-specified Docker environment.**)
G
gongweibao 已提交
19

20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
## Development Workflow

Here we describe how the workflow goes on.  We start from considering our daily development environment.

Developers work on a computer, which is usually a laptop or desktop:

<img src="doc/paddle-development-environment.png" width=500 />

or, they might rely on a more sophisticated box (like with GPUs):

<img src="doc/paddle-development-environment-gpu.png" width=500 />

A principle here is that source code lies on the development computer (host) so that editors like Eclipse can parse the source code to support auto-completion.

## Build With Docker
35

L
liaogang 已提交
36
### Build Environments
37

38
The lastest pre-built build environment images are:
39

T
typhoonzero 已提交
40 41
| Image | Tag |
| ----- | --- |
42
| paddlepaddle/paddle | latest-dev |
43

T
typhoonzero 已提交
44
### Start Build
45

T
typhoonzero 已提交
46 47 48
```bash
git clone https://github.com/PaddlePaddle/Paddle.git
cd Paddle
49
./paddle/scripts/paddle_docker_build.sh build
T
typhoonzero 已提交
50
```
51

T
typhoonzero 已提交
52 53
After the build finishes, you can get output `whl` package under
`build/python/dist`.
54

55 56
This command will download the most recent dev image from docker hub, start a container in the backend and then run the build script `/paddle/paddle/scripts/paddle_build.sh build` in the container.
The container mounts the source directory on the host into `/paddle`.
57
When it writes to `/paddle/build` in the container, it writes to `$PWD/build` on the host indeed.
58

T
typhoonzero 已提交
59
### Build Options
60

T
typhoonzero 已提交
61
Users can specify the following Docker build arguments with either "ON" or "OFF" value:
62

T
typhoonzero 已提交
63 64 65 66
| Option | Default | Description |
| ------ | -------- | ----------- |
| `WITH_GPU` | OFF | Generates NVIDIA CUDA GPU code and relies on CUDA libraries. |
| `WITH_AVX` | OFF | Set to "ON" to enable AVX support. |
67
| `WITH_TESTING` | OFF | Build unit tests binaries. |
68
| `WITH_MKL` | ON | Build with [Intel® MKL](https://software.intel.com/en-us/mkl) and [Intel® MKL-DNN](https://github.com/01org/mkl-dnn) support. |
T
typhoonzero 已提交
69 70 71 72
| `WITH_PYTHON` | ON | Build with python support. Turn this off if build is only for capi. |
| `WITH_STYLE_CHECK` | ON | Check the code style when building. |
| `PYTHON_ABI` | "" | Build for different python ABI support, can be cp27-cp27m or cp27-cp27mu |
| `RUN_TEST` | OFF | Run unit test immediently after the build. |
73

T
typhoonzero 已提交
74
## Docker Images
75

T
typhoonzero 已提交
76 77
You can get the latest PaddlePaddle docker images by
`docker pull paddlepaddle/paddle:<version>` or build one by yourself.
78

T
typhoonzero 已提交
79
### Official Docker Releases
80

T
typhoonzero 已提交
81 82 83 84
Official docker images at
[here](https://hub.docker.com/r/paddlepaddle/paddle/tags/),
you can choose either latest or images with a release tag like `0.10.0`,
Currently available tags are:
85

T
typhoonzero 已提交
86 87 88 89 90 91
|   Tag  | Description |
| ------ | --------------------- |
| latest | latest CPU only image |
| latest-gpu | latest binary with GPU support |
| 0.10.0 | release 0.10.0 CPU only binary image |
| 0.10.0-gpu | release 0.10.0 with GPU support |
92

T
typhoonzero 已提交
93
### Build Your Own Image
94

T
typhoonzero 已提交
95 96
Build PaddlePaddle docker images are quite simple since PaddlePaddle can
be installed by just running `pip install`. A sample `Dockerfile` is:
97

T
typhoonzero 已提交
98 99 100 101 102 103 104 105
```dockerfile
FROM nvidia/cuda:7.5-cudnn5-runtime-centos6
RUN yum install -y centos-release-SCL
RUN yum install -y python27
# This whl package is generated by previous build steps.
ADD python/dist/paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl /
RUN pip install /paddlepaddle-0.10.0-cp27-cp27mu-linux_x86_64.whl && rm -f /*.whl
```
106

T
typhoonzero 已提交
107 108
Then build the image by running `docker build -t [REPO]/paddle:[TAG] .` under
the directory containing your own `Dockerfile`.
109

110 111 112 113
We also release a script and Dockerfile for building PaddlePaddle docker images
across different cuda versions. To build these docker images, run:

```bash
114 115
bash ./build_docker_images.sh
docker build -t [REPO]/paddle:tag -f [generated_docker_file] .
116 117
```

T
typhoonzero 已提交
118
- NOTE: note that you can choose different base images for your environment, you can find all the versions [here](https://hub.docker.com/r/nvidia/cuda/).
119

T
typhoonzero 已提交
120
### Use Docker Images
121

T
typhoonzero 已提交
122 123
Suppose that you have written an application program `train.py` using
PaddlePaddle, we can test and run it using docker:
124

125
```bash
T
typhoonzero 已提交
126
docker run --rm -it -v $PWD:/work paddlepaddle/paddle /work/a.py
127
```
128

T
typhoonzero 已提交
129
But this works only if all dependencies of `train.py` are in the production image. If this is not the case, we need to build a new Docker image from the production image and with more dependencies installs.
130

T
typhoonzero 已提交
131
### Run PaddlePaddle Book In Docker
G
gongweibao 已提交
132

T
typhoonzero 已提交
133 134 135
Our [book repo](https://github.com/paddlepaddle/book) also provide a docker
image to start a jupiter notebook inside docker so that you can run this book
using docker:
136

137
```bash
T
typhoonzero 已提交
138
docker run -d -p 8888:8888 paddlepaddle/book
139
```
G
gongweibao 已提交
140

T
typhoonzero 已提交
141 142
Please refer to https://github.com/paddlepaddle/book if you want to build this
docker image by your self.
143

T
typhoonzero 已提交
144
### Run Distributed Applications
145

T
typhoonzero 已提交
146
In our [API design doc](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/api.md#distributed-training), we proposed an API that starts a distributed training job on a cluster.  This API need to build a PaddlePaddle application into a Docker image as above and calls kubectl to run it on the cluster.  This API might need to generate a Dockerfile look like above and call `docker build`.
147

T
typhoonzero 已提交
148
Of course, we can manually build an application image and launch the job using the kubectl tool:
149

150
```bash
T
typhoonzero 已提交
151 152 153 154
docker build -f some/Dockerfile -t myapp .
docker tag myapp me/myapp
docker push
kubectl ...
155
```
156

Y
Yancey1989 已提交
157

158
## More Options
T
typhoonzero 已提交
159

160
### Build Without Docker
Y
update  
Yancey1989 已提交
161

162
Follow the *Dockerfile* in the paddlepaddle repo to set up your local dev environment and run:
Y
Yancey1989 已提交
163 164

```bash
165
./paddle/scripts/paddle_build.sh build
Y
Yancey1989 已提交
166 167
```

168
### Additional Tasks
Y
Yancey1989 已提交
169

170 171 172 173 174
You can get the help menu for the build scripts by running with no options:

```bash
./paddle/scripts/paddle_build.sh
or ./paddle/scripts/paddle_docker_build.sh
Y
Yancey1989 已提交
175
```