Merge pull request #67 from tink2123/new_dev

copy Paddle-develop content to develop

Merge pull request #67 from tink2123/new_dev
copy Paddle-develop content to develop
c176d8d7 · Shan Yi · GitHub · bb5cda68 · edea80e0 · bb5cda68
19 changed file
--- a/doc/fluid/advanced_usage/deploy/build_and_install_lib_cn.rst
+++ b/doc/fluid/advanced_usage/deploy/build_and_install_lib_cn.rst
-.. _install_or_build_cpp_inference_lib:
-
-安装与编译C++预测库
-===========================
-
-直接下载安装
-------------
-
-======================   ========================================
-版本说明                            C++预测库   
-======================   ========================================
-cpu_avx_mkl              `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxCp27cp27mu/.lastSuccessful/fluid.tgz>`_ 
-cpu_avx_openblas         `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxOpenblas/.lastSuccessful/fluid.tgz>`_
-cpu_noavx_openblas       `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuNoavxOpenblas/.lastSuccessful/fluid.tgz>`_
-cuda7.5_cudnn5_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda75cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz>`_
-cuda8.0_cudnn5_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda80cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz>`_
-cuda8.0_cudnn7_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda8cudnn7cp27cp27mu/.lastSuccessful/fluid.tgz>`_
-cuda9.0_cudnn7_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda90cudnn7avxMkl/.lastSuccessful/fluid.tgz>`_
-======================   ========================================
-
-从源码编译
----------
-用户也可以从 PaddlePaddle 核心代码编译C++预测库，只需在编译时配制下面这些编译选项：
-
-=================   =========
-选项                 值   
-=================   =========
-CMAKE_BUILD_TYPE    Release
-FLUID_INSTALL_DIR   安装路径    
-WITH_FLUID_ONLY     ON（推荐）
-WITH_SWIG_PY        OFF（推荐
-WITH_PYTHON         OFF（推荐）
-WITH_GPU            ON/OFF
-WITH_MKL            ON/OFF
-=================   =========
-
-建议按照推荐值设置，以避免链接不必要的库。其它可选编译选项按需进行设定。
-
-下面的代码片段从github拉取最新代码，配制编译选项（需要将PADDLE_ROOT替换为PaddlePaddle预测库的安装路径）：
-
-  .. code-block:: bash
-
-     pip install paddlepaddle-gpu
-     PADDLE_ROOT=/path/of/capi
-     git clone https://github.com/PaddlePaddle/Paddle.git
-     cd Paddle
-     mkdir build
-     cd build
-     cmake -DFLUID_INSTALL_DIR=$PADDLE_ROOT \
-           -DCMAKE_BUILD_TYPE=Release \
-           -DWITH_FLUID_ONLY=ON \
-           -DWITH_SWIG_PY=OFF \
-           -DWITH_PYTHON=OFF \
-           -DWITH_MKL=OFF \
-           -DWITH_GPU=OFF  \
-           ..
-      make
-      make inference_lib_dist
-
-成功编译后，使用C++预测库所需的依赖（包括：（1）编译出的PaddlePaddle预测库和头文件；（2）第三方链接库和头文件；（3）版本信息与编译选项信息）
-均会存放于PADDLE_ROOT目录中。目录结构如下：
-
-  .. code-block:: text
-
-     PaddleRoot/
-     ├── CMakeCache.txt
-     ├── paddle
-     │   └── fluid
-     │       ├── framework
-     │       ├── inference
-     │       ├── memory
-     │       ├── platform
-     │       ├── pybind
-     │       └── string
-     ├── third_party
-     │   ├── boost
-     │   │   └── boost
-     │   ├── eigen3
-     │   │   ├── Eigen
-     │   │   └── unsupported
-     │   └── install
-     │       ├── gflags
-     │       ├── glog
-     │       ├── mklml
-     │       ├── protobuf
-     │       ├── snappy
-     │       ├── snappystream
-     │       └── zlib
-     └── version.txt
-     
-version.txt 中记录了该预测库的版本信息，包括Git Commit ID、使用OpenBlas或MKL数学库、CUDA/CUDNN版本号，如：
-
-  .. code-block:: text
-
-     GIT COMMIT ID: c95cd4742f02bb009e651a00b07b21c979637dc8
-     WITH_MKL: ON
-     WITH_GPU: ON
-     CUDA version: 8.0
-     CUDNN version: v5
--- a/doc/fluid/advanced_usage/development/contribute_to_paddle.md
+++ b/doc/fluid/advanced_usage/development/contribute_to_paddle.md
-# 如何贡献代码
-
-我们真诚地感谢您的贡献，欢迎通过 GitHub 的 fork 和 pull request 流程来提交代码。
-
-## 代码要求
- 代码注释请遵守 [Doxygen](http://www.stack.nl/~dimitri/doxygen/) 的样式。
- 确保编译器选项 `WITH_STYLE_CHECK` 已打开，并且编译能通过代码样式检查。
- 所有代码必须具有单元测试。
- 通过所有单元测试。
- 请遵守[提交代码的一些约定](#提交代码的一些约定)。
-
-以下教程将指导您提交代码。
-## [Fork](https://help.github.com/articles/fork-a-repo/)
-
-跳转到[PaddlePaddle](https://github.com/PaddlePaddle/Paddle) GitHub首页，然后单击 `Fork` 按钮，生成自己目录下的仓库，比如 <https://github.com/USERNAME/Paddle>。
-
-## 克隆（Clone）
-
-将远程仓库 clone 到本地：
-
-```bash
-➜  git clone https://github.com/USERNAME/Paddle
-➜  cd Paddle
-```
-
-
-## 创建本地分支
-
-Paddle 目前使用[Git流分支模型](http://nvie.com/posts/a-successful-git-branching-model/)进行开发，测试，发行和维护，具体请参考 [Paddle 分支规范](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/releasing_process.md#paddle-分支规范)。
-
-所有的 feature 和 bug fix 的开发工作都应该在一个新的分支上完成，一般从 `develop` 分支上创建新分支。
-
-使用 `git checkout -b` 创建并切换到新分支。
-
-```bash
-➜  git checkout -b my-cool-stuff
-```
-
-值得注意的是，在 checkout 之前，需要保持当前分支目录 clean，否则会把 untracked 的文件也带到新分支上，这可以通过 `git status` 查看。
-
-## 使用 `pre-commit` 钩子
-
-Paddle 开发人员使用 [pre-commit](http://pre-commit.com/) 工具来管理 Git 预提交钩子。 它可以帮助我们格式化源代码（C++，Python），在提交（commit）前自动检查一些基本事宜（如每个文件只有一个 EOL，Git 中不要添加大文件等）。
-
-`pre-commit`测试是 Travis-CI 中单元测试的一部分，不满足钩子的 PR 不能被提交到 Paddle，首先安装并在当前目录运行它：
-
-```bash
-➜  pip install pre-commit
-➜  pre-commit install
-```
-
-Paddle 使用 `clang-format` 来调整 C/C++ 源代码格式，请确保 `clang-format` 版本在 3.8 以上。
-
-注：通过`pip install pre-commit`和`conda install -c conda-forge pre-commit`安装的`yapf`稍有不同的，Paddle 开发人员使用的是`pip install pre-commit`。
-
-## 开始开发
-
-在本例中，我删除了 README.md 中的一行，并创建了一个新文件。
-
-通过 `git status` 查看当前状态，这会提示当前目录的一些变化，同时也可以通过 `git diff` 查看文件具体被修改的内容。
-
-```bash
-➜  git status
-On branch test
-Changes not staged for commit:
-  (use "git add <file>..." to update what will be committed)
-  (use "git checkout -- <file>..." to discard changes in working directory)
-
-	modified:   README.md
-
-Untracked files:
-  (use "git add <file>..." to include in what will be committed)
-
-	test
-
-no changes added to commit (use "git add" and/or "git commit -a")
-```
-
-## 构建和测试
-
-编译 PaddlePaddle 的源码以及生成文档需要多种开发工具。为了方便大家，我们的标准开发流程是把这些工具都装进一个Docker image，称为*开发镜像*，通常名字是 `paddle:latest-dev` 或者 `paddle:[version tag]-dev` 如 `paddle:0.11.0-dev`。然后所有用 `cmake && make` 的地方（比如IDE配置里）都用 `docker run paddle:latest-dev`来代替。
-
-如要build这个开发镜像，在源码目录树的根目录中运行：
-
-```bash
-➜  docker build -t paddle:latest-dev .
-```
-
-随后可以用这个开发镜像开始build PaddlePaddle的源码。比如如果要build一个不依赖GPU，但是支持AVX指令集，并且包括unit tests的PaddlePaddle，可以：
-
-```bash
-➜  docker run -v $(pwd):/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=ON" -e "WITH_TESTING=ON" paddle:latest-dev
-```
-
-这个过程除了编译PaddlePaddle为 `./build/libpaddle.so`，并且输出一个 `./build/paddle.deb`文件之外，还会输出一个 `build/Dockerfile`。我们只需要运行下面命令把编译好的PaddlePaddle打包成一个*生产镜像*（`paddle:prod`）：
-
-```bash
-➜  docker build -t paddle:prod -f build/Dockerfile .
-```
-
-如果要运行所有的单元测试，可以用如下命令：
-
-```bash
-➜  docker run -it -v $(pwd):/paddle paddle:latest-dev bash -c "cd /paddle/build && ctest"
-```
-
-关于构建和测试的更多信息，请参见[使用Docker安装运行](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/v2/build_and_install/docker_install_cn.rst)。
-
-## 提交（commit）
-
-接下来我们取消对 README.md 文件的改变，然后提交新添加的 test 文件。
-
-```bash
-➜  git checkout -- README.md
-➜  git status
-On branch test
-Untracked files:
-  (use "git add <file>..." to include in what will be committed)
-
-	test
-
-nothing added to commit but untracked files present (use "git add" to track)
-➜  git add test
-```
-
-Git 每次提交代码，都需要写提交说明，这可以让其他人知道这次提交做了哪些改变，这可以通过`git commit` 完成。
-
-```bash
-➜  git commit
-CRLF end-lines remover...............................(no files to check)Skipped
-yapf.................................................(no files to check)Skipped
-Check for added large files..............................................Passed
-Check for merge conflicts................................................Passed
-Check for broken symlinks................................................Passed
-Detect Private Key...................................(no files to check)Skipped
-Fix End of Files.....................................(no files to check)Skipped
-clang-formater.......................................(no files to check)Skipped
-[my-cool-stuff c703c041] add test file
- 1 file changed, 0 insertions(+), 0 deletions(-)
- create mode 100644 233
-```
-
-## 保持本地仓库最新
-
-在准备发起 Pull Request 之前，需要同步原仓库（<https://github.com/PaddlePaddle/Paddle>）最新的代码。
-
-首先通过 `git remote` 查看当前远程仓库的名字。
-
-```bash
-➜  git remote
-origin
-➜  git remote -v
-origin	https://github.com/USERNAME/Paddle (fetch)
-origin	https://github.com/USERNAME/Paddle (push)
-```
-
-这里 origin 是我们 clone 的远程仓库的名字，也就是自己用户名下的 Paddle，接下来我们创建一个原始 Paddle 仓库的远程主机，命名为 upstream。
-
-```bash
-➜  git remote add upstream https://github.com/PaddlePaddle/Paddle
-➜  git remote
-origin
-upstream
-```
-
-获取 upstream 的最新代码并更新当前分支。
-
-```bash
-➜  git fetch upstream
-➜  git pull upstream develop
-```
-
-## Push 到远程仓库
-
-将本地的修改推送到 GitHub 上，也就是 https://github.com/USERNAME/Paddle。
-
-```bash
-# 推送到远程仓库 origin 的 my-cool-stuff 分支上
-➜  git push origin my-cool-stuff
-```
-
-## 建立 Issue 并完成 Pull Request
-
-建立一个 Issue 描述问题，并记录它的编号。
-
-切换到所建分支，然后点击 `New pull request`。
-
-<img width="295" alt="screen shot 2017-04-26 at 9 09 28 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436054/a6d98c66-2ac4-11e7-9cb1-18dd13150230.png">
-
-选择目标分支：
-
-<img width="750" alt="screen shot 2017-04-26 at 9 11 52 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436139/f83b1e6c-2ac4-11e7-8c0e-add499023c46.png">
-
-在 PR 的描述说明中，填写 `resolve #Issue编号` 可以在这个 PR 被 merge 后，自动关闭对应的 Issue，具体请见 <https://help.github.com/articles/closing-issues-via-commit-messages/>。
-
-接下来等待 review，如果有需要修改的地方，参照上述步骤更新 origin 中的对应分支即可。
-
-## 删除远程分支
-
-在 PR 被 merge 进主仓库后，我们可以在 PR 的页面删除远程仓库的分支。
-
-<img width="775" alt="screen shot 2017-04-26 at 9 18 24 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436457/e4cdd472-2ac5-11e7-9272-badc76c4a23e.png">
-
-也可以使用 `git push origin :分支名` 删除远程分支，如：
-
-```bash
-➜  git push origin :my-cool-stuff
-```
-
-## 删除本地分支
-
-最后，删除本地分支。
-
-```bash
-# 切换到 develop 分支
-➜  git checkout develop 
-
-# 删除 my-cool-stuff 分支
-➜  git branch -D my-cool-stuff
-```
-
-至此，我们就完成了一次代码贡献的过程。
-
-## 提交代码的一些约定
-
-为了使评审人在评审代码时更好地专注于代码本身，请您每次提交代码时，遵守以下约定：
-
-1. 请保证Travis-CI 中单元测试能顺利通过。如果没过，说明提交的代码存在问题，评审人一般不做评审。
-2. 提交PUll Request前：
-   - 请注意commit的数量：
-     - 原因：如果仅仅修改一个文件但提交了十几个commit，每个commit只做了少量的修改，这会给评审人带来很大困扰。评审人需要逐一查看每个commit才能知道做了哪些修改，且不排除commit之间的修改存在相互覆盖的情况。
-     - 建议：每次提交时，保持尽量少的commit，可以通过`git commit --amend`补充上次的commit。对已经Push到远程仓库的多个commit，可以参考[squash commits after push](http://stackoverflow.com/questions/5667884/how-to-squash-commits-in-git-after-they-have-been-pushed)。
-   - 请注意每个commit的名称：应能反映当前commit的内容，不能太随意。
-3. 如果解决了某个Issue的问题，请在该PUll Request的**第一个**评论框中加上：`fix #issue_number`，这样当该PUll Request被合并后，会自动关闭对应的Issue。关键词包括：close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved，请选择合适的词汇。详细可参考[Closing issues via commit messages](https://help.github.com/articles/closing-issues-via-commit-messages)。
-
-此外，在回复评审人意见时，请您遵守以下约定：
-
-1. 评审人的每个意见都必须回复（这是开源社区的基本礼貌，别人帮了忙，应该说谢谢）：
-   - 对评审意见同意且按其修改完的，给个简单的`Done`即可；
-   - 对评审意见不同意的，请给出您自己的反驳理由。
-2. 如果评审意见比较多：
-   - 请给出总体的修改情况。
-   - 请采用[start a review](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/)进行回复，而非直接回复的方式。原因是每个回复都会发送一封邮件，会造成邮件灾难。
+../../../dev/contribute_to_paddle_cn.md
\ No newline at end of file
--- a/doc/fluid/advanced_usage/development/cpu_profiling_cn.md
+++ b/doc/fluid/advanced_usage/development/cpu_profiling_cn.md
-# CPU性能调优
-
-此教程会介绍如何使用Python的cProfile包、Python库yep、Google perftools来进行性能分析 (profiling) 与调优（performance tuning）。
-
-Profling 指发现性能瓶颈。系统中的瓶颈可能和程序员开发过程中想象的瓶颈相去甚远。Tuning 指消除瓶颈。性能优化的过程通常是不断重复地 profiling 和 tuning。
-
-PaddlePaddle 用户一般通过调用 Python API 编写深度学习程序。大部分 Python API 调用用 C++ 写的 libpaddle.so。所以 PaddlePaddle 的性能分析与调优分为两个部分:
-
-* Python 代码的性能分析
-* Python 与 C++ 混合代码的性能分析
-
-
-## Python代码的性能分析
-
-### 生成性能分析文件
-
-Python标准库中提供了性能分析的工具包，[cProfile](https://docs.python.org/2/library/profile.html)。生成Python性能分析的命令如下:
-
-```bash
-python -m cProfile -o profile.out main.py
-```
-
-其中 `main.py` 是我们要分析的程序，`-o`标识了一个输出的文件名，用来存储本次性能分析的结果。如果不指定这个文件，`cProfile`会打印到标准输出。
-
-### 查看性能分析文件
-
-`cProfile` 在main.py 运行完毕后输出`profile.out`。我们可以使用[`cprofilev`](https://github.com/ymichael/cprofilev)来查看性能分析结果。`cprofilev`是一个Python的第三方库。使用它会开启一个HTTP服务，将性能分析结果以网页的形式展示出来：
-
-```bash
-cprofilev -a 0.0.0.0 -p 3214 -f profile.out main.py
-```
-
-其中`-a`标识HTTP服务绑定的IP。使用`0.0.0.0`允许外网访问这个HTTP服务。`-p`标识HTTP服务的端口。`-f`标识性能分析的结果文件。`main.py`标识被性能分析的源文件。
-
-用Web浏览器访问对应网址，即可显示性能分析的结果：
-
-```
-   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
-        1    0.284    0.284   29.514   29.514 main.py:1(<module>)
-     4696    0.128    0.000   15.748    0.003 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/executor.py:20(run)
-     4696   12.040    0.003   12.040    0.003 {built-in method run}
-        1    0.144    0.144    6.534    6.534 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/__init__.py:14(<module>)
-```
-
-每一列的含义是:
-
-<table>
-<thead>
-<tr>
-<th>列名</th>
-<th>含义 </th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td> ncalls</td>
-<td> 函数的调用次数</td>
-</tr>
-<tr>
-<td>tottime</td>
-<td> 函数实际使用的总时间。该时间去除掉本函数调用其他函数的时间</td>
-</tr>
-<tr>
-<td> percall </td>
-<td> tottime的每次调用平均时间</td>
-</tr>
-<tr>
-<td> cumtime</td>
-<td> 函数总时间。包含这个函数调用其他函数的时间</td>
-</tr>
-<tr>
-<td> percall</td>
-<td> cumtime的每次调用平均时间</td>
-</tr>
-<tr>
-<td> filename:lineno(function) </td>
-<td> 文件名, 行号，函数名 </td>
-</tr>
-</tbody>
-</table>
-
-
-### 寻找性能瓶颈
-
-通常`tottime`和`cumtime`是寻找瓶颈的关键指标。这两个指标代表了某一个函数真实的运行时间。
-
-将性能分析结果按照tottime排序，效果如下:
-
-```text
-     4696   12.040    0.003   12.040    0.003 {built-in method run}
-   300005    0.874    0.000    1.681    0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/dataset/mnist.py:38(reader)
-   107991    0.676    0.000    1.519    0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:219(__init__)
-     4697    0.626    0.000    2.291    0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:428(sync_with_cpp)
-        1    0.618    0.618    0.618    0.618 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/__init__.py:1(<module>)
-```
-
-可以看到最耗时的函数是C++端的`run`函数。这需要联合我们第二节`Python`与`C++`混合代码的性能分析来进行调优。而`sync_with_cpp`函数的总共耗时很长，每次调用的耗时也很长。于是我们可以点击`sync_with_cpp`的详细信息，了解其调用关系。
-
-```text
-Called By:
-
-   Ordered by: internal time
-   List reduced from 4497 to 2 due to restriction <'sync_with_cpp'>
-
-Function                                                                                                 was called by...
-                                                                                                             ncalls  tottime  cumtime
-/home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:428(sync_with_cpp)  <-    4697    0.626    2.291  /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:562(sync_with_cpp)
-/home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:562(sync_with_cpp)  <-    4696    0.019    2.316  /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:487(clone)
-                                                                                                                  1    0.000    0.001  /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:534(append_backward)
-
-
-Called:
-
-   Ordered by: internal time
-   List reduced from 4497 to 2 due to restriction <'sync_with_cpp'>
-```
-
-通常观察热点函数间的调用关系，和对应行的代码，就可以了解到问题代码在哪里。当我们做出性能修正后，再次进行性能分析(profiling)即可检查我们调优后的修正是否能够改善程序的性能。
-
-
-
-## Python与C++混合代码的性能分析
-
-### 生成性能分析文件
-
-C++的性能分析工具非常多。常见的包括`gprof`, `valgrind`, `google-perftools`。但是调试Python中使用的动态链接库与直接调试原始二进制相比增加了很多复杂度。幸而Python的一个第三方库`yep`提供了方便的和`google-perftools`交互的方法。于是这里使用`yep`进行Python与C++混合代码的性能分析
-
-使用`yep`前需要安装`google-perftools`与`yep`包。ubuntu下安装命令为
-
-```bash
-apt update
-apt install libgoogle-perftools-dev
-pip install yep
-```
-
-安装完毕后，我们可以通过
-
-```bash
-python -m yep -v main.py
-```
-
-生成性能分析文件。生成的性能分析文件为`main.py.prof`。
-
-命令行中的`-v`指定在生成性能分析文件之后，在命令行显示分析结果。我们可以在命令行中简单的看一下生成效果。因为C++与Python不同，编译时可能会去掉调试信息，运行时也可能因为多线程产生混乱不可读的性能分析结果。为了生成更可读的性能分析结果，可以采取下面几点措施:
-
-1. 编译时指定`-g`生成调试信息。使用cmake的话，可以将CMAKE_BUILD_TYPE指定为`RelWithDebInfo`。
-2. 编译时一定要开启优化。单纯的`Debug`编译性能会和`-O2`或者`-O3`有非常大的差别。`Debug`模式下的性能测试是没有意义的。
-3. 运行性能分析的时候，先从单线程开始，再开启多线程，进而多机。毕竟单线程调试更容易。可以设置`OMP_NUM_THREADS=1`这个环境变量关闭openmp优化。
-
-### 查看性能分析文件
-
-在运行完性能分析后，会生成性能分析结果文件。我们可以使用[`pprof`](https://github.com/google/pprof)来显示性能分析结果。注意，这里使用了用`Go`语言重构后的`pprof`，因为这个工具具有web服务界面，且展示效果更好。
-
-安装`pprof`的命令和一般的`Go`程序是一样的，其命令如下:
-
-```bash
-go get github.com/google/pprof
-```
-
-进而我们可以使用如下命令开启一个HTTP服务:
-
-```bash
-pprof -http=0.0.0.0:3213 `which python`  ./main.py.prof
-```
-
-这行命令中，`-http`指开启HTTP服务。`which python`会产生当前Python二进制的完整路径，进而指定了Python可执行文件的路径。`./main.py.prof`输入了性能分析结果。
-
-访问对应的网址，我们可以查看性能分析的结果。结果如下图所示:
-
-![result](./pprof_1.png)
-
-
-### 寻找性能瓶颈
-
-与寻找Python代码的性能瓶颈类似，寻找Python与C++混合代码的性能瓶颈也是要看`tottime`和`cumtime`。而`pprof`展示的调用图也可以帮助我们发现性能中的问题。
-
-例如下图中，
-
-![kernel_perf](./pprof_2.png)
-
-在一次训练中，乘法和乘法梯度的计算占用2%-4%左右的计算时间。而`MomentumOp`占用了17%左右的计算时间。显然，`MomentumOp`的性能有问题。
-
-在`pprof`中，对于性能的关键路径都做出了红色标记。先检查关键路径的性能问题，再检查其他部分的性能问题，可以更有次序的完成性能的优化。
+../../../howto/optimization/cpu_profiling_cn.md
\ No newline at end of file
--- a/doc/fluid/advanced_usage/development/host_memory_profiling_cn.md
+++ b/doc/fluid/advanced_usage/development/host_memory_profiling_cn.md
-# 堆内存分析和优化
-
-计算机程序都可能有内存泄漏的风险。**内存泄漏**一般是由于程序在堆(heap)上分配了内存而没有释放，随着程序的运行占用的内存越来越大，一方面会影响程序的稳定性，可能让运行速度越来越慢，或者造成oom，甚至会影响运行程序的机器的稳定性，造成宕机。
-
-
-目前有很多内存泄漏分析工具，比较经典的有[valgrind](http://valgrind.org/docs/manual/quick-start.html#quick-start.intro), [gperftools](https://gperftools.github.io/gperftools/)。
-
-因为Fluid是用Python驱动C++ core来运行，valgrind直接分析非常困难，需要自己编译debug版本的、带valgrind支持的专用Python版本，而且输出的信息中大部分是Python自己的符号和调用信息，分析起来很困难，另外使用valgrind会让程序运行速度变得非常慢，所以不建议使用。
-
-本教程主要介绍[gperftools](https://gperftools.github.io/gperftools/)的使用。
-
-gperftool主要支持以下四个功能：
-
- thread-caching malloc
- heap-checking using tcmalloc
- heap-profiling using tcmalloc
- CPU profiler
-
-Paddle也提供了基于gperftool的[CPU性能分析教程](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/howto/optimization/cpu_profiling_cn.md)。
-
-对于堆内存的分析，主要用到thread-caching malloc和heap-profiling using tcmalloc。
-
-## 环境
-
-本教程基于paddle提供的Docker开发环境paddlepaddle/paddle:latest-dev，基于Ubuntu 16.04.4 LTS环境。
-
-## 使用流程
-
- 安装google-perftools
-
-```
-apt-get install libunwind-dev 
-apt-get install google-perftools
-```
-
- 安装pprof
-
-```
-go get -u github.com/google/pprof
-```
-
- 设置运行环境
-
-```
-export PPROF_PATH=/root/gopath/bin/pprof
-export PPROF_BINARY_PATH=/root/gopath/bin/pprof
-export LD_PRELOAD=/usr/lib/libtcmalloc.so.4
-```
-
- 使用heap profile来运行python程序。本质上是周期性的对堆的分配情况做一次快照。
-
-```
-# HEAPPROFILE 设置生成的堆分析文件的目录和文件前缀
-# HEAP_PROFILE_ALLOCATION_INTERVAL 设置每分配多少存储dump一次dump，默认1GB
-env HEAPPROFILE="./perf_log/test.log" HEAP_PROFILE_ALLOCATION_INTERVAL=209715200 python trainer.py
-```
-
-随着程序的运行，会在perf_log这个文件夹下生成很多文件，如下：
-
-```
-rw-r--r-- 1 root root 1.0M Jun  1 15:00 test.log.0001.heap
-rw-r--r-- 1 root root 1.0M Jun  1 15:00 test.log.0002.heap
-rw-r--r-- 1 root root 1.0M Jun  1 15:00 test.log.0003.heap
-rw-r--r-- 1 root root 1.0M Jun  1 15:00 test.log.0004.heap
-rw-r--r-- 1 root root 1.0M Jun  1 15:00 test.log.0005.heap
-rw-r--r-- 1 root root 1.0M Jun  1 15:00 test.log.0006.heap
-```
-
- 使用pprof对heap文件进行分析。分析有两种模式：
-	- 完整模式。会对当前heap做一个分析，显示目前分配内存一些调用路径。
-
-	```
-	pprof --pdf python test.log.0012.heap
-	```
-	上述命令会生成一个profile00x.pdf的文件，可以直接打开，例如：[memory_cpu_allocator](https://github.com/jacquesqiao/Paddle/blob/bd2ea0e1f84bb6522a66d44a072598153634cade/doc/fluid/howto/optimization/memory_cpu_allocator.pdf)。从下图可以看出，在CPU版本fluid的运行过程中，分配存储最多的模块式CPUAllocator. 而别的模块相对而言分配内存较少，所以被忽略了，这对于分配内存泄漏是很不方便的，因为泄漏是一个缓慢的过程，在这种图中是无法看到的。
-	
-	![result](https://user-images.githubusercontent.com/3048612/40964027-a54033e4-68dc-11e8-836a-144910c4bb8c.png)
-	
-	- Diff模式。可以对两个时刻的heap做diff，把一些内存分配没有发生变化的模块去掉，而把增量部分显示出来。
-	```
-	pprof --pdf --base test.log.0010.heap python test.log.1045.heap
-	```
-	生成的结果为：[`memory_leak_protobuf`](https://github.com/jacquesqiao/Paddle/blob/bd2ea0e1f84bb6522a66d44a072598153634cade/doc/fluid/howto/optimization/memory_leak_protobuf.pdf)
-	
-	从图中可以看出：ProgramDesc这个结构，在两个版本之间增长了200MB+，所以这里有很大的内存泄漏的可能性，最终结果也确实证明是这里造成了泄漏。
-	
-	![result](https://user-images.githubusercontent.com/3048612/40964057-b434d5e4-68dc-11e8-894b-8ab62bcf26c2.png)
-	![result](https://user-images.githubusercontent.com/3048612/40964063-b7dbee44-68dc-11e8-9719-da279f86477f.png)
-	
+../../../howto/optimization/host_memory_profiling_cn.md
\ No newline at end of file
--- a/doc/fluid/advanced_usage/development/new_op.md
+++ b/doc/fluid/advanced_usage/development/new_op.md
-# 如何写新的Operator
-
- - [概念简介](#概念简介)
- - [实现C++类](#实现c类)
-   - [定义ProtoMaker类](#定义protomaker类)
-   - [定义Operator类](#定义operator类)
-   - [定义OpKernel类](#定义opkernel类)
-   - [注册Operator](#注册operator)
-   - [编译](#编译)
- - [绑定Python](#绑定python)
- - [实现单元测试](#实现单元测试)
-   - [前向Operator单测](#前向operator单测)
-   - [反向Operator单测](#反向operator单测)
-   - [编译和执行](#编译和执行)
- - [注意事项](#注意事项)
-
-
-## 概念简介
-
-简单介绍需要用到基类，详细介绍请参考设计文档。
-
- `framework::OperatorBase`: Operator(简写，Op)基类。
- `framework::OpKernel`: Op计算函数的基类，称作Kernel。
- `framework::OperatorWithKernel`：继承自OperatorBase，Op有计算函数，称作有Kernel。
- `class OpProtoAndCheckerMaker`：描述该Op的输入、输出、属性、注释,主要用于Python API接口生成
-
-依据是否包含kernel，可以将Op分为两种：包含Kernel的Op和不包含kernel的Op，前者Op的定义继承自`OperatorWithKernel`，后者继承自`OperatorBase`。本教程主要介绍带Kernel的Op如何写，简单总结Op需要包含的内容如下：
-
-<table>
-<thead>
-<tr>
-<th>内容</th>
-<th>定义位置</th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td>OpProtoMake定义 </td>
-<td>.cc 文件，Backward Op不需要定义OpProtoMake </td>
-</tr>
-<tr>
-<td>Op定义 </td>
-<td> .cc 文件</td>
-</tr>
-<tr>
-<td>Kernel实现 </td>
-<td> CPU、CUDA共享Kernel实现在.h 文件中，否则，CPU 实现在.cc 文件中，CUDA 实现在.cu 文件中。</td>
-</tr>
-<tr>
-<td>注册Op </td>
-<td> Op注册实现在.cc 文件；Kernel注册CPU实现在.cc 文件中，CUDA实现在.cu 文件中</td>
-</tr>
-</tbody>
-</table>
-
-
-实现新的op都添加至目录[paddle/fluid/operators](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/fluid/operators)下，文件命名以`*_op.h`（如有） 、 `*_op.cc` 、`*_op.cu`（如有）结尾。**系统会根据文件名自动构建op和其对应的Python扩展。**
-
-
-下面以矩阵乘操作，即[MulOp](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/mul_op.cc)为例来介绍如何写带Kernel的Operator。
-
-
-## 实现C++类
-
-
-### 定义ProtoMaker类
-
-矩阵乘法的公式：$Out = X * Y$, 可见该计算由两个输入，一个输出组成。
-
-首先定义`ProtoMaker`来描述该Op的输入、输出，并添加注释：
-
-```cpp
-class MulOpMaker : public framework::OpProtoAndCheckerMaker {
- public:
-  MulOpMaker(OpProto *proto, OpAttrChecker *op_checker)
-      : OpProtoAndCheckerMaker(proto, op_checker) {
-    AddInput("X", "(Tensor), 2D tensor of size (M x K)");
-    AddInput("Y", "(Tensor), 2D tensor of size (K x N)");
-    AddOutput("Out", "(Tensor), 2D tensor of size (M x N)");
-    AddComment(R"DOC(
-Two Element Mul Operator.
-The equation is: Out = X * Y
-)DOC");
-  }
-};
-```
-
-[`MulOpMaker`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/mul_op.cc#L76-L127)继承自`framework::OpProtoAndCheckerMaker`，构造函数含有2个参数：
-
-   - `framework::OpProto` ： 前者存储Op的输入输出和参数属性，将用于Python API接口的生成。
-   - `framework::OpAttrChecker` ：后者用于检查参数属性的合法性。
-
-构造函数里通过`AddInput`添加输入参数，通过`AddOutput`添加输出参数，通过`AddComment`添加Op的注释。这些函数会将对应内容添加到`OpProto`中。
-
-上面的代码在`MulOp`中添加两个输入`X`和`Y`，添加了一个输出`Out`，并解释了各自含义，命名请遵守[命名规范](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/dev/name_convention.md)。
-
-
-再以[`ScaleOp`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/scale_op.cc#L38-L55)为例：
-
-```cpp
-template <typename AttrType>
-class ScaleOpMaker : public framework::OpProtoAndCheckerMaker {
- public:
-  ScaleOpMaker(OpProto *proto, OpAttrChecker *op_checker)
-      : OpProtoAndCheckerMaker(proto, op_checker) {
-    AddInput("X", "(Tensor) Input tensor of scale operator.");
-    AddOutput("Out", "(Tensor) Output tensor of scale operator.");
-    AddComment(R"DOC(
-Scale operator
-$$Out = scale*X$$
-)DOC");
-    AddAttr<AttrType>("scale",
-                      "(float, default 1.0)"
-                      "The scaling factor of the scale operator.")
-        .SetDefault(1.0);
-  }
-};
-```
-
-这个例子有`AddAttr<AttrType>("scale", "...").SetDefault(1.0);` : 增加`scale`系数，作为参数属性，并且设置默认值为1.0。
-
-### 定义GradProtoMaker类
-每个Op的必须有一个对应的GraProtoMaker，若未定制对应前向Op的GradProtoMaker，fluid提供了DefaultGradProtoMaker，默认注册会使用全部输入输出，包括Input, Output, Output@Grad等，使用不需要的变量的会造成显存浪费。
-下面示例定义了ScaleOp的GradProtoMaker。
-
-```cpp
-class ScaleGradMaker : public framework::SingleGradOpDescMaker {
- public:
-  using framework::SingleGradOpDescMaker::SingleGradOpDescMaker;
-
-  std::unique_ptr<framework::OpDesc> Apply() const override {
-    auto *grad_op = new framework::OpDesc();
-    grad_op->SetType("scale");
-    grad_op->SetInput("X", OutputGrad("Out"));
-    grad_op->SetOutput("Out", InputGrad("X"));
-    grad_op->SetAttr("scale", GetAttr("scale"));
-    return std::unique_ptr<framework::OpDesc>(grad_op);
-  }
-};
-```
-
-### 定义Operator类
-
-下面实现了MulOp的定义：
-
-```cpp
-class MulOp : public framework::OperatorWithKernel {
- public:
-  using framework::OperatorWithKernel::OperatorWithKernel;
-
- protected:
-  void InferShape(const framework::InferShapeContext &ctx) const override {
-    auto dim0 = ctx.Input<Tensor>("X")->dims();
-    auto dim1 = ctx.Input<Tensor>("Y")->dims();
-    PADDLE_ENFORCE_EQ(dim0.size(), 2,
-                      "input X(%s) should be a tensor with 2 dims, a matrix",
-                      ctx.op_.Input("X"));
-    PADDLE_ENFORCE_EQ(dim1.size(), 2,
-                      "input Y(%s) should be a tensor with 2 dims, a matrix",
-                      ctx.op_.Input("Y"));
-    PADDLE_ENFORCE_EQ(
-        dim0[1], dim1[0],
-        "First matrix's width must be equal with second matrix's height.");
-    ctx.Output<Tensor>("Out")->Resize({dim0[0], dim1[1]});
-  }
-};
-```
-
-[`MulOp`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/mul_op.cc#L22)继承自`OperatorWithKernel`。`public`成员：
-
-```cpp
-using framework::OperatorWithKernel::OperatorWithKernel;
-```
-
-这句表示使用基类`OperatorWithKernel`的构造函数，也可写成：
-
-```cpp
-MulOp(const std::string &type, const framework::VariableNameMap &inputs,
-      const framework::VariableNameMap &outputs,
-      const framework::AttributeMap &attrs)
-  : OperatorWithKernel(type, inputs, outputs, attrs) {}
-```
-
-还需要重写`InferShape`接口。`InferShape`为const函数，不能修改Op的成员变量，参数为`const framework::InferShapeContext &ctx`，通过该参数可获取到输入输出以及属性。它的功能是：
-
-  - 1). 做检查， 尽早报错：检查输入数据维度、类型等是否合法。
-  - 2). 设置输出Tensor的形状。
-
-通常`OpProtoMaker`和`Op`类的定义写在`.cc`文件中，和下面将要介绍的注册函数一起放在`.cc`中
-
-### 定义OpKernel类
-
-`MulKernel`继承自`framework::OpKernel`，带有下面两个模板参数:
-
- `typename DeviceContext`: 表示设备类型，不同设备(CPU、CUDA)共享同一个Kernel时，需加该模板参数，不共享则不加，一个不共享的例子是[`OnehotCrossEntropyOpKernel`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/cross_entropy_op.h#L43)。
-
- `typename T` : 表示数据类型，如`float`, `double`等。
-
-需要为`MulKernel`类重写`Compute`接口。
- `Compute`接受一个输入参数：`const framework::ExecutionContext& context`。
- 与`InferShapeContext`相比，`ExecutionContext`增加了设备类型，同样可获取到输入输出和属性参数。
- `Compute`函数里实现`OpKernel`的具体计算逻辑。
-
-下面是 `MulKernel` `Compute`的实现：
-
-  ```cpp
-  template <typename DeviceContext, typename T>
-  class MulKernel : public framework::OpKernel {
-  public:
-  void Compute(const framework::ExecutionContext& context) const override {
-    auto* X = context.Input<Tensor>("X");
-    auto* Y = context.Input<Tensor>("Y");
-    auto* Z = context.Output<Tensor>("Out");
-    Z->mutable_data<T>(context.GetPlace());
-    auto& device_context = context.template device_context<DeviceContext>();
-    math::matmul<DeviceContext, T>(*X, false, *Y, false, 1, Z, 0, device_context);
-  }
-  };
-  ```
-
-需要注意：**不同设备(CPU、CUDA)共享一个Op定义，是否则共享同一个`OpKernel`，取决于`Compute`调用的函数是否支持不同设备。**
-
-`MulOp`的CPU、CUDA实现共享同一个`Kernel`。`OpKernel`不共享的例子可以参考：[`OnehotCrossEntropyOpKernel`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/cross_entropy_op.h#L43)。
-
-为了使`OpKernel`的计算过程书写更加简单，并且CPU、CUDA的代码可以复用，我们通常借助 Eigen unsupported Tensor模块来实现`Compute`接口。关于在PaddlePaddle中如何使用Eigen库，请参考[使用文档](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/dev/use_eigen_cn.md)。
-
-到此，前向Op实现完成。接下来，需要在`.cc`文件中注册该op和kernel。
-反向Op类的定义，反向OpKernel的定义与前向Op类似，这里不再赘述。**但需注意反向Op没有`ProtoMaker`**。
-
-### 注册Operator
-
- 在`.cc`文件中注册前向、反向Op类，注册CPU Kernel。
-
-    ```cpp
-    namespace ops = paddle::operators;
-    REGISTER_OPERATOR(mul, ops::MulOp, ops::MulOpMaker,
-                  paddle::framework::DefaultGradOpDescMaker<true>)
-    REGISTER_OPERATOR(mul_grad, ops::MulGradOp)
-    REGISTER_OP_CPU_KERNEL(mul, ops::MulKernel<paddle::platform::CPUDeviceContext, float>);
-    REGISTER_OP_CPU_KERNEL(mul_grad,
-                  ops::MulGradKernel<paddle::platform::CPUDeviceContext, float>);
-    ```
-
-   在上面的代码中：
-
-    - `REGISTER_OPERATOR` ： 注册`ops::MulOp`类，类型名为`mul`，该类的`ProtoMaker`为`ops::MulOpMaker`，注册`ops::MulOpGrad`，类型名为`mul_grad`。
-    - `REGISTER_OP_CPU_KERNEL` ：注册`ops::MulKernel`类，并特化模板参数为`paddle::platform::CPUPlace`和`float`类型，同理，注册`ops::MulGradKernel`类。
-
-
- 在 `.cu`文件中注册CUDA Kernel。
-    - 请注意，如果CUDA Kernel的实现基于Eigen unsupported模块，那么在 `.cu`的开始请加上宏定义 `#define EIGEN_USE_GPU`，代码示例如下：
-
-    ```cpp
-    // if use Eigen unsupported module before include head files
-    #define EIGEN_USE_GPU
-
-    namespace ops = paddle::operators;
-    REGISTER_OP_CUDA_KERNEL(mul, ops::MulKernel<paddle::platform::CUDADeviceContext, float>);
-    REGISTER_OP_CUDA_KERNEL(mul_grad,
-                           ops::MulGradKernel<paddle::platform::CUDADeviceContext, float>);
-    ```
-
-### 编译
-
-运行下面命令可以进行编译：
-
-```
-make mul_op
-```
-
-## 绑定Python
-
-系统会对新增的op自动绑定Python，并链接到生成的lib库中。
-
-## 实现单元测试
-
-单测包括对比前向Op不同设备(CPU、CUDA)的实现、对比反向OP不同设备(CPU、CUDA)的实现、反向Op的梯度测试。下面介绍介绍[`MulOp`的单元测试](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/unittests/test_mul_op.py)。
-
-### 前向Operator单测
-
-Op单元测试继承自`OpTest`。各项更加具体的单元测试在`TestMulOp`里完成。测试Operator，需要：
-
-1. 在`setUp`函数定义输入、输出，以及相关的属性参数。
-2. 生成随机的输入数据。
-3. 在Python脚本中实现与前向operator相同的计算逻辑，得到输出值，与operator前向计算的输出进行对比。
-4. 反向计算已经自动集成进测试框架，直接调用相应接口即可。
-
-
-  ```python
-  import unittest
-  import numpy as np
-  from op_test import OpTest
-
-
-  class TestMulOp(OpTest):
-      def setUp(self):
-          self.op_type = "mul"
-          self.inputs = {
-              'X': np.random.random((32, 84)).astype("float32"),
-              'Y': np.random.random((84, 100)).astype("float32")
-          }
-          self.outputs = {'Out': np.dot(self.inputs['X'], self.inputs['Y'])}
-
-      def test_check_output(self):
-          self.check_output()
-
-      def test_check_grad_normal(self):
-          self.check_grad(['X', 'Y'], 'Out', max_relative_error=0.5)
-
-      def test_check_grad_ingore_x(self):
-          self.check_grad(
-              ['Y'], 'Out', max_relative_error=0.5, no_grad_set=set("X"))
-
-      def test_check_grad_ingore_y(self):
-          self.check_grad(
-              ['X'], 'Out', max_relative_error=0.5, no_grad_set=set('Y'))
-  ```
-
-上面的代码首先导入依赖的包，下面是对`setUp`函数中操作的重要变量的详细解释：
-
- `self.op_type = "mul" ` : 定义类型，与operator注册时注册的类型一致。
- `self.inputs` : 定义输入，类型为`numpy.array`，并初始化。
- `self.outputs` : 定义输出，并在Python脚本中完成与operator同样的计算逻辑，返回Python端的计算结果。
-
-### 反向operator单测
-
-而反向测试中：
- `test_check_grad_normal`中调用`check_grad`使用数值法检测梯度正确性和稳定性。
-  - 第一个参数`["X", "Y"]` : 指定对输入变量`X`、`Y`做梯度检测。
-  - 第二个参数`"Out"` : 指定前向网络最终的输出目标变量`Out`。
-  - 第三个参数`max_relative_error`：指定检测梯度时能容忍的最大错误值。
- `test_check_grad_ingore_x`和`test_check_grad_ingore_y`分支用来测试只需要计算一个输入梯度的情况。
-
-
-### 编译和执行
-
-`python/paddle/fluid/tests/unittests/` 目录下新增的 `test_*.py` 单元测试会被自动加入工程进行编译。
-
-请注意，**不同于Op的编译测试，运行单元测试测时需要编译整个工程**，并且编译时需要打开`WITH_TESTING`, 即`cmake paddle_dir -DWITH_TESTING=ON`。编译成功后，执行下面的命令来运行单元测试：
-
-```bash
-make test ARGS="-R test_mul_op -V"
-```
-
-或者:
-
-```bash
-ctest -R test_mul_op
-```
-
-## 注意事项
-
- 注册Op时的类型名，需要和该Op的名字一样。即不允许在`A_op.cc`里面，注册`REGISTER_OPERATOR(B, ...)`等，这将会导致单元测试出错。
- 如果Op没有实现CUDA Kernel，请不要创建空的`*_op.cu`，这将会导致单元测试出错。
- 如果多个Op依赖一些共用的函数，可以创建非`*_op.*`格式的文件来存放，如`gather.h`文件。
-
-### PADDLE_ENFORCE使用注意
-
-实现Op时检查数据的合法性需要使用PADDLE_ENFORCE以及PADDLE_ENFORCE_EQ等宏定义，基本格式如下：
-
-```
-PADDLE_ENFORCE(表达式, 错误提示信息)
-PADDLE_ENFORCE_EQ(比较对象A, 比较对象B, 错误提示信息)
-```
-
-如果表达式为真，或者比较对象A=B，则检查通过，否则会终止程序运行，向用户反馈相应的错误提示信息。
-为了确保提示友好易懂，开发者需要注意其使用方法。
-
-#### 总体原则
-
-任何使用了PADDLE_ENFORCE与PADDLE_ENFORCE_**检查的地方，必须有详略得当的备注解释！**错误提示信息**不能为空！
-
-#### 提示信息书写标准
-
-1. [required] 哪里错了？为什么错了？
-    - 例如：`ValueError: Mismatched label shape`
-2. [optional] 期望的输入是什么样的？实际的输入是怎样的？
-    - 例如：`Expected labels dimension=1. Received 4.`
-3. [optional] 能否给出修改意见？
-    - 例如：`Suggested Fix:If your classifier expects one-hot encoding label,check your n_classes argument to the estimatorand/or the shape of your label.Otherwise, check the shape of your label.`
-
-如果并非必要或者简洁的描述即可表达清楚以上要点，根据情况书写亦可。
-
-##### FAQ 典型问题
-
-1. 无报错信息或报错信息过于简单，不能给用户提供有效的提示！
-
-问题示例1 ：未写提示信息
-```
-PADDLE_ENFORCE(ctx->HasInput("X"), "");
-```
-问题示例2 ：提示信息过于简单
-```
-PADDLE_ENFORCE(i != nullptr, "i must be set"); // i是什么？
-```
-
-2. 在报错信息中使用开发人员定义的变量缩写，不易理解！
-
-问题示例：
-```
-PADDLE_ENFORCE(forward_pd != nullptr,
-                    "Fail to find eltwise_fwd_pd in device context");  //eltwise_fwd_pd用户可能看不懂
-```
-
-3. OP内部调用非法接口：Op内部如果出现Output = ShareDataWith(Input) 
-问题示例：
-```cpp
-auto *out = ctx.Output<framework::LoDTensor>("Out");
-auto *in = ctx.Input<framework::LoDTensor>("X");
-out->ShareDataWith(*in);
-```
-Op内部如果出现Output = ShareDataWith(Input)，相当于operator图的中有一条隐藏边，连接了Input和Output，这条边无法在图分析中表达，引发基于图优化的错误。
-
-4. OP实现的性能实践
-调用了eigen的broadcast, chop等操作，性能会比手写cuda kernel差几倍以上。此时cpu的实现可以复用eigen，gpu实现可以实现cuda kernel.
-
-
-#### OP InferShape检查提示信息特别说明
-
- 检查输入输出变量，请统一遵循以下格式
-`Input(变量名) of OP名 operator should not be null.`  
-
-正确示例：
-```
-PADDLE_ENFORCE(ctx->HasInput("Input"),
-                        "Input(Input) of LSTMP operator should not be null.");
-```
-
- 反向Op的输入输出检查，要写明反向Op的名字
-
-正确示例：
-```
-PADDLE_ENFORCE(ctx->HasInput("X"),
-                        "Input(X) of LoDResetGrad opreator should not be null.");
-```
+../../../dev/new_op_cn.md
\ No newline at end of file
--- a/doc/fluid/advanced_usage/development/timeline_cn.md
+++ b/doc/fluid/advanced_usage/development/timeline_cn.md
-# 如何使用timeline工具做性能分析
-
-1. 在训练的主循环外加上`profiler.start_profiler(...)`和`profiler.stop_profiler(...)`。运行之后，代码会在`/tmp/profile`目录下生成一个profile的记录文件。
-
-	**提示：**
-	请不要在timeline记录信息时运行太多次迭代，因为timeline中的记录数量和迭代次数是成正比的。
-
-	```python
-    for pass_id in range(pass_num):
-        for batch_id, data in enumerate(train_reader()):
-            if pass_id == 0 and batch_id == 5:
-                profiler.start_profiler("All")
-            elif pass_id == 0 and batch_id == 10:
-                profiler.stop_profiler("total", "/tmp/profile")
-            exe.run(fluid.default_main_program(),
-                    feed=feeder.feed(data),
-                    fetch_list=[])
-	            ...
-	```
-
-1. 运行`python paddle/tools/timeline.py`来处理`/tmp/profile`，这个程序默认会生成一个`/tmp/timeline`文件，你也可以用命令行参数来修改这个路径，请参考[timeline.py](https://github.com/PaddlePaddle/Paddle/blob/develop/tools/timeline.py)。
-```python
-python Paddle/tools/timeline.py --profile_path=/tmp/profile --timeline_path=timeline
-```
-
-1. 打开chrome浏览器，访问<chrome://tracing/>，用`load`按钮来加载生成的`timeline`文件。
-
-	![chrome tracing](./tracing.jpeg)
-
-1. 结果如下图所示，可以放到来查看timetime的细节信息。
-
-	![chrome timeline](./timeline.jpeg)
+../../../howto/optimization/timeline_cn.md
\ No newline at end of file
--- a/doc/fluid/advanced_usage/development/write_docs.rst
+++ b/doc/fluid/advanced_usage/development/write_docs.rst
-#############
-如何贡献文档
-#############
-
-PaddlePaddle的文档包括中英文两个部分。文档都是通过 ``cmake`` 驱动 ``sphinx`` 编译生成的，PaddlePaddle.org工具可以帮助我们实现这一编译过程，并提供更好的预览效果。
-
-如何构建文档
-============
-
-PaddlePaddle的文档构建有两种方式，分别为使用paddlepaddle.org工具和不使用paddlepaddle.org工具，两种方式都有各自的优点，前者方便预览，后者方便开发者进行调试。这两种方式中又分别有使用docker和不使用docker的两种构建方法。
-
-我们建议使用PaddlePaddle.org工具来构建文档。
-
-使用PaddlePaddle.org工具
------------------------
-这个是目前推荐的使用方法。除了可以自动编译文档，还可以直接在网页中预览文档，需要注意的是，采用后续说明的其它方式虽然也可以预览文档，但是文档的样式与官网文档是不一致的，使用PaddlePaddle.org工具进行编译才能产生与官网文档样式一致的预览效果。
-
-PaddlePaddle.org工具可以配合Docker使用，需要在系统里先安装好Docker工具包。Docker安装请参考 `Docker的官网 <https://docs.docker.com/>`_ 。安装好Docker之后即可用以下命令启动工具
-
-..  code-block:: bash
-
-    mkdir paddlepaddle # Create paddlepaddle working directory
-    cd paddlepaddle
-
-    # Clone the content repositories
-    git clone https://github.com/PaddlePaddle/Paddle.git
-    git clone https://github.com/PaddlePaddle/book.git
-    git clone https://github.com/PaddlePaddle/models.git
-    git clone https://github.com/PaddlePaddle/Mobile.git
-
-    # Please specify the working directory through -v
-    docker run -it -p 8000:8000 -v `pwd`:/var/content paddlepaddle/paddlepaddle.org:latest
-
-注意: PaddlePaddle.org 会在 -v (volume) 指定的内容存储库运行命令
-之后再用网页连到 http://localhost:8000 就可以在网页上生成需要的文档
-编译后的文件将被存储在工作目录 <paddlepaddle working directory>/.ppo_workspace/content。
-
-如果不想使用Docker，你还可以通过运行Django框架直接激活工具的服务器。使用下面的命令来运行它。
-
-..  code-block:: bash
-
-    mkdir paddlepaddle # Create paddlepaddle working directory
-    cd paddlepaddle
-
-    # Clone the content repositories and PaddlePaddle.org
-    git clone https://github.com/PaddlePaddle/Paddle.git
-    git clone https://github.com/PaddlePaddle/book.git
-    git clone https://github.com/PaddlePaddle/models.git
-    git clone https://github.com/PaddlePaddle/Mobile.git
-    git clone https://github.com/PaddlePaddle/PaddlePaddle.org.git
-
-    # Please specify the PaddlePaddle working directory. In the current setting, it should be pwd
-    export CONTENT_DIR=<path_to_paddlepaddle_working_directory>
-    export ENV=''
-    cd PaddlePaddle.org/portal/
-    pip install -r requirements.txt
-    python manage.py runserver
-
-工具服务器将读取环境变量 CONTENT_DIR 搜索代码库。请指定的PaddlePaddle工作目录给环境变量 CONTENT_DIR。
-之后再用网页连到 http://localhost:8000 就可以在网页上生成需要的文档。
-编译后的文件将被存储在工作目录 <paddlepaddle working directory>/.ppo_workspace/content。
-
-想了解更多PaddlePaddle.org工具的详细信息，可以 `点击这里 <https://github.com/PaddlePaddle/PaddlePaddle.org/blob/develop/README.cn.md>`_ 。
-
-不使用PaddlePaddle.org工具
--------------------------
-
-使用Docker构建PaddlePaddle的文档，需要在系统里先安装好Docker工具包。Docker安装请参考 `Docker的官网 <https://docs.docker.com/>`_ 。该方法与 `从源码编译PaddlePaddle <http://paddlepaddle.org/docs/develop/documentation/zh/build_and_install/build_from_source_cn.html>`_ 相似，通过从源码中构建可用于编译PaddlePaddle文档的Docker镜像并运行，在进入Docker容器后使用源码中的脚本构建PaddlePaddle文档，具体步骤如下：
-
-.. code-block:: bash
-
-   git clone https://github.com/PaddlePaddle/Paddle.git
-   cd Paddle
-
-   # 从源码中构建可用于编译PaddlePaddle文档的Docker镜像
-   docker build -t paddle:dev .
-   docker run -it -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_TESTING=OFF" -e "WITH_DOC=ON" paddle:dev /bin/bash
-
-   # 进入Docker容器后使用build.sh脚本构建PaddlePaddle文档
-   bash -x /paddle/paddle/scripts/docker/build.sh
-
-注：上述命令把当前目录（源码根目录）映射为 container 里的 :code:`/paddle` 目录。
-
-编译完成后，会产生 ``doc/v2`` 和 ``doc/fluid`` 两个目录，在这两个目录下分别都生成 ``cn/html/`` 、 ``en/html`` 、 ``api/en/html`` 共三个子目录，分别进入这些目录下，执行以下命令：
-
-.. code-block:: bash
-
-   python -m SimpleHTTPServer 8088
-
-在浏览器中输入 http://localhost:8088 就可以看到编译生成的 ``v2`` 和 ``fluid`` 两种版本的中/英文的文档页面和英文的API页面。
-
-如果不想使用Docker，也可以使用以下命令直接构建PaddlePaddle文档，即
-
-.. code-block:: bash
-
-   git clone https://github.com/PaddlePaddle/Paddle.git
-   cd Paddle
-   mkdir -p build
-   cd build
-   cmake .. -DCMAKE_BUILD_TYPE=Release -DWITH_GPU=OFF -DWITH_MKL=OFF -DWITH_DOC=ON
-
-   # 如果只需要构建使用文档，则执行以下命令
-   make -j $processors paddle_docs
-
-   # 如果只需要构建API，则执行以下命令
-   make -j $processors paddle_apis
-
-其中$processors代表启动和CPU核一样多的进程来并行编译，可以根据本机的CPU核数设置相应的值。
-
-编译完成后，同样会产生 ``doc/v2`` 和 ``doc/fluid`` 两个目录，如果选择构建文档则会在这两个目录下分别都生成 ``cn/html/`` 、 ``en/html`` 两个子目录，选择构建API则会在这两个目录下分别生成 ``api/en/html`` 目录，分别进入这些子目录下，执行以下命令：
-
-.. code-block:: bash
-
-   python -m SimpleHTTPServer 8088
-
-在浏览器中输入 http://localhost:8088 就可以看到编译生成的 ``v2`` 和 ``fluid`` 两种版本的中/英文的文档页面和英文的API页面。下图为生成的 ``v2`` 英文文档首页示例。注意，示例中由于使用了sphinx的原始主题，所以页面的风格与官网并不一致，但这并不影响开发者进行调试。
-
-..  image:: src/doc_en.png
-    :align: center
-    :scale: 60 %
-
-如何书写文档
-============
-
-PaddlePaddle文档使用 `sphinx`_ 自动生成，用户可以参考sphinx教程进行书写。
-
-如何更新www.paddlepaddle.org
-============================
-
-更新的文档以PR的形式提交到github中，提交方式参见 `如何贡献文档 <http://www.paddlepaddle.org/docs/develop/documentation/zh/dev/write_docs_cn.html>`_ 。
-目前PaddlePaddle的develop分支的文档是自动触发更新的，用户可以分别查看最新的 `中文文档 <http://www.paddlepaddle.org/docs/develop/documentation/zh/getstarted/index_cn.html>`_ 和
-`英文文档 <http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/index_en.html>`_ 。
-
-
-..  _cmake: https://cmake.org/
-..  _sphinx: http://www.sphinx-doc.org/en/1.4.8/
+../../../dev/write_docs_cn.rst
\ No newline at end of file
--- a/doc/fluid/dev/contribute_to_paddle_cn.md
+++ b/doc/fluid/dev/contribute_to_paddle_cn.md
-# 如何贡献代码
-
-我们真诚地感谢您的贡献，欢迎通过 GitHub 的 fork 和 pull request 流程来提交代码。
-
-## 代码要求
- 代码注释请遵守 [Doxygen](http://www.stack.nl/~dimitri/doxygen/) 的样式。
- 确保编译器选项 `WITH_STYLE_CHECK` 已打开，并且编译能通过代码样式检查。
- 所有代码必须具有单元测试。
- 通过所有单元测试。
- 请遵守[提交代码的一些约定](#提交代码的一些约定)。
-
-以下教程将指导您提交代码。
-## [Fork](https://help.github.com/articles/fork-a-repo/)
-
-跳转到[PaddlePaddle](https://github.com/PaddlePaddle/Paddle) GitHub首页，然后单击 `Fork` 按钮，生成自己目录下的仓库，比如 <https://github.com/USERNAME/Paddle>。
-
-## 克隆（Clone）
-
-将远程仓库 clone 到本地：
-
-```bash
-➜  git clone https://github.com/USERNAME/Paddle
-➜  cd Paddle
-```
-
-
-## 创建本地分支
-
-Paddle 目前使用[Git流分支模型](http://nvie.com/posts/a-successful-git-branching-model/)进行开发，测试，发行和维护，具体请参考 [Paddle 分支规范](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/releasing_process.md#paddle-分支规范)。
-
-所有的 feature 和 bug fix 的开发工作都应该在一个新的分支上完成，一般从 `develop` 分支上创建新分支。
-
-使用 `git checkout -b` 创建并切换到新分支。
-
-```bash
-➜  git checkout -b my-cool-stuff
-```
-
-值得注意的是，在 checkout 之前，需要保持当前分支目录 clean，否则会把 untracked 的文件也带到新分支上，这可以通过 `git status` 查看。
-
-## 使用 `pre-commit` 钩子
-
-Paddle 开发人员使用 [pre-commit](http://pre-commit.com/) 工具来管理 Git 预提交钩子。 它可以帮助我们格式化源代码（C++，Python），在提交（commit）前自动检查一些基本事宜（如每个文件只有一个 EOL，Git 中不要添加大文件等）。
-
-`pre-commit`测试是 Travis-CI 中单元测试的一部分，不满足钩子的 PR 不能被提交到 Paddle，首先安装并在当前目录运行它：
-
-```bash
-➜  pip install pre-commit
-➜  pre-commit install
-```
-
-Paddle 使用 `clang-format` 来调整 C/C++ 源代码格式，请确保 `clang-format` 版本在 3.8 以上。
-
-注：通过`pip install pre-commit`和`conda install -c conda-forge pre-commit`安装的`yapf`稍有不同的，Paddle 开发人员使用的是`pip install pre-commit`。
-
-## 开始开发
-
-在本例中，我删除了 README.md 中的一行，并创建了一个新文件。
-
-通过 `git status` 查看当前状态，这会提示当前目录的一些变化，同时也可以通过 `git diff` 查看文件具体被修改的内容。
-
-```bash
-➜  git status
-On branch test
-Changes not staged for commit:
-  (use "git add <file>..." to update what will be committed)
-  (use "git checkout -- <file>..." to discard changes in working directory)
-
-	modified:   README.md
-
-Untracked files:
-  (use "git add <file>..." to include in what will be committed)
-
-	test
-
-no changes added to commit (use "git add" and/or "git commit -a")
-```
-
-## 构建和测试
-
-编译 PaddlePaddle 的源码以及生成文档需要多种开发工具。为了方便大家，我们的标准开发流程是把这些工具都装进一个Docker image，称为*开发镜像*，通常名字是 `paddle:latest-dev` 或者 `paddle:[version tag]-dev` 如 `paddle:0.11.0-dev`。然后所有用 `cmake && make` 的地方（比如IDE配置里）都用 `docker run paddle:latest-dev`来代替。
-
-如要build这个开发镜像，在源码目录树的根目录中运行：
-
-```bash
-➜  docker build -t paddle:latest-dev .
-```
-
-随后可以用这个开发镜像开始build PaddlePaddle的源码。比如如果要build一个不依赖GPU，但是支持AVX指令集，并且包括unit tests的PaddlePaddle，可以：
-
-```bash
-➜  docker run -v $(pwd):/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=ON" -e "WITH_TESTING=ON" paddle:latest-dev
-```
-
-这个过程除了编译PaddlePaddle为 `./build/libpaddle.so`，并且输出一个 `./build/paddle.deb`文件之外，还会输出一个 `build/Dockerfile`。我们只需要运行下面命令把编译好的PaddlePaddle打包成一个*生产镜像*（`paddle:prod`）：
-
-```bash
-➜  docker build -t paddle:prod -f build/Dockerfile .
-```
-
-如果要运行所有的单元测试，可以用如下命令：
-
-```bash
-➜  docker run -it -v $(pwd):/paddle paddle:latest-dev bash -c "cd /paddle/build && ctest"
-```
-
-关于构建和测试的更多信息，请参见[使用Docker安装运行](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/v2/build_and_install/docker_install_cn.rst)。
-
-## 提交（commit）
-
-接下来我们取消对 README.md 文件的改变，然后提交新添加的 test 文件。
-
-```bash
-➜  git checkout -- README.md
-➜  git status
-On branch test
-Untracked files:
-  (use "git add <file>..." to include in what will be committed)
-
-	test
-
-nothing added to commit but untracked files present (use "git add" to track)
-➜  git add test
-```
-
-Git 每次提交代码，都需要写提交说明，这可以让其他人知道这次提交做了哪些改变，这可以通过`git commit` 完成。
-
-```bash
-➜  git commit
-CRLF end-lines remover...............................(no files to check)Skipped
-yapf.................................................(no files to check)Skipped
-Check for added large files..............................................Passed
-Check for merge conflicts................................................Passed
-Check for broken symlinks................................................Passed
-Detect Private Key...................................(no files to check)Skipped
-Fix End of Files.....................................(no files to check)Skipped
-clang-formater.......................................(no files to check)Skipped
-[my-cool-stuff c703c041] add test file
- 1 file changed, 0 insertions(+), 0 deletions(-)
- create mode 100644 233
-```
-
-## 保持本地仓库最新
-
-在准备发起 Pull Request 之前，需要同步原仓库（<https://github.com/PaddlePaddle/Paddle>）最新的代码。
-
-首先通过 `git remote` 查看当前远程仓库的名字。
-
-```bash
-➜  git remote
-origin
-➜  git remote -v
-origin	https://github.com/USERNAME/Paddle (fetch)
-origin	https://github.com/USERNAME/Paddle (push)
-```
-
-这里 origin 是我们 clone 的远程仓库的名字，也就是自己用户名下的 Paddle，接下来我们创建一个原始 Paddle 仓库的远程主机，命名为 upstream。
-
-```bash
-➜  git remote add upstream https://github.com/PaddlePaddle/Paddle
-➜  git remote
-origin
-upstream
-```
-
-获取 upstream 的最新代码并更新当前分支。
-
-```bash
-➜  git fetch upstream
-➜  git pull upstream develop
-```
-
-## Push 到远程仓库
-
-将本地的修改推送到 GitHub 上，也就是 https://github.com/USERNAME/Paddle。
-
-```bash
-# 推送到远程仓库 origin 的 my-cool-stuff 分支上
-➜  git push origin my-cool-stuff
-```
-
-## 建立 Issue 并完成 Pull Request
-
-建立一个 Issue 描述问题，并记录它的编号。
-
-切换到所建分支，然后点击 `New pull request`。
-
-<img width="295" alt="screen shot 2017-04-26 at 9 09 28 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436054/a6d98c66-2ac4-11e7-9cb1-18dd13150230.png">
-
-选择目标分支：
-
-<img width="750" alt="screen shot 2017-04-26 at 9 11 52 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436139/f83b1e6c-2ac4-11e7-8c0e-add499023c46.png">
-
-在 PR 的描述说明中，填写 `resolve #Issue编号` 可以在这个 PR 被 merge 后，自动关闭对应的 Issue，具体请见 <https://help.github.com/articles/closing-issues-via-commit-messages/>。
-
-接下来等待 review，如果有需要修改的地方，参照上述步骤更新 origin 中的对应分支即可。
-
-## 删除远程分支
-
-在 PR 被 merge 进主仓库后，我们可以在 PR 的页面删除远程仓库的分支。
-
-<img width="775" alt="screen shot 2017-04-26 at 9 18 24 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436457/e4cdd472-2ac5-11e7-9272-badc76c4a23e.png">
-
-也可以使用 `git push origin :分支名` 删除远程分支，如：
-
-```bash
-➜  git push origin :my-cool-stuff
-```
-
-## 删除本地分支
-
-最后，删除本地分支。
-
-```bash
-# 切换到 develop 分支
-➜  git checkout develop 
-
-# 删除 my-cool-stuff 分支
-➜  git branch -D my-cool-stuff
-```
-
-至此，我们就完成了一次代码贡献的过程。
-
-## 提交代码的一些约定
-
-为了使评审人在评审代码时更好地专注于代码本身，请您每次提交代码时，遵守以下约定：
-
-1. 请保证Travis-CI 中单元测试能顺利通过。如果没过，说明提交的代码存在问题，评审人一般不做评审。
-2. 提交PUll Request前：
-   - 请注意commit的数量：
-     - 原因：如果仅仅修改一个文件但提交了十几个commit，每个commit只做了少量的修改，这会给评审人带来很大困扰。评审人需要逐一查看每个commit才能知道做了哪些修改，且不排除commit之间的修改存在相互覆盖的情况。
-     - 建议：每次提交时，保持尽量少的commit，可以通过`git commit --amend`补充上次的commit。对已经Push到远程仓库的多个commit，可以参考[squash commits after push](http://stackoverflow.com/questions/5667884/how-to-squash-commits-in-git-after-they-have-been-pushed)。
-   - 请注意每个commit的名称：应能反映当前commit的内容，不能太随意。
-3. 如果解决了某个Issue的问题，请在该PUll Request的**第一个**评论框中加上：`fix #issue_number`，这样当该PUll Request被合并后，会自动关闭对应的Issue。关键词包括：close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved，请选择合适的词汇。详细可参考[Closing issues via commit messages](https://help.github.com/articles/closing-issues-via-commit-messages)。
-
-此外，在回复评审人意见时，请您遵守以下约定：
-
-1. 评审人的每个意见都必须回复（这是开源社区的基本礼貌，别人帮了忙，应该说谢谢）：
-   - 对评审意见同意且按其修改完的，给个简单的`Done`即可；
-   - 对评审意见不同意的，请给出您自己的反驳理由。
-2. 如果评审意见比较多：
-   - 请给出总体的修改情况。
-   - 请采用[start a review](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/)进行回复，而非直接回复的方式。原因是每个回复都会发送一封邮件，会造成邮件灾难。
+../../v2/dev/contribute_to_paddle_cn.md
\ No newline at end of file
--- a/doc/fluid/dev/contribute_to_paddle_en.md
+++ b/doc/fluid/dev/contribute_to_paddle_en.md
-# Contribute Code
-
-You are welcome to contribute to project PaddlePaddle. To contribute to PaddlePaddle, you have to agree with the 
-[PaddlePaddle Contributor License Agreement](https://gist.github.com/wangkuiyi/0c22c7b1bd3bb7eb27d76f85c3a3e329).
-
-We sincerely appreciate your contribution.  This document explains our workflow and work style.
-
-## Workflow
-
-PaddlePaddle uses this [Git branching model](http://nvie.com/posts/a-successful-git-branching-model/).  The following steps guide usual contributions.
-
-1. Fork
-
-   Our development community has been growing fastly; it doesn't make sense for everyone to write into the official repo.  So, please file Pull Requests from your fork.  To make a fork,  just head over to the GitHub page and click the ["Fork" button](https://help.github.com/articles/fork-a-repo/).
-
-1. Clone
-
-   To make a copy of your fork to your local computers, please run
-
-   ```bash
-   git clone https://github.com/your-github-account/paddle
-   cd paddle
-   ```
-
-1. Create the local feature branch
-
-   For daily works like adding a new feature or fixing a bug, please open your feature branch before coding:
-
-   ```bash
-   git checkout -b my-cool-stuff
-   ```
-
-1. Commit
-
-   Before issuing your first `git commit` command, please install [`pre-commit`](http://pre-commit.com/) by running the following commands:
-
-   ```bash
-   pip install pre-commit
-   pre-commit install
-   ```
-
-   Our pre-commit configuration requires clang-format 3.8 for auto-formating C/C++ code and yapf for Python.
-
-   Once installed, `pre-commit` checks the style of code and documentation in every commit.  We will see something like the following when you run `git commit`:
-
-   ```
-   ➜  git commit
-   CRLF end-lines remover...............................(no files to check)Skipped
-   yapf.................................................(no files to check)Skipped
-   Check for added large files..............................................Passed
-   Check for merge conflicts................................................Passed
-   Check for broken symlinks................................................Passed
-   Detect Private Key...................................(no files to check)Skipped
-   Fix End of Files.....................................(no files to check)Skipped
-   clang-formater.......................................(no files to check)Skipped
-   [my-cool-stuff c703c041] add test file
-    1 file changed, 0 insertions(+), 0 deletions(-)
-    create mode 100644 233
-   ```
-
-	NOTE: The `yapf` installed by `pip install pre-commit` and `conda install -c conda-forge pre-commit` is slightly different. Paddle developers use `pip install pre-commit`.
-
-1. Build and test
-
-   Users can build PaddlePaddle natively on Linux and Mac OS X.  But to unify the building environment and to make it easy for debugging, the recommended way is [using Docker](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/howto/dev/build_en.md).
-
-1. Keep pulling
-
-   An experienced Git user pulls from the official repo often -- daily or even hourly, so they notice conflicts with others work early, and it's easier to resolve smaller conflicts.
-
-   ```bash
-   git remote add upstream https://github.com/PaddlePaddle/Paddle
-   git pull upstream develop
-   ```
-
-1. Push and file a pull request
-
-   You can "push" your local work into your forked repo:
-
-   ```bash
-   git push origin my-cool-stuff
-   ```
-
-   The push allows you to create a pull request, requesting owners of this [official repo](https://github.com/PaddlePaddle/Paddle) to pull your change into the official one.
-
-   To create a pull request, please follow [these steps](https://help.github.com/articles/creating-a-pull-request/).
-
-   If your change is for fixing an issue, please write ["Fixes <issue-URL>"](https://help.github.com/articles/closing-issues-using-keywords/) in the description section of your pull request.  Github would close the issue when the owners merge your pull request.
-
-   Please remember to specify some reviewers for your pull request.  If you don't know who are the right ones, please follow Github's recommendation.
-
-
-1. Delete local and remote branches
-
-   To keep your local workspace and your fork clean, you might want to remove merged branches:
-
-   ```bash
-   git push origin :my-cool-stuff
-   git checkout develop
-   git pull upstream develop
-   git branch -d my-cool-stuff
-   ```
-
-### Code Review
-
-  Please feel free to ping your reviewers by sending them the URL of your pull request via IM or email.  Please do this after your pull request passes the CI.
-
- Please answer reviewers' every comment.  If you are to follow the comment, please write "Done"; please give a reason otherwise.
-
- If you don't want your reviewers to get overwhelmed by email notifications, you might reply their comments by [in a batch](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/).
-
- Reduce the unnecessary commits.  Some developers commit often.  It is recommended to append a sequence of small changes into one commit by running `git commit --amend` instead of `git commit`.
-
-
-## Coding Standard
-
-### Code Style
-
-Our C/C++ code follows the [Google style guide](http://google.github.io/styleguide/cppguide.html).
-
-Our Python code follows the [PEP8 style guide](https://www.python.org/dev/peps/pep-0008/).
-
-Our build process helps to check the code style.  In [`build.sh`](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/paddle/scripts/docker/build.sh#L42), the entry point of our [builder Docker image](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/Dockerfile#L88), the CMake argument `WITH_STYLE_CHECK` is set to `ON` by default.  This flag is on
-
-Please install pre-commit, which automatically reformat the changes to C/C++ and Python code whenever we run `git commit`.  To check the whole codebase, we can run the command `pre-commit run -a`, as in the [`check_style.sh` file](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/paddle/scripts/travis/check_style.sh#L30), which is invoked by [our Travis CI configuration](https://github.com/PaddlePaddle/Paddle/blob/b84e8226514b8bb4405c3c28e54aa5077193d179/.travis.yml#L43).
-
-### Unit Tests
-
-Please remember to add related unit tests.
-
- For C/C++ code, please follow [`google-test` Primer](https://github.com/google/googletest/blob/master/googletest/docs/Primer.md).
-
- For Python code, please use [Python's standard `unittest` package](http://pythontesting.net/framework/unittest/unittest-introduction/).
-
-
-### Writing Logs
-
-We use [glog](https://github.com/google/glog) for logging in our C/C++ code.
-
-For general information, please use `LOG`.  For debug information, please use [`VLOG`](http://htmlpreview.github.io/?https://github.com/google/glog/blob/master/doc/glog.html#verbose).  The reason is at [here](https://groups.google.com/a/chromium.org/d/msg/chromium-dev/3NDNd1KzXeY/AZKMMx37fdQJ).
-
-`VLOG` requires a *verbose level* parameter.  For example:
-
-```c++
-VLOG(3) << "Operator FC is taking " << num_inputs << "inputs."
-```
-
-When we run a PaddlePaddle application or test, we can specify a verbose threshold.  For example:
-
-```bash
-GLOG_vmodule=buddy_allocator=2 \
-GLOG_v=10 \
-python \
-../python/paddle/v2/framework/tests/test_recurrent_op.py
-```
-
-This will enable VLOG messages generated by `buddy_allocator.{h,cc}` and in the verbose range of 0 to 3, so you will see above example VLOG message, which is in level 3.  This suggests that we output overall messages in lower verbose levels, so they display with higher probability.  When coding C++, please follow the verbose level convention as follows:
-
- verbose level 1: [framework](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/framework)
- verbose level 3: [operators](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/operators)
- verbose level 5: [memory](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/memory), [platform](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/platform)
- verbose level 7: [math](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/legacy/math)
+../../v2/dev/contribute_to_paddle_en.md
\ No newline at end of file
--- a/doc/fluid/dev/releasing_process_cn.md
+++ b/doc/fluid/dev/releasing_process_cn.md
 # PaddlePaddle发行规范

-PaddlePaddle使用git-flow branching model做分支管理，使用[Semantic Versioning](http://semver.org/)标准表示PaddlePaddle版本号。
+PaddlePaddle使用Trunk Based Development，使用[Semantic Versioning](http://semver.org/)标准表示PaddlePaddle版本号。

 PaddlePaddle每次发新的版本，遵循以下流程:

 1. 从`develop`分支派生出新的分支，分支名为`release/版本号`。例如，`release/0.10.0`
-1. 将新分支的版本打上tag，tag为`版本号rc.Patch号`。第一个tag为`0.10.0rc1`，第二个为`0.10.0rc2`，依次类推。
-1. 对这个版本的提交，做如下几个操作:
-  * 使用Regression Test List作为检查列表，测试本次release的正确性。
-	  * 如果失败，记录下所有失败的例子，在这个`release/版本号`分支中，修复所有bug后，Patch号加一，到第二步
-	* 修改`python/setup.py.in`中的版本信息,并将`istaged`字段设为`True`。
-	* 将这个版本的python wheel包发布到pypi。
-	* 更新Docker镜像（参考后面的操作细节）。
-1. 第三步完成后，将`release/版本号`分支合入master分支，将master分支的合入commit打上tag，tag为`版本号`。同时再将`master`分支合入`develop`分支。
-1. 协同完成Release Note的书写。
+2. 将新分支的版本打上tag，tag为`版本号rc-Patch号`。例如，第一个tag为`0.10.0-rc0`。
+3. 新分支一般不接受新的feature和优化。QA在release分支上进行测试。研发基于最新的develop开发。
+4. QA和研发发现的bug，在develop上修复验证后，cherry-pick修复到release分支。直到release分支相对稳定。
+5. 如果有需要，在release分支最新代码上打上新的tag，比如`0.10.0-rc1`，让更多的用户加入测试。重复3-4步。
+6. release分支稳定后，打上正式的release tag，比如`0.10.0`。
+7. 将这个版本的python wheel包发布到pypi。
+8. 更新Docker镜像（参考后面的操作细节）。

 需要注意的是:

-* `release/版本号`分支一旦建立，一般不允许再从`develop`分支合入`release/版本号`。这样保证`release/版本号`分支功能的封闭，方便测试人员测试PaddlePaddle的行为。
-* 在`release/版本号`分支存在的时候，如果有bugfix的行为，需要将bugfix的分支同时merge到`master`, `develop`和`release/版本号`这三个分支。
+* bug修复需要先在develop上进行，然后进入release分支。而不是直接在release分支上开发。
+
+* release分支原则上只接受修复类的修改，不接受新feature。

 ## 发布wheel包到pypi

@@ -61,24 +60,21 @@ docker push [镜像]:[version]

 ## PaddlePaddle 分支规范

-PaddlePaddle开发过程使用[git-flow](http://nvie.com/posts/a-successful-git-branching-model/)分支规范，并适应github的特性做了一些区别。
-
-* PaddlePaddle的主版本库遵循[git-flow](http://nvie.com/posts/a-successful-git-branching-model/)分支规范。其中:
-	* `master`分支为稳定(stable branch)版本分支。每一个`master`分支的版本都是经过单元测试和回归测试的版本。
-	* `develop`分支为开发(develop branch)版本分支。每一个`develop`分支的版本都经过单元测试，但并没有经过回归测试。
-	* `release/版本号`分支为每一次Release时建立的临时分支。在这个阶段的代码正在经历回归测试。
+PaddlePaddle开发过程使用[Trunk Based Development](https://trunkbaseddevelopment.com/) 开发规范。

-* 其他用户的fork版本库并不需要严格遵守[git-flow](http://nvie.com/posts/a-successful-git-branching-model/)分支规范，但所有fork的版本库的所有分支都相当于特性分支。
-	* 建议，开发者fork的版本库使用`develop`分支同步主版本库的`develop`分支
-	* 建议，开发者fork的版本库中，再基于`develop`版本fork出自己的功能分支。
-	* 当功能分支开发完毕后，向PaddlePaddle的主版本库提交`Pull Reuqest`，进而进行代码评审。
-		* 在评审过程中，开发者修改自己的代码，可以继续在自己的功能分支提交代码。
+* `develop`分支为开发(develop branch)版本分支。每一个`develop`分支的版本都经过单元测试。并且会经过模型回归测试。
+* `release/版本号`分支为每一次Release时建立的临时分支。release分支主要用于测试，bug修复和最终发版。
+* `master`分支因为历史原因，已经废弃。

-* BugFix分支也是在开发者自己的fork版本库维护，与功能分支不同的是，BugFix分支需要分别给主版本库的`master`、`develop`与可能有的`release/版本号`分支，同时提起`Pull Request`。
+* 其他开发者fork的feature branch。
+	* 建议，开发者的feature branch需要同步主版本库的`develop`分支。
+	* 建议，开发者的feature branch需要基于主版本库中的`develop`分支。
+	* 当feature branch开发完毕后，向PaddlePaddle的主版本库提交`Pull Reuqest`，进而进行代码评审。
+		* 在评审过程中，开发者修改自己的代码，可以继续在自己的feature branch提交代码。

 ## PaddlePaddle回归测试列表

-本列表说明PaddlePaddle发版之前需要测试的功能点。
+TODO

 ### PaddlePaddle Book中所有章节


--- a/doc/fluid/dev/releasing_process_en.md
+++ b/doc/fluid/dev/releasing_process_en.md
 # PaddlePaddle Releasing Process

-PaddlePaddle manages its branches using "git-flow branching model", and [Semantic Versioning](http://semver.org/) as it's version number semantics.
+PaddlePaddle manages its branches using Trunk Based Development, and [Semantic Versioning](http://semver.org/) as it's version number semantics.

 Each time we release a new PaddlePaddle version, we should follow the below steps:

-1. Fork a new branch from `develop` named `release/[version]`, e.g. `release/0.10.0`.
-1. Push a new tag on the release branch, the tag name should be like `[version]rc.patch`. The
-   first tag should be `0.10.0rc1`, and the second should be `0.10.0.rc2` and so on.
-1. After that, we should do:
-  * Run all regression test on the Regression Test List (see PaddlePaddle TeamCity CI), to confirm
-      that this release has no major bugs.
-        * If regression test fails, we must fix those bugs and create a new `release/[version]`
-          branch from previous release branch.
-    * Modify `python/setup.py.in`, change the version number and change `ISTAGED` to `True`.
-    * Publish PaddlePaddle release wheel packages to pypi (see below instructions for detail).
-    * Update the Docker images (see below instructions for detail).
-1. After above step, merge `release/[version]` branch to master and push a tag on the master commit,
-   then merge `master` to `develop`.
-1. Update the Release Note.          
-
-***NOTE:***
-
-* Do ***NOT*** merge commits from develop branch to release branches to keep the release branch contain
-  features only for current release, so that we can test on that version.
-* If we want to fix bugs on release branches, we must merge the fix to master, develop and release branch.
+1. Create a new release branch from `develop`，named `release/[version]`. E.g.，`release/0.10.0`
+2. Create a new tag for the release branch, tag format: `version-rc.Patch`. E.g. the first tag is `0.10.0-rc0`。
+3. New release branch normally doesn't accept new features or optimizations. QA will test on the release branch. Developer should develop based on `develop` branch.
+4. If QA or Developer find bugs. They should first fix and verify on `develop` branch. Then cherry-pick the fix to the release branch. Wait until the release branch is stable.
+5. If necessary, create a new tag on the relese branch, e.g. `0.10.0-rc1`. Involve more users to try it and repeat step 3-4.
+6. After release branch is stable，Create the official release tag，such as `0.10.0`.
+7. Release the python wheel package to pypi.
+8. Update the docker image (More details below).
+
+NOTE:
+
+* bug fix should happen on `develop` branch, then cherry-pick to relese branch. Avoid developing directly on release branch.
+
+* release normally only accept bug fixes. Don't add new features.
+

 ## Publish Wheel Packages to pypi

@@ -97,26 +92,22 @@ You can then checkout the latest pushed tags at https://hub.docker.com/r/paddlep

 ## Branching Model

-We use [git-flow](http://nvie.com/posts/a-successful-git-branching-model/) as our branching model,
-with some modifications:
-
-* `master` branch is the stable branch. Each version on the master branch is tested and guaranteed.
-* `develop` branch is for development. Each commit on develop branch has passed CI unit test, but no
-  regression tests are run.
-* `release/[version]` branch is used to publish each release. Latest release version branches have
-  bugfix only for that version, but no feature updates.
-* Developer forks are not required to follow
-  [git-flow](http://nvie.com/posts/a-successful-git-branching-model/)
-  branching model, all forks is like a feature branch.
-    * Advise: developer fork's develop branch is used to sync up with main repo's develop branch.
-    * Advise: developer use it's fork's develop branch to for new branch to start developing.
-  * Use that branch on developer's fork to create pull requests and start reviews.
-      * developer can push new commits to that branch when the pull request is open.
-* Bug fixes are also started from developers forked repo. And, bug fixes branch can merge to
-  `master`, `develop` and `releases`.
+PaddlePaddle uses [Trunk Based Development](https://trunkbaseddevelopment.com/) as our branching model.
+
+* `develop` branch is used for development. Each comment to `develop` branc goes through unit tests and model regression tests.
+* `release/[version]` branch is used for each release. Release branch is used for tests, bug fix and evetual release.
+* `master` branch as been deprecated for historical reasons
+
+* Developer's feature branch。
+	* Developer's feature branch should sync with upstream `develop` branch.
+	* Developer's feature branch should be forked from upstream `develop` branch.
+	* After feature branch is ready, create a `Pull Request` against the Paddle repo and go through code review.
+	   * In the review process, develop modify codes and push to their own feature branch.

 ## PaddlePaddle Regression Test List

+TODO
+
 ### All Chapters of PaddlePaddle Book

 We need to guarantee that all the chapters of PaddlePaddle Book can run correctly. Including

--- a/doc/fluid/dev/write_docs_cn.rst
+++ b/doc/fluid/dev/write_docs_cn.rst
-#############
-如何贡献文档
-#############
-
-PaddlePaddle的文档包括中英文两个部分。文档都是通过 ``cmake`` 驱动 ``sphinx`` 编译生成的，PaddlePaddle.org工具可以帮助我们实现这一编译过程，并提供更好的预览效果。
-
-如何构建文档
-============
-
-PaddlePaddle的文档构建有两种方式，分别为使用paddlepaddle.org工具和不使用paddlepaddle.org工具，两种方式都有各自的优点，前者方便预览，后者方便开发者进行调试。这两种方式中又分别有使用docker和不使用docker的两种构建方法。
-
-我们建议使用PaddlePaddle.org工具来构建文档。
-
-使用PaddlePaddle.org工具
------------------------
-这个是目前推荐的使用方法。除了可以自动编译文档，还可以直接在网页中预览文档，需要注意的是，采用后续说明的其它方式虽然也可以预览文档，但是文档的样式与官网文档是不一致的，使用PaddlePaddle.org工具进行编译才能产生与官网文档样式一致的预览效果。
-
-PaddlePaddle.org工具可以配合Docker使用，需要在系统里先安装好Docker工具包。Docker安装请参考 `Docker的官网 <https://docs.docker.com/>`_ 。安装好Docker之后即可用以下命令启动工具
-
-..  code-block:: bash
-
-    mkdir paddlepaddle # Create paddlepaddle working directory
-    cd paddlepaddle
-
-    # Clone the content repositories
-    git clone https://github.com/PaddlePaddle/Paddle.git
-    git clone https://github.com/PaddlePaddle/book.git
-    git clone https://github.com/PaddlePaddle/models.git
-    git clone https://github.com/PaddlePaddle/Mobile.git
-
-    # Please specify the working directory through -v
-    docker run -it -p 8000:8000 -v `pwd`:/var/content paddlepaddle/paddlepaddle.org:latest
-
-注意: PaddlePaddle.org 会在 -v (volume) 指定的内容存储库运行命令
-之后再用网页连到 http://localhost:8000 就可以在网页上生成需要的文档
-编译后的文件将被存储在工作目录 <paddlepaddle working directory>/.ppo_workspace/content。
-
-如果不想使用Docker，你还可以通过运行Django框架直接激活工具的服务器。使用下面的命令来运行它。
-
-..  code-block:: bash
-
-    mkdir paddlepaddle # Create paddlepaddle working directory
-    cd paddlepaddle
-
-    # Clone the content repositories and PaddlePaddle.org
-    git clone https://github.com/PaddlePaddle/Paddle.git
-    git clone https://github.com/PaddlePaddle/book.git
-    git clone https://github.com/PaddlePaddle/models.git
-    git clone https://github.com/PaddlePaddle/Mobile.git
-    git clone https://github.com/PaddlePaddle/PaddlePaddle.org.git
-
-    # Please specify the PaddlePaddle working directory. In the current setting, it should be pwd
-    export CONTENT_DIR=<path_to_paddlepaddle_working_directory>
-    export ENV=''
-    cd PaddlePaddle.org/portal/
-    pip install -r requirements.txt
-    python manage.py runserver
-
-工具服务器将读取环境变量 CONTENT_DIR 搜索代码库。请指定的PaddlePaddle工作目录给环境变量 CONTENT_DIR。
-之后再用网页连到 http://localhost:8000 就可以在网页上生成需要的文档。
-编译后的文件将被存储在工作目录 <paddlepaddle working directory>/.ppo_workspace/content。
-
-想了解更多PaddlePaddle.org工具的详细信息，可以 `点击这里 <https://github.com/PaddlePaddle/PaddlePaddle.org/blob/develop/README.cn.md>`_ 。
-
-不使用PaddlePaddle.org工具
--------------------------
-
-使用Docker构建PaddlePaddle的文档，需要在系统里先安装好Docker工具包。Docker安装请参考 `Docker的官网 <https://docs.docker.com/>`_ 。该方法与 `从源码编译PaddlePaddle <http://paddlepaddle.org/docs/develop/documentation/zh/build_and_install/build_from_source_cn.html>`_ 相似，通过从源码中构建可用于编译PaddlePaddle文档的Docker镜像并运行，在进入Docker容器后使用源码中的脚本构建PaddlePaddle文档，具体步骤如下：
-
-.. code-block:: bash
-
-   git clone https://github.com/PaddlePaddle/Paddle.git
-   cd Paddle
-
-   # 从源码中构建可用于编译PaddlePaddle文档的Docker镜像
-   docker build -t paddle:dev .
-   docker run -it -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_TESTING=OFF" -e "WITH_DOC=ON" paddle:dev /bin/bash
-
-   # 进入Docker容器后使用build.sh脚本构建PaddlePaddle文档
-   bash -x /paddle/paddle/scripts/docker/build.sh
-
-注：上述命令把当前目录（源码根目录）映射为 container 里的 :code:`/paddle` 目录。
-
-编译完成后，会产生 ``doc/v2`` 和 ``doc/fluid`` 两个目录，在这两个目录下分别都生成 ``cn/html/`` 、 ``en/html`` 、 ``api/en/html`` 共三个子目录，分别进入这些目录下，执行以下命令：
-
-.. code-block:: bash
-
-   python -m SimpleHTTPServer 8088
-
-在浏览器中输入 http://localhost:8088 就可以看到编译生成的 ``v2`` 和 ``fluid`` 两种版本的中/英文的文档页面和英文的API页面。
-
-如果不想使用Docker，也可以使用以下命令直接构建PaddlePaddle文档，即
-
-.. code-block:: bash
-
-   git clone https://github.com/PaddlePaddle/Paddle.git
-   cd Paddle
-   mkdir -p build
-   cd build
-   cmake .. -DCMAKE_BUILD_TYPE=Release -DWITH_GPU=OFF -DWITH_MKL=OFF -DWITH_DOC=ON
-
-   # 如果只需要构建使用文档，则执行以下命令
-   make -j $processors paddle_docs
-
-   # 如果只需要构建API，则执行以下命令
-   make -j $processors paddle_apis
-
-其中$processors代表启动和CPU核一样多的进程来并行编译，可以根据本机的CPU核数设置相应的值。
-
-编译完成后，同样会产生 ``doc/v2`` 和 ``doc/fluid`` 两个目录，如果选择构建文档则会在这两个目录下分别都生成 ``cn/html/`` 、 ``en/html`` 两个子目录，选择构建API则会在这两个目录下分别生成 ``api/en/html`` 目录，分别进入这些子目录下，执行以下命令：
-
-.. code-block:: bash
-
-   python -m SimpleHTTPServer 8088
-
-在浏览器中输入 http://localhost:8088 就可以看到编译生成的 ``v2`` 和 ``fluid`` 两种版本的中/英文的文档页面和英文的API页面。下图为生成的 ``v2`` 英文文档首页示例。注意，示例中由于使用了sphinx的原始主题，所以页面的风格与官网并不一致，但这并不影响开发者进行调试。
-
-..  image:: src/doc_en.png
-    :align: center
-    :scale: 60 %
-
-如何书写文档
-============
-
-PaddlePaddle文档使用 `sphinx`_ 自动生成，用户可以参考sphinx教程进行书写。
-
-如何更新www.paddlepaddle.org
-============================
-
-更新的文档以PR的形式提交到github中，提交方式参见 `如何贡献文档 <http://www.paddlepaddle.org/docs/develop/documentation/zh/dev/write_docs_cn.html>`_ 。
-目前PaddlePaddle的develop分支的文档是自动触发更新的，用户可以分别查看最新的 `中文文档 <http://www.paddlepaddle.org/docs/develop/documentation/zh/getstarted/index_cn.html>`_ 和
-`英文文档 <http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/index_en.html>`_ 。
-
-
-..  _cmake: https://cmake.org/
-..  _sphinx: http://www.sphinx-doc.org/en/1.4.8/
+../../v2/dev/write_docs_cn.rst
\ No newline at end of file
--- a/doc/fluid/dev/write_docs_en.rst
+++ b/doc/fluid/dev/write_docs_en.rst
-########################
-Contribute Documentation
-########################
-
-PaddlePaddle's documentation includes both Chinese and English versions. The documentation is built using the ``cmake`` command to drive the ``sphinx`` compiler. The PaddlePaddle.org tool helps us to implement this compilation process and provides better preview results.
-
-How to build Documentation
-===========================
-
-PaddlePaddle's documentation is built in two ways: using the PaddlePaddle.org tool and without using it. Both methods have their own advantages. The former facilitates previewing, while the latter facilitates debugging by the developer. We could choose to build the documentation with Docker or without it in each of the above ways.
-
-We recommend using PaddlePaddle.org tool to build documentation.
-
-Using PaddlePaddle.org tool
-----------------------------
-This is the recommended method to build documentation, because it can automatically compile the documentation and preview the documentation directly in a web page. Note that, although you can preview the documentation in other ways, its style may not be consistent with the official website. Compiling with the PaddlePaddle.org tool produces a preview that will be consistent with the official website documentation style.
-
-The PaddlePaddle.org tool can be used with Docker and Docker needs to be installed first. Please refer to `Docker's official website <https://docs.docker.com/>`_ on how to install Docker. After installing Docker, you may use the following commands to activate the tool
-
-..  code-block:: bash
-
-    mkdir paddlepaddle # Create paddlepaddle working directory
-    cd paddlepaddle
-
-    # Clone the content repositories. You may only clone the contents you need
-    git clone https://github.com/PaddlePaddle/Paddle.git
-    git clone https://github.com/PaddlePaddle/book.git
-    git clone https://github.com/PaddlePaddle/models.git
-    git clone https://github.com/PaddlePaddle/Mobile.git
-
-    # Please specify the working directory through -v
-    docker run -it -p 8000:8000 -v `pwd`:/var/content paddlepaddle/paddlepaddle.org:latest
-
-Note: PaddlePaddle.org will read the content repos specified in the -v (volume) flag of the docker run commands
-Use a web browser and navigate to http://localhost:8000. Click the buttons to compile the documentation.
-The compiled documentations will be stored in <paddlepaddle working directory>/.ppo_workspace/content
-
-
-If you don't wish to use Docker, you can also activate the tool through Django. Use the following the commands to set up
-
-..  code-block:: bash
-
-    mkdir paddlepaddle # Create paddlepaddle working directory
-    cd paddlepaddle
-
-    # Clone the content repositories and PaddlePaddle.org
-    git clone https://github.com/PaddlePaddle/Paddle.git
-    git clone https://github.com/PaddlePaddle/book.git
-    git clone https://github.com/PaddlePaddle/models.git
-    git clone https://github.com/PaddlePaddle/Mobile.git
-    git clone https://github.com/PaddlePaddle/PaddlePaddle.org.git
-
-    # Please specify the PaddlePaddle working directory. In the current setting, it should be pwd
-    export CONTENT_DIR=<path_to_paddlepaddle_working_directory>
-    export ENV=''
-    cd PaddlePaddle.org/portal/
-    pip install -r requirements.txt
-    python manage.py runserver
-
-Specify the PaddlePaddle working directory for the environment variable CONTENT_DIR so that the tool could find where the working directory is.
-
-Use a web browser and navigate to http://localhost:8000. Click the buttons to compile the documentation
-The compiled documentations will be stored in <paddlepaddle working directory>/.ppo_workspace/content
-
-Please `click here <https://github.com/PaddlePaddle/PaddlePaddle.org/blob/develop/README.md>`_ for more information about the PaddlePaddle.org tool.
-
-
-Manually Building the Documentation
-------------------------------------
-
-Build PaddlePaddle's documentation with Docker，you need to install Docker first. Please refer to `Docker's official website <https://docs.docker.com/>`_ on how to install Docker. This method is quite similar to ` Build From Sources <http://paddlepaddle.org/docs/develop/documentation/en/build_and_install/build_from_source_en.html>`_ , by constructing, from source code, a docker image that can be used to build PaddlePaddle documentation. Enter the Docker container and use the script ``build.sh`` in the source directory to build the PaddlePaddle documentation. The specific steps are as follows:
-
-.. code-block:: bash
-
-   git clone https://github.com/PaddlePaddle/Paddle.git
-   cd Paddle
-
-   # Construct a docker image from source code
-   docker build -t paddle:dev .
-   docker run -it -v $PWD:/paddle -e "WITH_GPU=OFF" -e "WITH_TESTING=OFF" -e "WITH_DOC=ON" paddle:dev /bin/bash
-
-   # Use build.sh to build PaddlePaddle documentation
-   bash -x /paddle/paddle/scripts/docker/build.sh
-
-Note: The above commands maps the current directory (source root directory) to the :code:`/paddle` directory in the container.
-
-After compiling, there should be two generated directories: ``doc/v2`` and ``doc/fluid``, where three subdirectories ``cn/html/``, ``en/html`` and ``api/en/html`` are generated. Please enter these directories respectively and execute the following commands:
-
-.. code-block:: bash
-
-   python -m SimpleHTTPServer 8088
-
-Use a web browser and navigate to http://localhost:8000, you could see the compiled  ``v2`` 's and ``fluid`` 's Chinese/English documents page and English APIs page.
-
-If you do not wish to use Docker, you can also use the following commands to directly build the PaddlePaddle documentation.
-
-.. code-block:: bash
-
-
-   git clone https://github.com/PaddlePaddle/Paddle.git
-   cd Paddle
-   mkdir -p build
-   cd build
-   cmake .. -DCMAKE_BUILD_TYPE=Release -DWITH_GPU=OFF -DWITH_MKL=OFF -DWITH_DOC=ON
-
-   # If you only need to build documents, use the following commands
-   make -j $processors paddle_docs
-
-   # If you only need to build APIs, use the following commands
-   make -j $processors paddle_apis
-
-$processors indicates that as many processes as the CPU cores are started to compile in parallel. It should be set according to the number of CPU cores of your machine.
-
-After compiling, there also should be two generated directories: ``doc/v2`` and ``doc/fluid`` . If you chose to build documents, two subdirectories ``cn/html/`` and ``en/html``  will be generated in both two directories. If you chose to build APIs，a subdirectory ``api/en/html`` will be generated. Please enter these directories respectively and execute the following commands:
-
-.. code-block:: bash
-
-   python -m SimpleHTTPServer 8088
-
-Use a web browser and navigate to http://localhost:8000, you could see the compiled  ``v2`` 's and ``fluid`` 's Chinese/English documents page and English APIs page. The following figure is an example of the built ``v2`` 's English documents home page. Note that due to the sphinx's original theme used in the example, the style of the page is not consistent with the official website, but this does not affect the developer's debugging.
-
-..  image:: src/doc_en.png
-    :align: center
-    :scale: 60 %
-
-How to write Documentation
-===========================
-
-PaddlePaddle uses `sphinx`_ to compile documentation，Please check sphinx official website for more detail.
-
-How to update www.paddlepaddle.org
-===================================
-
-Please create PRs and submit them to github, please check `Contribute Code <http://www.paddlepaddle.org/docs/develop/documentation/en/howto/dev/contribute_to_paddle_en.html>`_ 。
-PaddlePaddle develop branch will update the documentation once the PR is merged. User may check latest `Chinese Docs <http://www.paddlepaddle.org/docs/develop/documentation/zh/getstarted/index_cn.html>`_ and
-`English Docs <http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/index_en.html>`_ 。
-
-..  _cmake: https://cmake.org/
-..  _sphinx: http://www.sphinx-doc.org/en/1.4.8/
+../../v2/dev/write_docs_en.rst
\ No newline at end of file
--- a/doc/fluid/user_guides/howto/debug/visualdl.md
+++ b/doc/fluid/user_guides/howto/debug/visualdl.md
@@ -104,6 +104,7 @@ visualDL --logdir=scratch_log --port=8080

 # 访问 http://127.0.0.1:8080
 ```
+如果出现`TypeError: __init__() got an unexpected keyword argument 'file'`, 是因为protobuf不是3.5以上，运行`pip install --upgrade protobuf`就能解决。

 如果在虚拟环境下仍然遇到安装问题，请尝试以下方法。


--- a/doc/fluid/user_guides/howto/inference/build_and_install_lib_cn.rst
+++ b/doc/fluid/user_guides/howto/inference/build_and_install_lib_cn.rst
@@ -9,13 +9,13 @@
 ======================   ========================================
 版本说明                            C++预测库   
 ======================   ========================================
-cpu_avx_mkl              `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxCp27cp27mu/.lastSuccessful/fluid.tgz/?branch=0.15.0>`_ 
-cpu_avx_openblas         `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxOpenblas/.lastSuccessful/fluid.tgz/?branch=0.15.0>`_
-cpu_noavx_openblas       `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuNoavxOpenblas/.lastSuccessful/fluid.tgz/?branch=0.15.0>`_
-cuda7.5_cudnn5_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda75cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz/?branch=0.15.0>`_
-cuda8.0_cudnn5_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda80cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz/?branch=0.15.0>`_
-cuda8.0_cudnn7_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda8cudnn7cp27cp27mu/.lastSuccessful/fluid.tgz/?branch=0.15.0>`_
-cuda9.0_cudnn7_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda90cudnn7avxMkl/.lastSuccessful/fluid.tgz/?branch=0.15.0>`_
+cpu_avx_mkl              `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxCp27cp27mu/.lastSuccessful/fluid.tgz>`_ 
+cpu_avx_openblas         `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuAvxOpenblas/.lastSuccessful/fluid.tgz>`_
+cpu_noavx_openblas       `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_CpuNoavxOpenblas/.lastSuccessful/fluid.tgz>`_
+cuda7.5_cudnn5_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda75cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz>`_
+cuda8.0_cudnn5_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda80cudnn5cp27cp27mu/.lastSuccessful/fluid.tgz>`_
+cuda8.0_cudnn7_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda8cudnn7cp27cp27mu/.lastSuccessful/fluid.tgz>`_
+cuda9.0_cudnn7_avx_mkl   `fluid.tgz <https://guest:@paddleci.ngrok.io/repository/download/Manylinux1_Cuda90cudnn7avxMkl/.lastSuccessful/fluid.tgz>`_
 ======================   ========================================

 从源码编译
@@ -40,6 +40,7 @@ WITH_MKL            ON/OFF

  .. code-block:: bash

+     pip install paddlepaddle-gpu
     PADDLE_ROOT=/path/of/capi
     git clone https://github.com/PaddlePaddle/Paddle.git
     cd Paddle

--- a/doc/fluid/user_guides/howto/inference/native_infer.rst
+++ b/doc/fluid/user_guides/howto/inference/native_infer.rst
@@ -4,7 +4,7 @@ Paddle 预测 API
 为了更简单方便的预测部署，Fluid 提供了一套高层 API
 用来隐藏底层不同的优化实现。

-`预测库相关代码 <https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/fluid/inference/api>`__
+`预测库相关代码 <https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/fluid/inference/api>`_
 包括

 -  头文件 ``paddle_inference_api.h`` 定义了所有的接口

--- a/doc/fluid/user_guides/howto/prepare_data/index.rst
+++ b/doc/fluid/user_guides/howto/prepare_data/index.rst
@@ -38,7 +38,6 @@ PaddlePaddle Fluid支持两种传入数据的方式:
   :maxdepth: 2

   feeding_data
-   use_recordio_reader

 Python Reader
 #############

--- a/doc/fluid/user_guides/howto/prepare_data/use_recordio_reader.rst
+++ b/doc/fluid/user_guides/howto/prepare_data/use_recordio_reader.rst
-.. _user_guide_use_recordio_as_train_data:
-
-############################
-使用RecordIO文件作为训练数据
-############################
-
-相比于 :ref:`user_guide_use_numpy_array_as_train_data`，
-:ref:`user_guide_use_recordio_as_train_data` 的性能更好；
-但是用户需要先将训练数据集转换成RecordIO文件格式，再使用
-:code:`fluid.layers.open_files()` 层在神经网络配置中导入 RecordIO 文件。
-用户还可以使用 :code:`fluid.layers.double_buffer()` 加速数据从内存到显存的拷贝，
-使用 :code:`fluid.layers.Preprocessor` 工具进行数据增强。
-
-将训练数据转换成RecordIO文件格式
-################################
-
-:code:`fluid.recordio_writer` 中，每个记录都是一个
-:code:`vector<LoDTensor>`, 即一个支持序列信息的Tensor数组。这个数组包括训练所需
-的所有特征。例如对于图像分类来说，这个数组可以包含图片和分类标签。
-
-用户可以使用 :code:`fluid.recordio_writer.convert_reader_to_recordio_file()` 可以将
-:ref:`user_guide_reader` 转换成一个RecordIO文件。或者可以使用
-:code:`fluid.recordio_writer.convert_reader_to_recordio_files()` 将一个
-:ref:`user_guide_reader` 转换成多个RecordIO文件。
-
-具体使用方法为:
-
-.. code-block:: python
-
-   import paddle.fluid as fluid
-   import numpy
-
-   def reader_creator():
-       def __impl__():
-           for i in range(1000):
-               yield [
-                        numpy.random.random(size=[3,224,224], dtype="float32"),
-                        numpy.random.random(size=[1], dtype="int64")
-                     ]
-       return __impl__
-
-   img = fluid.layers.data(name="image", shape=[3, 224, 224])
-   label = fluid.layers.data(name="label", shape=[1], dtype="int64")
-   feeder = fluid.DataFeeder(feed_list=[img, label], place=fluid.CPUPlace())
-
-   BATCH_SIZE = 32
-   reader = paddle.batch(reader_creator(), batch_size=BATCH_SIZE)
-   fluid.recordio_writer.convert_reader_to_recordio_file(
-      "train.recordio", feeder=feeder, reader_creator=reader)
-
-其中 :code:`reader_creator` 创建了一个 :code:`Reader`。
-:ref:`_api_fluid_data_feeder_DataFeeder`
-是将 :code:`Reader` 转换成 :code:`LoDTensor` 的工具。详细请参考
-:ref:`user_guide_reader` 。
-
-上述程序将 :code:`reader_creator` 的数据转换成了 :code:`train.recordio` 文件，
-其中每一个record 含有 32 条样本。如果batch size会在训练过程中调整，
-用户可以将每一个Record的样本数设置成1。并参考
-:ref:`user_guide_use_recordio_as_train_data_use_op_create_batch`。
-
-
-配置神经网络, 打开RecordIO文件
-##############################
-
-RecordIO文件转换好之后，用户可以使用 :code:`fluid.layers.open_files()`
-打开文件，并使用 :code:`fluid.layers.read_file` 读取文件内容。
-简单使用方法如下:
-
-.. code-block:: python
-
-   import paddle.fluid as fluid
-
-   file_obj = fluid.layers.open_files(
-     filenames=["train.recordio"],
-     shape=[[3, 224, 224], [1]],
-     lod_levels=[0, 0],
-     dtypes=["float32", "int64"],
-     pass_num=100
-   )
-
-   image, label = fluid.layers.read_file(file_obj)
-
-其中如果设置了 :code:`pass_num` ，那么当所有数据读完后，会重新读取数据，
-直到读取了 :code:`pass_num` 遍。
-
-
-
-进阶使用
-########
-
-
-使用 :code:`fluid.layers.double_buffer()`
------------------------------------------
-
-:code:`Double buffer` 使用双缓冲技术，将训练数据从内存中复制到显存中。配置双缓冲
-需要使用 :code:`fluid.layers.double_buffer()` 修饰文件对象。 例如:
-
-.. code-block:: python
-
-   import paddle.fliud as fluid
-   file_obj = fluid.layers.open_files(...)
-   file_obj = fluid.layers.double_buffer(file_obj)
-
-   image, label = fluid.layers.read_file(file_obj)
-
-双缓冲技术可以参考
-`Multiple buffering <https://en.wikipedia.org/wiki/Multiple_buffering>`_ 。
-
-配置数据增强
------------
-
-使用 :code:`fluid.layers.Preprocessor` 可以配置文件的数据增强方法。例如
-
-.. code-block:: python
-
-   import paddle.fluid as fluid
-   file_obj = fluid.layers.open_files(...)
-   preprocessor = fluid.layers.Preprocessor(reader=data_file)
-   with preprocessor.block():
-       image, label = preprocessor.inputs()
-       image = image / 2
-       label = label + 1
-       preprocessor.outputs(image, label)
-
-如上代码所示，使用 :code:`Preprocessor` 定义了一个数据增强模块，并在
-:code:`with preprocessor.block()` 中定义了数据增强的具体操作。 用户通过配置
-:code:`preprocessor.inputs()` 获得数据文件中的各个字段。 并用
-:code:`preprocessor.outputs()` 标记预处理后的输出。
-
-.. _user_guide_use_recordio_as_train_data_use_op_create_batch:
-
-使用Op组batch
-------------
-
-使用 :code:`fluid.layers.batch()` 可以在训练的过程中动态的组batch。例如
-
-.. code-block:: python
-
-   import paddle.fluid as fluid
-   file_obj = fluid.layers.open_files(...)
-   file_obj = fluid.layers.batch(file_obj, batch_size=32)
-
-   img, label = fluid.layers.read_file(file_obj)
-
-需要注意的是，如果数据集中的最后几个样本不能组成 :code:`batch_size` 大小的批量数据，
-那么这几个样本直接组成一个批量数据进行训练。
-
-读入数据的shuffle
-----------------
-
-使用 :code:`fluid.layers.shuffle()` 可以在训练过程中动态重排训练数据。例如
-
-.. code-block:: python
-
-   import paddle.fluid as fluid
-   file_obj = fluid.layers.open_files(...)
-   file_obj = fliud.layers.shuffle(file_obj, buffer_size=8192)
-
-   img, label = fliud.layers.read_file(file_obj)
-
-需要注意的是:
-
-1. :code:`shuffle` 实现方法是:
-先读入 :code:`buffer_size` 条样本，再随机的选出样本进行训练。
-
-2. :code:`shuffle` 中 :code:`buffer_size` 会占用训练内存，需要确定训练过程中内存
-足够支持缓存 :code:`buffer_size` 条数据。
--- a/doc/fluid/user_guides/models/index.rst
+++ b/doc/fluid/user_guides/models/index.rst
-../../../../external/models/fluid/README.cn.rst
\ No newline at end of file
+../../../../external/models/fluid/README.cn.rst