未验证 提交 3fe206d6 编写于 作者: C chentianyu03 提交者: GitHub

add paddle dir (#2498)

上级 f597cfb0

要显示的变更太多。

To preserve performance only 1000 of 1000+ files are displayed.
......@@ -81,11 +81,9 @@ def convert_markdown_into_html(argv=None):
for filename in args.filenames:
with open(
re.sub(r"README", "index", re.sub(r"\.md$", ".html",
filename)),
"w",
encoding="utf-8") as output:
filename)), "w") as output:
output.write(HEAD)
with open(filename, encoding="utf-8") as input:
with open(filename) as input:
for line in input:
output.write(line)
output.write(TAIL)
......
# 如何贡献文档
PaddlePaddle非常欢迎您贡献文档。如果您撰写/翻译的文档满足我们的要求,您的文档将会呈现在paddlapaddle.org网站和Github上供PaddlePaddle的用户阅读。
Paddle的文档主要分为以下几个模块:
- 新手入门:包括安装说明、深度学习基础知识、学习资料等,旨在帮助用户快速安装和入门;
- 使用指南:包括数据准备、网络配置、训练、Debug、预测部署和模型库文档,旨在为用户提供PaddlePaddle基本用法讲解;
- 进阶使用:包括服务器端和移动端部署、如何贡献代码/文档、如何性能调优等,旨在满足开发者的需求;
我们的文档支持[reStructured Text](http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html)[Markdown](https://guides.github.com/features/mastering-markdown/) (GitHub风格)格式的内容贡献。
撰写文档完成后,您可以使用预览工具查看文档在官网显示的效果,以验证您的文档是否能够在官网正确显示。
## 如何使用预览工具
如果您正在修改代码文档(即API),并在Docker容器中使用PaddlePaddle,请在您相应的docker容器中执行下列步骤。因为API的文档生成器依赖于PaddlePaddle。
如果您只改进了文本/媒体内容(不需要安装或构建PaddlePaddle),或者正在主机上构建PaddlePaddle,请继续在主机上执行下列步骤。
### 1. Clone你希望更新或测试的相关仓库:
首先下载完整的文档存储仓库,其中`--recurse-submodules`会同步更新FluidDoc中的submodule(所有的submodule均在`FluidDoc/external`中),以保证所有文档可以正常显示:
```
git clone --recurse-submodules https://github.com/PaddlePaddle/FluidDoc
```
其他可拉取的存储库有:
```
git clone https://github.com/PaddlePaddle/book.git
git clone https://github.com/PaddlePaddle/models.git
git clone https://github.com/PaddlePaddle/Mobile.git
```
您可以将这些本地副本放在电脑的任意目录下,稍后我们会在启动 PaddlePaddle.org时指定这些仓库的位置。
### 2. 在新目录下拉取 PaddlePaddle.org 并安装其依赖项
在此之前,请确认您的操作系统安装了python的依赖项
以ubuntu系统为例,运行:
```
sudo apt-get update && apt-get install -y python-dev build-essential
```
然后:
```
git clone https://github.com/PaddlePaddle/PaddlePaddle.org.git
cd PaddlePaddle.org/portal
```
之后需要安装依赖库,请确保在python 2.7.15 或2.7.16 环境下安装。推荐使用Anaconda或virtualenv创建合适的虚拟环境后安装依赖库。
安装依赖库:
```
pip install -r requirements.txt
```
**可选项**:如果你希望实现中英网站转换,以改善PaddlePaddle.org,请安装[GNU gettext](https://www.gnu.org/software/gettext/)
### 3. 在本地运行 PaddlePaddle.org
添加您希望加载和构建内容的目录列表(选项包括:--paddle,--book,--models,--mobile)
运行:
```
./runserver --paddle <path_to_FluidDoc_dir>
```
**注意:** `<pathe_to_FluidDoc_dir>`为第一步中paddle副本在您本机的存储地址。
如果您需要处理依赖于`book``models``mobile`存储库内容的文档,您可以添加一个或多个可选项:
```
./runserver --paddle <path_to_fluiddoc_dir> \
--book <path_to_fluiddoc_dir>/external/book \
--models <path_to_fluiddoc_dir>/external/models \
--mobile <path_to_fluiddoc_dir>/external/mobile
```
然后:打开浏览器并导航到http://localhost:8000。
>*网站可能需要几秒钟才能成功加载,因为构建需要一定的时间*
>*如果您是在docker环境下运行的这些步骤,请检查ip确保可以将端口8000映射到您的主机*
## 贡献新文档或更新API
所有内容都应该以[Markdown](https://guides.github.com/features/mastering-markdown/) (GitHub风格)的形式编写(尽管在文档中有一些使用.rst格式的遗留内容)。
在完成安装步骤后,您还需要完成下列操作:
- 在你开始写作之前,我们建议你回顾一下这些关于贡献内容的指南
---
**贡献新文档**
- 创建一个新的` .md` 文件或者在您当前操作的仓库中修改已存在的文章
- 将新增的文档名,添加到对应的index文件中
---
**贡献或修改Python API**
在编译代码的docker容器内,或主机的对应位置:
- 运行脚本 `paddle/scripts/paddle_build.sh`(在 Paddle repo 下)
```bash
# 编译paddle的python库
cd Paddle
./paddle/scripts/paddle_docker_build.sh gen_doc_lib full
cd ..
```
- 运行预览工具
```
# 在编译paddle的对应docker镜像中运行预览工具
docker run -it -v /Users/xxxx/workspace/paddlepaddle_workplace:/workplace -p 8000:8000 [images_id] /bin/bash
```
> 其中`/Users/xxxx/workspace/paddlepaddle_workplace`请替换成您本机的paddle工作环境,`/workplace`请替换成您相应的 docker 下的工作环境,这一映射会保证我们同时完成编译python库、修改FluidDoc和使用预览工具。
> [images_id]为docker中您使用的paddlepaddle的镜像id。
- 设置环境变量
```
# 在docker环境中
# 设置环境变量`PYTHONPATH`使预览工具可以找到 paddle 的 python 库
export PYTHONPATH=/workplace/Paddle/build/python/
```
- 清理旧文件
```
# 清除历史生成的文件,如果是第一次使用预览工具可以跳过这一步
rm -rf /workplace/FluidDoc/doc/fluid/menu.json /workplace/FluidDoc/doc/fluid/api/menu.json /tmp/docs/ /tmp/api/
```
- 启动预览工具
```
cd /workplace/PaddlePaddle.org/portal
pip install -r requirements.txt
./runserver --paddle /workplace/FluidDoc/
```
---
**预览修改**
打开浏览器并导航到http://localhost:8000。
在要更新的页面上,单击右上角的Refresh Content
进入使用文档单元后,API部分并不包含内容,希望预览API文档需要点击API目录,几分钟后您将看到生成的 API reference。
## 提交修改
如果您希望修改代码,请在`Paddle`仓库下参考[如何贡献代码](../contribute_code/index_cn.html)执行操作。
如果您仅修改文档:
- 修改的内容在`doc`文件夹内,您只需要在`FluidDoc`仓库下提交`PR`
- 修改的内容在`external`文件夹内:
1.在您修改的仓库下提交PR。这是因为:`FluidDoc`仓库只是一个包装器,将其他仓库的链接(git术语的“submodule”)集合在了一起。
2.当您的修改被认可后,更新FluidDoc中对应的`submodule`到源仓库最新的commit-id。
> 例如,您更新了book仓库中的develop分支下的文档:
> - 进入`FluidDoc/external/book`目录
> - 更新 commit-id 到最新的提交:`git pull origin develop`
> - 在`FluidDoc`中提交你的修改
3.在`FluidDoc`仓库下为您的修改提交PR
提交修改与PR的步骤可以参考[如何贡献代码](../contribute_code/index_cn.html)
## 帮助改进预览工具
我们非常欢迎您对平台和支持内容的各个方面做出贡献,以便更好地呈现这些内容。您可以Fork或Clone这个存储库,或者提出问题并提供反馈,以及在issues上提交bug信息。详细内容请参考[开发指南](https://github.com/PaddlePaddle/PaddlePaddle.org/blob/develop/DEVELOPING.md)
## 版权和许可
PaddlePaddle.org在Apache-2.0的许可下提供。
# How to contribute documentation
PaddlePaddle encourages you to contribute documentation. If your written or translated documents meet our requirements, your documents will be available on the paddlapaddle.org website and on Github for PaddlePaddle users.
Paddle's documentation is mainly divided into the following modules:
- Beginner's Guide: It includes installation instructions, basic knowledge of deep learning, learning materials, etc., designed to help users get started and inspired;
- User Guides: It includes data preparation, network configuration, training, debug, predictive deployment, and model library to provide users with a tutorial of basic operations in PaddlePaddle;
- Advanced User Guides: It includes server-side and mobile-end deployment, how to contribute code or documentation, how to optimize performance, etc., designed to meet the needs of developers;
Our documentation supports contributions in format of [reStructured Text](http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html) and [Markdown](https://guides.github.com/features/ Mastering-markdown/) (GitHub style) .
Once the document is written, you can use the preview tool to check how the document appears on the official website to verify that your document is displayed correctly on the official website.
## How to use the preview tool
If you are modifying a code document (i.e. API) and using PaddlePaddle in a Docker container, perform the following steps in your corresponding docker container. Because the API's document generator relies on PaddlePaddle.
If you have only improved text or media content (you don't need to install or build PaddlePaddle), or are building PaddlePaddle on the host machine, please continue with the following steps on the host.
### 1. Clone related repository that you wish to update or test:
First download the full document repository, where `--recurse-submodules` will update the submodules in FluidDoc (all submodules are in `FluidDoc/external`) to ensure that all documents will be displayed properly:
```
git clone --recurse-submodules https://github.com/PaddlePaddle/FluidDoc
```
Other pullable repositories are:
```
git clone https://github.com/PaddlePaddle/book.git
git clone https://github.com/PaddlePaddle/models.git
git clone https://github.com/PaddlePaddle/Mobile.git
```
You can place these local copies in any directory on your computer, and we will specify the location of these repositories when we start PaddlePaddle.org.
### 2. Pull PaddlePaddle.org in a new directory and install its dependencies
Before doing this, please make sure your operating system has python dependencies installed.
Take the ubuntu system as an example, run:
```
sudo apt-get update && apt-get install -y python-dev build-essential
```
Then:
```
git clone https://github.com/PaddlePaddle/PaddlePaddle.org.git
cd PaddlePaddle.org/portal
```
Then install requirements. Please make sure that you install with python 2.7.15 or 2.7.16. We recommend you to use Anaconda or virtualenv to create an appropriate virtual environment first.
Install requirements:
```
pip install -r requirements.txt
```
**Optional**: If you wish to implement a Chinese-English website conversion to improve PaddlePaddle.org, please install [GNU gettext](https://www.gnu.org/software/gettext/)
### 3. Run PaddlePaddle.org locally
Add a list of options which you want to load and build (options include: --paddle, --book, --models, --mobile)
run:
```
./runserver --paddle <path_to_FluidDoc_dir>
```
**Note:** `<path_to_FluidDoc_dir>` is the directory of the local FluidDoc copy specified in the first step.
If you need to work with documents that depend on the contents of the `book`, `models`, or `mobile` repositories, you can add one or more options:
```
./runserver --paddle <path_to_fluiddoc_dir> \
--book <path_to_fluiddoc_dir>/external/book \
--models <path_to_fluiddoc_dir>/external/models \
--mobile <path_to_fluiddoc_dir>/external/mobile
```
Then: open your browser and navigate to http://localhost:8000.
>* The site may take a few seconds to load because the building takes a certain amount of time*
>* If you are running these steps in a docker environment, please check ip to make sure port 8000 can be mapped to your host*
## Contribute new documentation or update API
All content should be written in [Markdown](https://guides.github.com/features/mastering-markdown/) (GitHub style) (although there are some legacy content with the .rst format in the documentation).
After completing the installation, you will also need to do:
- Before you start writing, we suggest that you review the following tips for contributing content.
---
**Contribute new documents**
- Create a new `.md` file or modify an existing article in the repository you are currently working on
- Add the new document name to the corresponding index file
---
**Contribute or modify the Python API**
In the docker container that compiles the code, or the corresponding location in the host machine:
- Run the script `paddle/scripts/paddle_build.sh` (under Paddle repo)
```bash
# Compile paddle's python library
cd Paddle
./paddle/scripts/paddle_docker_build.sh gen_doc_lib full
cd ..
```
- Run the preview tool
```
# Run the preview tool in docker image which compiled paddle
docker run -it -v /Users/xxxx/workspace/paddlepaddle_workplace:/workplace -p 8000:8000 [images_id] /bin/bash
```
> Where `/Users/xxxx/workspace/paddlepaddle_workplace` should be replaced with your local host paddle workspace directory, `/workplace` should be replaced with the working directory in the docker. This mapping will ensure that we compile the python library, modify FluidDoc and use the preview tool at the same time.
> [images_id] is the image id of the paddlepaddle you use in docker.
- Set environment variables
```
# In docker environment
# Set the environment variable `PYTHONPATH` so that the preview tool can find the python library for paddle
export PYTHONPATH=/workplace/Paddle/build/python/
```
- Clean up old files
```
# Clear the previously generated file, if you are using the preview tool for the first time, you can skip this step
rm -rf /workplace/FluidDoc/doc/fluid/menu.json /workplace/FluidDoc/doc/fluid/api/menu.json /tmp/docs/ /tmp/api/
```
- Launch preview tool
```
cd /workplace/PaddlePaddle.org/portal
pip install -r requirements.txt
./runserver --paddle /workplace/FluidDoc/
```
---
**Preview modification**
Open your browser and navigate to http://localhost:8000 .
On the page to be updated, click Refresh Content at the top right corner.
After entering documentation page, the API section does not contain content. To preview the API document, please click on the API directory and you will see the generated API reference after a few minutes.
## Submit changes
If you wish to modify the code, please refer to [How to contribute code](../contribute_code/index_en.html) under the `Paddle` repository.
If you just modify the document:
- The modified content is in the `doc` folder, you only need to submit `PR` in the `FluidDoc` repository.
- The modified content is in the `external` folder:
1. Submit the PR in the repostory you modified. This is because the `FluidDoc` repository is just a wrapper that brings together the links of other repositories (namely, the "submodules" in git context).
2. When your changes are approved, update the corresponding `submodule` in FluidDoc to the latest commit-id of the source repository.
> For example, you updated the document on the develop branch in the book repository:
> - Go to the `FluidDoc/external/book` directory
> - Update commit-id to the latest commit: `git pull origin develop`
> - Commit your changes in `FluidDoc`
3. Pull Request for your changes in the `FluidDoc` repository
The steps to submit changes and PR can refer to [How to contribute code](../contribute_code/index_en.html)
## Help improve preview tool
We encourage your contributions to all aspects of the platform and supportive contents. You can Fork or Clone repository, ask questions and feedback, or submit bugs on issues. For details, please refer to the [Development Guide](https://github.com/PaddlePaddle/PaddlePaddle.org/blob/develop/DEVELOPING.md).
## Copyright and Licensing
PaddlePaddle.org is available under the Apache-2.0 license.
.. _contribute_to_paddle_faq:
###################
FAQ
###################
.. contents::
1. CLA签署不成功,怎么办?
---------------------------
由于 `CLA <https://github.com/cla-assistant/cla-assistant>`_ 是第三方开源库,有时候会不稳定。如果确定自己已签署CLA,但CLA没触发成功,可尝试:
* 关闭并重新开启本PR,来重新触发CLA。点击 :code:`Close pull request` ,再点击 :code:`Reopen pull request` ,并等待几分钟。
* 如果上述操作重复2次仍未生效,请重新提一个PR或评论区留言。
2. CI没有触发,怎么办?
------------------------
* 请在commit信息中添加正确的CI触发规则:
* develop分支请添加 :code:`test=develop`
* release分支请添加如 :code:`test=release/1.4` 来触发release/1.4分支
* 文档预览请添加 :code:`test=document_preview`
* 该CI触发规则以commit为单位,即对同一个PR来说,不管前面的commit是否已经添加,如果新commit想继续触发CI,那么仍然需要添加。
* 添加CI触发规则后,仍有部分CI没有触发:请关闭并重新开启本PR,来重新触发CI。
3. CI随机挂,即错误信息与本PR无关,怎么办?
--------------------------------------
由于develop分支代码的不稳定性,CI可能会随机挂。
如果确定CI错误和本PR无关,请在评论区贴上错误截图和错误链接。
4. 如何修改API.spec?
-----------------------
为了保证API接口/文档的稳定性,我们对API进行了监控,即API.spec文件。
修改方法请参考 `diff_api.py <https://github.com/PaddlePaddle/Paddle/blob/ddfc823c73934d483df36fa9a8b96e67b19b67b4/tools/diff_api.py#L29-L34>`_ 。
**注意**:提交PR后请查看下diff,不要改到非本PR修改的API上。
############
如何贡献代码
############
.. toctree::
:maxdepth: 1
local_dev_guide.md
submit_pr_guide.md
faq.rst
#################################
How to contribute codes to Paddle
#################################
.. toctree::
:maxdepth: 1
local_dev_guide_en.md
submit_pr_guide_en.md
# 本地开发指南
本文将指导您如何在本地进行代码开发
## 代码要求
- 代码注释请遵守 [Doxygen](http://www.doxygen.nl/) 的样式。
- 确保编译器选项 `WITH_STYLE_CHECK` 已打开,并且编译能通过代码样式检查。
- 所有代码必须具有单元测试。
- 通过所有单元测试。
- 请遵守[提交代码的一些约定](#提交代码的一些约定)
以下教程将指导您提交代码。
## [Fork](https://help.github.com/articles/fork-a-repo/)
跳转到[PaddlePaddle](https://github.com/PaddlePaddle/Paddle) GitHub首页,然后单击 `Fork` 按钮,生成自己目录下的仓库,比如 <https://github.com/USERNAME/Paddle>
## 克隆(Clone)
将远程仓库 clone 到本地:
```bash
➜ git clone https://github.com/USERNAME/Paddle
cd Paddle
```
## 创建本地分支
Paddle 目前使用[Git流分支模型](http://nvie.com/posts/a-successful-git-branching-model/)进行开发,测试,发行和维护,具体请参考 [Paddle 分支规范](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/others/releasing_process.md)
所有的 feature 和 bug fix 的开发工作都应该在一个新的分支上完成,一般从 `develop` 分支上创建新分支。
使用 `git checkout -b` 创建并切换到新分支。
```bash
➜ git checkout -b my-cool-stuff
```
值得注意的是,在 checkout 之前,需要保持当前分支目录 clean,否则会把 untracked 的文件也带到新分支上,这可以通过 `git status` 查看。
## 使用 `pre-commit` 钩子
Paddle 开发人员使用 [pre-commit](http://pre-commit.com/) 工具来管理 Git 预提交钩子。 它可以帮助我们格式化源代码(C++,Python),在提交(commit)前自动检查一些基本事宜(如每个文件只有一个 EOL,Git 中不要添加大文件等)。
`pre-commit`测试是 Travis-CI 中单元测试的一部分,不满足钩子的 PR 不能被提交到 Paddle,首先安装并在当前目录运行它:
```bash
➜ pip install pre-commit
➜ pre-commit install
```
Paddle 使用 `clang-format` 来调整 C/C++ 源代码格式,请确保 `clang-format` 版本在 3.8 以上。
注:通过`pip install pre-commit``conda install -c conda-forge pre-commit`安装的`yapf`稍有不同的,Paddle 开发人员使用的是`pip install pre-commit`
## 开始开发
在本例中,我删除了 README.md 中的一行,并创建了一个新文件。
通过 `git status` 查看当前状态,这会提示当前目录的一些变化,同时也可以通过 `git diff` 查看文件具体被修改的内容。
```bash
➜ git status
On branch test
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: README.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
no changes added to commit (use "git add" and/or "git commit -a")
```
## 编译和单元测试
关于编译 PaddlePaddle 的源码,请参见[从源码编译](../../../install/compile/fromsource.html) 选择对应的操作系统。
关于单元测试,可参考[Op单元测试](../new_op/new_op.html#id7) 的运行方法。
## 提交(commit)
接下来我们取消对 README.md 文件的改变,然后提交新添加的 test 文件。
```bash
➜ git checkout -- README.md
➜ git status
On branch test
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
nothing added to commit but untracked files present (use "git add" to track)
➜ git add test
```
Git 每次提交代码,都需要写提交说明,这可以让其他人知道这次提交做了哪些改变,这可以通过`git commit` 完成。
```bash
➜ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
```
## 保持本地仓库最新
在准备发起 Pull Request 之前,需要同步原仓库(<https://github.com/PaddlePaddle/Paddle>)最新的代码。
首先通过 `git remote` 查看当前远程仓库的名字。
```bash
➜ git remote
origin
➜ git remote -v
origin https://github.com/USERNAME/Paddle (fetch)
origin https://github.com/USERNAME/Paddle (push)
```
这里 origin 是我们 clone 的远程仓库的名字,也就是自己用户名下的 Paddle,接下来我们创建一个原始 Paddle 仓库的远程主机,命名为 upstream。
```bash
➜ git remote add upstream https://github.com/PaddlePaddle/Paddle
➜ git remote
origin
upstream
```
获取 upstream 的最新代码并更新当前分支。
```bash
➜ git fetch upstream
➜ git pull upstream develop
```
## Push 到远程仓库
将本地的修改推送到 GitHub 上,也就是 https://github.com/USERNAME/Paddle。
```bash
# 推送到远程仓库 origin 的 my-cool-stuff 分支上
➜ git push origin my-cool-stuff
```
# Guide of local development
You will learn how to develop programs in local environment under the guidelines of this document.
## Requirements of coding
- Please refer to the coding comment format of [Doxygen](http://www.doxygen.nl/)
- Make sure that option of builder `WITH_STYLE_CHECK` is on and the build could pass through the code style check.
- Unit test is needed for all codes.
- Pass through all unit tests.
- Please follow [regulations of submitting codes](#regulations of submitting codes).
The following guidiance tells you how to submit code.
## [Fork](https://help.github.com/articles/fork-a-repo/)
Transfer to the home page of Github [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) ,and then click button `Fork` to generate the git under your own file directory,such as <https://github.com/USERNAME/Paddle>
## Clone
Clone remote git to local:
```bash
➜ git clone https://github.com/USERNAME/Paddle
cd Paddle
```
## Create local branch
At present [Git stream branch model](http://nvie.com/posts/a-successful-git-branching-model/) is applied to Paddle to undergo task of development,test,release and maintenance.Please refer to [branch regulation of Paddle](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/others/releasing_process.md) about details。
All development tasks of feature and bug fix should be finished in a new branch which is extended from `develop` branch.
Create and switch to a new branch with command `git checkout -b`.
```bash
➜ git checkout -b my-cool-stuff
```
It is worth noting that before the checkout, you need to keep the current branch directory clean, otherwise the untracked file will be brought to the new branch, which can be viewed by `git status` .
## Use `pre-commit` hook
Paddle developers use the [pre-commit](http://pre-commit.com/) tool to manage Git pre-commit hooks. It helps us format the source code (C++, Python) and automatically check some basic things before committing (such as having only one EOL per file, not adding large files in Git, etc.).
The `pre-commit` test is part of the unit test in Travis-CI. A PR that does not satisfy the hook cannot be submitted to Paddle. Install `pre-commit` first and then run it in current directory:
```bash
➜ pip install pre-commit
➜ pre-commit install
```
Paddle modify the format of C/C++ source code with `clang-format` .Make sure the version of `clang-format` is above 3.8.
Note:There are differences between the installation of `yapf` with `pip install pre-commit` and that with `conda install -c conda-forge pre-commit` . Paddle developers use `pip install pre-commit`
## Start development
I delete a line of README.md and create a new file in the case.
View the current state via `git status` , which will prompt some changes to the current directory, and you can also view the file's specific changes via `git diff` .
```bash
➜ git status
On branch test
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: README.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
no changes added to commit (use "git add" and/or "git commit -a")
```
## Build and test
Please refer to [Compile From Source Code](../../../install/compile/fromsource_en.html) about more information of building PaddlePaddle source codes.
Please refer to [Op Unit Tests](../new_op/new_op_en.html#unit-tests) about more information of running unit tests.
## Commit
Next we cancel the modification of README.md,and submit new added test file.
```bash
➜ git checkout -- README.md
➜ git status
On branch test
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
nothing added to commit but untracked files present (use "git add" to track)
➜ git add test
```
It's required that the commit message is also given on every Git commit, through which other developers will be notified of what changes have been made. Type `git commit` to realize it.
```bash
➜ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
```
## Keep the latest local repository
It needs to keep up with the latest code of original repository(<https://github.com/PaddlePaddle/Paddle>)before Pull Request.
Check the name of current remote repository with `git remote`.
```bash
➜ git remote
origin
➜ git remote -v
origin https://github.com/USERNAME/Paddle (fetch)
origin https://github.com/USERNAME/Paddle (push)
```
origin is the name of remote repository that we clone,which is also the Paddle under your own account. Next we create a remote host of an original Paddle and name it upstream.
```bash
➜ git remote add upstream https://github.com/PaddlePaddle/Paddle
➜ git remote
origin
upstream
```
Get the latest code of upstream and update current branch.
```bash
➜ git fetch upstream
➜ git pull upstream develop
```
## Push to remote repository
Push local modification to GitHub(https://github.com/USERNAME/Paddle).
```bash
# submit it to remote git the branch my-cool-stuff of origin
➜ git push origin my-cool-stuff
```
# 提交PR注意事项
## 建立 Issue 并完成 Pull Request
建立一个 Issue 描述问题,并记录它的编号。
切换到所建分支,然后点击 `New pull request`
<img width="295" alt="screen shot 2017-04-26 at 9 09 28 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436054/a6d98c66-2ac4-11e7-9cb1-18dd13150230.png">
选择目标分支:
<img width="750" alt="screen shot 2017-04-26 at 9 11 52 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436139/f83b1e6c-2ac4-11e7-8c0e-add499023c46.png">
在 PR 的描述说明中,填写 `resolve #Issue编号` 可以在这个 PR 被 merge 后,自动关闭对应的 Issue,具体请见[这里](https://help.github.com/articles/closing-issues-via-commit-messages/)
接下来等待 review,如果有需要修改的地方,参照上述步骤更新 origin 中的对应分支即可。
## 签署CLA协议和通过单元测试
### 签署CLA
在首次向PaddlePaddle提交Pull Request时,您需要您签署一次CLA(Contributor License Agreement)协议,以保证您的代码可以被合入,具体签署方式如下:
- 请您查看PR中的Check部分,找到license/cla,并点击右侧detail,进入CLA网站
<div align="center">
<img src="https://github.com/PaddlePaddle/FluidDoc/blob/release/1.1/doc/fluid/advanced_usage/development/contribute_to_paddle/img/cla_unsigned.png?raw=true" height="40" width="500">
</div>
- 请您点击CLA网站中的“Sign in with GitHub to agree”,点击完成后将会跳转回您的Pull Request页面
<div align="center">
<img src="https://github.com/PaddlePaddle/FluidDoc/blob/release/1.1/doc/fluid/advanced_usage/development/contribute_to_paddle/img/sign_cla.png?raw=true" height="330" width="400">
</div>
### 通过单元测试
您在Pull Request中每提交一次新的commit后,会触发CI单元测试,请确认您的commit message中已加入必要的说明,请见[提交(commit)](local_dev_guide.html#permalink-8--commit-)
请您关注您Pull Request中的CI单元测试进程,它将会在几个小时内完成
您仅需要关注和自己提交的分支相关的CI项目,例如您向develop分支提交代码,则无需关注release/1.1一栏是否通过测试
当所需的测试后都出现了绿色的对勾,表示您本次commit通过了各项单元测试
如果所需的测试后出现了红色叉号,代表您本次的commit未通过某项单元测试,在这种情况下,请您点击detail查看报错详情,并将报错原因截图,以评论的方式添加在您的Pull Request中,我们的工作人员将帮您查看
## 删除远程分支
在 PR 被 merge 进主仓库后,我们可以在 PR 的页面删除远程仓库的分支。
<img width="775" alt="screen shot 2017-04-26 at 9 18 24 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436457/e4cdd472-2ac5-11e7-9272-badc76c4a23e.png">
也可以使用 `git push origin :分支名` 删除远程分支,如:
```bash
➜ git push origin :my-cool-stuff
```
## 删除本地分支
最后,删除本地分支。
```bash
# 切换到 develop 分支
➜ git checkout develop
# 删除 my-cool-stuff 分支
➜ git branch -D my-cool-stuff
```
至此,我们就完成了一次代码贡献的过程。
## 提交代码的一些约定
为了使评审人在评审代码时更好地专注于代码本身,请您每次提交代码时,遵守以下约定:
1)请保证Travis-CI 中单元测试能顺利通过。如果没过,说明提交的代码存在问题,评审人一般不做评审。
2)提交PUll Request前:
- 请注意commit的数量:
原因:如果仅仅修改一个文件但提交了十几个commit,每个commit只做了少量的修改,这会给评审人带来很大困扰。评审人需要逐一查看每个commit才能知道做了哪些修改,且不排除commit之间的修改存在相互覆盖的情况。
建议:每次提交时,保持尽量少的commit,可以通过`git commit --amend`补充上次的commit。对已经Push到远程仓库的多个commit,可以参考[squash commits after push](http://stackoverflow.com/questions/5667884/how-to-squash-commits-in-git-after-they-have-been-pushed)
- 请注意每个commit的名称:应能反映当前commit的内容,不能太随意。
3)如果解决了某个Issue的问题,请在该PUll Request的**第一个**评论框中加上:`fix #issue_number`,这样当该PUll Request被合并后,会自动关闭对应的Issue。关键词包括:close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved,请选择合适的词汇。详细可参考[Closing issues via commit messages](https://help.github.com/articles/closing-issues-via-commit-messages)
此外,在回复评审人意见时,请您遵守以下约定:
1)评审人的每个意见都必须回复(这是开源社区的基本礼貌,别人帮了忙,应该说谢谢):
- 对评审意见同意且按其修改完的,给个简单的`Done`即可;
- 对评审意见不同意的,请给出您自己的反驳理由。
2)如果评审意见比较多:
- 请给出总体的修改情况。
- 请采用[start a review](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/)进行回复,而非直接回复的方式。原因是每个回复都会发送一封邮件,会造成邮件灾难。
# Guide of submitting PR to Github
## Create an Issue and finish Pull Request
Create an Issue to describe your problem and keep its number.
Switch to the branch you have created and click `New pull request`
<img width="295" alt="screen shot 2017-04-26 at 9 09 28 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436054/a6d98c66-2ac4-11e7-9cb1-18dd13150230.png">
Switch to targeted branch:
<img width="750" alt="screen shot 2017-04-26 at 9 11 52 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436139/f83b1e6c-2ac4-11e7-8c0e-add499023c46.png">
A note of `resolve #Issue number` in PR description results in automatic close of corresponding Issue after the merge of PR.More details can be viewed [here](https://help.github.com/articles/closing-issues-via-commit-messages/)
Then please wait for review.If there is any need to make a modification,you can update corresponding branch in origin following the steps above.
## Sign CLA and pass unit tests
### Sign CLA
For the first time to submit Pull Request,you need to sign CLA(Contributor License Agreement) to ensure merge of your code.Specific steps are listed as follows:
- Please check the Check in PR to find license/cla and click detail on the right to change into CLA website.
<div align="center">
<img src="https://github.com/PaddlePaddle/FluidDoc/blob/release/1.1/doc/fluid/advanced_usage/development/contribute_to_paddle/img/cla_unsigned.png?raw=true" height="40" width="500">
</div>
- Please click “Sign in with GitHub to agree” in CLA website.It will change into your Pull Request page after the click.
<div align="center">
<img src="https://github.com/PaddlePaddle/FluidDoc/blob/release/1.1/doc/fluid/advanced_usage/development/contribute_to_paddle/img/sign_cla.png?raw=true" height="330" width="400">
</div>
### Pass unit tests
Every new commit in your Pull Request will trigger CI unit tests,so please make sure that necessary comments have been included in your commit message.Please refer to [commit](local_dev_guide.html#permalink-8--commit-)
Please note the procedure of CI unit tests in your Pull Request which will be finished in several hours.
You only need to focus on CI projects associated with your submitted branch.For example,there is no need to check whether release/1.1 pass test or not if you submit code to develop branch.
Green ticks after all tests means that your commit has passed all unit tests.
Red cross after the tests means your commit hasn't passed certain unit test.Please click detail to view bug details and make a screenshot of bug,then add it as a comment in your Pull Request.Our stuff will help you check it.
## Delete remote branch
We can delete branches of remote repository in PR page after your PR is successfully merged into master repository.
<img width="775" alt="screen shot 2017-04-26 at 9 18 24 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436457/e4cdd472-2ac5-11e7-9272-badc76c4a23e.png">
We can also delete the branch of remote repository with `git push origin :the_branch_name`,such as:
```bash
➜ git push origin :my-cool-stuff
```
## Delete local branch
Finally,we delete local branch
```bash
# Switch to develop branch
➜ git checkout develop
# delete my-cool-stuff branch
➜ git branch -D my-cool-stuff
```
And now we finish a full process of code contribution
## Certain regulations about submitting code
In order that reviewers focus on code in the code review,please follow these rules every time you submit your code:
1)Make sure that unit tests in Travis-CI pass through successfully.If it fails,it means problems have been found in submitted code which will not be reviewed by reviewer.
2)Before the submit of PUll Request:
- Please note the number of commit:
Reason:It will bother reviewers a lot if a dozen of commits are submitted after modification of only one file and only a few modifications are updated in every commit.Reviewers have to check commit one by one to figure out the modification.And sometimes it needs to take the overlap among commits into consideration.
Suggestion:Keep commit concise as much as possible at every submit.You can make a supplyment to the previous commit with `git commit --amend`.About several commits having been pushed to remote repository,you can refer to [squash commits after push](http://stackoverflow.com/questions/5667884/how-to-squash-commits-in-git-after-they-have-been-pushed)
- Pay attention to the name of every commit:It would be better to abstract the content of present commit and be not too arbitrary.
3)If you have tackled with problems of an Issue,please add `fix #issue_number` to the *first* comment area of PULL Request.Then the corresponding Issue will be closed automatically after the merge of PULL Request.Keywords are including:close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved.Please select appropriate word.Please refer to [Closing issues via commit messages](https://help.github.com/articles/closing-issues-via-commit-messages) for more details.
In addition,please follow the following regulations in response to the suggestion of reviewers:
1)A reply to every comment of reviewers(It's a fundamental complimentary conduct in open source community.An expression of appreciation is a need for help from others):
- If you adopt the suggestion of reviewer and make a modification accordingly, it's courteous to reply with a simple `Done` .
- Please clarify your reason to the disagreenment
2)If there are many suggestions
- Please show general modification
- Please follow [start a review](https://help.github.com/articles/reviewing-proposed-changes-in-a-pull-request/) to give your reply,instead of directly replying for that every comment will result in sending an email causing email disaster.
# 设计思想
## 简介
本篇文档主要介绍飞桨(PaddlePaddle,以下简称Paddle)底层的设计思想,帮助用户更好的理解框架运作过程。
阅读本文档,您将了解:
- Paddle 内部的执行流程
- Program 如何描述模型
- Executor 如何执行运算
## 1. Paddle内部执行流程
Paddle使用一种编译器式的执行流程,分为编译时和运行时两个部分,具体包括:编译器定义 Program ,创建Executor 运行 Program 。
本地训练任务执行流程图如下所示:
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/FluidDoc/develop/doc/fluid/advanced_usage/design_idea/image/fluid_process.png" width=800>
</p>
1. 编译时,用户编写一段python程序,通过调用 Paddle 提供的算子,向一段 Program 中添加变量(Tensor)以及对变量的操作(Operators 或者 Layers)。用户只需要描述核心的前向计算,不需要关心反向计算、分布式下以及异构设备下如何计算。
2. 原始的 Program 在框架内部转换为中间描述语言: `ProgramDesc`
3. `Transpiler` 接受一段 `ProgramDesc` ,输出一段变化后的 `ProgramDesc` ,作为后端 `Executor` 最终需要执行的 Program 。 `Transpiler` 并非必需步骤。
4. 执行 `ProgramDesc` 中定义的 Operator(可以类比为程序语言中的指令),在执行过程中会为 Operator 创建所需的输入输出并进行管理。
## 2. Program设计思想
用户完成网络定义后,一段 Paddle 程序中通常存在 2 个 Program:
1. fluid.default_startup_program:定义了模型参数初始化、优化器参数初始化、reader初始化等各种操作。
default_startup_program 可以由框架自动生成,使用时无需显式地创建
如果调用修改了参数的默认初始化方式,框架会自动的将相关的修改加入default_startup_program
2. fluid.default_main_program :定义了神经网络模型,前向反向计算,以及模型参数更新、优化器参数更新等各种操作。
使用Paddle的核心就是构建起 default_main_program
<a name="ProgramsAndBlocks"></a>
### Programs and Blocks
Paddle 的 Program 的基本结构是一些嵌套 blocks,形式上类似一段 C++ 或 Java 程序。
blocks中包含:
- 本地变量的定义
- 一系列的operator
block的概念与通用程序一致,例如在下列这段C++代码中包含三个block:
``` cpp
#include <iostream>
int main() {
int x = 5; // block 0
int y = 4; // block 0
int out; // block 0
if (x < y) { // block 0
out = 1; // block 1
} else {
out = 0; // block 2
}
std::cout << out << std::endl;
return 0;
}
```
类似的,在下列 Paddle 的 Program 包含3段block:
```python
import paddle.fluid as fluid
x = fluid.data(name='x', shape=[1], dtype='int64') # block 0
y = fluid.data(name='y', shape=[1], dtype='int64') # block 0
def true_block():
return fluid.layers.fill_constant(dtype='int64', value=1, shape=[1]) # block 1
def false_block():
return fluid.layers.fill_constant(dtype='int64', value=0, shape=[1]) # block 2
condition = fluid.layers.less_than(x, y) # block 0
out = fluid.layers.cond(condition, true_block, false_block) # block 0
```
### BlockDesc and ProgramDesc
用户描述的block与program信息在Paddle中以[protobuf](https://en.wikipedia.org/wiki/Protocol_Buffers) 格式保存,所有的`protobuf`信息被定义在`framework.proto`中,在Paddle中被称为BlockDesc和ProgramDesc。ProgramDesc和BlockDesc的概念类似于一个[抽象语法树](https://en.wikipedia.org/wiki/Abstract_syntax_tree)
`BlockDesc`中包含本地变量的定义 [vars](../../api_guides/low_level/program.html#variable),和一系列的operator`ops`
```cpp
message BlockDesc {
required int32 idx = 1;
required int32 parent_idx = 2;
repeated VarDesc vars = 3;
repeated OpDesc ops = 4;
}
```
parent_idx表示父块,因此block中的操作符可以引用本地定义的变量,也可以引用祖先块中定义的变量。
Program 中的每层 block 都被压平并存储在数组中。blocks ID是这个数组中块的索引。
```cpp
message ProgramDesc {
repeated BlockDesc blocks = 1;
}
```
### 使用Blocks的Operator
[Programs and Blocks](#ProgramsAndBlocks)的例子中,IfElseOp这个Operator包含了两个block——true分支和false分支。
下述OpDesc的定义过程描述了一个operator可以包含哪些属性:
```cpp
message OpDesc {
AttrDesc attrs = 1;
...
}
```
属性可以是block的类型,实际上就是上面描述的block ID:
```cpp
message AttrDesc {
required string name = 1;
enum AttrType {
INT = 1,
STRING = 2,
...
BLOCK = ...
}
required AttrType type = 2;
optional int32 block = 10; // when type == BLOCK
...
}
```
<a name="Executor设计思想"></a>
## 3. Executor设计思想
Executor 在运行时将接受一个`ProgramDesc`、一个`block_id`和一个`Scope``ProgramDesc``block`的列表,每一项包含`block`中所有参数和`operator``protobuf`定义;`block_id`指定入口块;`Scope`是所有变量实例的容器。
其中 `Scope` 包含了 `name``Variable` 的映射,所有变量都被定义在 `Scope` 里。大部分API会默认使用 `global_scope` ,例如 `Executor.run` ,您也可以指定网络运行在某个特定的 `Scope` 中,一个网络可以在不同的 `Scope`内运行,并在该 `Scope` 内更新不同的 `Variable`
完成的编译执行的具体过程如下图所示:
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/FluidDoc/develop/doc/fluid/advanced_usage/design_idea/image/executor_design.png" width=600>
</p>
1. Executor 为每一个block创建一个Scope,Block是可嵌套的,因此Scope也是可嵌套的。
2. 创建所有Scope中的变量。
3. 创建并执行所有operator。
Executor的C++实现代码如下:
```cpp
class Executor{
public:
void Run(const ProgramDesc& pdesc,
Scope* scope,
int block_id) {
auto& block = pdesc.Block(block_id);
//创建所有变量
for (auto& var : block.AllVars())
scope->Var(Var->Name());
}
//创建OP并执行
for (auto& op_desc : block.AllOps()){
auto op = CreateOp(*op_desc);
op->Run(*local_scope, place_);
}
}
```
**创建Executor**
Paddle中使用fluid.Executor(place)创建Executor,place属性由用户定义,代表程序将在哪里执行。
下例代码表示创建一个Executor,其运行场所在CPU内:
```python
cpu=fluid.CPUPlace()
exe = fluid.Executor(cpu)
```
**运行Executor**
Paddle使用Executor.run来运行程序。定义中通过Feed映射获取数据,通过fetch\_list获取结果:
```python
...
x = numpy.random.random(size=(10, 1)).astype('float32')
outs = exe.run(
feed={'X': x},
fetch_list=[loss.name])
```
## 代码实例
本节通过[编程指南](../../../beginners_guide/basic_concept/programming_guide/programming_guide.html)中简单的线性回归例子,为您介绍上述内容如何在代码中实现。
**定义Program**
您可以随意定义自己的数据和网络结构,定义的结果都将作为一段 Program 被 Paddle 接收,Program 的基本结构是一些 blocks,本节的 Program 仅包含一个 block 0:
```python
#加载函数库
import paddle.fluid as fluid #block 0
import numpy
#定义数据
train_data=numpy.array([[1.0],[2.0],[3.0],[4.0]]).astype('float32')
y_true = numpy.array([[2.0],[4.0],[6.0],[8.0]]).astype('float32')
#定义网络
x = fluid.data(name="x",shape=[None, 1],dtype='float32')
y = fluid.data(name="y",shape=[None, 1],dtype='float32')
y_predict = fluid.layers.fc(input=x,size=1,act=None)
#定义损失函数
cost = fluid.layers.square_error_cost(input=y_predict,label=y)
avg_cost = fluid.layers.mean(cost)
#定义优化方法
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01)
sgd_optimizer.minimize(avg_cost)
```
完成上述定义,也就是完成了 fluid.default_main_program 的构建过程,fluid.default_main_program 中承载着神经网络模型,前向反向计算,以及优化算法对网络中可学习参数的更新。
此时可以输出这段 Program 观察定义好的网络形态:
```python
print(fluid.default_main_program().to_string(True))
```
完整ProgramDesc可以在本地查看,本次仅节选前三个变量的结果如下:
```
blocks {
idx: 0
parent_idx: -1
vars {
name: "mean_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: 1
}
}
}
persistable: false
}
vars {
name: "square_error_cost_1.tmp_1"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
}
vars {
name: "square_error_cost_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
...
```
从输出结果中可以看到,整个定义过程在框架内部转化为了一段ProgramDesc,以block idx为索引。本次线性回归模型中仅有1个block,ProgramDesc中也仅有block 0一段BlockDesc。
BlockDesc中包含定义的 vars 和一系列的 ops,以输入x为例,python代码中定义 x 是一个数据类型为"float32"的1维数据:
```python
x = fluid.data(name="x",shape=[None, 1],dtype='float32')
```
在BlockDesc中,变量x被描述为:
```
vars {
name: "x"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
```
在Paddle中所有的数据类型都为LoD-Tensor,对于不存在序列信息的数据(如此处的变量X),其lod_level=0。
dims表示数据的维度,这里表示 x 的维度为[-1,1],其中-1是batch的维度,无法确定具体数值时,Paddle 自动用 -1 占位。
参数`persistable`表示该变量在整个训练过程中是否为持久化变量。
**创建Executor**
Paddle使用Executor来执行网络训练,Executor运行细节请参考[Executor设计思想](#Executor设计思想)的介绍。作为使用者,实际并不需要了解内部机制。
创建Executor只需调用 fluid.Executor(place) 即可,在此之前请您依据训练场所定义place变量:
```python
#在CPU内执行训练
cpu = fluid.CPUPlace()
#创建Executor
exe = fluid.Executor(cpu)
```
**运行Executor**
Paddle使用Executor.run来运行一段Program。
正式进行网络训练前,需先执行参数初始化。其中 defalut_startup_program 中定义了模型参数初始化、优化器参数初始化、reader初始化等各种操作。
```python
#参数初始化
exe.run(fluid.default_startup_program())
```
由于传入数据与传出数据存在多列,因此 Paddle 通过 feed 映射定义数据的传输数据,通过 fetch_list 取出期望结果:
```python
#开始训练
outs = exe.run(
feed={'x':train_data,'y':y_true},
fetch_list=[y_predict.name,avg_cost.name])
```
上述代码段中定义了train_data传入x变量,y_true传入y变量,输出y的预测值和最后一轮cost值。
输出结果为:
```
[array([[1.5248038],
[3.0496075],
[4.5744114],
[6.099215 ]], dtype=float32), array([1.6935859], dtype=float32)]
```
至此您已经了解了Paddle内部的执行流程的核心概念,更多框架使用细节可以参考[典型案例](../../../user_guides/index_cn.html)
# Design Principles of Fluid
## Introduction
This document mainly introduces the underlying design principles Fluid to help users better understand the operation process of the framework.
After reading this document, you will learn about:
- internal execution process of Fluid
- How does Program express the model
- How does Executor perform operations
## 1. internal execution process of Fluid
Fluid uses a compiler-like execution process, which is divided into compile-time and runtime. Specifically, it includes : compiler defines Program, and an Executor is created to run the Program.
The flow chart of executing a local training task is as follows:
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/FluidDoc/develop/doc/fluid/advanced_usage/design_idea/image/fluid_process.png" width=800>
</p>
1. At compile time, the user writes a python program that adds variables (Tensor) and operations on variables (Operators or Layers) to a Program by calling the operators provided by Fluid. Users only need to describe the core forward calculations, and do not need to care about backward computing, distributed computing, and calculations on heterogeneous devices.
2. The original Program is converted to an intermediate descriptive language ``ProgramDesc`` within the platform:
3. The most important functional modules at compile time is ``Transpiler`` . ``Transpiler`` accepts a segment of `ProgramDesc` and outputs a segment of *transformed* ``ProgramDesc`` as the Program to be executed ultimately by backend ``Executor`` .
4. The backend ``Executor`` accepts the Program output from ``Transpiler`` , and executes the Operators in the Program in sequence (which can be analogized to the instructions in the programming language). During the execution, the Executor creates the required input and output for the Operator and manages them.
## 2. Design Principles of Program
After completing the network definition, there are usually 2 Programs in a Fluid program:
1. ``fluid.default_startup_program`` : defines various operations such as creating model parameters, input and output, and initialization of learnable parameters in the model.
default_startup_program can be automatically generated by the framework and can be used without visible manual creation.
If the call changes the default initialization method of the parameter, the framework will automatically add the relevant changes to the default_startup_program
2. ``fluid.default_main_program`` : defines the neural network model, forward and backward calculations, and updates to the learnable parameters of the network by the optimization algorithm.
The core of using Fluid is to build ``default_main_program``
### Programs and Blocks
The basic structure of Fluid's Program is some nested blocks that are similar in form to a C++ or Java program.
The blocks contain:
- Definition of local variables
- A series of operators
The concept of block is the same with that in generic programs. For example, there are three blocks in the following C++ code:
``` cpp
#include <iostream>
int main() {
int x = 5; // block 0
int y = 4; // block 0
int out; // block 0
if (x < y) { // block 0
out = 1; // block 1
} else {
out = 0; // block 2
}
std::cout << out << std::endl;
return 0;
}
```
Similarly, the following Program contains 3 blocks:
```python
import paddle.fluid as fluid
x = fluid.data(name='x', shape=[1], dtype='int64') # block 0
y = fluid.data(name='y', shape=[1], dtype='int64') # block 0
def true_block():
return fluid.layers.fill_constant(dtype='int64', value=1, shape=[1]) # block 1
def false_block():
return fluid.layers.fill_constant(dtype='int64', value=0, shape=[1]) # block 2
condition = fluid.layers.less_than(x, y) # block 0
out = fluid.layers.cond(condition, true_block, false_block) # block 0
```
### BlockDesc and ProgramDesc
The block and program information described by the user is saved in Fluid in [protobuf](https://en.wikipedia.org/wiki/Protocol_Buffers) format, and all ``protobuf`` information is defined in ``framework.proto`` . In Fluid it is called BlockDesc and ProgramDesc. The concepts of ProgramDesc and BlockDesc are similar to an [abstract syntax tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree).
`BlockDesc` contains the definition of the local variables `vars`, and a series of operators `ops`:
```cpp
message BlockDesc {
required int32 idx = 1;
required int32 parent_idx = 2;
repeated VarDesc vars = 3;
repeated OpDesc ops = 4;
}
```
The parent ID represents the parent block, so the operators in the block can not only reference local variables, but also reference variables defined in the ancestor block.
Each block in the Program is flattened and stored in an array. The blocks ID is the index of the block in this array.
```cpp
message ProgramDesc {
repeated BlockDesc blocks = 1;
}
```
### the Operator using Blocks
In the example of [Programs and Blocks](#Programs and Blocks), the IfElseOp operator contains two blocks -- the true branch and the false branch.
The following OpDesc definition process describes what attributes an operator contains:
```cpp
message OpDesc {
AttrDesc attrs = 1;
...
}
```
The attribute can be the type of block, which is actually the block ID described above:
```cpp
message AttrDesc {
required string name = 1;
enum AttrType {
INT = 1,
STRING = 2,
...
BLOCK = ...
}
required AttrType type = 2;
optional int32 block = 10; // when type == BLOCK
...
}
```
<a name="Executor Design Ideas"></a>
## 3. Design Principles of Executor
Executor will accept a `ProgramDesc`, a `block_id` and a `Scope` at runtime. `ProgramDesc` is a list of `block`, each containing all the parameters in `block` and the `protobuf` definition of `operator`; `block_id` specifying the entry block; `Scope` is the container for all variable instances.
The specific process of compilation and execution is shown in the following figure:
<p align="center">
<img src="https://raw.githubusercontent.com/PaddlePaddle/FluidDoc/develop/doc/fluid/advanced_usage/design_idea/image/executor_design.png" width=600>
</p>
1. Executor creates a Scope for each block. Block is nestable, so Scope is nestable
2. Create all variables in Scope
3. Create and Run all operators in order
C++ implementation code of Executor is as follows:
```cpp
class Executor{
public:
void Run(const ProgramDesc& pdesc,
scope* scope,
int block_id) {
auto& block = pdesc.Block(block_id);
// Create all variables
for (auto& var : block.AllVars())
scope->Var(Var->Name());
}
// Create OP and execute in order
for (auto& op_desc : block.AllOps()){
auto op = CreateOp(*op_desc);
op->Run(*local_scope, place_);
}
}
};
```
**Create Executor**
Fluid uses Fluid.Executor(place) to create an Executor. The place attribute is defined by user and represents where the program will be run.
The following code example creates an Executor that runs on CPU:
```python
cpu=fluid.CPUPlace()
exe = fluid.Executor(cpu)
```
**Run Executor**
Fluid uses Executor.run to run the program. In the definition, the data is obtained through the ``Feed`` dict, and the result is obtained through ``fetch_list``:
```python
...
x = numpy.random.random(size=(10, 1)).astype('float32')
outs = exe.run(
feed={'X': x},
fetch_list=[loss.name])
```
## Code Instance
This section introduces how the above is implemented in your code through a simple linear regression example in the [Fluid Programming Guide](../../../beginners_guide/basic_concept/programming_guide/programming_guide_en.html).
**Define Program**
You can freely define your own data and network structure, which will be received as a Program by Fluid. The basic structure of Program is some blocks. The Program in this section contains only block 0:
```python
#Load function library
import paddle.fluid as fluid #block 0
import numpy
# Define data
train_data=numpy.array([[1.0],[2.0],[3.0],[4.0]]).astype('float32')
y_true = numpy.array([[2.0],[4.0],[6.0],[8.0]]).astype('float32')
# Define the network
x = fluid.data(name="x",shape=[None, 1],dtype='float32')
y = fluid.data(name="y",shape=[None, 1],dtype='float32')
y_predict = fluid.layers.fc(input=x,size=1,act=None)
#definition loss function
cost = fluid.layers.square_error_cost(input=y_predict,label=y)
avg_cost = fluid.layers.mean(cost)
#defined optimization method
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.01)
sgd_optimizer.minimize(avg_cost)
```
Finishing the above definition is at the same time finishing the construction process of fluid.default_main_program. fluid.default_main_program carries the neural network model, forward and backward calculation, and the optimization algorithm which updates the learnable parameters in the network.
At this point you can output this Program to observe the defined network:
```python
print(fluid.default_main_program().to_string(True))
```
The complete ProgramDesc can be viewed locally. The first three variables are excerpted and displayed as follows:
```
blocks {
idx: 0
parent_idx: -1
vars {
name: "mean_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: 1
}
}
}
persistable: false
}
vars {
name: "square_error_cost_1.tmp_1"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
}
vars {
name: "square_error_cost_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
...
```
As you can see from the output, the entire definition process is transformed into a ProgramDesc inside the framework, indexed by block idx. There is only one block in this linear regression model, and ProgramDesc has only one piece of BlockDesc, namely Block 0.
BlockDesc contains defined vars and a series of ops. Take input x as an example. In python code, x is 1D data of data type "float 32":
```python
x = fluid.data(name="x",shape=[None, 1],dtype='float32')
```
In BlockDesc, the variable x is described as:
```
vars {
name: "x"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
```
All data in Fluid are LoD-Tensor, and for data without sequence information (such as variable X here), lod_level=0.
``dims`` represents the dimension of the data, and in the example x has the dimension of [-1,1], where -1 is the dimension of the batch. When the specific value cannot be determined, Fluid automatically uses the -1 as placeholder.
The parameter ``persistable`` indicates whether the variable is a persistent variable throughout the training process.
**Create Executor**
Fluid uses Executor to perform network training. For details on Executor operation, please refer to [Executor Design Principles](#Executor Design Ideas). As a user, there is actually no need to understand the internal mechanism.
To create an Executor, simply call fluid.Executor(place). Before that, please define a place variable based on the training site:
```python
#Execute training on CPU
cpu = fluid.CPUPlace()
#Create Executor
exe = fluid.Executor(cpu)
```
**Run Executor**
Fluid uses Executor.run to run a program.
Before the network training is actually performed, parameter initialization must be performed first. Among them, default_startup_program defines various operations such as creating model parameters, input and output, and initialization of learnable parameters in the model.
```python
#Parameter initialization
exe.run(fluid.default_startup_program())
```
Since there are multiple columns of incoming and outgoing data, fluid defines transferred data through the feed mapping, and fetch_list takes out the expected result:
```python
# Start training
outs = exe.run(
feed={'x':train_data,'y':y_true},
fetch_list=[y_predict.name,avg_cost.name])
```
The above code defines that train_data is to be passed into the x variable, y_true is to be passed into the y variable, and output the predicted value of y and the last round value of cost.
The output is:
```
[array([[1.5248038],
[3.0496075],
[4.5744114],
[6.099215 ]], dtype=float32), array([1.6935859], dtype=float32)]
```
Till now you have already be notified of the core concepts of the internal execution process of Fluid. For more details on the usage of the framework, please refer to the [User Guide](../../../user_guides/index_en.html).
.. _addon_development:
########
二次开发
########
.. toctree::
:maxdepth: 1
design_idea/fluid_design_idea.md
new_op/index_cn.rst
contribute_code/index_cn.rst
.. _addon_development_en:
################
Addon Development
################
.. toctree::
:maxdepth: 1
design_idea/fluid_design_idea_en.md
new_op/index_en.rst
contribute_code/index_en.rst
# 如何在框架外部自定义C++ OP
通常,如果PaddlePaddle的Operator(OP)库中没有您所需要的操作,建议先尝试使用已有的OP组合,如果无法组合出您需要的操作,可以尝试使用`fluid.layers.py_func`,也可以按照这篇教程自定义C++ OP。当然,如果用若干OP组合出来的OP性能无法满足您的要求,也可以自定义C++ OP。
自定义OP需要以下几个步骤:
1. 实现OP和注册OP,和在框架内部写OP完全相同,遵守"如何写新的C++ OP"的规范和步骤。当然,实现Gradient OP是可选的。
2. 编译出动态库。
3. 封装该OP的Python接口。
4. 写OP的单测。
下面通过一个具体的例子来详细的介绍,一步一步教会您如何实现。下面通过实现relu op来介绍。
## 自定义OP的实现
OP的实现与"如何写新的C++ OP"的教程相同,简答的说需要: 1). 定义OP的ProtoMaker,即描述OP的输入、输出、属性信息;2). 实现OP的定义和InferShape,以及OP的kernel函数,反向OP类似。3). 注册OP,以及OP的计算函数。
ReLU OP的CPU实现, ``relu_op.cc`` 文件:
```
// relu_op.cc
#include "paddle/fluid/framework/op_registry.h"
namespace paddle {
namespace operators {
// 前向OP的输入X、输出Y、属性
class Relu2OpMaker : public framework::OpProtoAndCheckerMaker {
public:
void Make() override {
AddInput("X", "The input tensor.");
AddOutput("Y", "Output of relu_op");
AddComment(R"DOC(
Relu Operator.
Y = max(X, 0)
)DOC");
}
};
// 前向OP的定义和InferShape实现,设置输出Y的shape
class Relu2Op : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
void InferShape(framework::InferShapeContext* ctx) const override {
auto in_dims = ctx->GetInputDim("X");
ctx->SetOutputDim("Y", in_dims);
}
};
// 实现前向OP的Kernel计算函数: Y = max(0, X)
using Tensor = framework::Tensor;
template <typename DeviceContext, typename T>
class Relu2Kernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto* in_t = ctx.Input<Tensor>("X");
auto* out_t = ctx.Output<Tensor>("Y");
auto x = in_t->data<T>();
// mutable_data分配内存、获取指针
auto y = out_t->mutable_data<T>(ctx.GetPlace());
for (int i = 0; i < in_t->numel(); ++i) {
y[i] = std::max(static_cast<T>(0.), x[i]);
}
}
};
// 定义反向OP的输入Y和dY、输出dX、属性:
template <typename T>
class Relu2GradMaker : public framework::SingleGradOpMaker<T> {
public:
using framework::SingleGradOpMaker<T>::SingleGradOpMaker;
void Apply(GradOpPtr<T> op) const override {
op->SetType("relu2_grad");
op->SetInput("Y", this->Output("Y"));
op->SetInput(framework::GradVarName("Y"), this->OutputGrad("Y"));
op->SetAttrMap(this->Attrs());
op->SetOutput(framework::GradVarName("X"), this->InputGrad("X"));
}
};
// 定义反向OP和InferShape实现,设置dX的shape
class Relu2GradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
void InferShape(framework::InferShapeContext* ctx) const override {
auto in_dims = ctx->GetInputDim(framework::GradVarName("Y"));
ctx->SetOutputDim(framework::GradVarName("X"), in_dims);
}
};
// 实现反向OP的kernel函数 dx = dy * ( y > 0. ? 1. : 0)
template <typename DeviceContext, typename T>
class Relu2GradKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto* dy_t = ctx.Input<Tensor>(framework::GradVarName("Y"));
auto* y_t = ctx.Input<Tensor>("Y");
auto* dx_t = ctx.Output<Tensor>(framework::GradVarName("X"));
auto dy = dy_t->data<T>();
auto y = y_t->data<T>();
auto dx = dx_t->mutable_data<T>(ctx.GetPlace());
for (int i = 0; i < y_t->numel(); ++i) {
dx[i] = dy[i] * (y[i] > static_cast<T>(0) ? 1. : 0.);
}
}
};
} // namespace operators
} // namespace paddle
namespace ops = paddle::operators;
using CPU = paddle::platform::CPUDeviceContext;
// 注册前向和反向op
// 为了和框架内部的relu区分,这里注册的OP type为relu2
REGISTER_OPERATOR(relu2,
ops::Relu2Op,
ops::Relu2OpMaker,
ops::Relu2GradMaker<paddle::framework::OpDesc>,
ops::Relu2GradMaker<paddle::imperative::OpBase>);
REGISTER_OPERATOR(relu2_grad, ops::Relu2GradOp);
// 注册CPU的Kernel
REGISTER_OP_CPU_KERNEL(relu2,
ops::Relu2Kernel<CPU, float>,
ops::Relu2Kernel<CPU, double>);
REGISTER_OP_CPU_KERNEL(relu2_grad,
ops::Relu2GradKernel<CPU, float>,
ops::Relu2GradKernel<CPU, double>);
```
ReLU OP的GPU实现, ``relu_op.cu`` 文件:
```
// relu_op.cu
#include "paddle/fluid/framework/op_registry.h"
namespace paddle {
namespace operators {
using Tensor = framework::Tensor;
template <typename T>
__global__ void KeRelu2(const T* x, const int num, T* y) {
int gid = blockIdx.x * blockDim.x + threadIdx.x;
for (int i = gid; i < num; i += blockDim.x * gridDim.x) {
y[i] = max(x[i], static_cast<T>(0.));
}
}
// 前向OP的kernel的GPU实现
template <typename DeviceContext, typename T>
class Relu2CUDAKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto* in_t = ctx.Input<Tensor>("X");
auto* out_t = ctx.Output<Tensor>("Y");
auto x = in_t->data<T>();
auto y = out_t->mutable_data<T>(ctx.GetPlace());
auto& dev_ctx = ctx.template device_context<DeviceContext>();
int num = in_t->numel();
int block = 512;
int grid = (num + block - 1) / block;
KeRelu2<T><<<grid, block, 0, dev_ctx.stream()>>>(x, num, y);
}
};
template <typename T>
__global__ void KeRelu2Grad(const T* y, const T* dy, const int num, T* dx) {
int gid = blockIdx.x * blockDim.x + threadIdx.x;
for (int i = gid; i < num; i += blockDim.x * gridDim.x) {
dx[i] = dy[i] * (y[i] > 0 ? 1. : 0.);
}
}
// 反向OP的kernel的GPU实现
template <typename DeviceContext, typename T>
class Relu2GradCUDAKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto* dy_t = ctx.Input<Tensor>(framework::GradVarName("Y"));
auto* y_t = ctx.Input<Tensor>("Y");
auto* dx_t = ctx.Output<Tensor>(framework::GradVarName("X"));
auto dy = dy_t->data<T>();
auto y = y_t->data<T>();
auto dx = dx_t->mutable_data<T>(ctx.GetPlace());
auto& dev_ctx = ctx.template device_context<DeviceContext>();
int num = dy_t->numel();
int block = 512;
int grid = (num + block - 1) / block;
KeRelu2Grad<T><<<grid, block, 0, dev_ctx.stream()>>>(y, dy, num, dx);
}
};
} // namespace operators
} // namespace paddle
using CUDA = paddle::platform::CUDADeviceContext;
// 注册前向的GPU Kernel
REGISTER_OP_CUDA_KERNEL(relu2,
paddle::operators::Relu2CUDAKernel<CUDA, float>,
paddle::operators::Relu2CUDAKernel<CUDA, double>);
// 注册反向的GPU Kernel
REGISTER_OP_CUDA_KERNEL(relu2_grad,
paddle::operators::Relu2GradCUDAKernel<CUDA, float>,
paddle::operators::Relu2GradCUDAKernel<CUDA, double>);
```
注意点:
1. OP的type不能和PaddlePaddle已有的OP type相同,否则在Python中使用时会报错。
## 自定义OP的编译
需要将实现的C++、CUDA代码编译成动态库,下面通过g++/nvcc编译,当然您也可以写Makefile或者CMake。
编译需要include PaddlePaddle的相关头文件,如上面代码 `paddle/fluid/framework/op_registry.h` ,需要链接PaddlePaddle的lib库。 可通过下面命令获取到:
```
# python
>>> import paddle
>>> print(paddle.sysconfig.get_include())
/paddle/pyenv/local/lib/python2.7/site-packages/paddle/include
>>> print(paddle.sysconfig.get_lib())
/paddle/pyenv/local/lib/python2.7/site-packages/paddle/libs
```
下面命令可编译出动态库:
```
include_dir=$( python -c 'import paddle; print(paddle.sysconfig.get_include())' )
lib_dir=$( python -c 'import paddle; print(paddle.sysconfig.get_lib())' )
echo $include_dir
echo $lib_dir
# PaddlePaddel >=1.6.1, 仅需要include ${include_dir} 和 ${include_dir}/third_party
nvcc relu_op.cu -c -o relu_op.cu.o -ccbin cc -DPADDLE_WITH_CUDA -DEIGEN_USE_GPU -DPADDLE_USE_DSO -DPADDLE_WITH_MKLDNN -Xcompiler -fPIC -std=c++11 -Xcompiler -fPIC -w --expt-relaxed-constexpr -O3 -DNVCC \
-I ${include_dir} \
-I ${include_dir}/third_party \
g++ relu_op.cc relu_op.cu.o -o relu2_op.so -shared -fPIC -std=c++11 -O3 -DPADDLE_WITH_MKLDNN \
-I ${include_dir} \
-I ${include_dir}/third_party \
-L /usr/local/cuda/lib64 \
-L ${lib_dir} -lpaddle_framework -lcudart
```
注意点:
1. 通过NVCC编译CUDA源文件时,需要加编译选项 `-DPADDLE_WITH_CUDA -DEIGEN_USE_GPU -DPADDLE_USE_DSO`,在框架源码中会使用这些宏定义进行条件编译。用户自定义的C++ OP实现编译时,选项的开启状态需要和核心框架编译行为一致。如`EIGEN_USE_GPU`是使用Eigen数学库的GPU实现时需要增加的编译选项。
2. 如果飞桨安装包中不包含MKLDNN库,则需要去掉编译选项`-DPADDLE_WITH_MKLDNN`。核心框架源码中(比如tensor.h)有使用此宏定义进行条件编译,该选项是否打开同样需要和核心框架编译行为保持一致。默认的飞桨安装包中含有MKLDNN库。
3. 可多个OP编译到同一个动态库中。
4. 通过pip方式安装的PaddlePaddle由GCC 4.8编译得到,由于GCC 4.8和GCC 5以上**C++11 ABI不兼容**,您编写的自定义OP,需要通过GCC 4.8编译。若是GCC 5及以上的环境上使用自定义OP,推荐使用[Docker安装PaddlePaddle](https://www.paddlepaddle.org.cn/install/doc/docker),使得编Paddle和编译自定义OP的GCC版本相同。
## 封装Python Layer接口
需要使用 `fluid.load_op_library` 接口调用加载动态库,使得PaddlePaddle的主进程中可以使用用户自定义的OP。
```
# custom_op.py
import paddle.fluid as fluid
# 调用load_op_library加载动态库
fluid.load_op_library('relu2_op.so')
from paddle.fluid.layer_helper import LayerHelper
def relu2(x, name=None):
# relu2的type和在OP中定义的type相同
helper = LayerHelper("relu2", **locals())
# 创建输出Variable
out = helper.create_variable_for_type_inference(dtype=x.dtype)
helper.append_op(type="relu2", inputs={"X": x}, outputs={"Y": out})
return out
```
注意点:
1. 一个动态库只需使用`fluid.load_op_library``paddle.fluid` import之后加载一次即可。
2. Python接口的封装和PaddlePaddle框架内部的封装相同,更多的示例也可以阅读源码中 `python/paddle/fluid/layers/nn.py`的代码示例。
## 单测测试
可以写个简单的Python程序测试计算的正确性:
```
import numpy as np
import paddle.fluid as fluid
from custom_op import relu2
data = fluid.layers.data(name='data', shape=[32], dtype='float32')
relu = relu2(data)
use_gpu = True # or False
place = fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
exe = fluid.Executor(place)
x = np.random.uniform(-1, 1, [4, 32]).astype('float32')
out, = exe.run(feed={'data': x}, fetch_list=[relu])
np.allclose(out, np.maximum(x,0.))
```
接下来可以在模型中使用您自定义的OP了!
## 如何在C++预测库中使用
暂时不支持在C++预测库中使用,后续会补充在C++预测库中的使用示例。
## FAQ
1. Q: 如果出现类似错误: `relu2_op.so: cannot open shared object file: No such file or directory` 以及 `libpaddle_framework.so: cannot open shared object file: No such file or directory`
A: 需要将`relu2_op.so`所在路径以及`libpaddle_framework.so`路径(即`paddle.sysconfig.get_lib()`得到路径)设置到环境变量LD_LIBRARY_PATH中:
```
# 假如relu2_op.so路径是:`paddle/test`,对于Linux环境设置:
export LD_LIBRARY_PATH=paddle/test:$( python -c 'import paddle; print(paddle.sysconfig.get_lib())'):$LD_LIBRARY_PATH
```
#############
新增OP
#############
本部分将指导您如何新增Operator,也包括一些必要的注意事项
- `如何写新的C++ op <./new_op.html>`_
- `C++ op相关注意事项 <./op_notes.html>`_
- `如何写新的Python op <./new_python_op.html>`_
- `如何在框架外部自定义C++ op <./custom_op.html>`_
.. toctree::
:hidden:
new_op.md
op_notes.md
new_python_op.md
custom_op.md
###################
Write New Operators
###################
This section will guide you how to add an operator, and it also includes some necessary notes.
- `How to write new operator <new_op_en.html>`_ :guides to write new operators
- `op notes <op_notes_en.html>`_ :notes on developing new operators
.. toctree::
:hidden:
new_op_en.md
op_notes_en.md
# 如何写新的Python OP
PaddlePaddle Fluid通过 `py_func` 接口支持在Python端自定义OP。 py_func的设计原理在于Paddle中的LodTensor可以与numpy数组可以方便的互相转换,从而可以使用Python中的numpy API来自定义一个Python OP。
## py_func接口概述
`py_func` 具体接口为:
```Python
def py_func(func, x, out, backward_func=None, skip_vars_in_backward_input=None):
pass
```
其中,
- `x` 是Python Op的输入变量,可以是单个 `Variable` | `tuple[Variable]` | `list[Variable]` 。多个Variable以tuple[Variable]或list[Variale]的形式传入,其中Variable为LoDTensor或Tenosr。
- `out` 是Python Op的输出变量,可以是单个 `Variable` | `tuple[Variable]` | `list[Variable]` 。其中Variable既可以为LoDTensor或Tensor,也可以为numpy数组。
- `func` 是Python Op的前向函数。在运行网络前向时,框架会调用 `out = func(*x)` ,根据前向输入 `x` 和前向函数 `func` 计算前向输出 `out`。在 ``func`` 建议先主动将LoDTensor转换为numpy数组,方便灵活的使用numpy相关的操作,如果未转换成numpy,则可能某些操作无法兼容。
- `backward_func` 是Python Op的反向函数。若 `backward_func``None` ,则该Python Op没有反向计算逻辑;
`backward_func` 不为 `None`,则框架会在运行网路反向时调用 `backward_func` 计算前向输入 `x` 的梯度。
- `skip_vars_in_backward_input` 为反向函数 `backward_func` 中不需要的输入,可以是单个 `Variable` | `tuple[Variable]` | `list[Variable]`
## 如何使用py_func编写Python Op
以下以tanh为例,介绍如何利用 `py_func` 编写Python Op。
- 第一步:定义前向函数和反向函数
前向函数和反向函数均由Python编写,可以方便地使用Python与numpy中的相关API来实现一个自定义的OP。
若前向函数的输入为 `x_1`, `x_2`, ..., `x_n` ,输出为`y_1`, `y_2`, ..., `y_m`,则前向函数的定义格式为:
```Python
def foward_func(x_1, x_2, ..., x_n):
...
return y_1, y_2, ..., y_m
```
默认情况下,反向函数的输入参数顺序为:所有前向输入变量 + 所有前向输出变量 + 所有前向输出变量的梯度,因此对应的反向函数的定义格式为:
```Python
def backward_func(x_1, x_2, ..., x_n, y_1, y_2, ..., y_m, dy_1, dy_2, ..., dy_m):
...
return dx_1, dx_2, ..., dx_n
```
若反向函数不需要某些前向输入变量或前向输出变量,可设置 `skip_vars_in_backward_input` 进行排除(步骤三中会叙述具体的排除方法)。
注:,x_1, ..., x_n为输入的多个LodTensor,请以tuple(Variable)或list[Variable]的形式在py_func中传入。建议先主动将LodTensor通过numpy.array转换为数组,否则Python与numpy中的某些操作可能无法兼容使用在LodTensor上。
此处我们利用numpy的相关API完成tanh的前向函数和反向函数编写。下面给出多个前向与反向函数定义的示例:
```Python
import numpy as np
# 前向函数1:模拟tanh激活函数
def tanh(x):
# 可以直接将LodTensor作为np.tanh的输入参数
return np.tanh(x)
# 前向函数2:将两个2-D LodTenosr相加,输入多个LodTensor以list[Variable]或tuple(Variable)形式
def element_wise_add(x, y):
# 必须先手动将LodTensor转换为numpy数组,否则无法支持numpy的shape操作
x = np.array(x)
y = np.array(y)
if x.shape != y.shape:
raise AssertionError("the shape of inputs must be the same!")
result = np.zeros(x.shape, dtype='int32')
for i in range(len(x)):
for j in range(len(x[0])):
result[i][j] = x[i][j] + y[i][j]
return result
# 前向函数3:可用于调试正在运行的网络(打印值)
def debug_func(x):
# 可以直接将LodTensor作为print的输入参数
print(x)
# 前向函数1对应的反向函数,默认的输入顺序为:x、out、out的梯度
def tanh_grad(x, y, dy):
# 必须先手动将LodTensor转换为numpy数组,否则"+/-"等操作无法使用
return np.array(dy) * (1 - np.square(np.array(y)))
```
注意,前向函数和反向函数的输入均是 `LoDTensor` 类型,输出可以是Numpy Array或 `LoDTensor`
由于 `LoDTensor` 实现了Python的buffer protocol协议,因此即可通过 `numpy.array` 直接将 `LoDTensor` 转换为numpy Array来进行操作,也可直接将 `LoDTensor` 作为numpy函数的输入参数。但建议先主动转换为numpy Array,则可以任意的使用python与numpy中的所有操作(例如"numpy array的+/-/shape")。
tanh的反向函数不需要前向输入x,因此我们可定义一个不需要前向输入x的反向函数,并在后续通过 `skip_vars_in_backward_input` 进行排除 :
```Python
def tanh_grad_without_x(y, dy):
return np.array(dy) * (1 - np.square(np.array(y)))
```
- 第二步:创建前向输出变量
我们需调用 `Program.current_block().create_var` 创建前向输出变量。在创建前向输出变量时,必须指明变量的名称name、数据类型dtype和维度shape。
```Python
import paddle.fluid as fluid
def create_tmp_var(program, name, dtype, shape):
return program.current_block().create_var(name=name, dtype=dtype, shape=shape)
in_var = fluid.layers.data(name='input', dtype='float32', shape=[-1, 28, 28])
# 手动创建前向输出变量
out_var = create_tmp_var(fluid.default_main_program(), name='output', dtype='float32', shape=[-1, 28, 28])
```
- 第三步:调用 `py_func` 组建网络
`py_func` 的调用方式为:
```Python
fluid.layers.py_func(func=tanh, x=in_var, out=out_var, backward_func=tanh_grad)
```
若我们不希望在反向函数输入参数中出现前向输入,则可使用 `skip_vars_in_backward_input` 进行排查,简化反向函数的参数列表。
```Python
fluid.layers.py_func(func=tanh, x=in_var, out=out_var, backward_func=tanh_grad_without_x,
skip_vars_in_backward_input=in_var)
```
至此,使用 `py_func` 编写Python Op的步骤结束。我们可以与使用其他Op一样进行网路训练/预测。
## 注意事项
- `py_func` 的前向函数和反向函数内部不应调用 `fluid.layers.xxx` ,因为前向函数和反向函数是在网络运行时调用的,且输入参数均为C++端的 `LoDTensor`
`fluid.layers.xxx` 是在组建网络的阶段调用的,且输入参数为Python端的 `Variable`
- `skip_vars_in_backward_input` 只能跳过前向输入变量和前向输出变量,不能跳过前向输出的梯度。
- 若某个前向输出变量没有梯度,则 `backward_func` 将接收到 `None` 的输入。若某个前向输入变量没有梯度,则我们应在 `backward_func` 中主动返回
`None`
# Notes on operator development
## Building logic of Fluid's op
### 1.Building logic of Fluid's op
All Ops in Fluid are derived from `OperatorBase` , and all Ops are stateless. Each Op contains only four variable members: type, inputs, outputs, and attribute.
The core method of Op is Run. The Run method requires two resources: data resources and computing resources. These two resources are obtained respectively from `Scope` and `Place`. Inside the framework, there is a global `DeviceContextPool`, which is used to record the mapping relationship between `Place` and `DeviceContext`, which means each `Place` has only one `DeviceContext` corresponding to it, and `DeviceContext` stores the computing resources of the current device. For example, for GPU, these resources include `cudnn_handle`, `cublas_handle`, `stream`, and so on. All the internal calculations (data copy and CUDA Kernel, etc.) of Op must be done in `DeviceContext`.
The Fluid framework is designed to run on a variety of devices and third-party libraries, and some Op implementations may vary on different the devices or third-party libraries. Therefore, Fluid introduced the OpKernel's approach, which means an Op can have multiple OpKernels. Such Ops are derived from `OperatorWithKernel`, and the representative of such Ops is conv, the OpKernels of conv_op are: `GemmConvKernel`, `CUDNNConvOpKernel`, `ConvMKLDNNOpKernel`, and each OpKernel has two data types, double and float. Ops that do not need OpKernel inclue `WhileOp` and so on.
Operator inheritance diagram:
![op_inheritance_relation_diagram](./op_inheritance_relation_diagram.png)
For further information, please refer to: [multi_devices](https://github.com/PaddlePaddle/FluidDoc/tree/develop/doc/fluid/design/multi_devices) , [scope](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/concepts/scope.md) , [Developer's_Guide_to_Paddle_Fluid](https://github.com/PaddlePaddle/FluidDoc/blob/release/1.2/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md)
### 2.Op's registration logic
The registration entries for each Operator include:
```C++
OpCreator creator_;
GradOpMakerFN grad_op_maker_;
proto::OpProto* proto_{nullptr};
OpAttrChecker* checker_{nullptr};
InferVarTypeFN infer_var_type_;
InferShapeFN infer_shape_;
```
<table>
<thead>
<tr>
<th>Registration Entry</th>
<th>Type</th>
<th>Description</th>
<th>Usage</th>
</tr>
</thead>
<tbody>
<tr>
<td>proto::OpProto </td>
<td>Class </td>
<td>Store the input/output/properties/Op type of Op </td>
<td>Call at compile time </td>
</tr>
<tr>
<td>GradOpMakerFN </td>
<td>Functor </td>
<td> Return a set of OpDescs of the reverse Op corresponding to the current Op, because the reverse ones of the forward Op may consist of multiple Ops </td>
<td>Call at compile time </td>
</tr>
<tr>
<td>OpAttrChecker </td>
<td>Class </td>
<td>Check the Op's attr </td>
<td>Call at compile time </td>
</tr>
<tr>
<td>InferVarTypeFN </td>
<td>Functor </td>
<td> Used to infer the type of the output Var, such as LoDTensor, SelectedRows, or others </td>
<td>Call at compile time </td>
</tr>
<tr>
<td>InferShapeFN </td>
<td>Functor </td>
<td> Used to infer the Shape of the Output </td>
<td> The usage is different at compile time and runtime. At compile time, it is called in Python side; If the Op is derived from OperatorWithKernel, at the runtime it will be called at op.run </td>
</tr>
<tr>
<td>OpCreator </td>
<td>Functor </td>
<td>Create a new OperatorBase for each call </td>
<td>Call at runtime </td>
</tr>
</tbody>
</table>
Usually you need to call REGISTER_OPERATOR when you make comments on Op, which is:
```
REGISTER_OPERATOR(op_type,
OperatorBase
Op_maker_and_checker_maker,
Op_grad_opmaker,
Op_infer_var_shape,
Op_infer_var_type)
```
**Note:**
1. For all Op, the first three parameters are required, op_type specifies the name of op, OperatorBase is the object instance of this Op, op_maker_and_checker_maker is the maker of op and the checker of attr in op.
2. If the Op has a reverse, it must have op_grad_opmaker, because in backward, the reverse Op's Maker will be obtained from the forward Op.
3. The framework provides a default op_grad_opmaker:`DefaultGradOpDescMaker`, which will use the input and output of the forward Op as the input of the reverse Op, and the gradients of the input to forward Op's as the output of the reverse Op, and copy the attributes of the forward Op to it. **Note:** DefaultGradOpDescMaker will take all the input and output of the forward Op as the reverse Op input. Even if this input is not necessary, the absence of this will prevent us from doing memory optimization for the unused variables.
4. The framework does not provide a default op_infer_var_shape method. If the Op has no OpKernel, you usually need to add the corresponding op_infer_var_shape method. If the Op has OpKernel, you need to implement the `InferShape` method of `OperatorWithKernel`. You don't need to provide the op_infer_var_shape method. For details, refer to [while_op.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/controlflow/while_op.cc), [conv_op.cc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/conv_op.cc).
5. The framework does not provide a default op_infer_var_type method, the user needs to add op_infer_var_type according to the actual situation. Strictly speaking, every Op should register an InferVarType, and op_infer_var_type infers the type and dtype of the output Var according to the type and dtype of the input Var. **Note:** In the Python-side LayerHelper, the create_variable_for_type_inference operation returns a Variable which is a LoDTensor. The C++-side InferVarType can modify the type and dtype of the `Variable`.
For more details, please refer to: [How to write a new Op](new_op_en.html)
## Notes on Writing an Op
### 1. input and output types supported by Op
The input and output of Fluid's Ops are `Variable`. In design, `Variable` can store any type. Op's input and output `Variable` may be of any type, and usually the `Variable` stores `LoDTensor` and `SelectedRows` .
**Note:**
- `context.Input<Tensor>("Input")` often appears in the code. It does not mean that the `Variable` of "Input" is `Tensor`, but indicates that the `Tensor` is obtained from `LoDTensor` in the `Variable` of the "Input". If the `Variable` of "Input" is `SelectedRows`, an error will be reported.
- If "Input" is `SelectedRows`, `context->GetInputDim("Input")` will return `var->Get<SelectedRows>().GetCompleteDims()` instead of Dim of `Tensor` in `SelectedRows` .
### 2. Do not modify the input data inside Op.
Never make any modification of the input data inside Op, as there may be other Ops that need to read this input.
### 3. The data type needs to be registered for OpKernel
Currently all OpKernel are required to register double and float data types.
### 4.Op compatibility issue
The modification of Op needs to consider the compatibility problem. Please ensure that the previous model can be loaded and run normally after the modification of Op which means that the model trained by the old version can be loaded and run with Paddle inference library of new version. <font color="#FF0000">**So developers should ensure that the Input, Output and Attribute of OPs cannot be modified (except for documents) or deleted. And developers can add Input, Output and Attribute, but the added Input and Output must be set to be dispensable, and the default value of added Attribute must be set. For more details, please refer to [OP Input/Output/Attribute Compatibility Modification](https://github.com/PaddlePaddle/Paddle/wiki/OP-Input-Output-Attribute-Compatibility-Modification(English-Version))**</font>.
### 5.Call ShareDataWith
The function of ShareDataWith is to make the two Tensors share the underlying buffer. When calling this operation, special attention should be paid. In the Op, the ShareDataWith cannot be applied to the output of Op. In other words, the Tensor of the Op output must be from Malloc.
### 6. Sparse gradient parameter's update method
At present, the sparse gradient will first merge the gradient when updating, which is to add up the gradients of the same parameter, and then update the parameters and additional parameters (such as velocity).
### 7. (Video) Memory optimization
If the reverse of Op does not require all of the input and output of the forward op as its input, please do not use `DefaultGradOpDescMaker`, which will prevent Memory/Video Memory optimization for unused variables.
### 8. Calls made on Hybrid device
Since the GPU is executed asynchronously, the GPU side may not be actually executed after the CPU call returns. Therefore, if you create a temporary variable in Op that you need to use at the GPU runtime, when the GPU starts running, the temporary variable may have been released on the CPU side, which may cause GPU calculation errors.
Some of the synchronous and asynchronous operations in the GPU:
```
The following device operations are asynchronous with respect to the host:
Kernel launches;
Memory copies within a single device's memory;
Memory copies from host to device of a memory block of 64 KB or less;
Memory copies performed by functions that are suffixed with Async;
Memory set function calls.
```
Note on cudaMemCpy and cudaMemCpyAsync:
- If the data transfer is from the GPU side to the CPU side with non-pinned memory , the data transfer will be synchronous, even if an asynchronous copy operation is called.
- If the data is transferred from the CPU side to the CPU side, the data transfer will be synchronous, even if an asynchronous copy operation is called.
For more information, please refer to: [Asynchronous Concurrent Execution](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#asynchronous-concurrent-execution) , [API synchronization behavior](https://Docs.nvidia.com/cuda/cuda-runtime-api/api-sync-behavior.html#api-sync-behavior)
## Op Performance Optimization
### 1. Selection of third-party libraries
In the process of writing Op, the operations provided by high-performance libraries (such as cudnn, mkldnn, mklml, eigen, etc.) are preferred, but the benchmark must be done. Some operations in the library may be slower in deep learning tasks. Because the operations provided in high-performance libraries (such as eigen, etc.) are more generalized and in terms of performance, they may not be sufficient. Usually the amount of data in the deep learning model is small, so in some cases some of the high-performance libraries may be compromised to a slower speed. For example, all Op (forward and reverse) of the Elementwise set. The Elementwise operation is called relatively frequently in the model. Especially Elementwise_add, which is used to add offset to many operations. In the previous implementation, Elementwise_op directly calls the Eigen library. Since the Elementwise operation needs to broadcast the data in many cases, and the experiment finds that the Eigen library is slower to broadcast, whose reason is in this PR[#6229](https://github.com/PaddlePaddle/Paddle/pull/6229).
### 2.Op performance optimization
The calculation speed of Op is related to the amount of data input. For some Op, different calculation methods can be selected according to the attribute parameters in Op and Shape of the input data. For example, concat_op, when axis>=1, in the process of concatenating multiple tensors, you need to make many copies for each tensor. If it is on GPU, you need to call cudaMemCopy. Relative to the CPU, the GPU is an external device. So each time the GPU is called, there will a certain overhead. And when more times of copying are required, the overhead is more prominent. At present, the implementation of concat_op will select different calling methods according to the Shape and axis values of the input data. If there are a relatively large number of input tensors, and the axis is not equal to 0, the multiple copy operations will be converted into a CUDA Kernel to complete the process; if input tensor are less, and the axis is equal to 0, direct copy will be used. The relevant experiment is described in this PR ([#8669](https://github.com/PaddlePaddle/Paddle/pull/8669)) .
Since the call of CUDA Kernel has a certain overhead, multiple calls of the CUDA Kernel in Op may affect the execution speed of Op. For example, the previous sequence_expand_op contains many CUDA Kernels. Usually, these CUDA Kernels process a small amount of data, so frequent calls to such Kernels will affect the calculation speed of Op. In this case, it is better to combine these small CUDA Kernels into one. This idea is used in the optimization of the sequence_expand_op procedure (related PR[#9289](https://github.com/PaddlePaddle/Paddle/pull/9289)). The optimized sequence_expand_op is about twice as fast as the previous implementation, the relevant experiments are introduced in the PR ([#9289](https://github.com/PaddlePaddle/Paddle/pull/9289)).
Reduce the number of copy and sync operations between the CPU and the GPU. For example, the fetch operation will update the model parameters and get a loss after each iteration, and the copy of the data from the GPU to the Non-Pinned-Memory CPU is synchronous, so frequent fetching for multiple parameters will reduce the model training speed.
## Op numerical stability
### 1. Some Ops have numerical stability problems
The main reason for numerical stability is that when the program is run multiple times, the order in which the floating-point data is processed may be different, resulting in different final calculation results. The GPU is accelerated by multi-threaded parallel computing, so it is commonplace that the order of operations on floating-point numbers is not fixed.
At present, it is found that the result of the convolution operation in cudnn, MaxPooling in cudnn, CudaAtomicXX in CUDA, and aggregation of parameter gradients in Reduce mode of ParallelExecutor are not certain.
For this purpose, some FLAGS is added to the Fluid. For example, FLAGS_cudnn_deterministic is used to force cudnn to use the deterministic algorithm, and FLAGS_cpu_deterministic to force the CPU-side calculation to use the deterministic method.
### 2.On/Off of WITH_FAST_MATH
If WITH_FAST_MATH is ON, NVCC will use --use_fast_math when compiling Paddle and Egien. This may cause some operations in CUDA to get faster on the condition that they lose some precision, such as log, exp, tanh. But it may lead to wrong results of some operations, such as pow operation, please read [torch/DEPRECEATED-torch7-distro#132](https://github.com/torch/DEPRECEATED-torch7-distro/issues/132) for specific reasons.
## Other
### 1. Error message
The Enforce prompt message cannot be empty and needs to be written, because the error message can analyze the cause of the error more quickly and conveniently.
### 2.Op's mathematical formula
If Op has a mathematical formula, be sure to write the mathematical formula in the code and display it in the Doc of the Python API, because the user may need to understand how Paddle implements Op when comparing the calculation results among different frameworks.
**Note:** The formula preview must be done before the merge to the develop branch. Example: [dynamic_lstmp](../../../api/layers/nn.html#dynamic-lstmp).
### 3. The order of parameters in the Python-side Op interface
The order of the parameters in the Python API is generally ranked by importance, taking fc as an example:
```
def fc(input,
size,
num_flatten_dims=1,
param_attr=None,
bias_attr=None,
act=None,
is_test=False,
name=None)
```
.. _user_guide_use_numpy_array_as_train_data:
##############
同步数据读取
##############
PaddlePaddle Fluid支持使用 :code:`fluid.data()` 配置数据层;
再使用 Numpy Array 或者直接使用Python创建C++的
:code:`fluid.LoDTensor` , 通过 :code:`Executor.run(feed=...)` 传给
:code:`fluid.Executor` 或 :code:`fluid.ParallelExecutor` 。
数据层配置
##########
通过 :code:`fluid.data()` 可以配置神经网络中需要的数据层。具体方法为:
.. code-block:: python
import paddle.fluid as fluid
image = fluid.data(name="image", shape=[None, 3, 224, 224])
label = fluid.data(name="label", shape=[None, 1], dtype="int64")
# use image/label as layer input
prediction = fluid.layers.fc(input=image, size=1000, act="softmax")
loss = fluid.layers.cross_entropy(input=prediction, label=label)
...
上段代码中,:code:`image` 和 :code:`label` 是通过 :code:`fluid.data`
创建的两个输入数据层。其中 :code:`image` 是 :code:`[None, 3, 224, 224]` 维度的浮点数据;
:code:`label` 是 :code:`[None, 1]` 维度的整数数据。这里需要注意的是:
1. Executor在执行的时候,会检查定义的数据层数据和feed的数据的 :code:`shape` 和 :code:`dtype` 是否一致,如果不一致,程序会报错退出。对于一些任务,在不同的轮数,数据的某些维度会变化,可以将维度的值设置为None,例如第0维会变化,可以将 :code:`shape` 设置为 :code:`[None, 3, 224, 224]` 。
2. Fluid中用来做类别标签的数据类型是 :code:`int64`,并且标签从0开始。可用数据类型请参考 :ref:`user_guide_paddle_support_data_types`。
.. _user_guide_feed_data_to_executor:
传递训练数据给执行器
####################
:code:`Executor.run` 和 :code:`ParallelExecutor.run` 都接受一个 :code:`feed` 参数。
这个参数是一个Python的字典。它的键是数据层的名字,例如上文代码中的 :code:`image`。
它的值是对应的numpy array。
例如:
.. code-block:: python
exe = fluid.Executor(fluid.CPUPlace())
# init Program
exe.run(fluid.default_startup_program())
exe.run(feed={
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
})
进阶使用
########
如何传入序列数据
----------------
序列数据是PaddlePaddle Fluid支持的特殊数据类型,可以使用 :code:`LoDTensor` 作为
输入数据类型。它需要用户: 1. 传入一个mini-batch需要被训练的所有数据;
2.每个序列的长度信息。
用户可以使用 :code:`fluid.create_lod_tensor` 来创建 :code:`LoDTensor` 。
传入序列信息的时候,需要设置序列嵌套深度,:code:`lod_level` 。
例如训练数据是词汇组成的句子,:code:`lod_level=1` ;训练数据是 词汇先组成了句子,
句子再组成了段落,那么 :code:`lod_level=2` 。
例如:
.. code-block:: python
sentence = fluid.data(name="sentence", dtype="int64", shape=[None, 1], lod_level=1)
...
exe.run(feed={
"sentence": create_lod_tensor(
data=numpy.array([1, 3, 4, 5, 3, 6, 8], dtype='int64').reshape(-1, 1),
recursive_seq_lens=[[4, 1, 2]],
place=fluid.CPUPlace()
)
})
训练数据 :code:`sentence` 包含三个样本,他们的长度分别是 :code:`4, 1, 2` 。
他们分别是 :code:`data[0:4]`, :code:`data[4:5]` 和 :code:`data[5:7]` 。
如何分别设置ParallelExecutor中每个设备的训练数据
------------------------------------------------
用户将数据传递给使用 :code:`ParallelExecutor.run(feed=...)` 时,
可以显示指定每一个训练设备(例如GPU)上的数据。
用户需要将一个列表传递给 :code:`feed` 参数,列表中的每一个元素都是一个字典。
这个字典的键是数据层的名字,值是数据层的值。
例如:
.. code-block:: python
parallel_executor = fluid.ParallelExecutor()
parallel_executor.run(
feed=[
{
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
},
{
"image": numpy.random.random(size=(16, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(16, 1)).astype('int64')
},
]
)
上述代码中,GPU0会训练 32 个样本,而 GPU1训练 16 个样本。
.. _user_guide_paddle_support_data_types:
Fluid目前支持的数据类型
-----------------------
PaddlePaddle Fluid目前支持的数据类型包括:
* float16: 部分操作支持
* float32: 主要实数类型
* float64: 次要实数类型,支持大部分操作
* int32: 次要标签类型
* int64: 主要标签类型
* uint64: 次要标签类型
* bool: 控制流数据类型
* int16: 次要标签类型
* uint8: 输入数据类型,可用于图像像素
.. _user_guide_use_numpy_array_as_train_data_en:
#################################
Take Numpy Array as Training Data
#################################
PaddlePaddle Fluid supports configuring data layer with :code:`fluid.data()` .
Then you can use Numpy Array or directly use Python to create C++
:code:`fluid.LoDTensor` , and then feed it to :code:`fluid.Executor` or :code:`fluid.ParallelExecutor`
through :code:`Executor.run(feed=...)` .
Configure Data Layer
############################
With :code:`fluid.data()` , you can configure data layer in neural network. Details are as follows:
.. code-block:: python
import paddle.fluid as fluid
image = fluid.data(name="image", shape=[None, 3, 224, 224])
label = fluid.data(name="label", shape=[None, 1], dtype="int64")
# use image/label as layer input
prediction = fluid.layers.fc(input=image, size=1000, act="softmax")
loss = fluid.layers.cross_entropy(input=prediction, label=label)
...
In the code above, :code:`image` and :code:`label` are two input data layers created by :code:`fluid.data` . :code:`image` is float data of shape :code:`[None, 3, 224, 224]` ; :code:`label` is the int data of shape :code:`[None, 1]` . Note that:
1. When the program is executing, executor will check whether the :code:`shape` and :code:`dtype` defined and feeded are consistent. If they are not consistent, the program will exit with an error. In some tasks, the dimension will change in different training steps. For this case, the value of the dimension can be set to None. For example, the :code:`shape` can be set to :code:`[None, 3, 224, 224]` when the 0th dimension will change.
2. Data type of category labels in Fluid is :code:`int64` and the label starts from 0. About the supported data types,please refer to :ref:`user_guide_paddle_support_data_types_en` .
.. _user_guide_feed_data_to_executor_en:
Transfer Train Data to Executor
################################
Both :code:`Executor.run` and :code:`ParallelExecutor.run` receive a parameter :code:`feed` .
The parameter is a dict in Python. Its key is the name of data layer,such as :code:`image` in code above. And its value is the corresponding numpy array.
For example:
.. code-block:: python
exe = fluid.Executor(fluid.CPUPlace())
# init Program
exe.run(fluid.default_startup_program())
exe.run(feed={
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
})
Advanced Usage
###############
How to feed Sequence Data
--------------------------
Sequence data is a unique data type supported by PaddlePaddle Fluid. You can take :code:`LoDTensor` as input data type.
You need to:
1. Feed all data to be trained in a mini-batch.
2. Get the length of each sequence.
You can use :code:`fluid.create_lod_tensor` to create :code:`LoDTensor` .
To feed sequence information, it is necessary to set the sequence nested depth :code:`lod_level` .
For instance, if the training data are sentences consisting of words, :code:`lod_level=1`; if train data are paragraphs which consists of sentences that consists of words, :code:`lod_level=2` .
For example:
.. code-block:: python
sentence = fluid.data(name="sentence", dtype="int64", shape=[None, 1], lod_level=1)
...
exe.run(feed={
"sentence": create_lod_tensor(
data=numpy.array([1, 3, 4, 5, 3, 6, 8], dtype='int64').reshape(-1, 1),
recursive_seq_lens=[[4, 1, 2]],
place=fluid.CPUPlace()
)
})
Training data :code:`sentence` contain three samples, the lengths of which are :code:`4, 1, 2` respectively.
They are :code:`data[0:4]`, :code:`data[4:5]` and :code:`data[5:7]` respectively.
How to prepare training data for every device in ParallelExecutor
-------------------------------------------------------------------
When you feed data to :code:`ParallelExecutor.run(feed=...)` ,
you can explicitly assign data for every training device (such as GPU).
You need to feed a list to :code:`feed` . Each element of the list is a dict.
The key of the dict is name of data layer and the value of dict is value of data layer.
For example:
.. code-block:: python
parallel_executor = fluid.ParallelExecutor()
parallel_executor.run(
feed=[
{
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
},
{
"image": numpy.random.random(size=(16, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(16, 1)).astype('int64')
},
]
)
In the code above, GPU0 will train 32 samples and GPU1 will train 16 samples.
.. _user_guide_paddle_support_data_types_en:
Data types supported by Fluid
-------------------------------
Data types supported by PaddlePaddle Fluid contains:
* float16: supported by part of operations
* float32: major data type of real number
* float64: minor data type of real number, supported by most operations
* int32: minor data type of labels
* int64: major data type of labels
* uint64: minor data type of labels
* bool: type of control flow data
* int16: minor type of labels
* uint8: input data type, used for pixel of picture
.. _user_guide_prepare_data:
########
准备数据
########
本章详细介绍了如何为神经网络提供数据,包括数据的前期处理与后期的同步、异步读取。
.. toctree::
:maxdepth: 1
prepare_steps.rst
reader_cn.md
use_py_reader.rst
feeding_data.rst
\ No newline at end of file
.. _user_guide_prepare_data_en:
#############
Prepare Data
#############
This document mainly introduces how to provide data for the network, including Synchronous-method and Asynchronous-method.
.. toctree::
:maxdepth: 1
prepare_steps_en.rst
reader.md
use_py_reader_en.rst
feeding_data_en.rst
\ No newline at end of file
.. _user_guide_prepare_steps:
########
准备步骤
########
使用PaddlePaddle Fluid准备数据分为三个步骤:
Step 1: 自定义Reader生成训练/预测数据
###################################
生成的数据类型可以为Numpy Array或LoDTensor。根据Reader返回的数据形式的不同,可分为Batch级的Reader和Sample(样本)级的Reader。
Batch级的Reader每次返回一个Batch的数据,Sample级的Reader每次返回单个样本的数据
如果您的数据是Sample级的数据,我们提供了一个可以数据预处理和组建batch的工具::code:`Python Reader` 。
Step 2: 在网络配置中定义数据层变量
###################################
用户需使用 :code:`fluid.data` 在网络中定义数据层变量。定义数据层变量时需指明数据层的名称name、数据类型dtype和维度shape。例如:
.. code-block:: python
import paddle.fluid as fluid
image = fluid.data(name='image', dtype='float32', shape=[None, 28, 28])
label = fluid.data(name='label', dtype='int64', shape=[None, 1])
其中,None表示不确定的维度。此例子中None的含义为batch size。
Step 3: 将数据送入网络进行训练/预测
###################################
Fluid提供两种方式,分别是异步DataLoader接口方式或同步Feed方式,具体介绍如下:
- 异步DataLoader接口方式
用户需要先使用 :code:`fluid.io.DataLoader` 定义DataLoader对象,然后通过DataLoader对象的set方法设置数据源。
使用DataLoader接口时,数据传入与模型训练/预测过程是异步进行的,效率较高,推荐使用。
- 同步Feed方式
用户自行构造输入数据,并在 :code:`fluid.Executor` 或 :code:`fluid.ParallelExecutor`
中使用 :code:`executor.run(feed=...)` 传入训练数据。数据准备和模型训练/预测的过程是同步进行的,
效率较低。
这两种准备数据方法的比较如下:
======== ================================= =====================================
对比项 同步Feed方式 异步DataLoader接口方式
======== ================================= =====================================
API接口 :code:`executor.run(feed=...)` :code:`fluid.io.DataLoader`
数据格式 Numpy Array或LoDTensor Numpy Array或LoDTensor
数据增强 Python端使用其他库完成 Python端使用其他库完成
速度 慢 快
推荐用途 调试模型 工业训练
======== ================================= =====================================
Reader数据类型对使用方式的影响
###########################
根据Reader数据类型的不同,上述步骤的具体操作将有所不同,具体介绍如下:
读取Sample级Reader数据
+++++++++++++++++++++
若自定义的Reader每次返回单个样本的数据,用户需通过以下步骤完成数据送入:
Step 1. 组建数据
================
调用Fluid提供的Reader相关接口完成组batch和部分的数据预处理功能,具体请参见: `数据预处理工具 <./reader_cn.html>`_ 。
Step 2. 送入数据
================
若使用异步DataLoader接口方式送入数据,请调用 :code:`set_sample_generator` 或 :code:`set_sample_list_generator` 接口完成,具体请参见: :ref:`user_guides_use_py_reader` 。
若使用同步Feed方式送入数据,请使用DataFeeder接口将Reader数据转换为LoDTensor格式后送入网络,具体请参见 :ref:`cn_api_fluid_DataFeeder` 。
读取Batch级Reader数据
++++++++++++++++++++
Step 1. 组建数据
================
由于Batch已经组好,已经满足了Step 1的条件,可以直接进行Step 2。
Step 2. 送入数据
================
若使用异步DataLoader接口方式送入数据,请调用DataLoader的 :code:`set_batch_generator` 接口完成,具体方式请参见: :ref:`user_guides_use_py_reader` 。
若使用同步Feed方式送入数据,具体请参见: :ref:`user_guide_use_numpy_array_as_train_data` 。
.. _user_guide_prepare_steps_en:
#############
Prepare Steps
#############
Data preparation in PaddlePaddle Fluid can be separated into 3 steps.
Step 1: Define a reader to generate training/testing data
##########################################################
The generated data type can be Numpy Array or LoDTensor. According to the different data formats returned by the reader, it can be divided into Batch Reader and Sample Reader.
The batch reader yields a mini-batch data for each, while the sample reader yields a sample data for each.
If your reader yields a sample data, we provide a data augmentation and batching tool for you: :code:`Python Reader` .
Step 2: Define data layer variables in network
###############################################
Users should use :code:`fluid.data` to define data layer variables. Name, dtype and shape are required when defining. For example,
.. code-block:: python
import paddle.fluid as fluid
image = fluid.data(name='image', dtype='float32', shape=[None, 28, 28])
label = fluid.data(name='label', dtype='int64', shape=[None, 1])
None means that the dimension is uncertain. In this example, None means the batch size.
Step 3: Send the data to network for training/testing
######################################################
PaddlePaddle Fluid provides 2 methods for sending data to the network: Asynchronous DataLoader API, and Synchronous Feed Method.
- Asynchronous DataLoader API
User should use :code:`fluid.io.DataLoader` to define a DataLoader object and use its setter method to set the data source.
When using DataLoader API, the process of data sending works asynchronously with network training/testing.
It is an efficient way for sending data and recommended to use.
- Synchronous Feed Method
User should create the feeding data beforehand and use :code:`executor.run(feed=...)` to send the data to :code:`fluid.Executor` or :code:`fluid.ParallelExecutor` .
Data preparation and network training/testing work synchronously, which is less efficient.
Comparison of these 2 methods are as follows:
========================== ================================= ======================================
Comparison item Synchronous Feed Method Asynchronous DataLoader API
========================== ================================= ======================================
API :code:`executor.run(feed=...)` :code:`fluid.io.DataLoader`
Data type Numpy Array or LoDTensor Numpy Array or LoDTensor
Data augmentation use Python for data augmentation use Python for data augmentation
Speed slow rapid
Recommended applications model debugging industrial training
========================== ================================= ======================================
Choose different usages for different data formats
###################################################
According to the different data formats of reader, users should choose different usages for data preparation.
Read data from sample reader
+++++++++++++++++++++++++++++
If user-defined reader is a sample reader, users should use the following steps:
Step 1. Batching
=================
Use the data reader interfaces in PaddlePaddle Fluid for data augmentation and batching. Please refer to `Python Reader <./reader.html>`_ for details.
Step 2. Sending data
=====================
If using Asynchronous DataLoader API, please use :code:`set_sample_generator` or :code:`set_sample_list_generator` to set the data source for DataLoader. Please refer to :ref:`user_guide_use_py_reader_en` for details.
If using Synchronous Feed Method, please use DataFeeder to convert the reader data to LoDTensor before sending to the network. Please refer to :ref:`api_fluid_DataFeeder` for details.
Read data from sample reader
+++++++++++++++++++++++++++++
Step 1. Batching
=================
Since the reader has been a batch reader, this step can be skipped.
Step 2. Sending data
=====================
If using Asynchronous DataLoader API, please use :code:`set_batch_generator` to set the data source for DataLoader. Please refer to :ref:`user_guide_use_py_reader_en` for details.
If using Synchronous Feed Method, please refer to :ref:`user_guide_use_numpy_array_as_train_data_en` for details.
\ No newline at end of file
# Python Reader
During the training and testing phases, PaddlePaddle programs need to read data. To help the users write code that performs reading input data, we define the following:
- A *reader*: A function that reads data (from file, network, random number generator, etc) and yields the data items.
- A *reader creator*: A function that returns a reader function.
- A *reader decorator*: A function, which takes in one or more readers, and returns a reader.
- A *batch reader*: A function that reads data (from *reader*, file, network, random number generator, etc) and yields a batch of data items.
and also provide a function which can convert a reader to a batch reader, frequently used reader creators and reader decorators.
## Data Reader Interface
*Data reader* doesn't have to be a function that reads and yields data items. It can just be any function without any parameters that creates an iterable (anything can be used in `for x in iterable`) as follows:
```
iterable = data_reader()
```
The item produced from the iterable should be a **single** entry of data and **not** a mini batch. The entry of data could be a single item or a tuple of items. Item should be of one of the supported types (e.g., numpy 1d array of float32, int, list of int etc.)
An example implementation for single item data reader creator is as follows:
```python
def reader_creator_random_image(width, height):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height)
return reader
```
An example implementation for multiple item data reader creator is as follows:
```python
def reader_creator_random_image_and_label(width, height, label):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height), label
return reader
```
## Batch Reader Interface
*Batch reader* can be any function without any parameters that creates an iterable (anything can be used in `for x in iterable`). The output of the iterable should be a batch (list) of data items. Each item inside the list should be a tuple.
Here are some valid outputs:
```python
# a mini batch of three data items. Each data item consist three columns of data, each of which is 1.
[(1, 1, 1),
(2, 2, 2),
(3, 3, 3)]
# a mini batch of three data items, each data item is a list (single column).
[([1,1,1],),
([2,2,2],),
([3,3,3],)]
```
Please note that each item inside the list must be a tuple, below is an invalid output:
```python
# wrong, [1,1,1] needs to be inside a tuple: ([1,1,1],).
# Otherwise it is ambiguous whether [1,1,1] means a single column of data [1, 1, 1],
# or three columns of data, each of which is 1.
[[1,1,1],
[2,2,2],
[3,3,3]]
```
It is easy to convert from a reader to a batch reader:
```python
mnist_train = paddle.dataset.mnist.train()
mnist_train_batch_reader = paddle.batch(mnist_train, 128)
```
It is also straight forward to create a custom batch reader:
```python
def custom_batch_reader():
while True:
batch = []
for i in xrange(128):
batch.append((numpy.random.uniform(-1, 1, 28*28),)) # note that it's a tuple being appended.
yield batch
mnist_random_image_batch_reader = custom_batch_reader
```
## Usage
Following is how we can use the reader with PaddlePaddle:
The batch reader, a mapping from item(s) to data layer, the batch size and the number of total passes will be passed into `paddle.train` as follows:
```python
# two data layer is created:
image_layer = paddle.layer.data("image", ...)
label_layer = paddle.layer.data("label", ...)
# ...
batch_reader = paddle.batch(paddle.dataset.mnist.train(), 128)
paddle.train(batch_reader, {"image":0, "label":1}, 128, 10, ...)
```
## Data Reader Decorator
The *Data reader decorator* takes in a single reader or multiple data readers and returns a new data reader. It is similar to a [python decorator](https://wiki.python.org/moin/PythonDecorators), but it does not use `@` in the syntax.
Since we have a strict interface for data readers (no parameters and return a single data item), a data reader can be used in a flexible way using data reader decorators. Following are a few examples:
### Prefetch Data
Since reading data may take some time and training can not proceed without data, it is generally a good idea to prefetch the data.
Use `paddle.reader.buffered` to prefetch data:
```python
buffered_reader = paddle.reader.buffered(paddle.dataset.mnist.train(), 100)
```
`buffered_reader` will try to buffer (prefetch) `100` data entries.
### Compose Multiple Data Readers
For example, if we want to use a source of real images (say reusing mnist dataset), and a source of random images as input for [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661).
We can do the following :
```python
def reader_creator_random_image(width, height):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height)
return reader
def reader_creator_bool(t):
def reader():
while True:
yield t
return reader
true_reader = reader_creator_bool(True)
false_reader = reader_creator_bool(False)
reader = paddle.reader.compose(paddle.dataset.mnist.train(), reader_creator_random_image(20, 20), true_reader, false_reader)
# Skipped 1 because paddle.dataset.mnist.train() produces two items per data entry.
# And we don't care about the second item at this time.
paddle.train(paddle.batch(reader, 128), {"true_image":0, "fake_image": 2, "true_label": 3, "false_label": 4}, ...)
```
### Shuffle
Given the shuffle buffer size `n`, `paddle.reader.shuffle` returns a data reader that buffers `n` data entries and shuffles them before a data entry is read.
Example:
```python
reader = paddle.reader.shuffle(paddle.dataset.mnist.train(), 512)
```
## Q & A
### Why does a reader return only a single entry, and not a mini batch?
Returning a single entry makes reusing existing data readers much easier (for example, if an existing reader returns 3 entries instead of a single entry, the training code will be more complicated because it needs to handle cases like a batch size 2).
We provide a function: `paddle.batch` to turn (a single entry) reader into a batch reader.
### Why do we need a batch reader, isn't is sufficient to give the reader and batch_size as arguments during training ?
In most of the cases, it would be sufficient to give the reader and batch_size as arguments to the train method. However sometimes the user wants to customize the order of data entries inside a mini batch, or even change the batch size dynamically. For these cases using a batch reader is very efficient and helpful.
### Why use a dictionary instead of a list to provide mapping?
Using a dictionary (`{"image":0, "label":1}`) instead of a list (`["image", "label"]`) gives the advantage that the user can easily reuse the items (e.g., using `{"image_a":0, "image_b":0, "label":1}`) or even skip an item (e.g., using `{"image_a":0, "label":2}`).
### How to create a custom data reader creator ?
```python
def image_reader_creator(image_path, label_path, n):
def reader():
f = open(image_path)
l = open(label_path)
images = numpy.fromfile(
f, 'ubyte', count=n * 28 * 28).reshape((n, 28 * 28)).astype('float32')
images = images / 255.0 * 2.0 - 1.0
labels = numpy.fromfile(l, 'ubyte', count=n).astype("int")
for i in xrange(n):
yield images[i, :], labels[i] # a single entry of data is created each time
f.close()
l.close()
return reader
# images_reader_creator creates a reader
reader = image_reader_creator("/path/to/image_file", "/path/to/label_file", 1024)
paddle.train(paddle.batch(reader, 128), {"image":0, "label":1}, ...)
```
# 数据预处理工具
在模型训练和预测阶段,PaddlePaddle程序需要读取训练或预测数据。为了帮助您编写数据读取的代码,我们提供了如下接口:
- *reader*: 样本级的reader,用于读取数据的函数,数据可来自于文件、网络、随机数生成器等,函数每次返回一个样本数据项。
- *reader creator*: 接受一个或多个reader作为参数、返回一个新reader的函数。
- *reader decorator*: 一个函数,接受一个或多个reader,并返回一个reader。
- *batch reader*: 用于读取数据的函数,数据可来自于文件、网络、随机数生成器等,函数每次返回一个batch大小的数据项。
此外,还提供了将reader转换为batch reader的函数,会频繁用到reader creator和reader decorator。
## Data Reader 接口
Data reader不一定要求为读取和遍历数据项的函数。它可以是返回iterable对象(即可以用于`for x in iterable`的任意对象)的任意不带参数的函数:
```
iterable = data_reader()
```
Iterable对象应产生单项或tuple形式的数据,而不是一个mini batch的数据。产生的数据项应在[支持的类型](./feeding_data.html#fluid) 中,例如float32,int类型的numpy一维矩阵,int类型的列表等。
以下是实现单项数据reader creator的示例:
```python
def reader_creator_random_image(width, height):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height)
return reader
```
以下是实现多项数据reader creator的示例:
```python
def reader_creator_random_image_and_label(width, height, label):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height), label
return reader
```
## Batch Reader 接口
*Batch reader*可以是返回iterable对象(即可以用于`for x in iterable`的任意对象)的任意不带参数的函数。Iterable的输出应为一个batch(list)的数据项。list中的每个数据项均为一个tuple元组。
这里是一些有效输出:
```python
# 三个数据项组成一个mini batch。每个数据项有三列,每列数据项为1。
[(1, 1, 1),
(2, 2, 2),
(3, 3, 3)]
# 三个数据项组成一个mini batch。每个数据项是一个列表(单列)。
[([1,1,1],),
([2,2,2],),
([3,3,3],)]
```
请注意列表里的每个项必须为tuple,下面是一个无效输出:
```python
# 错误, [1,1,1]需在一个tuple内: ([1,1,1],).
# 否则产生歧义,[1,1,1]是否表示数据[1, 1, 1]整体作为单一列。
# 或者数据的三列,每一列为1。
[[1,1,1],
[2,2,2],
[3,3,3]]
```
很容易将reader转换成batch reader:
```python
mnist_train = paddle.dataset.mnist.train()
mnist_train_batch_reader = paddle.batch(mnist_train, 128)
```
也可以直接创建一个自定义batch reader:
```python
def custom_batch_reader():
while True:
batch = []
for i in xrange(128):
batch.append((numpy.random.uniform(-1, 1, 28*28),)) # note that it's a tuple being appended.
yield batch
mnist_random_image_batch_reader = custom_batch_reader
```
## 使用
以下是我们如何用PaddlePaddle的reader:
batch reader是从数据项到数据层(data layer)的映射,batch_size和总pass数通过以下方式传给`paddle.train`
```python
# 创建两个数据层:
image_layer = paddle.layer.data("image", ...)
label_layer = paddle.layer.data("label", ...)
# ...
batch_reader = paddle.batch(paddle.dataset.mnist.train(), 128)
paddle.train(batch_reader, {"image":0, "label":1}, 128, 10, ...)
```
## Data Reader装饰器
*Data reader decorator*接收一个或多个reader对象作为参数,返回一个新的reader对象。它类似于[python decorator](https://wiki.python.org/moin/PythonDecorators) ,但在语法上不需要写`@`
我们对data reader接口有严格限制(无参数并返回单个数据项),data reader可灵活地搭配data reader decorators使用。以下是一些示例:
### 预取回数据(缓存数据)
由于读数据需要一些时间,而没有数据无法进行训练,因此一般而言数据预读取会是一个很好的方法。
`paddle.reader.buffered`预读取数据:
```python
buffered_reader = paddle.reader.buffered(paddle.dataset.mnist.train(), 100)
```
`buffered_reader`将尝试缓存(预读取)`100`个数据项。
### 组成多个Data Reader
例如,如果我们想用实际图像源(也就是复用mnist数据集),和随机图像源作为[Generative Adversarial Networks](https://arxiv.org/abs/1406.2661)的输入。
我们可以参照如下:
```python
def reader_creator_random_image(width, height):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height)
return reader
def reader_creator_bool(t):
def reader():
while True:
yield t
return reader
true_reader = reader_creator_bool(True)
false_reader = reader_creator_bool(False)
reader = paddle.reader.compose(paddle.dataset.mnist.train(), reader_creator_random_image(20, 20), true_reader, false_reader)
# 跳过1因为paddle.dataset.mnist.train()为每个数据项生成两个项。
# 并且这里我们暂时不考虑第二项。
paddle.train(paddle.batch(reader, 128), {"true_image":0, "fake_image": 2, "true_label": 3, "false_label": 4}, ...)
```
### 随机排序
给定大小为`n`的随机排序缓存, `paddle.reader.shuffle`返回一个data reader ,缓存`n`个数据项,并在读取一个数据项前进行随机排序。
示例:
```python
reader = paddle.reader.shuffle(paddle.dataset.mnist.train(), 512)
```
## Q & A
### 为什么一个reader只返回单项而不是mini batch?
返回单项,可以更容易地复用已有的data reader,例如如果一个已有的reader返回3项而不是一个单项,这样训练代码会更复杂,因为需要处理像batch_size为2这样的例子。
我们提供一个函数来将一个单项reader转换成一个batch reader。
### 为什么需要一个batch reader,在训练过程中给出reader和batch_size参数这样不足够吗?
在大多数情况下,在训练方法中给出reader和batch_size参数是足够的。但有时用户想自定义mini batch里数据项的顺序,或者动态改变batch_size。在这些情况下用batch reader会非常高效有用。
### 为什么用字典而不是列表进行映射?
使用字典(`{"image":0, "label":1}`)而不是列表`["image", "label"]`)有利于用户易于复用数据项,例如使用`{"image_a":0, "image_b":0, "label":1}`,或者甚至跳过数据项,例如使用`{"image_a":0, "label":2}`
### 如何创建一个自定义data reader?
```python
def image_reader_creator(image_path, label_path, n):
def reader():
f = open(image_path)
l = open(label_path)
images = numpy.fromfile(
f, 'ubyte', count=n * 28 * 28).reshape((n, 28 * 28)).astype('float32')
images = images / 255.0 * 2.0 - 1.0
labels = numpy.fromfile(l, 'ubyte', count=n).astype("int")
for i in xrange(n):
yield images[i, :], labels[i] # a single entry of data is created each time
f.close()
l.close()
return reader
# images_reader_creator创建一个reader
reader = image_reader_creator("/path/to/image_file", "/path/to/label_file", 1024)
```
##########
分布式训练
##########
.. toctree::
:maxdepth: 1
cluster_quick_start.rst
fleet_api_howto_cn.rst
.. _user_guide_distribute_en:
######################
Distributed Training
######################
.. toctree::
:maxdepth: 1
cluster_quick_start_en.rst
cluster_howto_en.rst
########
多机训练
########
.. toctree::
:maxdepth: 1
cluster_quick_start.rst
cluster_howto.rst
fleet_api_howto_cn.rst
####################
Multi-node Training
####################
.. toctree::
:maxdepth: 1
cluster_quick_start_en.rst
cluster_howto_en.rst
train_on_baidu_cloud_en.rst
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册