提交 98486ca8 编写于 作者: chrisxu2014's avatar chrisxu2014

Merge branch 'gh-pages' of https://github.com/chrisxu2016/Paddle into gh-pages

......@@ -13,79 +13,108 @@
### 训练数据的存储
选择GlusterFS作为训练数据的存储服务(后续的实现考虑HDFS)
选择CephFS作为训练数据的存储服务
在Kubernetes上运行的不同的计算框架,可以通过Volume或PersistentVolume挂载存储空间到每个容器中。
GlusterFS存储系统中的公开目录,需要保存一些预置的公开数据集(比如MNIST, BOW, imagenet数据集等),并且可以被提交的job直接使用。
CephFS存储系统中的公开目录,需要保存一些预置的公开数据集(比如MNIST, BOW, ImageNet数据集等),并且可以被提交的job直接使用。
### 上传训练文件
### 文件预处理
使用下面命令,可以把本地的训练数据上传到存储集群中,并指定上传数据的`dataset-name`
在数据集可以被训练之前,文件需要预先被转换成PaddlePaddle集群内部的存储格式(SSTable)。我们提供两个转换方式
```
paddle upload train_data.list "dataset-name"
```
- 提供给用户本地转换的库,用户可以编写程序完成转换。
- 用户可以上传自己的数据集,在集群运行MapReduce job完成转换。
其中`.list`文件描述了训练数据的文件和对应的label,对于图像类数据,`.list文件`样例如下,每一行包含了图片文件的路径和其label(用tab分隔开)
转换生成的文件名会是以下格式
```
./data/image1.jpg 1
./data/image2.jpg 5
./data/image3.jpg 2
./data/image4.jpg 5
./data/image5.jpg 1
./data/image6.jpg 8
...
```text
name_prefix-aaaaa-of-bbbbb
```
对于文本类训练数据样例如下(机器翻译),一行中包含源语言,目标语言的文本(label):
"aaaaa"和"bbbbb"都是五位的数字,每一个文件是数据集的一个shard,"aaaaa"代表shard的index,"bbbbb"代表这个shard的最大index。
比如ImageNet这个数据集可能被分成1000个shard,它们的文件名是:
```text
imagenet-00000-of-00999
imagenet-00001-of-00999
...
imagenet-00999-of-00999
```
L' inflation , en Europe , a dérapé sur l' alimentation Food : Where European inflation slipped up
L' inflation accélérée , mesurée dans la zone euro , est due principalement à l' augmentation rapide des prix de l' alimentation . The skyward zoom in food prices is the dominant force behind the speed up in eurozone inflation .
...
#### 转换库
无论是在本地或是云端转换,我们都提供Python的转换库,接口是:
```python
def convert(output_path, reader, num_shards, name_prefix)
```
### 使用reader
- `output_path`: directory in which output files will be saved.
- `reader`: a [data reader](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/reader/README.md#data-reader-interface), from which the convert program will read data instances.
- `num_shards`: the number of shards that the dataset will be partitioned into.
- `name_prefix`: the name prefix of generated files.
用户在使用v2 API编写训练任务时,可以使用paddle内置的reader完成对GlusterFS存储中的训练数据的读取,返回文件中的各列,然后在调用`trainer.train()`时传入,完成训练数据的读取
`reader`每次输出一个data instance,这个instance可以是单个值,或者用tuple表示的多个值
```python
reader = paddle.dist.reader("dataset-name")
trainer.train(reader, ...)
batch_reader = paddle.batch(paddle.dataset.mnist.train(), 128)
trainer.train(batch_reader, ...)
yield 1 # 单个值
yield numpy.random.uniform(-1, 1, size=28*28) # 单个值
yield numpy.random.uniform(-1, 1, size=28*28), 0 # 多个值
```
trainer.train内部会获取reader的内容:
每个值的类型可以是整形、浮点型数据、字符串,或者由它们组成的list,以及numpy.ndarray。如果是其它类型,会被Pickle序列化成字符串。
```
def paddle.train(batch_reader):
r = batch_reader() # create a iterator for one pass of data
for batch in r:
# train
### 示例程序
#### 使用转换库
以下`reader_creator`生成的`reader`每次输出一个data instance,每个data instance包涵两个值:numpy.ndarray类型的值和整型的值:
```python
def reader_creator():
def reader():
for i in range(1000):
yield numpy.random.uniform(-1, 1, size=28*28), 0 # 多个值
return reader
```
这里面batch是含有128个data instance的mini-batch。每一个data instance会是一个tuple,tuple元素的顺序与`.list`文件文件中每一列的顺序是一致的。每一个data instance会是(raw_image_file_binary_data, label)。其中raw_image_file_binary_data是对应图像文件的没有解码的原始二进制数据,用户需要自己解码。label是文本类型(比如:“1“,”2“),这里用户需要的其实是整形,用户需要自己转换成整形。
把`reader_creator`生成的`reader`传入`convert`函数即可完成转换:
```python
convert("./", reader_creator(), 100, random_images)
```
### 实现reader
以上命令会在当前目录下生成100个文件:
```text
random_images-00000-of-00099
random_images-00001-of-00099
...
random_images-00099-of-00099
```
reader的实现需要考虑本地训练程序实现之后,可以不修改程序直接提交集群进行分布式训练。要达到这样的目标,需要实现下面的功能:
#### 进行训练
paddle会封装一个在集群中使用的reader: `paddle.dist.reader()`。在集群训练时需要使用这个reader指定要使用的数据集开始训练。用户的训练程序需要按照如下方式初始化reader
PaddlePaddle提供专用的[data reader creator](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/reader/README.md#python-data-reader-design-doc),生成给定SSTable文件对应的data reader。**无论在本地还是在云端,reader的使用方式都是一致的**
```python
if os.getenv("PADDLE_TRAIN_LOCAL"):
reader = my_local_reader("dataset-name")
else:
reader = paddle.dist.reader("dataset-name")
# ...
reader = paddle.reader.creator.SSTable("/home/random_images-*-of-*")
batch_reader = paddle.batch(paddle.dataset.mnist.train(), 128)
trainer.train(batch_reader, ...)
```
用户训练程序提交到集群之后,集群会自动设置`PADDLE_TRAIN_LOCAL`环境变量,reader会被配置成集群训练的版本。其中`paddle.dist.reader()`需要从master的队列中获得需要开始执行的训练task,并找到对应的训练数据文件,开始训练任务。如果用户的训练数据源来自于其他服务,比如从集群中的Kafka,zeromq队列读取,也可以根据实际情况实现集群中运行的reader程序
以上代码的reader输出的data instance与生成数据集时,reader输出的data instance是一模一样的
### 上传训练文件
使用下面命令,可以把本地的数据上传到存储集群中。
```bash
paddle cp filenames pfs://home/folder/
```
比如,把之前示例中转换完毕的random_images数据集上传到云端的`/home/`可以用以下指令:
```bash
paddle cp random_images-*-of-* pfs://home/
```
## TODO
### 支持将数据合并成内部的文件格式(key-value),方便sharding与顺序读取
### 支持用户自定义的数据预处理job
......@@ -1505,9 +1505,15 @@ to maintain tractability.</p>
<span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">last_time_step_output</span>
<span class="k">return</span> <span class="n">simple_rnn</span>
<span class="n">generated_word_embedding</span> <span class="o">=</span> <span class="n">GeneratedInput</span><span class="p">(</span>
<span class="n">size</span><span class="o">=</span><span class="n">target_dictionary_dim</span><span class="p">,</span>
<span class="n">embedding_name</span><span class="o">=</span><span class="s2">&quot;target_language_embedding&quot;</span><span class="p">,</span>
<span class="n">embedding_size</span><span class="o">=</span><span class="n">word_vector_dim</span><span class="p">)</span>
<span class="n">beam_gen</span> <span class="o">=</span> <span class="n">beam_search</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;decoder&quot;</span><span class="p">,</span>
<span class="n">step</span><span class="o">=</span><span class="n">rnn_step</span><span class="p">,</span>
<span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">StaticInput</span><span class="p">(</span><span class="n">encoder_last</span><span class="p">)],</span>
<span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">StaticInput</span><span class="p">(</span><span class="n">encoder_last</span><span class="p">),</span>
<span class="n">generated_word_embedding</span><span class="p">],</span>
<span class="n">bos_id</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">eos_id</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">beam_size</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
......@@ -1529,7 +1535,8 @@ sharing a same set of weights.</p>
<p>You can refer to the first parameter of recurrent_group, or
demo/seqToseq/seqToseq_net.py for more details.</p>
</li>
<li><strong>input</strong> (<em>list</em>) &#8211; Input data for the recurrent unit</li>
<li><strong>input</strong> (<em>list</em>) &#8211; Input data for the recurrent unit, which should include the
previously generated words as a GeneratedInput object.</li>
<li><strong>bos_id</strong> (<em>int</em>) &#8211; Index of the start symbol in the dictionary. The start symbol
is a special token for NLP task, which indicates the
beginning of a sequence. In the generation task, the start
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
......@@ -13,79 +13,108 @@
### 训练数据的存储
选择GlusterFS作为训练数据的存储服务(后续的实现考虑HDFS)
选择CephFS作为训练数据的存储服务
在Kubernetes上运行的不同的计算框架,可以通过Volume或PersistentVolume挂载存储空间到每个容器中。
GlusterFS存储系统中的公开目录,需要保存一些预置的公开数据集(比如MNIST, BOW, imagenet数据集等),并且可以被提交的job直接使用。
CephFS存储系统中的公开目录,需要保存一些预置的公开数据集(比如MNIST, BOW, ImageNet数据集等),并且可以被提交的job直接使用。
### 上传训练文件
### 文件预处理
使用下面命令,可以把本地的训练数据上传到存储集群中,并指定上传数据的`dataset-name`
在数据集可以被训练之前,文件需要预先被转换成PaddlePaddle集群内部的存储格式(SSTable)。我们提供两个转换方式
```
paddle upload train_data.list "dataset-name"
```
- 提供给用户本地转换的库,用户可以编写程序完成转换。
- 用户可以上传自己的数据集,在集群运行MapReduce job完成转换。
其中`.list`文件描述了训练数据的文件和对应的label,对于图像类数据,`.list文件`样例如下,每一行包含了图片文件的路径和其label(用tab分隔开)
转换生成的文件名会是以下格式
```
./data/image1.jpg 1
./data/image2.jpg 5
./data/image3.jpg 2
./data/image4.jpg 5
./data/image5.jpg 1
./data/image6.jpg 8
...
```text
name_prefix-aaaaa-of-bbbbb
```
对于文本类训练数据样例如下(机器翻译),一行中包含源语言,目标语言的文本(label):
"aaaaa"和"bbbbb"都是五位的数字,每一个文件是数据集的一个shard,"aaaaa"代表shard的index,"bbbbb"代表这个shard的最大index。
比如ImageNet这个数据集可能被分成1000个shard,它们的文件名是:
```text
imagenet-00000-of-00999
imagenet-00001-of-00999
...
imagenet-00999-of-00999
```
L&apos; inflation , en Europe , a dérapé sur l&apos; alimentation Food : Where European inflation slipped up
L&apos; inflation accélérée , mesurée dans la zone euro , est due principalement à l&apos; augmentation rapide des prix de l&apos; alimentation . The skyward zoom in food prices is the dominant force behind the speed up in eurozone inflation .
...
#### 转换库
无论是在本地或是云端转换,我们都提供Python的转换库,接口是:
```python
def convert(output_path, reader, num_shards, name_prefix)
```
### 使用reader
- `output_path`: directory in which output files will be saved.
- `reader`: a [data reader](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/reader/README.md#data-reader-interface), from which the convert program will read data instances.
- `num_shards`: the number of shards that the dataset will be partitioned into.
- `name_prefix`: the name prefix of generated files.
用户在使用v2 API编写训练任务时,可以使用paddle内置的reader完成对GlusterFS存储中的训练数据的读取,返回文件中的各列,然后在调用`trainer.train()`时传入,完成训练数据的读取
`reader`每次输出一个data instance,这个instance可以是单个值,或者用tuple表示的多个值
```python
reader = paddle.dist.reader("dataset-name")
trainer.train(reader, ...)
batch_reader = paddle.batch(paddle.dataset.mnist.train(), 128)
trainer.train(batch_reader, ...)
yield 1 # 单个值
yield numpy.random.uniform(-1, 1, size=28*28) # 单个值
yield numpy.random.uniform(-1, 1, size=28*28), 0 # 多个值
```
trainer.train内部会获取reader的内容:
每个值的类型可以是整形、浮点型数据、字符串,或者由它们组成的list,以及numpy.ndarray。如果是其它类型,会被Pickle序列化成字符串。
```
def paddle.train(batch_reader):
r = batch_reader() # create a iterator for one pass of data
for batch in r:
# train
### 示例程序
#### 使用转换库
以下`reader_creator`生成的`reader`每次输出一个data instance,每个data instance包涵两个值:numpy.ndarray类型的值和整型的值:
```python
def reader_creator():
def reader():
for i in range(1000):
yield numpy.random.uniform(-1, 1, size=28*28), 0 # 多个值
return reader
```
这里面batch是含有128个data instance的mini-batch。每一个data instance会是一个tuple,tuple元素的顺序与`.list`文件文件中每一列的顺序是一致的。每一个data instance会是(raw_image_file_binary_data, label)。其中raw_image_file_binary_data是对应图像文件的没有解码的原始二进制数据,用户需要自己解码。label是文本类型(比如:“1“,”2“),这里用户需要的其实是整形,用户需要自己转换成整形。
把`reader_creator`生成的`reader`传入`convert`函数即可完成转换:
```python
convert("./", reader_creator(), 100, random_images)
```
### 实现reader
以上命令会在当前目录下生成100个文件:
```text
random_images-00000-of-00099
random_images-00001-of-00099
...
random_images-00099-of-00099
```
reader的实现需要考虑本地训练程序实现之后,可以不修改程序直接提交集群进行分布式训练。要达到这样的目标,需要实现下面的功能:
#### 进行训练
paddle会封装一个在集群中使用的reader: `paddle.dist.reader()`。在集群训练时需要使用这个reader指定要使用的数据集开始训练。用户的训练程序需要按照如下方式初始化reader
PaddlePaddle提供专用的[data reader creator](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/reader/README.md#python-data-reader-design-doc),生成给定SSTable文件对应的data reader。**无论在本地还是在云端,reader的使用方式都是一致的**
```python
if os.getenv("PADDLE_TRAIN_LOCAL"):
reader = my_local_reader("dataset-name")
else:
reader = paddle.dist.reader("dataset-name")
# ...
reader = paddle.reader.creator.SSTable("/home/random_images-*-of-*")
batch_reader = paddle.batch(paddle.dataset.mnist.train(), 128)
trainer.train(batch_reader, ...)
```
用户训练程序提交到集群之后,集群会自动设置`PADDLE_TRAIN_LOCAL`环境变量,reader会被配置成集群训练的版本。其中`paddle.dist.reader()`需要从master的队列中获得需要开始执行的训练task,并找到对应的训练数据文件,开始训练任务。如果用户的训练数据源来自于其他服务,比如从集群中的Kafka,zeromq队列读取,也可以根据实际情况实现集群中运行的reader程序
以上代码的reader输出的data instance与生成数据集时,reader输出的data instance是一模一样的
### 上传训练文件
使用下面命令,可以把本地的数据上传到存储集群中。
```bash
paddle cp filenames pfs://home/folder/
```
比如,把之前示例中转换完毕的random_images数据集上传到云端的`/home/`可以用以下指令:
```bash
paddle cp random_images-*-of-* pfs://home/
```
## TODO
### 支持将数据合并成内部的文件格式(key-value),方便sharding与顺序读取
### 支持用户自定义的数据预处理job
# 如何贡献代码
我们真诚地感谢您的贡献,欢迎通过 GitHub 的 fork 和 pull request 流程来提交代码。
## 代码要求
- 你的代码必须完全遵守 [doxygen](http://www.stack.nl/~dimitri/doxygen/) 的样式。
- 确保编译器选项 WITH\_STYLE\_CHECK 已打开,并且编译能通过代码样式检查。
- 代码注释请遵守 [Doxygen](http://www.stack.nl/~dimitri/doxygen/) 的样式。
- 确保编译器选项 `WITH_STYLE_CHECK` 已打开,并且编译能通过代码样式检查。
- 所有代码必须具有单元测试。
- 通过所有单元测试。
以下教程将指导您提交代码。
## [Fork](https://help.github.com/articles/fork-a-repo/)
跳转到[PaddlePaddle](https://github.com/PaddlePaddle/Paddle) GitHub首页,然后单击 `Fork` 按钮。
跳转到[PaddlePaddle](https://github.com/PaddlePaddle/Paddle) GitHub首页,然后单击 `Fork` 按钮,生成自己目录下的仓库,比如 <https://github.com/USERNAME/Paddle>
## 克隆(Clone)
Paddle 目前使用[git流分支模型](http://nvie.com/posts/a-successful-git-branching-model/)进行开发,测试,发行和维护。
**develop** 是主分支,其他用户分支是特征分支(feature branches)。
将远程仓库 clone 到本地:
```bash
➜ git clone https://github.com/USERNAME/Paddle
➜ cd Paddle
```
## 创建本地分支
Paddle 目前使用[Git流分支模型](http://nvie.com/posts/a-successful-git-branching-model/)进行开发,测试,发行和维护,具体请参考 [Paddle 分支规范](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/releasing_process.md#paddle-分支规范)。
一旦你创建了一个fork,你可以使用你最喜欢的 git 客户端克隆你的仓库(repo)或只是直接在命令行输入:
所有的 feature 和 bug fix 的开发工作都应该在一个新的分支上完成,一般从 `develop` 分支上创建新分支。
```shell
# 克隆 fork 到本地
git clone --branch develop https://github.com/USERNAME/Paddle.git
使用 `git checkout -b` 创建并切换到新分支。
```bash
➜ git checkout -b my-cool-stuff
```
如果你的仓库不包含 **develop** 分支,你只需自己创建它。
```shell
git clone https://github.com/USERNAME/Paddle.git Paddle
cd Paddle
git checkout -b develop # 创建 develop 分支
git remote add upstream https://github.com/PaddlePaddle/Paddle.git # 添加 upstream 到 baidu/Paddle
git pull upstream develop # 更新 upstream
值得注意的是,在 checkout 之前,需要保持当前分支目录 clean,否则会把 untracked 的文件也带到新分支上,这可以通过 `git status` 查看。
## 使用 `pre-commit` 钩子
Paddle 开发人员使用 [pre-commit](http://pre-commit.com/) 工具来管理 Git 预提交钩子。 它可以帮助我们格式化源代码(C++,Python),在提交(commit)前自动检查一些基本事宜(如每个文件只有一个 EOL,Git 中不要添加大文件等)。
`pre-commit`测试是 Travis-CI 中单元测试的一部分,不满足钩子的 PR 不能被提交到 Paddle,首先安装并在当前目录运行它:
```bash
➜ pip install pre-commit
➜ pre-commit install
```
然后你可以通过做一个本地开发分支开始开发
Paddle 使用 `clang-format` 来调整 C/C++ 源代码格式,请确保 `clang-format` 版本在 3.8 以上。
```shell
git checkout -b MY_COOL_STUFF_BRANCH
## 开始开发
在本例中,我删除了 README.md 中的一行,并创建了一个新文件。
通过 `git status` 查看当前状态,这会提示当前目录的一些变化,同时也可以通过 `git diff` 查看文件具体被修改的内容。
```bash
➜ git status
On branch test
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: README.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
no changes added to commit (use "git add" and/or "git commit -a")
```
## 使用 `pre-commit` 钩子
## 构建和测试
编译 PaddlePaddle 的源码以及生成文档需要多种开发工具。为了方便大家,我们的标准开发流程是把这些工具都装进一个Docker image,称为*开发镜像*,通常名字是 `paddle:dev`。然后所有用 `cmake && make` 的地方(比如IDE配置里)都用 `docker run paddle:dev`来代替。
Paddle 开发人员使用 [pre-commit](http://pre-commit.com/) 工具来管理git预提交钩子。 它可以帮助我们格式化源代码(cpp,python),在提交前检查一些基本事宜(每个文件只有一个 EOL
,git 中不要添加大文件)。 `pre-commit`测试是 Travis-CI 中单元测试的一部分,不满足钩子
的 PR 不能提交代码到 Paddle。
如要build这个开发镜像,在源码目录树的根目录中运行:
你可以通过 `pip install pre-commit` 安装 [pre-commit](http://pre-commit.com/),
目前 Paddle 使用 `clang-format` 来调整C/C++源代码格式。请确保 clang-format 版本在3.8以上。
```bash
➜ docker build -t paddle:dev .
```
然后只需在 Paddle clone 目录中运行 `pre-commit install` 。当你
提交你的代码时,pre-commit 钩子会检查本地代码是否存在
不适合提交的东西,等等。
随后可以用这个开发镜像开build PaddlePaddle的源码。比如如果要build一个不依赖GPU,但是支持AVX指令集,并且包括unit tests的PaddlePaddle,可以:
## 提交(Commit)
```bash
➜ docker run -v $(pwd):/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=ON" -e "WITH_TEST=ON" paddle:dev
```
提交你的代码
这个过程除了编译PaddlePaddle为 `./build/libpaddle.so`,并且输出一个 `./build/paddle.deb`文件之外,还会输出一个 `build/Dockerfile`。我们只需要运行下面命令把编译好的PaddlePaddle打包成一个*生产镜像*(`paddle:prod`)
```shell
# 显示工作树状态
git status
# 添加修改过的文件
git add xx
env EDITOR=vim git commit # 你可以用 vim/nano/emacs 写下你的注释
```bash
➜ docker build -t paddle:prod -f build/Dockerfile .
```
提交信息的第一行是标题,其他行可以添加一些细节(如果有必要的话)。
## 保持 Fork 状态最新
如果要运行所有的单元测试,可以用如下命令:
在拉(pull)你的请求(request)之前,你应该从最新的 PaddlePaddle 同步代码。
为此,你需要首先添加远程(remote):
```bash
➜ docker run -it -v $(pwd):/paddle paddle:dev bash -c "cd /paddle/build && ctest"
```
```shell
# 观察当前远程仓库配置
git remote -v
# 添加上游(upstream)仓库
git remote add upstream https://github.com/PaddlePaddle/Paddle.git
# 验证新的 upstream
git remote -v
关于构建和测试的更多信息,请参见[这篇文档](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/getstarted/build_and_install/docker_install_cn.rst)。
## 提交(commit)
接下来我们取消对 README.md 文件的改变,然后提交新添加的 test 文件。
```bash
➜ git checkout -- README.md
➜ git status
On branch test
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
nothing added to commit but untracked files present (use "git add" to track)
➜ git add test
```
Git 每次提交代码,都需要写提交说明,这可以让其他人知道这次提交做了哪些改变,这可以通过`git commit` 完成。
```bash
➜ git commit
CRLF end-lines remover...............................(no files to check)Skipped
yapf.................................................(no files to check)Skipped
Check for added large files..............................................Passed
Check for merge conflicts................................................Passed
Check for broken symlinks................................................Passed
Detect Private Key...................................(no files to check)Skipped
Fix End of Files.....................................(no files to check)Skipped
clang-formater.......................................(no files to check)Skipped
[my-cool-stuff c703c041] add test file
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 233
```
## 保持本地仓库最新
在准备发起 Pull Request 之前,需要同步原仓库(<https://github.com/PaddlePaddle/Paddle>)最新的代码。
首先通过 `git remote` 查看当前远程仓库的名字。
```bash
➜ git remote
origin
➜ git remote -v
origin https://github.com/USERNAME/Paddle (fetch)
origin https://github.com/USERNAME/Paddle (push)
```
用最新的 upstream 更新你的 fork:
这里 origin 是我们 clone 的远程仓库的名字,也就是自己用户名下的 Paddle,接下来我们创建一个原始 Paddle 仓库的远程主机,命名为 upstream。
```shell
git pull --rebase upstream develop
```bash
➜ git remote add upstream https://github.com/PaddlePaddle/Paddle
➜ git remote
origin
upstream
```
如果本地没有提交,git 将简单地执行快进。但是,如果你一直在做一些改变(绝大多数情况下不应该),你可能要处理冲突。
现在,你的本地主分支与上游修改的一致并是最新的
获取 upstream 的最新代码并更新当前分支
## 推送(Push)到 GitHub
```bash
➜ git fetch upstream
➜ git pull upstream develop
```
## Push 到远程仓库
将本地的修改推送到 GitHub 上,也就是 https://github.com/USERNAME/Paddle。
```shell
# 在 GitHub 上 push 你的仓库
git push -u origin MY_COOL_STUFF_BRANCH # 创建远程分支 MY_COOL_STUFF_BRANCH 到 origin.
```bash
# 推送到远程仓库 origin 的 my-cool-stuff 分支上
➜ git push origin my-cool-stuff
```
## 拉取请求(Pull Request)
## 建立 Issue 并完成 Pull Request
建立一个 Issue 描述问题,并记录它的编号。
切换到所建分支,然后点击 `New pull request`。
<img width="295" alt="screen shot 2017-04-26 at 9 09 28 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436054/a6d98c66-2ac4-11e7-9cb1-18dd13150230.png">
转到 GitHub上 你 fork 的页面,选择你的开发分支并单击 **pull request 按钮**。
选择目标分支:
## 使用最新版本更新你的 pull 请求
<img width="750" alt="screen shot 2017-04-26 at 9 11 52 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436139/f83b1e6c-2ac4-11e7-8c0e-add499023c46.png">
代码审查(code review)期间,由于 baidu/Paddle 中新的提交导致你的 pull 请求可能会失效。如果没有冲突,GitHub允许自动更新。 你可以点击 pull request 页面中的“更新分支(Update Branch)”按钮。 但是如果存在代码冲突,你需要手动进行更新。你需要在本地仓库执行如下命令:
PR 的描述说明中,填写 `resolve #Issue编号` 可以在这个 PR 被 merge 后,自动关闭对应的 Issue,具体请见 <https://help.github.com/articles/closing-issues-via-commit-messages/>。
```shell
git checkout MY_COOL_STUFF_BRANCH
git pull upstream develop
# 你可能需要根据git提示解决冲突
# 创建并测试你的代码
git push origin MY_COOL_STUFF_BRANCH
接下来等待 review,如果有需要修改的地方,参照上述步骤更新 origin 中的对应分支即可。
## 删除远程分支
在 PR 被 merge 进主仓库后,我们可以在 PR 的页面删除远程仓库的分支。
<img width="775" alt="screen shot 2017-04-26 at 9 18 24 pm" src="https://cloud.githubusercontent.com/assets/11692045/25436457/e4cdd472-2ac5-11e7-9272-badc76c4a23e.png">
也可以使用 `git push origin :分支名` 删除远程分支,如:
```bash
➜ git push origin :my-cool-stuff
```
现在你的 Pull Request 是最新的了。
## 修改你的 pull request
## 删除本地分支
当根据审阅者的意见修改 pull 请求时,请使用“git commit”而不是“git commit --amend”来提交更改,以便审阅者可以看到新的请求和旧的请求之间的区别
最后,删除本地分支
可能的命令是
```bash
# 切换到 develop 分支
➜ git checkout develop
```shell
git checkout MY_COOL_STUFF_BRANCH
git pull upstream develop # 将本地更新到最新的代码库
# 可能会发生一些冲突
# 开始开发吧!
env EDITOR=vim git commit # 添加修改日志
git push origin MY_COOL_STUFF_BRANCH
# 删除 my-cool-stuff 分支
➜ git branch -D my-cool-stuff
```
至此,我们就完成了一次代码贡献的过程。
......@@ -1509,9 +1509,15 @@ to maintain tractability.</p>
<span class="n">simple_rnn</span> <span class="o">+=</span> <span class="n">last_time_step_output</span>
<span class="k">return</span> <span class="n">simple_rnn</span>
<span class="n">generated_word_embedding</span> <span class="o">=</span> <span class="n">GeneratedInput</span><span class="p">(</span>
<span class="n">size</span><span class="o">=</span><span class="n">target_dictionary_dim</span><span class="p">,</span>
<span class="n">embedding_name</span><span class="o">=</span><span class="s2">&quot;target_language_embedding&quot;</span><span class="p">,</span>
<span class="n">embedding_size</span><span class="o">=</span><span class="n">word_vector_dim</span><span class="p">)</span>
<span class="n">beam_gen</span> <span class="o">=</span> <span class="n">beam_search</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s2">&quot;decoder&quot;</span><span class="p">,</span>
<span class="n">step</span><span class="o">=</span><span class="n">rnn_step</span><span class="p">,</span>
<span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">StaticInput</span><span class="p">(</span><span class="n">encoder_last</span><span class="p">)],</span>
<span class="nb">input</span><span class="o">=</span><span class="p">[</span><span class="n">StaticInput</span><span class="p">(</span><span class="n">encoder_last</span><span class="p">),</span>
<span class="n">generated_word_embedding</span><span class="p">],</span>
<span class="n">bos_id</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
<span class="n">eos_id</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span>
<span class="n">beam_size</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
......@@ -1533,7 +1539,8 @@ sharing a same set of weights.</p>
<p>You can refer to the first parameter of recurrent_group, or
demo/seqToseq/seqToseq_net.py for more details.</p>
</li>
<li><strong>input</strong> (<em>list</em>) &#8211; Input data for the recurrent unit</li>
<li><strong>input</strong> (<em>list</em>) &#8211; Input data for the recurrent unit, which should include the
previously generated words as a GeneratedInput object.</li>
<li><strong>bos_id</strong> (<em>int</em>) &#8211; Index of the start symbol in the dictionary. The start symbol
is a special token for NLP task, which indicates the
beginning of a sequence. In the generation task, the start
......
此差异已折叠。
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 4d7a146cda87e1e0222ce8a24b0ea6b4
tags: 645f666f9bcd5a90fca523b33c5a78b7
ABOUT
=======
PaddlPaddle is an easy-to-use, efficient, flexible and scalable deep learning platform,
which is originally developed by Baidu scientists and engineers for the purpose of applying deep learning to many products at Baidu.
PaddlePaddle is now open source but far from complete, which is intended to be built upon, improved, scaled, and extended.
We hope to build an active open source community both by providing feedback and by actively contributing to the source code.
Credits
--------
We owe many thanks to `all contributors and developers <https://github.com/PaddlePaddle/Paddle/graphs/contributors>`_ of PaddlePaddle!
API
===
.. toctree::
:maxdepth: 1
v2/model_configs.rst
v2/data.rst
v2/run_logic.rst
Introduction
==============
DataProvider is a module that loads training or testing data into cpu or gpu
memory for the following triaining or testing process.
For simple use, users can use Python :code:`PyDataProvider` to dynamically reads
the original data in any format or in any form, and then transfer them into a
data format PaddlePaddle requires. The process is extremly flexible and highly
customized, with sacrificing the efficiency only a little. This is extremly
useful when you have to dynamically generate certain kinds of data according to,
for example, the training performance.
Besides, users also can customize a C++ :code:`DataProvider` for a more
complex usage, or for a higher efficiency.
The following parameters are required to define in the PaddlePaddle network
configuration file (trainer_config.py): which DataProvider is chosen to used,
and specific parameters for DataProvider, including training file list
(train.list) and testing file list (test.list).
Train.list and test.list are simply two plain text files, which defines path
of training or testing data. It is recommended that directly placing them into
the training directory, and reference to them by using a relative path (
relative to the PaddePaddle program).
Testing or evaluating will not be performed during training if the test.list is
not set or set to None. Otherwise, PaddlePaddle will evaluate the trained model
by the specified tesing data while training, every testing period (a user
defined command line parameter in PaddlePaddle) to prevent over-fitting.
Each line of train.list and test.list is an absolute or relative path (relative
to the PaddePaddle program runtime) of data file. Fascinatingly more, each line
can also be a HDFS file path or a SQL connection string. As long as the user
assures how to access each file in DataProvider.
.. _api_pydataprovider2:
PyDataProvider2
===============
We highly recommand users to use PyDataProvider2 to provide training or testing
data to PaddlePaddle. The user only needs to focus on how to read a single
sample from the original data file by using PyDataProvider2, leaving all of the
trivial work, including, transfering data into cpu/gpu memory, shuffle, binary
serialization to PyDataProvider2. PyDataProvider2 uses multithreading and a
fanscinating but simple cache strategy to optimize the efficiency of the data
providing process.
DataProvider for the non-sequential model
-----------------------------------------
Here we use the MNIST handwriting recognition data as an example to illustrate
how to write a simple PyDataProvider.
MNIST is a handwriting classification data set. It contains 70,000 digital
grayscale images. Labels of the training sample range from 0 to 9. All the
images have been size-normalized and centered into images with the same size
of 28 x 28 pixels.
A small part of the original data as an example is shown as below:
.. literalinclude:: src/mnist_train.txt
Each line of the data contains two parts, separated by :code:`;`. The first part is
label of an image. The second part contains 28x28 pixel float values.
Just write path of the above data into train.list. It looks like this:
.. literalinclude:: src/train.list
The corresponding dataprovider is shown as below:
.. literalinclude:: src/mnist_provider.dict.py
The first line imports PyDataProvider2 package.
The main function is the process function, that has two parameters.
The first parameter is the settings, which is not used in this example.
The second parameter is the filename, that is exactly each line of train.list.
This parameter is passed to the process function by PaddlePaddle.
:code:`@provider` is a Python
`Decorator <http://www.learnpython.org/en/Decorators>`_ .
It sets some properties to DataProvider, and constructs a real PaddlePaddle
DataProvider from a very simple user implemented python function. It does not
matter if you are not familiar with `Decorator`_. You can keep it simple by
just taking :code:`@provider` as a fixed mark above the provider function you
implemented.
`input_types`_ defines the data format that a DataProvider returns.
In this example, it is set to a 28x28-dimensional dense vector and an integer
scalar, whose value ranges from 0 to 9.
`input_types`_ can be set to several kinds of input formats, please refer to the
document of `input_types`_ for more details.
The process method is the core part to construct a real DataProvider in
PaddlePaddle. It implements how to open the text file, how to read one sample
from the original text file, convert them into `input_types`_, and give them
back to PaddlePaddle process at line 23.
Note that data yielded by the process function must follow the same order that
`input_types`_ are defined.
With the help of PyDataProvider2, user can focus on how to generate ONE traning
sample by using keywords :code:`yield`.
:code:`yield` is a python keyword, and a concept related to it includes
:code:`generator`.
Only a few lines of codes need to be added into the training configuration file,
you can take this as an example.
.. literalinclude:: src/mnist_config.py
Here we specify training data by :code:`train.list`, and no testing data is specified.
The method which actually provide data is :code:`process`.
User also can use another style to provide data, which defines the
:code:`data_layer`'s name explicitly when `yield`. For example,
the :code:`dataprovider` is shown as below.
.. literalinclude:: src/mnist_provider.dict.py
:linenos:
If user did't give the :code:`data_layer`'s name, PaddlePaddle will use
the order of :code:`data_layer` definition roughly to determine which feature to
which :code:`data_layer`. This order may be not correct, so TO DEFINE THE
:code:`data_layer`'s NAMES EXPLICITLY IS THE RECOMMANDED WAY TO PROVIDER DATA.
Now, this simple example of using PyDataProvider is finished.
The only thing that the user should know is how to generte **one sample** from
**one data file**.
And PaddlePadle will do all of the rest things\:
* Form a training batch
* Shuffle the training data
* Read data with multithreading
* Cache the training data (Optional)
* CPU-> GPU double buffering.
Is this cool?
.. _api_pydataprovider2_sequential_model:
DataProvider for the sequential model
-------------------------------------
A sequence model takes sequences as its input. A sequence is made up of several
timesteps. The so-called timestep, is not necessary to have something to do
with time. It can also be explained to that the order of data are taken into
consideration into model design and training.
For example, the sentence can be interpreted as a kind of sequence data in NLP
tasks.
Here is an example on data proivider for English sentiment classification data.
The original input data are simple English text, labeled into positive or
negative sentiment (marked by 0 and 1 respectively).
A small part of the original data as an example can be found in the path below:
.. literalinclude:: src/sentimental_train.txt
The corresponding data provider can be found in the path below:
.. literalinclude:: src/sentimental_provider.py
This data provider for sequential model is a little more complex than that
for MINST dataset.
A new initialization method is introduced here.
The method :code:`on_init` is configured to DataProvider by :code:`@provider`'s
:code:`init_hook` parameter, and it will be invoked once DataProvider is
initialized. The :code:`on_init` function has the following parameters:
* The first parameter is the settings object.
* The rest parameters are passed by key word arguments. Some of them are passed
by PaddlePaddle, see reference for `init_hook`_.
The :code:`dictionary` object is a python dict object passed from the trainer
configuration file, and it maps word string to word id.
To pass these parameters into DataProvider, the following lines should be added
into trainer configuration file.
.. literalinclude:: src/sentimental_config.py
The definition is basically same as MNIST example, except:
* Load dictionary in this configuration
* Pass it as a parameter to the DataProvider
The `input_types` is configured in method :code:`on_init`. It has the same
effect to configure them by :code:`@provider`'s :code:`input_types` parameter.
However, the :code:`input_types` is set at runtime, so we can set it to
different types according to the input data. Input of the neural network is a
sequence of word id, so set :code:`seq_type` to :code:`integer_value_sequence`.
Durning :code:`on_init`, we save :code:`dictionary` variable to
:code:`settings`, and it will be used in :code:`process`. Note the settings
parameter for the process function and for the on_init's function are a same
object.
The basic processing logic is the same as MNIST's :code:`process` method. Each
sample in the data file is given back to PaddlePaddle process.
Thus, the basic usage of PyDataProvider is here.
Please refer to the following section reference for details.
Reference
---------
@provider
+++++++++
.. autofunction:: paddle.trainer.PyDataProvider2.provider
input_types
+++++++++++
PaddlePaddle has four data types, and three sequence types.
The four data types are:
* :code:`dense_vector`: dense float vector.
* :code:`sparse_binary_vector`: sparse binary vector, most of the value is 0, and
the non zero elements are fixed to 1.
* :code:`sparse_float_vector`: sparse float vector, most of the value is 0, and some
non zero elements can be any float value. They are given by the user.
* :code:`integer`: an integer scalar, that is especially used for label or word index.
The three sequence types are:
* :code:`SequenceType.NO_SEQUENCE` means the sample is not a sequence.
* :code:`SequenceType.SEQUENCE` means the sample is a sequence.
* :code:`SequenceType.SUB_SEQUENCE` means it is a nested sequence, that each timestep of
the input sequence is also a sequence.
Different input type has a defferenct input format. Their formats are shown
in the above table.
+----------------------+---------------------+-----------------------------------+------------------------------------------------+
| | NO_SEQUENCE | SEQUENCE | SUB_SEQUENCE |
+======================+=====================+===================================+================================================+
| dense_vector | [f, f, ...] | [[f, ...], [f, ...], ...] | [[[f, ...], ...], [[f, ...], ...],...] |
+----------------------+---------------------+-----------------------------------+------------------------------------------------+
| sparse_binary_vector | [i, i, ...] | [[i, ...], [i, ...], ...] | [[[i, ...], ...], [[i, ...], ...],...] |
+----------------------+---------------------+-----------------------------------+------------------------------------------------+
| sparse_float_vector | [(i,f), (i,f), ...] | [[(i,f), ...], [(i,f), ...], ...] | [[[(i,f), ...], ...], [[(i,f), ...], ...],...] |
+----------------------+---------------------+-----------------------------------+------------------------------------------------+
| integer_value | i | [i, i, ...] | [[i, ...], [i, ...], ...] |
+----------------------+---------------------+-----------------------------------+------------------------------------------------+
where f represents a float value, i represents an integer value.
init_hook
+++++++++
init_hook is a function that is invoked once the data provoder is initialized.
Its parameters lists as follows:
* The first parameter is a settings object, which is the same to :code:`settings`
in :code:`process` method. The object contains several attributes, including:
* :code:`settings.input_types`: the input types. Reference `input_types`_.
* :code:`settings.logger`: a logging object.
* The rest parameters are the key word arguments. It is made up of PaddpePaddle
pre-defined parameters and user defined parameters.
* PaddlePaddle-defined parameters including:
* :code:`is_train` is a bool parameter that indicates the DataProvider is used in
training or testing.
* :code:`file_list` is the list of all files.
* User-defined parameters args can be set in training configuration.
Note, PaddlePaddle reserves the right to add pre-defined parameter, so please
use :code:`**kwargs` in init_hook to ensure compatibility by accepting the
parameters which your init_hook does not use.
cache
+++++
DataProvider provides two simple cache strategy. They are:
* :code:`CacheType.NO_CACHE` means do not cache any data, then data is read at runtime by
the user implemented python module every pass.
* :code:`CacheType.CACHE_PASS_IN_MEM` means the first pass reads data by the user
implemented python module, and the rest passes will directly read data from
memory.
API
===
DataProvider API
----------------
.. toctree::
:maxdepth: 1
data_provider/dataprovider_en.rst
data_provider/pydataprovider2_en.rst
.. _api_trainer_config:
Model Config API
----------------
.. toctree::
:maxdepth: 1
trainer_config_helpers/optimizers.rst
trainer_config_helpers/data_sources.rst
trainer_config_helpers/layers.rst
trainer_config_helpers/activations.rst
trainer_config_helpers/poolings.rst
trainer_config_helpers/networks.rst
trainer_config_helpers/evaluators.rst
trainer_config_helpers/attrs.rst
Applications API
----------------
.. toctree::
:maxdepth: 1
predict/swig_py_paddle_en.rst
Python Prediction
==================
PaddlePaddle offers a set of clean prediction interfaces for python with the help of
SWIG. The main steps of predict values in python are:
* Parse training configurations
* Construct GradientMachine
* Prepare data
* Predict
Here is a sample python script that shows the typical prediction process for the
MNIST classification problem. A complete sample code could be found at
:code:`src_root/doc/ui/predict/predict_sample.py`.
.. literalinclude:: src/predict_sample.py
:language: python
:lines: 15-18,90-100,101-104
The module that does the most of the job is py_paddle.swig_paddle, it's
generated by SWIG and has complete documents, for more details you can use
python's :code:`help()` function. Let's walk through the above python script:
* At the beginning, use :code:`swig_paddle.initPaddle()` to initialize
PaddlePaddle with command line arguments, for more about command line arguments
see :ref:`cmd_detail_introduction` .
* Parse the configuration file that is used in training with :code:`parse_config()`.
Because data to predict with always have no label, and output of prediction work
normally is the output layer rather than the cost layer, so you should modify
the configuration file accordingly before using it in the prediction work.
* Create a neural network with
:code:`swig_paddle.GradientMachine.createFromConfigproto()`, which takes the
parsed configuration :code:`conf.model_config` as argument. Then load the
trained parameters from the model with :code:`network.loadParameters()`.
* Create a data converter object of utility class :code:`DataProviderConverter`.
- Note: As swig_paddle can only accept C++ matrices, we offer a utility
class DataProviderConverter that can accept the same input data with
PyDataProvider2, for more information please refer to document
of :ref:`api_pydataprovider2` .
* Do the prediction with :code:`forwardTest()`, which takes the converted
input data and outputs the activations of the output layer.
Here is a typical output:
.. code-block:: text
[{'id': None, 'value': array([[ 5.53018653e-09, 1.12194102e-05, 1.96644767e-09,
1.43630644e-02, 1.51111044e-13, 9.85625684e-01,
2.08823112e-10, 2.32777140e-08, 2.00186201e-09,
1.15501715e-08],
[ 9.99982715e-01, 1.27787406e-10, 1.72296313e-05,
1.49316648e-09, 1.36540484e-11, 6.93137714e-10,
2.70634608e-08, 3.48565123e-08, 5.25639710e-09,
4.48684503e-08]], dtype=float32)}]
:code:`value` is the output of the output layer, each row represents result of
the corresponding row in the input data, each element represents activation of
the corresponding neuron in the output layer.
===========
Activations
===========
BaseActivation
==============
.. automodule:: paddle.trainer_config_helpers.activations
:members: BaseActivation
:noindex:
AbsActivation
===============
.. automodule:: paddle.trainer_config_helpers.activations
:members: AbsActivation
:noindex:
ExpActivation
===============
.. automodule:: paddle.trainer_config_helpers.activations
:members: ExpActivation
:noindex:
IdentityActivation
==================
.. automodule:: paddle.trainer_config_helpers.activations
:members: IdentityActivation
:noindex:
LinearActivation
==================
.. automodule:: paddle.trainer_config_helpers.activations
:members: LinearActivation
:noindex:
LogActivation
==================
.. automodule:: paddle.trainer_config_helpers.activations
:members: LogActivation
:noindex:
SquareActivation
================
.. automodule:: paddle.trainer_config_helpers.activations
:members: SquareActivation
:noindex:
SigmoidActivation
=================
.. automodule:: paddle.trainer_config_helpers.activations
:members: SigmoidActivation
:noindex:
SoftmaxActivation
=================
.. automodule:: paddle.trainer_config_helpers.activations
:members: SoftmaxActivation
:noindex:
SequenceSoftmaxActivation
=========================
.. automodule:: paddle.trainer_config_helpers.activations
:members: SequenceSoftmaxActivation
:noindex:
ReluActivation
==============
.. automodule:: paddle.trainer_config_helpers.activations
:members: ReluActivation
:noindex:
BReluActivation
===============
.. automodule:: paddle.trainer_config_helpers.activations
:members: BReluActivation
:noindex:
SoftReluActivation
==================
.. automodule:: paddle.trainer_config_helpers.activations
:members: SoftReluActivation
:noindex:
TanhActivation
==============
.. automodule:: paddle.trainer_config_helpers.activations
:members: TanhActivation
:noindex:
STanhActivation
===============
.. automodule:: paddle.trainer_config_helpers.activations
:members: STanhActivation
:noindex:
Parameter Attributes
=======================
.. automodule:: paddle.trainer_config_helpers.attrs
:members:
.. _api_trainer_config_helpers_data_sources:
DataSources
===========
.. automodule:: paddle.trainer_config_helpers.data_sources
:members:
.. _api_trainer_config_helpers_evaluators:
==========
Evaluators
==========
Base
====
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: evaluator_base
:noindex:
Classification
==============
classification_error_evaluator
------------------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: classification_error_evaluator
:noindex:
auc_evaluator
-------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: auc_evaluator
:noindex:
ctc_error_evaluator
-------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: ctc_error_evaluator
:noindex:
chunk_evaluator
---------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: chunk_evaluator
:noindex:
precision_recall_evaluator
--------------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: precision_recall_evaluator
:noindex:
Rank
====
pnpair_evaluator
----------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: pnpair_evaluator
:noindex:
Utils
=====
sum_evaluator
-------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: sum_evaluator
:noindex:
column_sum_evaluator
--------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: column_sum_evaluator
:noindex:
Print
=====
classification_error_printer_evaluator
--------------------------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: classification_error_printer_evaluator
:noindex:
gradient_printer_evaluator
--------------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: gradient_printer_evaluator
:noindex:
maxid_printer_evaluator
-----------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: maxid_printer_evaluator
:noindex:
maxframe_printer_evaluator
---------------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: maxframe_printer_evaluator
:noindex:
seqtext_printer_evaluator
-------------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: seqtext_printer_evaluator
:noindex:
value_printer_evaluator
-----------------------
.. automodule:: paddle.trainer_config_helpers.evaluators
:members: value_printer_evaluator
:noindex:
.. _api_trainer_config_helpers_layers:
======
Layers
======
Base
======
LayerType
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: LayerType
:noindex:
LayerOutput
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: LayerOutput
:noindex:
Data layer
===========
.. _api_trainer_config_helpers_layers_data_layer:
data_layer
----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: data_layer
:noindex:
Fully Connected Layers
======================
.. _api_trainer_config_helpers_layers_fc_layer:
fc_layer
--------
.. automodule:: paddle.trainer_config_helpers.layers
:members: fc_layer
:noindex:
selective_fc_layer
------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: selective_fc_layer
:noindex:
Conv Layers
===========
conv_operator
-------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: conv_operator
:noindex:
conv_projection
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: conv_projection
:noindex:
conv_shift_layer
------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: conv_shift_layer
:noindex:
img_conv_layer
--------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: img_conv_layer
:noindex:
.. _api_trainer_config_helpers_layers_context_projection:
context_projection
------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: context_projection
:noindex:
Image Pooling Layer
===================
img_pool_layer
--------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: img_pool_layer
:noindex:
spp_layer
--------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: spp_layer
:noindex:
maxout_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: maxout_layer
:noindex:
Norm Layer
==========
img_cmrnorm_layer
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: img_cmrnorm_layer
:noindex:
batch_norm_layer
---------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: batch_norm_layer
:noindex:
sum_to_one_norm_layer
---------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: sum_to_one_norm_layer
:noindex:
Recurrent Layers
================
recurrent_layer
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: recurrent_layer
:noindex:
lstmemory
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: lstmemory
:noindex:
grumemory
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: grumemory
:noindex:
Recurrent Layer Group
=====================
memory
------
.. automodule:: paddle.trainer_config_helpers.layers
:members: memory
:noindex:
recurrent_group
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: recurrent_group
:noindex:
lstm_step_layer
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: lstm_step_layer
:noindex:
gru_step_layer
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: gru_step_layer
:noindex:
beam_search
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: beam_search
:noindex:
get_output_layer
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: get_output_layer
:noindex:
Mixed Layer
===========
.. _api_trainer_config_helpers_layers_mixed_layer:
mixed_layer
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: mixed_layer
:noindex:
.. _api_trainer_config_helpers_layers_embedding_layer:
embedding_layer
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: embedding_layer
:noindex:
scaling_projection
------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: scaling_projection
:noindex:
dotmul_projection
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: dotmul_projection
:noindex:
dotmul_operator
---------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: dotmul_operator
:noindex:
full_matrix_projection
----------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: full_matrix_projection
:noindex:
identity_projection
-------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: identity_projection
:noindex:
table_projection
----------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: table_projection
:noindex:
trans_full_matrix_projection
----------------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: trans_full_matrix_projection
:noindex:
Aggregate Layers
================
.. _api_trainer_config_helpers_layers_pooling_layer:
pooling_layer
-------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: pooling_layer
:noindex:
.. _api_trainer_config_helpers_layers_last_seq:
last_seq
--------
.. automodule:: paddle.trainer_config_helpers.layers
:members: last_seq
:noindex:
.. _api_trainer_config_helpers_layers_first_seq:
first_seq
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: first_seq
:noindex:
concat_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: concat_layer
:noindex:
seq_concat_layer
----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: seq_concat_layer
:noindex:
Reshaping Layers
================
block_expand_layer
------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: block_expand_layer
:noindex:
.. _api_trainer_config_helpers_layers_expand_layer:
expand_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: expand_layer
:noindex:
repeat_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: repeat_layer
:noindex:
rotate_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: rotate_layer
:noindex:
seq_reshape_layer
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: seq_reshape_layer
:noindex:
Math Layers
===========
addto_layer
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: addto_layer
:noindex:
linear_comb_layer
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: linear_comb_layer
:noindex:
interpolation_layer
-------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: interpolation_layer
:noindex:
bilinear_interp_layer
----------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: bilinear_interp_layer
:noindex:
power_layer
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: power_layer
:noindex:
scaling_layer
-------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: scaling_layer
:noindex:
slope_intercept_layer
----------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: slope_intercept_layer
:noindex:
tensor_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: tensor_layer
:noindex:
.. _api_trainer_config_helpers_layers_cos_sim:
cos_sim
-------
.. automodule:: paddle.trainer_config_helpers.layers
:members: cos_sim
:noindex:
trans_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: trans_layer
:noindex:
Sampling Layers
===============
maxid_layer
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: maxid_layer
:noindex:
sampling_id_layer
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: sampling_id_layer
:noindex:
Slicing and Joining Layers
==========================
pad_layer
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: pad_layer
:noindex:
.. _api_trainer_config_helpers_layers_cost_layers:
Cost Layers
===========
cross_entropy
-------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: cross_entropy
:noindex:
cross_entropy_with_selfnorm
---------------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: cross_entropy_with_selfnorm
:noindex:
multi_binary_label_cross_entropy
--------------------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: multi_binary_label_cross_entropy
:noindex:
mse_cost
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: mse_cost
:noindex:
huber_cost
----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: huber_cost
:noindex:
lambda_cost
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: lambda_cost
:noindex:
rank_cost
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: rank_cost
:noindex:
sum_cost
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: sum_cost
:noindex:
crf_layer
-----------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: crf_layer
:noindex:
crf_decoding_layer
-------------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: crf_decoding_layer
:noindex:
ctc_layer
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: ctc_layer
:noindex:
warp_ctc_layer
--------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: warp_ctc_layer
:noindex:
nce_layer
-----------
.. automodule:: paddle.trainer_config_helpers.layers
:members: nce_layer
:noindex:
hsigmoid
---------
.. automodule:: paddle.trainer_config_helpers.layers
:members: hsigmoid
:noindex:
Check Layer
============
eos_layer
------------
.. automodule:: paddle.trainer_config_helpers.layers
:members: eos_layer
:noindex:
========
Networks
========
The networks module contains pieces of neural network that combine multiple layers.
NLP
===
sequence_conv_pool
------------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: sequence_conv_pool
:noindex:
.. _api_trainer_config_helpers_network_text_conv_pool:
text_conv_pool
--------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: text_conv_pool
:noindex:
Images
======
img_conv_bn_pool
----------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: img_conv_bn_pool
:noindex:
img_conv_group
--------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: img_conv_group
:noindex:
.. _api_trainer_config_helpers_network_simple_img_conv_pool:
simple_img_conv_pool
--------------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: simple_img_conv_pool
:noindex:
vgg_16_network
---------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: vgg_16_network
:noindex:
Recurrent
=========
LSTM
----
lstmemory_unit
``````````````
.. automodule:: paddle.trainer_config_helpers.networks
:members: lstmemory_unit
:noindex:
lstmemory_group
```````````````
.. automodule:: paddle.trainer_config_helpers.networks
:members: lstmemory_group
:noindex:
simple_lstm
```````````
.. automodule:: paddle.trainer_config_helpers.networks
:members: simple_lstm
:noindex:
bidirectional_lstm
``````````````````
.. automodule:: paddle.trainer_config_helpers.networks
:members: bidirectional_lstm
:noindex:
GRU
---
gru_unit
````````
.. automodule:: paddle.trainer_config_helpers.networks
:members: gru_unit
:noindex:
gru_group
`````````
.. automodule:: paddle.trainer_config_helpers.networks
:members: gru_group
:noindex:
simple_gru
``````````
.. automodule:: paddle.trainer_config_helpers.networks
:members: simple_gru
:noindex:
simple_attention
----------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: simple_attention
:noindex:
Miscs
=====
dropout_layer
--------------
.. automodule:: paddle.trainer_config_helpers.networks
:members: dropout_layer
:noindex:
outputs
-------
.. automodule:: paddle.trainer_config_helpers.networks
:members: outputs
:noindex:
.. _api_trainer_config_helpers_optimizers:
==========
Optimizers
==========
BaseSGDOptimizer
================
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: BaseSGDOptimizer
:noindex:
MomentumOptimizer
=================
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: MomentumOptimizer
:noindex:
AdamOptimizer
=============
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: AdamOptimizer
:noindex:
AdamaxOptimizer
================
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: AdamaxOptimizer
:noindex:
AdaGradOptimizer
================
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: AdaGradOptimizer
:noindex:
DecayedAdaGradOptimizer
=======================
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: DecayedAdaGradOptimizer
:noindex:
AdaDeltaOptimizer
=================
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: AdaDeltaOptimizer
:noindex:
RMSPropOptimizer
================
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: RMSPropOptimizer
:noindex:
.. _api_trainer_config_helpers_optimizers_settings:
settings
========
.. automodule:: paddle.trainer_config_helpers.optimizers
:members: settings
:noindex:
========
Poolings
========
BasePoolingType
===============
.. automodule:: paddle.trainer_config_helpers.poolings
:members: BasePoolingType
:noindex:
AvgPooling
==========
.. automodule:: paddle.trainer_config_helpers.poolings
:members: AvgPooling
:noindex:
MaxPooling
==========
.. automodule:: paddle.trainer_config_helpers.poolings
:members: MaxPooling
:noindex:
SumPooling
==========
.. automodule:: paddle.trainer_config_helpers.poolings
:members: SumPooling
:noindex:
SquareRootNPooling
==================
.. automodule:: paddle.trainer_config_helpers.poolings
:members: SquareRootNPooling
:noindex:
===========
Activation
===========
Abs
===
.. automodule:: paddle.v2.activation
:members: Abs
:noindex:
Exp
===
.. automodule:: paddle.v2.activation
:members: Exp
:noindex:
Identity
========
.. automodule:: paddle.v2.activation
:members: Identity
:noindex:
Linear
======
.. automodule:: paddle.v2.activation
:members: Linear
:noindex:
Log
===
.. automodule:: paddle.v2.activation
:members: Log
:noindex:
Square
======
.. automodule:: paddle.v2.activation
:members: Square
:noindex:
Sigmoid
=======
.. automodule:: paddle.v2.activation
:members: Sigmoid
:noindex:
Softmax
=======
.. automodule:: paddle.v2.activation
:members: Softmax
:noindex:
SequenceSoftmax
===============
.. automodule:: paddle.v2.activation
:members: SequenceSoftmax
:noindex:
Relu
====
.. automodule:: paddle.v2.activation
:members: Relu
:noindex:
BRelu
=====
.. automodule:: paddle.v2.activation
:members: BRelu
:noindex:
SoftRelu
========
.. automodule:: paddle.v2.activation
:members: SoftRelu
:noindex:
Tanh
====
.. automodule:: paddle.v2.activation
:members: Tanh
:noindex:
STanh
=====
.. automodule:: paddle.v2.activation
:members: STanh
:noindex:
Parameter Attribute
===================
.. automodule:: paddle.v2.attr
:members:
:noindex:
.. _api_v2.layer:
======
Layers
======
Data layer
===========
.. _api_v2.layer_data:
data
----
.. autoclass:: paddle.v2.layer.data
:noindex:
Fully Connected Layers
======================
.. _api_v2.layer_fc:
fc
--
.. autoclass:: paddle.v2.layer.fc
:noindex:
selective_fc
------------
.. autoclass:: paddle.v2.layer.selective_fc
:noindex:
Conv Layers
===========
conv_operator
-------------
.. autoclass:: paddle.v2.layer.conv_operator
:noindex:
conv_projection
---------------
.. autoclass:: paddle.v2.layer.conv_projection
:noindex:
conv_shift
----------
.. autoclass:: paddle.v2.layer.conv_shift
:noindex:
img_conv
--------
.. autoclass:: paddle.v2.layer.img_conv
:noindex:
.. _api_v2.layer_context_projection:
context_projection
------------------
.. autoclass:: paddle.v2.layer.context_projection
:noindex:
Image Pooling Layer
===================
img_pool
--------
.. autoclass:: paddle.v2.layer.img_pool
:noindex:
spp
---
.. autoclass:: paddle.v2.layer.spp
:noindex:
maxout
------
.. autoclass:: paddle.v2.layer.maxout
:noindex:
Norm Layer
==========
img_cmrnorm
-----------
.. autoclass:: paddle.v2.layer.img_cmrnorm
:noindex:
batch_norm
----------
.. autoclass:: paddle.v2.layer.batch_norm
:noindex:
sum_to_one_norm
---------------
.. autoclass:: paddle.v2.layer.sum_to_one_norm
:noindex:
cross_channel_norm
------------------
.. autoclass:: paddle.v2.layer.cross_channel_norm
:noindex:
Recurrent Layers
================
recurrent
---------
.. autoclass:: paddle.v2.layer.recurrent
:noindex:
lstmemory
---------
.. autoclass:: paddle.v2.layer.lstmemory
:noindex:
grumemory
---------
.. autoclass:: paddle.v2.layer.grumemory
:noindex:
Recurrent Layer Group
=====================
memory
------
.. autoclass:: paddle.v2.layer.memory
:noindex:
recurrent_group
---------------
.. autoclass:: paddle.v2.layer.recurrent_group
:noindex:
lstm_step
---------
.. autoclass:: paddle.v2.layer.lstm_step
:noindex:
gru_step
--------
.. autoclass:: paddle.v2.layer.gru_step
:noindex:
beam_search
------------
.. autoclass:: paddle.v2.layer.beam_search
:noindex:
get_output
----------
.. autoclass:: paddle.v2.layer.get_output
:noindex:
Mixed Layer
===========
.. _api_v2.layer_mixed:
mixed
-----
.. autoclass:: paddle.v2.layer.mixed
:noindex:
.. _api_v2.layer_embedding:
embedding
---------
.. autoclass:: paddle.v2.layer.embedding
:noindex:
scaling_projection
------------------
.. autoclass:: paddle.v2.layer.scaling_projection
:noindex:
dotmul_projection
-----------------
.. autoclass:: paddle.v2.layer.dotmul_projection
:noindex:
dotmul_operator
---------------
.. autoclass:: paddle.v2.layer.dotmul_operator
:noindex:
full_matrix_projection
----------------------
.. autoclass:: paddle.v2.layer.full_matrix_projection
:noindex:
identity_projection
-------------------
.. autoclass:: paddle.v2.layer.identity_projection
:noindex:
table_projection
----------------
.. autoclass:: paddle.v2.layer.table_projection
:noindex:
trans_full_matrix_projection
----------------------------
.. autoclass:: paddle.v2.layer.trans_full_matrix_projection
:noindex:
Aggregate Layers
================
.. _api_v2.layer_pooling:
pooling
-------
.. autoclass:: paddle.v2.layer.pooling
:noindex:
.. _api_v2.layer_last_seq:
last_seq
--------
.. autoclass:: paddle.v2.layer.last_seq
:noindex:
.. _api_v2.layer_first_seq:
first_seq
---------
.. autoclass:: paddle.v2.layer.first_seq
:noindex:
concat
------
.. autoclass:: paddle.v2.layer.concat
:noindex:
seq_concat
----------
.. autoclass:: paddle.v2.layer.seq_concat
:noindex:
Reshaping Layers
================
block_expand
------------
.. autoclass:: paddle.v2.layer.block_expand
:noindex:
.. _api_v2.layer_expand:
expand
------
.. autoclass:: paddle.v2.layer.expand
:noindex:
repeat
------
.. autoclass:: paddle.v2.layer.repeat
:noindex:
rotate
------
.. autoclass:: paddle.v2.layer.rotate
:noindex:
seq_reshape
-----------
.. autoclass:: paddle.v2.layer.seq_reshape
:noindex:
Math Layers
===========
addto
-----
.. autoclass:: paddle.v2.layer.addto
:noindex:
linear_comb
-----------
.. autoclass:: paddle.v2.layer.linear_comb
:noindex:
interpolation
-------------
.. autoclass:: paddle.v2.layer.interpolation
:noindex:
bilinear_interp
---------------
.. autoclass:: paddle.v2.layer.bilinear_interp
:noindex:
power
-----
.. autoclass:: paddle.v2.layer.power
:noindex:
scaling
-------
.. autoclass:: paddle.v2.layer.scaling
:noindex:
slope_intercept
---------------
.. autoclass:: paddle.v2.layer.slope_intercept
:noindex:
tensor
------
.. autoclass:: paddle.v2.layer.tensor
:noindex:
.. _api_v2.layer_cos_sim:
cos_sim
-------
.. autoclass:: paddle.v2.layer.cos_sim
:noindex:
trans
-----
.. autoclass:: paddle.v2.layer.trans
:noindex:
Sampling Layers
===============
maxid
-----
.. autoclass:: paddle.v2.layer.max_id
:noindex:
sampling_id
-----------
.. autoclass:: paddle.v2.layer.sampling_id
:noindex:
Slicing and Joining Layers
==========================
pad
----
.. autoclass:: paddle.v2.layer.pad
:noindex:
.. _api_v2.layer_costs:
Cost Layers
===========
cross_entropy_cost
------------------
.. autoclass:: paddle.v2.layer.cross_entropy_cost
:noindex:
cross_entropy_with_selfnorm_cost
--------------------------------
.. autoclass:: paddle.v2.layer.cross_entropy_with_selfnorm_cost
:noindex:
multi_binary_label_cross_entropy_cost
-------------------------------------
.. autoclass:: paddle.v2.layer.multi_binary_label_cross_entropy_cost
:noindex:
huber_cost
----------
.. autoclass:: paddle.v2.layer.huber_cost
:noindex:
lambda_cost
-----------
.. autoclass:: paddle.v2.layer.lambda_cost
:noindex:
mse_cost
--------
.. autoclass:: paddle.v2.layer.mse_cost
:noindex:
rank_cost
---------
.. autoclass:: paddle.v2.layer.rank_cost
:noindex:
sum_cost
---------
.. autoclass:: paddle.v2.layer.sum_cost
:noindex:
crf
---
.. autoclass:: paddle.v2.layer.crf
:noindex:
crf_decoding
------------
.. autoclass:: paddle.v2.layer.crf_decoding
:noindex:
ctc
---
.. autoclass:: paddle.v2.layer.ctc
:noindex:
warp_ctc
--------
.. autoclass:: paddle.v2.layer.warp_ctc
:noindex:
nce
---
.. autoclass:: paddle.v2.layer.nce
:noindex:
hsigmoid
---------
.. autoclass:: paddle.v2.layer.hsigmoid
:noindex:
Check Layer
============
eos
---
.. autoclass:: paddle.v2.layer.eos
:noindex:
========
Networks
========
The v2.networks module contains pieces of neural network that combine multiple layers.
NLP
===
sequence_conv_pool
------------------
.. automodule:: paddle.v2.networks
:members: sequence_conv_pool
:noindex:
.. _api_trainer_config_helpers_network_text_conv_pool:
text_conv_pool
--------------
.. automodule:: paddle.v2.networks
:members: text_conv_pool
:noindex:
Images
======
img_conv_bn_pool
----------------
.. automodule:: paddle.v2.networks
:members: img_conv_bn_pool
:noindex:
img_conv_group
--------------
.. automodule:: paddle.v2.networks
:members: img_conv_group
:noindex:
.. _api_trainer_config_helpers_network_simple_img_conv_pool:
simple_img_conv_pool
--------------------
.. automodule:: paddle.v2.networks
:members: simple_img_conv_pool
:noindex:
vgg_16_network
---------------
.. automodule:: paddle.v2.networks
:members: vgg_16_network
:noindex:
Recurrent
=========
LSTM
----
lstmemory_unit
``````````````
.. automodule:: paddle.v2.networks
:members: lstmemory_unit
:noindex:
lstmemory_group
```````````````
.. automodule:: paddle.v2.networks
:members: lstmemory_group
:noindex:
simple_lstm
```````````
.. automodule:: paddle.v2.networks
:members: simple_lstm
:noindex:
bidirectional_lstm
``````````````````
.. automodule:: paddle.v2.networks
:members: bidirectional_lstm
:noindex:
GRU
---
gru_unit
````````
.. automodule:: paddle.v2.networks
:members: gru_unit
:noindex:
gru_group
`````````
.. automodule:: paddle.v2.networks
:members: gru_group
:noindex:
simple_gru
``````````
.. automodule:: paddle.v2.networks
:members: simple_gru
:noindex:
simple_attention
----------------
.. automodule:: paddle.v2.networks
:members: simple_attention
:noindex:
Miscs
=====
dropout_layer
--------------
.. automodule:: paddle.v2.networks
:members: dropout_layer
:noindex:
==========
Optimizer
==========
Momentum
========
.. automodule:: paddle.v2.optimizer
:members: Momentum
:noindex:
Adam
====
.. automodule:: paddle.v2.optimizer
:members: Adam
:noindex:
Adamax
======
.. automodule:: paddle.v2.optimizer
:members: Adamax
:noindex:
AdaGrad
=======
.. automodule:: paddle.v2.optimizer
:members: AdaGrad
:noindex:
DecayedAdaGrad
==============
.. automodule:: paddle.v2.optimizer
:members: DecayedAdaGrad
:noindex:
AdaDelta
========
.. automodule:: paddle.v2.optimizer
:members: AdaDelta
:noindex:
RMSProp
=======
.. automodule:: paddle.v2.optimizer
:members: RMSProp
:noindex:
=======
Pooling
=======
BasePool
========
.. automodule:: paddle.v2.pooling
:members: BasePool
:noindex:
Avg
===
.. automodule:: paddle.v2.pooling
:members: Avg
:noindex:
Max
===
.. automodule:: paddle.v2.pooling
:members: Max
:noindex:
Sum
===
.. automodule:: paddle.v2.pooling
:members: Sum
:noindex:
SquareRootN
===========
.. automodule:: paddle.v2.pooling
:members: SquareRootN
:noindex:
CudnnAvg
========
.. automodule:: paddle.v2.pooling
:members: CudnnAvg
:noindex:
CudnnMax
========
.. automodule:: paddle.v2.pooling
:members: CudnnMax
:noindex:
==================================
Data Reader Interface and DataSets
==================================
DataTypes
=========
.. automodule:: paddle.v2.data_type
:members:
:noindex:
DataFeeder
==========
.. automodule:: paddle.v2.data_feeder
:members:
:noindex:
Reader
======
.. automodule:: paddle.v2.reader
:members:
:noindex:
.. automodule:: paddle.v2.reader.creator
:members:
:noindex:
minibatch
=========
.. automodule:: paddle.v2.minibatch
:members:
:noindex:
Dataset
=======
.. automodule:: paddle.v2.dataset
:members:
:noindex:
mnist
+++++
.. automodule:: paddle.v2.dataset.mnist
:members:
:noindex:
cifar
+++++
.. automodule:: paddle.v2.dataset.cifar
:members:
:noindex:
conll05
+++++++
.. automodule:: paddle.v2.dataset.conll05
:members: get_dict,get_embedding,test
:noindex:
imdb
++++
.. automodule:: paddle.v2.dataset.imdb
:members:
:noindex:
imikolov
++++++++
.. automodule:: paddle.v2.dataset.imikolov
:members:
:noindex:
movielens
+++++++++
.. automodule:: paddle.v2.dataset.movielens
:members:
:noindex:
.. autoclass:: paddle.v2.dataset.movielens.MovieInfo
:noindex:
.. autoclass:: paddle.v2.dataset.movielens.UserInfo
:noindex:
sentiment
+++++++++
.. automodule:: paddle.v2.dataset.sentiment
:members:
:noindex:
uci_housing
+++++++++++
.. automodule:: paddle.v2.dataset.uci_housing
:members:
:noindex:
wmt14
+++++
.. automodule:: paddle.v2.dataset.wmt14
:members:
:noindex:
Model Configuration
===================
.. toctree::
:maxdepth: 1
config/activation.rst
config/layer.rst
config/optimizer.rst
config/pooling.rst
config/networks.rst
config/attr.rst
======================
Training and Inference
======================
Parameters
==========
.. automodule:: paddle.v2.parameters
:members: Parameters
:noindex:
Trainer
=======
.. automodule:: paddle.v2.trainer
:members: SGD
:noindex:
Event
=====
.. automodule:: paddle.v2.event
:members:
:noindex:
Inference
=========
.. autofunction:: paddle.v2.infer
:noindex:
\ No newline at end of file
# PaddlePaddle Design Doc
## Ingredients
As our design principle is starting from the essence: how could we
allow users to express and solve their problems at neural networks.
Some essential concepts that our API have to provide include:
1. A *topology* is an expression of *layers*.
1. A layer could be any kind of computation, including *cost*.
1. Some layers have parameters, some don't. Most costs don't have
parameters.
1. In some topologies, layers share parameters. For
example,
[the network for training a ranking model](https://github.com/PaddlePaddle/Paddle/issues/1311#issuecomment-279121850).
1. At programming time, users specify topologies and possible sharing
of parameters. PaddlePaddle can figure out and create parameters
required (and possibly shared) by one or more topologies.
## Starting from Examples
As a summarization
of
[our disucssion](https://github.com/PaddlePaddle/Paddle/issues/1315),
let us present two examples here:
### Example 1. Sharing Parameters between Layers
We use
the
[3-branch ranking](https://github.com/PaddlePaddle/Paddle/issues/1311#issuecomment-279121850) model
in this example. For your convenience, I copy-a-paste the model's
topology as follows:
```
A -> f -\
Q -> f --> cost
B -> f -/
```
The following program trains the topology including the cost, and then
use the sub-network in the trained topology in inference:
```python
def f(in):
e = paddle.layer.embedding(in, parameter_name="embedding")
o = paddle.layer.softmax(e, parameter_name="semantic")
return o
# Create 3 topologies (subnets), they share parameters because all
# correspoinding layers have the same parameter names.
fA = f(paddle.layer.data(input_name="A"))
fB = f(paddle.layer.data(input_name="B"))
fQ = f(paddle.layer.data(input_name="Q"))
topology = paddle.layer.less_than(
paddle.layer.cross_entropy(fA, fQ),
paddle.layer.corss_entropy(fB, fQ))
# Derive parameters required in topology and create them in model.
parameters = paddle.parameters.create(topology)
# Estimate parameters used in topology from data.
paddle.train(topology, parameters, reader=read_ranking_model_data)
# Inference using fA (or fB or fC, as they share their parameters).
[testA, testB, testQ] = read_ranking_model_data()
print "The sematic-vector of testA: ", paddle.infer(fA, parameters, testA)
```
### Example 2. Sharing Parameters between "Models"
We use [GAN](https://github.com/PaddlePaddle/book/tree/develop/gan) in
this example. In the following example program, `d0` and `d1`
correspond to the two networks in the following figure:
<img src="https://github.com/wangyang59/book/raw/00036f4b0da5225041a6824587c1a01cf20159b1/gan/image/gan_ig.png" width=400 />
```python
def G(in):
# over-simplified example as G has only one layers:
return paddle.layer.fc(in, parameter_name="G")
def D(in);
# again, over-simplified:
return paddle.layer.fc(in, parameter_name="D")
# Construct the first topology, which contains both D and G.
# By learning this topology, we update parameters of G.
d0 = paddle.layer.should_be_false(D(G(paddle.layer.data())))
# Construct a second topology d1, which contains only D. By
# training this topology, we update parameters of D. Note
# that d1 share parameters with d0.
d1 = paddle.layer.should_be_true(D(paddle.layer.data()))
# Create parameters from a list of multiple topologies (models) for
# the chance to share parameters between these topologies.
parameters = paddle.parameters.create([d0, d1])
# Iterative training of GAN.
for ...:
train(d0, parameters, reader=read_from_rng, immutable_parameters={"D"})
train(d1, parameters, reader=read_from_realistic_images)
# Use d1 for inference:
print "D thinks a batch of images are realistic ", infer(d1, parameters, read_mnist_images)
```
### Summarization
Above two programs reveal some important design concerns:
1. Users describe a topology as an expression of layers. Every layer
has a *parameter name*. If the users don't specify it explicitly, it's automatically generated as a unique name. By
specifying the parameter name, users can specify the sharing of
parameters between layers and even between topologies.
1. `paddle.parameters.create` figures out parameters required by one
or more topologies from parameter names of layers. It creates these
parameters and returns a `ParameterSet` object, which is in essence
a map from *parameter names* to *parameters*.
1. At training and inference time, `paddle.train` and `paddle.infer`
requires both a topology and the parameter set that holds the parameters of that topology. There are some reasons:
1. This prevents users from forgetting to call
`paddle.parameters.create`.
1. `paddle.train` needs to know which parameter set to update.
1. Users could load another (pre-trained) parameter set and use it
with a topology in `train.infer`.
1. By specifying the `immutable_parameters` parameter of
`paddle.train`, we can forbid the update of these parameters.
## Reader
Not all programming frameworks allow users to define I/O functions.
An example is Google MapReduce, which can only read from text,
SSTable, and RecordIO files. Hadoop MapReduce allows users to define
readers and writers by deriving from base classes `Reader` and
`Writer`. The former is less flexible but also less error-prone. We
decide to provide the flexibility to users to define their readers.
There are some open questions here:
1. **Should a reader return a Python dictionary?**
1. **How to map multiple outputs from a reader to multiple data layers?**
1. **How to easily compose some existing readers to read more data and
feed a topology with more data layers?**
## Training
The recommended way to training a model is to call `paddle.train`,
which simply calls `paddle.trainer.Default`, a global variable of
type `paddle.trainer.SGD`. Equivalently, we can do
```python
opt = paddle.trainer.SGD(..., paddle.updater.Adam(...))
opt.train(topology, parameters, reader=read, ...)
```
### Updater
Please be aware that a trainer can accept an updater as its data
member, where an updater is a class derived from
`paddle.trainer.Updater`. This is to make it easier to customize
trainers, as discussed
[here](https://github.com/PaddlePaddle/Paddle/issues/1319).
### Event Handler
`paddle.train` and `paddle.trainer.XXX.train` take an optional
parameter `event_handler`, which should be either `None` or a function
that handle some events:
1. BeginTraining
1. EndTraining
1. BeginIteration
1. EndIteration
1. BeginPass
1. EndPass
where EndPass is sent if and only if the reader yields
`end_pass=True`.
An example as follows:
```python
def event_handler(event):
if ininstance(event, paddle.event.EndIteration):
print paddle.test(...)
paddle.train(topology, parameters, reader, event_handler)
```
If we are writing a PaddlePaddle program in and for iPython/Jypyter,
we can use metaplotlib in the event handler to plot a curve of
cost/error versus iterations, as shown
[here](https://blog.dominodatalab.com/interactive-dashboards-in-jupyter/).
### Distributed Training
If users want to do distributed training on a cluster, s/he should
call `paddle.dist_train` and provides access tokens to the cluster as
a parameter.
For example, if the user has a TLS certificate that allows him to
access a Kubernetes cluster, s/he should be able to call
```python
paddle.dist_train(model,
trainer=paddle.trainer.SGD(...,
paddle.updater.Adam(...)),
reader=read,
k8s_user="yi",
k8s_token="kube_cluster_tls.pem",
k8s_job="hello",
num_parameter_servers=15)
```
The pseudo code if `paddle.dist_train` is as follows:
```python
def dist_train(topology, parameters, trainer, reader, ...):
if os.getenv("KUBERNETES_SERVICE_HOST") == None:
image_name = k8s_user + '/' + k8s_job
docker_build(image_name)
docker_push()
kube_ctrl_start_job(image_name, k8s_user, k8s_token)
else:
rank = kube_list_containers_in_job_and_return_current_containers_rank()
if rank == 0:
master()
elif rank < 15:
parameter_server()
else:
trainer.train(model, reader=read)
```
Please be aware that if a process is running on the Kubernetes
cluster, it will have some environment variables pre-defined.
If `dist_train` doesn't see these environment variables, it knows
that it's running on users' personal computer, and it should work as a
*launcher*. Otherwise, it knows that it's running on the cluster and
need to figure out its role as either the master, or a trainer, or a
parameter server.
# Design Doc: Distributed Training
## Objective
In [this slides](https://www.slideshare.net/cxwangyi/paddlepaddle-a-complete-solution-for-businesses), we explained that we'd like PaddlePaddle running on general-purpose clusters like those managed by Kubernetes, so to address demands for AI from both Internet and non-Internet industries.
This poses technical challenges to PaddlePaddle:
1. Support fault-recovery.
1. Support both offline and online training.
1. [Serverless computing](https://en.wikipedia.org/wiki/Serverless_computing) of distributed training.
## Training Job
A training job will be created once user asks Paddle cloud to train a model. The training job is made up of different processes that collaboratively consume data and produce a trained model. There are three kinds of processes:
1. the *master process*, which dispatches tasks to
1. one or more *trainer processes*, which run distributed training and synchronize gradients/models via
1. one or more *parameter server processes*, where each holds a shard of the global model.
Their relation is illustrated in the following graph:
<img src="src/paddle-model-sharding.png"/>
### Master Process
The master process will:
- Partition a dataset into [tasks](#task) and dispatch tasks to trainers.
- Keep track of training progress on the dataset with [task queue](#task-queue). A training job will iterate on the dataset for a full pass until it goes into next pass.
#### Task
A task is a data shard to be trained. The total number of tasks will be much bigger than the total number of trainers. The number of data instances inside a task will be much bigger than the mini-batch size.
#### Task Queue
The master process has three task queues to track training progress. As illustrated in the graph below, Job A and Job B both have one master process. Each master process has three task queues.
<img src="src/paddle-task-queues.png"/>
- The todo queue holds tasks to be dispatched. When a job starts, the master process fills in the todo queue with all tasks.
- The pending queue holds tasks that are currently training by trainers.
- the done queue holds tasks that are already trained.
The life cycle of a single task is illustrated below:
<img src="src/paddle-task-states.png"/>
1. When a new pass of training starts, all tasks will be placed in the todo queue.
1. The master process will dispatch few tasks to each trainer at a time, puts them in the pending queue and waits for completion.
1. The trainer will work on its tasks and tell the master process once a task is completed. The master process will dispatch a new task to that trainer.
1. If a task timeout. the master process will move it back to the todo queue. The timeout count will increase by one. If the timeout count is above a threshold, the task is likely to cause a trainer to crash, so it will be discarded.
1. The master process will move completed task to the done queue. When the todo queue is empty, the master process will start a new pass by moving all tasks in the done queue to todo queue and reset the timeout counter of all tasks to zero.
### Trainer Process
The trainer process will:
- Receive tasks from the master.
- Work on the tasks: calculate and upload gradient to parameter servers, and update local model by downloading new parameters from parameter servers.
### Parameter Server Process
Parameter server processes hold the parameters collaboratively. The parameters are partitioned on different parameter servers.
The parameter server will:
- Receive gradient from the trainers, update its parameters, and give the trainers the latest parameters.
- Periodically save its parameters to distributed file system by overriding the previous save.
### Optimization Algorithms
The communication pattern between the trainers and the parameter servers depends on the category of optimization algorithm:
- Synchronous Stochastic Gradient Descent (sync-SGD)
Parameter server will wait for all trainer finish n-th mini-batch calculation and send their gradients before broadcasting new parameters to every trainer. Every trainer will wait for the new parameters before starting n+1-th mini-batch.
- Asynchronous Stochastic Gradient Descent (async-SGD)
There will no synchronization between different trainers, and parameter server updates its parameter as soon as it receives new gradient:
- Each trainer uploads its accumulated gradient every n mini-batches.
- Every m mini-batches, the trainer downloads new parameters from parameter server.
- n and m do not have to be equal.
## Fault Tolerant
The training job will pause if the master processes is dead, or any of the parameter server process is dead. They will be started by [Kubernetes](https://kubernetes.io/) and recover in few minutes. Please refer to [fault recovery](#fault-recovery).
The training job will continue to make progress if there is at least one training process running. The strategy depends on the type of optimization algorithm:
- sync-SGD
TODO
- async-SGD
Since async-SGD does not require synchronization between mini-batches, the system will by definition make process if at least one trainer is running.
## Fault Recovery
PaddlePaddle uses [etcd](https://github.com/coreos/etcd) to keep track of the states of processes. Because etcd is a distributed reliable key-value store, the restarted process can recover its states from etcd. The model parameters are periodically saved into distributed file system, so a restarted parameter server can recover its parameters from the saved file.
Now we will introduce how each process recovers from a failure, the graph below shows how etcd is used:
<img src="src/paddle-etcd.png"/>
### Master Process
When the master is started by the Kubernetes, it executes the following steps at startup:
1. Grabs a unique *master* lock in etcd, which prevents concurrent master instantiations.
1. Recovers the task queues from etcd if they already exist, otherwise, the master will create them.
1. Watches the trainer prefix keys `/trainer/` on etcd to find the live trainers.
1. Starts dispatching the tasks to the trainers, and updates task queue using an etcd transaction to ensure lock is held during the update.
The master process will kill itself if its etcd lease expires.
When the master process is dead for any reason, Kubernetes will restart it. It will be online again with all states recovered from etcd in few minutes.
### Trainer Process
When the trainer is started by the Kubernetes, it executes the following steps at startup:
1. Watches the available parameter server prefix keys `/ps/` on etcd and waits until the count of parameter servers reaches the desired count.
1. Generates a unique ID, and sets key `/trainer/<unique ID>` with its contact address as value. The key will be deleted when the lease expires, so the master will be aware of the trainer being online and offline.
1. Waits for tasks from the master to start training.
If trainer's etcd lease expires, it will try set key `/trainer/<unique ID>` again so that the master process can discover the trainer again.
### Parameter Server Process
When the parameter server is started by Kubernetes, it executes the following steps at startup:
1. Read desired total number of parameter servers from etcd `/ps_desired`
1. Search through etcd keys `/ps/<index>` (`/ps/0`, `/ps/1`, ...) to find the first non-existant key whose index is smaller than the total number of parameter servers. Set the key using a transaction to avoid concurrent writes. The parameter server's index is inferred from the key name.
The desired number of parameter servers is 3:
<img src="src/paddle-ps-0.png"/>
The third parameter server joined:
<img src="src/paddle-ps-1.png"/>
1. The parameter server can load parameters if there are already saved parameters in the save path (inferred from its index).
1. Now the parameter server is ready for the trainers' requests.
If the parameter server's etcd lease expires, the parameter server will kill itself.
## Dynamic Scaling
### Trainer Scaling
TODO
### Parameter Server Scaling
Not planned for v1.
## Training Dataset Format
TODO
## User Interface
TODO
# Paddle多语言接口实现
## 背景
Paddle需要一个多语言接口,这个接口需要做到:
* 有标准的,良好的文档
* 例如Python可以使用[Sphinx](http://www.sphinx-doc.org/en/stable/)生成API文档,golang可以使用[GoDoc](https://godoc.org/golang.org/x/tools/cmd/godoc)生成文档。这都需要这个接口按照约定俗成的规则来注释完备。
* 不同语言的接口适应不同语言的特性
* 例如Java与Python的错误处理是直接扔出来Exception,而对于golang错误处理应该使用返回值。
## 基本要求
Paddle的多语言接口实现包括一下几个方面:
* 我们使用动态库来分发Paddle。在这个动态库中不嵌入任何其他语言的解释器,也不使用其他动态库。
* 这个动态库使用C99标准的头文件导出一些函数,不使用/导出C++符号。
* 不导出Paddle内部的结构体、类,仅仅使用`void*`指针作为类型的句柄(handler)。
* 不使用SWIG这种代码生成器,而是手写多语言绑定。
## 原因
### 使用动态库来分发Paddle
* Paddle的链接方式比较复杂
* 如果用户要把Paddle的静态库(libpaddle.a)链接到自己的程序里,得使用 `--whole-archive` (for GCC) 或者 `--force_load` (for Clang) 参数,来确保把 libpaddle.a 里所有的符号都写入自己的程序的二进制文件里。这是因为 Paddle 的源码里使用了[object factory design pattern](http://stackoverflow.com/a/1310326/724872)。
* 编译型语言,例如C/C++使用静态库和动态库难度差不多。但是解释性语言,例如[Python](http://stackoverflow.com/questions/19560594/how-to-import-static-library-in-python)或者[Java](http://stackoverflow.com/questions/24493337/linking-static-library-with-jni),只能调用Paddle的动态库,否则得把Paddle静态库链接到解释器里。
* 解释性语言实际运行的二进制是解释器本身,如果调用静态库只能将静态库与解释器链接。例如对于Java来说,便是将静态库加入JVM中。这对于通常的Java的开发者来说,是不常见的做法。
### 动态库中不嵌入任何其他语言的解释器
* 目前Paddle的进程模型是C++内部驱动Python解释器进行模型配置解析和数据读取
* 我们最终的动态库中不嵌入Python或者其他任何语言的解释器。模型配置解析,数据读取均交由其他语言完成
现阶段Paddle有一个问题是,Paddle内嵌的Python解释器和外部使用的Python如果版本不同,会直接报错退出。
### Paddle动态库中,不引用其他动态库
* 即这个动态库是不依赖于其他任何文件的,可以在任何机器上执行的。
### 这个动态库使用C99标准的头文件导出一些函数,不使用/导出C++符号
* 由于C++编译器没有[名字修饰](https://en.wikipedia.org/wiki/Name_mangling#C.2B.2B)的规范,不同版本的编译器之间,对于同一段C++代码生成的符号可能不一致。而多语言接口需要直接读取生成的二进制(动态库),需要有稳定的导出符号。
* C语言是有导出符号的标准的,并且在常见的平台上,都是ABI调用标准的。
* 大多数语言都支持使用C语言API
* 使用C99而不使用C89,是因为C99支持[Fixed-width integer types](https://en.wikipedia.org/wiki/C_data_types#Fixed-width_integer_types)和[Boolean type](https://en.wikipedia.org/wiki/C_data_types#Boolean_type)。
* 使用C99而不使用C11的原因是,[C11](https://en.wikipedia.org/wiki/C11_(C_standard_revision))并没有Paddle特别需要的特性,且C99相对于C11使用更加广泛。
### 不导出Paddle内部的结构体、类,仅仅使用`void*`指针作为类型的句柄(handler)
* Paddle内部的类为C++书写,直接导出到C的接口比较困难。
* 在C-API中使用`void*`来表示Paddle内部类。再在每一个API中自己检查类型。
在C的头文件 `paddle_matrix.h` 中:
```C
typedef void* paddle_matrix;
typedef int paddle_error;
extern "C"
paddle_error paddle_matrix_shape(paddle_matrix matrix,
uint64_t* width,
uint64_t* height);
```
而在CPP里面实现这个C的接口,文件 `paddle_matrix.cpp`
```cpp
#include "paddle/math/matrix.hpp"
extern "C"
paddle_error paddle_matrix_shape(paddle_matrix matrix,
uint64_t *width,
uint64_t *height) {
auto m = (paddle::math::matrix*)(matrix);
*width = m->width();
*height = m->height();
}
```
其中`paddle/math/matrix.hpp`文件内容为:
```cpp
namespace paddle {
namespace math {
class Matrix {
//...
};
} // namespace math
} // namespace paddle
```
### 不使用SWIG这种代码生成器,而是手写多语言绑定
* [SWIG](http://www.swig.org/)是一个多语言接口的代码生成器。他的目标是使用C/C++写代码,SWIG直接读取C/C++的头文件,生成各种语言的绑定代码。
* 对于多语言接口,SWIG需要写一个interface文件。这个文件具有独特的语法,学习成本高。且增加一个第三方语言,就需要对这个第三方语言增加一些定义。有的时候,interface文件的写法非常[tricky](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/api/Paddle.swig#L36)。社区贡献代码学习成本高。
* SWIG暴露的接口保留了C++的接口样式,很难保证多语言代码风格的一致性。(函数命名,错误处理)
* 因为SWIG在第三方语言中暴露的函数名,类名和C++中完全一致。C++的命名风格并不能适应其他第三方语言。如果使用SWIG我们需要将在interface文件里,将大量的`SomeCppClass`重命名成`some_python_class`,或者`SomeGoTypes`。
* 对于不同语言,错误处理的方式也不尽相同。例如对于Java或者Python,最常见的错误处理方式是Exception,而对于Golang,错误处理方式是返回值。而SWIG只能简单的暴露C++接口,无法做到对于各种语言错误处理方式的适配。
* 对于大多数语言,直接使用C语言的.h并不困难。例如Python的[cffi](https://cffi.readthedocs.io/en/latest/overview.html#simple-example-abi-level-in-line)或者[Cython](http://cython.org/), golang的[cgo](https://golang.org/cmd/cgo/)。
* SWIG支持的语言或者解释器有局限。例如对于Python,使用SWIG只支持CPython解释器,而不支持PyPy解释器。
## 原因列表
| 结论 | 对比 | 原因 |
|---| --- | --- |
| 使用动态库 | 不使用静态库 | 解释型语言只能调用动态库,Paddle静态库链接复杂 |
| 不嵌入其他语言解释器 | 不嵌入Python解释器 | Paddle C++目前嵌入Python解释器,会导致不同版本Python在一个进程里的bug |
| 不引用其他动态库 | | Paddle一个动态库可以在任何Linux系统上运行 |
| 使用C99做接口 | 不使用C++做接口 | C有标准的ABI,C99是目前C最广泛的使用标准,且C99支持bool类型和定长整数(uint64_t等)类型 |
| 使用void*作为类句柄 | 不显示的写每个类具体包含什么| 实现简单,并且让接口脱离实现细节 |
| 手写多语言绑定 | 不使用SWIG | 使用SWIG需要多语言绑定的开发人员熟练掌握SWIG配置,社区参与困难。SWIG生成的代码不能保证多语言代码风格的一致性 |
## 简单实现
TBD
# Python Data Reader Design Doc
At training and testing time, PaddlePaddle programs need to read data. To ease the users' work to write data reading code, we define that
- A *reader* is a function that reads data (from file, network, random number generator, etc) and yields data items.
- A *reader creator* is a function that returns a reader function.
- A *reader decorator* is a function, which accepts one or more readers, and returns a reader.
- A *batch reader* is a function that reads data (from *reader*, file, network, random number generator, etc) and yields a batch of data items.
and provide function which converts reader to batch reader, frequently used reader creators and reader decorators.
## Data Reader Interface
Indeed, *data reader* doesn't have to be a function that reads and yields data items. It can be any function with no parameter that creates a iterable (anything can be used in `for x in iterable`):
```
iterable = data_reader()
```
Element produced from the iterable should be a **single** entry of data, **not** a mini batch. That entry of data could be a single item, or a tuple of items. Item should be of [supported type](http://www.paddlepaddle.org/doc/ui/data_provider/pydataprovider2.html?highlight=dense_vector#input-types) (e.g., numpy 1d array of float32, int, list of int)
An example implementation for single item data reader creator:
```python
def reader_creator_random_image(width, height):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height)
return reader
```
An example implementation for multiple item data reader creator:
```python
def reader_creator_random_image_and_label(width, height, label):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height), label
return reader
```
## Batch Reader Interface
*batch reader* can be any function with no parameter that creates a iterable (anything can be used in `for x in iterable`). The output of the iterable should be a batch (list) of data items. Each item inside the list must be a tuple.
Here are valid outputs:
```python
# a mini batch of three data items. Each data item consist three columns of data, each of which is 1.
[(1, 1, 1),
(2, 2, 2),
(3, 3, 3)]
# a mini batch of three data items, each data item is a list (single column).
[([1,1,1],),
([2,2,2],),
([3,3,3],),
```
Please note that each item inside the list must be a tuple, below is an invalid output:
```python
# wrong, [1,1,1] needs to be inside a tuple: ([1,1,1],).
# Otherwise it's ambiguous whether [1,1,1] means a single column of data [1, 1, 1],
# or three column of datas, each of which is 1.
[[1,1,1],
[2,2,2],
[3,3,3]]
```
It's easy to convert from reader to batch reader:
```python
mnist_train = paddle.dataset.mnist.train()
mnist_train_batch_reader = paddle.batch(mnist_train, 128)
```
Also easy to create custom batch reader:
```python
def custom_batch_reader():
while True:
batch = []
for i in xrange(128):
batch.append((numpy.random.uniform(-1, 1, 28*28),)) # note that it's a tuple being appended.
yield batch
mnist_random_image_batch_reader = custom_batch_reader
```
## Usage
batch reader, mapping from item(s) read to data layer, batch size and number of total pass will be passed into `paddle.train`:
```python
# two data layer is created:
image_layer = paddle.layer.data("image", ...)
label_layer = paddle.layer.data("label", ...)
# ...
batch_reader = paddle.batch(paddle.dataset.mnist.train(), 128)
paddle.train(batch_reader, {"image":0, "label":1}, 128, 10, ...)
```
## Data Reader Decorator
*Data reader decorator* takes a single or multiple data reader, returns a new data reader. It is similar to a [python decorator](https://wiki.python.org/moin/PythonDecorators), but it does not use `@` syntax.
Since we have a strict interface for data readers (no parameter, return a single data item). Data reader can be used flexiable via data reader decorators. Following are a few examples:
### Prefetch Data
Since reading data may take time and training can not proceed without data. It is generally a good idea to prefetch data.
Use `paddle.reader.buffered` to prefetch data:
```python
buffered_reader = paddle.reader.buffered(paddle.dataset.mnist.train(), 100)
```
`buffered_reader` will try to buffer (prefetch) `100` data entries.
### Compose Multiple Data Readers
For example, we want to use a source of real images (reusing mnist dataset), and a source of random images as input for [Generative Adversarial Networks](https://arxiv.org/abs/1406.2661).
We can do:
```python
def reader_creator_random_image(width, height):
def reader():
while True:
yield numpy.random.uniform(-1, 1, size=width*height)
return reader
def reader_creator_bool(t):
def reader:
while True:
yield t
return reader
true_reader = reader_creator_bool(True)
false_reader = reader_creator_bool(False)
reader = paddle.reader.compose(paddle.dataset.mnist.train(), data_reader_creator_random_image(20, 20), true_reader, false_reader)
# Skipped 1 because paddle.dataset.mnist.train() produces two items per data entry.
# And we don't care second item at this time.
paddle.train(paddle.batch(reader, 128), {"true_image":0, "fake_image": 2, "true_label": 3, "false_label": 4}, ...)
```
### Shuffle
Given shuffle buffer size `n`, `paddle.reader.shuffle` will return a data reader that buffers `n` data entries and shuffle them before a data entry is read.
Example:
```python
reader = paddle.reader.shuffle(paddle.dataset.mnist.train(), 512)
```
## Q & A
### Why reader return only a single entry, but not a mini batch?
Always returning a single entry make reusing existing data readers much easier (e.g., if existing reader return not a single entry but 3 entries, training code will be more complex because it need to handle cases like batch size 2).
We provide function `paddle.batch` to turn (single entry) reader into batch reader.
### Why do we need batch reader, isn't train take reader and batch_size as arguments sufficient?
In most of the case, train taking reader and batch_size as arguments would be sufficent. However sometimes user want to customize order of data entries inside a mini batch. Or even change batch size dynamically.
### Why use a dictionary but not a list to provide mapping?
We decided to use dictionary (`{"image":0, "label":1}`) instead of list (`["image", "label"]`) is because that user can easily resue item (e.g., using `{"image_a":0, "image_b":0, "label":1}`) or skip item (e.g., using `{"image_a":0, "label":2}`).
### How to create custom data reader creator
```python
def image_reader_creator(image_path, label_path, n):
def reader():
f = open(image_path)
l = open(label_path)
images = numpy.fromfile(
f, 'ubyte', count=n * 28 * 28).reshape((n, 28 * 28)).astype('float32')
images = images / 255.0 * 2.0 - 1.0
labels = numpy.fromfile(l, 'ubyte', count=n).astype("int")
for i in xrange(n):
yield images[i, :], labels[i] # a single entry of data is created each time
f.close()
l.close()
return reader
# images_reader_creator creates a reader
reader = image_reader_creator("/path/to/image_file", "/path/to/label_file", 1024)
paddle.train(paddle.batch(reader, 128), {"image":0, "label":1}, ...)
```
### How is `paddle.train` implemented
An example implementation of paddle.train could be:
```python
def train(batch_reader, mapping, batch_size, total_pass):
for pass_idx in range(total_pass):
for mini_batch in batch_reader(): # this loop will never end in online learning.
do_forward_backward(mini_batch, mapping)
```
Simple Linear Regression
========================
PaddlePaddle is a deep learning platform open-sourced by Baidu. With PaddlePaddle, you can easily train a classic neural network within a couple lines of configuration, or you can build sophisticated models that provide state-of-the-art performance on difficult learning tasks like sentiment analysis, machine translation, image caption and so on.
Problem Background
------------------
Now, to give you a hint of what using PaddlePaddle looks like, let's start with a fundamental learning problem - `simple linear regression <https://en.wikipedia.org/wiki/Simple_linear_regression>`_: you have observed a set of two-dimensional data points of ``X`` and ``Y``, where ``X`` is an explanatory variable and ``Y`` is corresponding dependent variable, and you want to recover the underlying correlation between ``X`` and ``Y``. Linear regression can be used in many practical scenarios. For example, ``X`` can be a variable about house size, and ``Y`` a variable about house price. You can build a model that captures relationship between them by observing real estate markets.
Prepare the Data
-----------------
Suppose the true relationship can be characterized as ``Y = 2X + 0.3``, let's see how to recover this pattern only from observed data. Here is a piece of python code that feeds synthetic data to PaddlePaddle. The code is pretty self-explanatory, the only extra thing you need to add for PaddlePaddle is a definition of input data types.
.. code-block:: python
# dataprovider.py
from paddle.trainer.PyDataProvider2 import *
import random
# define data types of input: 2 real numbers
@provider(input_types=[dense_vector(1), dense_vector(1)],use_seq=False)
def process(settings, input_file):
for i in xrange(2000):
x = random.random()
yield [x], [2*x+0.3]
Train a NeuralNetwork
----------------------
To recover this relationship between ``X`` and ``Y``, we use a neural network with one layer of linear activation units and a square error cost layer. Don't worry if you are not familiar with these terminologies, it's just saying that we are starting from a random line ``Y' = wX + b`` , then we gradually adapt ``w`` and ``b`` to minimize the difference between ``Y'`` and ``Y``. Here is what it looks like in PaddlePaddle:
.. code-block:: python
# trainer_config.py
from paddle.trainer_config_helpers import *
# 1. read data. Suppose you saved above python code as dataprovider.py
data_file = 'empty.list'
with open(data_file, 'w') as f: f.writelines(' ')
define_py_data_sources2(train_list=data_file, test_list=None,
module='dataprovider', obj='process',args={})
# 2. learning algorithm
settings(batch_size=12, learning_rate=1e-3, learning_method=MomentumOptimizer())
# 3. Network configuration
x = data_layer(name='x', size=1)
y = data_layer(name='y', size=1)
y_predict = fc_layer(input=x, param_attr=ParamAttr(name='w'), size=1, act=LinearActivation(), bias_attr=ParamAttr(name='b'))
cost = mse_cost(input=y_predict, label=y)
outputs(cost)
Some of the most fundamental usages of PaddlePaddle are demonstrated:
- The first part shows how to feed data into PaddlePaddle. In general cases, PaddlePaddle reads raw data from a list of files, and then do some user-defined process to get real input. In this case, we only need to create a placeholder file since we are generating synthetic data on the fly.
- The second part describes learning algorithm. It defines in what ways adjustments are made to model parameters. PaddlePaddle provides a rich set of optimizers, but a simple momentum based optimizer will suffice here, and it processes 12 data points each time.
- Finally, the network configuration. It usually is as simple as "stacking" layers. Three kinds of layers are used in this configuration:
- **Data Layer**: a network always starts with one or more data layers. They provide input data to the rest of the network. In this problem, two data layers are used respectively for ``X`` and ``Y``.
- **FC Layer**: FC layer is short for Fully Connected Layer, which connects all the input units to current layer and does the actual computation specified as activation function. Computation layers like this are the fundamental building blocks of a deeper model.
- **Cost Layer**: in training phase, cost layers are usually the last layers of the network. They measure the performance of current model, and provide guidence to adjust parameters.
Now that everything is ready, you can train the network with a simple command line call:
.. code-block:: bash
paddle train --config=trainer_config.py --save_dir=./output --num_passes=30
This means that PaddlePaddle will train this network on the synthectic dataset for 30 passes, and save all the models under path ``./output``. You will see from the messages printed out during training phase that the model cost is decreasing as time goes by, which indicates we are getting a closer guess.
Evaluate the Model
-------------------
Usually, a different dataset that left out during training phase should be used to evalute the models. However, we are lucky enough to know the real answer: ``w=2, b=0.3``, thus a better option is to check out model parameters directly.
In PaddlePaddle, training is just to get a collection of model parameters, which are ``w`` and ``b`` in this case. Each parameter is saved in an individual file in the popular ``numpy`` array format. Here is the code that reads parameters from last pass.
.. code-block:: python
import numpy as np
import os
def load(file_name):
with open(file_name, 'rb') as f:
f.read(16) # skip header for float type.
return np.fromfile(f, dtype=np.float32)
print 'w=%.6f, b=%.6f' % (load('output/pass-00029/w'), load('output/pass-00029/b'))
# w=1.999743, b=0.300137
.. image:: parameters.png
:align: center
Although starts from a random guess, you can see that value of ``w`` changes quickly towards 2 and ``b`` changes quickly towards 0.3. In the end, the predicted line is almost identical with real answer.
There, you have recovered the underlying pattern between ``X`` and ``Y`` only from observed data.
Installing from Sources
==========================
* [1. Download and Setup](#download)
* [2. Requirements](#requirements)
* [3. Build on Ubuntu](#ubuntu)
* [4. Build on Centos](#centos)
## <span id="download">Download and Setup</span>
You can download PaddlePaddle from the [github source](https://github.com/PaddlePaddle/Paddle).
```bash
git clone https://github.com/PaddlePaddle/Paddle paddle
cd paddle
```
## <span id="requirements">Requirements</span>
To compile the source code, your computer must be equipped with the following dependencies.
- **Compiler**: GCC >= 4.8 or Clang >= 3.3 (AppleClang >= 5.1) and gfortran compiler
- **CMake**: CMake >= 3.0 (at least CMake 3.4 on Mac OS X)
- **BLAS**: MKL, OpenBlas or ATLAS
- **Python**: only support Python 2.7
**Note:** For CUDA 7.0 and CUDA 7.5, GCC 5.0 and up are not supported!
For CUDA 8.0, GCC versions later than 5.3 are not supported!
### Options
PaddlePaddle supports some build options.
<html>
<table>
<thead>
<tr>
<th scope="col" class="left">Optional</th>
<th scope="col" class="left">Description</th>
</tr>
</thead>
<tbody>
<tr><td class="left">WITH_GPU</td><td class="left">Compile PaddlePaddle with NVIDIA GPU</td></tr>
<tr><td class="left">WITH_AVX</td><td class="left">Compile PaddlePaddle with AVX intrinsics</td></tr>
<tr><td class="left">WITH_DSO</td><td class="left">Compile PaddlePaddle with dynamic linked CUDA</td></tr>
<tr><td class="left">WITH_TESTING</td><td class="left">Compile PaddlePaddle with unit testing</td></tr>
<tr><td class="left">WITH_SWIG_PY</td><td class="left">Compile PaddlePaddle with inference api</td></tr>
<tr><td class="left">WITH_STYLE_CHECK</td><td class="left">Compile PaddlePaddle with style check</td></tr>
<tr><td class="left">WITH_PYTHON</td><td class="left">Compile PaddlePaddle with python interpreter</td></tr>
<tr><td class="left">WITH_DOUBLE</td><td class="left">Compile PaddlePaddle with double precision</td></tr>
<tr><td class="left">WITH_RDMA</td><td class="left">Compile PaddlePaddle with RDMA support</td></tr>
<tr><td class="left">WITH_TIMER</td><td class="left">Compile PaddlePaddle with stats timer</td></tr>
<tr><td class="left">WITH_PROFILER</td><td class="left">Compile PaddlePaddle with GPU profiler</td></tr>
<tr><td class="left">WITH_DOC</td><td class="left">Compile PaddlePaddle with documentation</td></tr>
<tr><td class="left">WITH_COVERAGE</td><td class="left">Compile PaddlePaddle with code coverage</td></tr>
<tr><td class="left">COVERALLS_UPLOAD</td><td class="left">Package code coverage data to coveralls</td></tr>
<tr><td class="left">ON_TRAVIS</td><td class="left">Exclude special unit test on Travis CI</td></tr>
</tbody>
</table>
</html>
**Note:**
- The GPU version works best with Cuda Toolkit 8.0 and cuDNN v5.
- Other versions like Cuda Toolkit 7.0, 7.5 and cuDNN v3, v4 are also supported.
- **To utilize cuDNN v5, Cuda Toolkit 7.5 is prerequisite and vice versa.**
As a simple example, consider the following:
1. **BLAS Dependencies(optional)**
CMake will search BLAS libraries from system. If not found, OpenBLAS will be downloaded, built and installed automatically.
To utilize preinstalled BLAS, you can simply specify MKL, OpenBLAS or ATLAS via `MKL_ROOT`, `OPENBLAS_ROOT` or `ATLAS_ROOT`.
```bash
# specify MKL
cmake .. -DMKL_ROOT=<mkl_path>
# or specify OpenBLAS
cmake .. -DOPENBLAS_ROOT=<openblas_path>
```
2. **Doc Dependencies(optional)**
To generate PaddlePaddle's documentation, install dependencies and set `-DWITH_DOC=ON` as follows:
```bash
pip install 'sphinx>=1.4.0'
pip install sphinx_rtd_theme recommonmark
# install doxygen on Ubuntu
sudo apt-get install doxygen
# install doxygen on Mac OS X
brew install doxygen
# active docs in cmake
cmake .. -DWITH_DOC=ON`
```
## <span id="ubuntu">Build on Ubuntu 14.04</span>
### Install Dependencies
- **Paddle Dependencies**
```bash
# necessary
sudo apt-get update
sudo apt-get install -y git curl gcc g++ gfortran make build-essential automake
sudo apt-get install -y python python-pip python-numpy libpython-dev bison
sudo pip install 'protobuf==3.1.0.post1'
# install cmake 3.4
curl -sSL https://cmake.org/files/v3.4/cmake-3.4.1.tar.gz | tar -xz && \
cd cmake-3.4.1 && ./bootstrap && make -j4 && sudo make install && \
cd .. && rm -rf cmake-3.4.1
```
- **GPU Dependencies (optional)**
To build GPU version, you will need the following installed:
1. a CUDA-capable GPU
2. A supported version of Linux with a gcc compiler and toolchain
3. NVIDIA CUDA Toolkit (available at http://developer.nvidia.com/cuda-downloads)
4. NVIDIA cuDNN Library (availabel at https://developer.nvidia.com/cudnn)
The CUDA development environment relies on tight integration with the host development environment,
including the host compiler and C runtime libraries, and is therefore only supported on
distribution versions that have been qualified for this CUDA Toolkit release.
After downloading cuDNN library, issue the following commands:
```bash
sudo tar -xzf cudnn-7.5-linux-x64-v5.1.tgz -C /usr/local
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
```
Then you need to set LD\_LIBRARY\_PATH, PATH environment variables in ~/.bashrc.
```bash
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda/bin:$PATH
```
### Build and Install
As usual, the best option is to create build folder under paddle project directory.
```bash
mkdir build && cd build
```
Finally, you can build and install PaddlePaddle:
```bash
# you can add build option here, such as:
cmake .. -DCMAKE_INSTALL_PREFIX=<path to install>
# please use sudo make install, if you want to install PaddlePaddle into the system
make -j `nproc` && make install
# set PaddlePaddle installation path in ~/.bashrc
export PATH=<path to install>/bin:$PATH
# install PaddlePaddle Python modules.
sudo pip install <path to install>/opt/paddle/share/wheels/*.whl
```
## <span id="centos">Build on Centos 7</span>
### Install Dependencies
- **CPU Dependencies**
```bash
# necessary
sudo yum update
sudo yum install -y epel-release
sudo yum install -y make cmake3 python-devel python-pip gcc-gfortran swig git
sudo pip install wheel numpy
sudo pip install 'protobuf>=3.0.0'
```
- **GPU Dependencies (optional)**
To build GPU version, you will need the following installed:
1. a CUDA-capable GPU
2. A supported version of Linux with a gcc compiler and toolchain
3. NVIDIA CUDA Toolkit (available at http://developer.nvidia.com/cuda-downloads)
4. NVIDIA cuDNN Library (availabel at https://developer.nvidia.com/cudnn)
The CUDA development environment relies on tight integration with the host development environment,
including the host compiler and C runtime libraries, and is therefore only supported on
distribution versions that have been qualified for this CUDA Toolkit release.
After downloading cuDNN library, issue the following commands:
```bash
sudo tar -xzf cudnn-7.5-linux-x64-v5.1.tgz -C /usr/local
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
```
Then you need to set LD\_LIBRARY\_PATH, PATH environment variables in ~/.bashrc.
```bash
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda/bin:$PATH
```
### Build and Install
As usual, the best option is to create build folder under paddle project directory.
```bash
mkdir build && cd build
```
Finally, you can build and install PaddlePaddle:
```bash
# you can add build option here, such as:
cmake3 .. -DCMAKE_INSTALL_PREFIX=<path to install>
# please use sudo make install, if you want to install PaddlePaddle into the system
make -j `nproc` && make install
# set PaddlePaddle installation path in ~/.bashrc
export PATH=<path to install>/bin:$PATH
# install PaddlePaddle Python modules.
sudo pip install <path to install>/opt/paddle/share/wheels/*.whl
```
PaddlePaddle in Docker Containers
=================================
Docker container is currently the only officially-supported way to
running PaddlePaddle. This is reasonable as Docker now runs on all
major operating systems including Linux, Mac OS X, and Windows.
Please be aware that you will need to change `Dockers settings
<https://github.com/PaddlePaddle/Paddle/issues/627>`_ to make full use
of your hardware resource on Mac OS X and Windows.
Working With Docker
-------------------
Docker is simple as long as we understand a few basic concepts:
- *image*: A Docker image is a pack of software. It could contain one or more programs and all their dependencies. For example, the PaddlePaddle's Docker image includes pre-built PaddlePaddle and Python and many Python packages. We can run a Docker image directly, other than installing all these software. We can type
.. code-block:: bash
docker images
to list all images in the system. We can also run
.. code-block:: bash
docker pull paddlepaddle/paddle:0.10.0rc2
to download a Docker image, paddlepaddle/paddle in this example,
from Dockerhub.com.
- *container*: considering a Docker image a program, a container is a
"process" that runs the image. Indeed, a container is exactly an
operating system process, but with a virtualized filesystem, network
port space, and other virtualized environment. We can type
.. code-block:: bash
docker run paddlepaddle/paddle:0.10.0rc2
to start a container to run a Docker image, paddlepaddle/paddle in this example.
- By default docker container have an isolated file system namespace,
we can not see the files in the host file system. By using *volume*,
mounted files in host will be visible inside docker container.
Following command will mount current dirctory into /data inside
docker container, run docker container from debian image with
command :code:`ls /data`.
.. code-block:: bash
docker run --rm -v $(pwd):/data debian ls /data
Usage of CPU-only and GPU Images
----------------------------------
We package PaddlePaddle's compile environment into a Docker image,
called the develop image, it contains all compiling tools that
PaddlePaddle needs. We package compiled PaddlePaddle program into a
Docker image as well, called the production image, it contains all
runtime environment that running PaddlePaddle needs. For each version
of PaddlePaddle, we release both of them. Production image includes
CPU-only version and a CUDA GPU version and their no-AVX versions.
We put the docker images on `dockerhub.com
<https://hub.docker.com/r/paddledev/paddle/>`_. You can find the
latest versions under "tags" tab at dockerhub.com. If you are in
China, you can use our Docker image registry mirror to speed up the
download process. To use it, please replace all paddlepaddle/paddle in
the commands to docker.paddlepaddle.org/paddle.
1. Production images, this image might have multiple variants:
- GPU/AVX::code:`paddlepaddle/paddle:<version>-gpu`
- GPU/no-AVX::code:`paddlepaddle/paddle:<version>-gpu-noavx`
- CPU/AVX::code:`paddlepaddle/paddle:<version>`
- CPU/no-AVX::code:`paddlepaddle/paddle:<version>-noavx`
Please be aware that the CPU-only and the GPU images both use the
AVX instruction set, but old computers produced before 2008 do not
support AVX. The following command checks if your Linux computer
supports AVX:
.. code-block:: bash
if cat /proc/cpuinfo | grep -i avx; then echo Yes; else echo No; fi
To run the CPU-only image as an interactive container:
.. code-block:: bash
docker run -it --rm paddlepaddle/paddle:0.10.0rc2 /bin/bash
Above method work with the GPU image too -- the recommended way is
using `nvidia-docker <https://github.com/NVIDIA/nvidia-docker>`_.
Please install nvidia-docker first following this `tutorial
<https://github.com/NVIDIA/nvidia-docker#quick-start>`_.
Now you can run a GPU image:
.. code-block:: bash
nvidia-docker run -it --rm paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash
2. development image :code:`paddlepaddle/paddle:<version>-dev`
This image has packed related develop tools and runtime
environment. Users and developers can use this image instead of
their own local computer to accomplish development, build,
releasing, document writing etc. While different version of paddle
may depends on different version of libraries and tools, if you
want to setup a local environment, you must pay attention to the
versions. The development image contains:
- gcc/clang
- nvcc
- Python
- sphinx
- woboq
- sshd
Many developers use servers with GPUs, they can use ssh to login to
the server and run :code:`docker exec` to enter the docker
container and start their work. Also they can start a development
docker image with SSHD service, so they can login to the container
and start work.
Train Model Using Python API
----------------------------
Our official docker image provides a runtime for PaddlePaddle
programs. The typical workflow will be as follows:
Create a directory as workspace:
.. code-block:: bash
mkdir ~/workspace
Edit a PaddlePaddle python program using your favourite editor
.. code-block:: bash
emacs ~/workspace/example.py
Run the program using docker:
.. code-block:: bash
docker run --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 python /workspace/example.py
Or if you are using GPU for training:
.. code-block:: bash
nvidia-docker run --rm -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu python /workspace/example.py
Above commands will start a docker container by running :code:`python
/workspace/example.py`. It will stop once :code:`python
/workspace/example.py` finishes.
Another way is to tell docker to start a :code:`/bin/bash` session and
run PaddlePaddle program interactively:
.. code-block:: bash
docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2 /bin/bash
# now we are inside docker container
cd /workspace
python example.py
Running with GPU is identical:
.. code-block:: bash
nvidia-docker run -it -v ~/workspace:/workspace paddlepaddle/paddle:0.10.0rc2-gpu /bin/bash
# now we are inside docker container
cd /workspace
python example.py
Develop PaddlePaddle or Train Model Using C++ API
---------------------------------------------------
We will be using PaddlePaddle development image since it contains all
compiling tools and dependencies.
1. Build PaddlePaddle develop image
Use following command to build PaddlePaddle develop image:
.. code-block:: bash
git clone https://github.com/PaddlePaddle/Paddle.git && cd Paddle
docker build -t paddle:dev .
2. Build PaddlePaddle production image
There are two steps for building production image, the first step is to run:
.. code-block:: bash
docker run -v $(pwd):/paddle -e "WITH_GPU=OFF" -e "WITH_AVX=OFF" -e "WITH_TEST=ON" paddle:dev
The above command will compile PaddlePaddle and create a Dockerfile for building production image. All the generated files are in the build directory. "WITH_GPU" controls if the generated production image supports GPU. "WITH_AVX" controls if the generated production image supports AVX. "WITH_TEST" controls if the unit test will be generated.
The second step is to run:
.. code-block:: bash
docker build -t paddle:prod -f build/Dockerfile ./build
The above command will generate the production image by copying the compiled PaddlePaddle program into the image.
3. Run unit test
Following command will run unit test:
.. code-block:: bash
docker run -it -v $(pwd):/paddle paddle:dev bash -c "cd /paddle/build && ctest"
PaddlePaddle Book
------------------
The Jupyter Notebook is an open-source web application that allows
you to create and share documents that contain live code, equations,
visualizations and explanatory text in a single browser.
PaddlePaddle Book is an interactive Jupyter Notebook for users and developers.
We already exposed port 8888 for this book. If you want to
dig deeper into deep learning, PaddlePaddle Book definitely is your best choice.
We provide a packaged book image, simply issue the command:
.. code-block:: bash
docker run -p 8888:8888 paddlepaddle/book
Then, you would back and paste the address into the local browser:
.. code-block:: text
http://localhost:8888/
That's all. Enjoy your journey!
Documentation
-------------
Paddle Docker images include an HTML version of C++ source code
generated using `woboq code browser
<https://github.com/woboq/woboq_codebrowser>`_. This makes it easy
for users to browse and understand the C++ source code.
As long as we give the Paddle Docker container a name, we can run an
additional Nginx Docker container to serve the volume from the Paddle
container:
.. code-block:: bash
docker run -d --name paddle-cpu-doc paddle:<version>
docker run -d --volumes-from paddle-cpu-doc -p 8088:80 nginx
Then we can direct our Web browser to the HTML version of source code
at http://localhost:8088/paddle/
Install and Build
=================
Install PaddlePaddle
----------------------
.. toctree::
:maxdepth: 1
docker_install_en.rst
ubuntu_install_en.rst
Build from Source
-----------------
.. warning::
Please use :code:`deb` package or :code:`docker` image to install paddle. The building guide is used for hacking or contributing PaddlePaddle source code.
.. toctree::
:maxdepth: 1
build_from_source_en.md
Debian Package installation guide
=================================
PaddlePaddle supports :code:`deb` pacakge. The installation of this :code:`deb` package is tested in ubuntu 14.04, but it should be support other debian based linux, too.
There are four versions of debian package, :code:`cpu`, :code:`gpu`, :code:`cpu-noavx`, :code:`gpu-noavx`. And :code:`noavx` version is used to support CPU which does not contain :code:`AVX` instructions. The download url of :code:`deb` package is \: https://github.com/baidu/Paddle/releases/
After downloading PaddlePaddle deb packages, you can use :code:`gdebi` install.
.. code-block:: bash
gdebi paddle-*.deb
If :code:`gdebi` is not installed, you can use :code:`sudo apt-get install gdebi` to install it.
Or you can use following commands to install PaddlePaddle.
.. code-block:: bash
dpkg -i paddle-*.deb
apt-get install -f
And if you use GPU version deb package, you need to install CUDA toolkit and cuDNN, and set related environment variables(such as LD_LIBRARY_PATH) first. It is normal when `dpkg -i` get errors. `apt-get install -f` will continue install paddle, and install dependences.
GET STARTED
============
.. toctree::
:maxdepth: 1
build_and_install/index_en.rst
- `Deep Learning 101 <http://book.paddlepaddle.org/index.en.html>`_
HOW TO
=======
Usage
-------
.. toctree::
:maxdepth: 1
usage/cmd_parameter/index_en.rst
usage/cluster/cluster_train_en.md
usage/k8s/k8s_en.md
usage/k8s/k8s_aws_en.md
Development
------------
.. toctree::
:maxdepth: 1
dev/new_layer_en.rst
dev/contribute_to_paddle_en.md
Configuration
-------------
.. toctree::
:maxdepth: 1
deep_model/rnn/index_en.rst
Optimization
-------------
.. toctree::
:maxdepth: 1
optimization/gpu_profiling_en.rst
To build PaddlePaddle data preparation image in tutorial [Distributed PaddlePaddle Training on AWS with Kubernetes](../../k8s_aws_en.md), run following commands:
```
cp -r ../../../../../../demo/quick_start .
docker build . -t prepare-data-image-name
```
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册