提交 4f522a6a 编写于 作者: T tangwei

Merge branch 'develop' of https://github.com/PaddlePaddle/FluidDoc into develop

......@@ -2,3 +2,4 @@
/doc/fluid/menu.zh.json
/doc/fluid/menu.en.json
.idea
build
......@@ -177,7 +177,7 @@ pip install -r requirements.txt
## 提交修改
如果您希望修改代码,请在`Paddle`仓库下参考[如何贡献代码](../development/contribute_to_paddle/index_cn.html)执行操作。
如果您希望修改代码,请在`Paddle`仓库下参考[如何贡献代码](../contribute_code/index_cn.html)执行操作。
如果您仅修改文档:
......@@ -198,7 +198,7 @@ pip install -r requirements.txt
3.在`FluidDoc`仓库下为您的修改提交PR
提交修改与PR的步骤可以参考[如何贡献代码](../development/contribute_to_paddle/index_cn.html)
提交修改与PR的步骤可以参考[如何贡献代码](../contribute_code/index_cn.html)
## 帮助改进预览工具
......
......@@ -58,7 +58,7 @@ git clone https://github.com/PaddlePaddle/PaddlePaddle.org.git
cd PaddlePaddle.org/portal
```
Then install requirements. Please make sure that you install with python 2.7.15 or 2.7.16. We recommend you to use Anaconda or virtualenv to create an appropriate virtual environment first.
Then install requirements. Please make sure that you install with python 2.7.15 or 2.7.16. We recommend you to use Anaconda or virtualenv to create an appropriate virtual environment first.
Install requirements:
......@@ -85,9 +85,9 @@ If you need to work with documents that depend on the contents of the `book`, `m
```
./runserver --paddle <path_to_fluiddoc_dir> \
--book <path_to_fluiddoc_dir>/external/book \
--models <path_to_fluiddoc_dir>/external/models \
--mobile <path_to_fluiddoc_dir>/external/mobile
--book <path_to_fluiddoc_dir>/external/book \
--models <path_to_fluiddoc_dir>/external/models \
--mobile <path_to_fluiddoc_dir>/external/mobile
```
Then: open your browser and navigate to http://localhost:8000.
......@@ -178,7 +178,7 @@ After completing the installation, you will also need to do:
## Submit changes
If you wish to modify the code, please refer to [How to contribute code](../development/contribute_to_paddle/index_en.html) under the `Paddle` repository.
If you wish to modify the code, please refer to [How to contribute code](../contribute_code/index_en.html) under the `Paddle` repository.
If you just modify the document:
......@@ -186,20 +186,20 @@ If you just modify the document:
- The modified content is in the `external` folder:
1. Submit the PR in the repostory you modified. This is because the `FluidDoc` repository is just a wrapper that brings together the links of other repositories (namely, the "submodules" in git context).
1. Submit the PR in the repostory you modified. This is because the `FluidDoc` repository is just a wrapper that brings together the links of other repositories (namely, the "submodules" in git context).
2. When your changes are approved, update the corresponding `submodule` in FluidDoc to the latest commit-id of the source repository.
2. When your changes are approved, update the corresponding `submodule` in FluidDoc to the latest commit-id of the source repository.
> For example, you updated the document on the develop branch in the book repository:
> For example, you updated the document on the develop branch in the book repository:
> - Go to the `FluidDoc/external/book` directory
> - Update commit-id to the latest commit: `git pull origin develop`
> - Commit your changes in `FluidDoc`
> - Go to the `FluidDoc/external/book` directory
> - Update commit-id to the latest commit: `git pull origin develop`
> - Commit your changes in `FluidDoc`
3. Pull Request for your changes in the `FluidDoc` repository
The steps to submit changes and PR can refer to [How to contribute code](../development/contribute_to_paddle/index_en.html)
The steps to submit changes and PR can refer to [How to contribute code](../contribute_code/index_en.html)
## Help improve preview tool
......
......@@ -78,8 +78,8 @@ no changes added to commit (use "git add" and/or "git commit -a")
## 编译和单元测试
关于编译 PaddlePaddle 的源码,请参见[从源码编译](../../../beginners_guide/install/compile/fromsource.html) 选择对应的操作系统。
关于单元测试,可参考[Op单元测试](../../../advanced_usage/development/new_op/new_op.html#id7) 的运行方法。
关于编译 PaddlePaddle 的源码,请参见[从源码编译](../../../install/compile/fromsource.html) 选择对应的操作系统。
关于单元测试,可参考[Op单元测试](../new_op/new_op.html#id7) 的运行方法。
## 提交(commit)
......
......@@ -69,17 +69,17 @@ On branch test
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: README.md
modified: README.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
test
no changes added to commit (use "git add" and/or "git commit -a")
```
## Build and test
Please refer to [Compile From Source Code](../../../beginners_guide/install/compile/fromsource_en.html) about more information of building PaddlePaddle source codes.
Please refer to [Op Unit Tests](../../../advanced_usage/development/new_op/new_op_en.html#unit-tests) about more information of running unit tests.
Please refer to [Compile From Source Code](../../../install/compile/fromsource_en.html) about more information of building PaddlePaddle source codes.
Please refer to [Op Unit Tests](../new_op/new_op_en.html#unit-tests) about more information of running unit tests.
## Commit
......@@ -91,7 +91,7 @@ Next we cancel the modification of README.md,and submit new added test file.
On branch test
Untracked files:
(use "git add <file>..." to include in what will be committed)
test
test
nothing added to commit but untracked files present (use "git add" to track)
➜ git add test
```
......@@ -132,8 +132,8 @@ Check the name of current remote repository with `git remote`.
➜ git remote
origin
➜ git remote -v
origin https://github.com/USERNAME/Paddle (fetch)
origin https://github.com/USERNAME/Paddle (push)
origin https://github.com/USERNAME/Paddle (fetch)
origin https://github.com/USERNAME/Paddle (push)
```
origin is the name of remote repository that we clone,which is also the Paddle under your own account. Next we create a remote host of an original Paddle and name it upstream.
......
......@@ -25,7 +25,7 @@ Paddle使用一种编译器式的执行流程,分为编译时和运行时两
2. 原始的 Program 在框架内部转换为中间描述语言: `ProgramDesc`
3. `Transpiler` 接受一段 `ProgramDesc` ,输出一段变化后的 `ProgramDesc` ,作为后端 `Executor` 最终需要执行的 Program 。 `Transpiler` 并非必需步骤。
4. 执行 `ProgramDesc` 中定义的 Operator(可以类比为程序语言中的指令),在执行过程中会为 Operator 创建所需的输入输出并进行管理。
......@@ -213,7 +213,7 @@ outs = exe.run(
## 代码实例
本节通过[编程指南](../../beginners_guide/programming_guide/programming_guide.html)中简单的线性回归例子,为您介绍上述内容如何在代码中实现。
本节通过[编程指南](../../../beginners_guide/basic_concept/programming_guide/programming_guide.html)中简单的线性回归例子,为您介绍上述内容如何在代码中实现。
**定义Program**
......@@ -360,5 +360,4 @@ Paddle使用Executor.run来运行一段Program。
[6.099215 ]], dtype=float32), array([1.6935859], dtype=float32)]
```
至此您已经了解了Paddle内部的执行流程的核心概念,更多框架使用细节请参考[使用指南](../../user_guides/index_cn.html)相关内容,[模型库](../../user_guides/models/index_cn.html
)中也为您提供了丰富的模型示例以供参考。
至此您已经了解了Paddle内部的执行流程的核心概念,更多框架使用细节可以参考[典型案例](../../../user_guides/index_cn.html)
......@@ -37,7 +37,7 @@ After completing the network definition, there are usually 2 Programs in a Fluid
1. ``fluid.default_startup_program`` : defines various operations such as creating model parameters, input and output, and initialization of learnable parameters in the model.
default_startup_program can be automatically generated by the framework and can be used without visible manual creation.
default_startup_program can be automatically generated by the framework and can be used without visible manual creation.
If the call changes the default initialization method of the parameter, the framework will automatically add the relevant changes to the default_startup_program
......@@ -75,22 +75,22 @@ Similarly, the following Program contains 3 blocks:
import paddle.fluid as fluid # block 0
limit = fluid.layers.fill_constant_batch_size_like(
Input=label, dtype='int64', shape=[1], value=5.0)
Input=label, dtype='int64', shape=[1], value=5.0)
cond = fluid.layers.less_than(x=label, y=limit)
ie = fluid.layers.IfElse(cond)
with ie.true_block(): # block 1
true_image = ie.input(image)
hidden = fluid.layers.fc(input=true_image, size=100, act='tanh')
prob = fluid.layers.fc(input=hidden, size=10, act='softmax')
ie.output(prob)
true_image = ie.input(image)
hidden = fluid.layers.fc(input=true_image, size=100, act='tanh')
prob = fluid.layers.fc(input=hidden, size=10, act='softmax')
ie.output(prob)
with ie.false_block(): # block 2
false_image = ie.input(image)
hidden = fluid.layers.fc(
input=false_image, size=200, act='tanh')
prob = fluid.layers.fc(input=hidden, size=10, act='softmax')
ie.output(prob)
false_image = ie.input(image)
hidden = fluid.layers.fc(
input=false_image, size=200, act='tanh')
prob = fluid.layers.fc(input=hidden, size=10, act='softmax')
ie.output(prob)
prob = ie()
```
......@@ -136,10 +136,10 @@ message AttrDesc {
required string name = 1;
enum AttrType {
INT = 1,
STRING = 2,
...
BLOCK = ...
INT = 1,
STRING = 2,
...
BLOCK = ...
}
required AttrType type = 2;
......@@ -169,24 +169,24 @@ C++ implementation code of Executor is as follows:
```cpp
class Executor{
public:
void Run(const ProgramDesc& pdesc,
scope* scope,
int block_id) {
auto& block = pdesc.Block(block_id);
// Create all variables
for (auto& var : block.AllVars())
scope->Var(Var->Name());
}
// Create OP and execute in order
for (auto& op_desc : block.AllOps()){
auto op = CreateOp(*op_desc);
op->Run(*local_scope, place_);
}
}
};
public:
void Run(const ProgramDesc& pdesc,
scope* scope,
int block_id) {
auto& block = pdesc.Block(block_id);
// Create all variables
for (auto& var : block.AllVars())
scope->Var(Var->Name());
}
// Create OP and execute in order
for (auto& op_desc : block.AllOps()){
auto op = CreateOp(*op_desc);
op->Run(*local_scope, place_);
}
}
};
```
**Create Executor**
......@@ -208,13 +208,13 @@ Fluid uses Executor.run to run the program. In the definition, the data is obtai
...
x = numpy.random.random(size=(10, 1)).astype('float32')
outs = exe.run(
feed={'X': x},
fetch_list=[loss.name])
feed={'X': x},
fetch_list=[loss.name])
```
## Code Instance
This section introduces how the above is implemented in your code through a simple linear regression example in the [Fluid Programming Guide](../../beginners_guide/programming_guide/programming_guide.html).
This section introduces how the above is implemented in your code through a simple linear regression example in the [Fluid Programming Guide](../../../beginners_guide/basic_concept/programming_guide/programming_guide_en.html).
**Define Program**
......@@ -252,48 +252,48 @@ blocks {
idx: 0
parent_idx: -1
vars {
name: "mean_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: 1
}
}
}
persistable: false
name: "mean_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: 1
}
}
}
persistable: false
}
vars {
name: "square_error_cost_1.tmp_1"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
name: "square_error_cost_1.tmp_1"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
}
vars {
name: "square_error_cost_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
...
name: "square_error_cost_1.tmp_0"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
...
```
As you can see from the output, the entire definition process is transformed into a ProgramDesc inside the framework, indexed by block idx. There is only one block in this linear regression model, and ProgramDesc has only one piece of BlockDesc, namely Block 0.
......@@ -304,19 +304,19 @@ x = fluid.layers.data(name="x",shape=[1],dtype='float32')
In BlockDesc, the variable x is described as:
```
vars {
name: "x"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
name: "x"
type {
type: LOD_TENSOR
lod_tensor {
tensor {
data_type: FP32
dims: -1
dims: 1
}
lod_level: 0
}
}
persistable: false
```
All data in Fluid are LoD-Tensor, and for data without sequence information (such as variable X here), lod_level=0.
......@@ -348,18 +348,17 @@ Since there are multiple columns of incoming and outgoing data, fluid defines tr
```python
# Start training
outs = exe.run(
feed={'x':train_data,'y':y_true},
fetch_list=[y_predict.name,avg_cost.name])
feed={'x':train_data,'y':y_true},
fetch_list=[y_predict.name,avg_cost.name])
```
The above code defines that train_data is to be passed into the x variable, y_true is to be passed into the y variable, and output the predicted value of y and the last round value of cost.
The output is:
```
[array([[1.5248038],
[3.0496075],
[4.5744114],
[6.099215 ]], dtype=float32), array([1.6935859], dtype=float32)]
[3.0496075],
[4.5744114],
[6.099215 ]], dtype=float32), array([1.6935859], dtype=float32)]
```
Till now you have already be notified of the core concepts of the internal execution process of Fluid. For more details on the usage of the framework, please refer to the [User Guide](../../user_guides/index_en.html) related content, [Model Library](../../user_guides/models/index_en.html
) also provides you with a wealth of model examples for reference.
Till now you have already be notified of the core concepts of the internal execution process of Fluid. For more details on the usage of the framework, please refer to the [User Guide](../../../user_guides/index_en.html).
.. _addon_development:
########
二次开发
########
.. toctree::
:maxdepth: 1
design_idea/fluid_design_idea.md
new_op/index_cn.rst
contribute_code/index_cn.rst
.. _addon_development_en:
################
Addon Development
################
.. toctree::
:maxdepth: 1
design_idea/fluid_design_idea_en.md
new_op/index_en.rst
contribute_code/index_en.rst
......@@ -72,18 +72,19 @@ class Relu2Kernel : public framework::OpKernel<T> {
};
// 定义反向OP的输入Y和dY、输出dX、属性:
class Relu2GradMaker : public framework::SingleGradOpDescMaker {
template <typename T>
class Relu2GradMaker : public framework::SingleGradOpMaker<T> {
public:
using framework::SingleGradOpDescMaker::SingleGradOpDescMaker;
using framework::SingleGradOpMaker<T>::SingleGradOpMaker;
std::unique_ptr<framework::OpDesc> Apply() const override {
auto* op = new framework::OpDesc();
std::unique_ptr<T> Apply() const override {
auto* op = new T();
op->SetType("relu2_grad");
op->SetInput("Y", Output("Y"));
op->SetInput(framework::GradVarName("Y"), OutputGrad("Y"));
op->SetAttrMap(Attrs());
op->SetOutput(framework::GradVarName("X"), InputGrad("X"));
return std::unique_ptr<framework::OpDesc>(op);
op->SetInput("Y", this->Output("Y"));
op->SetInput(framework::GradVarName("Y"), this->OutputGrad("Y"));
op->SetAttrMap(this->Attrs());
op->SetOutput(framework::GradVarName("X"), this->InputGrad("X"));
return std::unique_ptr<T>(op);
}
};
......@@ -124,7 +125,11 @@ namespace ops = paddle::operators;
using CPU = paddle::platform::CPUDeviceContext;
// 注册前向和反向op
// 为了和框架内部的relu区分,这里注册的OP type为relu2
REGISTER_OPERATOR(relu2, ops::Relu2Op, ops::Relu2OpMaker, ops::Relu2GradMaker);
REGISTER_OPERATOR(relu2,
ops::Relu2Op,
ops::Relu2OpMaker,
ops::Relu2GradMaker<paddle::framework::OpDesc>,
ops::Relu2GradMaker<paddle::imperative::OpBase>);
REGISTER_OPERATOR(relu2_grad, ops::Relu2GradOp);
// 注册CPU的Kernel
REGISTER_OP_CPU_KERNEL(relu2,
......@@ -252,32 +257,25 @@ echo $include_dir
echo $lib_dir
# PaddlePaddel >=1.6.1, 仅需要include ${include_dir} 和 ${include_dir}/third_party
nvcc relu_op.cu -c -o relu_op.cu.o -ccbin cc -DPADDLE_WITH_CUDA -DEIGEN_USE_GPU -DPADDLE_USE_DSO -Xcompiler -fPIC -std=c++11 -Xcompiler -fPIC -w --expt-relaxed-constexpr -O3 -DNVCC \
nvcc relu_op.cu -c -o relu_op.cu.o -ccbin cc -DPADDLE_WITH_CUDA -DEIGEN_USE_GPU -DPADDLE_USE_DSO -DPADDLE_WITH_MKLDNN -Xcompiler -fPIC -std=c++11 -Xcompiler -fPIC -w --expt-relaxed-constexpr -O3 -DNVCC \
-I ${include_dir} \
-I ${include_dir}/third_party \
g++ relu_op.cc relu_op.cu.o -o relu2_op.so -shared -fPIC -std=c++11 -O3 \
g++ relu_op.cc relu_op.cu.o -o relu2_op.so -shared -fPIC -std=c++11 -O3 -DPADDLE_WITH_MKLDNN \
-I ${include_dir} \
-I ${include_dir}/third_party \
-L /usr/local/cuda/lib64 \
-L ${lib_dir} -lpaddle_framework -lcudart
# PaddlePaddel 1.6.0, 需要include的third_party如下:
# -I ${include_dir}/third_party/install/protobuf/include \
# -I ${include_dir}/third_party/install/glog/include \
# -I ${include_dir}/third_party/install/gflags/include \
# -I ${include_dir}/third_party/install/xxhash/include \
# -I ${include_dir}/third_party/boost \
# -I ${include_dir}/third_party/eigen3 \
# -I ${include_dir}/third_party/dlpack/include \
```
注意点:
注意点:
1. NVCC编译GPU OP的cu文件时,需要加 `-DPADDLE_WITH_CUDA -DEIGEN_USE_GPU -DPADDLE_USE_DSO`
2. 可多个OP编译到同一个动态库中。
2. 如果安装的PaddlePaddle不包含MKLDNN,则需要去掉编译选项`-DPADDLE_WITH_MKLDNN`。默认的安装包已包含MKLDNN。
3. 可多个OP编译到同一个动态库中。
4. 通过pip方式安装的PaddlePaddle由GCC 4.8编译得到,由于GCC 4.8和GCC 5以上**C++11 ABI不兼容**,您编写的自定义OP,需要通过GCC 4.8编译。若是GCC 5及以上的环境上使用自定义OP,推荐使用[Docker安装PaddlePaddle](https://www.paddlepaddle.org.cn/install/doc/docker),使得编Paddle和编译自定义OP的GCC版本相同。
......
......@@ -2,6 +2,8 @@
Write New Operators
###################
This section will guide you how to add an operator, and it also includes some necessary notes.
- `How to write new operator <new_op_en.html>`_ :guides to write new operators
- `op notes <op_notes_en.html>`_ :notes on developing new operators
......
......@@ -408,7 +408,7 @@ class MulKernel : public framework::OpKernel<T> {
REGISTER_OPERATOR(mul, ops::MulOp, ops::MulOpMaker,
ops::MulOpGradMaker)
REGISTER_OPERATOR(mul_grad, ops::MulGradOp)
REGISTER_OP_CPU_KERNEL(mul,
REGISTER_OP_CPU_KERNEL(mul,
ops::MulKernel<paddle::platform::CPUDeviceContext, float>,
ops::MulKernel<paddle::platform::CPUDeviceContext, double>);
REGISTER_OP_CPU_KERNEL(mul_grad,
......@@ -418,9 +418,9 @@ class MulKernel : public framework::OpKernel<T> {
在上面的代码中:
- `REGISTER_OPERATOR` : 注册`ops::MulOp`类,类型名为`mul`,该类的`ProtoMaker`为`ops::MulOpMaker`,注册`ops::MulOpGrad`,类型名为`mul_grad`。
- `REGISTER_OPERATOR` : 注册`ops::MulOp`类,类型名为`mul`,该类的`ProtoMaker`为`ops::MulOpMaker`,注册`ops::MulOpGrad`,类型名为`mul_grad`。
- `REGISTER_OP_CPU_KERNEL` :注册`ops::MulKernel`类,并特化模板参数为`paddle::platform::CPUPlace`和`float`类型,同理,注册`ops::MulGradKernel`类。
- `REGISTER_OP_CPU_KERNEL` :注册`ops::MulKernel`类,并特化模板参数为`paddle::platform::CPUPlace`和`float`类型,同理,注册`ops::MulGradKernel`类。
-`.cu`文件中注册CUDA Kernel。
......@@ -432,7 +432,7 @@ class MulKernel : public framework::OpKernel<T> {
#define EIGEN_USE_GPU
namespace ops = paddle::operators;
REGISTER_OP_CUDA_KERNEL(mul,
REGISTER_OP_CUDA_KERNEL(mul,
ops::MulKernel<paddle::platform::CUDADeviceContext, float>,
ops::MulKernel<paddle::platform::CUDADeviceContext, double>);
REGISTER_OP_CUDA_KERNEL(mul_grad,
......@@ -499,7 +499,7 @@ Op单元测试继承自`OpTest`。各项具体的单元测试在`TestMulOp`里
1.`setUp`函数定义输入、输出,以及相关的属性参数。
> 注意:输入输出请以`ndarray`的类型配置输入/输出,如果需要配置一个带`LOD`的输入/输出,请以`tuple`的形式传入,`tuple`中应该有两个类型为`ndarray`的元素,第一个是实际的数据,第二个是`LOD`
> 注意:输入输出请以`ndarray`的类型配置输入/输出,如果需要配置一个带`LOD`的输入/输出,请以`tuple`的形式传入,`tuple`中应该有两个类型为`ndarray`的元素,第一个是实际的数据,第二个是`LOD`
2. 生成随机的输入数据。
......@@ -507,41 +507,41 @@ Op单元测试继承自`OpTest`。各项具体的单元测试在`TestMulOp`里
4. 反向计算已经自动集成进测试框架,直接调用相应接口即可。
```python
import unittest
import numpy as np
from op_test import OpTest
```python
import unittest
import numpy as np
from op_test import OpTest
class TestMulOp(OpTest):
def setUp(self):
self.op_type = "mul"
self.inputs = {
'X': np.random.random((32, 84)).astype("float32"),
'Y': np.random.random((84, 100)).astype("float32")
}
self.outputs = {'Out': np.dot(self.inputs['X'], self.inputs['Y'])}
class TestMulOp(OpTest):
def setUp(self):
self.op_type = "mul"
self.inputs = {
'X': np.random.random((32, 84)).astype("float32"),
'Y': np.random.random((84, 100)).astype("float32")
}
self.outputs = {'Out': np.dot(self.inputs['X'], self.inputs['Y'])}
def test_check_output(self):
self.check_output()
def test_check_output(self):
self.check_output()
def test_check_grad_normal(self):
self.check_grad(['X', 'Y'], 'Out', max_relative_error=0.5)
def test_check_grad_normal(self):
self.check_grad(['X', 'Y'], 'Out', max_relative_error=0.5)
def test_check_grad_ingore_x(self):
self.check_grad(
['Y'], 'Out', max_relative_error=0.5, no_grad_set=set("X"))
def test_check_grad_ingore_x(self):
self.check_grad(
['Y'], 'Out', max_relative_error=0.5, no_grad_set=set("X"))
def test_check_grad_ingore_y(self):
self.check_grad(
['X'], 'Out', max_relative_error=0.5, no_grad_set=set('Y'))
```
def test_check_grad_ingore_y(self):
self.check_grad(
['X'], 'Out', max_relative_error=0.5, no_grad_set=set('Y'))
```
上面的代码首先导入依赖的包,下面是对`setUp`函数中操作的重要变量的详细解释:
上面的代码首先导入依赖的包,下面是对`setUp`函数中操作的重要变量的详细解释:
- `self.op_type = "mul" ` : 定义类型,与operator注册时注册的类型一致。
- `self.inputs` : 定义输入,类型为`numpy.array`,并初始化。
- `self.outputs` : 定义输出,并在Python脚本中完成与operator同样的计算逻辑,返回Python端的计算结果。
- `self.op_type = "mul" ` : 定义类型,与operator注册时注册的类型一致。
- `self.inputs` : 定义输入,类型为`numpy.array`,并初始化。
- `self.outputs` : 定义输出,并在Python脚本中完成与operator同样的计算逻辑,返回Python端的计算结果。
### 反向operator单测
......@@ -613,34 +613,34 @@ PADDLE_ENFORCE_EQ(比较对象A, 比较对象B, 错误提示信息)
1. 无报错信息或报错信息过于简单,不能给用户提供有效的提示!
问题示例1 :未写提示信息
```
PADDLE_ENFORCE(ctx->HasInput("X"), "");
```
问题示例2 :提示信息过于简单
```
PADDLE_ENFORCE(i != nullptr, "i must be set"); // i是什么?
```
问题示例1 :未写提示信息
```
PADDLE_ENFORCE(ctx->HasInput("X"), "");
```
问题示例2 :提示信息过于简单
```
PADDLE_ENFORCE(i != nullptr, "i must be set"); // i是什么?
```
2. 在报错信息中使用开发人员定义的变量缩写,不易理解!
问题示例:
```
PADDLE_ENFORCE(forward_pd != nullptr,
"Fail to find eltwise_fwd_pd in device context"); //eltwise_fwd_pd用户可能看不懂
```
问题示例:
```
PADDLE_ENFORCE(forward_pd != nullptr,
"Fail to find eltwise_fwd_pd in device context"); //eltwise_fwd_pd用户可能看不懂
```
3. OP内部调用非法接口:Op内部如果出现Output = ShareDataWith(Input)
问题示例:
```cpp
auto *out = ctx.Output<framework::LoDTensor>("Out");
auto *in = ctx.Input<framework::LoDTensor>("X");
out->ShareDataWith(*in);
```
Op内部如果出现Output = ShareDataWith(Input),相当于operator图的中有一条隐藏边,连接了Input和Output,这条边无法在图分析中表达,引发基于图优化的错误。
问题示例:
```cpp
auto *out = ctx.Output<framework::LoDTensor>("Out");
auto *in = ctx.Input<framework::LoDTensor>("X");
out->ShareDataWith(*in);
```
Op内部如果出现Output = ShareDataWith(Input),相当于operator图的中有一条隐藏边,连接了Input和Output,这条边无法在图分析中表达,引发基于图优化的错误。
4. OP实现的性能实践
调用了eigen的broadcast, chop等操作,性能会比手写cuda kernel差几倍以上。此时cpu的实现可以复用eigen,gpu实现可以实现cuda kernel.
调用了eigen的broadcast, chop等操作,性能会比手写cuda kernel差几倍以上。此时cpu的实现可以复用eigen,gpu实现可以实现cuda kernel.
#### OP InferShape检查提示信息特别说明
......@@ -648,16 +648,16 @@ PADDLE_ENFORCE_EQ(比较对象A, 比较对象B, 错误提示信息)
- 检查输入输出变量,请统一遵循以下格式
`Input(变量名) of OP名 operator should not be null.`
正确示例:
```
PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(Input) of LSTMP operator should not be null.");
```
正确示例:
```
PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(Input) of LSTMP operator should not be null.");
```
- 反向Op的输入输出检查,要写明反向Op的名字
正确示例:
```
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of LoDResetGrad opreator should not be null.");
```
正确示例:
```
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of LoDResetGrad opreator should not be null.");
```
# 如何写新的Python OP
PaddlePaddle Fluid通过 `py_func` 接口支持在Python端编写op。
PaddlePaddle Fluid通过 `py_func` 接口支持在Python端自定义OP。 py_func的设计原理在于Paddle中的LodTensor可以与numpy数组可以方便的互相转换,从而可以使用Python中的numpy API来自定义一个Python OP。
## py_func接口概述
......@@ -14,12 +14,12 @@ def py_func(func, x, out, backward_func=None, skip_vars_in_backward_input=None):
其中,
- `x` 是Python Op的输入变量,可以是单个 `Variable` 或者 `List[Variable]`
- `out` 是Python Op的输出变量,可以是单个 `Variable` 或者 `List[Variable]`
- `func` 是Python Op的前向函数。在运行网络前向时,框架会调用 `out = func(*x)` ,根据前向输入 `x` 和前向函数 `func` 计算前向输出 `out`
- `x` 是Python Op的输入变量,可以是单个 `Variable` | `tuple[Variable]` | `list[Variable]` 。多个Variable以tuple[Variable]或list[Variale]的形式传入,其中Variable为LoDTensor或Tenosr
- `out` 是Python Op的输出变量,可以是单个 `Variable` | `tuple[Variable]` | `list[Variable]` 。其中Variable既可以为LoDTensor或Tensor,也可以为numpy数组
- `func` 是Python Op的前向函数。在运行网络前向时,框架会调用 `out = func(*x)` ,根据前向输入 `x` 和前向函数 `func` 计算前向输出 `out```func`` 建议先主动将LoDTensor转换为numpy数组,方便灵活的使用numpy相关的操作,如果未转换成numpy,则可能某些操作无法兼容。
- `backward_func` 是Python Op的反向函数。若 `backward_func``None` ,则该Python Op没有反向计算逻辑;
`backward_func` 不为 `None`,则框架会在运行网路反向时调用 `backward_func` 计算前向输入 `x` 的梯度。
- `skip_vars_in_backward_input` 为反向函数 `backward_func` 中不需要的输入,可以是单个 `Variable` 或者 `List[Variable]`
- `skip_vars_in_backward_input` 为反向函数 `backward_func` 中不需要的输入,可以是单个 `Variable` | `tuple[Variable]` | `list[Variable]`
## 如何使用py_func编写Python Op
......@@ -28,7 +28,7 @@ def py_func(func, x, out, backward_func=None, skip_vars_in_backward_input=None):
- 第一步:定义前向函数和反向函数
前向函数和反向函数均由Python编写。
前向函数和反向函数均由Python编写,可以方便地使用Python与numpy中的相关API来实现一个自定义的OP
若前向函数的输入为 `x_1`, `x_2`, ..., `x_n` ,输出为`y_1`, `y_2`, ..., `y_m`,则前向函数的定义格式为:
```Python
......@@ -46,25 +46,52 @@ def backward_func(x_1, x_2, ..., x_n, y_1, y_2, ..., y_m, dy_1, dy_2, ..., dy_m)
若反向函数不需要某些前向输入变量或前向输出变量,可设置 `skip_vars_in_backward_input` 进行排除(步骤三中会叙述具体的排除方法)。
此处我们利用numpy库完成tanh的前向函数和反向函数编写。
注:,x_1, ..., x_n为输入的多个LodTensor,请以tuple(Variable)或list[Variable]的形式在py_func中传入。建议先主动将LodTensor通过numpy.array转换为数组,否则Python与numpy中的某些操作可能无法兼容使用在LodTensor上。
此处我们利用numpy的相关API完成tanh的前向函数和反向函数编写。下面给出多个前向与反向函数定义的示例:
```Python
import numpy as np
def my_tanh(x):
# 前向函数1:模拟tanh激活函数
def tanh(x):
# 可以直接将LodTensor作为np.tanh的输入参数
return np.tanh(x)
def my_tanh_grad(x, y, dy):
# 前向函数2:将两个2-D LodTenosr相加,输入多个LodTensor以list[Variable]或tuple(Variable)形式
def element_wise_add(x, y):
# 必须先手动将LodTensor转换为numpy数组,否则无法支持numpy的shape操作
x = np.array(x)
y = np.array(y)
if x.shape != y.shape:
raise AssertionError("the shape of inputs must be the same!")
result = np.zeros(x.shape, dtype='int32')
for i in range(len(x)):
for j in range(len(x[0])):
result[i][j] = x[i][j] + y[i][j]
return result
# 前向函数3:可用于调试正在运行的网络(打印值)
def debug_func(x):
# 可以直接将LodTensor作为print的输入参数
print(x)
# 前向函数1对应的反向函数,默认的输入顺序为:x、out、out的梯度
def tanh_grad(x, y, dy):
# 必须先手动将LodTensor转换为numpy数组,否则"+/-"等操作无法使用
return np.array(dy) * (1 - np.square(np.array(y)))
```
注意,前向函数和反向函数的输入均是 `LoDTensor` 类型,输出可以是Numpy Array或 `LoDTensor`
由于 `LoDTensor` 实现了Python的buffer protocol协议,因此我们既可通过 `numpy.array` 直接将 `LoDTensor` 转换为Numpy Array,也可直接将 `LoDTensor` 作为Numpy函数的输入参数
由于 `LoDTensor` 实现了Python的buffer protocol协议,因此即可通过 `numpy.array` 直接将 `LoDTensor` 转换为numpy Array来进行操作,也可直接将 `LoDTensor` 作为numpy函数的输入参数。但建议先主动转换为numpy Array,则可以任意的使用python与numpy中的所有操作(例如"numpy array的+/-/shape")
tanh的反向函数不需要前向输入x,因此我们可定义一个不需要前向输入x的反向函数,并在后续通过 `skip_vars_in_backward_input` 进行排除 :
```Python
def my_tanh_grad_without_x(y, dy):
def tanh_grad_without_x(y, dy):
return np.array(dy) * (1 - np.square(np.array(y)))
```
......@@ -77,7 +104,7 @@ import paddle.fluid as fluid
def create_tmp_var(program, name, dtype, shape):
return program.current_block().create_var(name=name, dtype=dtype, shape=shape)
in_var = fluid.layers.data(name='input', dtype='float32', shape=[-1, 28, 28])
# 手动创建前向输出变量
......@@ -89,13 +116,13 @@ out_var = create_tmp_var(fluid.default_main_program(), name='output', dtype='flo
`py_func` 的调用方式为:
```Python
fluid.layers.py_func(func=my_tanh, x=in_var, out=out_var, backward_func=my_tanh_grad)
fluid.layers.py_func(func=tanh, x=in_var, out=out_var, backward_func=tanh_grad)
```
若我们不希望在反向函数输入参数中出现前向输入,则可使用 `skip_vars_in_backward_input` 进行排查,简化反向函数的参数列表。
```Python
fluid.layers.py_func(func=my_tanh, x=in_var, out=out_var, backward_func=my_tanh_grad_without_x,
fluid.layers.py_func(func=tanh, x=in_var, out=out_var, backward_func=tanh_grad_without_x,
skip_vars_in_backward_input=in_var)
```
......@@ -106,9 +133,8 @@ fluid.layers.py_func(func=my_tanh, x=in_var, out=out_var, backward_func=my_tanh_
- `py_func` 的前向函数和反向函数内部不应调用 `fluid.layers.xxx` ,因为前向函数和反向函数是在网络运行时调用的,且输入参数均为C++端的 `LoDTensor`
`fluid.layers.xxx` 是在组建网络的阶段调用的,且输入参数为Python端的 `Variable`
- `skip_vars_in_backward_input` 只能跳过前向输入变量和前向输出变量,不能跳过前向输出的梯度。
- 若某个前向输出变量没有梯度,则 `backward_func` 将接收到 `None` 的输入。若某个前向输入变量没有梯度,则我们应在 `backward_func` 中主动返回
`None`
......@@ -9,7 +9,7 @@ Op的核心方法是Run,Run方法需要两方面的资源:数据资源和计
Fluid框架的设计理念是可以在多种设备及第三方库上运行,有些Op的实现可能会因为设备或者第三方库的不同而不同。为此,Fluid引入了OpKernel的方式,即一个Op可以有多个OpKernel,这类Op继承自`OperatorWithKernel`,这类Op的代表是conv_op,conv_op的OpKernel有:`GemmConvKernel``CUDNNConvOpKernel``ConvMKLDNNOpKernel`,且每个OpKernel都有double和float两种数据类型。不需要OpKernel的代表有`WhileOp`等。
Operator继承关系图:
![op_inheritance_relation_diagram](../../pics/op_inheritance_relation_diagram.png)
![op_inheritance_relation_diagram](./op_inheritance_relation_diagram.png)
进一步了解可参考:[multi_devices](https://github.com/PaddlePaddle/FluidDoc/tree/develop/doc/fluid/design/multi_devices)[scope](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/concepts/scope.md)[Developer's_Guide_to_Paddle_Fluid](https://github.com/PaddlePaddle/FluidDoc/blob/release/1.2/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md)
......@@ -148,7 +148,7 @@ GetExpectedKernelType方法是OperatorWithKernel类中用于获取指定设备
如果未使用带有初始化检查的方法,直接使用了`Tensor->type()`,可能会导致报出`holder_ should not be null. Tensor not initialized yet when Tensor::type()`的错误,例如[Paddle issue #19522](https://github.com/PaddlePaddle/Paddle/issues/19522) ,用户仅凭该错误信息将无法得知具体出错的Op,不利于调试。
### 5.Op兼容性问题
对Op的修改需要考虑兼容性问题,要保证Op修改之后,之前的模型都能够正常加载及运行<font color="#FF0000">**所以现在不允许对已有的Op新增输入或者输出,不允许减去Op的已有属性及修改默认值**</font>
对Op的修改需要考虑兼容性问题,要保证Op修改之后,之前的模型都能够正常加载及运行,即新版本的Paddle预测库能成功加载运行旧版本训练的模型。<font color="#FF0000">**所以,需要保证Op的Input、Output和Attribute不能被修改(文档除外)或删除,可以新增Input、Output和Attribute,但是新增的Input,Output必须设置AsDispensable,新增的Attribute必须设置默认值。更多详细内容请参考[OP修改规范:Input/Output/Attribute只能做兼容修改](https://github.com/PaddlePaddle/Paddle/wiki/OP-Input-Output-Attribute-Compatibility-Modification)**</font>
### 6.ShareDataWith的调用
ShareDataWith的功能是使两个Tensor共享底层buffer,在调用这个操作的时候需要特别注意,在Op内部不能将ShareDataWith作用在Op的输出上,即Op输出的Tensor必须是Malloc出来的。
......@@ -173,13 +173,13 @@ class SliceOpGrad : public framework::OperatorWithKernel {
using framework::OperatorWithKernel::OperatorWithKernel;
void InferShape(framework::InferShapeContext* ctx) const override {
// ...
// ...
}
framework::OpKernelType GetExpectedKernelType(
const framework::ExecutionContext& ctx) const override {
// Note: don't get data type from ctx.Input<framework::Tensor>("Input");
auto dtype = ctx.Input<framework::Tensor>(framework::GradVarName("Out"))->type();
// Note: don't get data type from ctx.Input<framework::Tensor>("Input");
auto dtype = ctx.Input<framework::Tensor>(framework::GradVarName("Out"))->type();
return framework::OpKernelType( dtype, ctx.GetPlace());
}
};
......@@ -261,15 +261,15 @@ The following device operations are asynchronous with respect to the host:
</table>
这两类 OP 的 LoD 传导需要考虑前向和反向两个过程。
#### 前向传导
在前向传导过程,与输入的 LoD 相比较,Op 输出的 LoD 可能出现不变、改变和消失这三种情况:
- 不变:适用于所有的 LoD-Transparent OP 与部分的 LoD-Based OP。可以在`InferShape` 中调用 `ShareLoD()` 直接将输入 Var 的 LoD 共享给输出 Var, 可参考 [lstm_op](https://github.com/PaddlePaddle/Paddle/blob/a88a1faa48a42a8c3737deb0f05da968d200a7d3/paddle/fluid/operators/lstm_op.cc#L92); 如果有多个输入且都可能存在 LoD 的情况,通常默认共享第一个输入, 例如 [elementwise_ops forward](https://github.com/PaddlePaddle/Paddle/blob/5d6a1fcf16bcb48d2e66306b27d9994d9b07433c/paddle/fluid/operators/elementwise/elementwise_op.h#L69)
- 改变:适用于部分 LoD-Based OP。在实现 OpKernel 时需考虑输出 LoD 的正确计算,真实的 LoD 在前向计算结束后才能确定,此时仍需要在`InferShape` 中调用 `ShareLoD()`,以确保CompileTime 时对 LoD Level 做了正确的传导,可参考 [sequence_expand_op](https://github.com/PaddlePaddle/Paddle/blob/565d30950138b9f831caa33904d9016cf53c6c2e/paddle/fluid/operators/sequence_ops/sequence_expand_op.cc)
- 消失:适用于输出不再是序列数据的 LoD-Based OP。此时不用再考虑前向的 LoD 传导问题,可参考 [sequence_pool_op](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/operators/sequence_ops/sequence_pool_op.cc)
其它重要的注意事项:
......@@ -278,7 +278,7 @@ The following device operations are asynchronous with respect to the host:
- 对 LoD Level 有明确要求的 OP,推荐的做法是在 `InferShape` 中即完成 LoD Level的检查,例如 [sequence_pad_op](https://github.com/PaddlePaddle/Paddle/blob/4292bd8687ababc7737cffbddc0d38ead2138c00/paddle/fluid/operators/sequence_ops/sequence_pad_op.cc#L79)
#### 反向传导
通常来讲,OP 的某个输入 Var 所对应的梯度 GradVar 的 LoD 应该与 Var 自身相同,所以应直接将 Var 的 LoD 共享给 GradVar,可以参考 [elementwise ops 的 backward](https://github.com/PaddlePaddle/Paddle/blob/a88a1faa48a42a8c3737deb0f05da968d200a7d3/paddle/fluid/operators/elementwise/elementwise_op.h#L189-L196)
......
......@@ -9,20 +9,20 @@ The core method of Op is Run. The Run method requires two resources: data resour
The Fluid framework is designed to run on a variety of devices and third-party libraries, and some Op implementations may vary on different the devices or third-party libraries. Therefore, Fluid introduced the OpKernel's approach, which means an Op can have multiple OpKernels. Such Ops are derived from `OperatorWithKernel`, and the representative of such Ops is conv, the OpKernels of conv_op are: `GemmConvKernel`, `CUDNNConvOpKernel`, `ConvMKLDNNOpKernel`, and each OpKernel has two data types, double and float. Ops that do not need OpKernel inclue `WhileOp` and so on.
Operator inheritance diagram:
![op_inheritance_relation_diagram](../../pics/op_inheritance_relation_diagram.png)
![op_inheritance_relation_diagram](./op_inheritance_relation_diagram.png)
For further information, please refer to: [multi_devices](https://github.com/PaddlePaddle/FluidDoc/tree/develop/doc/fluid/design/multi_devices) , [scope](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/concepts/scope.md) , [Developer's_Guide_to_Paddle_Fluid](https://github.com/PaddlePaddle/FluidDoc/blob/release/1.2/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md)
### 2.Op's registration logic
The registration entries for each Operator include:
```C++
OpCreator creator_;
GradOpMakerFN grad_op_maker_;
proto::OpProto* proto_{nullptr};
OpAttrChecker* checker_{nullptr};
InferVarTypeFN infer_var_type_;
InferShapeFN infer_shape_;
```
```C++
OpCreator creator_;
GradOpMakerFN grad_op_maker_;
proto::OpProto* proto_{nullptr};
OpAttrChecker* checker_{nullptr};
InferVarTypeFN infer_var_type_;
InferShapeFN infer_shape_;
```
<table>
<thead>
......@@ -74,14 +74,14 @@ The registration entries for each Operator include:
</table>
Usually you need to call REGISTER_OPERATOR when you make comments on Op, which is:
```
REGISTER_OPERATOR(op_type,
OperatorBase
Op_maker_and_checker_maker,
Op_grad_opmaker,
Op_infer_var_shape,
Op_infer_var_type)
```
```
REGISTER_OPERATOR(op_type,
OperatorBase
Op_maker_and_checker_maker,
Op_grad_opmaker,
Op_infer_var_shape,
Op_infer_var_type)
```
**Note:**
......@@ -110,7 +110,7 @@ Never make any modification of the input data inside Op, as there may be other O
Currently all OpKernel are required to register double and float data types.
### 4.Op compatibility issue
The modification of Op needs to consider the compatibility problem. Please ensure that the previous model can be loaded and run normally after the modification of Op. <font color="#FF0000">**So it is not allowed to add input or output to the existing Ops. It is not allowed to remove the existing properties of Op and modify the default value**</font>.
The modification of Op needs to consider the compatibility problem. Please ensure that the previous model can be loaded and run normally after the modification of Op which means that the model trained by the old version can be loaded and run with Paddle inference library of new version. <font color="#FF0000">**So developers should ensure that the Input, Output and Attribute of OPs cannot be modified (except for documents) or deleted. And developers can add Input, Output and Attribute, but the added Input and Output must be set to be dispensable, and the default value of added Attribute must be set. For more details, please refer to [OP Input/Output/Attribute Compatibility Modification](https://github.com/PaddlePaddle/Paddle/wiki/OP-Input-Output-Attribute-Compatibility-Modification(English-Version))**</font>.
### 5.Call ShareDataWith
The function of ShareDataWith is to make the two Tensors share the underlying buffer. When calling this operation, special attention should be paid. In the Op, the ShareDataWith cannot be applied to the output of Op. In other words, the Tensor of the Op output must be from Malloc.
......@@ -127,11 +127,11 @@ Since the GPU is executed asynchronously, the GPU side may not be actually execu
Some of the synchronous and asynchronous operations in the GPU:
```
The following device operations are asynchronous with respect to the host:
Kernel launches;
Memory copies within a single device's memory;
Memory copies from host to device of a memory block of 64 KB or less;
Memory copies performed by functions that are suffixed with Async;
Memory set function calls.
Kernel launches;
Memory copies within a single device's memory;
Memory copies from host to device of a memory block of 64 KB or less;
Memory copies performed by functions that are suffixed with Async;
Memory set function calls.
```
Note on cudaMemCpy and cudaMemCpyAsync:
......@@ -176,12 +176,11 @@ If Op has a mathematical formula, be sure to write the mathematical formula in t
The order of the parameters in the Python API is generally ranked by importance, taking fc as an example:
```
def fc(input,
size,
num_flatten_dims=1,
param_attr=None,
bias_attr=None,
act=None,
is_test=False,
name=None)
size,
num_flatten_dims=1,
param_attr=None,
bias_attr=None,
act=None,
is_test=False,
name=None)
```
.. _user_guide_use_numpy_array_as_train_data_en:
#################################
Take Numpy Array as Training Data
#################################
PaddlePaddle Fluid supports configuring data layer with :code:`fluid.layers.data()` .
Then you can use Numpy Array or directly use Python to create C++
:code:`fluid.LoDTensor` , and then feed it to :code:`fluid.Executor` or :code:`fluid.ParallelExecutor`
through :code:`Executor.run(feed=...)` .
Configure Data Layer
############################
With :code:`fluid.layers.data()` , you can configure data layer in neural network. Details are as follows:
.. code-block:: python
import paddle.fluid as fluid
image = fluid.layers.data(name="image", shape=[3, 224, 224])
label = fluid.layers.data(name="label", shape=[1], dtype="int64")
# use image/label as layer input
prediction = fluid.layers.fc(input=image, size=1000, act="softmax")
loss = fluid.layers.cross_entropy(input=prediction, label=label)
...
In the code above, :code:`image` and :code:`label` are two input data layers created by :code:`fluid.layers.data` . :code:`image` is float data of shape :code:`[3, 224, 224]` ; :code:`label` is the int data of shape :code:`[1]` . Note that:
1. :code:`-1` is represented for the dimension of batch size by default in Fluid. And :code:`-1` is added to the first dimension of :code:`shape` by default. Therefore in the code above, it would be alright to transfer numpy array of :code:`[32, 3, 224, 224]` to :code:`image` . If you want to customize the position of the batch size dimension, please set :code:`fluid.layers.data(append_batch_size=False)` .Please refer to the tutorial in the advanced user guide: :ref:`user_guide_customize_batch_size_rank_en` .
2. Data type of category labels in Fluid is :code:`int64` and the label starts from 0. About the supported data types,please refer to :ref:`user_guide_paddle_support_data_types_en` .
.. _user_guide_feed_data_to_executor_en:
Transfer Train Data to Executor
################################
Both :code:`Executor.run` and :code:`ParallelExecutor.run` receive a parameter :code:`feed` .
The parameter is a dict in Python. Its key is the name of data layer,such as :code:`image` in code above. And its value is the corresponding numpy array.
For example:
.. code-block:: python
exe = fluid.Executor(fluid.CPUPlace())
# init Program
exe.run(fluid.default_startup_program())
exe.run(feed={
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
})
Advanced Usage
###############
How to feed Sequence Data
--------------------------
Sequence data is a unique data type supported by PaddlePaddle Fluid. You can take :code:`LoDTensor` as input data type.
You need to:
1. Feed all data to be trained in a mini-batch.
2. Get the length of each sequence.
You can use :code:`fluid.create_lod_tensor` to create :code:`LoDTensor` .
To feed sequence information, it is necessary to set the sequence nested depth :code:`lod_level` .
For instance, if the training data are sentences consisting of words, :code:`lod_level=1`; if train data are paragraphs which consists of sentences that consists of words, :code:`lod_level=2` .
For example:
.. code-block:: python
sentence = fluid.layers.data(name="sentence", dtype="int64", shape=[1], lod_level=1)
...
exe.run(feed={
"sentence": create_lod_tensor(
data=numpy.array([1, 3, 4, 5, 3, 6, 8], dtype='int64').reshape(-1, 1),
recursive_seq_lens=[[4, 1, 2]],
place=fluid.CPUPlace()
)
})
Training data :code:`sentence` contain three samples, the lengths of which are :code:`4, 1, 2` respectively.
They are :code:`data[0:4]`, :code:`data[4:5]` and :code:`data[5:7]` respectively.
How to prepare training data for every device in ParallelExecutor
-------------------------------------------------------------------
When you feed data to :code:`ParallelExecutor.run(feed=...)` ,
you can explicitly assign data for every training device (such as GPU).
You need to feed a list to :code:`feed` . Each element of the list is a dict.
The key of the dict is name of data layer and the value of dict is value of data layer.
For example:
.. code-block:: python
parallel_executor = fluid.ParallelExecutor()
parallel_executor.run(
feed=[
{
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
},
{
"image": numpy.random.random(size=(16, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(16, 1)).astype('int64')
},
]
)
In the code above, GPU0 will train 32 samples and GPU1 will train 16 samples.
.. _user_guide_customize_batch_size_rank_en:
Customize the BatchSize dimension
------------------------------------
Batch size is the first dimension of data by default in PaddlePaddle Fluid, indicated by :code:`-1` .But in advanced usage, batch_size could be fixed or respresented by other dimension or multiple dimensions, which could be implemented by setting :code:`fluid.layers.data(append_batch_size=False)` .
1. fixed BatchSize dimension
.. code-block:: python
image = fluid.layers.data(name="image", shape=[32, 784], append_batch_size=False)
Here :code:`image` is always a matrix with size of :code:`[32, 784]` .
2. batch size expressed by other dimension
.. code-block:: python
sentence = fluid.layers.data(name="sentence",
shape=[80, -1, 1],
append_batch_size=False,
dtype="int64")
Here the middle dimension of :code:`sentence` is batch size. This type of data layout is applied in fixed-length recurrent neural networks.
.. _user_guide_paddle_support_data_types_en:
Data types supported by Fluid
-------------------------------
Data types supported by PaddlePaddle Fluid contains:
* float16: supported by part of operations
* float32: major data type of real number
* float64: minor data type of real number, supported by most operations
* int32: minor data type of labels
* int64: major data type of labels
* uint64: minor data type of labels
* bool: type of control flow data
* int16: minor type of labels
* uint8: input data type, used for pixel of picture
.. _user_guide_use_numpy_array_as_train_data_en:
#################################
Take Numpy Array as Training Data
#################################
PaddlePaddle Fluid supports configuring data layer with :code:`fluid.layers.data()` .
Then you can use Numpy Array or directly use Python to create C++
:code:`fluid.LoDTensor` , and then feed it to :code:`fluid.Executor` or :code:`fluid.ParallelExecutor`
through :code:`Executor.run(feed=...)` .
Configure Data Layer
############################
With :code:`fluid.layers.data()` , you can configure data layer in neural network. Details are as follows:
.. code-block:: python
import paddle.fluid as fluid
image = fluid.layers.data(name="image", shape=[3, 224, 224])
label = fluid.layers.data(name="label", shape=[1], dtype="int64")
# use image/label as layer input
prediction = fluid.layers.fc(input=image, size=1000, act="softmax")
loss = fluid.layers.cross_entropy(input=prediction, label=label)
...
In the code above, :code:`image` and :code:`label` are two input data layers created by :code:`fluid.layers.data` . :code:`image` is float data of shape :code:`[3, 224, 224]` ; :code:`label` is the int data of shape :code:`[1]` . Note that:
1. :code:`-1` is represented for the dimension of batch size by default in Fluid. And :code:`-1` is added to the first dimension of :code:`shape` by default. Therefore in the code above, it would be alright to transfer numpy array of :code:`[32, 3, 224, 224]` to :code:`image` . If you want to customize the position of the batch size dimension, please set :code:`fluid.layers.data(append_batch_size=False)` .Please refer to the tutorial in the advanced user guide: :ref:`user_guide_customize_batch_size_rank_en` .
2. Data type of category labels in Fluid is :code:`int64` and the label starts from 0. About the supported data types,please refer to :ref:`user_guide_paddle_support_data_types_en` .
.. _user_guide_feed_data_to_executor_en:
Transfer Train Data to Executor
################################
Both :code:`Executor.run` and :code:`ParallelExecutor.run` receive a parameter :code:`feed` .
The parameter is a dict in Python. Its key is the name of data layer,such as :code:`image` in code above. And its value is the corresponding numpy array.
For example:
.. code-block:: python
exe = fluid.Executor(fluid.CPUPlace())
# init Program
exe.run(fluid.default_startup_program())
exe.run(feed={
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
})
Advanced Usage
###############
How to feed Sequence Data
--------------------------
Sequence data is a unique data type supported by PaddlePaddle Fluid. You can take :code:`LoDTensor` as input data type.
You need to:
1. Feed all data to be trained in a mini-batch.
2. Get the length of each sequence.
You can use :code:`fluid.create_lod_tensor` to create :code:`LoDTensor` .
To feed sequence information, it is necessary to set the sequence nested depth :code:`lod_level` .
For instance, if the training data are sentences consisting of words, :code:`lod_level=1`; if train data are paragraphs which consists of sentences that consists of words, :code:`lod_level=2` .
For example:
.. code-block:: python
sentence = fluid.layers.data(name="sentence", dtype="int64", shape=[1], lod_level=1)
...
exe.run(feed={
"sentence": create_lod_tensor(
data=numpy.array([1, 3, 4, 5, 3, 6, 8], dtype='int64').reshape(-1, 1),
recursive_seq_lens=[[4, 1, 2]],
place=fluid.CPUPlace()
)
})
Training data :code:`sentence` contain three samples, the lengths of which are :code:`4, 1, 2` respectively.
They are :code:`data[0:4]`, :code:`data[4:5]` and :code:`data[5:7]` respectively.
How to prepare training data for every device in ParallelExecutor
-------------------------------------------------------------------
When you feed data to :code:`ParallelExecutor.run(feed=...)` ,
you can explicitly assign data for every training device (such as GPU).
You need to feed a list to :code:`feed` . Each element of the list is a dict.
The key of the dict is name of data layer and the value of dict is value of data layer.
For example:
.. code-block:: python
parallel_executor = fluid.ParallelExecutor()
parallel_executor.run(
feed=[
{
"image": numpy.random.random(size=(32, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(32, 1)).astype('int64')
},
{
"image": numpy.random.random(size=(16, 3, 224, 224)).astype('float32'),
"label": numpy.random.random(size=(16, 1)).astype('int64')
},
]
)
In the code above, GPU0 will train 32 samples and GPU1 will train 16 samples.
.. _user_guide_customize_batch_size_rank_en:
Customize the BatchSize dimension
------------------------------------
Batch size is the first dimension of data by default in PaddlePaddle Fluid, indicated by :code:`-1` .But in advanced usage, batch_size could be fixed or respresented by other dimension or multiple dimensions, which could be implemented by setting :code:`fluid.layers.data(append_batch_size=False)` .
1. fixed BatchSize dimension
.. code-block:: python
image = fluid.layers.data(name="image", shape=[32, 784], append_batch_size=False)
Here :code:`image` is always a matrix with size of :code:`[32, 784]` .
2. batch size expressed by other dimension
.. code-block:: python
sentence = fluid.layers.data(name="sentence",
shape=[80, -1, 1],
append_batch_size=False,
dtype="int64")
Here the middle dimension of :code:`sentence` is batch size. This type of data layout is applied in fixed-length recurrent neural networks.
.. _user_guide_paddle_support_data_types_en:
Data types supported by Fluid
-------------------------------
Data types supported by PaddlePaddle Fluid contains:
* float16: supported by part of operations
* float32: major data type of real number
* float64: minor data type of real number, supported by most operations
* int32: minor data type of labels
* int64: major data type of labels
* uint64: minor data type of labels
* bool: type of control flow data
* int16: minor type of labels
* uint8: input data type, used for pixel of picture
##########
分布式训练
##########
.. toctree::
:maxdepth: 1
cluster_quick_start.rst
cluster_howto.rst
fleet_api_howto_cn.rst
.. _user_guide_distribute_en:
######################
Distributed Training
######################
.. toctree::
:maxdepth: 1
cluster_quick_start_en.rst
cluster_howto_en.rst
########
########
多机训练
########
......@@ -8,5 +8,3 @@
cluster_quick_start.rst
cluster_howto.rst
fleet_api_howto_cn.rst
train_on_baidu_cloud_cn.rst
deploy_ctr_on_baidu_cloud_cn.rst
......@@ -27,6 +27,7 @@ import paddle.fluid as fluid
BATCH_SIZE = 64
PASS_NUM = 1
def loss_net(hidden, label):
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
loss = fluid.layers.cross_entropy(input=prediction, label=label)
......@@ -34,6 +35,7 @@ def loss_net(hidden, label):
acc = fluid.layers.accuracy(input=prediction, label=label)
return prediction, avg_loss, acc
def conv_net(img, label):
conv_pool_1 = fluid.nets.simple_img_conv_pool(
input=img,
......@@ -79,8 +81,7 @@ def train(use_cuda, role, endpoints, current_endpoint, trainer_id, trainers):
exe = fluid.Executor(place)
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.mnist.train(), buf_size=500),
paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=500),
batch_size=BATCH_SIZE)
test_reader = paddle.batch(
paddle.dataset.mnist.test(), batch_size=BATCH_SIZE)
......@@ -88,20 +89,23 @@ def train(use_cuda, role, endpoints, current_endpoint, trainer_id, trainers):
exe.run(fluid.default_startup_program())
for pass_id in range(PASS_NUM):
for batch_id, data in enumerate(train_reader()):
acc_np, avg_loss_np = exe.run(prog,
feed=feeder.feed(data),
fetch_list=[acc, avg_loss])
acc_np, avg_loss_np = exe.run(
prog, feed=feeder.feed(data), fetch_list=[acc, avg_loss])
if (batch_id + 1) % 10 == 0:
print(
'PassID {0:1}, BatchID {1:04}, Loss {2:2.2}, Acc {3:2.2}'.
format(pass_id, batch_id + 1,
float(avg_loss_np.mean()), float(acc_np.mean())))
float(avg_loss_np.mean()), float(
acc_np.mean())))
if __name__ == '__main__':
if len(sys.argv) != 6:
print("Usage: python %s role endpoints current_endpoint trainer_id trainers" % sys.argv[0])
print(
"Usage: python %s role endpoints current_endpoint trainer_id trainers"
% sys.argv[0])
exit(0)
role, endpoints, current_endpoint, trainer_id, trainers = \
sys.argv[1:]
train(True, role, endpoints, current_endpoint, int(trainer_id), int(trainers))
train(True, role, endpoints, current_endpoint,
int(trainer_id), int(trainers))
......@@ -246,4 +246,3 @@ VisualDL 是由 [PaddlePaddle](http://www.paddlepaddle.org/) 和
## 更多细节
想了解更多关于VisualDL的使用介绍,请查看[文档](https://github.com/PaddlePaddle/VisualDL/tree/develop/demo)
......@@ -262,4 +262,3 @@ We welcome everyone to use, comment and contribute to VisualDL :)
## More details
For more details about how to use VisualDL, please take a look at [documents](https://github.com/PaddlePaddle/VisualDL/tree/develop/demo)
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册