提交 2af9aac2 编写于 作者: Y Yu Yang

Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/add_global_step

...@@ -27,7 +27,6 @@ third_party/ ...@@ -27,7 +27,6 @@ third_party/
cmake-build-* cmake-build-*
# generated while compiling # generated while compiling
python/paddle/v2/fluid/core.so
paddle/pybind/pybind.h paddle/pybind/pybind.h
CMakeFiles CMakeFiles
cmake_install.cmake cmake_install.cmake
......
======================
Fluid
======================
.. toctree::
:maxdepth: 1
layers.rst
data_feeder.rst
executor.rst
initializer.rst
evaluator.rst
nets.rst
optimizer.rst
param_attr.rst
profiler.rst
regularizer.rst
io.rst
...@@ -8,4 +8,4 @@ API ...@@ -8,4 +8,4 @@ API
v2/model_configs.rst v2/model_configs.rst
v2/data.rst v2/data.rst
v2/run_logic.rst v2/run_logic.rst
v2/fluid.rst fluid/index.rst
======================
Fluid
======================
.. toctree::
:maxdepth: 1
fluid/layers.rst
fluid/data_feeder.rst
fluid/executor.rst
fluid/initializer.rst
fluid/evaluator.rst
fluid/nets.rst
fluid/optimizer.rst
fluid/param_attr.rst
fluid/profiler.rst
fluid/regularizer.rst
fluid/io.rst
...@@ -12,7 +12,7 @@ The following table compares concepts in Fluid and Go ...@@ -12,7 +12,7 @@ The following table compares concepts in Fluid and Go
| Go | Fluid | | Go | Fluid |
|----|-------| |----|-------|
|user-defined functions | [layers](https://github.com/PaddlePaddle/Paddle/tree/develop/python/paddle/v2/fluid) | |user-defined functions | [layers](https://github.com/PaddlePaddle/Paddle/tree/develop/python/paddle/fluid) |
| control-flow and built-in functions | [intrinsics/operators](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/operators) | | control-flow and built-in functions | [intrinsics/operators](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/operators) |
| goroutines, channels | [class ThreadPool](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/framework/thread_pool.h) | | goroutines, channels | [class ThreadPool](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/framework/thread_pool.h) |
| runtime | [class Executor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/executor.h) | | runtime | [class Executor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/executor.h) |
......
...@@ -89,7 +89,7 @@ with train_loop.block(): ...@@ -89,7 +89,7 @@ with train_loop.block():
h[t] = the_step(input[t]) h[t] = the_step(input[t])
``` ```
An actual Fluid example is described [here](https://github.com/PaddlePaddle/Paddle/blob/a91efdde6910ce92a78e3aa7157412c4c88d9ee8/python/paddle/v2/fluid/tests/test_while_op.py#L36-L44). An actual Fluid example is described [here](https://github.com/PaddlePaddle/Paddle/blob/bde090a97564b9c61a6aaa38b72ccc4889d102d9/python/paddle/fluid/tests/unittests/test_while_op.py#L50-L58).
From the example, the Fluid programs look very similar to their PyTorch equivalent programs, except that Fluid's loop structure, wrapped with Python's `with` statement, could run much faster than just a Python loop. From the example, the Fluid programs look very similar to their PyTorch equivalent programs, except that Fluid's loop structure, wrapped with Python's `with` statement, could run much faster than just a Python loop.
......
...@@ -101,7 +101,7 @@ In-place is a built-in attribute of an operator. Since we treat in-place and oth ...@@ -101,7 +101,7 @@ In-place is a built-in attribute of an operator. Since we treat in-place and oth
#### contruct control flow graph #### contruct control flow graph
Following is the ProgramDesc protobuf of [machine translation](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/tests/book/test_machine_translation.py) example. Following is the ProgramDesc protobuf of [machine translation](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book/test_machine_translation.py) example.
- Block0: - Block0:
......
C-API预测库 C-API预测库
================== ==================
当我们训练完一个神经网络模型之后,下一步就是用模型来做预测。预测就是准备输入数据,经过模型处理之后,得到预测结果的过程。
相比于模型训练,预测有如下特点:
#. 预测不需要训练过程中反向传播和参数更新的部分。
#. 预测不需要标签(label)。
#. 预测很多时候需要和用户系统整合在一起。
因为上述特点,模型预测SDK需要单独设计,并具备以下特点:
#. 预测SDK不包含反向传播和参数更新部分,以减小SDK的体积。
#. 预测SDK需要提供一个简洁的用户接口,方便使用。
#. 因为输入数据可能有多种结构,对输入数据的格式做清晰简洁的封装。
#. 为了和用户系统兼容,SDK的接口需要是满足C标准的接口。
PaddlePaddle提供了C-API,用于解决上述问题。关于C-API的使用,我们提供了如下指南:
.. toctree:: .. toctree::
:maxdepth: 1 :maxdepth: 1
......
...@@ -32,7 +32,7 @@ The non-cluster version of this demo with fluid API is as follows: ...@@ -32,7 +32,7 @@ The non-cluster version of this demo with fluid API is as follows:
``` python ``` python
import paddle.v2 as paddle import paddle.v2 as paddle
import paddle.v2.fluid as fluid import paddle.fluid as fluid
x = fluid.layers.data(name='x', shape=[13], dtype='float32') x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None) y_predict = fluid.layers.fc(input=x, size=1, act=None)
...@@ -125,11 +125,11 @@ for pass_id in range(100): ...@@ -125,11 +125,11 @@ for pass_id in range(100):
### E2E demo ### E2E demo
Please find the complete demo from [here](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/tests/book_distribute/notest_dist_fit_a_line.py). Please find the complete demo from [here](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/tests/book_distribute/notest_dist_fit_a_line.py).
First `cd` into the folder that contains the `python` files. In this case: First `cd` into the folder that contains the `python` files. In this case:
```bash ```bash
cd /paddle/python/paddle/v2/fluid/tests/book_distribute cd /paddle/python/paddle/fluid/tests/book_distribute
``` ```
In parameter server node run the following in the command line: In parameter server node run the following in the command line:
......
...@@ -35,7 +35,7 @@ cprofilev -a 0.0.0.0 -p 3214 -f profile.out main.py ...@@ -35,7 +35,7 @@ cprofilev -a 0.0.0.0 -p 3214 -f profile.out main.py
``` ```
ncalls tottime percall cumtime percall filename:lineno(function) ncalls tottime percall cumtime percall filename:lineno(function)
1 0.284 0.284 29.514 29.514 main.py:1(<module>) 1 0.284 0.284 29.514 29.514 main.py:1(<module>)
4696 0.128 0.000 15.748 0.003 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/executor.py:20(run) 4696 0.128 0.000 15.748 0.003 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/executor.py:20(run)
4696 12.040 0.003 12.040 0.003 {built-in method run} 4696 12.040 0.003 12.040 0.003 {built-in method run}
1 0.144 0.144 6.534 6.534 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/__init__.py:14(<module>) 1 0.144 0.144 6.534 6.534 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/__init__.py:14(<module>)
``` ```
...@@ -61,9 +61,9 @@ cprofilev -a 0.0.0.0 -p 3214 -f profile.out main.py ...@@ -61,9 +61,9 @@ cprofilev -a 0.0.0.0 -p 3214 -f profile.out main.py
```text ```text
4696 12.040 0.003 12.040 0.003 {built-in method run} 4696 12.040 0.003 12.040 0.003 {built-in method run}
300005 0.874 0.000 1.681 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/dataset/mnist.py:38(reader) 300005 0.874 0.000 1.681 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/dataset/mnist.py:38(reader)
107991 0.676 0.000 1.519 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:219(__init__) 107991 0.676 0.000 1.519 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:219(__init__)
4697 0.626 0.000 2.291 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:428(sync_with_cpp) 4697 0.626 0.000 2.291 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:428(sync_with_cpp)
1 0.618 0.618 0.618 0.618 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/__init__.py:1(<module>) 1 0.618 0.618 0.618 0.618 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/__init__.py:1(<module>)
``` ```
可以看到最耗时的函数是C++端的`run`函数。这需要联合我们第二节`Python``C++`混合代码的性能分析来进行调优。而`sync_with_cpp`函数的总共耗时很长,每次调用的耗时也很长。于是我们可以点击`sync_with_cpp`的详细信息,了解其调用关系。 可以看到最耗时的函数是C++端的`run`函数。这需要联合我们第二节`Python``C++`混合代码的性能分析来进行调优。而`sync_with_cpp`函数的总共耗时很长,每次调用的耗时也很长。于是我们可以点击`sync_with_cpp`的详细信息,了解其调用关系。
...@@ -76,9 +76,9 @@ Called By: ...@@ -76,9 +76,9 @@ Called By:
Function was called by... Function was called by...
ncalls tottime cumtime ncalls tottime cumtime
/home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:428(sync_with_cpp) <- 4697 0.626 2.291 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:562(sync_with_cpp) /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:428(sync_with_cpp) <- 4697 0.626 2.291 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:562(sync_with_cpp)
/home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:562(sync_with_cpp) <- 4696 0.019 2.316 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:487(clone) /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:562(sync_with_cpp) <- 4696 0.019 2.316 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:487(clone)
1 0.000 0.001 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:534(append_backward) 1 0.000 0.001 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:534(append_backward)
Called: Called:
......
...@@ -49,7 +49,7 @@ port, we will see the output like the following: ...@@ -49,7 +49,7 @@ port, we will see the output like the following:
``` ```
ncalls tottime percall cumtime percall filename:lineno(function) ncalls tottime percall cumtime percall filename:lineno(function)
1 0.284 0.284 29.514 29.514 main.py:1(<module>) 1 0.284 0.284 29.514 29.514 main.py:1(<module>)
4696 0.128 0.000 15.748 0.003 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/executor.py:20(run) 4696 0.128 0.000 15.748 0.003 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/executor.py:20(run)
4696 12.040 0.003 12.040 0.003 {built-in method run} 4696 12.040 0.003 12.040 0.003 {built-in method run}
1 0.144 0.144 6.534 6.534 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/__init__.py:14(<module>) 1 0.144 0.144 6.534 6.534 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/__init__.py:14(<module>)
``` ```
...@@ -74,9 +74,9 @@ focus on. We can sort above profiling file by tottime: ...@@ -74,9 +74,9 @@ focus on. We can sort above profiling file by tottime:
```text ```text
4696 12.040 0.003 12.040 0.003 {built-in method run} 4696 12.040 0.003 12.040 0.003 {built-in method run}
300005 0.874 0.000 1.681 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/dataset/mnist.py:38(reader) 300005 0.874 0.000 1.681 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/dataset/mnist.py:38(reader)
107991 0.676 0.000 1.519 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:219(__init__) 107991 0.676 0.000 1.519 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:219(__init__)
4697 0.626 0.000 2.291 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:428(sync_with_cpp) 4697 0.626 0.000 2.291 0.000 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:428(sync_with_cpp)
1 0.618 0.618 0.618 0.618 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/__init__.py:1(<module>) 1 0.618 0.618 0.618 0.618 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/__init__.py:1(<module>)
``` ```
We can see that the most time-consuming function is the `built-in We can see that the most time-consuming function is the `built-in
...@@ -93,9 +93,9 @@ Called By: ...@@ -93,9 +93,9 @@ Called By:
Function was called by... Function was called by...
ncalls tottime cumtime ncalls tottime cumtime
/home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:428(sync_with_cpp) <- 4697 0.626 2.291 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:562(sync_with_cpp) /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:428(sync_with_cpp) <- 4697 0.626 2.291 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:562(sync_with_cpp)
/home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:562(sync_with_cpp) <- 4696 0.019 2.316 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:487(clone) /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:562(sync_with_cpp) <- 4696 0.019 2.316 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:487(clone)
1 0.000 0.001 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/v2/fluid/framework.py:534(append_backward) 1 0.000 0.001 /home/yuyang/perf_test/.env/lib/python2.7/site-packages/paddle/fluid/framework.py:534(append_backward)
Called: Called:
......
# PaddlePaddle Fluid Source Code Overview # PaddlePaddle Fluid Source Code Overview
Examples: https://github.com/PaddlePaddle/Paddle/tree/develop/python/paddle/v2/fluid/tests/book Examples: https://github.com/PaddlePaddle/Paddle/tree/develop/python/paddle/fluid/tests/book
Core: https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/framework Core: https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/framework
...@@ -26,16 +26,16 @@ sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001) ...@@ -26,16 +26,16 @@ sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
sgd_optimizer.minimize(avg_cost) sgd_optimizer.minimize(avg_cost)
``` ```
- Variables: `x`, `y`, `y_predict`, `cost` and `avg_cost`. [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/framework.py#) - Variables: `x`, `y`, `y_predict`, `cost` and `avg_cost`. [Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/framework.py#)
- Layers: `fluid.layers.data`, `fluid.layers.fc` and `fluid.layers.mean` are layers. [Python](https://github.com/PaddlePaddle/Paddle/tree/develop/python/paddle/v2/fluid/layers) - Layers: `fluid.layers.data`, `fluid.layers.fc` and `fluid.layers.mean` are layers. [Python](https://github.com/PaddlePaddle/Paddle/tree/develop/python/paddle/fluid/layers)
- Every Layer has one or more operators and variables/parameters - Every Layer has one or more operators and variables/parameters
- All the operators are defined at [`paddle/operators/`](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/operators). Other worth-looking files: - All the operators are defined at [`paddle/operators/`](https://github.com/PaddlePaddle/Paddle/tree/develop/paddle/operators). Other worth-looking files:
- Base class: [`paddle/framework/operator.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/operator.h) - Base class: [`paddle/framework/operator.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/operator.h)
- Operator Registration: [`paddle/framework/op_registry.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/op_registry.h) - Operator Registration: [`paddle/framework/op_registry.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/op_registry.h)
- Operator Lookup: [`paddle/framework/op_info.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/op_info.h) - Operator Lookup: [`paddle/framework/op_info.h`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/op_info.h)
- Optimizer: `fluid.optimizer.SGD`. It does the following - Optimizer: `fluid.optimizer.SGD`. It does the following
- Add backward operators. [[Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/backward.py)] - Add backward operators. [[Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/backward.py)]
- Add optimizer operators. [[Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/optimizer.py)] - Add optimizer operators. [[Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/optimizer.py)]
# Run Time # Run Time
...@@ -57,7 +57,7 @@ exe.run(fluid.default_main_program(), ...@@ -57,7 +57,7 @@ exe.run(fluid.default_main_program(),
- Place: `place`. one of CPU, GPU or FPGA. [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/place.h) - Place: `place`. one of CPU, GPU or FPGA. [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/place.h)
- The device handle are at [paddle/platform/device_context.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/device_context.h) - The device handle are at [paddle/platform/device_context.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/platform/device_context.h)
- Executor: `fluid.Executor(place)`. [[Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/executor.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/executor.cc)] - Executor: `fluid.Executor(place)`. [[Python](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/executor.py), [C++](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/executor.cc)]
- Feeds the data: `feed=feeder.feed(data)` - Feeds the data: `feed=feeder.feed(data)`
- Evaluates all the operators - Evaluates all the operators
- Fetches the result: `fetch_list=[avg_cost]` - Fetches the result: `fetch_list=[avg_cost]`
......
...@@ -130,7 +130,7 @@ def generate_layer_fn(op_type): ...@@ -130,7 +130,7 @@ def generate_layer_fn(op_type):
o_name = not_intermediate_outputs[0].name o_name = not_intermediate_outputs[0].name
intermediate_output_names = [output.name for output in intermediate_outputs] intermediate_output_names = [output.name for output in intermediate_outputs]
def infer_and_check_dtype(op_proto, **kwargs): def infer_and_check_dtype(op_proto, *args, **kwargs):
""" """
This function performs the sanity check for dtype and This function performs the sanity check for dtype and
instance type. instance type.
...@@ -141,6 +141,10 @@ def generate_layer_fn(op_type): ...@@ -141,6 +141,10 @@ def generate_layer_fn(op_type):
val = kwargs.pop(name, []) val = kwargs.pop(name, [])
if not isinstance(val, list) and not isinstance(val, tuple): if not isinstance(val, list) and not isinstance(val, tuple):
val = [val] val = [val]
if len(val) == 0:
val = [args[0]]
args = args[1:]
for each in val: for each in val:
if not isinstance(each, Variable): if not isinstance(each, Variable):
raise ValueError("input of {0} must be variable".format( raise ValueError("input of {0} must be variable".format(
...@@ -155,10 +159,10 @@ def generate_layer_fn(op_type): ...@@ -155,10 +159,10 @@ def generate_layer_fn(op_type):
return dtype return dtype
def func(**kwargs): def func(*args, **kwargs):
helper = LayerHelper(op_type, **kwargs) helper = LayerHelper(op_type, **kwargs)
dtype = infer_and_check_dtype(op_proto, **kwargs) dtype = infer_and_check_dtype(op_proto, *args, **kwargs)
inputs = dict() inputs = dict()
for ipt in op_proto.inputs: for ipt in op_proto.inputs:
...@@ -166,6 +170,9 @@ def generate_layer_fn(op_type): ...@@ -166,6 +170,9 @@ def generate_layer_fn(op_type):
val = kwargs.pop(name, []) val = kwargs.pop(name, [])
if not isinstance(val, list) and not isinstance(val, tuple): if not isinstance(val, list) and not isinstance(val, tuple):
val = [val] val = [val]
if len(val) == 0 and len(args) != 0:
val = args[0]
args = args[1:]
inputs[ipt.name] = val inputs[ipt.name] = val
outputs = dict() outputs = dict()
......
...@@ -160,8 +160,8 @@ def sums(input, out=None): ...@@ -160,8 +160,8 @@ def sums(input, out=None):
a0 = layers.array_read(array=tmp, i=i) a0 = layers.array_read(array=tmp, i=i)
i = layers.increment(x=i) i = layers.increment(x=i)
a1 = layers.array_read(array=tmp, i=i) a1 = layers.array_read(array=tmp, i=i)
mean_a0 = layers.mean(x=a0) mean_a0 = layers.mean(a0)
mean_a1 = layers.mean(x=a1) mean_a1 = layers.mean(a1)
a_sum = layers.sums(input=[mean_a0, mean_a1]) a_sum = layers.sums(input=[mean_a0, mean_a1])
""" """
helper = LayerHelper('sum', **locals()) helper = LayerHelper('sum', **locals())
......
...@@ -36,9 +36,17 @@ class Optimizer(object): ...@@ -36,9 +36,17 @@ class Optimizer(object):
""" """
def __init__(self, learning_rate, regularization=None): def __init__(self, learning_rate, regularization=None):
assert learning_rate is not None if not isinstance(learning_rate, float) and \
not isinstance(learning_rate, framework.Variable):
raise TypeError("learning rate should be float or Variable")
self.regularization = regularization self.regularization = regularization
self._global_learning_rate = learning_rate self._learning_rate = learning_rate
# each program should have a independent learning rate
# program -> Variable(learning_rate)
self._learning_rate_map = dict()
if isinstance(self._learning_rate, framework.Variable):
self._learning_rate_map[framework.default_main_program(
)] = self._learning_rate
# Dictionary of accumulators. Some optimizer subclasses need to # Dictionary of accumulators. Some optimizer subclasses need to
# allocate and manage extra variables associated with the parameters # allocate and manage extra variables associated with the parameters
# to train. These variables are called accumulators. # to train. These variables are called accumulators.
...@@ -47,26 +55,33 @@ class Optimizer(object): ...@@ -47,26 +55,33 @@ class Optimizer(object):
self.helper = None self.helper = None
def _create_global_learning_rate(self): def _create_global_learning_rate(self):
if isinstance(self._global_learning_rate, float): lr = self.global_learning_rate()
self._global_learning_rate = layers.create_global_var(
if isinstance(lr, framework.Variable):
return
else:
if not isinstance(self._learning_rate, float):
raise TypeError(
"learning rate variable is create outside optimizer,"
"can not create new learning rate variable for new program")
# create learning rate in the current main program
self._learning_rate_map[framework.default_main_program(
)] = layers.create_global_var(
name=unique_name.generate("learning_rate"), name=unique_name.generate("learning_rate"),
shape=[1], shape=[1],
value=float(self._global_learning_rate), value=float(self._learning_rate),
dtype='float32', dtype='float32',
persistable=True) persistable=True)
if not isinstance(self._global_learning_rate, framework.Variable): def global_learning_rate(self, program=None):
raise ValueError("learning rate should be a Variable, "
"actual type is %s",
type(self._global_learning_rate))
@property
def global_learning_rate(self):
""" """
get global decayed learning rate get global decayed learning rate
:return: :return:
""" """
return self._global_learning_rate if program is None:
program = framework.default_main_program()
return self._learning_rate_map.get(program, None)
def _append_optimize_op(self, block, param_and_grad): def _append_optimize_op(self, block, param_and_grad):
""" append optimize operator to block and return all the added optimize_op """ append optimize operator to block and return all the added optimize_op
...@@ -77,7 +92,7 @@ class Optimizer(object): ...@@ -77,7 +92,7 @@ class Optimizer(object):
# create learning rate variable for every parameter # create learning rate variable for every parameter
param = param_and_grad[0] param = param_and_grad[0]
param_lr = param.optimize_attr['learning_rate'] param_lr = param.optimize_attr['learning_rate']
return self._global_learning_rate * param_lr return self.global_learning_rate() * param_lr
def _create_accumulators(self, block, parameters): def _create_accumulators(self, block, parameters):
"""Create all accumulators needed by the parameters """Create all accumulators needed by the parameters
......
...@@ -147,7 +147,7 @@ def seq_to_seq_net(): ...@@ -147,7 +147,7 @@ def seq_to_seq_net():
label = fluid.layers.data( label = fluid.layers.data(
name='label_sequence', shape=[1], dtype='int64', lod_level=1) name='label_sequence', shape=[1], dtype='int64', lod_level=1)
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
return avg_cost, prediction return avg_cost, prediction
......
...@@ -29,7 +29,7 @@ def train(use_cuda, save_dirname): ...@@ -29,7 +29,7 @@ def train(use_cuda, save_dirname):
y = fluid.layers.data(name='y', shape=[1], dtype='float32') y = fluid.layers.data(name='y', shape=[1], dtype='float32')
cost = fluid.layers.square_error_cost(input=y_predict, label=y) cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001) sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
sgd_optimizer.minimize(avg_cost) sgd_optimizer.minimize(avg_cost)
......
...@@ -110,7 +110,7 @@ def train(net_type, use_cuda, save_dirname): ...@@ -110,7 +110,7 @@ def train(net_type, use_cuda, save_dirname):
predict = fluid.layers.fc(input=net, size=classdim, act='softmax') predict = fluid.layers.fc(input=net, size=classdim, act='softmax')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
acc = fluid.layers.accuracy(input=predict, label=label) acc = fluid.layers.accuracy(input=predict, label=label)
# Test program # Test program
......
...@@ -164,7 +164,7 @@ def train(use_cuda, save_dirname=None): ...@@ -164,7 +164,7 @@ def train(use_cuda, save_dirname=None):
label=target, label=target,
param_attr=fluid.ParamAttr( param_attr=fluid.ParamAttr(
name='crfw', learning_rate=mix_hidden_lr)) name='crfw', learning_rate=mix_hidden_lr))
avg_cost = fluid.layers.mean(x=crf_cost) avg_cost = fluid.layers.mean(crf_cost)
# TODO(qiao) # TODO(qiao)
# check other optimizers and check why out will be NAN # check other optimizers and check why out will be NAN
......
...@@ -178,7 +178,7 @@ def train_main(use_cuda, is_sparse): ...@@ -178,7 +178,7 @@ def train_main(use_cuda, is_sparse):
label = pd.data( label = pd.data(
name="target_language_next_word", shape=[1], dtype='int64', lod_level=1) name="target_language_next_word", shape=[1], dtype='int64', lod_level=1)
cost = pd.cross_entropy(input=rnn_out, label=label) cost = pd.cross_entropy(input=rnn_out, label=label)
avg_cost = pd.mean(x=cost) avg_cost = pd.mean(cost)
optimizer = fluid.optimizer.Adagrad(learning_rate=1e-4) optimizer = fluid.optimizer.Adagrad(learning_rate=1e-4)
optimizer.minimize(avg_cost) optimizer.minimize(avg_cost)
......
...@@ -48,7 +48,7 @@ BATCH_SIZE = 64 ...@@ -48,7 +48,7 @@ BATCH_SIZE = 64
def loss_net(hidden, label): def loss_net(hidden, label):
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax') prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
loss = fluid.layers.cross_entropy(input=prediction, label=label) loss = fluid.layers.cross_entropy(input=prediction, label=label)
avg_loss = fluid.layers.mean(x=loss) avg_loss = fluid.layers.mean(loss)
acc = fluid.layers.accuracy(input=prediction, label=label) acc = fluid.layers.accuracy(input=prediction, label=label)
return prediction, avg_loss, acc return prediction, avg_loss, acc
...@@ -101,8 +101,8 @@ def train(nn_type, use_cuda, parallel, save_dirname, save_param_filename): ...@@ -101,8 +101,8 @@ def train(nn_type, use_cuda, parallel, save_dirname, save_param_filename):
avg_loss, acc = pd() avg_loss, acc = pd()
# get mean loss and acc through every devices. # get mean loss and acc through every devices.
avg_loss = fluid.layers.mean(x=avg_loss) avg_loss = fluid.layers.mean(avg_loss)
acc = fluid.layers.mean(x=acc) acc = fluid.layers.mean(acc)
else: else:
prediction, avg_loss, acc = net_conf(img, label) prediction, avg_loss, acc = net_conf(img, label)
......
...@@ -147,7 +147,7 @@ def model(): ...@@ -147,7 +147,7 @@ def model():
label = layers.data(name='score', shape=[1], dtype='float32') label = layers.data(name='score', shape=[1], dtype='float32')
square_cost = layers.square_error_cost(input=scale_infer, label=label) square_cost = layers.square_error_cost(input=scale_infer, label=label)
avg_cost = layers.mean(x=square_cost) avg_cost = layers.mean(square_cost)
return scale_infer, avg_cost return scale_infer, avg_cost
......
...@@ -42,7 +42,7 @@ def convolution_net(data, label, input_dim, class_dim=2, emb_dim=32, ...@@ -42,7 +42,7 @@ def convolution_net(data, label, input_dim, class_dim=2, emb_dim=32,
size=class_dim, size=class_dim,
act="softmax") act="softmax")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
accuracy = fluid.layers.accuracy(input=prediction, label=label) accuracy = fluid.layers.accuracy(input=prediction, label=label)
return avg_cost, accuracy, prediction return avg_cost, accuracy, prediction
...@@ -82,7 +82,7 @@ def dyn_rnn_lstm(data, label, input_dim, class_dim=2, emb_dim=32, ...@@ -82,7 +82,7 @@ def dyn_rnn_lstm(data, label, input_dim, class_dim=2, emb_dim=32,
last = fluid.layers.sequence_last_step(rnn()) last = fluid.layers.sequence_last_step(rnn())
prediction = fluid.layers.fc(input=last, size=class_dim, act="softmax") prediction = fluid.layers.fc(input=last, size=class_dim, act="softmax")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
accuracy = fluid.layers.accuracy(input=prediction, label=label) accuracy = fluid.layers.accuracy(input=prediction, label=label)
return avg_cost, accuracy, prediction return avg_cost, accuracy, prediction
...@@ -119,7 +119,7 @@ def stacked_lstm_net(data, ...@@ -119,7 +119,7 @@ def stacked_lstm_net(data,
size=class_dim, size=class_dim,
act='softmax') act='softmax')
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
accuracy = fluid.layers.accuracy(input=prediction, label=label) accuracy = fluid.layers.accuracy(input=prediction, label=label)
return avg_cost, accuracy, prediction return avg_cost, accuracy, prediction
...@@ -158,8 +158,8 @@ def train(word_dict, net_method, use_cuda, parallel=False, save_dirname=None): ...@@ -158,8 +158,8 @@ def train(word_dict, net_method, use_cuda, parallel=False, save_dirname=None):
pd.write_output(acc) pd.write_output(acc)
cost, acc = pd() cost, acc = pd()
cost = fluid.layers.mean(x=cost) cost = fluid.layers.mean(cost)
acc_out = fluid.layers.mean(x=acc) acc_out = fluid.layers.mean(acc)
prediction = None prediction = None
assert save_dirname is None assert save_dirname is None
......
...@@ -118,7 +118,7 @@ def train(use_cuda, is_sparse, parallel, save_dirname): ...@@ -118,7 +118,7 @@ def train(use_cuda, is_sparse, parallel, save_dirname):
size=dict_size, size=dict_size,
act='softmax') act='softmax')
cost = fluid.layers.cross_entropy(input=predict_word, label=words[4]) cost = fluid.layers.cross_entropy(input=predict_word, label=words[4])
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
return avg_cost, predict_word return avg_cost, predict_word
word_dict = paddle.dataset.imikolov.build_dict() word_dict = paddle.dataset.imikolov.build_dict()
...@@ -143,7 +143,7 @@ def train(use_cuda, is_sparse, parallel, save_dirname): ...@@ -143,7 +143,7 @@ def train(use_cuda, is_sparse, parallel, save_dirname):
])) ]))
pd.write_output(avg_cost) pd.write_output(avg_cost)
avg_cost = fluid.layers.mean(x=pd()) avg_cost = fluid.layers.mean(pd())
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001) sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
sgd_optimizer.minimize(avg_cost) sgd_optimizer.minimize(avg_cost)
......
...@@ -24,7 +24,7 @@ y_predict = fluid.layers.fc(input=x, size=1, act=None) ...@@ -24,7 +24,7 @@ y_predict = fluid.layers.fc(input=x, size=1, act=None)
y = fluid.layers.data(name='y', shape=[1], dtype='float32') y = fluid.layers.data(name='y', shape=[1], dtype='float32')
cost = fluid.layers.square_error_cost(input=y_predict, label=y) cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001) sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
optimize_ops, params_grads = sgd_optimizer.minimize(avg_cost) optimize_ops, params_grads = sgd_optimizer.minimize(avg_cost)
......
...@@ -114,7 +114,7 @@ else: ...@@ -114,7 +114,7 @@ else:
predict = fluid.layers.fc(input=net, size=classdim, act='softmax') predict = fluid.layers.fc(input=net, size=classdim, act='softmax')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
optimizer = fluid.optimizer.Adam(learning_rate=0.001) optimizer = fluid.optimizer.Adam(learning_rate=0.001)
optimize_ops, params_grads = optimizer.minimize(avg_cost) optimize_ops, params_grads = optimizer.minimize(avg_cost)
......
...@@ -154,7 +154,7 @@ def main(): ...@@ -154,7 +154,7 @@ def main():
label=target, label=target,
param_attr=fluid.ParamAttr( param_attr=fluid.ParamAttr(
name='crfw', learning_rate=mix_hidden_lr)) name='crfw', learning_rate=mix_hidden_lr))
avg_cost = fluid.layers.mean(x=crf_cost) avg_cost = fluid.layers.mean(crf_cost)
# TODO(qiao) # TODO(qiao)
# check other optimizers and check why out will be NAN # check other optimizers and check why out will be NAN
......
...@@ -65,7 +65,7 @@ concat_embed = fluid.layers.concat( ...@@ -65,7 +65,7 @@ concat_embed = fluid.layers.concat(
hidden1 = fluid.layers.fc(input=concat_embed, size=HIDDEN_SIZE, act='sigmoid') hidden1 = fluid.layers.fc(input=concat_embed, size=HIDDEN_SIZE, act='sigmoid')
predict_word = fluid.layers.fc(input=hidden1, size=dict_size, act='softmax') predict_word = fluid.layers.fc(input=hidden1, size=dict_size, act='softmax')
cost = fluid.layers.cross_entropy(input=predict_word, label=next_word) cost = fluid.layers.cross_entropy(input=predict_word, label=next_word)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001) sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
optimize_ops, params_grads = sgd_optimizer.minimize(avg_cost) optimize_ops, params_grads = sgd_optimizer.minimize(avg_cost)
train_reader = paddle.batch( train_reader = paddle.batch(
......
...@@ -94,7 +94,7 @@ def main(): ...@@ -94,7 +94,7 @@ def main():
label = layers.data( label = layers.data(
name="target_language_next_word", shape=[1], dtype='int64', lod_level=1) name="target_language_next_word", shape=[1], dtype='int64', lod_level=1)
cost = layers.cross_entropy(input=rnn_out, label=label) cost = layers.cross_entropy(input=rnn_out, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
optimizer = fluid.optimizer.Adagrad(learning_rate=1e-4) optimizer = fluid.optimizer.Adagrad(learning_rate=1e-4)
optimize_ops, params_grads = optimizer.minimize(avg_cost) optimize_ops, params_grads = optimizer.minimize(avg_cost)
......
...@@ -37,7 +37,7 @@ conv_pool_2 = fluid.nets.simple_img_conv_pool( ...@@ -37,7 +37,7 @@ conv_pool_2 = fluid.nets.simple_img_conv_pool(
predict = fluid.layers.fc(input=conv_pool_2, size=10, act="softmax") predict = fluid.layers.fc(input=conv_pool_2, size=10, act="softmax")
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
optimizer = fluid.optimizer.Adam(learning_rate=0.01) optimizer = fluid.optimizer.Adam(learning_rate=0.01)
optimize_ops, params_grads = optimizer.minimize(avg_cost) optimize_ops, params_grads = optimizer.minimize(avg_cost)
......
...@@ -32,7 +32,7 @@ predict = fluid.layers.fc(input=hidden2, size=10, act='softmax') ...@@ -32,7 +32,7 @@ predict = fluid.layers.fc(input=hidden2, size=10, act='softmax')
label = fluid.layers.data(name='y', shape=[1], dtype='int64') label = fluid.layers.data(name='y', shape=[1], dtype='int64')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
optimizer = fluid.optimizer.Momentum(learning_rate=0.001, momentum=0.9) optimizer = fluid.optimizer.Momentum(learning_rate=0.001, momentum=0.9)
optimize_ops, params_grads = optimizer.minimize(avg_cost) optimize_ops, params_grads = optimizer.minimize(avg_cost)
......
...@@ -117,7 +117,7 @@ def model(): ...@@ -117,7 +117,7 @@ def model():
label = layers.data(name='score', shape=[1], dtype='float32') label = layers.data(name='score', shape=[1], dtype='float32')
square_cost = layers.square_error_cost(input=scale_infer, label=label) square_cost = layers.square_error_cost(input=scale_infer, label=label)
avg_cost = layers.mean(x=square_cost) avg_cost = layers.mean(square_cost)
return avg_cost return avg_cost
......
...@@ -38,7 +38,7 @@ def convolution_net(data, label, input_dim, class_dim=2, emb_dim=32, ...@@ -38,7 +38,7 @@ def convolution_net(data, label, input_dim, class_dim=2, emb_dim=32,
size=class_dim, size=class_dim,
act="softmax") act="softmax")
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
adam_optimizer = fluid.optimizer.Adam(learning_rate=0.002) adam_optimizer = fluid.optimizer.Adam(learning_rate=0.002)
optimize_ops, params_grads = adam_optimizer.minimize(avg_cost) optimize_ops, params_grads = adam_optimizer.minimize(avg_cost)
accuracy = fluid.evaluator.Accuracy(input=prediction, label=label) accuracy = fluid.evaluator.Accuracy(input=prediction, label=label)
......
...@@ -49,7 +49,7 @@ def stacked_lstm_net(data, ...@@ -49,7 +49,7 @@ def stacked_lstm_net(data,
size=class_dim, size=class_dim,
act='softmax') act='softmax')
cost = fluid.layers.cross_entropy(input=prediction, label=label) cost = fluid.layers.cross_entropy(input=prediction, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
adam_optimizer = fluid.optimizer.Adam(learning_rate=0.002) adam_optimizer = fluid.optimizer.Adam(learning_rate=0.002)
optimize_ops, params_grads = adam_optimizer.minimize(avg_cost) optimize_ops, params_grads = adam_optimizer.minimize(avg_cost)
accuracy = fluid.evaluator.Accuracy(input=prediction, label=label) accuracy = fluid.evaluator.Accuracy(input=prediction, label=label)
......
...@@ -30,7 +30,7 @@ y_predict = fluid.layers.fc(input=x, size=1, act=None) ...@@ -30,7 +30,7 @@ y_predict = fluid.layers.fc(input=x, size=1, act=None)
y = fluid.layers.data(name='y', shape=[1], dtype='float32') y = fluid.layers.data(name='y', shape=[1], dtype='float32')
cost = fluid.layers.square_error_cost(input=y_predict, label=y) cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.1) sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.1)
sgd_optimizer.minimize(avg_cost) sgd_optimizer.minimize(avg_cost)
......
...@@ -117,7 +117,7 @@ else: ...@@ -117,7 +117,7 @@ else:
predict = fluid.layers.fc(input=net, size=classdim, act='softmax') predict = fluid.layers.fc(input=net, size=classdim, act='softmax')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
optimizer = fluid.optimizer.Adam(learning_rate=0.001) optimizer = fluid.optimizer.Adam(learning_rate=0.001)
opts = optimizer.minimize(avg_cost) opts = optimizer.minimize(avg_cost)
......
...@@ -100,7 +100,7 @@ def main(): ...@@ -100,7 +100,7 @@ def main():
label = layers.data( label = layers.data(
name="target_language_next_word", shape=[1], dtype='int64', lod_level=1) name="target_language_next_word", shape=[1], dtype='int64', lod_level=1)
cost = layers.cross_entropy(input=rnn_out, label=label) cost = layers.cross_entropy(input=rnn_out, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
optimizer = fluid.optimizer.Adagrad(learning_rate=1e-4) optimizer = fluid.optimizer.Adagrad(learning_rate=1e-4)
optimizer.minimize(avg_cost) optimizer.minimize(avg_cost)
......
...@@ -96,7 +96,7 @@ def main(): ...@@ -96,7 +96,7 @@ def main():
x=D(img), x=D(img),
label=fluid.layers.data( label=fluid.layers.data(
name='label', shape=[1], dtype='float32')) name='label', shape=[1], dtype='float32'))
d_loss = fluid.layers.mean(x=d_loss) d_loss = fluid.layers.mean(d_loss)
with fluid.program_guard(dg_program, startup_program): with fluid.program_guard(dg_program, startup_program):
noise = fluid.layers.data( noise = fluid.layers.data(
...@@ -107,7 +107,7 @@ def main(): ...@@ -107,7 +107,7 @@ def main():
x=D(g_img), x=D(g_img),
label=fluid.layers.fill_constant_batch_size_like( label=fluid.layers.fill_constant_batch_size_like(
input=noise, dtype='float32', shape=[-1, 1], value=1.0)) input=noise, dtype='float32', shape=[-1, 1], value=1.0))
dg_loss = fluid.layers.mean(x=dg_loss) dg_loss = fluid.layers.mean(dg_loss)
opt = fluid.optimizer.Adam(learning_rate=LEARNING_RATE) opt = fluid.optimizer.Adam(learning_rate=LEARNING_RATE)
......
...@@ -33,7 +33,7 @@ with fluid.program_guard(main_program=prog): ...@@ -33,7 +33,7 @@ with fluid.program_guard(main_program=prog):
label = fluid.layers.data(name='y', shape=[1], dtype='int64') label = fluid.layers.data(name='y', shape=[1], dtype='int64')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
prog_clip = prog.clone() prog_clip = prog.clone()
prog_clip.block(0).var(hidden1.name).set_error_clip( prog_clip.block(0).var(hidden1.name).set_error_clip(
......
...@@ -30,7 +30,7 @@ with fluid.program_guard(main_program=prog): ...@@ -30,7 +30,7 @@ with fluid.program_guard(main_program=prog):
label = fluid.layers.data(name='y', shape=[1], dtype='int64') label = fluid.layers.data(name='y', shape=[1], dtype='int64')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
prog_clip = prog.clone() prog_clip = prog.clone()
......
...@@ -56,7 +56,7 @@ class TestMNISTIfElseOp(unittest.TestCase): ...@@ -56,7 +56,7 @@ class TestMNISTIfElseOp(unittest.TestCase):
prob = layers.merge_lod_tensor( prob = layers.merge_lod_tensor(
in_true=true_out, in_false=false_out, mask=cond, x=image) in_true=true_out, in_false=false_out, mask=cond, x=image)
loss = layers.cross_entropy(input=prob, label=label) loss = layers.cross_entropy(input=prob, label=label)
avg_loss = layers.mean(x=loss) avg_loss = layers.mean(loss)
optimizer = MomentumOptimizer(learning_rate=0.001, momentum=0.9) optimizer = MomentumOptimizer(learning_rate=0.001, momentum=0.9)
optimizer.minimize(avg_loss, startup_prog) optimizer.minimize(avg_loss, startup_prog)
...@@ -113,7 +113,7 @@ class TestMNISTIfElseOp(unittest.TestCase): ...@@ -113,7 +113,7 @@ class TestMNISTIfElseOp(unittest.TestCase):
prob = ie() prob = ie()
loss = layers.cross_entropy(input=prob[0], label=label) loss = layers.cross_entropy(input=prob[0], label=label)
avg_loss = layers.mean(x=loss) avg_loss = layers.mean(loss)
optimizer = MomentumOptimizer(learning_rate=0.001, momentum=0.9) optimizer = MomentumOptimizer(learning_rate=0.001, momentum=0.9)
optimizer.minimize(avg_loss, startup_prog) optimizer.minimize(avg_loss, startup_prog)
......
...@@ -49,15 +49,15 @@ class TestArrayReadWrite(unittest.TestCase): ...@@ -49,15 +49,15 @@ class TestArrayReadWrite(unittest.TestCase):
i = layers.increment(x=i) i = layers.increment(x=i)
a2 = layers.array_read(array=arr, i=i) a2 = layers.array_read(array=arr, i=i)
mean_a0 = layers.mean(x=a0) mean_a0 = layers.mean(a0)
mean_a1 = layers.mean(x=a1) mean_a1 = layers.mean(a1)
mean_a2 = layers.mean(x=a2) mean_a2 = layers.mean(a2)
a_sum = layers.sums(input=[mean_a0, mean_a1, mean_a2]) a_sum = layers.sums(input=[mean_a0, mean_a1, mean_a2])
mean_x0 = layers.mean(x=x[0]) mean_x0 = layers.mean(x[0])
mean_x1 = layers.mean(x=x[1]) mean_x1 = layers.mean(x[1])
mean_x2 = layers.mean(x=x[2]) mean_x2 = layers.mean(x[2])
x_sum = layers.sums(input=[mean_x0, mean_x1, mean_x2]) x_sum = layers.sums(input=[mean_x0, mean_x1, mean_x2])
......
...@@ -26,7 +26,7 @@ class TestCalcGradient(unittest.TestCase): ...@@ -26,7 +26,7 @@ class TestCalcGradient(unittest.TestCase):
x = layers.create_parameter(dtype="float32", shape=[5, 10]) x = layers.create_parameter(dtype="float32", shape=[5, 10])
y = layers.create_parameter(dtype="float32", shape=[10, 8]) y = layers.create_parameter(dtype="float32", shape=[10, 8])
mul_out = layers.mul(x=x, y=y) mul_out = layers.mul(x=x, y=y)
mean_out = layers.mean(x=mul_out) mean_out = layers.mean(mul_out)
a = calc_gradient(mean_out, mul_out) a = calc_gradient(mean_out, mul_out)
b = calc_gradient(mean_out, x) b = calc_gradient(mean_out, x)
place = fluid.CPUPlace() place = fluid.CPUPlace()
......
...@@ -39,7 +39,7 @@ class ConditionalBlock(unittest.TestCase): ...@@ -39,7 +39,7 @@ class ConditionalBlock(unittest.TestCase):
outs = exe.run(feed={'X': x}, fetch_list=[out])[0] outs = exe.run(feed={'X': x}, fetch_list=[out])[0]
print outs print outs
loss = layers.mean(x=out) loss = layers.mean(out)
append_backward(loss=loss) append_backward(loss=loss)
outs = exe.run( outs = exe.run(
feed={'X': x}, feed={'X': x},
......
...@@ -81,7 +81,7 @@ class TestDynRNN(unittest.TestCase): ...@@ -81,7 +81,7 @@ class TestDynRNN(unittest.TestCase):
logits = fluid.layers.fc(input=last, size=1, act=None) logits = fluid.layers.fc(input=last, size=1, act=None)
loss = fluid.layers.sigmoid_cross_entropy_with_logits( loss = fluid.layers.sigmoid_cross_entropy_with_logits(
x=logits, label=label) x=logits, label=label)
loss = fluid.layers.mean(x=loss) loss = fluid.layers.mean(loss)
sgd = fluid.optimizer.SGD(1e-4) sgd = fluid.optimizer.SGD(1e-4)
sgd.minimize(loss=loss) sgd.minimize(loss=loss)
cpu = fluid.CPUPlace() cpu = fluid.CPUPlace()
...@@ -119,7 +119,7 @@ class TestDynRNN(unittest.TestCase): ...@@ -119,7 +119,7 @@ class TestDynRNN(unittest.TestCase):
label = fluid.layers.data(name='label', shape=[1], dtype='float32') label = fluid.layers.data(name='label', shape=[1], dtype='float32')
loss = fluid.layers.sigmoid_cross_entropy_with_logits( loss = fluid.layers.sigmoid_cross_entropy_with_logits(
x=logits, label=label) x=logits, label=label)
loss = fluid.layers.mean(x=loss) loss = fluid.layers.mean(loss)
sgd = fluid.optimizer.Adam(1e-3) sgd = fluid.optimizer.Adam(1e-3)
sgd.minimize(loss=loss) sgd.minimize(loss=loss)
......
...@@ -272,7 +272,7 @@ class TestSimpleMul(SeedFixedTestCase): ...@@ -272,7 +272,7 @@ class TestSimpleMul(SeedFixedTestCase):
out = rnn() out = rnn()
out = fluid.layers.sequence_pool(out, pool_type='last') out = fluid.layers.sequence_pool(out, pool_type='last')
loss = fluid.layers.mean(x=out) loss = fluid.layers.mean(out)
fluid.backward.append_backward(loss) fluid.backward.append_backward(loss)
cpu = fluid.CPUPlace() cpu = fluid.CPUPlace()
...@@ -348,7 +348,7 @@ class TestSimpleMulWithMemory(SeedFixedTestCase): ...@@ -348,7 +348,7 @@ class TestSimpleMulWithMemory(SeedFixedTestCase):
out = rnn() out = rnn()
last = fluid.layers.sequence_pool(input=out, pool_type='last') last = fluid.layers.sequence_pool(input=out, pool_type='last')
loss = fluid.layers.mean(x=last) loss = fluid.layers.mean(last)
fluid.backward.append_backward(loss) fluid.backward.append_backward(loss)
cpu = fluid.CPUPlace() cpu = fluid.CPUPlace()
......
...@@ -125,7 +125,7 @@ class TestDyRnnStaticInput(unittest.TestCase): ...@@ -125,7 +125,7 @@ class TestDyRnnStaticInput(unittest.TestCase):
return static_input_step_outs return static_input_step_outs
last = fluid.layers.sequence_pool(input=rnn(), pool_type='last') last = fluid.layers.sequence_pool(input=rnn(), pool_type='last')
loss = fluid.layers.mean(x=last) loss = fluid.layers.mean(last)
append_backward(loss) append_backward(loss)
static_input_grad = self._program.global_block().var( static_input_grad = self._program.global_block().var(
framework.grad_var_name('static_input_tensor')) framework.grad_var_name('static_input_tensor'))
......
...@@ -38,7 +38,7 @@ class TestBook(unittest.TestCase): ...@@ -38,7 +38,7 @@ class TestBook(unittest.TestCase):
y_predict = layers.fc(input=x, size=1, act=None) y_predict = layers.fc(input=x, size=1, act=None)
cost = layers.square_error_cost(input=y_predict, label=y) cost = layers.square_error_cost(input=y_predict, label=y)
avg_cost = layers.mean(x=cost) avg_cost = layers.mean(cost)
sgd_optimizer = optimizer.SGDOptimizer(learning_rate=0.001) sgd_optimizer = optimizer.SGDOptimizer(learning_rate=0.001)
sgd_optimizer.minimize(avg_cost, init_program) sgd_optimizer.minimize(avg_cost, init_program)
......
...@@ -30,7 +30,7 @@ class TestBook(unittest.TestCase): ...@@ -30,7 +30,7 @@ class TestBook(unittest.TestCase):
y_predict = layers.fc(input=x, size=1, act=None) y_predict = layers.fc(input=x, size=1, act=None)
y = layers.data(name='y', shape=[1], dtype='float32') y = layers.data(name='y', shape=[1], dtype='float32')
cost = layers.square_error_cost(input=y_predict, label=y) cost = layers.square_error_cost(input=y_predict, label=y)
avg_cost = layers.mean(x=cost) avg_cost = layers.mean(cost)
self.assertIsNotNone(avg_cost) self.assertIsNotNone(avg_cost)
program.append_backward(avg_cost) program.append_backward(avg_cost)
...@@ -49,7 +49,7 @@ class TestBook(unittest.TestCase): ...@@ -49,7 +49,7 @@ class TestBook(unittest.TestCase):
act='softmax', act='softmax',
param_attr=["sftmax.w1", "sftmax.w2"]) param_attr=["sftmax.w1", "sftmax.w2"])
cost = layers.cross_entropy(input=predict, label=label) cost = layers.cross_entropy(input=predict, label=label)
avg_cost = layers.mean(x=cost) avg_cost = layers.mean(cost)
self.assertIsNotNone(avg_cost) self.assertIsNotNone(avg_cost)
print(str(program)) print(str(program))
...@@ -92,7 +92,7 @@ class TestBook(unittest.TestCase): ...@@ -92,7 +92,7 @@ class TestBook(unittest.TestCase):
predict = layers.fc(input=conv_pool_2, size=10, act="softmax") predict = layers.fc(input=conv_pool_2, size=10, act="softmax")
cost = layers.cross_entropy(input=predict, label=label) cost = layers.cross_entropy(input=predict, label=label)
avg_cost = layers.mean(x=cost) avg_cost = layers.mean(cost)
program.append_backward(avg_cost) program.append_backward(avg_cost)
...@@ -140,7 +140,7 @@ class TestBook(unittest.TestCase): ...@@ -140,7 +140,7 @@ class TestBook(unittest.TestCase):
size=dict_size, size=dict_size,
act='softmax') act='softmax')
cost = layers.cross_entropy(input=predict_word, label=next_word) cost = layers.cross_entropy(input=predict_word, label=next_word)
avg_cost = layers.mean(x=cost) avg_cost = layers.mean(cost)
self.assertIsNotNone(avg_cost) self.assertIsNotNone(avg_cost)
print(str(program)) print(str(program))
...@@ -287,7 +287,7 @@ class TestBook(unittest.TestCase): ...@@ -287,7 +287,7 @@ class TestBook(unittest.TestCase):
num_total_classes=dict_size, num_total_classes=dict_size,
param_attr='nce.w', param_attr='nce.w',
bias_attr='nce.b') bias_attr='nce.b')
avg_loss = layers.mean(x=loss) avg_loss = layers.mean(loss)
self.assertIsNotNone(avg_loss) self.assertIsNotNone(avg_loss)
print(str(default_main_program())) print(str(default_main_program()))
......
...@@ -182,7 +182,7 @@ class TestCPULoDTensorArrayOpGrad(unittest.TestCase): ...@@ -182,7 +182,7 @@ class TestCPULoDTensorArrayOpGrad(unittest.TestCase):
array = layers.lod_tensor_to_array(x, table) array = layers.lod_tensor_to_array(x, table)
result = layers.array_to_lod_tensor(array, table) result = layers.array_to_lod_tensor(array, table)
mean = layers.mean(x=result) mean = layers.mean(result)
append_backward(mean) append_backward(mean)
......
...@@ -29,7 +29,7 @@ class TestControlFlowGraph(unittest.TestCase): ...@@ -29,7 +29,7 @@ class TestControlFlowGraph(unittest.TestCase):
y_predict = layers.fc(input=x, size=1, act=None) y_predict = layers.fc(input=x, size=1, act=None)
y = layers.data(name='y', shape=[1], dtype='float32') y = layers.data(name='y', shape=[1], dtype='float32')
cost = layers.square_error_cost(input=y_predict, label=y) cost = layers.square_error_cost(input=y_predict, label=y)
avg_cost = layers.mean(x=cost) avg_cost = layers.mean(cost)
opt = optimizer.SGD(learning_rate=0.001) opt = optimizer.SGD(learning_rate=0.001)
opt = opt.minimize(avg_cost) opt = opt.minimize(avg_cost)
......
...@@ -127,7 +127,7 @@ class BaseParallelForTest(unittest.TestCase): ...@@ -127,7 +127,7 @@ class BaseParallelForTest(unittest.TestCase):
data = next(generator) data = next(generator)
loss = generator.send(data) loss = generator.send(data)
self.assertIsNotNone(loss) self.assertIsNotNone(loss)
avg_loss = fluid.layers.mean(x=loss) avg_loss = fluid.layers.mean(loss)
fluid.backward.append_backward(loss=avg_loss) fluid.backward.append_backward(loss=avg_loss)
exe = fluid.Executor(place) exe = fluid.Executor(place)
...@@ -170,7 +170,7 @@ class ParallelOpTest(BaseParallelForTest): ...@@ -170,7 +170,7 @@ class ParallelOpTest(BaseParallelForTest):
x = fluid.layers.data(shape=[784], dtype='float32', name='img') x = fluid.layers.data(shape=[784], dtype='float32', name='img')
x = yield x x = yield x
hidden = fluid.layers.fc(input=x, size=200, param_attr='fc1.w') hidden = fluid.layers.fc(input=x, size=200, param_attr='fc1.w')
loss = fluid.layers.mean(x=hidden) loss = fluid.layers.mean(hidden)
yield loss yield loss
def test_simple_fc(self): def test_simple_fc(self):
...@@ -200,7 +200,7 @@ class ParallelOpTestMultipleInput(BaseParallelForTest): ...@@ -200,7 +200,7 @@ class ParallelOpTestMultipleInput(BaseParallelForTest):
hidden1 = fluid.layers.fc(input=x, size=200, param_attr='fc1.w') hidden1 = fluid.layers.fc(input=x, size=200, param_attr='fc1.w')
hidden2 = fluid.layers.fc(input=hidden1, size=200, param_attr='fc2.w') hidden2 = fluid.layers.fc(input=hidden1, size=200, param_attr='fc2.w')
hidden3 = fluid.layers.fc(input=hidden2, size=200, param_attr='fc3.w') hidden3 = fluid.layers.fc(input=hidden2, size=200, param_attr='fc3.w')
loss = fluid.layers.mean(x=hidden3) loss = fluid.layers.mean(hidden3)
yield loss yield loss
def test_simple_fc(self): def test_simple_fc(self):
......
...@@ -35,7 +35,7 @@ class TestPrintOpCPU(unittest.TestCase): ...@@ -35,7 +35,7 @@ class TestPrintOpCPU(unittest.TestCase):
x.stop_gradient = False x.stop_gradient = False
printed = layers.Print(input=x, **kargs) printed = layers.Print(input=x, **kargs)
if only_forward: return printed if only_forward: return printed
loss = layers.mean(x=printed) loss = layers.mean(printed)
append_backward(loss=loss) append_backward(loss=loss)
return loss return loss
......
...@@ -54,7 +54,7 @@ class TestProfiler(unittest.TestCase): ...@@ -54,7 +54,7 @@ class TestProfiler(unittest.TestCase):
predict = fluid.layers.fc(input=hidden2, size=10, act='softmax') predict = fluid.layers.fc(input=hidden2, size=10, act='softmax')
label = fluid.layers.data(name='y', shape=[1], dtype='int64') label = fluid.layers.data(name='y', shape=[1], dtype='int64')
cost = fluid.layers.cross_entropy(input=predict, label=label) cost = fluid.layers.cross_entropy(input=predict, label=label)
avg_cost = fluid.layers.mean(x=cost) avg_cost = fluid.layers.mean(cost)
accuracy = fluid.evaluator.Accuracy(input=predict, label=label) accuracy = fluid.evaluator.Accuracy(input=predict, label=label)
optimizer = fluid.optimizer.Momentum(learning_rate=0.001, momentum=0.9) optimizer = fluid.optimizer.Momentum(learning_rate=0.001, momentum=0.9)
......
...@@ -127,7 +127,7 @@ class RecurrentOpTest1(unittest.TestCase): ...@@ -127,7 +127,7 @@ class RecurrentOpTest1(unittest.TestCase):
self.output_shape = (self.sent_len, self.batch_size, self.input_dim) self.output_shape = (self.sent_len, self.batch_size, self.input_dim)
self.py_rnn = PySimpleRNN1(self.input_shape, self.output_shape) self.py_rnn = PySimpleRNN1(self.input_shape, self.output_shape)
self.output = layers.mean(x=self.create_rnn_op(), **self.p_info) self.output = layers.mean(self.create_rnn_op(), **self.p_info)
def create_rnn_op(self): def create_rnn_op(self):
x = layers.data( x = layers.data(
...@@ -261,7 +261,7 @@ class RecurrentOpTest2(RecurrentOpTest1): ...@@ -261,7 +261,7 @@ class RecurrentOpTest2(RecurrentOpTest1):
self.output_shape = (self.sent_len, self.batch_size, self.input_dim) self.output_shape = (self.sent_len, self.batch_size, self.input_dim)
self.py_rnn = PySimpleRNN2(self.input_shape, self.output_shape) self.py_rnn = PySimpleRNN2(self.input_shape, self.output_shape)
self.output = layers.mean(x=self.create_rnn_op(), **self.p_info) self.output = layers.mean(self.create_rnn_op(), **self.p_info)
def create_rnn_op(self): def create_rnn_op(self):
x = layers.data( x = layers.data(
...@@ -360,7 +360,7 @@ class RecurrentOpMultipleMemoryTest(RecurrentOpTest1): ...@@ -360,7 +360,7 @@ class RecurrentOpMultipleMemoryTest(RecurrentOpTest1):
self.py_rnn = RecurrentOpMultipleMemoryTest.PySimpleRNN3( self.py_rnn = RecurrentOpMultipleMemoryTest.PySimpleRNN3(
self.input_shape, self.output_shape) self.input_shape, self.output_shape)
self.output = layers.mean(x=self.create_rnn_op(), **self.p_info) self.output = layers.mean(self.create_rnn_op(), **self.p_info)
def create_rnn_op(self): def create_rnn_op(self):
x = layers.data( x = layers.data(
...@@ -444,7 +444,7 @@ class RecurrentOpNoMemBootTest(RecurrentOpTest1): ...@@ -444,7 +444,7 @@ class RecurrentOpNoMemBootTest(RecurrentOpTest1):
self.output_shape = (self.sent_len, self.batch_size, self.input_dim) self.output_shape = (self.sent_len, self.batch_size, self.input_dim)
self.py_rnn = RecurrentOpNoMemBootTest.PySimpleRNN4(self.input_shape, self.py_rnn = RecurrentOpNoMemBootTest.PySimpleRNN4(self.input_shape,
self.output_shape) self.output_shape)
self.output = layers.mean(x=self.create_rnn_op(), **self.p_info) self.output = layers.mean(self.create_rnn_op(), **self.p_info)
print self.main_program print self.main_program
def create_rnn_op(self): def create_rnn_op(self):
......
...@@ -22,7 +22,7 @@ class TestRegistry(unittest.TestCase): ...@@ -22,7 +22,7 @@ class TestRegistry(unittest.TestCase):
@decorators.prog_scope() @decorators.prog_scope()
def test_registry_layer(self): def test_registry_layer(self):
x = fluid.layers.data(name='X', shape=[10, 10], dtype='float32') x = fluid.layers.data(name='X', shape=[10, 10], dtype='float32')
output = fluid.layers.mean(x=x) output = fluid.layers.mean(x)
place = fluid.CPUPlace() place = fluid.CPUPlace()
exe = fluid.Executor(place) exe = fluid.Executor(place)
......
...@@ -39,7 +39,7 @@ class TestShrinkRNNMemoryBase(unittest.TestCase): ...@@ -39,7 +39,7 @@ class TestShrinkRNNMemoryBase(unittest.TestCase):
i = layers.increment(x=i) i = layers.increment(x=i)
i.stop_gradient = True i.stop_gradient = True
self.mem3 = layers.shrink_memory(x=self.mem2, i=i, table=table) self.mem3 = layers.shrink_memory(x=self.mem2, i=i, table=table)
mem3_mean = layers.mean(x=self.mem3) mem3_mean = layers.mean(self.mem3)
append_backward(loss=mem3_mean) append_backward(loss=mem3_mean)
self.x_grad = self.main_program.global_block().var('x@GRAD') self.x_grad = self.main_program.global_block().var('x@GRAD')
......
...@@ -145,7 +145,7 @@ class TestCPUSplitMergeLoDTensorGrad(unittest.TestCase): ...@@ -145,7 +145,7 @@ class TestCPUSplitMergeLoDTensorGrad(unittest.TestCase):
input=x, mask=y, level=level) input=x, mask=y, level=level)
out = layers.merge_lod_tensor( out = layers.merge_lod_tensor(
in_true=out_true, in_false=out_false, mask=y, x=x, level=level) in_true=out_true, in_false=out_false, mask=y, x=x, level=level)
mean = layers.mean(x=out) mean = layers.mean(out)
append_backward(mean) append_backward(mean)
......
...@@ -58,7 +58,7 @@ class TestWhileOp(unittest.TestCase): ...@@ -58,7 +58,7 @@ class TestWhileOp(unittest.TestCase):
layers.less_than(x=i, y=array_len, cond=cond) layers.less_than(x=i, y=array_len, cond=cond)
sum_result = layers.array_read(array=mem_array, i=i) sum_result = layers.array_read(array=mem_array, i=i)
loss = layers.mean(x=sum_result) loss = layers.mean(sum_result)
append_backward(loss) append_backward(loss)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册