提交 331bfd98 编写于 作者: G gx_wind

Merge remote-tracking branch 'upstream/develop' into develop

...@@ -49,12 +49,12 @@ def generate_copyright(template, lang='C'): ...@@ -49,12 +49,12 @@ def generate_copyright(template, lang='C'):
LANG_COMMENT_MARK = "//" LANG_COMMENT_MARK = "//"
lines = template.split(NEW_LINE_MARK) lines = template.split(NEW_LINE_MARK)
ans = LANG_COMMENT_MARK + COPYRIGHT_HEADER + NEW_LINE_MARK ans = LANG_COMMENT_MARK + " " + COPYRIGHT_HEADER + NEW_LINE_MARK
for lino, line in enumerate(lines): for lino, line in enumerate(lines):
if lino == 0 or lino == 1 or lino == len(lines) - 1: continue if lino == 0 or lino == 1 or lino == len(lines) - 1: continue
ans += LANG_COMMENT_MARK + line + NEW_LINE_MARK ans += LANG_COMMENT_MARK + " " + line + NEW_LINE_MARK
return ans return ans + "\n"
def lang_type(filename): def lang_type(filename):
...@@ -90,7 +90,7 @@ def main(argv=None): ...@@ -90,7 +90,7 @@ def main(argv=None):
retv = 0 retv = 0
for filename in args.filenames: for filename in args.filenames:
first_line = io.open(filename).readline() first_line = io.open(filename).readline()
if "Copyright" in first_line: continue if "COPYRIGHT" in first_line.upper() : continue
original_contents = io.open(filename).read() original_contents = io.open(filename).read()
new_contents = generate_copyright( new_contents = generate_copyright(
COPYRIGHT, lang_type(filename)) + original_contents COPYRIGHT, lang_type(filename)) + original_contents
......
# Contributor Covenant Code of Conduct
## Our Pledge
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at paddle-dev@baidu.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]
[homepage]: http://contributor-covenant.org
[version]: http://contributor-covenant.org/version/1/4/
# 貢獻者公約
## 我們的承諾
為了促進一個開放透明且受歡迎的環境,我們作為貢獻者和維護者保證,無論年齡、種族、民族、性別認同和表達、體型、殘疾、經驗水平、國籍、個人表現、宗教或性別取向,在我們的專案以及社群的參與者都有不被騷擾的體驗。
## 我們的準則
舉例來說有助於創造正面環境的行為包括:
* 使用歡迎和包容性語言
* 尊重不同的觀點和經驗
* 優雅地接受建設性批評
* 關注在對於社群最好的事情上
* 對其他社群成員的表現友善
舉例來說身為參與者不能接受的行為包括:
* 使用與性有關的言語或是圖像,以及不受歡迎的性騷擾
* 酸民/反串/釣魚行為或進行侮辱/貶損的評論,人身攻擊及政治攻擊
* 公開或私下的騷擾
* 未經許可地發布他人的個人資料,例如住址或是電子地址
* 其他可以被合理地認定為不恰當或者違反職業操守的行為
## 我們的責任
專案維護者有責任為"可接受的行為"準則做出詮釋,以及對已發生的不被接受的行為採取恰當且公平的糾正措施。
專案維護者有權力及責任去刪除、編輯、拒絕與本行為準則有所違背的評論(comments)、提交(commits)、程式碼、wiki 編輯、問題(issues)和其他貢獻,以及專案維護者可暫時或永久性的禁止任何他們認為有不適當、威脅、冒犯、有害行為的貢獻者。
## 使用範圍
當一個人代表該專案或是其社群時,本行為準則適用於其專案平台和公共平台。
代表專案或是社群的情況,舉例來說包括使用官方專案的電子郵件地址、通過官方的社群媒體帳號發布或線上或線下事件中擔任指定代表。
該專案的呈現方式可由其專案維護者進行進一步的定義及解釋。
## 強制執行
可以透過paddle-dev@baidu.com,來聯繫專案團隊來報告濫用、騷擾或其他不被接受的行為。
任何維護團隊認為有必要且適合的所有投訴都將進行審查及調查,並做出相對應的回應。專案小組有對事件回報者有保密的義務。具體執行的方針近一步細節可能會單獨公佈。
沒有真誠的遵守或是執行本行為準則的專案維護人員,可能會因專案領導人或是其他成員的決定,暫時或是永久的取消其身份。
## 來源
本行為準則改編自[貢獻者公約][首頁],版本 1.4
可在此觀看https://www.contributor-covenant.org/zh-tw/version/1/4/code-of-conduct.html
[首頁]: https://www.contributor-covenant.org
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from six.moves import xrange # pylint: disable=redefined-builtin from six.moves import xrange # pylint: disable=redefined-builtin
from datetime import datetime from datetime import datetime
import math import math
......
...@@ -364,6 +364,12 @@ split ...@@ -364,6 +364,12 @@ split
.. autofunction:: paddle.v2.fluid.layers.split .. autofunction:: paddle.v2.fluid.layers.split
:noindex: :noindex:
matmul
------
.. autofunction:: paddle.v2.fluid.layers.matmul
:noindex:
logsigmoid logsigmoid
---------- ----------
.. autofunction:: paddle.v2.fluid.layers.logsigmoid .. autofunction:: paddle.v2.fluid.layers.logsigmoid
...@@ -493,3 +499,8 @@ swish ...@@ -493,3 +499,8 @@ swish
------ ------
.. autofunction:: paddle.v2.fluid.layers.swish .. autofunction:: paddle.v2.fluid.layers.swish
:noindex: :noindex:
l2_normalize
------------
.. autofunction:: paddle.v2.fluid.layers.l2_normalize
:noindex:
...@@ -25,3 +25,9 @@ glu ...@@ -25,3 +25,9 @@ glu
.. autofunction:: paddle.v2.fluid.nets.glu .. autofunction:: paddle.v2.fluid.nets.glu
:noindex: :noindex:
dot_product_attention
---------------------
.. autofunction:: paddle.v2.fluid.nets.dot_product_attention
:noindex:
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import paddle.v2 as paddle import paddle.v2 as paddle
import numpy as np import numpy as np
......
...@@ -19,7 +19,7 @@ ...@@ -19,7 +19,7 @@
### 基本使用概念 ### 基本使用概念
- 在PaddlePaddle内部,神经网络中一个计算层的输入/输出被组织为一个 `Argument` 结构体,如果神经网络有多个输入或者多个输入,每一个输入/输入都会对应有自己的`Argument` - 在PaddlePaddle内部,神经网络中一个计算层的输入/输出被组织为一个 `Argument` 结构体,如果神经网络有多个输入或者多个输出,每一个输入/输出都会对应有自己的`Argument`
- `Argument` 并不真正“存储”数据,而是将输入/输出信息有机地组织在一起。 - `Argument` 并不真正“存储”数据,而是将输入/输出信息有机地组织在一起。
-`Argument`内部由`IVector`(对应着上文提到的一维整型数组)和`Matrix`(对应着上文提到的二维浮点型矩阵)来实际存储数据;由 `Sequence Start Positions` (下文详细解释) 来描述输入/输出的序列信息。 -`Argument`内部由`IVector`(对应着上文提到的一维整型数组)和`Matrix`(对应着上文提到的二维浮点型矩阵)来实际存储数据;由 `Sequence Start Positions` (下文详细解释) 来描述输入/输出的序列信息。
......
# Fluid Distributed Training
## Introduction
In this article, we'll explain how to config and run distributed training jobs with PaddlePaddle Fluid in a bare metal cluster.
## Preparations
### Get your cluster ready
Prepare your computer nodes in the cluster. Nodes in this cluster can be of any specification that runs PaddlePaddle, and with a unique IP address assigned to it. Make sure they can communicate with each other.
### Have PaddlePaddle installed
PaddlePaddle must be installed on all nodes. If you have GPU cards on your nodes, be sure to properly install drivers and CUDA libraries.
PaddlePaddle build and installation guide can be found from [here](http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/build_and_install/index_en.html).
### Update training script
#### Non-cluster training script
Let's take [Deep Learning 101](http://www.paddlepaddle.org/docs/develop/book/01.fit_a_line/index.html)'s first chapter: "fit a line" as an example.
This demo's non-cluster version with fluid API is as follows:
``` python
import paddle.v2 as paddle
import paddle.v2.fluid as fluid
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
y_predict = fluid.layers.fc(input=x, size=1, act=None)
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
cost = fluid.layers.square_error_cost(input=y_predict, label=y)
avg_cost = fluid.layers.mean(x=cost)
sgd_optimizer = fluid.optimizer.SGD(learning_rate=0.001)
sgd_optimizer.minimize(avg_cost)
BATCH_SIZE = 20
train_reader = paddle.batch(
paddle.reader.shuffle(
paddle.dataset.uci_housing.train(), buf_size=500),
batch_size=BATCH_SIZE)
place = fluid.CPUPlace()
feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
PASS_NUM = 100
for pass_id in range(PASS_NUM):
fluid.io.save_persistables(exe, "./fit_a_line.model/")
fluid.io.load_persistables(exe, "./fit_a_line.model/")
for data in train_reader():
avg_loss_value, = exe.run(fluid.default_main_program(),
feed=feeder.feed(data),
fetch_list=[avg_cost])
if avg_loss_value[0] < 10.0:
exit(0) # if avg cost less than 10.0, we think our code is good.
exit(1)
```
We created a simple fully connected neural networks training program and handed it to the fluid executor to run for 100 passes.
Now let's try to convert it to a distributed version to run in a cluster.
#### Introducing parameter server
As you see from the non-cluster version of training script, there is only one role in it: the trainer, who does the computing as well as holding parameters. In cluster training, since multi-trainers are working on the same task, they need one centralized place to hold and distribute parameters. This centralized place is called the Parameter Server in PaddlePaddle.
![parameter server architect](src/trainer.png)
Parameter Server in fluid does not only hold parameters but is also assigned with a part of the program. Trainers communicate with parameter servers via send/receive OPs. For more tech detail, please refer to this [document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/dist_refactor/distributed_architecture.md).
Now we need to create program for both trainers and parameter servers, the question is how?
#### Slice the program
Fluid provides a tool called "Distribute Transpiler" to automatically convert the non-cluster program into cluster program.
The idea behind this tool is to find optimize OPs and gradient parameters, slice the program into 2 pieces and connect them with send/receive OP.
Optimize OPs and gradient parameters can be found from the return values of optimizer's minimize function.
To put them together:
``` python
... #define the program, cost, and create sgd optimizer
optimize_ops, params_grads = sgd_optimizer.minimize(avg_cost) #get optimize OPs and gradient parameters
t = fluid.DistributeTranspiler() # create transpiler instance
# slice the program into 2 pieces with optimizer_ops and gradient parameters list, as well as pserver_endpoints, which is a comma separated list of [IP:PORT] and number of trainers
t.transpile(optimize_ops, params_grads, pservers=pserver_endpoints, trainers=2)
... #create executor
# in pserver, run this
exe.run(fluid.default_startup_program())
#current_endpoint here means current pserver IP:PORT you wish to run on
exe.run(t.get_pserver_program(current_endpoint, optimize_ops))
# in trainer, run this
... # define data reader
exe.run(fluid.default_startup_program())
for pass_id in range(100):
for data in train_reader():
exe.run(t.get_trainer_program())
```
### E2E demo
Please find the complete demo from [here](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/tests/book_distribute/notest_dist_fit_a_line.py). In parameter server node run this in the command line:
``` bash
PSERVERS=192.168.1.2:6174 SERVER_ENDPOINT=192.168.1.2:6174 TRAINING_ROLE=PSERVER python notest_dist_fit_a_line.py
```
*please note we assume that your parameter server runs at 192.168.1.2:6174*
Wait until the prompt `Server listening on 192.168.1.2:6174`
Then in 2 of your trainer node run this:
``` bash
PSERVERS=192.168.1.2:6174 SERVER_ENDPOINT=192.168.1.2:6174 TRAINING_ROLE=TRAINER python notest_dist_fit_a_line.py
```
*the reason you need to run this command twice in 2 nodes is: in the script we set the trainer count to be 2. You can change this setting on line 50*
Now you have 2 trainers and 1 parameter server up and running.
...@@ -135,6 +135,65 @@ bool operator==(const LoD &a, const LoD &b) { ...@@ -135,6 +135,65 @@ bool operator==(const LoD &a, const LoD &b) {
return true; return true;
} }
bool CheckLoD(const LoD &in, int tensor_height) {
if (in.empty()) return true;
for (const auto &level : in) {
// check: there should be more than 2 offsets existing in each level.
if (level.size() < 2) return false;
// check: the first offset(the begin offset) of each level should be 0.
if (level.front() != 0) return false;
// check: all the offsets in a level should be ascending(no same items
// allows).
if (!std::is_sorted(level.begin(), level.begin(), [](size_t a, size_t b) {
if (a < b) return true;
return false;
})) {
LOG(INFO) << "ascending error";
return false;
}
}
// check: the lowest level's last offset should equals `tensor_height` if
// tensor_height>0.
if (tensor_height > 0 && (size_t)tensor_height != in.back().back())
return false;
// check: the higher level's last offset should equals the lower level's
// size-1.
// NOTE LoD store the levels from top to bottom, so the higher level goes
// first.
for (size_t level = 0; level < in.size() - 1; level++) {
if (in[level].back() != in[level + 1].size() - 1) return false;
}
return true;
}
bool CheckAbsLoD(const LoD &in, int tensor_height) {
if (in.empty()) return true;
for (const auto &level : in) {
// check: all the offsets in a level should be ascending(no same items
// allows).
if (!std::is_sorted(level.begin(), level.begin(), [](size_t a, size_t b) {
if (a < b) return true;
return false;
})) {
return false;
}
// check: there should be more than 2 offsets existing in each level.
if (level.size() < 2) return false;
// check: the first offset of each level should be 0, and the last should be
// the same(the height of underlying tensor).
if (level.front() != 0) return false;
if (tensor_height < 0) {
tensor_height = level.back();
} else if ((size_t)tensor_height != level.back()) {
return false;
}
}
return true;
}
using LoDAndOffset = std::pair<LoD, std::pair<size_t, size_t>>; using LoDAndOffset = std::pair<LoD, std::pair<size_t, size_t>>;
LoDAndOffset GetSubLoDAndAbsoluteOffset(const LoD &lod, size_t start_idx, LoDAndOffset GetSubLoDAndAbsoluteOffset(const LoD &lod, size_t start_idx,
size_t end_idx, size_t start_level) { size_t end_idx, size_t start_level) {
...@@ -232,23 +291,32 @@ std::vector<LoDTensor> LoDTensor::SplitLoDTensor( ...@@ -232,23 +291,32 @@ std::vector<LoDTensor> LoDTensor::SplitLoDTensor(
const std::vector<platform::Place> places) const { const std::vector<platform::Place> places) const {
check_memory_size(); check_memory_size();
PADDLE_ENFORCE(lod().empty(), "Disable parallel lod for now"); PADDLE_ENFORCE(lod().empty(), "Disable parallel lod for now");
PADDLE_ENFORCE(dims()[0] % places.size() == 0, size_t result_size = std::min(static_cast<size_t>(dims()[0]), places.size());
"Batch size should be divided by places size"); size_t remainder = dims()[0] % places.size();
std::vector<LoDTensor> lods; std::vector<LoDTensor> results;
for (size_t place_idx = 0; place_idx < places.size(); ++place_idx) { results.reserve(result_size);
int begin = place_idx * dims()[0] / places.size();
int end = (place_idx + 1) * dims()[0] / places.size(); int step_width = static_cast<int>(dims()[0] / result_size);
for (size_t i = 0; i < result_size; ++i) {
int begin = static_cast<int>(i * step_width);
int end = static_cast<int>((i + 1) * step_width);
if (i + 1 == places.size()) { // last
end += remainder;
}
auto src = Slice(begin, end); auto src = Slice(begin, end);
auto &dst_place = places[place_idx]; auto &dst_place = places[i];
LoDTensor dst; LoDTensor dst;
if (!(dst_place == place())) {
framework::Copy(src, dst_place, &dst); framework::Copy(src, dst_place, &dst);
} else { // It is no need to copy if src_place and dst_place are same.
lods.emplace_back(dst); dst.ShareDataWith(src);
}
results.emplace_back(dst);
} }
return lods; return results;
} }
// TODO(tonyyang-svail): make this function support LoD // TODO(tonyyang-svail): make this function support LoD
...@@ -259,12 +327,17 @@ void LoDTensor::MergeLoDTensor( ...@@ -259,12 +327,17 @@ void LoDTensor::MergeLoDTensor(
framework::DDim new_dim = lod_tensors[0]->dims(); framework::DDim new_dim = lod_tensors[0]->dims();
std::type_index new_type = lod_tensors[0]->type(); std::type_index new_type = lod_tensors[0]->type();
auto new_layout = lod_tensors[0]->layout(); auto new_layout = lod_tensors[0]->layout();
int64_t new_height = 0;
for (auto *lod : lod_tensors) { for (auto *lod : lod_tensors) {
PADDLE_ENFORCE(new_dim == lod->dims()); new_height += lod->dims()[0];
PADDLE_ENFORCE(new_type == lod->type()); for (int i = 1; i < new_dim.size(); ++i) {
PADDLE_ENFORCE(new_layout == lod->layout()); PADDLE_ENFORCE_EQ(new_dim[i], lod->dims()[i]);
}
PADDLE_ENFORCE_EQ(new_type, lod->type());
PADDLE_ENFORCE_EQ(new_layout, lod->layout());
} }
new_dim[0] *= lod_tensors.size(); new_dim[0] = new_height;
Resize(new_dim); Resize(new_dim);
set_layout(new_layout); set_layout(new_layout);
......
...@@ -71,6 +71,38 @@ LoD ToAbsOffset(const LoD& in); ...@@ -71,6 +71,38 @@ LoD ToAbsOffset(const LoD& in);
bool operator==(const LoD& a, const LoD& b); bool operator==(const LoD& a, const LoD& b);
/*
* Check whether this lod's format is valid.
*
* ATTENTION:
* - Empty lod is treated as valid.
*
* It will check two things:
*
* 1. all the offsets in a level should be ascending(no same items allows).
* 2. there should be more than 2 offsets existing in each level.
* 3. the higher level's last offset should equals the lower level's size-1.
* 4. the first offset(the begin offset) of each level should be 0.
* 5. the lowest level's last offset should equals `tensor_height` if
* tensor_height>0.
*/
bool CheckLoD(const LoD& in, int tensor_height = -1);
/*
* Check whether this absolute lod's format is valid.
*
* ATTENTION:
* - Empty lod is treated as valid.
*
* It will check two things:
* 1. all the offsets in a level should be ascending(no same items allows)
* 2. there should be more than 2 offsets existing in each level.
* 3. the first offset of each level should be 0, and the last should be the
* same(the height of underlying tensor) or `tensor_height` if
* tensor_height>0.
*/
bool CheckAbsLoD(const LoD& in, int tensor_height = -1);
/* /*
* LoDTensor (Level of details Tensor) * LoDTensor (Level of details Tensor)
* see https://en.wikipedia.org/wiki/Level_of_details for reference. * see https://en.wikipedia.org/wiki/Level_of_details for reference.
......
// Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. /* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
//
// Licensed under the Apache License, Version 2.0 (the "License"); Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License. you may not use this file except in compliance with the License.
// You may obtain a copy of the License at You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
/*
Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0 http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, Unless required by applicable law or agreed to in writing, software
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. distributed under the License is distributed on an "AS IS" BASIS,
See the License for the specific language governing permissions and WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
limitations under the License. See the License for the specific language governing permissions and
*/ limitations under the License. */
#include "paddle/framework/lod_tensor.h" #include "paddle/framework/lod_tensor.h"
...@@ -35,38 +23,6 @@ ...@@ -35,38 +23,6 @@
namespace paddle { namespace paddle {
namespace framework { namespace framework {
const int kLodTensorSize = 20 * 128;
class LoDTensorTester : public ::testing::Test {
public:
virtual void SetUp() override {
// tensor's batch_size: 30
// 3 levels
// 0 10 20
// 0 5 10 15 20
// 0 2 5 7 10 12 15 20
LoD lod;
lod.push_back(std::vector<size_t>{0, 2, 3});
lod.push_back(std::vector<size_t>{0, 2, 5, 8});
lod.push_back(std::vector<size_t>{0, 2, 5, 7, 10, 12, 15, 17, 20});
ASSERT_EQ(lod.size(), 3UL);
lod_tensor_.Resize({20 /*batch size*/, 128 /*dim*/});
// malloc memory
float* dst_ptr = lod_tensor_.mutable_data<float>(place);
for (int i = 0; i < kLodTensorSize; ++i) {
dst_ptr[i] = i;
}
lod_tensor_.set_lod(lod);
}
protected:
platform::CPUPlace place;
LoDTensor lod_tensor_;
};
TEST(LodExpand, test) { TEST(LodExpand, test) {
LoD lod{{0, 2}}; LoD lod{{0, 2}};
LoDTensor tensor; LoDTensor tensor;
...@@ -144,5 +100,53 @@ TEST(LoD, ToAbsOffset) { ...@@ -144,5 +100,53 @@ TEST(LoD, ToAbsOffset) {
EXPECT_EQ(abs_lod, expected); EXPECT_EQ(abs_lod, expected);
} }
TEST(LoD, CheckLoD) {
LoD relative_lod;
relative_lod.push_back(std::vector<size_t>({0, 2}));
relative_lod.push_back(std::vector<size_t>({0, 1, 3}));
relative_lod.push_back(std::vector<size_t>({0, 2, 4, 5}));
// check compatible
ASSERT_TRUE(CheckLoD(relative_lod));
relative_lod[1].back()++;
ASSERT_FALSE(CheckLoD(relative_lod));
relative_lod[1].back()--; // recover it
// check empty
LoD empty_lod;
ASSERT_TRUE(CheckLoD(empty_lod));
// check less than 2 offsets in a level
LoD some_lod0;
some_lod0.push_back(std::vector<size_t>({0}));
ASSERT_FALSE(CheckLoD(some_lod0));
// check with underlying tensor storage.
ASSERT_TRUE(CheckLoD(relative_lod, 5));
ASSERT_FALSE(CheckLoD(relative_lod, 9));
}
TEST(LoD, CheckAbsLoD) {
LoD relative_lod;
relative_lod.push_back(std::vector<size_t>({0, 2}));
relative_lod.push_back(std::vector<size_t>({0, 1, 3}));
relative_lod.push_back(std::vector<size_t>({0, 2, 4, 5}));
auto abs_lod = ToAbsOffset(relative_lod);
ASSERT_TRUE(CheckAbsLoD(abs_lod));
// check less than 2 offsets in a level.
// check the last item should be compatible with tensor height.
abs_lod.back().back()++;
ASSERT_FALSE(CheckAbsLoD(abs_lod));
abs_lod.back().back()--; // restore
// check less than 2 offsets in a lod.
LoD abs_lod0;
abs_lod0.push_back(std::vector<size_t>({0}));
ASSERT_FALSE(CheckAbsLoD(abs_lod0));
}
} // namespace framework } // namespace framework
} // namespace paddle } // namespace paddle
...@@ -177,15 +177,15 @@ class OpKernelRegistrar : public Registrar { ...@@ -177,15 +177,15 @@ class OpKernelRegistrar : public Registrar {
/** /**
* Macro to register OperatorKernel. * Macro to register OperatorKernel.
*/ */
#define REGISTER_OP_KERNEL(op_type, DEVICE_TYPE, place_class, ...) \ #define REGISTER_OP_KERNEL(op_type, LIBRARY_TYPE, place_class, ...) \
STATIC_ASSERT_GLOBAL_NAMESPACE( \ STATIC_ASSERT_GLOBAL_NAMESPACE( \
__reg_op_kernel_##op_type##_##DEVICE_TYPE##__, \ __reg_op_kernel_##op_type##_##LIBRARY_TYPE##__, \
"REGISTER_OP_KERNEL must be called in global namespace"); \ "REGISTER_OP_KERNEL must be called in global namespace"); \
static ::paddle::framework::OpKernelRegistrar<place_class, __VA_ARGS__> \ static ::paddle::framework::OpKernelRegistrar<place_class, __VA_ARGS__> \
__op_kernel_registrar_##op_type##_##DEVICE_TYPE##__(#op_type, \ __op_kernel_registrar_##op_type##_##LIBRARY_TYPE##__(#op_type, \
#DEVICE_TYPE); \ #LIBRARY_TYPE); \
int TouchOpKernelRegistrar_##op_type##_##DEVICE_TYPE() { \ int TouchOpKernelRegistrar_##op_type##_##LIBRARY_TYPE() { \
__op_kernel_registrar_##op_type##_##DEVICE_TYPE##__.Touch(); \ __op_kernel_registrar_##op_type##_##LIBRARY_TYPE##__.Touch(); \
return 0; \ return 0; \
} }
...@@ -208,14 +208,14 @@ class OpKernelRegistrar : public Registrar { ...@@ -208,14 +208,14 @@ class OpKernelRegistrar : public Registrar {
static int use_op_itself_##op_type##_ __attribute__((unused)) = \ static int use_op_itself_##op_type##_ __attribute__((unused)) = \
TouchOpRegistrar_##op_type() TouchOpRegistrar_##op_type()
#define USE_OP_DEVICE_KERNEL(op_type, DEVICE_TYPE) \ #define USE_OP_DEVICE_KERNEL(op_type, LIBRARY_TYPE) \
STATIC_ASSERT_GLOBAL_NAMESPACE( \ STATIC_ASSERT_GLOBAL_NAMESPACE( \
__use_op_kernel_##op_type##_##DEVICE_TYPE##__, \ __use_op_kernel_##op_type##_##LIBRARY_TYPE##__, \
"USE_OP_DEVICE_KERNEL must be in global namespace"); \ "USE_OP_DEVICE_KERNEL must be in global namespace"); \
extern int TouchOpKernelRegistrar_##op_type##_##DEVICE_TYPE(); \ extern int TouchOpKernelRegistrar_##op_type##_##LIBRARY_TYPE(); \
static int use_op_kernel_##op_type##_##DEVICE_TYPE##_ \ static int use_op_kernel_##op_type##_##LIBRARY_TYPE##_ \
__attribute__((unused)) = \ __attribute__((unused)) = \
TouchOpKernelRegistrar_##op_type##_##DEVICE_TYPE() TouchOpKernelRegistrar_##op_type##_##LIBRARY_TYPE()
// TODO(fengjiayi): The following macros // TODO(fengjiayi): The following macros
// seems ugly, do we have better method? // seems ugly, do we have better method?
......
...@@ -43,7 +43,7 @@ void MKLDNNConcatLayer::reshape( ...@@ -43,7 +43,7 @@ void MKLDNNConcatLayer::reshape(
channels_[0] = ic; channels_[0] = ic;
oc = ic; oc = ic;
for (size_t i = 1; i < inputLayers_.size(); i++) { for (size_t i = 1; i < inputLayers_.size(); i++) {
int batchsize, height, witdh; int batchsize = 0, height = 0, witdh = 0;
reshapeInput(batchsize, height, witdh, i); reshapeInput(batchsize, height, witdh, i);
CHECK_EQ(bs, batchsize); CHECK_EQ(bs, batchsize);
CHECK_EQ(ih, height); CHECK_EQ(ih, height);
...@@ -84,6 +84,7 @@ void MKLDNNConcatLayer::resetFwdBuffers(std::vector<MKLDNNMatrixPtr>& inputs, ...@@ -84,6 +84,7 @@ void MKLDNNConcatLayer::resetFwdBuffers(std::vector<MKLDNNMatrixPtr>& inputs,
bool has8c = false, has16c = false, hasnc = false; bool has8c = false, has16c = false, hasnc = false;
for (size_t i = 0; i < inputs.size(); i++) { for (size_t i = 0; i < inputs.size(); i++) {
resetInValue(inputs[i], nullptr, i, channels_[i]); resetInValue(inputs[i], nullptr, i, channels_[i]);
inputs[i]->downSpatial();
CHECK(inputs[i]); CHECK(inputs[i]);
auto dm = inputs[i]->getDims(); auto dm = inputs[i]->getDims();
// inputs format can be different, but ndims must equal // inputs format can be different, but ndims must equal
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
...@@ -11,20 +11,6 @@ ...@@ -11,20 +11,6 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
#limitations under the License. #limitations under the License.
#edit-mode: -*- python -*-
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
...@@ -11,20 +11,6 @@ ...@@ -11,20 +11,6 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
#limitations under the License. #limitations under the License.
#edit-mode: -*- python -*-
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
......
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
# You may obtain a copy of the License at #You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
# Unless required by applicable law or agreed to in writing, software #Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, #distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
# limitations under the License. #limitations under the License.
import numpy import numpy
import struct import struct
import traceback import traceback
......
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
# You may obtain a copy of the License at #You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
# Unless required by applicable law or agreed to in writing, software #Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, #distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
# limitations under the License. #limitations under the License.
from paddle.trainer.PyDataProvider2 import * from paddle.trainer.PyDataProvider2 import *
# Note that each config should has an independent provider # Note that each config should has an independent provider
......
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
# You may obtain a copy of the License at #You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
# Unless required by applicable law or agreed to in writing, software #Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, #distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
# limitations under the License. #limitations under the License.
import os import os
import sys import sys
......
# edit-mode: -*- python -*- # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
# You may obtain a copy of the License at #You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
# Unless required by applicable law or agreed to in writing, software #Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, #distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
# limitations under the License. #limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
######################## data source ################################ ######################## data source ################################
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
...@@ -11,20 +11,6 @@ ...@@ -11,20 +11,6 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
#limitations under the License. #limitations under the License.
#!/usr/bin/env python
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
......
#!/usr/bin/env python # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
# #
# Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
...@@ -12,7 +11,6 @@ ...@@ -12,7 +11,6 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
######################## data source ################################ ######################## data source ################################
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
...@@ -11,20 +11,6 @@ ...@@ -11,20 +11,6 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
#limitations under the License. #limitations under the License.
# edit-mode: -*- python -*-
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
...@@ -11,20 +11,6 @@ ...@@ -11,20 +11,6 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
#limitations under the License. #limitations under the License.
# edit-mode: -*- python -*-
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
...@@ -11,20 +11,6 @@ ...@@ -11,20 +11,6 @@
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
#limitations under the License. #limitations under the License.
#edit-mode: -*- python -*-
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import * from paddle.trainer_config_helpers import *
......
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved # Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License. #you may not use this file except in compliance with the License.
# You may obtain a copy of the License at #You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
# Unless required by applicable law or agreed to in writing, software #Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, #distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and #See the License for the specific language governing permissions and
# limitations under the License. #limitations under the License.
import random import random
from paddle.trainer.PyDataProvider2 import * from paddle.trainer.PyDataProvider2 import *
......
...@@ -178,14 +178,13 @@ foreach(src ${GENERAL_OPS}) ...@@ -178,14 +178,13 @@ foreach(src ${GENERAL_OPS})
endforeach() endforeach()
file(APPEND ${pybind_file} "USE_OP(less_than);\nUSE_OP(logical_and);\nUSE_NO_KERNEL_OP(read_from_array);\n") file(APPEND ${pybind_file} "USE_OP(less_than);\nUSE_OP(logical_and);\nUSE_NO_KERNEL_OP(read_from_array);\n")
set(GLOB_OP_LIB ${OP_LIBRARY} CACHE INTERNAL "Global OP library") set(GLOB_OP_LIB ${OP_LIBRARY} CACHE INTERNAL "Global OP library")
cc_test(gather_test SRCS gather_test.cc DEPS tensor) cc_test(gather_test SRCS gather_test.cc DEPS tensor)
cc_test(net_op_test SRCS net_op_test.cc DEPS net_op) cc_test(net_op_test SRCS net_op_test.cc DEPS net_op)
cc_test(scatter_test SRCS scatter_test.cc DEPS tensor) cc_test(scatter_test SRCS scatter_test.cc DEPS tensor)
cc_test(beam_search_decode_op_test SRCS beam_search_decode_op_test.cc DEPS lod_tensor) cc_test(beam_search_decode_op_test SRCS beam_search_decode_op_test.cc DEPS lod_tensor)
cc_test(beam_search_op_test SRCS beam_search_op_test.cc DEPS lod_tensor beam_search_op)
cc_test(strided_memcpy_test SRCS strided_memcpy_test.cc DEPS tensor paddle_memory) cc_test(strided_memcpy_test SRCS strided_memcpy_test.cc DEPS tensor paddle_memory)
if(WITH_GPU) if(WITH_GPU)
cc_test(nccl_op_test SRCS nccl_op_test.cu.cc DEPS nccl_op gpu_info device_context) cc_test(nccl_op_test SRCS nccl_op_test.cu.cc DEPS nccl_op gpu_info device_context)
......
...@@ -29,7 +29,7 @@ void BeamSearch::operator()(const framework::LoDTensor &pre_ids, ...@@ -29,7 +29,7 @@ void BeamSearch::operator()(const framework::LoDTensor &pre_ids,
PruneEndidCandidates(pre_ids, &selected_items); PruneEndidCandidates(pre_ids, &selected_items);
// calculate the output tensor's height // calculate the output tensor's height
size_t num_instances = std::accumulate( size_t num_instances = std::accumulate(
std::begin(items), std::end(items), 0, std::begin(selected_items), std::end(selected_items), 0,
[](size_t a, std::vector<Item> &b) { return a + b.size(); }); [](size_t a, std::vector<Item> &b) { return a + b.size(); });
// the output tensor shape should be [num_instances, 1] // the output tensor shape should be [num_instances, 1]
auto dims = framework::make_ddim( auto dims = framework::make_ddim(
...@@ -48,12 +48,20 @@ void BeamSearch::operator()(const framework::LoDTensor &pre_ids, ...@@ -48,12 +48,20 @@ void BeamSearch::operator()(const framework::LoDTensor &pre_ids,
size_t low_offset = 0; size_t low_offset = 0;
for (auto &items : selected_items) { for (auto &items : selected_items) {
low_level.push_back(low_offset); low_level.push_back(low_offset);
sort(items.begin(), items.end(), [](const Item &a, const Item &b) {
if (a.offset < b.offset) {
return true;
}
return a.id < b.id;
});
for (auto &item : items) { for (auto &item : items) {
ids_data[low_offset] = item.id; ids_data[low_offset] = item.id;
scores_data[low_offset] = item.score; scores_data[low_offset] = item.score;
low_offset++; low_offset++;
} }
} }
low_level.push_back(low_offset);
// fill lod // fill lod
auto abs_lod = framework::ToAbsOffset(ids_->lod()); auto abs_lod = framework::ToAbsOffset(ids_->lod());
auto &high_level = abs_lod[lod_level_]; auto &high_level = abs_lod[lod_level_];
...@@ -64,16 +72,21 @@ void BeamSearch::operator()(const framework::LoDTensor &pre_ids, ...@@ -64,16 +72,21 @@ void BeamSearch::operator()(const framework::LoDTensor &pre_ids,
selected_scores->set_lod(lod); selected_scores->set_lod(lod);
} }
void BeamSearch::PruneEndidCandidates(const framework::LoDTensor &pre_ids, int BeamSearch::PruneEndidCandidates(const framework::LoDTensor &pre_ids,
std::vector<std::vector<Item>> *items) { std::vector<std::vector<Item>> *items) {
auto *pre_ids_data = pre_ids.data<int64_t>(); auto *pre_ids_data = pre_ids.data<int64_t>();
int res = 0;
for (size_t offset = 0; offset < items->size(); offset++) { for (size_t offset = 0; offset < items->size(); offset++) {
auto prefix_id = pre_ids_data[offset]; auto prefix_id = pre_ids_data[offset];
if (prefix_id == end_id_) { if (prefix_id == end_id_) {
items->at(offset).clear(); items->at(offset).clear();
} else {
res++;
} }
} }
return res;
} }
std::vector<std::vector<BeamSearch::Item>> BeamSearch::ToMap( std::vector<std::vector<BeamSearch::Item>> BeamSearch::ToMap(
...@@ -121,11 +134,7 @@ bool BeamSearch::NextItemSet(std::vector<BeamSearch::Item> *items) { ...@@ -121,11 +134,7 @@ bool BeamSearch::NextItemSet(std::vector<BeamSearch::Item> *items) {
auto ids = *ids_; auto ids = *ids_;
auto scores = *scores_; auto scores = *scores_;
auto source_abs_two_level_lod = framework::SliceInLevel(
ids.lod(), lod_level_, sent_offset_, sent_offset_ + 1);
source_abs_two_level_lod = framework::ToAbsOffset(source_abs_two_level_lod);
auto abs_lod = framework::ToAbsOffset(ids.lod()); auto abs_lod = framework::ToAbsOffset(ids.lod());
PADDLE_ENFORCE_GE(source_abs_two_level_lod.size(), 2UL);
auto *ids_data = ids.data<int64_t>(); auto *ids_data = ids.data<int64_t>();
auto *scores_data = scores.data<float>(); auto *scores_data = scores.data<float>();
......
...@@ -73,7 +73,15 @@ namespace operators { ...@@ -73,7 +73,15 @@ namespace operators {
* second level: * second level:
* [0, 2, 4] * [0, 2, 4]
* *
* tensor's data * id tensor's data
* [[
* 4,
* 1,
* 3,
* 8,
* ]]
*
* score tensor's data
* [[ * [[
* 0.5, * 0.5,
* 0.3, * 0.3,
...@@ -137,15 +145,20 @@ class BeamSearch { ...@@ -137,15 +145,20 @@ class BeamSearch {
Item() {} Item() {}
Item(size_t offset, size_t id, float score) Item(size_t offset, size_t id, float score)
: offset(offset), id(id), score(score) {} : offset(offset), id(id), score(score) {}
// offset in the lod_level_+1 // offset in the higher lod level.
size_t offset; size_t offset;
// // prefix id in the lower lod level.
// size_t prefix;
// the candidate id // the candidate id
id_t id; id_t id;
// the corresponding score // the corresponding score
score_t score; score_t score;
}; };
void PruneEndidCandidates(const framework::LoDTensor& pre_ids, /*
* Delete all the records that follows the end token.
*/
int PruneEndidCandidates(const framework::LoDTensor& pre_ids,
std::vector<std::vector<Item>>* items); std::vector<std::vector<Item>>* items);
/* /*
......
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/beam_search_op.h"
#include <gtest/gtest.h>
#include <vector>
namespace paddle {
namespace test {
using std::vector;
using framework::LoDTensor;
using framework::LoD;
using operators::BeamSearch;
using paddle::platform::CPUPlace;
using std::cout;
using std::endl;
void CreateInput(LoDTensor* ids, LoDTensor* scores) {
LoD lod;
vector<size_t> level0({0, 1, 4});
vector<size_t> level1({0, 1, 2, 3, 4});
lod.push_back(level0);
lod.push_back(level1);
ids->set_lod(lod);
scores->set_lod(lod);
auto dims = framework::make_ddim(vector<int64_t>({4, 3}));
ids->Resize(dims);
scores->Resize(dims);
CPUPlace place;
auto* ids_data = ids->mutable_data<int64_t>(place);
auto* scores_data = scores->mutable_data<float>(place);
vector<int64_t> _ids({4, 2, 5, 2, 1, 3, 3, 5, 2, 8, 2, 1});
vector<float> _scores(
{0.5, 0.3, 0.2, 0.6, 0.3, 0.1, 0.9, 0.5, 0.1, 0.7, 0.5, 0.1});
for (int i = 0; i < 12; i++) {
ids_data[i] = _ids[i];
scores_data[i] = _scores[i];
}
}
TEST(beam_search_op, run) {
CPUPlace place;
LoDTensor ids, scores;
CreateInput(&ids, &scores);
LoDTensor pre_ids;
pre_ids.Resize(framework::make_ddim(vector<int64_t>(4, 1)));
for (int i = 0; i < 4; i++) {
pre_ids.mutable_data<int64_t>(place)[i] = i + 1;
}
BeamSearch beamsearch(ids, scores, (int64_t)0, (int64_t)2, 0);
LoDTensor sids, sscores;
beamsearch(pre_ids, &sids, &sscores);
LOG(INFO) << "score: " << sscores << endl;
ASSERT_EQ(sids.lod(), sscores.lod());
vector<int> tids({2, 4, 3, 8});
vector<float> tscores({0.3, 0.5, 0.9, 0.7});
for (int i = 0; i < 4; i++) {
ASSERT_EQ(tids[i], sids.data<int64_t>()[i]);
ASSERT_EQ(tscores[i], sscores.data<float>()[i]);
}
}
} // namespace test
} // namespace paddle
...@@ -51,8 +51,8 @@ class ClipOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -51,8 +51,8 @@ class ClipOpMaker : public framework::OpProtoAndCheckerMaker {
AddComment(R"DOC( AddComment(R"DOC(
Clip Operator. Clip Operator.
The clip operator limits the value of given input within an interval. The interval is The clip operator limits the value of given input within an interval. The
specified with arguments 'min' and 'max': interval is specified with arguments 'min' and 'max':
$$ $$
Out = \min(\max(X, min), max) Out = \min(\max(X, min), max)
......
...@@ -70,6 +70,13 @@ void ConvOp::InferShape(framework::InferShapeContext* ctx) const { ...@@ -70,6 +70,13 @@ void ConvOp::InferShape(framework::InferShapeContext* ctx) const {
framework::OpKernelType ConvOp::GetExpectedKernelType( framework::OpKernelType ConvOp::GetExpectedKernelType(
const framework::ExecutionContext& ctx) const { const framework::ExecutionContext& ctx) const {
bool use_cudnn = ctx.Attr<bool>("use_cudnn"); bool use_cudnn = ctx.Attr<bool>("use_cudnn");
use_cudnn &= platform::is_gpu_place(ctx.GetPlace());
#ifdef PADDLE_WITH_CUDA
if (platform::is_gpu_place(ctx.GetPlace())) {
auto& dev_ctx = ctx.template device_context<platform::CUDADeviceContext>();
use_cudnn &= dev_ctx.cudnn_handle() != nullptr;
}
#endif
framework::LibraryType library_; framework::LibraryType library_;
if (use_cudnn) { if (use_cudnn) {
library_ = framework::LibraryType::kCUDNN; library_ = framework::LibraryType::kCUDNN;
...@@ -283,6 +290,14 @@ void ConvOpGrad::InferShape(framework::InferShapeContext* ctx) const { ...@@ -283,6 +290,14 @@ void ConvOpGrad::InferShape(framework::InferShapeContext* ctx) const {
framework::OpKernelType ConvOpGrad::GetExpectedKernelType( framework::OpKernelType ConvOpGrad::GetExpectedKernelType(
const framework::ExecutionContext& ctx) const { const framework::ExecutionContext& ctx) const {
bool use_cudnn = ctx.Attr<bool>("use_cudnn"); bool use_cudnn = ctx.Attr<bool>("use_cudnn");
use_cudnn &= platform::is_gpu_place(ctx.GetPlace());
#ifdef PADDLE_WITH_CUDA
if (platform::is_gpu_place(ctx.GetPlace())) {
auto& dev_ctx = ctx.template device_context<platform::CUDADeviceContext>();
use_cudnn &= dev_ctx.cudnn_handle() != nullptr;
}
#endif
framework::LibraryType library_; framework::LibraryType library_;
if (use_cudnn) { if (use_cudnn) {
library_ = framework::LibraryType::kCUDNN; library_ = framework::LibraryType::kCUDNN;
......
...@@ -61,6 +61,13 @@ void ConvTransposeOp::InferShape(framework::InferShapeContext* ctx) const { ...@@ -61,6 +61,13 @@ void ConvTransposeOp::InferShape(framework::InferShapeContext* ctx) const {
framework::OpKernelType ConvTransposeOp::GetExpectedKernelType( framework::OpKernelType ConvTransposeOp::GetExpectedKernelType(
const framework::ExecutionContext& ctx) const { const framework::ExecutionContext& ctx) const {
bool use_cudnn = ctx.Attr<bool>("use_cudnn"); bool use_cudnn = ctx.Attr<bool>("use_cudnn");
use_cudnn &= platform::is_gpu_place(ctx.GetPlace());
#ifdef PADDLE_WITH_CUDA
if (platform::is_gpu_place(ctx.GetPlace())) {
auto& dev_ctx = ctx.template device_context<platform::CUDADeviceContext>();
use_cudnn &= dev_ctx.cudnn_handle() != nullptr;
}
#endif
framework::LibraryType library_; framework::LibraryType library_;
if (use_cudnn) { if (use_cudnn) {
library_ = framework::LibraryType::kCUDNN; library_ = framework::LibraryType::kCUDNN;
...@@ -263,6 +270,13 @@ void ConvTransposeOpGrad::InferShape(framework::InferShapeContext* ctx) const { ...@@ -263,6 +270,13 @@ void ConvTransposeOpGrad::InferShape(framework::InferShapeContext* ctx) const {
framework::OpKernelType ConvTransposeOpGrad::GetExpectedKernelType( framework::OpKernelType ConvTransposeOpGrad::GetExpectedKernelType(
const framework::ExecutionContext& ctx) const { const framework::ExecutionContext& ctx) const {
bool use_cudnn = ctx.Attr<bool>("use_cudnn"); bool use_cudnn = ctx.Attr<bool>("use_cudnn");
use_cudnn &= platform::is_gpu_place(ctx.GetPlace());
#ifdef PADDLE_WITH_CUDA
if (platform::is_gpu_place(ctx.GetPlace())) {
auto& dev_ctx = ctx.template device_context<platform::CUDADeviceContext>();
use_cudnn &= dev_ctx.cudnn_handle() != nullptr;
}
#endif
framework::LibraryType library_; framework::LibraryType library_;
if (use_cudnn) { if (use_cudnn) {
library_ = framework::LibraryType::kCUDNN; library_ = framework::LibraryType::kCUDNN;
......
...@@ -28,39 +28,7 @@ template <typename DeviceContext, typename T> ...@@ -28,39 +28,7 @@ template <typename DeviceContext, typename T>
class ElementwiseAddKernel : public framework::OpKernel<T> { class ElementwiseAddKernel : public framework::OpKernel<T> {
public: public:
void Compute(const framework::ExecutionContext& ctx) const override { void Compute(const framework::ExecutionContext& ctx) const override {
using Tensor = framework::Tensor; ElementwiseComputeEx<AddFunctor<T>, DeviceContext, T>(ctx);
auto* x = ctx.Input<Tensor>("X");
auto* y = ctx.Input<Tensor>("Y");
auto* z = ctx.Output<Tensor>("Out");
z->mutable_data<T>(ctx.GetPlace());
TransformFunctor<AddFunctor<T>, T, DeviceContext> functor(
x, y, z, ctx.template device_context<DeviceContext>(), AddFunctor<T>());
auto x_dims = x->dims();
auto y_dims = y->dims();
PADDLE_ENFORCE_GE(x_dims.size(), y_dims.size(),
"Rank of first input must >= rank of second input.");
if (x_dims == y_dims) {
functor.Run();
return;
}
int axis = ctx.Attr<int>("axis");
axis = (axis == -1 ? x_dims.size() - y_dims.size() : axis);
PADDLE_ENFORCE(axis >= 0 && axis < x_dims.size(),
"Axis should be in range [0, x_dims)");
int pre, n, post;
get_mid_dims(x_dims, y_dims, axis, pre, n, post);
if (post == 1) {
functor.RunRowWise(n, pre);
return;
} else {
functor.RunMidWise(n, pre, post);
return;
}
} }
}; };
......
...@@ -19,11 +19,16 @@ limitations under the License. */ ...@@ -19,11 +19,16 @@ limitations under the License. */
namespace paddle { namespace paddle {
namespace operators { namespace operators {
template <typename T>
struct DivFunctor {
inline HOSTDEVICE T operator()(T a, T b) const { return a / b; }
};
template <typename DeviceContext, typename T> template <typename DeviceContext, typename T>
class ElementwiseDivKernel : public framework::OpKernel<T> { class ElementwiseDivKernel : public framework::OpKernel<T> {
public: public:
void Compute(const framework::ExecutionContext& ctx) const override { void Compute(const framework::ExecutionContext& ctx) const override {
ElementwiseCompute<EigenDivFunctor, DeviceContext, T>(ctx); ElementwiseComputeEx<DivFunctor<T>, DeviceContext, T>(ctx);
} }
}; };
......
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/elementwise_max_op.h"
#include "paddle/operators/elementwise_op.h"
namespace paddle {
namespace operators {
class ElementwiseMaxOpMaker : public ElementwiseOpMaker {
public:
ElementwiseMaxOpMaker(OpProto* proto, OpAttrChecker* op_checker)
: ElementwiseOpMaker(proto, op_checker) {
SetComment("Max", "Out = max(X, Y)");
AddComment(comment_);
}
};
} // namespace operators
} // namespace paddle
namespace ops = paddle::operators;
REGISTER_OP(elementwise_max, ops::ElementwiseOp, ops::ElementwiseMaxOpMaker,
elementwise_max_grad, ops::ElementwiseOpGrad);
REGISTER_OP_CPU_KERNEL(
elementwise_max,
ops::ElementwiseMaxKernel<paddle::platform::CPUDeviceContext, float>,
ops::ElementwiseMaxKernel<paddle::platform::CPUDeviceContext, double>,
ops::ElementwiseMaxKernel<paddle::platform::CPUDeviceContext, int>,
ops::ElementwiseMaxKernel<paddle::platform::CPUDeviceContext, int64_t>);
REGISTER_OP_CPU_KERNEL(
elementwise_max_grad,
ops::ElementwiseMaxGradKernel<paddle::platform::CPUDeviceContext, float>,
ops::ElementwiseMaxGradKernel<paddle::platform::CPUDeviceContext, double>,
ops::ElementwiseMaxGradKernel<paddle::platform::CPUDeviceContext, int>,
ops::ElementwiseMaxGradKernel<paddle::platform::CPUDeviceContext, int64_t>);
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#define EIGEN_USE_GPU
#include "paddle/operators/elementwise_max_op.h"
namespace ops = paddle::operators;
REGISTER_OP_CUDA_KERNEL(
elementwise_max,
ops::ElementwiseMaxKernel<paddle::platform::CUDADeviceContext, float>,
ops::ElementwiseMaxKernel<paddle::platform::CUDADeviceContext, double>,
ops::ElementwiseMaxKernel<paddle::platform::CUDADeviceContext, int>,
ops::ElementwiseMaxKernel<paddle::platform::CUDADeviceContext, int64_t>);
REGISTER_OP_CUDA_KERNEL(
elementwise_max_grad,
ops::ElementwiseMaxGradKernel<paddle::platform::CUDADeviceContext, float>,
ops::ElementwiseMaxGradKernel<paddle::platform::CUDADeviceContext, double>,
ops::ElementwiseMaxGradKernel<paddle::platform::CUDADeviceContext, int>,
ops::ElementwiseMaxGradKernel<paddle::platform::CUDADeviceContext,
int64_t>);
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "paddle/operators/elementwise_op_function.h"
namespace paddle {
namespace operators {
template <typename T>
struct MaxFunctor {
inline HOSTDEVICE T operator()(T a, T b) const { return a > b ? a : b; }
};
template <typename DeviceContext, typename T>
class ElementwiseMaxKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
ElementwiseComputeEx<MaxFunctor<T>, DeviceContext, T>(ctx);
}
};
template <typename T>
struct ElementwiseMaxGradFunctor {
template <typename Device, typename X, typename Y, typename Z, typename dX,
typename dY, typename dZ>
void operator()(Device d, X x, Y y, Z z, dX dx, dY dy, dZ dz) {
auto x_e = framework::EigenVector<T>::Flatten(*x);
auto y_e = framework::EigenVector<T>::Flatten(*y);
auto dz_e = framework::EigenVector<T>::Flatten(*dz);
if (dx) {
auto dx_e = framework::EigenVector<T>::Flatten(*dx);
dx_e.device(d) = (x_e > y_e).template cast<T>() * dz_e;
}
if (dy) {
auto dy_e = framework::EigenVector<T>::Flatten(*dy);
dy_e.device(d) = (x_e <= y_e).template cast<T>() * dz_e;
}
}
};
template <typename T>
struct ElementwiseMaxBroadCastGradFunctor {
template <typename Device, typename X, typename Y, typename Z, typename dX,
typename dY, typename dZ, typename Pre, typename N>
void operator()(Device d, X x, Y y, Z z, dX dx, dY dy, dZ dz, Pre pre, N n) {
auto x_e = framework::EigenVector<T>::Flatten(*x);
auto y_e = framework::EigenVector<T>::Flatten(*y);
auto dz_e = framework::EigenVector<T>::Flatten(*dz);
auto y_e_bcast = y_e.reshape(Eigen::DSizes<int, 2>(1, n))
.broadcast(Eigen::DSizes<int, 2>(pre, 1))
.reshape(Eigen::DSizes<int, 1>(x_e.size()));
if (dx) {
auto dx_e = framework::EigenVector<T>::Flatten(*dx);
dx_e.device(d) = (x_e > y_e_bcast).template cast<T>() * dz_e;
}
if (dy) {
auto dy_e = framework::EigenVector<T>::Flatten(*dy);
dy_e.device(d) = ((x_e <= y_e_bcast).template cast<T>() * dz_e)
.reshape(Eigen::DSizes<int, 2>(pre, n))
.sum(Eigen::array<int, 1>{{0}});
}
}
};
template <typename T>
struct ElementwiseMaxBroadCast2GradFunctor {
template <typename Device, typename X, typename Y, typename Z, typename dX,
typename dY, typename dZ, typename Pre, typename N, typename Post>
void operator()(Device d, X x, Y y, Z z, dX dx, dY dy, dZ dz, Pre pre, N n,
Post post) {
auto x_e = framework::EigenVector<T>::Flatten(*x);
auto y_e = framework::EigenVector<T>::Flatten(*y);
auto dz_e = framework::EigenVector<T>::Flatten(*dz);
auto y_e_bcast = y_e.reshape(Eigen::DSizes<int, 3>(1, n, 1))
.broadcast(Eigen::DSizes<int, 3>(pre, 1, post))
.reshape(Eigen::DSizes<int, 1>(x_e.size()));
if (dx) {
auto dx_e = framework::EigenVector<T>::Flatten(*dx);
dx_e.device(d) = (x_e > y_e_bcast).template cast<T>() * dz_e;
}
if (dy) {
auto dy_e = framework::EigenVector<T>::Flatten(*dy);
dy_e.device(d) = ((x_e <= y_e_bcast).template cast<T>() * dz_e)
.reshape(Eigen::DSizes<int, 3>(pre, n, post))
.sum(Eigen::array<int, 2>{{0, 2}});
}
}
};
template <typename DeviceContext, typename T>
class ElementwiseMaxGradKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
ElementwiseGradCompute<DeviceContext, T, ElementwiseMaxGradFunctor<T>,
ElementwiseMaxBroadCastGradFunctor<T>,
ElementwiseMaxBroadCast2GradFunctor<T>>(ctx);
}
};
} // namespace operators
} // namespace paddle
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/elementwise_min_op.h"
#include "paddle/operators/elementwise_op.h"
namespace paddle {
namespace operators {
class ElementwiseMinOpMaker : public ElementwiseOpMaker {
public:
ElementwiseMinOpMaker(OpProto* proto, OpAttrChecker* op_checker)
: ElementwiseOpMaker(proto, op_checker) {
SetComment("Max", "Out = min(X, Y)");
AddComment(comment_);
}
};
} // namespace operators
} // namespace paddle
namespace ops = paddle::operators;
REGISTER_OP(elementwise_min, ops::ElementwiseOp, ops::ElementwiseMinOpMaker,
elementwise_min_grad, ops::ElementwiseOpGrad);
REGISTER_OP_CPU_KERNEL(
elementwise_min,
ops::ElementwiseMinKernel<paddle::platform::CPUDeviceContext, float>,
ops::ElementwiseMinKernel<paddle::platform::CPUDeviceContext, double>,
ops::ElementwiseMinKernel<paddle::platform::CPUDeviceContext, int>,
ops::ElementwiseMinKernel<paddle::platform::CPUDeviceContext, int64_t>);
REGISTER_OP_CPU_KERNEL(
elementwise_min_grad,
ops::ElementwiseMinGradKernel<paddle::platform::CPUDeviceContext, float>,
ops::ElementwiseMinGradKernel<paddle::platform::CPUDeviceContext, double>,
ops::ElementwiseMinGradKernel<paddle::platform::CPUDeviceContext, int>,
ops::ElementwiseMinGradKernel<paddle::platform::CPUDeviceContext, int64_t>);
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#define EIGEN_USE_GPU
#include "paddle/operators/elementwise_min_op.h"
namespace ops = paddle::operators;
REGISTER_OP_CUDA_KERNEL(
elementwise_min,
ops::ElementwiseMinKernel<paddle::platform::CUDADeviceContext, float>,
ops::ElementwiseMinKernel<paddle::platform::CUDADeviceContext, double>,
ops::ElementwiseMinKernel<paddle::platform::CUDADeviceContext, int>,
ops::ElementwiseMinKernel<paddle::platform::CUDADeviceContext, int64_t>);
REGISTER_OP_CUDA_KERNEL(
elementwise_min_grad,
ops::ElementwiseMinGradKernel<paddle::platform::CUDADeviceContext, float>,
ops::ElementwiseMinGradKernel<paddle::platform::CUDADeviceContext, double>,
ops::ElementwiseMinGradKernel<paddle::platform::CUDADeviceContext, int>,
ops::ElementwiseMinGradKernel<paddle::platform::CUDADeviceContext,
int64_t>);
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "paddle/operators/elementwise_op_function.h"
namespace paddle {
namespace operators {
template <typename T>
struct MinFunctor {
inline HOSTDEVICE T operator()(T a, T b) const { return a < b ? a : b; }
};
template <typename DeviceContext, typename T>
class ElementwiseMinKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
ElementwiseComputeEx<MinFunctor<T>, DeviceContext, T>(ctx);
}
};
template <typename T>
struct ElementwiseMinGradFunctor {
template <typename Device, typename X, typename Y, typename Z, typename dX,
typename dY, typename dZ>
void operator()(Device d, X x, Y y, Z z, dX dx, dY dy, dZ dz) {
auto x_e = framework::EigenVector<T>::Flatten(*x);
auto y_e = framework::EigenVector<T>::Flatten(*y);
auto dz_e = framework::EigenVector<T>::Flatten(*dz);
if (dx) {
auto dx_e = framework::EigenVector<T>::Flatten(*dx);
dx_e.device(d) = (x_e < y_e).template cast<T>() * dz_e;
}
if (dy) {
auto dy_e = framework::EigenVector<T>::Flatten(*dy);
dy_e.device(d) = (x_e >= y_e).template cast<T>() * dz_e;
}
}
};
template <typename T>
struct ElementwiseMinBroadCastGradFunctor {
template <typename Device, typename X, typename Y, typename Z, typename dX,
typename dY, typename dZ, typename Pre, typename N>
void operator()(Device d, X x, Y y, Z z, dX dx, dY dy, dZ dz, Pre pre, N n) {
auto x_e = framework::EigenVector<T>::Flatten(*x);
auto y_e = framework::EigenVector<T>::Flatten(*y);
auto dz_e = framework::EigenVector<T>::Flatten(*dz);
auto y_e_bcast = y_e.reshape(Eigen::DSizes<int, 2>(1, n))
.broadcast(Eigen::DSizes<int, 2>(pre, 1))
.reshape(Eigen::DSizes<int, 1>(x_e.size()));
if (dx) {
auto dx_e = framework::EigenVector<T>::Flatten(*dx);
dx_e.device(d) = (x_e < y_e_bcast).template cast<T>() * dz_e;
}
if (dy) {
auto dy_e = framework::EigenVector<T>::Flatten(*dy);
dy_e.device(d) = ((x_e >= y_e_bcast).template cast<T>() * dz_e)
.reshape(Eigen::DSizes<int, 2>(pre, n))
.sum(Eigen::array<int, 1>{{0}});
}
}
};
template <typename T>
struct ElementwiseMinBroadCast2GradFunctor {
template <typename Device, typename X, typename Y, typename Z, typename dX,
typename dY, typename dZ, typename Pre, typename N, typename Post>
void operator()(Device d, X x, Y y, Z z, dX dx, dY dy, dZ dz, Pre pre, N n,
Post post) {
auto x_e = framework::EigenVector<T>::Flatten(*x);
auto y_e = framework::EigenVector<T>::Flatten(*y);
auto dz_e = framework::EigenVector<T>::Flatten(*dz);
auto y_e_bcast = y_e.reshape(Eigen::DSizes<int, 3>(1, n, 1))
.broadcast(Eigen::DSizes<int, 3>(pre, 1, post))
.reshape(Eigen::DSizes<int, 1>(x_e.size()));
if (dx) {
auto dx_e = framework::EigenVector<T>::Flatten(*dx);
dx_e.device(d) = (x_e < y_e_bcast).template cast<T>() * dz_e;
}
if (dy) {
auto dy_e = framework::EigenVector<T>::Flatten(*dy);
dy_e.device(d) = ((x_e >= y_e_bcast).template cast<T>() * dz_e)
.reshape(Eigen::DSizes<int, 3>(pre, n, post))
.sum(Eigen::array<int, 2>{{0, 2}});
}
}
};
template <typename DeviceContext, typename T>
class ElementwiseMinGradKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
ElementwiseGradCompute<DeviceContext, T, ElementwiseMinGradFunctor<T>,
ElementwiseMinBroadCastGradFunctor<T>,
ElementwiseMinBroadCast2GradFunctor<T>>(ctx);
}
};
} // namespace operators
} // namespace paddle
...@@ -18,11 +18,16 @@ limitations under the License. */ ...@@ -18,11 +18,16 @@ limitations under the License. */
namespace paddle { namespace paddle {
namespace operators { namespace operators {
template <typename T>
struct MulFunctor {
inline HOSTDEVICE T operator()(T a, T b) const { return a * b; }
};
template <typename DeviceContext, typename T> template <typename DeviceContext, typename T>
class ElementwiseMulKernel : public framework::OpKernel<T> { class ElementwiseMulKernel : public framework::OpKernel<T> {
public: public:
void Compute(const framework::ExecutionContext& ctx) const override { void Compute(const framework::ExecutionContext& ctx) const override {
ElementwiseCompute<EigenMulFunctor, DeviceContext, T>(ctx); ElementwiseComputeEx<MulFunctor<T>, DeviceContext, T>(ctx);
} }
}; };
......
...@@ -26,9 +26,9 @@ class ElementwiseOp : public framework::OperatorWithKernel { ...@@ -26,9 +26,9 @@ class ElementwiseOp : public framework::OperatorWithKernel {
using Tensor = framework::Tensor; using Tensor = framework::Tensor;
void InferShape(framework::InferShapeContext* ctx) const override { void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of elementwise op should not be null"); "Input(X) of elementwise op should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Y"), PADDLE_ENFORCE(ctx->HasInput("Y"),
"Input(Y) of elementwise op should not be null"); "Input(Y) of elementwise op should not be null.");
PADDLE_ENFORCE(ctx->HasOutput("Out"), PADDLE_ENFORCE(ctx->HasOutput("Out"),
"Output(Out) of elementwise op should not be null."); "Output(Out) of elementwise op should not be null.");
...@@ -45,12 +45,12 @@ class ElementwiseOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -45,12 +45,12 @@ class ElementwiseOpMaker : public framework::OpProtoAndCheckerMaker {
public: public:
ElementwiseOpMaker(OpProto* proto, OpAttrChecker* op_checker) ElementwiseOpMaker(OpProto* proto, OpAttrChecker* op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "(Tensor) The first input tensor of elementwise op"); AddInput("X", "(Tensor), The first input tensor of elementwise op.");
AddInput("Y", "(Tensor) The second input tensor of elementwise op"); AddInput("Y", "(Tensor), The second input tensor of elementwise op.");
AddOutput("Out", "The output of elementwise op"); AddOutput("Out", "The output of elementwise op.");
AddAttr<int>("axis", AddAttr<int>("axis",
"(int, default -1) The starting dimension index " "(int, default -1). The start dimension index "
"for broadcasting Y onto X") "for broadcasting Y onto X.")
.SetDefault(-1) .SetDefault(-1)
.EqualGreaterThan(-1); .EqualGreaterThan(-1);
comment_ = R"DOC( comment_ = R"DOC(
...@@ -58,19 +58,18 @@ Limited Elementwise {name} Operator. ...@@ -58,19 +58,18 @@ Limited Elementwise {name} Operator.
The equation is: The equation is:
.. math:: $${equation}$$
{equation}
X is a tensor of any dimension and the dimensions of tensor Y must be smaller than $X$ is a tensor of any dimension and the dimensions of tensor $Y$ must be
or equal to the dimensions of X. smaller than or equal to the dimensions of $X$.
There are two cases for this operator: There are two cases for this operator:
1. The shape of Y is same with X; 1. The shape of $Y$ is same with $X$;
2. The shape of Y is a subset of X. 2. The shape of $Y$ is a subset of $X$.
For case 2: For case 2:
Y will be broadcasted to match the shape of X and axis should be $Y$ will be broadcasted to match the shape of $X$ and axis should be
the starting dimension index for broadcasting Y onto X. set to index of the start dimension to broadcast $Y$ onto $X$.
For example For example
.. code-block:: python .. code-block:: python
...@@ -81,7 +80,8 @@ For example ...@@ -81,7 +80,8 @@ For example
shape(X) = (2, 3, 4, 5), shape(Y) = (3, 4), with axis=1 shape(X) = (2, 3, 4, 5), shape(Y) = (3, 4), with axis=1
shape(X) = (2, 3, 4, 5), shape(Y) = (2), with axis=0 shape(X) = (2, 3, 4, 5), shape(Y) = (2), with axis=0
Either of the inputs X and Y or none can carry the LoD (Level of Details) information. However, the output only shares the LoD information with input X. Either of the inputs $X$ and $Y$ or none can carry the LoD (Level of Details)
information. However, the output only shares the LoD information with input $X$.
)DOC"; )DOC";
AddComment(comment_); AddComment(comment_);
......
...@@ -340,6 +340,13 @@ void ElementwiseGradCompute(const framework::ExecutionContext& ctx) { ...@@ -340,6 +340,13 @@ void ElementwiseGradCompute(const framework::ExecutionContext& ctx) {
return; return;
} }
if (y_dims.size() == 1 && y_dims[0] == 1) {
// y is a scalar
auto extended_dims = framework::vectorize(x_dims);
extended_dims.push_back(1);
x_dims = framework::make_ddim(extended_dims);
}
int axis = ctx.Attr<int>("axis"); int axis = ctx.Attr<int>("axis");
axis = (axis == -1 ? x_dims.size() - y_dims.size() : axis); axis = (axis == -1 ? x_dims.size() - y_dims.size() : axis);
...@@ -356,5 +363,50 @@ void ElementwiseGradCompute(const framework::ExecutionContext& ctx) { ...@@ -356,5 +363,50 @@ void ElementwiseGradCompute(const framework::ExecutionContext& ctx) {
return; return;
} }
} }
template <typename Functor, typename DeviceContext, typename T>
void ElementwiseComputeEx(const framework::ExecutionContext& ctx) {
using Tensor = framework::Tensor;
auto* x = ctx.Input<Tensor>("X");
auto* y = ctx.Input<Tensor>("Y");
auto* z = ctx.Output<Tensor>("Out");
z->mutable_data<T>(ctx.GetPlace());
TransformFunctor<Functor, T, DeviceContext> functor(
x, y, z, ctx.template device_context<DeviceContext>(), Functor());
auto x_dims = x->dims();
auto y_dims = y->dims();
PADDLE_ENFORCE_GE(x_dims.size(), y_dims.size(),
"Rank of first input must >= rank of second input.");
if (x_dims == y_dims) {
functor.Run();
return;
}
if (y_dims.size() == 1 && y_dims[0] == 1) {
// y is a scalar
auto extended_dims = framework::vectorize(x_dims);
extended_dims.push_back(1);
x_dims = framework::make_ddim(extended_dims);
}
int axis = ctx.Attr<int>("axis");
axis = (axis == -1 ? x_dims.size() - y_dims.size() : axis);
PADDLE_ENFORCE(axis >= 0 && axis < x_dims.size(),
"Axis should be in range [0, x_dims)");
int pre, n, post;
get_mid_dims(x_dims, y_dims, axis, pre, n, post);
if (post == 1) {
functor.RunRowWise(n, pre);
return;
} else {
functor.RunMidWise(n, pre, post);
return;
}
}
} // namespace operators } // namespace operators
} // namespace paddle } // namespace paddle
...@@ -18,11 +18,16 @@ limitations under the License. */ ...@@ -18,11 +18,16 @@ limitations under the License. */
namespace paddle { namespace paddle {
namespace operators { namespace operators {
template <typename T>
struct SubFunctor {
inline HOSTDEVICE T operator()(T a, T b) const { return a - b; }
};
template <typename DeviceContext, typename T> template <typename DeviceContext, typename T>
class ElementwiseSubKernel : public framework::OpKernel<T> { class ElementwiseSubKernel : public framework::OpKernel<T> {
public: public:
void Compute(const framework::ExecutionContext& ctx) const override { void Compute(const framework::ExecutionContext& ctx) const override {
ElementwiseCompute<EigenSubFunctor, DeviceContext, T>(ctx); ElementwiseComputeEx<SubFunctor<T>, DeviceContext, T>(ctx);
} }
}; };
......
...@@ -58,21 +58,21 @@ class ExpandOpMaker : public framework::OpProtoAndCheckerMaker { ...@@ -58,21 +58,21 @@ class ExpandOpMaker : public framework::OpProtoAndCheckerMaker {
ExpandOpMaker(OpProto* proto, OpAttrChecker* op_checker) ExpandOpMaker(OpProto* proto, OpAttrChecker* op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) { : OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", AddInput("X",
"(Tensor, default Tensor<float>) A tensor with rank in [1, 6]." "(Tensor, default Tensor<float>). A tensor with rank in [1, 6]."
"X is the input tensor to be expanded."); "X is the input to be expanded.");
AddOutput("Out", AddOutput("Out",
"(Tensor, default Tensor<float>) A tensor with rank in [1, 6]." "(Tensor, default Tensor<float>). A tensor with rank in [1, 6]."
"The rank of Output(Out) is same as Input(X) except that each " "The rank of Output(Out) have the same with Input(X). "
"dimension size of Output(Out) is equal to corresponding " "After expanding, size of each dimension of Output(Out) is equal "
"dimension size of Input(X) multiplying corresponding value of " "to size of the corresponding dimension of Input(X) multiplying "
"Attr(expand_times)."); "the corresponding value given by Attr(expand_times).");
AddAttr<std::vector<int>>("expand_times", AddAttr<std::vector<int>>("expand_times",
"Expand times number for each dimension."); "Expand times number for each dimension.");
AddComment(R"DOC( AddComment(R"DOC(
Expand operator tiles the input by given times number. You should set times Expand operator tiles the input by given times number. You should set times
number for each dimension by providing attribute 'expand_times'. The rank of X number for each dimension by providing attribute 'expand_times'. The rank of X
should be in [1, 6]. Please notice that size of 'expand_times' must be same with should be in [1, 6]. Please note that size of 'expand_times' must be the same
X's rank. Following is a using case: with X's rank. Following is a using case:
Input(X) is a 3-D tensor with shape [2, 3, 1]: Input(X) is a 3-D tensor with shape [2, 3, 1]:
......
...@@ -66,7 +66,7 @@ As most C++ operators do, `batch_norm_op` is defined by inputs, outputs, attribu ...@@ -66,7 +66,7 @@ As most C++ operators do, `batch_norm_op` is defined by inputs, outputs, attribu
The following graph showes the training computational process of `batch_norm_op`: The following graph showes the training computational process of `batch_norm_op`:
<img src="./images/batch_norm_op_kernel.png" width="800"/> <img src="../images/batch_norm_op_kernel.png" width="800"/>
cudnn provides APIs to finish the whole series of computation, we can use them in our GPU kernel. cudnn provides APIs to finish the whole series of computation, we can use them in our GPU kernel.
...@@ -124,7 +124,7 @@ for pass_id in range(PASS_NUM): ...@@ -124,7 +124,7 @@ for pass_id in range(PASS_NUM):
`is_infer` is an attribute. Once an operator is created, its attributes can not be changed. It suggests us that we shall maintain two `batch_norm_op` in the model, one's `is_infer` is `True`(we call it `infer_batch_norm_op`) and the other one's is `False`(we call it `train_batch_norm_op`). They share all parameters and variables, but be placed in two different branches. That is to say, if a network contains a `batch_norm_op`, it will fork into two branches, one go through `train_batch_norm_op` and the other one go through `infer_batch_norm_op`: `is_infer` is an attribute. Once an operator is created, its attributes can not be changed. It suggests us that we shall maintain two `batch_norm_op` in the model, one's `is_infer` is `True`(we call it `infer_batch_norm_op`) and the other one's is `False`(we call it `train_batch_norm_op`). They share all parameters and variables, but be placed in two different branches. That is to say, if a network contains a `batch_norm_op`, it will fork into two branches, one go through `train_batch_norm_op` and the other one go through `infer_batch_norm_op`:
<div align=center> <div align=center>
<img src="./images/batch_norm_fork.png" width="500"/> <img src="../images/batch_norm_fork.png" width="500"/>
</div> </div>
Just like what is shown in the above graph, the net forks before `batch_norm_op` and will never merge again. All the operators after `batch_norm_op` will duplicate. Just like what is shown in the above graph, the net forks before `batch_norm_op` and will never merge again. All the operators after `batch_norm_op` will duplicate.
......
...@@ -30,16 +30,13 @@ static constexpr char kParallelScopes[] = "parallel_scopes"; ...@@ -30,16 +30,13 @@ static constexpr char kParallelScopes[] = "parallel_scopes";
static constexpr char kParallelBlock[] = "sub_block"; static constexpr char kParallelBlock[] = "sub_block";
// using ParallelScopeVar = std::vector<framework::Scope *>;
using LoDTensor = framework::LoDTensor; using LoDTensor = framework::LoDTensor;
using OperatorBase = framework::OperatorBase;
void SplitTensorAndMoveTensorToScopes( static void SplitTensorAndMoveTensorToScopes(
const framework::Scope &scope, const framework::Scope &scope, std::vector<framework::Scope *> *sub_scopes,
const std::vector<framework::Scope *> &sub_scopes,
const std::vector<platform::Place> &places, const std::vector<platform::Place> &places,
const std::vector<std::string> &names) { const std::vector<std::string> &names) {
PADDLE_ENFORCE_EQ(sub_scopes.size(), places.size()); size_t num_sub_scopes = 0;
for (auto &argu : names) { for (auto &argu : names) {
auto *var = scope.FindVar(argu); auto *var = scope.FindVar(argu);
const auto &tensor = var->Get<LoDTensor>(); const auto &tensor = var->Get<LoDTensor>();
...@@ -48,9 +45,21 @@ void SplitTensorAndMoveTensorToScopes( ...@@ -48,9 +45,21 @@ void SplitTensorAndMoveTensorToScopes(
for (auto &lod : lod_tensors) { for (auto &lod : lod_tensors) {
VLOG(3) << lod.dims(); VLOG(3) << lod.dims();
} }
if (num_sub_scopes == 0) {
num_sub_scopes = lod_tensors.size();
} else {
PADDLE_ENFORCE_EQ(num_sub_scopes, lod_tensors.size());
}
PADDLE_ENFORCE_NE(num_sub_scopes, 0);
if (sub_scopes->size() == 0) {
sub_scopes->reserve(num_sub_scopes);
for (size_t i = 0; i < num_sub_scopes; ++i) {
sub_scopes->emplace_back(&scope.NewScope());
}
}
for (size_t i = 0; i < sub_scopes.size(); ++i) { for (size_t i = 0; i < lod_tensors.size(); ++i) {
*sub_scopes[i]->Var(argu)->GetMutable<LoDTensor>() = lod_tensors[i]; *(*sub_scopes)[i]->Var(argu)->GetMutable<LoDTensor>() = lod_tensors[i];
} }
} }
} }
...@@ -70,7 +79,7 @@ class ParallelDoOp : public framework::OperatorBase { ...@@ -70,7 +79,7 @@ class ParallelDoOp : public framework::OperatorBase {
const framework::VariableNameMap &inputs, const framework::VariableNameMap &inputs,
const framework::VariableNameMap &outputs, const framework::VariableNameMap &outputs,
const framework::AttributeMap &attrs) const framework::AttributeMap &attrs)
: OperatorBase(type, inputs, outputs, attrs) {} : framework::OperatorBase(type, inputs, outputs, attrs) {}
void Run(const framework::Scope &scope, void Run(const framework::Scope &scope,
const platform::Place &place) const override { const platform::Place &place) const override {
...@@ -85,19 +94,17 @@ class ParallelDoOp : public framework::OperatorBase { ...@@ -85,19 +94,17 @@ class ParallelDoOp : public framework::OperatorBase {
auto &sub_scopes = *scope.FindVar(Output(kParallelScopes)) auto &sub_scopes = *scope.FindVar(Output(kParallelScopes))
->GetMutable<std::vector<framework::Scope *>>(); ->GetMutable<std::vector<framework::Scope *>>();
for (size_t place_idx = 0; place_idx < places.size(); ++place_idx) {
sub_scopes.push_back(&scope.NewScope());
}
// split input // split input
SplitTensorAndMoveTensorToScopes(scope, sub_scopes, places, SplitTensorAndMoveTensorToScopes(scope, &sub_scopes, places,
Inputs(kInputs)); Inputs(kInputs));
// copy parameter // copy parameter
for (auto &param : Inputs(kParameters)) { for (auto &param : Inputs(kParameters)) {
PADDLE_ENFORCE(scope.FindVar(param)->IsType<LoDTensor>(), PADDLE_ENFORCE(scope.FindVar(param)->IsType<LoDTensor>(),
"Only support parameter type as LoDTensor"); "Only support parameter type as LoDTensor");
auto &src = scope.FindVar(param)->Get<LoDTensor>(); auto &src = scope.FindVar(param)->Get<LoDTensor>();
for (size_t i = 0; i < places.size(); ++i) { for (size_t i = 0; i < sub_scopes.size(); ++i) {
auto &place = places[i]; auto &place = places[i];
auto *sub_scope = sub_scopes[i]; auto *sub_scope = sub_scopes[i];
auto *dst = sub_scope->Var(param)->GetMutable<LoDTensor>(); auto *dst = sub_scope->Var(param)->GetMutable<LoDTensor>();
...@@ -108,9 +115,7 @@ class ParallelDoOp : public framework::OperatorBase { ...@@ -108,9 +115,7 @@ class ParallelDoOp : public framework::OperatorBase {
std::vector<std::future<void>> workers; std::vector<std::future<void>> workers;
workers.reserve(places.size()); workers.reserve(places.size());
for (size_t place_idx = 0; place_idx < places.size(); ++place_idx) { for (size_t place_idx = 0; place_idx < sub_scopes.size(); ++place_idx) {
VLOG(3) << "Run " << place_idx;
auto &place = places[place_idx]; auto &place = places[place_idx];
auto *cur_scope = sub_scopes[place_idx]; auto *cur_scope = sub_scopes[place_idx];
...@@ -157,21 +162,16 @@ ParallelDo Operator. ...@@ -157,21 +162,16 @@ ParallelDo Operator.
} }
}; };
class ParallelDoGradOp : public OperatorBase { class ParallelDoGradOp : public framework::OperatorBase {
public: public:
ParallelDoGradOp(const std::string &type, ParallelDoGradOp(const std::string &type,
const framework::VariableNameMap &inputs, const framework::VariableNameMap &inputs,
const framework::VariableNameMap &outputs, const framework::VariableNameMap &outputs,
const framework::AttributeMap &attrs) const framework::AttributeMap &attrs)
: OperatorBase(type, inputs, outputs, attrs) {} : framework::OperatorBase(type, inputs, outputs, attrs) {}
void Run(const framework::Scope &scope, void Run(const framework::Scope &scope,
const platform::Place &place) const override { const platform::Place &place) const override {
// // get device context from pool
// platform::DeviceContextPool &pool =
// platform::DeviceContextPool::Instance();
// auto &dev_ctx = *pool.Get(place);
auto *block = Attr<framework::BlockDesc *>(kParallelBlock); auto *block = Attr<framework::BlockDesc *>(kParallelBlock);
auto *program = block->Program(); auto *program = block->Program();
...@@ -181,26 +181,16 @@ class ParallelDoGradOp : public OperatorBase { ...@@ -181,26 +181,16 @@ class ParallelDoGradOp : public OperatorBase {
auto &places = scope.FindVar(Input(kPlaces))->Get<platform::PlaceList>(); auto &places = scope.FindVar(Input(kPlaces))->Get<platform::PlaceList>();
// feed output@grad // feed output@grad
SplitTensorAndMoveTensorToScopes(scope, sub_scopes, places, SplitTensorAndMoveTensorToScopes(
Inputs(framework::GradVarName(kOutputs))); scope, const_cast<std::vector<framework::Scope *> *>(&sub_scopes),
places, Inputs(framework::GradVarName(kOutputs)));
WaitOnPlaces(places); WaitOnPlaces(places);
// for debugging
for (auto &s : Inputs(framework::GradVarName(kOutputs))) {
VLOG(3) << s;
VLOG(3) << scope.FindVar(s)->Get<LoDTensor>();
for (auto *sub_scope : sub_scopes) {
VLOG(3) << sub_scope->FindVar(s)->Get<LoDTensor>();
}
}
// exe run // exe run
std::vector<std::future<void>> workers; std::vector<std::future<void>> workers;
for (size_t place_idx = 0; place_idx < places.size(); ++place_idx) { for (size_t i = 0; i < sub_scopes.size(); ++i) {
VLOG(3) << "Run " << place_idx; auto &place = places[i];
auto *cur_scope = sub_scopes[i];
auto &place = places[place_idx];
auto *cur_scope = sub_scopes[place_idx];
// execute // execute
workers.emplace_back(framework::Async([program, cur_scope, place, block] { workers.emplace_back(framework::Async([program, cur_scope, place, block] {
...@@ -216,33 +206,38 @@ class ParallelDoGradOp : public OperatorBase { ...@@ -216,33 +206,38 @@ class ParallelDoGradOp : public OperatorBase {
// merge grad // merge grad
for (auto &s : Outputs(framework::GradVarName(kParameters))) { for (auto &s : Outputs(framework::GradVarName(kParameters))) {
VLOG(3) << "merge grad " << s; auto &result = sub_scopes[0]->FindVar(s)->Get<LoDTensor>();
std::string tmp_name;
auto &t = sub_scopes[0]->FindVar(s)->Get<LoDTensor>(); auto *tmp = sub_scopes[0]->Var(&tmp_name)->GetMutable<LoDTensor>();
VLOG(3) << t;
for (size_t i = 1; i < sub_scopes.size(); ++i) {
std::string s_buf = s + "@BUF"; auto &tensor_to_merge = sub_scopes[i]->FindVar(s)->Get<LoDTensor>();
auto *t_buf = sub_scopes[0]->Var(s_buf)->GetMutable<LoDTensor>(); if (!(places[i] == places[0])) {
framework::Copy(tensor_to_merge, places[0], tmp);
for (size_t place_idx = 1; place_idx < places.size(); ++place_idx) { } else {
auto &tt = sub_scopes[place_idx]->FindVar(s)->Get<LoDTensor>(); tmp->ShareDataWith(tensor_to_merge);
VLOG(3) << place_idx; }
VLOG(3) << tt;
framework::Copy(tt, places[0], t_buf);
auto sum_op = framework::OpRegistry::CreateOp( auto sum_op = framework::OpRegistry::CreateOp(
"sum", {{"X", {s, s_buf}}}, {{"Out", {s}}}, "sum", {{"X", {s, tmp_name}}}, {{"Out", {s}}},
framework::AttributeMap{}); framework::AttributeMap{});
sum_op->Run(*sub_scopes[0], places[0]); sum_op->Run(*sub_scopes[0], places[0]);
WaitOnPlaces(places); WaitOnPlaces(places);
} }
VLOG(3) << t; VLOG(3) << result;
framework::Copy(t, place, scope.FindVar(s)->GetMutable<LoDTensor>()); framework::Copy(result, place, scope.FindVar(s)->GetMutable<LoDTensor>());
} }
} }
}; };
std::ostream &operator<<(std::ostream &sout,
const std::vector<std::string> &strs) {
std::copy(strs.begin(), strs.end(),
std::ostream_iterator<std::string>(sout, ","));
return sout;
}
class ParallelDoGradOpDescMaker : public framework::SingleGradOpDescMaker { class ParallelDoGradOpDescMaker : public framework::SingleGradOpDescMaker {
public: public:
using framework::SingleGradOpDescMaker::SingleGradOpDescMaker; using framework::SingleGradOpDescMaker::SingleGradOpDescMaker;
...@@ -283,18 +278,30 @@ class ParallelDoGradOpShapeInference : public framework::InferShapeBase { ...@@ -283,18 +278,30 @@ class ParallelDoGradOpShapeInference : public framework::InferShapeBase {
void operator()(framework::InferShapeContext *ctx) const override { void operator()(framework::InferShapeContext *ctx) const override {
std::vector<std::string> input{kParameters, kInputs}; std::vector<std::string> input{kParameters, kInputs};
std::vector<std::string> output{kOutputs}; std::vector<std::string> output{kOutputs};
for (auto &s : input) {
PADDLE_ENFORCE(ctx->HasInputs(s)); PADDLE_ENFORCE(ctx->HasInputs(kParameters));
PADDLE_ENFORCE(ctx->HasOutputs(framework::GradVarName(s)), PADDLE_ENFORCE(ctx->HasOutputs(framework::GradVarName(kParameters)));
"Cannot find the gradient variable %s", PADDLE_ENFORCE(ctx->HasInput(kInputs));
framework::GradVarName(s));
}
for (auto &s : output) { for (auto &s : output) {
PADDLE_ENFORCE(ctx->HasInputs(s)); PADDLE_ENFORCE(ctx->HasInputs(s));
} }
for (auto &s : input) {
ctx->SetOutputsDim(framework::GradVarName(s), ctx->GetInputsDim(s)); ctx->SetOutputsDim(framework::GradVarName(kParameters),
ctx->GetInputsDim(kParameters));
auto i_dims = ctx->GetInputsDim(kInputs);
auto ig_names = ctx->Outputs(framework::GradVarName(kInputs));
for (size_t i = 0; i < ig_names.size(); ++i) {
auto &ig_name = ig_names[i];
if (ig_name == framework::kEmptyVarName) {
continue;
}
ctx->SetDims({ig_name}, {i_dims[i]});
} }
if (ctx->HasInputs(kParameters)) { if (ctx->HasInputs(kParameters)) {
PADDLE_ENFORCE(ctx->HasOutputs(framework::GradVarName(kParameters))); PADDLE_ENFORCE(ctx->HasOutputs(framework::GradVarName(kParameters)));
ctx->SetOutputsDim(framework::GradVarName(kParameters), ctx->SetOutputsDim(framework::GradVarName(kParameters),
......
...@@ -64,6 +64,13 @@ void PoolOp::InferShape(framework::InferShapeContext *ctx) const { ...@@ -64,6 +64,13 @@ void PoolOp::InferShape(framework::InferShapeContext *ctx) const {
framework::OpKernelType PoolOp::GetExpectedKernelType( framework::OpKernelType PoolOp::GetExpectedKernelType(
const framework::ExecutionContext &ctx) const { const framework::ExecutionContext &ctx) const {
bool use_cudnn = ctx.Attr<bool>("use_cudnn"); bool use_cudnn = ctx.Attr<bool>("use_cudnn");
use_cudnn &= platform::is_gpu_place(ctx.GetPlace());
#ifdef PADDLE_WITH_CUDA
if (platform::is_gpu_place(ctx.GetPlace())) {
auto &dev_ctx = ctx.template device_context<platform::CUDADeviceContext>();
use_cudnn &= dev_ctx.cudnn_handle() != nullptr;
}
#endif
framework::LibraryType library_; framework::LibraryType library_;
if (use_cudnn) { if (use_cudnn) {
library_ = framework::LibraryType::kCUDNN; library_ = framework::LibraryType::kCUDNN;
...@@ -88,6 +95,13 @@ void PoolOpGrad::InferShape(framework::InferShapeContext *ctx) const { ...@@ -88,6 +95,13 @@ void PoolOpGrad::InferShape(framework::InferShapeContext *ctx) const {
framework::OpKernelType PoolOpGrad::GetExpectedKernelType( framework::OpKernelType PoolOpGrad::GetExpectedKernelType(
const framework::ExecutionContext &ctx) const { const framework::ExecutionContext &ctx) const {
bool use_cudnn = ctx.Attr<bool>("use_cudnn"); bool use_cudnn = ctx.Attr<bool>("use_cudnn");
use_cudnn &= platform::is_gpu_place(ctx.GetPlace());
#ifdef PADDLE_WITH_CUDA
if (platform::is_gpu_place(ctx.GetPlace())) {
auto &dev_ctx = ctx.template device_context<platform::CUDADeviceContext>();
use_cudnn &= dev_ctx.cudnn_handle() != nullptr;
}
#endif
framework::LibraryType library_; framework::LibraryType library_;
if (use_cudnn) { if (use_cudnn) {
library_ = framework::LibraryType::kCUDNN; library_ = framework::LibraryType::kCUDNN;
......
...@@ -129,7 +129,7 @@ If reduce_all is true, just reduce along all dimensions and output a scalar. ...@@ -129,7 +129,7 @@ If reduce_all is true, just reduce along all dimensions and output a scalar.
} }
void SetComment(std::string name, std::string op) { void SetComment(std::string name, std::string op) {
Replace(comment_, "{ReduceOP}", name); Replace(comment_, "{ReduceOp}", name);
Replace(comment_, "{reduce}", op); Replace(comment_, "{reduce}", op);
} }
}; };
......
...@@ -15,9 +15,15 @@ limitations under the License. */ ...@@ -15,9 +15,15 @@ limitations under the License. */
#pragma once #pragma once
#include <sstream> #include <sstream>
#include <string> #include <string>
#include <typeindex>
namespace paddle { namespace paddle {
namespace string { namespace string {
inline std::ostream& operator<<(std::ostream& s, const std::type_index& t) {
s << t.name();
return s;
}
template <typename T> template <typename T>
inline std::string to_string(T v) { inline std::string to_string(T v) {
std::ostringstream sout; std::ostringstream sout;
...@@ -25,6 +31,11 @@ inline std::string to_string(T v) { ...@@ -25,6 +31,11 @@ inline std::string to_string(T v) {
return sout.str(); return sout.str();
} }
template <>
inline std::string to_string(std::type_index t) {
return t.name();
}
// Faster std::string/const char* type // Faster std::string/const char* type
template <> template <>
inline std::string to_string(std::string v) { inline std::string to_string(std::string v) {
......
...@@ -16,13 +16,22 @@ from paddle.trainer.config_parser import * ...@@ -16,13 +16,22 @@ from paddle.trainer.config_parser import *
from default_decorators import * from default_decorators import *
__all__ = [ __all__ = [
"evaluator_base", "classification_error_evaluator", "auc_evaluator", "evaluator_base",
"pnpair_evaluator", "precision_recall_evaluator", "ctc_error_evaluator", "classification_error_evaluator",
"chunk_evaluator", "sum_evaluator", "column_sum_evaluator", "auc_evaluator",
"value_printer_evaluator", "gradient_printer_evaluator", "pnpair_evaluator",
"maxid_printer_evaluator", "maxframe_printer_evaluator", "precision_recall_evaluator",
"seqtext_printer_evaluator", "classification_error_printer_evaluator", "ctc_error_evaluator",
"detection_map_evaluator" "chunk_evaluator",
"sum_evaluator",
"column_sum_evaluator",
"value_printer_evaluator",
"gradient_printer_evaluator",
"maxid_printer_evaluator",
"maxframe_printer_evaluator",
"seqtext_printer_evaluator",
"classification_error_printer_evaluator",
"detection_map_evaluator",
] ]
......
...@@ -116,8 +116,8 @@ def _debug_string_(proto, throw_on_error=True): ...@@ -116,8 +116,8 @@ def _debug_string_(proto, throw_on_error=True):
""" """
error_fields = list() error_fields = list()
if not proto.IsInitialized(error_fields) and throw_on_error: if not proto.IsInitialized(error_fields) and throw_on_error:
raise ValueError("{0} are not initialized\nThe message is {1}".format( raise ValueError("{0} are not initialized.\nThe message is {1}:\n".
error_fields, proto)) format(error_fields, proto))
return proto.__str__() return proto.__str__()
...@@ -374,12 +374,13 @@ class Operator(object): ...@@ -374,12 +374,13 @@ class Operator(object):
>>> outputs={"Out": [var1]}) >>> outputs={"Out": [var1]})
Args: Args:
block(Block): The block has the current operator block(Block): The block has the current operator.
desc(core.OpDesc): The protobuf description desc(core.OpDesc): The protobuf description.
type(str): The type of operator. type(str): The type of operator.
inputs(dict): The input dictionary. Key is the input parameter name. inputs(dict): The input dictionary. Key is the input parameter name.
Value is a list of variables. Value is a list of variables.
outputs(dict): The output dictionary. Has same format with inputs outputs(dict): The output dictionary which has the same format with
inputs.
attrs(dict): The attributes dictionary. Key is attribute name. Value attrs(dict): The attributes dictionary. Key is attribute name. Value
is the attribute value. The attribute type should be as same as is the attribute value. The attribute type should be as same as
the type registered in C++ the type registered in C++
...@@ -436,9 +437,10 @@ class Operator(object): ...@@ -436,9 +437,10 @@ class Operator(object):
for m in proto.outputs: for m in proto.outputs:
need.add(m.name) need.add(m.name)
if not given == need: if not given == need:
raise ValueError( raise ValueError(("Incorrect setting for output(s) of "
"Incorrect setting for output(s) of operator \"%s\". Need: [%s] Given: [%s]" "operator \"%s\". Need: [%s] Given: [%s]") %
% (type, ", ".join(str(e) for e in need), ", ".join( (type, ", ".join(str(e)
for e in need), ", ".join(
str(e) for e in given))) str(e) for e in given)))
for out_proto in proto.outputs: for out_proto in proto.outputs:
...@@ -818,9 +820,8 @@ class Program(object): ...@@ -818,9 +820,8 @@ class Program(object):
if isinstance(t, Variable): if isinstance(t, Variable):
t = t.op t = t.op
else: else:
raise ValueError( raise ValueError(("All targets of prune() can only be "
"All targets of prune() can only be Variable or Operator." "Variable or Operator."))
)
targets_idx.append([t.block.idx, t.idx]) targets_idx.append([t.block.idx, t.idx])
res = Program() res = Program()
......
...@@ -28,9 +28,9 @@ def data(name, ...@@ -28,9 +28,9 @@ def data(name,
**Data Layer** **Data Layer**
This function takes in the input and based on whether data has This function takes in the input and based on whether data has
to be returned back as a minibatch, it creates the global variable using to be returned back as a minibatch, it creates the global variable by using
the helper functions. The global variables can be accessed by all the the helper functions. The global variables can be accessed by all the
following operations and layers in the graph. following operators in the graph.
All the input variables of this function are passed in as local variables All the input variables of this function are passed in as local variables
to the LayerHelper constructor. to the LayerHelper constructor.
......
...@@ -50,6 +50,8 @@ __all__ = [ ...@@ -50,6 +50,8 @@ __all__ = [
'sequence_last_step', 'sequence_last_step',
'dropout', 'dropout',
'split', 'split',
'l2_normalize',
'matmul',
] ]
...@@ -674,6 +676,7 @@ def conv2d(input, ...@@ -674,6 +676,7 @@ def conv2d(input,
groups=None, groups=None,
param_attr=None, param_attr=None,
bias_attr=None, bias_attr=None,
use_cudnn=True,
act=None): act=None):
""" """
**Convlution2D Layer** **Convlution2D Layer**
...@@ -737,6 +740,8 @@ def conv2d(input, ...@@ -737,6 +740,8 @@ def conv2d(input,
connected to the second half of the input channels. Default: groups=1 connected to the second half of the input channels. Default: groups=1
param_attr(ParamAttr): The parameters to the Conv2d Layer. Default: None param_attr(ParamAttr): The parameters to the Conv2d Layer. Default: None
bias_attr(ParamAttr): Bias parameter for the Conv2d layer. Default: None bias_attr(ParamAttr): Bias parameter for the Conv2d layer. Default: None
use_cudnn(bool): Use cudnn kernel or not, it is valid only when the cudnn
library is installed. Default: True
act(str): Activation type. Default: None act(str): Activation type. Default: None
Returns: Returns:
...@@ -772,6 +777,8 @@ def conv2d(input, ...@@ -772,6 +777,8 @@ def conv2d(input,
stride = [stride, stride] stride = [stride, stride]
if isinstance(padding, int): if isinstance(padding, int):
padding = [padding, padding] padding = [padding, padding]
if not isinstance(use_cudnn, bool):
raise ValueError("use_cudnn should be True or False")
input_shape = input.shape input_shape = input.shape
filter_shape = [num_filters, num_filter_channels] + filter_size filter_shape = [num_filters, num_filter_channels] + filter_size
...@@ -795,9 +802,12 @@ def conv2d(input, ...@@ -795,9 +802,12 @@ def conv2d(input,
'Filter': filter_param, 'Filter': filter_param,
}, },
outputs={"Output": pre_bias}, outputs={"Output": pre_bias},
attrs={'strides': stride, attrs={
'strides': stride,
'paddings': padding, 'paddings': padding,
'groups': groups}) 'groups': groups,
'use_cudnn': use_cudnn
})
pre_act = helper.append_bias_op(pre_bias, dim_start=1, dim_end=2) pre_act = helper.append_bias_op(pre_bias, dim_start=1, dim_end=2)
...@@ -945,7 +955,9 @@ def pool2d(input, ...@@ -945,7 +955,9 @@ def pool2d(input,
pool_type, pool_type,
pool_stride=None, pool_stride=None,
pool_padding=None, pool_padding=None,
global_pooling=False): global_pooling=False,
use_cudnn=True,
name=None):
""" """
This function adds the operator for pooling in 2 dimensions, using the This function adds the operator for pooling in 2 dimensions, using the
pooling configurations mentioned in input parameters. pooling configurations mentioned in input parameters.
...@@ -964,6 +976,8 @@ def pool2d(input, ...@@ -964,6 +976,8 @@ def pool2d(input,
pool_stride = [pool_stride, pool_stride] pool_stride = [pool_stride, pool_stride]
if isinstance(pool_padding, int): if isinstance(pool_padding, int):
pool_padding = [pool_padding, pool_padding] pool_padding = [pool_padding, pool_padding]
if not isinstance(use_cudnn, bool):
raise ValueError("use_cudnn should be True or False")
helper = LayerHelper('pool2d', **locals()) helper = LayerHelper('pool2d', **locals())
dtype = helper.input_dtype() dtype = helper.input_dtype()
...@@ -978,7 +992,8 @@ def pool2d(input, ...@@ -978,7 +992,8 @@ def pool2d(input,
"ksize": pool_size, "ksize": pool_size,
"global_pooling": global_pooling, "global_pooling": global_pooling,
"strides": pool_stride, "strides": pool_stride,
"paddings": pool_padding "paddings": pool_padding,
"use_cudnn": use_cudnn
}) })
return pool_out return pool_out
...@@ -991,7 +1006,8 @@ def batch_norm(input, ...@@ -991,7 +1006,8 @@ def batch_norm(input,
epsilon=1e-05, epsilon=1e-05,
param_attr=None, param_attr=None,
bias_attr=None, bias_attr=None,
data_layout='NCHW'): data_layout='NCHW',
name=None):
""" """
This function helps create an operator to implement This function helps create an operator to implement
the BatchNorm layer using the configurations from the input parameters. the BatchNorm layer using the configurations from the input parameters.
...@@ -1067,7 +1083,7 @@ def batch_norm(input, ...@@ -1067,7 +1083,7 @@ def batch_norm(input,
return helper.append_activation(batch_norm_out) return helper.append_activation(batch_norm_out)
def beam_search_decode(ids, scores): def beam_search_decode(ids, scores, name=None):
helper = LayerHelper('beam_search_decode', **locals()) helper = LayerHelper('beam_search_decode', **locals())
sentence_ids = helper.create_tmp_variable(dtype=ids.dtype) sentence_ids = helper.create_tmp_variable(dtype=ids.dtype)
sentence_scores = helper.create_tmp_variable(dtype=ids.dtype) sentence_scores = helper.create_tmp_variable(dtype=ids.dtype)
...@@ -1091,7 +1107,9 @@ def conv2d_transpose(input, ...@@ -1091,7 +1107,9 @@ def conv2d_transpose(input,
padding=None, padding=None,
stride=None, stride=None,
dilation=None, dilation=None,
param_attr=None): param_attr=None,
use_cudnn=True,
name=None):
""" """
The transpose of conv2d layer. The transpose of conv2d layer.
...@@ -1118,8 +1136,10 @@ def conv2d_transpose(input, ...@@ -1118,8 +1136,10 @@ def conv2d_transpose(input,
contain two integers, (dilation_H, dilation_W). Otherwise, the contain two integers, (dilation_H, dilation_W). Otherwise, the
dilation_H = dilation_W = dilation. dilation_H = dilation_W = dilation.
param_attr: Parameter Attribute. param_attr: Parameter Attribute.
main_program(Program): the main program use_cudnn(bool): Use cudnn kernel or not, it is valid only when the cudnn
startup_program(Program): the startup program library is installed. Default: True
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
Variable: Output image. Variable: Output image.
...@@ -1146,6 +1166,10 @@ def conv2d_transpose(input, ...@@ -1146,6 +1166,10 @@ def conv2d_transpose(input,
elif dilation is not None: elif dilation is not None:
op_attr['dilations'] = dilation op_attr['dilations'] = dilation
if not isinstance(use_cudnn, bool):
raise ValueError("use_cudnn should be True or False")
op_attr['use_cudnn'] = use_cudnn
if filter_size is None: if filter_size is None:
if output_size is None: if output_size is None:
raise ValueError("output_size must be set when filter_size is None") raise ValueError("output_size must be set when filter_size is None")
...@@ -1183,7 +1207,7 @@ def conv2d_transpose(input, ...@@ -1183,7 +1207,7 @@ def conv2d_transpose(input,
return out return out
def sequence_expand(x, y): def sequence_expand(x, y, name=None):
"""Sequence Expand Layer. This layer will expand the input variable **x** """Sequence Expand Layer. This layer will expand the input variable **x**
according to LoD information of **y**. And the following examples will according to LoD information of **y**. And the following examples will
explain how sequence_expand works: explain how sequence_expand works:
...@@ -1227,6 +1251,8 @@ def sequence_expand(x, y): ...@@ -1227,6 +1251,8 @@ def sequence_expand(x, y):
Args: Args:
x (Variable): The input variable which is a Tensor or LoDTensor. x (Variable): The input variable which is a Tensor or LoDTensor.
y (Variable): The input variable which is a LoDTensor. y (Variable): The input variable which is a LoDTensor.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
Variable: The expanded variable which is a LoDTensor. Variable: The expanded variable which is a LoDTensor.
...@@ -1253,7 +1279,8 @@ def lstm_unit(x_t, ...@@ -1253,7 +1279,8 @@ def lstm_unit(x_t,
cell_t_prev, cell_t_prev,
forget_bias=0.0, forget_bias=0.0,
param_attr=None, param_attr=None,
bias_attr=None): bias_attr=None,
name=None):
"""Lstm unit layer. The equation of a lstm step is: """Lstm unit layer. The equation of a lstm step is:
.. math:: .. math::
...@@ -1300,6 +1327,8 @@ def lstm_unit(x_t, ...@@ -1300,6 +1327,8 @@ def lstm_unit(x_t,
initializer, name etc. initializer, name etc.
bias_attr (ParamAttr): The attributes of bias weights, if not False, bias_attr (ParamAttr): The attributes of bias weights, if not False,
bias weights will be created and be set to default value. bias weights will be created and be set to default value.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
tuple: The hidden value and cell value of lstm unit. tuple: The hidden value and cell value of lstm unit.
...@@ -1365,7 +1394,7 @@ def lstm_unit(x_t, ...@@ -1365,7 +1394,7 @@ def lstm_unit(x_t,
return h, c return h, c
def reduce_sum(input, dim=None, keep_dim=False): def reduce_sum(input, dim=None, keep_dim=False, name=None):
""" """
Computes the sum of tensor elements over the given dimension. Computes the sum of tensor elements over the given dimension.
...@@ -1379,6 +1408,8 @@ def reduce_sum(input, dim=None, keep_dim=False): ...@@ -1379,6 +1408,8 @@ def reduce_sum(input, dim=None, keep_dim=False):
keep_dim (bool): Whether to reserve the reduced dimension in the keep_dim (bool): Whether to reserve the reduced dimension in the
output Tensor. The result tensor will have one fewer dimension output Tensor. The result tensor will have one fewer dimension
than the :attr:`input` unless :attr:`keep_dim` is true. than the :attr:`input` unless :attr:`keep_dim` is true.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
Variable: The reduced Tensor variable. Variable: The reduced Tensor variable.
...@@ -1409,7 +1440,7 @@ def reduce_sum(input, dim=None, keep_dim=False): ...@@ -1409,7 +1440,7 @@ def reduce_sum(input, dim=None, keep_dim=False):
return out return out
def reduce_mean(input, dim=None, keep_dim=False): def reduce_mean(input, dim=None, keep_dim=False, name=None):
""" """
Computes the mean of tensor elements over the given dimension. Computes the mean of tensor elements over the given dimension.
...@@ -1423,6 +1454,8 @@ def reduce_mean(input, dim=None, keep_dim=False): ...@@ -1423,6 +1454,8 @@ def reduce_mean(input, dim=None, keep_dim=False):
keep_dim (bool): Whether to reserve the reduced dimension in the keep_dim (bool): Whether to reserve the reduced dimension in the
output Tensor. The result tensor will have one fewer dimension output Tensor. The result tensor will have one fewer dimension
than the :attr:`input` unless :attr:`keep_dim` is true. than the :attr:`input` unless :attr:`keep_dim` is true.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
Variable: The reduced Tensor variable. Variable: The reduced Tensor variable.
...@@ -1453,7 +1486,7 @@ def reduce_mean(input, dim=None, keep_dim=False): ...@@ -1453,7 +1486,7 @@ def reduce_mean(input, dim=None, keep_dim=False):
return out return out
def reduce_max(input, dim=None, keep_dim=False): def reduce_max(input, dim=None, keep_dim=False, name=None):
""" """
Computes the maximum of tensor elements over the given dimension. Computes the maximum of tensor elements over the given dimension.
...@@ -1467,6 +1500,8 @@ def reduce_max(input, dim=None, keep_dim=False): ...@@ -1467,6 +1500,8 @@ def reduce_max(input, dim=None, keep_dim=False):
keep_dim (bool): Whether to reserve the reduced dimension in the keep_dim (bool): Whether to reserve the reduced dimension in the
output Tensor. The result tensor will have one fewer dimension output Tensor. The result tensor will have one fewer dimension
than the :attr:`input` unless :attr:`keep_dim` is true. than the :attr:`input` unless :attr:`keep_dim` is true.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
Variable: The reduced Tensor variable. Variable: The reduced Tensor variable.
...@@ -1497,7 +1532,7 @@ def reduce_max(input, dim=None, keep_dim=False): ...@@ -1497,7 +1532,7 @@ def reduce_max(input, dim=None, keep_dim=False):
return out return out
def reduce_min(input, dim=None, keep_dim=False): def reduce_min(input, dim=None, keep_dim=False, name=None):
""" """
Computes the minimum of tensor elements over the given dimension. Computes the minimum of tensor elements over the given dimension.
...@@ -1511,6 +1546,8 @@ def reduce_min(input, dim=None, keep_dim=False): ...@@ -1511,6 +1546,8 @@ def reduce_min(input, dim=None, keep_dim=False):
keep_dim (bool): Whether to reserve the reduced dimension in the keep_dim (bool): Whether to reserve the reduced dimension in the
output Tensor. The result tensor will have one fewer dimension output Tensor. The result tensor will have one fewer dimension
than the :attr:`input` unless :attr:`keep_dim` is true. than the :attr:`input` unless :attr:`keep_dim` is true.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
Variable: The reduced Tensor variable. Variable: The reduced Tensor variable.
...@@ -1541,9 +1578,9 @@ def reduce_min(input, dim=None, keep_dim=False): ...@@ -1541,9 +1578,9 @@ def reduce_min(input, dim=None, keep_dim=False):
return out return out
def split(input, num_or_sections, dim=-1): def split(input, num_or_sections, dim=-1, name=None):
""" """
Splits the tensor into multiple sub-tensors. Split the input tensor into multiple sub-tensors.
Args: Args:
input (Variable): The input variable which is a Tensor or LoDTensor. input (Variable): The input variable which is a Tensor or LoDTensor.
...@@ -1555,6 +1592,8 @@ def split(input, num_or_sections, dim=-1): ...@@ -1555,6 +1592,8 @@ def split(input, num_or_sections, dim=-1):
:attr:`dim` dimension orderly. :attr:`dim` dimension orderly.
dim (int): The dimension along which to split. If :math:`dim < 0`, the dim (int): The dimension along which to split. If :math:`dim < 0`, the
dimension to split along is :math:`rank(input) + dim`. dimension to split along is :math:`rank(input) + dim`.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns: Returns:
List: The list of segmented tensor variables. List: The list of segmented tensor variables.
...@@ -1597,3 +1636,155 @@ def split(input, num_or_sections, dim=-1): ...@@ -1597,3 +1636,155 @@ def split(input, num_or_sections, dim=-1):
'axis': dim 'axis': dim
}) })
return outs return outs
def l2_normalize(x, axis, epsilon=1e-12, name=None):
"""
**L2 normalize Layer**
The l2 normalize layer normalizes `x` along dimension `axis` using an L2
norm. For a 1-D tensor (`dim` is fixed to 0), this layer computes
output = x / sqrt(max(sum(x**2), epsilon))
For `x` with more dimensions, this layer independently normalizes each 1-D
slice along dimension `axis`.
Args:
x(Variable|list): The input tensor to l2_normalize layer.
axis(int): Dimension along which to normalize the input.
epsilon(float): A lower bound value for `x`'s l2 norm. sqrt(epsilon) will
be used as the divisor if the l2 norm of `x` is less than
sqrt(epsilon).
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns:
Variable: The output tensor variable.
Examples:
.. code-block:: python
data = fluid.layers.data(name="data",
shape=(3, 17, 13),
dtype="float32")
fc = fluid.layers.l2_normalize(x=data, axis=1)
"""
if len(x.shape) == 1: axis = 0
helper = LayerHelper("l2_normalize", **locals())
square = helper.create_tmp_variable(dtype=x.dtype)
helper.append_op(type="square", inputs={"X": x}, outputs={"Out": square})
reduced_sum = helper.create_tmp_variable(dtype=x.dtype)
helper.append_op(
type="reduce_sum",
inputs={"X": square},
outputs={"Out": reduced_sum},
attrs={
"dim": 1 if axis is None else axis,
"keep_dim": True,
"reduce_all": False
})
# TODO(caoying) A lower bound value epsilon for the norm is needed to
# imporve the numeric stability of reciprocal. This requires a maximum_op.
rsquare = helper.create_tmp_variable(dtype=x.dtype)
helper.append_op(
type="reciprocal", inputs={"X": reduced_sum}, outputs={"Out": rsquare})
# TODO(caoying) the current elementwise_mul operator does not support a
# general broadcast rule which broadcasts input(Y) to have the same
# dimension with Input(X) starting from a specified dimension. So this
# exanpsion is requred. Once a general broadcast rule is spported, this
# expanding canbe removed.
rsquare_expanded = helper.create_tmp_variable(dtype=x.dtype)
expand_times = [1] * len(x.shape)
expand_times[axis] = int(x.shape[axis])
helper.append_op(
type="expand",
inputs={"X": rsquare},
outputs={"Out": rsquare_expanded},
attrs={"expand_times": expand_times})
out = helper.create_tmp_variable(dtype=x.dtype)
helper.append_op(
type="elementwise_mul",
inputs={"X": x,
"Y": rsquare_expanded},
outputs={"Out": out})
return out
def matmul(x, y, transpose_x=False, transpose_y=False, name=None):
"""
Applies matrix multipication to two tensors. Currently only rank 1 to rank
3 input tensors are supported.
The actual behavior depends on the shapes of :math:`x`, :math:`y` and the
flag values of :attr:`transpose_x`, :attr:`transpose_y`. Specifically:
- If a transpose flag is specified, the last two dimensions of the tensor
are transposed. If the tensor is rank-1 of shape :math:`[D]`, then for
:math:`x` it is treated as :math:`[1, D]` in nontransposed form and as
:math:`[D, 1]` in transposed form, whereas for :math:`y` it is the
opposite: It is treated as :math:`[D, 1]` in nontransposed form and as
:math:`[1, D]` in transposed form.
- After transpose, the two tensors are 2-D or 3-D and matrix multipication
performs in the following way.
- If both are 2-D, they are multiplied like conventional matrices.
- If either is 3-D, it is treated as a stack of matrices residing in the
last two dimensions and a batched matrix multiply supporting broadcast
applies on the two tensors.
Also note that if the raw tensor :math:`x` or :math:`y` is rank-1 and
nontransposed, the prepended or appended dimension :math:`1` will be
removed after matrix multipication.
Args:
x (Variable): The input variable which is a Tensor or LoDTensor.
y (Variable): The input variable which is a Tensor or LoDTensor.
transpose_x (bool): Whether to transpose :math:`x` before multiplication.
transpose_y (bool): Whether to transpose :math:`y` before multiplication.
name(str|None): A name for this layer(optional). If set None, the layer
will be named automatically.
Returns:
Variable: The product Tensor variable.
Examples:
.. code-block:: python
# Examples to clarify shapes of the inputs and output
# x: [B, M, K], y: [B, K, N]
fluid.layers.matmul(x, y) # out: [B, M, N]
# x: [B, M, K], y: [K, N]
fluid.layers.matmul(x, y) # out: [B, M, N]
# x: [B, M, K], y: [K]
fluid.layers.matmul(x, y) # out: [B, M]
# x: [M, K], y: [K, N]
fluid.layers.matmul(x, y) # out: [M, N]
# x: [K], y: [K]
fluid.layers.matmul(x, y) # out: [1]
# x: [M], y: [N]
fluid.layers.matmul(x, y, True, True) # out: [M, N]
"""
helper = LayerHelper('matmul', **locals())
assert max(
len(x.shape), len(y.shape)
) <= 3, 'Currently only rank 1 to rank 3 input tensors are supported.'
out = helper.create_tmp_variable(dtype=helper.input_dtype())
helper.append_op(
type='matmul',
inputs={'X': x,
'Y': y},
outputs={'Out': out},
attrs={'transpose_X': transpose_x,
'transpose_Y': transpose_y})
return out
...@@ -55,6 +55,8 @@ __all__ = [ ...@@ -55,6 +55,8 @@ __all__ = [
'elementwise_div', 'elementwise_div',
'elementwise_sub', 'elementwise_sub',
'elementwise_mul', 'elementwise_mul',
'elementwise_max',
'elementwise_min',
'clip', 'clip',
'sequence_softmax', 'sequence_softmax',
] + __activations__ ] + __activations__
......
...@@ -17,6 +17,7 @@ __all__ = [ ...@@ -17,6 +17,7 @@ __all__ = [
"simple_img_conv_pool", "simple_img_conv_pool",
"sequence_conv_pool", "sequence_conv_pool",
"glu", "glu",
"dot_product_attention",
] ]
...@@ -27,19 +28,22 @@ def simple_img_conv_pool(input, ...@@ -27,19 +28,22 @@ def simple_img_conv_pool(input,
pool_stride, pool_stride,
act, act,
param_attr=None, param_attr=None,
pool_type='max'): pool_type='max',
use_cudnn=True):
conv_out = layers.conv2d( conv_out = layers.conv2d(
input=input, input=input,
num_filters=num_filters, num_filters=num_filters,
filter_size=filter_size, filter_size=filter_size,
param_attr=param_attr, param_attr=param_attr,
act=act) act=act,
use_cudnn=use_cudnn)
pool_out = layers.pool2d( pool_out = layers.pool2d(
input=conv_out, input=conv_out,
pool_size=pool_size, pool_size=pool_size,
pool_type=pool_type, pool_type=pool_type,
pool_stride=pool_stride) pool_stride=pool_stride,
use_cudnn=use_cudnn)
return pool_out return pool_out
...@@ -53,7 +57,8 @@ def img_conv_group(input, ...@@ -53,7 +57,8 @@ def img_conv_group(input,
conv_with_batchnorm=False, conv_with_batchnorm=False,
conv_batchnorm_drop_rate=None, conv_batchnorm_drop_rate=None,
pool_stride=1, pool_stride=1,
pool_type=None): pool_type=None,
use_cudnn=True):
""" """
Image Convolution Group, Used for vgg net. Image Convolution Group, Used for vgg net.
""" """
...@@ -84,7 +89,8 @@ def img_conv_group(input, ...@@ -84,7 +89,8 @@ def img_conv_group(input,
filter_size=conv_filter_size[i], filter_size=conv_filter_size[i],
padding=conv_padding[i], padding=conv_padding[i],
param_attr=param_attr[i], param_attr=param_attr[i],
act=local_conv_act) act=local_conv_act,
use_cudnn=use_cudnn)
if conv_with_batchnorm[i]: if conv_with_batchnorm[i]:
tmp = layers.batch_norm(input=tmp, act=conv_act) tmp = layers.batch_norm(input=tmp, act=conv_act)
...@@ -96,7 +102,8 @@ def img_conv_group(input, ...@@ -96,7 +102,8 @@ def img_conv_group(input,
input=tmp, input=tmp,
pool_size=pool_size, pool_size=pool_size,
pool_type=pool_type, pool_type=pool_type,
pool_stride=pool_stride) pool_stride=pool_stride,
use_cudnn=use_cudnn)
return pool_out return pool_out
...@@ -150,3 +157,55 @@ def glu(input, dim=-1): ...@@ -150,3 +157,55 @@ def glu(input, dim=-1):
act_b = layers.sigmoid(x=b) act_b = layers.sigmoid(x=b)
out = layers.elementwise_mul(x=a, y=act_b) out = layers.elementwise_mul(x=a, y=act_b)
return out return out
def dot_product_attention(querys, keys, values):
"""
The dot-product attention.
Attention mechanism can be seen as mapping a query and a set of key-value
pairs to an output. The output is computed as a weighted sum of the values,
where the weight assigned to each value is computed by a compatibility
function (dot-product here) of the query with the corresponding key.
The dot-product attention can be implemented through (batch) matrix
multipication as follows:
.. math::
Attention(Q, K, V)= softmax(QK^\mathrm{T})V
Refer to `Attention Is All You Need
<https://arxiv.org/pdf/1706.03762.pdf>`_.
Note that batch data containing sequences with different lengths is not
supported by this because of the (batch) matrix multipication.
Args:
query (Variable): The input variable which is a Tensor or LoDTensor.
key (Variable): The input variable which is a Tensor or LoDTensor.
value (Variable): The input variable which is a Tensor or LoDTensor.
Returns:
tuple: The Tensor variables representing the output and attention scores.
Examples:
.. code-block:: python
# Suppose q, k, v are tensor variables with the following shape:
# q: [3, 5, 9], k: [3, 6, 9], v: [3, 6, 10]
out, attn_scores = fluid.nets.dot_product_attention(q, k, v)
out.shape # [3, 5, 10]
attn_scores.shape # [3, 5, 6]
"""
assert keys.shape[-2] == values.shape[
-2], 'The shapes of keys and values mismatch.'
assert querys.shape[-1] == keys.shape[
-1], 'The shapes of querys and keys mismatch.'
product = layers.matmul(x=querys, y=keys, transpose_y=True)
attn_scores = layers.reshape(
x=layers.reshape(
x=product, shape=[-1, product.shape[-1]], act='softmax'),
shape=product.shape)
out = layers.matmul(attn_scores, values)
return out, attn_scores
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import unittest import unittest
import paddle.v2 as paddle import paddle.v2 as paddle
import paddle.v2.fluid.core as core import paddle.v2.fluid.core as core
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import unittest import unittest
import numpy as np import numpy as np
from op_test import OpTest from op_test import OpTest
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
#You may obtain a copy of the License at # You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
#Unless required by applicable law or agreed to in writing, software # Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS, # distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
#limitations under the License. # limitations under the License.
import unittest import unittest
import numpy as np import numpy as np
from op_test import OpTest from op_test import OpTest
...@@ -40,6 +40,16 @@ class TestElementwiseOp(OpTest): ...@@ -40,6 +40,16 @@ class TestElementwiseOp(OpTest):
['X'], 'Out', max_relative_error=0.005, no_grad_set=set('Y')) ['X'], 'Out', max_relative_error=0.005, no_grad_set=set('Y'))
class TestElementwiseAddOp_scalar(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_add"
self.inputs = {
'X': np.random.rand(2, 3, 4).astype(np.float32),
'Y': np.random.rand(1).astype(np.float32)
}
self.outputs = {'Out': self.inputs['X'] + self.inputs['Y']}
class TestElementwiseAddOp_Vector(TestElementwiseOp): class TestElementwiseAddOp_Vector(TestElementwiseOp):
def setUp(self): def setUp(self):
self.op_type = "elementwise_add" self.op_type = "elementwise_add"
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
#You may obtain a copy of the License at # You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
#Unless required by applicable law or agreed to in writing, software # Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS, # distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
#limitations under the License. # limitations under the License.
import unittest import unittest
import numpy as np import numpy as np
from op_test import OpTest from op_test import OpTest
...@@ -45,6 +45,16 @@ class ElementwiseDivOp(OpTest): ...@@ -45,6 +45,16 @@ class ElementwiseDivOp(OpTest):
['X'], 'Out', max_relative_error=0.05, no_grad_set=set('Y')) ['X'], 'Out', max_relative_error=0.05, no_grad_set=set('Y'))
class TestElementwiseDivOp_scalar(ElementwiseDivOp):
def setUp(self):
self.op_type = "elementwise_div"
self.inputs = {
'X': np.random.uniform(0.1, 1, [2, 3, 4]).astype(np.float32),
'Y': np.random.uniform(0.1, 1, [1]).astype(np.float32)
}
self.outputs = {'Out': self.inputs['X'] / self.inputs['Y']}
class TestElementwiseDivOp_Vector(ElementwiseDivOp): class TestElementwiseDivOp_Vector(ElementwiseDivOp):
def setUp(self): def setUp(self):
self.op_type = "elementwise_div" self.op_type = "elementwise_div"
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import unittest
import numpy as np
from op_test import OpTest
class TestElementwiseOp(OpTest):
def setUp(self):
self.op_type = "elementwise_max"
# If x and y have the same value, the max() is not differentiable.
# So we generate test data by the following method
# to avoid them being too close to each other.
x = np.random.uniform(0.1, 1, [13, 17]).astype("float32")
sgn = np.random.choice([-1, 1], [13, 17]).astype("float32")
y = x + sgn * np.random.uniform(0.1, 1, [13, 17]).astype("float32")
self.inputs = {'X': x, 'Y': y}
self.outputs = {'Out': np.maximum(self.inputs['X'], self.inputs['Y'])}
def test_check_output(self):
self.check_output()
def test_check_grad_normal(self):
self.check_grad(['X', 'Y'], 'Out', max_relative_error=0.005)
def test_check_grad_ingore_x(self):
self.check_grad(
['Y'], 'Out', max_relative_error=0.005, no_grad_set=set("X"))
def test_check_grad_ingore_y(self):
self.check_grad(
['X'], 'Out', max_relative_error=0.005, no_grad_set=set('Y'))
class TestElementwiseMaxOp_scalar(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_max"
x = np.random.random_integers(-5, 5, [2, 3, 4]).astype("float32")
y = np.array([0.5]).astype("float32")
self.inputs = {'X': x, 'Y': y}
self.outputs = {'Out': np.maximum(self.inputs['X'], self.inputs['Y'])}
class TestElementwiseMaxOp_Vector(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_max"
x = np.random.random((32, )).astype("float32")
sgn = np.random.choice([-1, 1], (32, )).astype("float32")
y = x + sgn * np.random.uniform(0.1, 1, (32, )).astype("float32")
self.inputs = {'X': x, 'Y': y}
self.outputs = {'Out': np.maximum(self.inputs['X'], self.inputs['Y'])}
class TestElementwiseMaxOp_broadcast_0(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_max"
x = np.random.uniform(0.5, 1, (2, 3, 4)).astype(np.float32)
sgn = np.random.choice([-1, 1], (2, )).astype(np.float32)
y = x[:, 0, 0] + sgn * \
np.random.uniform(1, 2, (2, )).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.attrs = {'axis': 0}
self.outputs = {
'Out':
np.maximum(self.inputs['X'], self.inputs['Y'].reshape(2, 1, 1))
}
class TestElementwiseMaxOp_broadcast_1(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_max"
x = np.random.uniform(0.5, 1, (2, 3, 4)).astype(np.float32)
sgn = np.random.choice([-1, 1], (3, )).astype(np.float32)
y = x[0, :, 0] + sgn * \
np.random.uniform(1, 2, (3, )).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.attrs = {'axis': 1}
self.outputs = {
'Out':
np.maximum(self.inputs['X'], self.inputs['Y'].reshape(1, 3, 1))
}
class TestElementwiseMaxOp_broadcast_2(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_max"
x = np.random.uniform(0.5, 1, (2, 3, 4)).astype(np.float32)
sgn = np.random.choice([-1, 1], (4, )).astype(np.float32)
y = x[0, 0, :] + sgn * \
np.random.uniform(1, 2, (4, )).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.outputs = {
'Out':
np.maximum(self.inputs['X'], self.inputs['Y'].reshape(1, 1, 4))
}
class TestElementwiseMaxOp_broadcast_3(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_max"
x = np.random.uniform(0.5, 1, (2, 3, 4, 5)).astype(np.float32)
sgn = np.random.choice([-1, 1], (3, 4)).astype(np.float32)
y = x[0, :, :, 0] + sgn * \
np.random.uniform(1, 2, (3, 4)).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.attrs = {'axis': 1}
self.outputs = {
'Out':
np.maximum(self.inputs['X'], self.inputs['Y'].reshape(1, 3, 4, 1))
}
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import unittest
import numpy as np
from op_test import OpTest
class TestElementwiseOp(OpTest):
def setUp(self):
self.op_type = "elementwise_min"
# If x and y have the same value, the min() is not differentiable.
# So we generate test data by the following method
# to avoid them being too close to each other.
x = np.random.uniform(0.1, 1, [13, 17]).astype("float32")
sgn = np.random.choice([-1, 1], [13, 17]).astype("float32")
y = x + sgn * np.random.uniform(0.1, 1, [13, 17]).astype("float32")
self.inputs = {'X': x, 'Y': y}
self.outputs = {'Out': np.minimum(self.inputs['X'], self.inputs['Y'])}
def test_check_output(self):
self.check_output()
def test_check_grad_normal(self):
self.check_grad(['X', 'Y'], 'Out', max_relative_error=0.005)
def test_check_grad_ingore_x(self):
self.check_grad(
['Y'], 'Out', max_relative_error=0.005, no_grad_set=set("X"))
def test_check_grad_ingore_y(self):
self.check_grad(
['X'], 'Out', max_relative_error=0.005, no_grad_set=set('Y'))
class TestElementwiseMinOp_scalar(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_min"
x = np.random.random_integers(-5, 5, [2, 3, 4]).astype("float32")
y = np.array([0.5]).astype("float32")
self.inputs = {'X': x, 'Y': y}
self.outputs = {'Out': np.minimum(self.inputs['X'], self.inputs['Y'])}
class TestElementwiseMaxOp_Vector(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_min"
x = np.random.random((32, )).astype("float32")
sgn = np.random.choice([-1, 1], (32, )).astype("float32")
y = x + sgn * np.random.uniform(0.1, 1, (32, )).astype("float32")
self.inputs = {'X': x, 'Y': y}
self.outputs = {'Out': np.minimum(self.inputs['X'], self.inputs['Y'])}
class TestElementwiseMaxOp_broadcast_0(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_min"
x = np.random.uniform(0.5, 1, (2, 3, 4)).astype(np.float32)
sgn = np.random.choice([-1, 1], (2, )).astype(np.float32)
y = x[:, 0, 0] + sgn * \
np.random.uniform(1, 2, (2, )).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.attrs = {'axis': 0}
self.outputs = {
'Out':
np.minimum(self.inputs['X'], self.inputs['Y'].reshape(2, 1, 1))
}
class TestElementwiseMaxOp_broadcast_1(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_min"
x = np.random.uniform(0.5, 1, (2, 3, 4)).astype(np.float32)
sgn = np.random.choice([-1, 1], (3, )).astype(np.float32)
y = x[0, :, 0] + sgn * \
np.random.uniform(1, 2, (3, )).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.attrs = {'axis': 1}
self.outputs = {
'Out':
np.minimum(self.inputs['X'], self.inputs['Y'].reshape(1, 3, 1))
}
class TestElementwiseMaxOp_broadcast_2(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_min"
x = np.random.uniform(0.5, 1, (2, 3, 4)).astype(np.float32)
sgn = np.random.choice([-1, 1], (4, )).astype(np.float32)
y = x[0, 0, :] + sgn * \
np.random.uniform(1, 2, (4, )).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.outputs = {
'Out':
np.minimum(self.inputs['X'], self.inputs['Y'].reshape(1, 1, 4))
}
class TestElementwiseMaxOp_broadcast_3(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_min"
x = np.random.uniform(0.5, 1, (2, 3, 4, 5)).astype(np.float32)
sgn = np.random.choice([-1, 1], (3, 4)).astype(np.float32)
y = x[0, :, :, 0] + sgn * \
np.random.uniform(1, 2, (3, 4)).astype(np.float32)
self.inputs = {'X': x, 'Y': y}
self.attrs = {'axis': 1}
self.outputs = {
'Out':
np.minimum(self.inputs['X'], self.inputs['Y'].reshape(1, 3, 4, 1))
}
if __name__ == '__main__':
unittest.main()
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
#You may obtain a copy of the License at # You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
#Unless required by applicable law or agreed to in writing, software # Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS, # distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
#limitations under the License. # limitations under the License.
import unittest import unittest
import numpy as np import numpy as np
from op_test import OpTest from op_test import OpTest
...@@ -38,6 +38,16 @@ class ElementwiseMulOp(OpTest): ...@@ -38,6 +38,16 @@ class ElementwiseMulOp(OpTest):
self.check_grad(['X'], 'Out', no_grad_set=set('Y')) self.check_grad(['X'], 'Out', no_grad_set=set('Y'))
class TestElementwiseMulOp_scalar(ElementwiseMulOp):
def setUp(self):
self.op_type = "elementwise_mul"
self.inputs = {
'X': np.random.rand(2, 3, 4).astype(np.float32),
'Y': np.random.rand(1).astype(np.float32)
}
self.outputs = {'Out': self.inputs['X'] * self.inputs['Y']}
class TestElementwiseMulOp_Vector(ElementwiseMulOp): class TestElementwiseMulOp_Vector(ElementwiseMulOp):
def setUp(self): def setUp(self):
self.op_type = "elementwise_mul" self.op_type = "elementwise_mul"
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve. # Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
# #
#Licensed under the Apache License, Version 2.0 (the "License"); # Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License. # you may not use this file except in compliance with the License.
#You may obtain a copy of the License at # You may obtain a copy of the License at
# #
# http://www.apache.org/licenses/LICENSE-2.0 # http://www.apache.org/licenses/LICENSE-2.0
# #
#Unless required by applicable law or agreed to in writing, software # Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS, # distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
#limitations under the License. # limitations under the License.
import unittest import unittest
import numpy as np import numpy as np
from op_test import OpTest from op_test import OpTest
...@@ -40,6 +40,16 @@ class TestElementwiseOp(OpTest): ...@@ -40,6 +40,16 @@ class TestElementwiseOp(OpTest):
['X'], 'Out', max_relative_error=0.005, no_grad_set=set('Y')) ['X'], 'Out', max_relative_error=0.005, no_grad_set=set('Y'))
class TestElementwiseSubOp_scalar(TestElementwiseOp):
def setUp(self):
self.op_type = "elementwise_sub"
self.inputs = {
'X': np.random.rand(2, 3, 4).astype(np.float32),
'Y': np.random.rand(1).astype(np.float32)
}
self.outputs = {'Out': self.inputs['X'] - self.inputs['Y']}
class TestElementwiseSubOp_Vector(TestElementwiseOp): class TestElementwiseSubOp_Vector(TestElementwiseOp):
def setUp(self): def setUp(self):
self.op_type = "elementwise_sub" self.op_type = "elementwise_sub"
......
...@@ -96,18 +96,18 @@ class Generator(object): ...@@ -96,18 +96,18 @@ class Generator(object):
self.outputs = {'Out': Out} self.outputs = {'Out': Out}
def test_check_output(self): def test_check_output(self):
self.check_output(atol=1e-2) self.check_output(atol=1e-3)
def test_check_grad_normal(self): def test_check_grad_normal(self):
self.check_grad(['X', 'Y'], 'Out', max_relative_error=0.5) self.check_grad(['X', 'Y'], 'Out', max_relative_error=1e-3)
def test_check_grad_ignore_x(self): def test_check_grad_ignore_x(self):
self.check_grad( self.check_grad(
['Y'], 'Out', max_relative_error=0.5, no_grad_set=set("X")) ['Y'], 'Out', max_relative_error=1e-3, no_grad_set=set("X"))
def test_check_grad_ignore_y(self): def test_check_grad_ignore_y(self):
self.check_grad( self.check_grad(
['X'], 'Out', max_relative_error=0.5, no_grad_set=set('Y')) ['X'], 'Out', max_relative_error=1e-3, no_grad_set=set('Y'))
# Generate test cases for all possibilities # Generate test cases for all possibilities
......
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import unittest
import paddle.v2.fluid as fluid
import paddle.v2.fluid.core as core
import numpy as np
class TestNormalization(unittest.TestCase):
data_desc = {"name": "input", "shape": (2, 3, 7)}
def gen_random_input(self):
"""Generate random input data.
"""
self.data = np.random.random(
size=self.data_desc["shape"]).astype("float32")
def set_program(self, axis, epsilon):
"""Build the test program.
"""
data = fluid.layers.data(
name=self.data_desc["name"],
shape=self.data_desc["shape"],
dtype="float32",
append_batch_size=False)
data.stop_gradient = False
l2_norm = fluid.layers.l2_normalize(x=data, axis=axis, epsilon=epsilon)
out = fluid.layers.reduce_sum(l2_norm, dim=None)
fluid.backward.append_backward(loss=out)
self.fetch_list = [l2_norm]
def run_program(self):
"""Run the test program.
"""
places = [core.CPUPlace()]
if core.is_compile_gpu():
places.append(core.CUDAPlace(0))
for place in places:
self.set_inputs(place)
exe = fluid.Executor(place)
output = exe.run(fluid.default_main_program(),
feed=self.inputs,
fetch_list=self.fetch_list,
return_numpy=True)
self.op_output = output
def set_inputs(self, place):
"""Set the randomly generated data to the test program.
"""
self.inputs = {}
tensor = fluid.Tensor()
tensor.set(self.data, place)
self.inputs[self.data_desc["name"]] = tensor
def l2_normalize(self, data, axis, epsilon):
""" Compute the groundtruth.
"""
output = data * np.reciprocal(
np.sum(np.square(data), axis=axis, keepdims=True))
return output
def test_l2_normalize(self):
""" Test the python wrapper for l2_normalize.
"""
axis = 1
#TODO(caoying) epsilon is not supported due to lack of a maximum_op.
epsilon = 1e-6
self.gen_random_input()
self.set_program(axis, epsilon)
self.run_program()
expect_output = self.l2_normalize(self.data, axis, epsilon)
# check output
self.assertTrue(np.allclose(self.op_output, expect_output, atol=0.001))
if __name__ == '__main__':
unittest.main()
...@@ -151,24 +151,28 @@ class BaseParallelForTest(unittest.TestCase): ...@@ -151,24 +151,28 @@ class BaseParallelForTest(unittest.TestCase):
class ParallelOpTest(BaseParallelForTest): class ParallelOpTest(BaseParallelForTest):
def test_simple_fc(self): @staticmethod
def __network__(): def __network__():
x = fluid.layers.data(shape=[784], dtype='float32', name='img') x = fluid.layers.data(shape=[784], dtype='float32', name='img')
# FIXME: This is a bug of parallel.do
x.stop_gradient = False
x = yield x x = yield x
hidden = fluid.layers.fc(input=x, size=200, param_attr='fc1.w') hidden = fluid.layers.fc(input=x, size=200, param_attr='fc1.w')
loss = fluid.layers.mean(x=hidden) loss = fluid.layers.mean(x=hidden)
yield loss yield loss
def test_simple_fc(self):
self.run_test( self.run_test(
callback=__network__, callback=ParallelOpTest.__network__,
feed={ feed={
'img': 'img': numpy.random.random(size=(51, 784)).astype('float32')
numpy.random.random(size=(128 * 3, 784)).astype('float32')
}, },
fetch='fc1.w@GRAD') fetch='fc1.w@GRAD')
def test_fc_with_tiny_data(self):
self.run_test(
callback=ParallelOpTest.__network__,
feed={'img': numpy.random.random(size=(1, 784)).astype('float32')},
fetch='fc1.w@GRAD')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()
The examples in v1_api_demo are using v1_api currently, and will be upgraded to v2_api later.
Thus, v1_api_demo is a temporary directory. We decide not to maintain it and will delete it in future.
Please go to [PaddlePaddle/book](https://github.com/PaddlePaddle/book) and
[PaddlePaddle/models](https://github.com/PaddlePaddle/models) to learn PaddlePaddle.
output/
uniform_params/
cifar_params/
mnist_params/
*.png
.pydevproject
.project
*.log
*.pyc
data/mnist_data/
data/cifar-10-batches-py/
# Generative Adversarial Networks (GAN)
This demo implements GAN training described in the original GAN paper (https://arxiv.org/abs/1406.2661) and DCGAN (https://arxiv.org/abs/1511.06434).
The general training procedures are implemented in gan_trainer.py. The neural network configurations are specified in gan_conf.py (for synthetic data) and gan_conf_image.py (for image data).
In order to run the model, first download the corresponding data by running the shell script in ./data.
Then you can run the command below. The flag -d specifies the training data (cifar, mnist or uniform) and flag --useGpu specifies whether to use gpu for training (0 is cpu, 1 is gpu).
$python gan_trainer.py -d cifar --use_gpu 1
The generated images will be stored in ./cifar_samples/
The corresponding models will be stored in ./cifar_params/
#!/bin/bash
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
wget https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
tar zxf cifar-10-python.tar.gz
rm cifar-10-python.tar.gz
#!/usr/bin/env sh
# This script downloads the mnist data and unzips it.
set -e
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
rm -rf "$DIR/mnist_data"
mkdir "$DIR/mnist_data"
cd "$DIR/mnist_data"
echo "Downloading..."
for fname in train-images-idx3-ubyte train-labels-idx1-ubyte t10k-images-idx3-ubyte t10k-labels-idx1-ubyte
do
if [ ! -e $fname ]; then
wget --no-check-certificate http://yann.lecun.com/exdb/mnist/${fname}.gz
gunzip ${fname}.gz
fi
done
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import *
mode = get_config_arg("mode", str, "generator")
assert mode in set([
"generator", "discriminator", "generator_training", "discriminator_training"
])
is_generator_training = mode == "generator_training"
is_discriminator_training = mode == "discriminator_training"
is_generator = mode == "generator"
is_discriminator = mode == "discriminator"
# The network structure below follows the ref https://arxiv.org/abs/1406.2661
# Here we used two hidden layers and batch_norm
print('mode=%s' % mode)
# the dim of the noise (z) as the input of the generator network
noise_dim = 10
# the dim of the hidden layer
hidden_dim = 10
# the dim of the generated sample
sample_dim = 2
settings(
batch_size=128,
learning_rate=1e-4,
learning_method=AdamOptimizer(beta1=0.5))
def discriminator(sample):
"""
discriminator ouputs the probablity of a sample is from generator
or real data.
The output has two dimenstional: dimension 0 is the probablity
of the sample is from generator and dimension 1 is the probabblity
of the sample is from real data.
"""
param_attr = ParamAttr(is_static=is_generator_training)
bias_attr = ParamAttr(
is_static=is_generator_training, initial_mean=1.0, initial_std=0)
hidden = fc_layer(
input=sample,
name="dis_hidden",
size=hidden_dim,
bias_attr=bias_attr,
param_attr=param_attr,
act=ReluActivation())
hidden2 = fc_layer(
input=hidden,
name="dis_hidden2",
size=hidden_dim,
bias_attr=bias_attr,
param_attr=param_attr,
act=LinearActivation())
hidden_bn = batch_norm_layer(
hidden2,
act=ReluActivation(),
name="dis_hidden_bn",
bias_attr=bias_attr,
param_attr=ParamAttr(
is_static=is_generator_training, initial_mean=1.0,
initial_std=0.02),
use_global_stats=False)
return fc_layer(
input=hidden_bn,
name="dis_prob",
size=2,
bias_attr=bias_attr,
param_attr=param_attr,
act=SoftmaxActivation())
def generator(noise):
"""
generator generates a sample given noise
"""
param_attr = ParamAttr(is_static=is_discriminator_training)
bias_attr = ParamAttr(
is_static=is_discriminator_training, initial_mean=1.0, initial_std=0)
hidden = fc_layer(
input=noise,
name="gen_layer_hidden",
size=hidden_dim,
bias_attr=bias_attr,
param_attr=param_attr,
act=ReluActivation())
hidden2 = fc_layer(
input=hidden,
name="gen_hidden2",
size=hidden_dim,
bias_attr=bias_attr,
param_attr=param_attr,
act=LinearActivation())
hidden_bn = batch_norm_layer(
hidden2,
act=ReluActivation(),
name="gen_layer_hidden_bn",
bias_attr=bias_attr,
param_attr=ParamAttr(
is_static=is_discriminator_training,
initial_mean=1.0,
initial_std=0.02),
use_global_stats=False)
return fc_layer(
input=hidden_bn,
name="gen_layer1",
size=sample_dim,
bias_attr=bias_attr,
param_attr=param_attr,
act=LinearActivation())
if is_generator_training:
noise = data_layer(name="noise", size=noise_dim)
sample = generator(noise)
if is_discriminator_training:
sample = data_layer(name="sample", size=sample_dim)
if is_generator_training or is_discriminator_training:
label = data_layer(name="label", size=1)
prob = discriminator(sample)
cost = cross_entropy(input=prob, label=label)
classification_error_evaluator(
input=prob, label=label, name=mode + '_error')
outputs(cost)
if is_generator:
noise = data_layer(name="noise", size=noise_dim)
outputs(generator(noise))
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import *
mode = get_config_arg("mode", str, "generator")
dataSource = get_config_arg("data", str, "mnist")
assert mode in set([
"generator", "discriminator", "generator_training", "discriminator_training"
])
is_generator_training = mode == "generator_training"
is_discriminator_training = mode == "discriminator_training"
is_generator = mode == "generator"
is_discriminator = mode == "discriminator"
# The network structure below follows the dcgan paper
# (https://arxiv.org/abs/1511.06434)
print('mode=%s' % mode)
# the dim of the noise (z) as the input of the generator network
noise_dim = 100
# the number of filters in the layer in generator/discriminator that is
# closet to the image
gf_dim = 64
df_dim = 64
if dataSource == "mnist":
sample_dim = 28 # image dim
c_dim = 1 # image color
else:
sample_dim = 32
c_dim = 3
s2, s4 = int(sample_dim / 2), int(sample_dim / 4),
s8, s16 = int(sample_dim / 8), int(sample_dim / 16)
settings(
batch_size=128,
learning_rate=2e-4,
learning_method=AdamOptimizer(beta1=0.5))
def conv_bn(input,
channels,
imgSize,
num_filters,
output_x,
stride,
name,
param_attr,
bias_attr,
param_attr_bn,
bn,
trans=False,
act=ReluActivation()):
"""
conv_bn is a utility function that constructs a convolution/deconv layer
with an optional batch_norm layer
:param bn: whether to use batch_norm_layer
:type bn: bool
:param trans: whether to use conv (False) or deconv (True)
:type trans: bool
"""
# calculate the filter_size and padding size based on the given
# imgSize and ouput size
tmp = imgSize - (output_x - 1) * stride
if tmp <= 1 or tmp > 5:
raise ValueError("conv input-output dimension does not fit")
elif tmp <= 3:
filter_size = tmp + 2
padding = 1
else:
filter_size = tmp
padding = 0
print(imgSize, output_x, stride, filter_size, padding)
if trans:
nameApx = "_convt"
else:
nameApx = "_conv"
if bn:
conv = img_conv_layer(
input,
filter_size=filter_size,
num_filters=num_filters,
name=name + nameApx,
num_channels=channels,
act=LinearActivation(),
groups=1,
stride=stride,
padding=padding,
bias_attr=bias_attr,
param_attr=param_attr,
shared_biases=True,
layer_attr=None,
filter_size_y=None,
stride_y=None,
padding_y=None,
trans=trans)
conv_bn = batch_norm_layer(
conv,
act=act,
name=name + nameApx + "_bn",
bias_attr=bias_attr,
param_attr=param_attr_bn,
use_global_stats=False)
return conv_bn
else:
conv = img_conv_layer(
input,
filter_size=filter_size,
num_filters=num_filters,
name=name + nameApx,
num_channels=channels,
act=act,
groups=1,
stride=stride,
padding=padding,
bias_attr=bias_attr,
param_attr=param_attr,
shared_biases=True,
layer_attr=None,
filter_size_y=None,
stride_y=None,
padding_y=None,
trans=trans)
return conv
def generator(noise):
"""
generator generates a sample given noise
"""
param_attr = ParamAttr(
is_static=is_discriminator_training, initial_mean=0.0, initial_std=0.02)
bias_attr = ParamAttr(
is_static=is_discriminator_training, initial_mean=0.0, initial_std=0.0)
param_attr_bn = ParamAttr(
is_static=is_discriminator_training, initial_mean=1.0, initial_std=0.02)
h1 = fc_layer(
input=noise,
name="gen_layer_h1",
size=s8 * s8 * gf_dim * 4,
bias_attr=bias_attr,
param_attr=param_attr,
act=LinearActivation())
h1_bn = batch_norm_layer(
h1,
act=ReluActivation(),
name="gen_layer_h1_bn",
bias_attr=bias_attr,
param_attr=param_attr_bn,
use_global_stats=False)
h2_bn = conv_bn(
h1_bn,
channels=gf_dim * 4,
output_x=s8,
num_filters=gf_dim * 2,
imgSize=s4,
stride=2,
name="gen_layer_h2",
param_attr=param_attr,
bias_attr=bias_attr,
param_attr_bn=param_attr_bn,
bn=True,
trans=True)
h3_bn = conv_bn(
h2_bn,
channels=gf_dim * 2,
output_x=s4,
num_filters=gf_dim,
imgSize=s2,
stride=2,
name="gen_layer_h3",
param_attr=param_attr,
bias_attr=bias_attr,
param_attr_bn=param_attr_bn,
bn=True,
trans=True)
return conv_bn(
h3_bn,
channels=gf_dim,
output_x=s2,
num_filters=c_dim,
imgSize=sample_dim,
stride=2,
name="gen_layer_h4",
param_attr=param_attr,
bias_attr=bias_attr,
param_attr_bn=param_attr_bn,
bn=False,
trans=True,
act=TanhActivation())
def discriminator(sample):
"""
discriminator ouputs the probablity of a sample is from generator
or real data.
The output has two dimenstional: dimension 0 is the probablity
of the sample is from generator and dimension 1 is the probabblity
of the sample is from real data.
"""
param_attr = ParamAttr(
is_static=is_generator_training, initial_mean=0.0, initial_std=0.02)
bias_attr = ParamAttr(
is_static=is_generator_training, initial_mean=0.0, initial_std=0.0)
param_attr_bn = ParamAttr(
is_static=is_generator_training, initial_mean=1.0, initial_std=0.02)
h0 = conv_bn(
sample,
channels=c_dim,
imgSize=sample_dim,
num_filters=df_dim,
output_x=s2,
stride=2,
name="dis_h0",
param_attr=param_attr,
bias_attr=bias_attr,
param_attr_bn=param_attr_bn,
bn=False)
h1_bn = conv_bn(
h0,
channels=df_dim,
imgSize=s2,
num_filters=df_dim * 2,
output_x=s4,
stride=2,
name="dis_h1",
param_attr=param_attr,
bias_attr=bias_attr,
param_attr_bn=param_attr_bn,
bn=True)
h2_bn = conv_bn(
h1_bn,
channels=df_dim * 2,
imgSize=s4,
num_filters=df_dim * 4,
output_x=s8,
stride=2,
name="dis_h2",
param_attr=param_attr,
bias_attr=bias_attr,
param_attr_bn=param_attr_bn,
bn=True)
return fc_layer(
input=h2_bn,
name="dis_prob",
size=2,
bias_attr=bias_attr,
param_attr=param_attr,
act=SoftmaxActivation())
if is_generator_training:
noise = data_layer(name="noise", size=noise_dim)
sample = generator(noise)
if is_discriminator_training:
sample = data_layer(name="sample", size=sample_dim * sample_dim * c_dim)
if is_generator_training or is_discriminator_training:
label = data_layer(name="label", size=1)
prob = discriminator(sample)
cost = cross_entropy(input=prob, label=label)
classification_error_evaluator(
input=prob, label=label, name=mode + '_error')
outputs(cost)
if is_generator:
noise = data_layer(name="noise", size=noise_dim)
outputs(generator(noise))
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import argparse
import random
import numpy
import cPickle
import sys, os
from PIL import Image
from paddle.trainer.config_parser import parse_config
from paddle.trainer.config_parser import logger
import py_paddle.swig_paddle as api
import matplotlib.pyplot as plt
def plot2DScatter(data, outputfile):
'''
Plot the data as a 2D scatter plot and save to outputfile
data needs to be two dimensinoal
'''
x = data[:, 0]
y = data[:, 1]
logger.info("The mean vector is %s" % numpy.mean(data, 0))
logger.info("The std vector is %s" % numpy.std(data, 0))
heatmap, xedges, yedges = numpy.histogram2d(x, y, bins=50)
extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
plt.clf()
plt.scatter(x, y)
plt.savefig(outputfile, bbox_inches='tight')
def CHECK_EQ(a, b):
assert a == b, "a=%s, b=%s" % (a, b)
def copy_shared_parameters(src, dst):
'''
copy the parameters from src to dst
:param src: the source of the parameters
:type src: GradientMachine
:param dst: the destination of the parameters
:type dst: GradientMachine
'''
src_params = [src.getParameter(i) for i in xrange(src.getParameterSize())]
src_params = dict([(p.getName(), p) for p in src_params])
for i in xrange(dst.getParameterSize()):
dst_param = dst.getParameter(i)
src_param = src_params.get(dst_param.getName(), None)
if src_param is None:
continue
src_value = src_param.getBuf(api.PARAMETER_VALUE)
dst_value = dst_param.getBuf(api.PARAMETER_VALUE)
CHECK_EQ(len(src_value), len(dst_value))
dst_value.copyFrom(src_value)
dst_param.setValueUpdated()
def print_parameters(src):
src_params = [src.getParameter(i) for i in xrange(src.getParameterSize())]
print "***************"
for p in src_params:
print "Name is %s" % p.getName()
print "value is %s \n" % p.getBuf(api.PARAMETER_VALUE).copyToNumpyArray(
)
def load_mnist_data(imageFile):
f = open(imageFile, "rb")
f.read(16)
# Define number of samples for train/test
if "train" in imageFile:
n = 60000
else:
n = 10000
data = numpy.fromfile(f, 'ubyte', count=n * 28 * 28).reshape((n, 28 * 28))
data = data / 255.0 * 2.0 - 1.0
f.close()
return data.astype('float32')
def load_cifar_data(cifar_path):
batch_size = 10000
data = numpy.zeros((5 * batch_size, 32 * 32 * 3), dtype="float32")
for i in range(1, 6):
file = cifar_path + "/data_batch_" + str(i)
fo = open(file, 'rb')
dict = cPickle.load(fo)
fo.close()
data[(i - 1) * batch_size:(i * batch_size), :] = dict["data"]
data = data / 255.0 * 2.0 - 1.0
return data
# synthesize 2-D uniform data
def load_uniform_data():
data = numpy.random.rand(1000000, 2).astype('float32')
return data
def merge(images, size):
if images.shape[1] == 28 * 28:
h, w, c = 28, 28, 1
else:
h, w, c = 32, 32, 3
img = numpy.zeros((h * size[0], w * size[1], c))
for idx in xrange(size[0] * size[1]):
i = idx % size[1]
j = idx // size[1]
img[j*h:j*h+h, i*w:i*w+w, :] = \
((images[idx, :].reshape((h, w, c), order="F").transpose(1, 0, 2) + 1.0) / 2.0 * 255.0)
return img.astype('uint8')
def save_images(images, path):
merged_img = merge(images, [8, 8])
if merged_img.shape[2] == 1:
im = Image.fromarray(numpy.squeeze(merged_img)).convert('RGB')
else:
im = Image.fromarray(merged_img, mode="RGB")
im.save(path)
def get_real_samples(batch_size, data_np):
return data_np[numpy.random.choice(
data_np.shape[0], batch_size, replace=False), :]
def get_noise(batch_size, noise_dim):
return numpy.random.normal(size=(batch_size, noise_dim)).astype('float32')
def get_fake_samples(generator_machine, batch_size, noise):
gen_inputs = api.Arguments.createArguments(1)
gen_inputs.setSlotValue(0, api.Matrix.createDenseFromNumpy(noise))
gen_outputs = api.Arguments.createArguments(0)
generator_machine.forward(gen_inputs, gen_outputs, api.PASS_TEST)
fake_samples = gen_outputs.getSlotValue(0).copyToNumpyMat()
return fake_samples
def get_training_loss(training_machine, inputs):
outputs = api.Arguments.createArguments(0)
training_machine.forward(inputs, outputs, api.PASS_TEST)
loss = outputs.getSlotValue(0).copyToNumpyMat()
return numpy.mean(loss)
def prepare_discriminator_data_batch_pos(batch_size, data_np):
real_samples = get_real_samples(batch_size, data_np)
labels = numpy.ones(batch_size, dtype='int32')
inputs = api.Arguments.createArguments(2)
inputs.setSlotValue(0, api.Matrix.createDenseFromNumpy(real_samples))
inputs.setSlotIds(1, api.IVector.createVectorFromNumpy(labels))
return inputs
def prepare_discriminator_data_batch_neg(generator_machine, batch_size, noise):
fake_samples = get_fake_samples(generator_machine, batch_size, noise)
labels = numpy.zeros(batch_size, dtype='int32')
inputs = api.Arguments.createArguments(2)
inputs.setSlotValue(0, api.Matrix.createDenseFromNumpy(fake_samples))
inputs.setSlotIds(1, api.IVector.createVectorFromNumpy(labels))
return inputs
def prepare_generator_data_batch(batch_size, noise):
label = numpy.ones(batch_size, dtype='int32')
inputs = api.Arguments.createArguments(2)
inputs.setSlotValue(0, api.Matrix.createDenseFromNumpy(noise))
inputs.setSlotIds(1, api.IVector.createVectorFromNumpy(label))
return inputs
def find(iterable, cond):
for item in iterable:
if cond(item):
return item
return None
def get_layer_size(model_conf, layer_name):
layer_conf = find(model_conf.layers, lambda x: x.name == layer_name)
assert layer_conf is not None, "Cannot find '%s' layer" % layer_name
return layer_conf.size
def main():
parser = argparse.ArgumentParser()
parser.add_argument("-d", "--data_source", help="mnist or cifar or uniform")
parser.add_argument(
"--use_gpu", default="1", help="1 means use gpu for training")
parser.add_argument("--gpu_id", default="0", help="the gpu_id parameter")
args = parser.parse_args()
data_source = args.data_source
use_gpu = args.use_gpu
assert data_source in ["mnist", "cifar", "uniform"]
assert use_gpu in ["0", "1"]
if not os.path.exists("./%s_samples/" % data_source):
os.makedirs("./%s_samples/" % data_source)
if not os.path.exists("./%s_params/" % data_source):
os.makedirs("./%s_params/" % data_source)
api.initPaddle('--use_gpu=' + use_gpu, '--dot_period=10',
'--log_period=100', '--gpu_id=' + args.gpu_id,
'--save_dir=' + "./%s_params/" % data_source)
if data_source == "uniform":
conf = "gan_conf.py"
num_iter = 10000
else:
conf = "gan_conf_image.py"
num_iter = 1000
gen_conf = parse_config(conf, "mode=generator_training,data=" + data_source)
dis_conf = parse_config(conf,
"mode=discriminator_training,data=" + data_source)
generator_conf = parse_config(conf, "mode=generator,data=" + data_source)
batch_size = dis_conf.opt_config.batch_size
noise_dim = get_layer_size(gen_conf.model_config, "noise")
if data_source == "mnist":
data_np = load_mnist_data("./data/mnist_data/train-images-idx3-ubyte")
elif data_source == "cifar":
data_np = load_cifar_data("./data/cifar-10-batches-py/")
else:
data_np = load_uniform_data()
# this creates a gradient machine for discriminator
dis_training_machine = api.GradientMachine.createFromConfigProto(
dis_conf.model_config)
# this create a gradient machine for generator
gen_training_machine = api.GradientMachine.createFromConfigProto(
gen_conf.model_config)
# generator_machine is used to generate data only, which is used for
# training discriminator
logger.info(str(generator_conf.model_config))
generator_machine = api.GradientMachine.createFromConfigProto(
generator_conf.model_config)
dis_trainer = api.Trainer.create(dis_conf, dis_training_machine)
gen_trainer = api.Trainer.create(gen_conf, gen_training_machine)
dis_trainer.startTrain()
gen_trainer.startTrain()
# Sync parameters between networks (GradientMachine) at the beginning
copy_shared_parameters(gen_training_machine, dis_training_machine)
copy_shared_parameters(gen_training_machine, generator_machine)
# constrain that either discriminator or generator can not be trained
# consecutively more than MAX_strike times
curr_train = "dis"
curr_strike = 0
MAX_strike = 5
for train_pass in xrange(100):
dis_trainer.startTrainPass()
gen_trainer.startTrainPass()
for i in xrange(num_iter):
# Do forward pass in discriminator to get the dis_loss
noise = get_noise(batch_size, noise_dim)
data_batch_dis_pos = prepare_discriminator_data_batch_pos(
batch_size, data_np)
dis_loss_pos = get_training_loss(dis_training_machine,
data_batch_dis_pos)
data_batch_dis_neg = prepare_discriminator_data_batch_neg(
generator_machine, batch_size, noise)
dis_loss_neg = get_training_loss(dis_training_machine,
data_batch_dis_neg)
dis_loss = (dis_loss_pos + dis_loss_neg) / 2.0
# Do forward pass in generator to get the gen_loss
data_batch_gen = prepare_generator_data_batch(batch_size, noise)
gen_loss = get_training_loss(gen_training_machine, data_batch_gen)
if i % 100 == 0:
print "d_pos_loss is %s d_neg_loss is %s" % (dis_loss_pos,
dis_loss_neg)
print "d_loss is %s g_loss is %s" % (dis_loss, gen_loss)
# Decide which network to train based on the training history
# And the relative size of the loss
if (not (curr_train == "dis" and curr_strike == MAX_strike)) and \
((curr_train == "gen" and curr_strike == MAX_strike) or dis_loss > gen_loss):
if curr_train == "dis":
curr_strike += 1
else:
curr_train = "dis"
curr_strike = 1
dis_trainer.trainOneDataBatch(batch_size, data_batch_dis_neg)
dis_trainer.trainOneDataBatch(batch_size, data_batch_dis_pos)
copy_shared_parameters(dis_training_machine,
gen_training_machine)
else:
if curr_train == "gen":
curr_strike += 1
else:
curr_train = "gen"
curr_strike = 1
gen_trainer.trainOneDataBatch(batch_size, data_batch_gen)
# TODO: add API for paddle to allow true parameter sharing between different GradientMachines
# so that we do not need to copy shared parameters.
copy_shared_parameters(gen_training_machine,
dis_training_machine)
copy_shared_parameters(gen_training_machine, generator_machine)
dis_trainer.finishTrainPass()
gen_trainer.finishTrainPass()
# At the end of each pass, save the generated samples/images
fake_samples = get_fake_samples(generator_machine, batch_size, noise)
if data_source == "uniform":
plot2DScatter(fake_samples, "./%s_samples/train_pass%s.png" %
(data_source, train_pass))
else:
save_images(fake_samples, "./%s_samples/train_pass%s.png" %
(data_source, train_pass))
dis_trainer.finishTrain()
gen_trainer.finishTrain()
if __name__ == '__main__':
main()
data/raw_data
data/*.list
mnist_vgg_model
plot.png
train.log
*pyc
.ipynb_checkpoints
params.pkl
params.tar
params.tar.gz
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
"""
A very basic example for how to use current Raw SWIG API to train mnist network.
Current implementation uses Raw SWIG, which means the API call is directly \
passed to C++ side of Paddle.
The user api could be simpler and carefully designed.
"""
import random
import numpy as np
import paddle.v2 as paddle_v2
import py_paddle.swig_paddle as api
from paddle.trainer_config_helpers import *
from py_paddle import DataProviderConverter
from mnist_util import read_from_mnist
def init_parameter(network):
assert isinstance(network, api.GradientMachine)
for each_param in network.getParameters():
assert isinstance(each_param, api.Parameter)
array_size = len(each_param)
array = np.random.uniform(-1.0, 1.0, array_size).astype('float32')
each_param.getBuf(api.PARAMETER_VALUE).copyFromNumpyArray(array)
def generator_to_batch(generator, batch_size):
ret_val = list()
for each_item in generator:
ret_val.append(each_item)
if len(ret_val) == batch_size:
yield ret_val
ret_val = list()
if len(ret_val) != 0:
yield ret_val
class BatchPool(object):
def __init__(self, generator, batch_size):
self.data = list(generator)
self.batch_size = batch_size
def __call__(self):
random.shuffle(self.data)
for offset in xrange(0, len(self.data), self.batch_size):
limit = min(offset + self.batch_size, len(self.data))
yield self.data[offset:limit]
def input_order_converter(generator):
for each_item in generator:
yield each_item['pixel'], each_item['label']
def main():
api.initPaddle("-use_gpu=false", "-trainer_count=4") # use 4 cpu cores
optimizer = paddle_v2.optimizer.Adam(
learning_rate=1e-4,
batch_size=1000,
model_average=ModelAverage(average_window=0.5),
regularization=L2Regularization(rate=0.5))
# Create Local Updater. Local means not run in cluster.
# For a cluster training, here we can change to createRemoteUpdater
# in future.
updater = optimizer.create_local_updater()
assert isinstance(updater, api.ParameterUpdater)
# define network
images = paddle_v2.layer.data(
name='pixel', type=paddle_v2.data_type.dense_vector(784))
label = paddle_v2.layer.data(
name='label', type=paddle_v2.data_type.integer_value(10))
hidden1 = paddle_v2.layer.fc(input=images, size=200)
hidden2 = paddle_v2.layer.fc(input=hidden1, size=200)
inference = paddle_v2.layer.fc(input=hidden2,
size=10,
act=paddle_v2.activation.Softmax())
cost = paddle_v2.layer.classification_cost(input=inference, label=label)
# Create Simple Gradient Machine.
model_config = paddle_v2.layer.parse_network(cost)
m = api.GradientMachine.createFromConfigProto(model_config,
api.CREATE_MODE_NORMAL,
optimizer.enable_types())
# This type check is not useful. Only enable type hint in IDE.
# Such as PyCharm
assert isinstance(m, api.GradientMachine)
# Initialize Parameter by numpy.
init_parameter(network=m)
# Initialize ParameterUpdater.
updater.init(m)
# DataProvider Converter is a utility convert Python Object to Paddle C++
# Input. The input format is as same as Paddle's DataProvider.
converter = DataProviderConverter(input_types=[images.type, label.type])
train_file = './data/raw_data/train'
test_file = './data/raw_data/t10k'
# start gradient machine.
# the gradient machine must be started before invoke forward/backward.
# not just for training, but also for inference.
m.start()
# evaluator can print error rate, etc. It is a C++ class.
batch_evaluator = m.makeEvaluator()
test_evaluator = m.makeEvaluator()
# Get Train Data.
# TrainData will stored in a data pool. Currently implementation is not care
# about memory, speed. Just a very naive implementation.
train_data_generator = input_order_converter(read_from_mnist(train_file))
train_data = BatchPool(train_data_generator, 512)
# outArgs is Neural Network forward result. Here is not useful, just passed
# to gradient_machine.forward
outArgs = api.Arguments.createArguments(0)
for pass_id in xrange(2): # we train 2 passes.
updater.startPass()
for batch_id, data_batch in enumerate(train_data()):
# data_batch is input images.
# here, for online learning, we could get data_batch from network.
# Start update one batch.
pass_type = updater.startBatch(len(data_batch))
# Start BatchEvaluator.
# batch_evaluator can be used between start/finish.
batch_evaluator.start()
# forwardBackward is a shortcut for forward and backward.
# It is sometimes faster than invoke forward/backward separately,
# because in GradientMachine, it may be async.
m.forwardBackward(converter(data_batch), outArgs, pass_type)
for each_param in m.getParameters():
updater.update(each_param)
# Get cost. We use numpy to calculate total cost for this batch.
cost_vec = outArgs.getSlotValue(0)
cost_vec = cost_vec.copyToNumpyMat()
cost = cost_vec.sum() / len(data_batch)
# Make evaluator works.
m.eval(batch_evaluator)
# Print logs.
print 'Pass id', pass_id, 'Batch id', batch_id, 'with cost=', \
cost, batch_evaluator
batch_evaluator.finish()
# Finish batch.
# * will clear gradient.
# * ensure all values should be updated.
updater.finishBatch(cost)
# testing stage. use test data set to test current network.
updater.apply()
test_evaluator.start()
test_data_generator = input_order_converter(read_from_mnist(test_file))
for data_batch in generator_to_batch(test_data_generator, 512):
# in testing stage, only forward is needed.
m.forward(converter(data_batch), outArgs, api.PASS_TEST)
m.eval(test_evaluator)
# print error rate for test data set
print 'Pass', pass_id, ' test evaluator: ', test_evaluator
test_evaluator.finish()
updater.restore()
updater.catchUpWith()
params = m.getParameters()
for each_param in params:
assert isinstance(each_param, api.Parameter)
value = each_param.getBuf(api.PARAMETER_VALUE)
value = value.copyToNumpyArray()
# Here, we could save parameter to every where you want
print each_param.getName(), value
updater.finishPass()
m.finish()
if __name__ == '__main__':
main()
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
o = open("./" + "train.list", "w")
o.write("./data/raw_data/train" + "\n")
o.close()
o = open("./" + "test.list", "w")
o.write("./data/raw_data/t10k" + "\n")
o.close()
#!/usr/bin/env sh
# This scripts downloads the mnist data and unzips it.
set -e
DIR="$( cd "$(dirname "$0")" ; pwd -P )"
rm -rf "$DIR/raw_data"
mkdir "$DIR/raw_data"
cd "$DIR/raw_data"
echo "Downloading..."
for fname in train-images-idx3-ubyte train-labels-idx1-ubyte t10k-images-idx3-ubyte t10k-labels-idx1-ubyte
do
if [ ! -e $fname ]; then
wget --no-check-certificate http://yann.lecun.com/exdb/mnist/${fname}.gz
gunzip ${fname}.gz
fi
done
cd $DIR
rm -f *.list
python generate_list.py
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import *
is_predict = get_config_arg("is_predict", bool, False)
####################Data Configuration ##################
if not is_predict:
data_dir = './data/'
define_py_data_sources2(
train_list=data_dir + 'train.list',
test_list=data_dir + 'test.list',
module='mnist_provider',
obj='process')
######################Algorithm Configuration #############
settings(batch_size=50, learning_rate=0.001, learning_method=AdamOptimizer())
#######################Network Configuration #############
data_size = 1 * 28 * 28
label_size = 10
img = data_layer(name='pixel', size=data_size)
# light cnn
# A shallower cnn model: [CNN, BN, ReLU, Max-Pooling] x4 + FC x1
# Easier to train for mnist dataset and quite efficient
# Final performance is close to deeper ones on tasks such as digital and character classification
def light_cnn(input_image, num_channels, num_classes):
def __light__(ipt,
num_filter=128,
times=1,
conv_filter_size=3,
dropouts=0,
num_channels_=None):
return img_conv_group(
input=ipt,
num_channels=num_channels_,
pool_size=2,
pool_stride=2,
conv_padding=0,
conv_num_filter=[num_filter] * times,
conv_filter_size=conv_filter_size,
conv_act=ReluActivation(),
conv_with_batchnorm=True,
conv_batchnorm_drop_rate=dropouts,
pool_type=MaxPooling())
tmp = __light__(input_image, num_filter=128, num_channels_=num_channels)
tmp = __light__(tmp, num_filter=128)
tmp = __light__(tmp, num_filter=128)
tmp = __light__(tmp, num_filter=128, conv_filter_size=1)
tmp = fc_layer(input=tmp, size=num_classes, act=SoftmaxActivation())
return tmp
predict = light_cnn(input_image=img, num_channels=1, num_classes=label_size)
if not is_predict:
lbl = data_layer(name="label", size=label_size)
inputs(img, lbl)
outputs(classification_cost(input=predict, label=lbl))
else:
outputs(predict)
# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
from paddle.trainer.PyDataProvider2 import *
from mnist_util import read_from_mnist
# Define a py data provider
@provider(
input_types={'pixel': dense_vector(28 * 28),
'label': integer_value(10)},
cache=CacheType.CACHE_PASS_IN_MEM)
def process(settings, filename): # settings is not used currently.
for each in read_from_mnist(filename):
yield each
import numpy
__all__ = ['read_from_mnist']
def read_from_mnist(filename):
imgf = filename + "-images-idx3-ubyte"
labelf = filename + "-labels-idx1-ubyte"
f = open(imgf, "rb")
l = open(labelf, "rb")
f.read(16)
l.read(8)
# Define number of samples for train/test
if "train" in filename:
n = 60000
else:
n = 10000
images = numpy.fromfile(
f, 'ubyte', count=n * 28 * 28).reshape((n, 28 * 28)).astype('float32')
images = images / 255.0 * 2.0 - 1.0
labels = numpy.fromfile(l, 'ubyte', count=n).astype("int")
for i in xrange(n):
yield {"pixel": images[i, :], 'label': labels[i]}
f.close()
l.close()
#!/bin/bash
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -e
config=vgg_16_mnist.py
output=./mnist_vgg_model
log=train.log
paddle train \
--config=$config \
--dot_period=10 \
--log_period=100 \
--test_all_data_in_one_period=1 \
--use_gpu=0 \
--trainer_count=1 \
--num_passes=100 \
--save_dir=$output \
2>&1 | tee $log
paddle usage -l $log -e $? -n "mnist_train" >/dev/null 2>&1
python -m paddle.utils.plotcurve -i $log > plot.png
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from paddle.trainer_config_helpers import *
is_predict = get_config_arg("is_predict", bool, False)
####################Data Configuration ##################
if not is_predict:
data_dir = './data/'
define_py_data_sources2(
train_list=data_dir + 'train.list',
test_list=data_dir + 'test.list',
module='mnist_provider',
obj='process')
######################Algorithm Configuration #############
settings(
batch_size=128,
learning_rate=0.1 / 128.0,
learning_method=MomentumOptimizer(0.9),
regularization=L2Regularization(0.0005 * 128))
#######################Network Configuration #############
data_size = 1 * 28 * 28
label_size = 10
img = data_layer(name='pixel', size=data_size)
# small_vgg is predined in trainer_config_helpers.network
predict = small_vgg(input_image=img, num_channels=1, num_classes=label_size)
if not is_predict:
lbl = data_layer(name="label", size=label_size)
inputs(img, lbl)
outputs(classification_cost(input=predict, label=lbl))
else:
outputs(predict)
#!/bin/env python
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Example:
python extract_para.py --preModel PREMODEL --preDict PREDICT \
--usrModel USRMODEL --usrDict USRDICT -d DIM
Options:
-h, --help show this help message and exit
--preModel PREMODEL the name of pretrained embedding model
--preDict PREDICT the name of pretrained dictionary
--usrModel usrModel the name of output usr embedding model
--usrDict usrDict the name of user specified dictionary
-d DIM dimension of parameter
"""
from optparse import OptionParser
import struct
def get_row_index(preDict, usrDict):
"""
Get the row positions for all words in user dictionary from pre-trained dictionary.
return: a list of row positions
Example: preDict='a\nb\nc\n', usrDict='a\nc\n', then return [0,2]
"""
pos = []
index = dict()
with open(preDict, "r") as f:
for line_index, line in enumerate(f):
word = line.strip().split()[0]
index[word] = line_index
with open(usrDict, "r") as f:
for line in f:
word = line.strip().split()[0]
pos.append(index[word])
return pos
def extract_parameters_by_usrDict(preModel, preDict, usrModel, usrDict,
paraDim):
"""
Extract desired parameters from a pretrained embedding model based on user dictionary
"""
if paraDim not in [32, 64, 128, 256]:
raise RuntimeError("We only support 32, 64, 128, 256 dimensions now")
fi = open(preModel, "rb")
fo = open(usrModel, "wb")
# write filehead
rowIndex = get_row_index(preDict, usrDict)
newHead = struct.pack("iil", 0, 4, len(rowIndex) * paraDim)
fo.write(newHead)
bytes = 4 * paraDim
for i in range(0, len(rowIndex)):
# find the absolute position of input file
fi.seek(rowIndex[i] * bytes + 16, 0)
fo.write(fi.read(bytes))
print "extract parameters finish, total", len(rowIndex), "lines"
fi.close()
def main():
"""
Main entry for running paraconvert.py
"""
usage = "usage: \n" \
"python %prog --preModel PREMODEL --preDict PREDICT" \
" --usrModel USRMODEL --usrDict USRDICT -d DIM"
parser = OptionParser(usage)
parser.add_option(
"--preModel",
action="store",
dest="preModel",
help="the name of pretrained embedding model")
parser.add_option(
"--preDict",
action="store",
dest="preDict",
help="the name of pretrained dictionary")
parser.add_option(
"--usrModel",
action="store",
dest="usrModel",
help="the name of output usr embedding model")
parser.add_option(
"--usrDict",
action="store",
dest="usrDict",
help="the name of user specified dictionary")
parser.add_option(
"-d", action="store", dest="dim", help="dimension of parameter")
(options, args) = parser.parse_args()
extract_parameters_by_usrDict(options.preModel, options.preDict,
options.usrModel, options.usrDict,
int(options.dim))
if __name__ == '__main__':
main()
#!/bin/env python
# Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserved
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Example:
python paraconvert.py --b2t -i INPUT -o OUTPUT -d DIM
python paraconvert.py --t2b -i INPUT -o OUTPUT
Options:
-h, --help show this help message and exit
--b2t convert parameter file of embedding model from binary to text
--t2b convert parameter file of embedding model from text to binary
-i INPUT input parameter file name
-o OUTPUT output parameter file name
-d DIM dimension of parameter
"""
from optparse import OptionParser
import struct
def binary2text(input, output, paraDim):
"""
Convert a binary parameter file of embedding model to be a text file.
input: the name of input binary parameter file, the format is:
1) the first 16 bytes is filehead:
version(4 bytes): version of paddle, default = 0
floatSize(4 bytes): sizeof(float) = 4
paraCount(8 bytes): total number of parameter
2) the next (paraCount * 4) bytes is parameters, each has 4 bytes
output: the name of output text parameter file, for example:
0,4,32156096
-0.7845433,1.1937413,-0.1704215,...
0.0000909,0.0009465,-0.0008813,...
...
the format is:
1) the first line is filehead:
version=0, floatSize=4, paraCount=32156096
2) other lines print the paramters
a) each line prints paraDim paramters splitted by ','
b) there is paraCount/paraDim lines (embedding words)
paraDim: dimension of parameters
"""
fi = open(input, "rb")
fo = open(output, "w")
"""
"""
version, floatSize, paraCount = struct.unpack("iil", fi.read(16))
newHead = ','.join([str(version), str(floatSize), str(paraCount)])
print >> fo, newHead
bytes = 4 * int(paraDim)
format = "%df" % int(paraDim)
context = fi.read(bytes)
line = 0
while context:
numbers = struct.unpack(format, context)
lst = []
for i in numbers:
lst.append('%8.7f' % i)
print >> fo, ','.join(lst)
context = fi.read(bytes)
line += 1
fi.close()
fo.close()
print "binary2text finish, total", line, "lines"
def get_para_count(input):
"""
Compute the total number of embedding parameters in input text file.
input: the name of input text file
"""
numRows = 1
paraDim = 0
with open(input) as f:
line = f.readline()
paraDim = len(line.split(","))
for line in f:
numRows += 1
return numRows * paraDim
def text2binary(input, output, paddle_head=True):
"""
Convert a text parameter file of embedding model to be a binary file.
input: the name of input text parameter file, for example:
-0.7845433,1.1937413,-0.1704215,...
0.0000909,0.0009465,-0.0008813,...
...
the format is:
1) it doesn't have filehead
2) each line stores the same dimension of parameters,
the separator is commas ','
output: the name of output binary parameter file, the format is:
1) the first 16 bytes is filehead:
version(4 bytes), floatSize(4 bytes), paraCount(8 bytes)
2) the next (paraCount * 4) bytes is parameters, each has 4 bytes
"""
fi = open(input, "r")
fo = open(output, "wb")
newHead = struct.pack("iil", 0, 4, get_para_count(input))
fo.write(newHead)
count = 0
for line in fi:
line = line.strip().split(",")
for i in range(0, len(line)):
binary_data = struct.pack("f", float(line[i]))
fo.write(binary_data)
count += 1
fi.close()
fo.close()
print "text2binary finish, total", count, "lines"
def main():
"""
Main entry for running paraconvert.py
"""
usage = "usage: \n" \
"python %prog --b2t -i INPUT -o OUTPUT -d DIM \n" \
"python %prog --t2b -i INPUT -o OUTPUT"
parser = OptionParser(usage)
parser.add_option(
"--b2t",
action="store_true",
help="convert parameter file of embedding model from binary to text")
parser.add_option(
"--t2b",
action="store_true",
help="convert parameter file of embedding model from text to binary")
parser.add_option(
"-i", action="store", dest="input", help="input parameter file name")
parser.add_option(
"-o", action="store", dest="output", help="output parameter file name")
parser.add_option(
"-d", action="store", dest="dim", help="dimension of parameter")
(options, args) = parser.parse_args()
if options.b2t:
binary2text(options.input, options.output, options.dim)
if options.t2b:
text2binary(options.input, options.output)
if __name__ == '__main__':
main()
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册