提交 b3cd6796 编写于 作者: W wangmeng28

Merge remote-tracking branch 'upstream/develop' into factorization_machine_layer

# Executor Design Doc
## Motivation
We use executor to do the runtime evaluation of a `ProgramDesc`.
## Overview
An executor takes a `ProgramDesc`, a `block_id` and a `Scope`. The `ProgramDesc` is a list of blocks and each block contains the protobuf definition of all the parameters and operators. The `block_id` specifies the entrance block. And the `Scope` is the container of all the variable instance, which is persistent throughout different runs.
### What does executor do?
It evaluates all the operators in the `block_id`th block of a `ProgramDesc`.
### What does executor NOT do?
It does not do runtime optimization, meaning intelligently parse the dependency of each op a choose which one to be run and in which order they should be run.
It does not do graph partitioning, meaning dividing the `ProgramDesc` into several small pieces and executing them on different devices.
## Implementation
`Executor` evaluates a `ProgramDesc`. Essentially, it instantiates Variables and Operators, then run all the operators in sequence. [[code]](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/executor.cc)
# Design Doc: InferVarType
## The Problem Posed
The variable in our design can hold variant types. Such as `LoDTensor` and `SelectedRows`. An operator should be able to inference the variable types of its output.
For example, a `lookup table` operator takes two `LoDTensor`; one is a float tensor as the embedding table, the other is an int tensor as word ID. The gradient operator of `lookup table` will generate a `SelectedRows` as its output. A `sum` operator can take both `LoDTensor` and `SelectedRows` as its inputs and will generate a `LoDTensor` if any of its inputs is `LoDTensor`, otherwise, the `sum` operator will generate `SelectedRows` as its output.
The variable type will be constant at runtime. Every variable's type can either be set by the user (input data and parameter) or be inferred by the operator in compile time.
## Proposed Solution
The `InferVarType` is a compile-time function which is registered to each operator. The inferface of that function is:
```c++
using InferVarTypeFN = std::function<
void (const OpDescBind& /*op_desc*/, BlockDescBind* /*block*/)>;
```
It takes an operator description as its input and will write the output variable type and store them in block description.
The `InferVarTypeFN` will be registered in `OpInfo`, to replace `infer_var_type_` field. The `OpInfo` should be
```cpp
struct OpInfo {
InferVarTypeFN infer_var_type_;
...
};
```
The default `InferVarType` will set output type as `LoDTensor`. It can be done by `GetInferVarType()`.
```cpp
void DefaultInferVarType(const OpDescBind& op_desc, BlockDescBind* block) {
// set the output type of variable as `LoDTensor`.
// ...
}
struct OpInfo {
InferVarTypeFN infer_var_type_;
InferVarTypeFN GetInferVarType() const {
if (infer_var_type_) {
return infer_var_type_;
} else {
return DefaultInferVarType;
}
}
};
```
## Register InferVarType
We provide a thin base class for registering an `InferVarTypeFN`. To use a base class will ease the implementation of registry since we can detect the registry entry is an `InferVarTypeFN` or not.
```cpp
class VarTypeInferer {
public:
virtual void operator()(const OpDescBind& op_desc, BlockDescBind* block) const = 0;
}
```
Operator developers can write the specialize `VarTypeInferer` as follow.
```cpp
class SpecialVarTypeInferer : public VarTypeInferer {
public:
virtual void operator()(const OpDescBind& op_desc, BlockDescBind* block) const {
// .. own logic
}
}
```
Then user can register the `InferVarType` just like `GradOpDescMaker` and `OpInfoMaker`.
```
REGISTER_OPERATOR(some_op, OpType, SpecialVarTypeInferer, ...);
```
......@@ -179,40 +179,104 @@ init_attr={
`optimize_op_attrs` is not in the `VarDesc` message, but kept in the Python instance, as it will be used in the Python space when creating the optimize operator's `OpDesc`, and will be in the `OpDesc` message.
## Layer Functions
## Layer Function
A layer is a Python function that creates some operators and variables. Layers simplify the work of application programmers.
A layer is a Python function that creates some operators and variables. Layers simplify the work of application programmers.
### Data Layer
Layer functions take `Variable` and configuration parameters as its input and return the output variable(s).
For example, `FullyConnected` take one or more variable as its input. The input could be input data or another layer's output. There are many configuration options for a `FullyConnected` layer, such as layer size, activation, parameter names, initialization strategies of parameters, and so on. The `FullyConnected` layer will return an output variable.
### Necessity for reusing code between layer functions
There are a lot of code that can be reused. Such as
* Give the default value of configuration. e.g., default initialize strategy for parameters is uniform random with `min = -1.0`, `max = 1.0`. and default initialize strategy for bias is to fill zero.
* Append the activation operator.
* Create a temporary variable.
* Create parameter.
* Generate a unique name.
* Add a bias.
* ...
A mechanism to reuse code between layer functions is necessary. It will be around [150 lines of code](https://github.com/PaddlePaddle/Paddle/pull/4724/files#diff-823b27e07e93914ada859232ae23f846R12) if we write a `FullyConnected` layer without any helper functions.
### Comparision between global functions and helper class
The `FullyConnected` layer will be as follow when we provide global functions:
```python
def data_layer(name, type, column_name):
block = the_current_program.glolal_block()
var = block.create_global_var(
name=name,
shape=[None] + type.dims(),
dtype=type.dtype)
block.prepend_operator(block,
type="Feed",
inputs = None,
outputs = [var],
{column_name: column_name})
return var
def fc_layer(input, size, param_attr=None, bias_attr=None, act=None, name=None):
if name is None:
name = unique_name("fc")
input = multiple_input(input)
param_attr = default_param_attr(param_attr)
param_attr = multiple_param_attr(param_attr, len(input))
# mul
mul_results = []
for ipt, attr in zip(input, param_attr):
shape = ipt.shape[1:] + [size]
w = g_program.global_block().create_parameter(shape, ipt.dtype, name, attr)
tmp = create_tmp_var(name)
g_program.current_block().append_op("mul", {ipt, w}, {tmp})
mul_results.append(tmp)
# add sum
...
# add bias
...
# add activation
...
return out
```
The input to the feed operator is a special variable in the global scope, which is the output of [Python readers](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/reader/README.md).
We can provide many helpers functions for layer developers. However, there are several disadvantages for global helper functions:
1. We need a namespace for these methods, then layer developers can quickly figure out what method they can use.
2. Global functions will force layer developers to pass its parameter time by time.
So we provide a helper class, `LayerHelper`, to share code between layer functions. The `FullyConnected` Layer will be as follow.
```python
def fc_layer(input, size, param_attr=None, bias_attr=None, act=None, name=None):
helper = LayerHelper(locals()) # pass all parameter to LayerHelper
mul_results = []
for ipt, param in helper.iter_multiple_input_and_param():
w = helper.create_parameter(shape=ipt.shape[1:] + [size], dtype = ipt.dtype)
tmp = helper.create_tmp_variable()
helper.append_op('mul', {ipt, w}, {tmp})
mul_results.append(tmp)
pre_bias = helper.add_sum(mul_results)
pre_activation = helper.add_bias(pre_bias)
return helper.add_activation(pre_activation)
```
We not only use the fewer lines of code to write `fc_layer` but also make the code clearer to understand. At the same time, layer developers can figure out what function they can invoke by typing `helper.` in a python editor.
### Implementation of layer helper
### FC Layer
We just keep all parameters of a layer function as a dictionary in layer helper as a private data member. Every method of layer helper will look up the dictionary after it is invoked. In that way, we can implement a layer helper for all layer functions even some layer does not contain some operator. For example, The `activation` is used by the FullyConnected layer or convolution layers, but a cross-entropy layer does not use it. The example code of `add_activation` are:
```python
def fc_layer(input, size, ...):
block = program.current_block()
w = block.create_parameter(...)
b = block.create_parameter(...)
out = block.create_var()
op = block.append_operator("FC", X=input, W=w, b=b, out=out)
out.writer = op
return out
class LayerHelper(object):
def __init__(self, **kwargs): # kwargs is short for `keyword arguments`
self.kwargs = kwargs
def add_activation(self, input_var):
act = self.kwargs.get("act", None) # default value is None
if act is None: # do nothing if no act
return input_var
tmp = self.create_tmp_var(self)
self.append_op(type=act, input=input_var, output=tmp)
return tmp
```
## Optimizer
......
......@@ -43,16 +43,11 @@ cc_library(backward SRCS backward.cc DEPS net_op)
cc_test(backward_test SRCS backward_test.cc DEPS backward recurrent_op device_context)
cc_library(executor SRCS executor.cc DEPS op_registry device_context scope framework_proto backward)
set(EXECUTOR_TEST_OP elementwise_add_op gaussian_random_op feed_op fetch_op
mul_op sum_op squared_l2_distance_op fill_constant_op sgd_op mean_op)
if(WITH_GPU)
nv_test(executor_test SRCS executor_test.cc DEPS executor ${EXECUTOR_TEST_OP})
else()
cc_test(executor_test SRCS executor_test.cc DEPS executor ${EXECUTOR_TEST_OP})
endif()
cc_library(tensor_array SRCS tensor_array.cc DEPS lod_tensor)
cc_test(tensor_array_test SRCS tensor_array_test.cc DEPS tensor_array place)
cc_test(var_type_inference_test SRCS var_type_inference_test.cc DEPS op_registry
proto_desc)
cc_library(selected_rows SRCS selected_rows.cc DEPS tensor)
cc_test(selected_rows_test SRCS selected_rows_test.cc DEPS selected_rows)
......@@ -120,6 +120,57 @@ class EnumInContainer {
std::unordered_set<T> container_;
};
template <typename T>
struct ExtractAttribute {
explicit ExtractAttribute(const std::string& attr_name)
: attr_name_(attr_name) {}
T* operator()(Attribute& attr) const {
T* attr_value = nullptr;
try {
attr_value = &boost::get<T>(attr);
} catch (boost::bad_get& bad_get) {
PADDLE_THROW("Cannot get attribute %s by type %s, its type is %s",
attr_name_, typeid(T).name(), attr.type().name());
}
return attr_value;
}
const std::string& attr_name_;
};
// special handle bool
// FIXME(yuyang18): Currently we cast bool into int in python binding. It is
// hard to change the logic there. In another way, we should correct handle
// if the user set `some_flag=1`.
//
// FIX ME anytime if there is a better solution.
template <>
struct ExtractAttribute<bool> {
explicit ExtractAttribute(const std::string& attr_name)
: attr_name_(attr_name) {}
bool* operator()(Attribute& attr) const {
if (attr.type() == typeid(int)) { // NOLINT
int val = boost::get<int>(attr);
attr = static_cast<bool>(val);
} else if (attr.type() == typeid(float)) { // NOLINT
float val = boost::get<float>(attr);
attr = static_cast<bool>(val);
}
bool* attr_value = nullptr;
try {
attr_value = &boost::get<bool>(attr);
} catch (boost::bad_get& bad_get) {
PADDLE_THROW("Cannot get attribute %s by type bool, its type is %s",
attr_name_, attr.type().name());
}
return attr_value;
}
const std::string& attr_name_;
};
// check whether a certain attribute fit its limits
// an attribute can have more than one limits
template <typename T>
......@@ -171,9 +222,10 @@ class TypedAttrChecker {
attr_map[attr_name_] = val;
}
Attribute& attr = attr_map.at(attr_name_);
T& attr_value = boost::get<T>(attr);
ExtractAttribute<T> extract_attr(attr_name_);
T* attr_value = extract_attr(attr);
for (const auto& checker : value_checkers_) {
checker(attr_value);
checker(*attr_value);
}
}
......
......@@ -433,7 +433,7 @@ ParamGradInfoMap AppendBackward(
new OpDescBind("fill_constant", {}, {{"Out", {fill_one_op_out}}},
{{"shape", std::vector<int>{1}},
{"value", static_cast<float>(1.0)},
{"dataType", framework::DataType::FP32}}));
{"data_type", framework::DataType::FP32}}));
all_ops.push_back(std::move(fill_one_op));
size_t forward_op_num = all_ops.size();
size_t forward_block_num = program_desc.Size();
......
......@@ -18,6 +18,7 @@
#include "paddle/framework/op_info.h"
#include "paddle/framework/op_proto_maker.h"
#include "paddle/framework/operator.h"
#include "paddle/framework/var_type_inference.h"
namespace paddle {
namespace framework {
......@@ -26,7 +27,8 @@ namespace details {
enum OpInfoFillType {
kOperator = 0,
kOpProtoAndCheckerMaker = 1,
kGradOpDescMaker = 2
kGradOpDescMaker = 2,
kVarTypeInference = 3
};
template <typename T>
......@@ -38,7 +40,9 @@ struct OpInfoFillTypeID {
? kOpProtoAndCheckerMaker
: (std::is_base_of<GradOpDescMakerBase, T>::value
? kGradOpDescMaker
: static_cast<OpInfoFillType>(-1)));
: (std::is_base_of<VarTypeInference, T>::value
? kVarTypeInference
: static_cast<OpInfoFillType>(-1))));
}
};
......@@ -106,6 +110,17 @@ struct OpInfoFiller<T, kGradOpDescMaker> {
};
}
};
template <typename T>
struct OpInfoFiller<T, kVarTypeInference> {
void operator()(const char* op_type, OpInfo* info) const {
info->infer_var_type_ = [](const OpDescBind& fwd_op, BlockDescBind* block) {
T inference;
inference(fwd_op, block);
};
}
};
} // namespace details
} // namespace framework
......
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/framework/executor.h"
#include <memory>
#include <vector>
#include "gflags/gflags.h"
#include "gtest/gtest.h"
#include "paddle/framework/attribute.h"
#include "paddle/framework/backward.h"
#include "paddle/framework/block_desc.h"
#include "paddle/framework/op_desc.h"
#include "paddle/framework/op_registry.h"
#include "paddle/framework/operator.h"
USE_OP(elementwise_add);
USE_OP(gaussian_random);
USE_OP(feed);
USE_OP(fetch);
USE_OP(mul);
USE_OP(sum);
USE_OP(squared_l2_distance);
USE_OP(fill_constant);
USE_OP(mean);
USE_OP(sgd);
using namespace paddle::platform;
using namespace paddle::framework;
void AddOp(const std::string& type, const VariableNameMap& inputs,
const VariableNameMap& outputs, AttributeMap attrs,
paddle::framework::BlockDescBind* block) {
// insert output
for (auto kv : outputs) {
for (auto v : kv.second) {
// <<<<<<< HEAD
// auto var = block->Var(v);
// var->SetType(VarDesc::LOD_TENSOR);
// var->SetDataType(paddle::framework::DataType::FP32);
// =======
if (!block->HasVar(v)) {
auto var = block->Var(v);
var->SetDataType(paddle::framework::DataType::FP32);
}
// >>>>>>> origin/develop
}
}
// insert op
auto op = block->AppendOp();
op->SetType(type);
for (auto& kv : inputs) {
op->SetInput(kv.first, kv.second);
}
for (auto& kv : outputs) {
op->SetOutput(kv.first, kv.second);
}
op->SetAttrMap(attrs);
op->CheckAttrs();
}
// Tensors in feed value variable will only be in CPUPlace
// So we can memcpy the data from vector<T> to feed_value
template <typename T>
void SetFeedVariable(const std::vector<std::vector<T>>& inputs,
const std::vector<std::vector<int64_t>>& dims) {
Variable* g_feed_value = GetGlobalScope().FindVar("feed_value");
auto& feed_inputs =
*(g_feed_value->GetMutable<std::vector<paddle::framework::Tensor>>());
size_t size = inputs.size();
feed_inputs.resize(size);
for (size_t i = 0; i < size; i++) {
T* dst = feed_inputs[i].mutable_data<T>(make_ddim(dims[i]), CPUPlace());
memcpy(dst, inputs[i].data(), inputs[i].size() * sizeof(T));
}
}
// Tensors in fetch value variable will only be in CPUPlace
// So we can memcpy the data from fetch_value to vector<T>
template <typename T>
std::vector<std::vector<T>> GetFetchVariable() {
Variable* g_fetch_value = GetGlobalScope().FindVar("fetch_value");
auto& fetch_outputs =
*(g_fetch_value->GetMutable<std::vector<paddle::framework::Tensor>>());
size_t size = fetch_outputs.size();
std::vector<std::vector<T>> result;
result.reserve(size);
for (size_t i = 0; i < size; i++) {
std::vector<T> tmp;
tmp.resize(fetch_outputs[i].numel());
memcpy(tmp.data(), fetch_outputs[i].data<T>(),
fetch_outputs[i].numel() * sizeof(T));
result.push_back(tmp);
}
return result;
}
class ExecutorTesterRandom : public ::testing::Test {
public:
virtual void SetUp() override {
int input_dim = 3, batch_size = 2, embed_dim = 5;
auto temp_init_root_block = init_pdesc_.add_blocks();
temp_init_root_block->set_idx(0);
temp_init_root_block->set_parent_idx(-1);
paddle::framework::ProgramDescBind& init_program =
paddle::framework::ProgramDescBind::Instance(&init_pdesc_);
paddle::framework::BlockDescBind* init_root_block = init_program.Block(0);
AddOp("gaussian_random", {}, {{"Out", {"w1"}}},
{{"dims", std::vector<int>{input_dim, embed_dim}}}, init_root_block);
AddOp("gaussian_random", {}, {{"Out", {"w2"}}},
{{"dims", std::vector<int>{embed_dim, input_dim}}}, init_root_block);
AddOp("fetch", {{"Input", {"w1"}}}, {}, {{"col", 0}}, init_root_block);
AddOp("fetch", {{"Input", {"w2"}}}, {}, {{"col", 1}}, init_root_block);
// flush
init_program.Proto();
// run block
auto temp_root_block = pdesc_.add_blocks();
temp_root_block->set_idx(0);
temp_root_block->set_parent_idx(-1);
paddle::framework::ProgramDescBind& program =
paddle::framework::ProgramDescBind::Instance(&pdesc_);
paddle::framework::BlockDescBind* root_block = program.Block(0);
// feed data
inputs_.push_back({1.0, 1.0, 1.0, 1.0, 1.0, 1.0});
dims_.push_back({batch_size, input_dim});
AddOp("feed", {}, {{"Out", {"a"}}},
{{"dims", std::vector<int>{batch_size, input_dim}}, {"col", 0}},
root_block);
// forward
AddOp("mul", {{"X", {"a"}}, {"Y", {"w1"}}}, {{"Out", {"b"}}}, {},
root_block);
AddOp("mul", {{"X", {"b"}}, {"Y", {"w2"}}}, {{"Out", {"a_out"}}}, {},
root_block);
AddOp("squared_l2_distance", {{"X", {"a"}}, {"Y", {"a_out"}}},
{{"Out", {"l2_distance"}}, {"sub_result", {"l2_distance_sub"}}}, {},
root_block);
AddOp("mean", {{"X", {"l2_distance"}}}, {{"Out", {"mean_out"}}}, {},
root_block);
// backward
auto target = VarDescBind("mean_out");
AppendBackward(program, target, {});
// update
AddOp("fill_constant", {}, {{"Out", {"learning_rate"}}},
{{"shape", std::vector<int>{1}}, {"value", float(0.001)}},
root_block);
AddOp("sgd", {{"Param", {"w1"}},
{"LearningRate", {"learning_rate"}},
{"Grad", {"w1@GRAD"}}},
{{"ParamOut", {"w1"}}}, {}, root_block);
AddOp("sgd", {{"Param", {"w2"}},
{"LearningRate", {"learning_rate"}},
{"Grad", {"w2@GRAD"}}},
{{"ParamOut", {"w2"}}}, {}, root_block);
AddOp("fetch", {{"Input", {"w1"}}}, {}, {{"col", 0}}, root_block);
AddOp("fetch", {{"Input", {"w2"}}}, {}, {{"col", 1}}, root_block);
AddOp("fetch", {{"Input", {"l2_distance"}}}, {}, {{"col", 0}}, root_block);
// flush
program.Proto();
}
protected:
ProgramDesc init_pdesc_;
ProgramDesc pdesc_;
std::vector<std::vector<float>> inputs_;
std::vector<std::vector<int64_t>> dims_;
};
class ExecutorTesterFeedAndFetch : public ::testing::Test {
public:
virtual void SetUp() override {
auto temp_root_block = pdesc_.add_blocks();
temp_root_block->set_idx(0);
temp_root_block->set_parent_idx(-1);
// wrap to BlockDescBind
paddle::framework::ProgramDescBind& program =
paddle::framework::ProgramDescBind::Instance(&pdesc_);
paddle::framework::BlockDescBind* root_block = program.Block(0);
std::vector<int> dim{6};
AddOp("feed", {}, {{"Out", {"a"}}}, {{"dims", dim}, {"col", 0}},
root_block);
AddOp("feed", {}, {{"Out", {"b"}}}, {{"dims", dim}, {"col", 1}},
root_block);
AddOp("fetch", {{"Input", {"a"}}}, {}, {{"col", 0}}, root_block);
AddOp("fetch", {{"Input", {"b"}}}, {}, {{"col", 1}}, root_block);
// flush
program.Proto();
std::vector<float> vec1 = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0};
std::vector<float> vec2 = {4.0, 5.0, 6.0, 7.0, 8.0, 9.0};
inputs_.push_back(vec1);
inputs_.push_back(vec2);
dims_.push_back({static_cast<int64_t>(vec1.size())});
dims_.push_back({static_cast<int64_t>(vec2.size())});
}
protected:
ProgramDesc pdesc_;
std::vector<std::vector<float>> inputs_;
std::vector<std::vector<int64_t>> dims_;
};
#ifndef PADDLE_WITH_CUDA
TEST_F(ExecutorTesterRandom, CPU) {
std::vector<Place> places;
CPUPlace cpu_place;
places.push_back(cpu_place);
// We have a global Scope and BuddyAllocator, and we must ensure
// global BuddyAllocator is initialized before global Scope. Thus,
// global Scope will deconstruct before BuddyAllocator. Otherwise,
// "pointer being freed was not allocated" error will appear.
paddle::memory::Used(cpu_place);
std::unique_ptr<Executor> executor(new Executor(places));
executor->Run(init_pdesc_, &GetGlobalScope(), 0);
SetFeedVariable<float>(inputs_, dims_);
executor->Run(pdesc_, &GetGlobalScope(), 0);
std::vector<std::vector<float>> result = GetFetchVariable<float>();
}
TEST_F(ExecutorTesterFeedAndFetch, CPU) {
std::vector<Place> places;
CPUPlace cpu_place;
places.push_back(cpu_place);
// We have a global Scope and BuddyAllocator, and we must ensure
// global BuddyAllocator is initialized before global Scope. Thus,
// global Scope will deconstruct before BuddyAllocator. Otherwise,
// "pointer being freed was not allocated" error will appear.
paddle::memory::Used(cpu_place);
std::unique_ptr<Executor> executor(new Executor(places));
for (int batch_id = 0; batch_id < 3; batch_id++) {
SetFeedVariable<float>(inputs_, dims_);
executor->Run(pdesc_, &GetGlobalScope(), 0);
std::vector<std::vector<float>> result = GetFetchVariable<float>();
PADDLE_ENFORCE_EQ(result.size(), inputs_.size());
for (size_t i = 0; i < result.size(); ++i) {
PADDLE_ENFORCE_EQ(result[i].size(), inputs_[i].size());
for (size_t j = 0; j < result[i].size(); ++j) {
PADDLE_ENFORCE_EQ(result[i][j], inputs_[i][j]);
}
}
}
}
#else
TEST_F(ExecutorTesterRandom, GPU) {
std::vector<Place> places;
GPUPlace gpu_place(0);
places.push_back(gpu_place);
// We have a global Scope and BuddyAllocator, and we must ensure
// global BuddyAllocator is initialized before global Scope. Thus,
// global Scope will deconstruct before BuddyAllocator. Otherwise,
// "pointer being freed was not allocated" error will appear.
// If paddle is compiled with GPU, both CPU and GPU BuddyAllocator
// need to be used at first.
paddle::memory::Used(CPUPlace());
paddle::memory::Used(gpu_place);
std::unique_ptr<Executor> executor(new Executor(places));
executor->Run(init_pdesc_, &GetGlobalScope(), 0);
for (int batch_id = 0; batch_id < 3; batch_id++) {
SetFeedVariable<float>(inputs_, dims_);
executor->Run(pdesc_, &GetGlobalScope(), 0);
}
}
TEST_F(ExecutorTesterFeedAndFetch, GPU) {
std::vector<Place> places;
GPUPlace gpu_place(0);
places.push_back(gpu_place);
// We have a global Scope and BuddyAllocator, and we must ensure
// global BuddyAllocator is initialized before global Scope. Thus,
// global Scope will deconstruct before BuddyAllocator. Otherwise,
// "pointer being freed was not allocated" error will appear.
// If paddle is compiled with GPU, both CPU and GPU BuddyAllocator
// need to be used at first.
paddle::memory::Used(CPUPlace());
paddle::memory::Used(gpu_place);
std::unique_ptr<Executor> executor(new Executor(places));
for (int batch_id = 0; batch_id < 3; batch_id++) {
SetFeedVariable<float>(inputs_, dims_);
executor->Run(pdesc_, &GetGlobalScope(), 0);
std::vector<std::vector<float>> result = GetFetchVariable<float>();
PADDLE_ENFORCE_EQ(result.size(), inputs_.size());
for (size_t i = 0; i < result.size(); ++i) {
PADDLE_ENFORCE_EQ(result[i].size(), inputs_[i].size());
for (size_t j = 0; j < result[i].size(); ++j) {
PADDLE_ENFORCE_EQ(result[i][j], inputs_[i][j]);
}
}
}
}
DECLARE_double(fraction_of_gpu_memory_to_use);
int main(int argc, char** argv) {
testing::InitGoogleTest(&argc, argv);
// Use less GPU memory for unittest.
FLAGS_fraction_of_gpu_memory_to_use = 0.25;
return RUN_ALL_TESTS();
}
#endif
......@@ -13,33 +13,38 @@ See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "paddle/framework/eigen.h"
#include "paddle/framework/op_registry.h"
#include "paddle/framework/scope.h"
#include "paddle/framework/variable.h"
namespace paddle {
namespace operators {
namespace framework {
template <typename T>
class FetchKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
const framework::Tensor* input = ctx.Input<framework::Tensor>("Input");
framework::Variable* g_fetch_variable =
framework::GetGlobalScope().FindVar("fetch_value");
auto* tensors =
g_fetch_variable->GetMutable<std::vector<framework::Tensor>>();
int col = ctx.template Attr<int>("col");
if (tensors->size() < static_cast<size_t>(col + 1)) {
tensors->resize(col + 1);
}
PADDLE_ENFORCE_GT(tensors->size(), static_cast<size_t>(col));
(*tensors)[col].Resize(input->dims());
(*tensors)[col].mutable_data<T>(platform::CPUPlace());
(*tensors)[col].CopyFrom<T>(*input, platform::CPUPlace(),
ctx.device_context());
// TODO(qijun): need to handle LodTensor later
void SetFeedVariable(const LoDTensor& input, const std::string& var_name,
size_t index) {
// If var_name Variable is not found in GlobalScope, a new variable will
// be created.
Variable* g_feed_value = GetGlobalScope().Var(var_name);
auto& feed_inputs =
*(g_feed_value->GetMutable<std::vector<paddle::framework::LoDTensor>>());
if (index >= feed_inputs.size()) {
feed_inputs.resize(index + 1);
}
};
// shared data with input tensor
feed_inputs[index].ShareDataWith<T>(input);
// set lod
feed_inputs[index].set_lod(input.lod());
}
} // namespace operators
LoDTensor& GetFetchVariable(const std::string& var_name, size_t index) {
// Since we want to fetch LodTensor from a variable, the variable must
// be created alreadly.
Variable* g_fetch_value = GetGlobalScope().FindVar(var_name);
auto& fetch_outputs =
*(g_fetch_value->GetMutable<std::vector<paddle::framework::LoDTensor>>());
PADDLE_ENFORCE_LT(index, fetch_outputs.size());
return fetch_outputs[index];
}
} // namespace framework
} // namespace paddle
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include <vector>
#include "paddle/framework/lod_tensor.h"
namespace paddle {
namespace framework {
using FeedFetchType = LoDTensor;
using FeedFetchList = std::vector<FeedFetchType>;
} // namespace framework
} // namespace paddle
......@@ -149,5 +149,13 @@ class DefaultGradOpDescMaker : public SingleGradOpDescMaker {
}
};
class EmptyGradOpMaker : public GradOpDescMakerBase {
public:
using GradOpDescMakerBase::GradOpDescMakerBase;
std::vector<std::unique_ptr<OpDescBind>> operator()() const override {
return {};
}
};
} // namespace framework
} // namespace paddle
......@@ -220,9 +220,12 @@ static InferShapeFuncMap &InferShapeFuncs() {
void OpDescBind::CheckAttrs() {
PADDLE_ENFORCE(!Type().empty(),
"CheckAttr() can not be called before type is setted.");
const auto *checker = OpInfoMap::Instance().Get(Type()).Checker();
PADDLE_ENFORCE_NOT_NULL(checker, "Operator \"%s\" has no registered checker.",
Type());
auto *checker = OpInfoMap::Instance().Get(Type()).Checker();
if (checker == nullptr) {
// checker is not configured. That operator could be generated by Paddle,
// not by users.
return;
}
checker->Check(attrs_);
}
......@@ -236,5 +239,19 @@ void OpDescBind::InferShape(const BlockDescBind &block) const {
it->second(&ctx);
}
void OpDescBind::InferVarType(BlockDescBind *block) const {
auto &info = OpInfoMap::Instance().Get(this->Type());
if (info.infer_var_type_) {
info.infer_var_type_(*this, block);
} else {
// all output type is LoDTensor by default
for (auto &out_pair : this->outputs_) {
for (auto &out_var_name : out_pair.second) {
block->Var(out_var_name)->SetType(VarDesc::LOD_TENSOR);
}
}
}
}
} // namespace framework
} // namespace paddle
......@@ -102,6 +102,8 @@ class OpDescBind {
void InferShape(const BlockDescBind &block) const;
void InferVarType(BlockDescBind *block) const;
void Flush();
private:
......
......@@ -19,7 +19,6 @@
#include <unordered_map>
#include "paddle/framework/attribute.h"
#include "paddle/framework/op_desc.h"
#include "paddle/framework/type_defs.h"
#include "paddle/platform/macros.h"
......@@ -31,6 +30,7 @@ struct OpInfo {
GradOpMakerFN grad_op_maker_;
OpProto* proto_{nullptr};
OpAttrChecker* checker_{nullptr};
InferVarTypeFN infer_var_type_;
bool HasOpProtoAndChecker() const {
return proto_ != nullptr && checker_ != nullptr;
......
......@@ -82,16 +82,6 @@ class OpRegistry {
static std::unique_ptr<OperatorBase> CreateOp(const OpDescBind& op_desc);
};
template <typename OpType, typename ProtoMakerType, typename GradOpType>
class OpRegistrar : public Registrar {
public:
explicit OpRegistrar(const char* op_type) { OpRegistrar(op_type, ""); }
OpRegistrar(const char* op_type, const char* grad_op_type) {
OpRegistry::RegisterOp<OpType, ProtoMakerType, GradOpType>(op_type,
grad_op_type);
}
};
template <typename PlaceType, bool at_end, size_t I, typename... KernelType>
struct OpKernelRegistrarFunctor;
......
......@@ -10,6 +10,7 @@ See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "paddle/framework/lod_tensor.h"
#include "paddle/framework/tensor.h"
namespace paddle {
......@@ -34,9 +35,9 @@ class SelectedRows {
void set_height(int64_t height) { height_ = height; }
const std::vector<int64_t>& rows() const { return rows_; }
const Vector<int64_t>& rows() const { return rows_; }
void set_rows(const std::vector<int64_t>& rows) { rows_ = rows; }
void set_rows(const Vector<int64_t>& rows) { rows_ = rows; }
DDim GetCompleteDims() const {
std::vector<int64_t> dims = vectorize(value_->dims());
......@@ -45,7 +46,10 @@ class SelectedRows {
}
private:
std::vector<int64_t> rows_;
// Notice: rows can be duplicate. We can have {0, 4, 7, 0, 5, 7, 9} here.
// SelectedRows are simplely concated when adding together. Until a
// SelectedRows add a Tensor, will the duplicate rows be handled.
Vector<int64_t> rows_;
std::unique_ptr<Tensor> value_{nullptr};
int64_t height_;
};
......
......@@ -100,6 +100,22 @@ class Tensor {
inline void CopyFrom(const Tensor& src, const platform::Place& dst_place,
const platform::DeviceContext& ctx);
// FIXME(yuyang18): CopyFrom should without template T, use the replace
// `CopyFrom` with `CopyFromTensor`
inline void CopyFromTensor(const Tensor& src,
const platform::Place& dst_place,
const platform::DeviceContext& ctx) {
// NOLINTNEXTLINES_8 cpplint.py will recognize below lines as functions.
// That is a bug of cpplint.py. Just ignore lint these lines.
if (src.type() == std::type_index(typeid(double))) {
CopyFrom<double>(src, dst_place, ctx);
} else if (src.type() == std::type_index(typeid(float))) {
CopyFrom<float>(src, dst_place, ctx);
} else if (src.type() == std::type_index(typeid(int))) {
CopyFrom<int>(src, dst_place, ctx);
}
}
/**
* @brief Copy the content of an external vector to a tensor.
*
......
......@@ -16,12 +16,18 @@
#include <functional>
#include <map>
#include <memory>
#include <string>
#include <unordered_map>
#include <unordered_set>
#include <vector>
#include "paddle/platform/variant.h"
namespace paddle {
namespace framework {
class OperatorBase;
class OpDescBind;
class BlockDescBind;
class BlockDesc;
using VariableNameMap = std::map<std::string, std::vector<std::string>>;
// The order should be as same as framework.proto
......@@ -40,5 +46,8 @@ using GradOpMakerFN = std::function<std::vector<std::unique_ptr<OpDescBind>>(
const OpDescBind&, const std::unordered_set<std::string>& /*no_grad_set*/,
std::unordered_map<std::string, std::string>* /*grad_to_var*/)>;
using InferVarTypeFN = std::function<void(const OpDescBind& /*op_desc*/,
BlockDescBind* /*block*/)>;
} // namespace framework
} // namespace paddle
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "paddle/framework/type_defs.h"
namespace paddle {
namespace framework {
class VarTypeInference {
public:
virtual ~VarTypeInference() {}
virtual void operator()(const OpDescBind& op_desc,
BlockDescBind* block) const = 0;
};
} // namespace framework
} // namespace paddle
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/framework/var_type_inference.h"
#include "gtest/gtest.h"
#include "paddle/framework/op_registry.h"
#include "paddle/framework/operator.h"
#include "paddle/framework/program_desc.h"
namespace paddle {
namespace framework {
class SumOpMaker : public OpProtoAndCheckerMaker {
public:
SumOpMaker(OpProto *proto, OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X", "").AsDuplicable();
AddOutput("Out", "");
AddComment("");
}
};
class SumOpVarTypeInference : public VarTypeInference {
public:
void operator()(const OpDescBind &op_desc,
BlockDescBind *block) const override {
auto &inputs = op_desc.Input("X");
auto default_var_type = VarDesc::SELECTED_ROWS;
bool any_input_is_lod_tensor = std::any_of(
inputs.begin(), inputs.end(), [block](const std::string &name) {
return block->Var(name)->GetType() == VarDesc::LOD_TENSOR;
});
if (any_input_is_lod_tensor) {
default_var_type = VarDesc::LOD_TENSOR;
}
auto out_var_name = op_desc.Output("Out").front();
block->Var(out_var_name)->SetType(default_var_type);
}
};
} // namespace framework
} // namespace paddle
REGISTER_OPERATOR(sum, paddle::framework::NOP, paddle::framework::SumOpMaker,
paddle::framework::SumOpVarTypeInference);
REGISTER_OPERATOR(sum_without_infer_var_type, paddle::framework::NOP,
paddle::framework::SumOpMaker);
namespace paddle {
namespace framework {
TEST(InferVarType, sum_op) {
auto &prog = ProgramDescBind::Instance(&GetProgramDesc());
auto *op = prog.Block(0)->AppendOp();
op->SetType("sum");
op->SetInput("X", {"test_a", "test_b", "test_c"});
op->SetOutput("Out", {"test_out"});
prog.Block(0)->Var("test_a")->SetType(VarDesc::SELECTED_ROWS);
prog.Block(0)->Var("test_b")->SetType(VarDesc::SELECTED_ROWS);
prog.Block(0)->Var("test_c")->SetType(VarDesc::SELECTED_ROWS);
prog.Block(0)->Var("test_out");
op->InferVarType(prog.Block(0));
ASSERT_EQ(VarDesc::SELECTED_ROWS, prog.Block(0)->Var("test_out")->GetType());
prog.Block(0)->Var("test_b")->SetType(VarDesc::LOD_TENSOR);
op->InferVarType(prog.Block(0));
ASSERT_EQ(VarDesc::LOD_TENSOR, prog.Block(0)->Var("test_out")->GetType());
}
TEST(InferVarType, sum_op_without_infer_var_type) {
auto &prog = ProgramDescBind::Instance(&GetProgramDesc());
auto *op = prog.Block(0)->AppendOp();
op->SetType("sum_without_infer_var_type");
op->SetInput("X", {"test2_a", "test2_b", "test2_c"});
op->SetOutput("Out", {"test2_out"});
prog.Block(0)->Var("test2_a")->SetType(VarDesc::SELECTED_ROWS);
prog.Block(0)->Var("test2_b")->SetType(VarDesc::SELECTED_ROWS);
prog.Block(0)->Var("test2_c")->SetType(VarDesc::SELECTED_ROWS);
prog.Block(0)->Var("test2_out");
op->InferVarType(prog.Block(0));
ASSERT_EQ(VarDesc_VarType_LOD_TENSOR,
prog.Block(0)->Var("test2_out")->GetType());
}
} // namespace framework
} // namespace paddle
\ No newline at end of file
......@@ -21,7 +21,6 @@ class AccuracyOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Inference"),
"Input(Inference) of AccuracyOp should not be null.");
......
......@@ -21,7 +21,6 @@ class ActivationOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
ctx->SetOutputDim("Y", ctx->GetInputDim("X"));
ctx->ShareLoD("X", /*->*/ "Y");
......@@ -32,7 +31,6 @@ class ActivationOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
ctx->SetOutputDim(framework::GradVarName("X"), ctx->GetInputDim("Y"));
}
......
......@@ -21,7 +21,6 @@ class AdadeltaOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Param"),
"Input(Param) of AdadeltaOp should not be null.");
......
......@@ -21,7 +21,6 @@ class AdagradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Param"),
"Input(Param) of AdagradOp should not be null.");
......
......@@ -21,7 +21,6 @@ class AdamOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Param"),
"Input(Param) of AdamOp should not be null.");
......
......@@ -21,7 +21,6 @@ class AdamaxOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Param"),
"Input(Param) of AdamaxOp should not be null.");
......
......@@ -21,7 +21,6 @@ class ClipOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of ClipOp should not be null.");
......@@ -60,7 +59,6 @@ class ClipOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
......
......@@ -23,7 +23,6 @@ class ConcatOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE_GE(ctx->Inputs("X").size(), 1UL,
"Inputs(X) of ConcatOp should be empty.")
......@@ -82,7 +81,6 @@ class ConcatOpGrad : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
ctx->SetOutputsDim(framework::GradVarName("X"), ctx->GetInputsDim("X"));
}
......
......@@ -44,7 +44,6 @@ class Conv2DOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override;
};
......@@ -52,7 +51,6 @@ class Conv2DOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override;
};
......
......@@ -27,7 +27,6 @@ class ConvShiftOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should be not null.");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Input(Y) should be not null.");
......@@ -54,7 +53,6 @@ class ConvShiftGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should be not null.");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Input(Y) should be not null.");
......
......@@ -23,7 +23,6 @@ class CosSimOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
// notnull check
PADDLE_ENFORCE(ctx->HasInput("X"),
......@@ -97,7 +96,6 @@ class CosSimOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
// notnull check
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) must not be null.");
......
......@@ -24,7 +24,6 @@ class CropOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of CropOp should not be null.");
......@@ -114,7 +113,6 @@ class CropOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
......
......@@ -21,7 +21,6 @@ class CrossEntropyOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should be not null.");
PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) should be not null.");
......@@ -34,13 +33,13 @@ class CrossEntropyOp : public framework::OperatorWithKernel {
PADDLE_ENFORCE_EQ(x_dims[0], label_dims[0],
"The 1st dimension of Input(X) and Input(Label) should "
"be equal.");
if (ctx->Attrs().Get<bool>("softLabel")) {
if (ctx->Attrs().Get<bool>("soft_label")) {
PADDLE_ENFORCE_EQ(x_dims[1], label_dims[1],
"If Attr(softLabel) == true, the 2nd dimension of "
"If Attr(soft_label) == true, the 2nd dimension of "
"Input(X) and Input(Label) should be equal.");
} else {
PADDLE_ENFORCE_EQ(label_dims[1], 1,
"If Attr(softLabel) == false, the 2nd dimension of "
"If Attr(soft_label) == false, the 2nd dimension of "
"Input(Label) should be 1.");
}
......@@ -48,6 +47,7 @@ class CrossEntropyOp : public framework::OperatorWithKernel {
ctx->ShareLoD("X", /*->*/ "Y");
}
protected:
// CrossEntropy's data type just determined by "X"
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
......@@ -59,7 +59,6 @@ class CrossEntropyGradientOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should be not null.");
PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) should be not null.");
......@@ -82,18 +81,19 @@ class CrossEntropyGradientOp : public framework::OperatorWithKernel {
"be equal.");
PADDLE_ENFORCE_EQ(dy_dims[1], 1,
"The 2nd dimension of Input(Y@Grad) should be 1.");
if (ctx->Attrs().Get<bool>("softLabel")) {
if (ctx->Attrs().Get<bool>("soft_label")) {
PADDLE_ENFORCE_EQ(x_dims[1], label_dims[1],
"When Attr(softLabel) == true, the 2nd dimension of "
"When Attr(soft_label) == true, the 2nd dimension of "
"Input(X) and Input(Label) should be equal.");
} else {
PADDLE_ENFORCE_EQ(label_dims[1], 1,
"When Attr(softLabel) == false, the 2nd dimension of "
"When Attr(soft_label) == false, the 2nd dimension of "
"Input(Label) should be 1.");
}
ctx->SetOutputDim(framework::GradVarName("X"), x_dims);
}
protected:
// CrossEntropy's data type just determined by "X"
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
......@@ -115,15 +115,15 @@ class CrossEntropyOpMaker : public framework::OpProtoAndCheckerMaker {
"Label",
"(Tensor, default Tensor<int>), the ground truth which is "
"a 2-D tensor. "
"When softLabel is set to false, `Label` is a Tensor<int> with shape "
"When soft_label is set to false, `Label` is a Tensor<int> with shape "
"[N x 1]. "
"When softLabel is set to true, `Label` is a Tensor<float/double> "
"When soft_label is set to true, `Label` is a Tensor<float/double> "
"with shape [N x K].");
AddOutput("Y",
"(Tensor, default Tensor<float>), a 2-D tensor "
"with shape [N x 1]. The cross entropy loss.");
AddAttr<bool>(
"softLabel",
"soft_label",
"(bool, default false), a flag to indicate whether to interpretate "
"the given labels as soft labels.")
.SetDefault(false);
......@@ -133,12 +133,12 @@ CrossEntropy Operator.
It supports both standard cross-entropy and soft-label cross-entropy loss
computation.
1) One-hot cross-entropy:
softLabel = false, Label[i, 0] indicates the class index for sample i:
soft_label = false, Label[i, 0] indicates the class index for sample i:
Y[i] = -log(X[i, Label[i]])
2) Soft-label cross-entropy:
softLabel = true, Label[i, j] indicates the soft label of class j
soft_label = true, Label[i, j] indicates the soft label of class j
for sample i:
Y[i] = \sum_j{-Label[i, j] * log(X[i, j])}
......
......@@ -56,7 +56,7 @@ class CrossEntropyOpCUDAKernel : public framework::OpKernel<T> {
y->mutable_data<T>(ctx.GetPlace());
math::CrossEntropyFunctor<platform::GPUPlace, T>()(
ctx.device_context(), y, x, label, ctx.Attr<bool>("softLabel"));
ctx.device_context(), y, x, label, ctx.Attr<bool>("soft_label"));
}
};
......@@ -83,7 +83,7 @@ class CrossEntropyGradientOpCUDAKernel : public framework::OpKernel<T> {
int block = 512;
int grid = (batch_size * class_num + block - 1) / block;
if (ctx.Attr<bool>("softLabel")) {
if (ctx.Attr<bool>("soft_label")) {
auto* label_data = label->data<T>();
SoftCrossEntropyGradientKernel<T><<<
grid, block, 0, reinterpret_cast<const platform::CUDADeviceContext&>(
......@@ -91,7 +91,8 @@ class CrossEntropyGradientOpCUDAKernel : public framework::OpKernel<T> {
.stream()>>>(dx_data, dy_data, x_data, label_data,
batch_size, class_num);
} else {
math::SetConstant<platform::GPUPlace, T>(ctx.device_context(), dx, 0);
math::SetConstant<platform::GPUPlace, T> functor;
functor(ctx.device_context(), dx, 0);
auto* label_data = label->data<int>();
grid = (batch_size + block - 1) / block;
CrossEntropyGradientKernel<T><<<
......
......@@ -38,7 +38,7 @@ class CrossEntropyOpKernel : public framework::OpKernel<T> {
y->mutable_data<T>(ctx.GetPlace());
math::CrossEntropyFunctor<platform::CPUPlace, T>()(
ctx.device_context(), y, x, labels, ctx.Attr<bool>("softLabel"));
ctx.device_context(), y, x, labels, ctx.Attr<bool>("soft_label"));
}
};
......@@ -55,7 +55,7 @@ class CrossEntropyGradientOpKernel : public framework::OpKernel<T> {
T* dx_data = dx->mutable_data<T>(ctx.GetPlace());
int class_num = x->dims()[1];
if (ctx.Attr<bool>("softLabel")) {
if (ctx.Attr<bool>("soft_label")) {
auto x_mat = EigenMatrix<T>::From(*x);
auto dy_mat = EigenMatrix<T>::From(*dy);
auto lbl_mat = EigenMatrix<T>::From(*label);
......@@ -70,7 +70,8 @@ class CrossEntropyGradientOpKernel : public framework::OpKernel<T> {
const T* x_data = x->data<T>();
const int* label_data = label->data<int>();
math::SetConstant<platform::CPUPlace, T>(ctx.device_context(), dx, 0);
math::SetConstant<platform::CPUPlace, T> functor;
functor(ctx.device_context(), dx, 0);
for (int i = 0; i < batch_size; ++i) {
PADDLE_ASSERT(label_data[i] >= 0 || label_data[i] < class_num);
......
......@@ -21,7 +21,6 @@ class DecayedAdagradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Param"),
"Input(Param) of DecayedAdagradOp should not be null.");
......
......@@ -23,7 +23,6 @@ class DropoutOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) must not be null.");
PADDLE_ENFORCE_GE(ctx->Attrs().Get<float>("dropout_prob"), 0);
......@@ -69,7 +68,6 @@ class DropoutOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE_EQ(ctx->Attrs().Get<bool>("is_training"), 1,
"GradOp is only callable when is_training is true");
......
......@@ -23,7 +23,6 @@ class ElementwiseOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
using Tensor = framework::Tensor;
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
......@@ -105,7 +104,6 @@ class ElementwiseOpGrad : public framework::OperatorWithKernel {
using framework::OperatorWithKernel::OperatorWithKernel;
using Tensor = framework::Tensor;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Input(Y) should not be null");
......
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/feed_op.h"
#include "paddle/framework/feed_fetch_type.h"
#include "paddle/framework/op_registry.h"
#include "paddle/framework/operator.h"
namespace paddle {
namespace operators {
class FeedOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasOutput("Out"), "Output should be not null.");
auto& shape = ctx->Attrs().Get<std::vector<int>>("dims");
std::vector<int64_t> shape_int64(shape.size(), 0);
std::transform(shape.begin(), shape.end(), shape_int64.begin(),
[](int a) { return static_cast<int64_t>(a); });
ctx->SetOutputDim("Out", framework::make_ddim(shape_int64));
// TODO(qijun): need to handle LodTensor later
}
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return static_cast<framework::DataType>(Attr<int>("dataType"));
}
};
class FeedOpMaker : public framework::OpProtoAndCheckerMaker {
class FeedOp : public framework::OperatorBase {
public:
FeedOpMaker(framework::OpProto* proto, framework::OpAttrChecker* op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddAttr<int>("dataType", "output data type")
.SetDefault(framework::DataType::FP32);
AddAttr<int>("col", "The col in global feed variable").SetDefault(0);
AddAttr<std::vector<int>>("dims", "The dimension of feed tensor.");
AddOutput("Out", "The output of feed op.");
AddComment(R"DOC(Feed data from global feed variable)DOC");
FeedOp(const std::string &type, const framework::VariableNameMap &inputs,
const framework::VariableNameMap &outputs,
const framework::AttributeMap &attrs)
: OperatorBase(type, inputs, outputs, attrs) {}
void Run(const framework::Scope &scope,
const platform::DeviceContext &dev_ctx) const override {
auto feed_var_name = Input("Input");
auto *feed_var = scope.FindVar(feed_var_name);
PADDLE_ENFORCE(feed_var != nullptr,
"Cannot find feed_var in scope, feed_var_name is %s",
feed_var_name);
auto out_name = this->Output("Out");
auto *out_var = scope.FindVar(out_name);
PADDLE_ENFORCE(out_var != nullptr,
"Cannot find out_var in scope, out_var_name is %s",
out_name);
auto col = Attr<int>("col");
auto &feed_list = feed_var->Get<framework::FeedFetchList>();
auto &feed_item = feed_list.at(static_cast<size_t>(col));
auto *out_item = out_var->GetMutable<framework::FeedFetchType>();
out_item->CopyFromTensor(feed_item, dev_ctx.GetPlace(), dev_ctx);
out_item->set_lod(feed_item.lod());
}
};
} // namespace operators
} // namespace paddle
namespace ops = paddle::operators;
REGISTER_OP_WITHOUT_GRADIENT(feed, ops::FeedOp, ops::FeedOpMaker);
REGISTER_OP_CPU_KERNEL(feed, ops::FeedKernel<float>);
// We do not need to register OpInfoMaker,
// since feed operator will not be used by end users directly
REGISTER_OPERATOR(feed, paddle::operators::FeedOp,
paddle::framework::EmptyGradOpMaker);
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/feed_op.h"
namespace ops = paddle::operators;
REGISTER_OP_GPU_KERNEL(feed, ops::FeedKernel<float>);
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/fetch_op.h"
#include "paddle/framework/feed_fetch_type.h"
#include "paddle/framework/op_registry.h"
namespace paddle {
namespace operators {
class FetchOp : public framework::OperatorWithKernel {
class FetchOp : public framework::OperatorBase {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
FetchOp(const std::string &type, const framework::VariableNameMap &inputs,
const framework::VariableNameMap &outputs,
const framework::AttributeMap &attrs)
: OperatorBase(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Input"), "Input should be not null.");
}
void Run(const framework::Scope &scope,
const platform::DeviceContext &dev_ctx) const override {
auto fetch_var_name = Input("Input");
auto *fetch_var = scope.FindVar(fetch_var_name);
PADDLE_ENFORCE(fetch_var != nullptr,
"Cannot find fetch variable in scope, fetch_var_name is %s",
fetch_var_name);
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return static_cast<framework::DataType>(Attr<int>("dataType"));
}
};
auto out_name = this->Output("Out");
auto *out_var = scope.FindVar(out_name);
PADDLE_ENFORCE(out_var != nullptr,
"Cannot find out_var in scope, out_var_name is %s",
out_name);
class FetchOpMaker : public framework::OpProtoAndCheckerMaker {
public:
FetchOpMaker(framework::OpProto* proto, framework::OpAttrChecker* op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddAttr<int>("dataType", "output data type")
.SetDefault(framework::DataType::FP32);
AddAttr<int>("col", "The col in global fetch variable").SetDefault(0);
AddInput("Input", "The output of fetch op.");
AddComment(R"DOC(Fetch data to global fetch variable)DOC");
auto col = static_cast<size_t>(Attr<int>("col"));
auto *fetch_list = out_var->GetMutable<framework::FeedFetchList>();
auto &src_item = fetch_var->Get<framework::FeedFetchType>();
if (col >= fetch_list->size()) {
fetch_list->resize(col + 1);
}
auto &dst_item = fetch_list->at(col);
// FIXME(yuyang18): Should we assume the fetch operator always generate
// CPU outputs?
dst_item.CopyFromTensor(src_item, platform::CPUPlace(), dev_ctx);
}
};
} // namespace operators
} // namespace paddle
namespace ops = paddle::operators;
REGISTER_OP_WITHOUT_GRADIENT(fetch, ops::FetchOp, ops::FetchOpMaker);
REGISTER_OP_CPU_KERNEL(fetch, ops::FetchKernel<float>);
// We do not need to register OpInfoMaker,
// since fetch operator will not be used by end users directly
REGISTER_OPERATOR(fetch, paddle::operators::FetchOp,
paddle::framework::EmptyGradOpMaker);
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/fetch_op.h"
namespace ops = paddle::operators;
REGISTER_OP_GPU_KERNEL(fetch, ops::FetchKernel<float>);
......@@ -21,7 +21,6 @@ class FillConstantOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasOutput("Out"),
"Output(Out) of FillConstantOp should not be null.");
......@@ -33,9 +32,10 @@ class FillConstantOp : public framework::OperatorWithKernel {
ctx->SetOutputDim("Out", dims);
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext &ctx) const override {
return static_cast<framework::DataType>(ctx.Attr<int>("dataType"));
return static_cast<framework::DataType>(ctx.Attr<int>("data_type"));
}
};
......@@ -44,7 +44,7 @@ class FillConstantOpMaker : public framework::OpProtoAndCheckerMaker {
FillConstantOpMaker(framework::OpProto *proto,
framework::OpAttrChecker *op_checker)
: framework::OpProtoAndCheckerMaker(proto, op_checker) {
AddAttr<int>("dataType",
AddAttr<int>("data_type",
"(int, default 5 (FP32)) "
"Output data type")
.SetDefault(framework::DataType::FP32);
......
......@@ -21,7 +21,6 @@ class FillZerosLikeOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of FillZerosLikeOp should not be null.");
......
......@@ -22,7 +22,6 @@ class GatherOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of GatherOp should not be null.");
......@@ -40,6 +39,7 @@ class GatherOp : public framework::OperatorWithKernel {
ctx->SetOutputDim("Out", output_dims);
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.Input<Tensor>("X")->type());
......@@ -50,11 +50,11 @@ class GatherGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
ctx->SetOutputDim(framework::GradVarName("X"), ctx->GetInputDim("X"));
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.Input<Tensor>("X")->type());
......
......@@ -42,7 +42,6 @@ class GaussianRandomOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasOutput("Out"),
"Output(Out) of GaussianRandomOp should not be null.");
......@@ -57,6 +56,7 @@ class GaussianRandomOp : public framework::OperatorWithKernel {
ctx->SetOutputDim("Out", framework::make_ddim(temp));
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return static_cast<framework::DataType>(Attr<int>("data_type"));
......
......@@ -23,7 +23,6 @@ class GRUUnitOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(%s) of GRUUnitOp should not be null.", "Input");
......@@ -131,7 +130,6 @@ class GRUUnitGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Input"),
"Input(%s) of GRUUnitGradOp should not be null.", "Input");
......
......@@ -21,7 +21,6 @@ class LookupTableOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("W"),
"Input(W) of LookupTableOp should not be null.");
......@@ -37,6 +36,7 @@ class LookupTableOp : public framework::OperatorWithKernel {
ctx->ShareLoD("Ids", /*->*/ "Out");
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.Input<Tensor>("W")->type());
......@@ -69,12 +69,12 @@ class LookupTableOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
auto table_dims = ctx->GetInputDim("W");
ctx->SetOutputDim(framework::GradVarName("W"), table_dims);
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.Input<Tensor>("W")->type());
......
......@@ -21,7 +21,6 @@ class LstmUnitOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) of LSTM should not be null.");
PADDLE_ENFORCE(ctx->HasInput("C_prev"),
......@@ -76,7 +75,6 @@ class LstmUnitGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("C")),
"Input(C@GRAD) should not be null");
......
......@@ -21,7 +21,6 @@ class MarginRankLossOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
// input check
PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) shouldn't be null.");
......@@ -94,7 +93,6 @@ class MarginRankLossGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) shouldn't be null.");
PADDLE_ENFORCE(ctx->HasInput("X1"), "Input(X1) shouldn't be null.");
......
if(WITH_GPU)
nv_library(math_function SRCS math_function.cc math_function.cu im2col.cc im2col.cu DEPS cblas device_context operator)
nv_test(math_function_test SRCS math_function_test.cc DEPS math_function tensor)
nv_test(math_function_gpu_test SRCS math_function_test.cu DEPS math_function tensor)
nv_library(selected_rows_functor SRCS selected_rows_functor.cc selected_rows_functor.cu DEPS selected_rows math_function)
nv_test(selected_rows_functor_gpu_test SRCS selected_rows_functor_test.cu DEPS selected_rows_functor)
nv_library(softmax SRCS softmax.cc softmax.cu DEPS operator)
nv_library(cross_entropy SRCS cross_entropy.cc cross_entropy.cu DEPS operator)
nv_library(pooling SRCS pooling.cc pooling.cu DEPS device_context)
nv_library(vol2col SRCS vol2col.cc vol2col.cu DEPS device_context)
else()
cc_library(math_function SRCS math_function.cc im2col.cc DEPS cblas device_context operator)
cc_test(math_function_test SRCS math_function_test.cc DEPS math_function tensor)
cc_library(selected_rows_functor SRCS selected_rows_functor.cc DEPS selected_rows math_function)
cc_library(softmax SRCS softmax.cc DEPS operator)
cc_library(cross_entropy SRCS cross_entropy.cc DEPS operator)
cc_library(pooling SRCS pooling.cc DEPS device_context)
cc_library(vol2col SRCS vol2col.cc DEPS device_context)
endif()
cc_test(math_function_test SRCS math_function_test.cc DEPS math_function tensor)
cc_test(selected_rows_functor_test SRCS selected_rows_functor_test.cc DEPS selected_rows_functor)
cc_test(im2col_test SRCS im2col_test.cc DEPS math_function tensor)
cc_test(vol2col_test SRCS vol2col_test.cc DEPS vol2col tensor)
......@@ -130,6 +130,8 @@ void matmul<platform::CPUPlace, double>(
matrix_b.data<double>(), beta, matrix_out->data<double>());
}
template struct SetConstant<platform::CPUPlace, float>;
} // namespace math
} // namespace operators
} // namespace paddle
......@@ -155,6 +155,8 @@ void matmul<platform::GPUPlace, double>(
matrix_b.data<double>(), beta, matrix_out->data<double>());
}
template struct SetConstant<platform::GPUPlace, float>;
} // namespace math
} // namespace operators
} // namespace paddle
......@@ -86,11 +86,14 @@ void matmul(const platform::DeviceContext& context,
framework::Tensor* matrix_out, T beta);
template <typename Place, typename T>
void SetConstant(const platform::DeviceContext& context,
framework::Tensor* tensor, T num) {
auto t = framework::EigenVector<T>::Flatten(*tensor);
t.device(*context.GetEigenDevice<Place>()) = t.constant(static_cast<T>(num));
}
struct SetConstant {
void operator()(const platform::DeviceContext& context,
framework::Tensor* tensor, T num) {
auto t = framework::EigenVector<T>::Flatten(*tensor);
t.device(*context.GetEigenDevice<Place>()) =
t.constant(static_cast<T>(num));
}
};
} // namespace math
} // namespace operators
......
#include "paddle/operators/math/math_function.h"
#include "gtest/gtest.h"
#ifdef PADDLE_WITH_CUDA
TEST(math_function, notrans_mul_trans) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor out_gpu;
paddle::framework::Tensor out;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr, 6 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input1, *gpu_place, context);
out_gpu.mutable_data<float>({2, 2}, *gpu_place);
paddle::operators::math::matmul<paddle::platform::GPUPlace, float>(
context, input1_gpu, false, input2_gpu, true, 1, &out_gpu, 0);
out.CopyFrom<float>(out_gpu, *cpu_place, context);
float* out_ptr = out.data<float>();
context.Wait();
EXPECT_EQ(out_ptr[0], 5);
EXPECT_EQ(out_ptr[1], 14);
EXPECT_EQ(out_ptr[2], 14);
EXPECT_EQ(out_ptr[3], 50);
delete gpu_place;
}
TEST(math_function, trans_mul_notrans) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor out_gpu;
paddle::framework::Tensor out;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr, 6 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input1, *gpu_place, context);
out_gpu.mutable_data<float>({3, 3}, *gpu_place);
paddle::operators::math::matmul<paddle::platform::GPUPlace, float>(
context, input1_gpu, true, input2_gpu, false, 1, &out_gpu, 0);
out.CopyFrom<float>(out_gpu, *cpu_place, context);
float* out_ptr = out.data<float>();
context.Wait();
EXPECT_EQ(out_ptr[0], 9);
EXPECT_EQ(out_ptr[1], 12);
EXPECT_EQ(out_ptr[2], 15);
EXPECT_EQ(out_ptr[3], 12);
EXPECT_EQ(out_ptr[4], 17);
EXPECT_EQ(out_ptr[5], 22);
EXPECT_EQ(out_ptr[6], 15);
EXPECT_EQ(out_ptr[7], 22);
EXPECT_EQ(out_ptr[8], 29);
delete gpu_place;
}
TEST(math_function, gemm_notrans_cublas) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input2;
paddle::framework::Tensor input3;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor input3_gpu;
int m = 2;
int n = 3;
int k = 3;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr1[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr1, 6 * sizeof(float));
float* input2_ptr = input2.mutable_data<float>({3, 4}, *cpu_place);
float arr2[12] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};
memcpy(input2_ptr, arr2, 12 * sizeof(float));
float* input3_ptr = input3.mutable_data<float>({2, 4}, *cpu_place);
float arr3[8] = {0, 1, 2, 3, 4, 5, 6, 7};
memcpy(input3_ptr, arr3, 8 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input2, *gpu_place, context);
input3_gpu.CopyFrom<float>(input3, *gpu_place, context);
float* a = input1_gpu.data<float>();
float* b = input2_gpu.data<float>();
float* c = input3_gpu.mutable_data<float>(*gpu_place);
paddle::operators::math::gemm<paddle::platform::GPUPlace, float>(
context, false, false, m, n, k, 1, a, 3, b + 1, 4, 1, c + 1, 4);
input3.CopyFrom<float>(input3_gpu, *cpu_place, context);
// numpy code:
// a = np.arange(6).reshape(2, 3)
// b = np.arange(12).reshape(3, 4)[:, 1:]
// c = np.arange(8).reshape(2, 4)[:, 1:]
// out = np.arange(8).reshape(2, 4)
// out[:, 1:] = np.dot(a, b) + c
context.Wait();
EXPECT_EQ(input3_ptr[0], 0);
EXPECT_EQ(input3_ptr[1], 24);
EXPECT_EQ(input3_ptr[2], 28);
EXPECT_EQ(input3_ptr[3], 32);
EXPECT_EQ(input3_ptr[4], 4);
EXPECT_EQ(input3_ptr[5], 73);
EXPECT_EQ(input3_ptr[6], 86);
EXPECT_EQ(input3_ptr[7], 99);
delete gpu_place;
}
TEST(math_function, gemm_trans_cublas) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input2;
paddle::framework::Tensor input3;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor input3_gpu;
int m = 2;
int n = 3;
int k = 3;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr1[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr1, 6 * sizeof(float));
float* input2_ptr = input2.mutable_data<float>({4, 3}, *cpu_place);
float arr2[12] = {0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11};
memcpy(input2_ptr, arr2, 12 * sizeof(float));
float* input3_ptr = input3.mutable_data<float>({2, 4}, *cpu_place);
float arr3[8] = {0, 1, 2, 3, 4, 5, 6, 7};
memcpy(input3_ptr, arr3, 8 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input2, *gpu_place, context);
input3_gpu.CopyFrom<float>(input3, *gpu_place, context);
float* a = input1_gpu.data<float>();
float* b = input2_gpu.data<float>();
float* c = input3_gpu.mutable_data<float>(*gpu_place);
paddle::operators::math::gemm<paddle::platform::GPUPlace, float>(
context, false, true, m, n, k, 1, a, 3, b + 3, 3, 1, c + 1, 4);
input3.CopyFrom<float>(input3_gpu, *cpu_place, context);
context.Wait();
EXPECT_EQ(input3_ptr[0], 0);
EXPECT_EQ(input3_ptr[1], 24);
EXPECT_EQ(input3_ptr[2], 28);
EXPECT_EQ(input3_ptr[3], 32);
EXPECT_EQ(input3_ptr[4], 4);
EXPECT_EQ(input3_ptr[5], 73);
EXPECT_EQ(input3_ptr[6], 86);
EXPECT_EQ(input3_ptr[7], 99);
delete gpu_place;
}
#endif
TEST(math_function, gemm_notrans_cblas) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input2;
......@@ -253,15 +74,15 @@ TEST(math_function, zero) {
auto* cpu_place = new paddle::platform::CPUPlace();
float* t = tensor.mutable_data<float>({2, 2}, *cpu_place);
paddle::platform::CPUDeviceContext context(*cpu_place);
paddle::operators::math::SetConstant<paddle::platform::CPUPlace, float>(
context, &tensor, 0);
paddle::operators::math::SetConstant<paddle::platform::CPUPlace, float>
functor;
functor(context, &tensor, 0);
EXPECT_EQ(t[0], 0);
EXPECT_EQ(t[1], 0);
EXPECT_EQ(t[2], 0);
EXPECT_EQ(t[3], 0);
paddle::operators::math::SetConstant<paddle::platform::CPUPlace, float>(
context, &tensor, 1);
functor(context, &tensor, 1);
EXPECT_EQ(t[0], 1);
EXPECT_EQ(t[1], 1);
......
#include "gtest/gtest.h"
#include "paddle/operators/math/math_function.h"
TEST(math_function, notrans_mul_trans) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor out_gpu;
paddle::framework::Tensor out;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr, 6 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input1, *gpu_place, context);
out_gpu.mutable_data<float>({2, 2}, *gpu_place);
paddle::operators::math::matmul<paddle::platform::GPUPlace, float>(
context, input1_gpu, false, input2_gpu, true, 1, &out_gpu, 0);
out.CopyFrom<float>(out_gpu, *cpu_place, context);
float* out_ptr = out.data<float>();
context.Wait();
EXPECT_EQ(out_ptr[0], 5);
EXPECT_EQ(out_ptr[1], 14);
EXPECT_EQ(out_ptr[2], 14);
EXPECT_EQ(out_ptr[3], 50);
delete gpu_place;
}
TEST(math_function, trans_mul_notrans) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor out_gpu;
paddle::framework::Tensor out;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr, 6 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input1, *gpu_place, context);
out_gpu.mutable_data<float>({3, 3}, *gpu_place);
paddle::operators::math::matmul<paddle::platform::GPUPlace, float>(
context, input1_gpu, true, input2_gpu, false, 1, &out_gpu, 0);
out.CopyFrom<float>(out_gpu, *cpu_place, context);
float* out_ptr = out.data<float>();
context.Wait();
EXPECT_EQ(out_ptr[0], 9);
EXPECT_EQ(out_ptr[1], 12);
EXPECT_EQ(out_ptr[2], 15);
EXPECT_EQ(out_ptr[3], 12);
EXPECT_EQ(out_ptr[4], 17);
EXPECT_EQ(out_ptr[5], 22);
EXPECT_EQ(out_ptr[6], 15);
EXPECT_EQ(out_ptr[7], 22);
EXPECT_EQ(out_ptr[8], 29);
delete gpu_place;
}
TEST(math_function, gemm_notrans_cublas) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input2;
paddle::framework::Tensor input3;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor input3_gpu;
int m = 2;
int n = 3;
int k = 3;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr1[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr1, 6 * sizeof(float));
float* input2_ptr = input2.mutable_data<float>({3, 4}, *cpu_place);
float arr2[12] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};
memcpy(input2_ptr, arr2, 12 * sizeof(float));
float* input3_ptr = input3.mutable_data<float>({2, 4}, *cpu_place);
float arr3[8] = {0, 1, 2, 3, 4, 5, 6, 7};
memcpy(input3_ptr, arr3, 8 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input2, *gpu_place, context);
input3_gpu.CopyFrom<float>(input3, *gpu_place, context);
float* a = input1_gpu.data<float>();
float* b = input2_gpu.data<float>();
float* c = input3_gpu.mutable_data<float>(*gpu_place);
paddle::operators::math::gemm<paddle::platform::GPUPlace, float>(
context, false, false, m, n, k, 1, a, 3, b + 1, 4, 1, c + 1, 4);
input3.CopyFrom<float>(input3_gpu, *cpu_place, context);
// numpy code:
// a = np.arange(6).reshape(2, 3)
// b = np.arange(12).reshape(3, 4)[:, 1:]
// c = np.arange(8).reshape(2, 4)[:, 1:]
// out = np.arange(8).reshape(2, 4)
// out[:, 1:] = np.dot(a, b) + c
context.Wait();
EXPECT_EQ(input3_ptr[0], 0);
EXPECT_EQ(input3_ptr[1], 24);
EXPECT_EQ(input3_ptr[2], 28);
EXPECT_EQ(input3_ptr[3], 32);
EXPECT_EQ(input3_ptr[4], 4);
EXPECT_EQ(input3_ptr[5], 73);
EXPECT_EQ(input3_ptr[6], 86);
EXPECT_EQ(input3_ptr[7], 99);
delete gpu_place;
}
TEST(math_function, gemm_trans_cublas) {
paddle::framework::Tensor input1;
paddle::framework::Tensor input2;
paddle::framework::Tensor input3;
paddle::framework::Tensor input1_gpu;
paddle::framework::Tensor input2_gpu;
paddle::framework::Tensor input3_gpu;
int m = 2;
int n = 3;
int k = 3;
auto* cpu_place = new paddle::platform::CPUPlace();
float* input1_ptr = input1.mutable_data<float>({2, 3}, *cpu_place);
float arr1[6] = {0, 1, 2, 3, 4, 5};
memcpy(input1_ptr, arr1, 6 * sizeof(float));
float* input2_ptr = input2.mutable_data<float>({4, 3}, *cpu_place);
float arr2[12] = {0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11};
memcpy(input2_ptr, arr2, 12 * sizeof(float));
float* input3_ptr = input3.mutable_data<float>({2, 4}, *cpu_place);
float arr3[8] = {0, 1, 2, 3, 4, 5, 6, 7};
memcpy(input3_ptr, arr3, 8 * sizeof(float));
auto* gpu_place = new paddle::platform::GPUPlace(0);
paddle::platform::CUDADeviceContext context(*gpu_place);
input1_gpu.CopyFrom<float>(input1, *gpu_place, context);
input2_gpu.CopyFrom<float>(input2, *gpu_place, context);
input3_gpu.CopyFrom<float>(input3, *gpu_place, context);
float* a = input1_gpu.data<float>();
float* b = input2_gpu.data<float>();
float* c = input3_gpu.mutable_data<float>(*gpu_place);
paddle::operators::math::gemm<paddle::platform::GPUPlace, float>(
context, false, true, m, n, k, 1, a, 3, b + 3, 3, 1, c + 1, 4);
input3.CopyFrom<float>(input3_gpu, *cpu_place, context);
context.Wait();
EXPECT_EQ(input3_ptr[0], 0);
EXPECT_EQ(input3_ptr[1], 24);
EXPECT_EQ(input3_ptr[2], 28);
EXPECT_EQ(input3_ptr[3], 32);
EXPECT_EQ(input3_ptr[4], 4);
EXPECT_EQ(input3_ptr[5], 73);
EXPECT_EQ(input3_ptr[6], 86);
EXPECT_EQ(input3_ptr[7], 99);
delete gpu_place;
}
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/math/selected_rows_functor.h"
#include "paddle/operators/math/math_function.h"
namespace paddle {
namespace operators {
namespace math {
template <typename T>
struct SelectedRowsAdd<platform::CPUPlace, T> {
void operator()(const platform::DeviceContext& context,
const framework::SelectedRows& input1,
const framework::SelectedRows& input2,
framework::SelectedRows* output) {
auto in1_height = input1.height();
PADDLE_ENFORCE_EQ(in1_height, input2.height());
output->set_height(in1_height);
auto& in1_rows = input1.rows();
auto& in2_rows = input2.rows();
std::vector<int64_t> out_rows;
out_rows.reserve(in1_rows.size() + in2_rows.size());
// concat rows
out_rows.insert(out_rows.end(), in1_rows.begin(), in1_rows.end());
out_rows.insert(out_rows.end(), in2_rows.begin(), in2_rows.end());
output->set_rows(out_rows);
auto* out_value = output->mutable_value();
auto& in1_value = input1.value();
auto& in2_value = input2.value();
auto in1_row_numel = in1_value.numel() / in1_rows.size();
PADDLE_ENFORCE_EQ(in1_row_numel, in2_value.numel() / in2_rows.size());
PADDLE_ENFORCE_EQ(in1_row_numel, out_value->numel() / out_rows.size());
auto in1_place = input1.place();
PADDLE_ENFORCE(platform::is_cpu_place(in1_place));
auto in2_place = input2.place();
PADDLE_ENFORCE(platform::is_cpu_place(in2_place));
auto out_place = context.GetPlace();
PADDLE_ENFORCE(platform::is_cpu_place(out_place));
auto* out_data = out_value->data<T>();
auto* in1_data = in1_value.data<T>();
memory::Copy(boost::get<platform::CPUPlace>(out_place), out_data,
boost::get<platform::CPUPlace>(in1_place), in1_data,
in1_value.numel() * sizeof(T));
auto* in2_data = in2_value.data<T>();
memory::Copy(boost::get<platform::CPUPlace>(out_place),
out_data + in1_value.numel(),
boost::get<platform::CPUPlace>(in2_place), in2_data,
in2_value.numel() * sizeof(T));
}
};
template struct SelectedRowsAdd<platform::CPUPlace, float>;
template <typename T>
struct SelectedRowsAddTensor<platform::CPUPlace, T> {
void operator()(const platform::DeviceContext& context,
const framework::SelectedRows& input1,
const framework::Tensor& input2, framework::Tensor* output) {
auto in1_height = input1.height();
auto in2_dims = input2.dims();
auto out_dims = output->dims();
PADDLE_ENFORCE_EQ(in1_height, in2_dims[0]);
PADDLE_ENFORCE_EQ(in1_height, out_dims[0]);
auto& in1_value = input1.value();
auto& in1_rows = input1.rows();
int64_t in1_row_numel = in1_value.numel() / in1_rows.size();
PADDLE_ENFORCE_EQ(in1_row_numel, input2.numel() / in1_height);
PADDLE_ENFORCE_EQ(in1_row_numel, output->numel() / in1_height);
SetConstant<platform::CPUPlace, T> functor;
functor(context, output, 0.0);
auto* in1_data = in1_value.data<T>();
auto* out_data = output->data<T>();
for (size_t i = 0; i < in1_rows.size(); i++) {
for (int64_t j = 0; j < in1_row_numel; j++) {
out_data[in1_rows[i] * in1_row_numel + j] +=
in1_data[i * in1_row_numel + j];
}
}
auto out_eigen = framework::EigenVector<T>::Flatten(*output);
auto in2_eigen = framework::EigenVector<T>::Flatten(input2);
out_eigen.device(*context.GetEigenDevice<platform::CPUPlace>()) =
out_eigen + in2_eigen;
}
};
template struct SelectedRowsAddTensor<platform::CPUPlace, float>;
} // namespace math
} // namespace operators
} // namespace paddle
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/math/math_function.h"
#include "paddle/operators/math/selected_rows_functor.h"
#include "paddle/platform/cuda_helper.h"
namespace paddle {
namespace operators {
namespace math {
template <typename T>
struct SelectedRowsAdd<platform::GPUPlace, T> {
void operator()(const platform::DeviceContext& context,
const framework::SelectedRows& input1,
const framework::SelectedRows& input2,
framework::SelectedRows* output) {
auto in1_height = input1.height();
PADDLE_ENFORCE_EQ(in1_height, input2.height());
output->set_height(in1_height);
auto& in1_rows = input1.rows();
auto& in2_rows = input2.rows();
std::vector<int64_t> out_rows;
out_rows.reserve(in1_rows.size() + in2_rows.size());
// concat rows
out_rows.insert(out_rows.end(), in1_rows.begin(), in1_rows.end());
out_rows.insert(out_rows.end(), in2_rows.begin(), in2_rows.end());
output->set_rows(out_rows);
auto* out_value = output->mutable_value();
auto& in1_value = input1.value();
auto& in2_value = input2.value();
auto in1_row_numel = in1_value.numel() / in1_rows.size();
PADDLE_ENFORCE_EQ(in1_row_numel, in2_value.numel() / in2_rows.size());
PADDLE_ENFORCE_EQ(in1_row_numel, out_value->numel() / out_rows.size());
auto* out_data = out_value->data<T>();
auto* in1_data = in1_value.data<T>();
auto in1_place = input1.place();
PADDLE_ENFORCE(platform::is_gpu_place(in1_place));
auto in2_place = input2.place();
PADDLE_ENFORCE(platform::is_gpu_place(in2_place));
auto out_place = context.GetPlace();
PADDLE_ENFORCE(platform::is_gpu_place(out_place));
memory::Copy(
boost::get<platform::GPUPlace>(out_place), out_data,
boost::get<platform::GPUPlace>(in1_place), in1_data,
in1_value.numel() * sizeof(T),
reinterpret_cast<const platform::CUDADeviceContext&>(context).stream());
auto* in2_data = in2_value.data<T>();
memory::Copy(
boost::get<platform::GPUPlace>(out_place), out_data + in1_value.numel(),
boost::get<platform::GPUPlace>(in2_place), in2_data,
in2_value.numel() * sizeof(T),
reinterpret_cast<const platform::CUDADeviceContext&>(context).stream());
}
};
template struct SelectedRowsAdd<platform::GPUPlace, float>;
namespace {
template <typename T>
__global__ void SelectedRowsAddTensorKernel(const T* selected_rows,
const int64_t* rows, T* tensor_out,
int64_t row_numel, int block_size) {
const int ty = blockIdx.y;
int tid = threadIdx.x;
selected_rows += ty * row_numel;
tensor_out += rows[ty] * row_numel;
for (int index = tid; index < row_numel; index += block_size) {
// Since index in rows of SelectedRows can be duplicate, we can not use
// tensor_out[index] += selected_rows[index]; Instead, we have to use
// AtomicAdd to avoid concurrent write error.
paddle::platform::CudaAtomicAdd(tensor_out + index, selected_rows[index]);
}
}
} // namespace
template <typename T>
struct SelectedRowsAddTensor<platform::GPUPlace, T> {
void operator()(const platform::DeviceContext& context,
const framework::SelectedRows& input1,
const framework::Tensor& input2, framework::Tensor* output) {
auto in1_height = input1.height();
auto in2_dims = input2.dims();
auto out_dims = output->dims();
PADDLE_ENFORCE_EQ(in1_height, in2_dims[0]);
PADDLE_ENFORCE_EQ(in1_height, out_dims[0]);
auto& in1_value = input1.value();
auto& in1_rows = input1.rows();
int64_t in1_row_numel = in1_value.numel() / in1_rows.size();
PADDLE_ENFORCE_EQ(in1_row_numel, input2.numel() / in1_height);
PADDLE_ENFORCE_EQ(in1_row_numel, output->numel() / in1_height);
auto* in1_data = in1_value.data<T>();
auto* in2_data = input2.data<T>();
auto* out_data = output->data<T>();
SetConstant<platform::GPUPlace, T> functor;
functor(context, output, 0.0);
int block_size = 256;
dim3 threads(block_size, 1);
dim3 grid(1, in1_rows.size());
SelectedRowsAddTensorKernel<
T><<<grid, threads, 0,
reinterpret_cast<const platform::CUDADeviceContext&>(context)
.stream()>>>(in1_data, in1_rows.data(), out_data,
in1_row_numel, block_size);
auto out_eigen = framework::EigenVector<T>::Flatten(*output);
auto in2_eigen = framework::EigenVector<T>::Flatten(input2);
out_eigen.device(*context.GetEigenDevice<platform::GPUPlace>()) =
out_eigen + in2_eigen;
}
};
template struct SelectedRowsAddTensor<platform::GPUPlace, float>;
} // namespace math
} // namespace operators
} // namespace paddle
......@@ -11,32 +11,31 @@ distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
#include "paddle/framework/eigen.h"
#include "paddle/framework/op_registry.h"
#include "paddle/framework/selected_rows.h"
#include "paddle/platform/device_context.h"
namespace paddle {
namespace operators {
namespace math {
// SelectedRows + SelectedRows will simplely concat value and rows.
// The real computation happens in dealing with LoDTensor.
template <typename Place, typename T>
struct SelectedRowsAdd {
void operator()(const platform::DeviceContext& context,
const framework::SelectedRows& input1,
const framework::SelectedRows& input2,
framework::SelectedRows* output);
};
template <typename T>
class FeedKernel : public framework::OpKernel<T> {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
framework::Tensor* out = ctx.Output<framework::Tensor>("Out");
out->mutable_data<T>(ctx.GetPlace());
framework::Variable* g_feed_variable =
framework::GetGlobalScope().FindVar("feed_value");
const auto& tensors =
g_feed_variable->Get<std::vector<framework::Tensor>>();
int col = ctx.template Attr<int>("col");
PADDLE_ENFORCE_GT(tensors.size(), static_cast<size_t>(col));
// TODO(qijun):
// check tensors[col].dims() with attribute,
// except the first dimenson.
out->CopyFrom<T>(tensors[col], ctx.GetPlace(), ctx.device_context());
}
template <typename Place, typename T>
struct SelectedRowsAddTensor {
void operator()(const platform::DeviceContext& context,
const framework::SelectedRows& input1,
const framework::Tensor& input2, framework::Tensor* output);
};
} // namespace math
} // namespace operators
} // namespace paddle
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "paddle/operators/math/selected_rows_functor.h"
#include "gtest/gtest.h"
#include "paddle/operators/math/math_function.h"
TEST(selected_rows_functor, cpu_add) {
using namespace paddle::framework;
using namespace paddle::platform;
using namespace paddle::operators::math;
CPUPlace cpu_place;
CPUDeviceContext ctx(cpu_place);
SetConstant<CPUPlace, float> functor;
int64_t height = 10;
int64_t row_numel = 10;
std::vector<int64_t> rows1{0, 4, 7};
std::unique_ptr<SelectedRows> selected_rows1{new SelectedRows(rows1, height)};
auto* in1_value = selected_rows1->mutable_value();
in1_value->mutable_data<float>(
make_ddim({static_cast<int64_t>(rows1.size()), row_numel}), cpu_place);
functor(ctx, in1_value, 1.0);
std::vector<int64_t> rows2{0, 5, 7, 9};
std::unique_ptr<SelectedRows> selected_rows2{new SelectedRows(rows2, height)};
auto* in2_value = selected_rows2->mutable_value();
in2_value->mutable_data<float>(
make_ddim({static_cast<int64_t>(rows2.size()), row_numel}), cpu_place);
functor(ctx, in2_value, 2.0);
std::unique_ptr<SelectedRows> output{new SelectedRows()};
auto* out_value = output->mutable_value();
// simplely concat two SelectedRows
out_value->mutable_data<float>(make_ddim({7, 10}), cpu_place);
SelectedRowsAdd<CPUPlace, float> add_functor;
add_functor(ctx, *selected_rows1, *selected_rows2, output.get());
auto out_height = output->height();
EXPECT_EQ(out_height, height);
auto& out_rows = output->rows();
// input1 rows
EXPECT_EQ(out_rows[0], 0);
EXPECT_EQ(out_rows[1], 4);
EXPECT_EQ(out_rows[2], 7);
// input2 rows
EXPECT_EQ(out_rows[3], 0);
EXPECT_EQ(out_rows[4], 5);
EXPECT_EQ(out_rows[5], 7);
EXPECT_EQ(out_rows[6], 9);
auto* out_data = output->value().data<float>();
// input1 value
EXPECT_EQ(out_data[0 * row_numel + 0], 1.0);
EXPECT_EQ(out_data[0 * row_numel + 8], 1.0);
EXPECT_EQ(out_data[1 * row_numel + 1], 1.0);
EXPECT_EQ(out_data[2 * row_numel + 6], 1.0);
// input2 value
EXPECT_EQ(out_data[3 * row_numel + 3], 2.0);
EXPECT_EQ(out_data[3 * row_numel + 8], 2.0);
EXPECT_EQ(out_data[4 * row_numel + 4], 2.0);
EXPECT_EQ(out_data[5 * row_numel + 7], 2.0);
EXPECT_EQ(out_data[6 * row_numel + 9], 2.0);
std::unique_ptr<Tensor> tensor1{new Tensor()};
tensor1->mutable_data<float>(make_ddim({height, row_numel}), cpu_place);
functor(ctx, tensor1.get(), 3.0);
std::unique_ptr<Tensor> tensor2{new Tensor()};
tensor2->mutable_data<float>(make_ddim({height, row_numel}), cpu_place);
SelectedRowsAddTensor<CPUPlace, float> add_tensor_functor;
add_tensor_functor(ctx, *output, *tensor1, tensor2.get());
auto* tensor2_data = tensor2->data<float>();
// row0: 1.0 + 2.0 + 3.0
EXPECT_EQ(tensor2_data[0 * row_numel + 0], 6.0);
// row1: 3.0
EXPECT_EQ(tensor2_data[1 * row_numel + 1], 3.0);
// row4 : 1.0 + 3.0
EXPECT_EQ(tensor2_data[4 * row_numel + 6], 4.0);
// row5: 2.0 + 3.0
EXPECT_EQ(tensor2_data[5 * row_numel + 7], 5.0);
// row6: 3.0
EXPECT_EQ(tensor2_data[6 * row_numel + 1], 3.0);
// row7: 1.0 + 2.0 + 3.0
EXPECT_EQ(tensor2_data[7 * row_numel + 3], 6.0);
// row9: 2.0 + 3.0
EXPECT_EQ(tensor2_data[9 * row_numel + 6], 5.0);
}
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
#include "gtest/gtest.h"
#include "paddle/operators/math/math_function.h"
#include "paddle/operators/math/selected_rows_functor.h"
TEST(selected_rows_functor, gpu_add) {
using namespace paddle::framework;
using namespace paddle::platform;
using namespace paddle::operators::math;
GPUPlace gpu_place(0);
CPUPlace cpu_place;
CUDADeviceContext ctx(gpu_place);
SetConstant<GPUPlace, float> functor;
int64_t height = 10;
int64_t row_numel = 10;
std::vector<int64_t> rows1{0, 4, 7};
std::unique_ptr<SelectedRows> selected_rows1{new SelectedRows(rows1, height)};
auto* in1_value = selected_rows1->mutable_value();
in1_value->mutable_data<float>(
make_ddim({static_cast<int64_t>(rows1.size()), row_numel}), gpu_place);
functor(ctx, in1_value, 1.0);
std::vector<int64_t> rows2{0, 5, 7, 9};
std::unique_ptr<SelectedRows> selected_rows2{new SelectedRows(rows2, height)};
auto* in2_value = selected_rows2->mutable_value();
in2_value->mutable_data<float>(
make_ddim({static_cast<int64_t>(rows2.size()), row_numel}), gpu_place);
functor(ctx, in2_value, 2.0);
std::unique_ptr<SelectedRows> output{new SelectedRows()};
auto* out_value = output->mutable_value();
// simplely concat two SelectedRows
out_value->mutable_data<float>(make_ddim({7, 10}), gpu_place);
SelectedRowsAdd<GPUPlace, float> add_functor;
add_functor(ctx, *selected_rows1, *selected_rows2, output.get());
auto out_height = output->height();
EXPECT_EQ(out_height, height);
auto& out_rows = output->rows();
// input1 rows
EXPECT_EQ(out_rows[0], 0);
EXPECT_EQ(out_rows[1], 4);
EXPECT_EQ(out_rows[2], 7);
// input2 rows
EXPECT_EQ(out_rows[3], 0);
EXPECT_EQ(out_rows[4], 5);
EXPECT_EQ(out_rows[5], 7);
EXPECT_EQ(out_rows[6], 9);
Tensor out_cpu;
out_cpu.CopyFrom<float>(*out_value, cpu_place, ctx);
ctx.Wait();
auto* out_cpu_data = out_cpu.data<float>();
// input1 value
EXPECT_EQ(out_cpu_data[0 * row_numel + 0], 1.0);
EXPECT_EQ(out_cpu_data[0 * row_numel + 8], 1.0);
EXPECT_EQ(out_cpu_data[1 * row_numel + 1], 1.0);
EXPECT_EQ(out_cpu_data[2 * row_numel + 6], 1.0);
// input2 value
EXPECT_EQ(out_cpu_data[3 * row_numel + 3], 2.0);
EXPECT_EQ(out_cpu_data[3 * row_numel + 8], 2.0);
EXPECT_EQ(out_cpu_data[4 * row_numel + 4], 2.0);
EXPECT_EQ(out_cpu_data[5 * row_numel + 7], 2.0);
EXPECT_EQ(out_cpu_data[6 * row_numel + 9], 2.0);
std::unique_ptr<Tensor> tensor1{new Tensor()};
tensor1->mutable_data<float>(make_ddim({height, row_numel}), gpu_place);
functor(ctx, tensor1.get(), 3.0);
std::unique_ptr<Tensor> tensor2{new Tensor()};
tensor2->mutable_data<float>(make_ddim({height, row_numel}), gpu_place);
SelectedRowsAddTensor<GPUPlace, float> add_tensor_functor;
add_tensor_functor(ctx, *output, *tensor1, tensor2.get());
Tensor tensor2_cpu;
tensor2_cpu.CopyFrom<float>(*tensor2, cpu_place, ctx);
ctx.Wait();
auto* tensor2_cpu_data = tensor2_cpu.data<float>();
// row0: 1.0 + 2.0 + 3.0
EXPECT_EQ(tensor2_cpu_data[0 * row_numel + 0], 6.0);
// row1: 3.0
EXPECT_EQ(tensor2_cpu_data[1 * row_numel + 1], 3.0);
// row4 : 1.0 + 3.0
EXPECT_EQ(tensor2_cpu_data[4 * row_numel + 6], 4.0);
// row5: 2.0 + 3.0
EXPECT_EQ(tensor2_cpu_data[5 * row_numel + 7], 5.0);
// row6: 3.0
EXPECT_EQ(tensor2_cpu_data[6 * row_numel + 1], 3.0);
// row7: 1.0 + 2.0 + 3.0
EXPECT_EQ(tensor2_cpu_data[7 * row_numel + 3], 6.0);
// row9: 2.0 + 3.0
EXPECT_EQ(tensor2_cpu_data[9 * row_numel + 6], 5.0);
}
......@@ -21,7 +21,6 @@ class MeanOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of MeanOp should not be null.");
......@@ -46,7 +45,6 @@ class MeanGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
ctx->SetOutputDim(framework::GradVarName("X"), ctx->GetInputDim("X"));
}
......
......@@ -25,7 +25,6 @@ class MinusOp : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of MinusOp should not be null.");
......
......@@ -21,7 +21,6 @@ class ModifiedHuberLossOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "X must be initialized.");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Y must be initialized.");
......@@ -73,7 +72,6 @@ class ModifiedHuberLossGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "X must be initialized.");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Y must be initialized.");
......
......@@ -23,7 +23,6 @@ class MulOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) of MulOp should not be null.");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Input(Y) of MulOp should not be null.");
......@@ -96,7 +95,6 @@ class MulOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Input(Y) should not be null");
......
......@@ -23,7 +23,6 @@ class MultiplexOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Ids"), "Input(Ids) shouldn't be null.");
PADDLE_ENFORCE(!ctx->Inputs("X").empty(),
......@@ -51,6 +50,7 @@ class MultiplexOp : public framework::OperatorWithKernel {
ctx->SetOutputDim("Out", in_dim);
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.MultiInput<Tensor>("X")[0]->type());
......@@ -89,7 +89,6 @@ class MultiplexGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(!ctx->Inputs("X").empty(), "Input(X) should not be null.");
PADDLE_ENFORCE(!ctx->Outputs(framework::GradVarName("X")).empty(),
......@@ -105,6 +104,7 @@ class MultiplexGradOp : public framework::OperatorWithKernel {
ctx->SetOutputsDim(framework::GradVarName("X"), d_ins);
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.MultiInput<Tensor>("X")[0]->type());
......
......@@ -11,7 +11,7 @@ When defining an operator in Paddle, a corresponding [OpProtoMaker](https://gith
- If an operator's Input/Output are tensors in math, not match to any meaningful words, input name should starts from `X`. e.g. `X`, `Y`, and output name should starts from `Out`. e.g. `Out`. This rule intends making operators which have few inputs/outputs unified.
- Attribute.
- Attribute name follows the **camelCase**. e.g. `x`, `y`, `axis`, `rowwiseMatrix`. Also, attribute name prefers to meaningful English words.
- Attribute name follows the **snake_case**. e.g. `x`, `y`, `axis`, `rowwise_matrix`. Also, attribute name prefers to meaningful English words.
- Comments.
- Input/Output/Attr comment follow the format of **(type,default value) usage**, corresponding to which type it can be and how it will be used in the operator. e.g. Attribute in Accumulator`"gamma" `,`(float, default 1.0) Accumulation multiplier`.
......
......@@ -23,7 +23,6 @@ class PadOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) of PadOp should not be null.");
PADDLE_ENFORCE(ctx->HasOutput("Out"),
......@@ -97,7 +96,6 @@ class PadOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
......
......@@ -29,7 +29,7 @@ void PoolOp::InferShape(framework::InferShapeContext *ctx) const {
auto in_x_dims = ctx->GetInputDim("X");
std::string pooling_type = ctx->Attrs().Get<std::string>("poolingType");
std::string pooling_type = ctx->Attrs().Get<std::string>("pooling_type");
std::vector<int> ksize = ctx->Attrs().Get<std::vector<int>>("ksize");
std::vector<int> strides = ctx->Attrs().Get<std::vector<int>>("strides");
std::vector<int> paddings = ctx->Attrs().Get<std::vector<int>>("paddings");
......@@ -37,7 +37,7 @@ void PoolOp::InferShape(framework::InferShapeContext *ctx) const {
PADDLE_ENFORCE(in_x_dims.size() == 4 || in_x_dims.size() == 5,
"Pooling intput should be 4-D or 5-D tensor.");
if (ctx->Attrs().Get<bool>("globalPooling")) {
if (ctx->Attrs().Get<bool>("global_pooling")) {
ksize.resize(static_cast<size_t>(in_x_dims.size()) - 2);
for (size_t i = 0; i < ksize.size(); ++i)
ksize[i] = static_cast<int>(in_x_dims[i + 2]);
......@@ -80,23 +80,23 @@ Pool2dOpMaker::Pool2dOpMaker(framework::OpProto *proto,
"the number of channels, H and W is the height and "
"width of feature.");
AddAttr<std::string>("poolingType",
"PoolingType of pooling operator."
AddAttr<std::string>("pooling_type",
"Pooling_type of pooling operator."
"Str constant equal to 'max' or 'avg'.")
.InEnum({"max", "avg"});
AddAttr<std::vector<int>>(
"ksize",
"The pooling window size(height, width) of pooling operator."
"If globalPooling = true, ksize is ignored and need not be "
"If global_pooling = true, ksize is ignored and need not be "
"specified."); // TODO(Chengduo): Add checker. (Currently,
// TypedAttrChecker don't support vector type.)
AddAttr<bool>(
"globalPooling",
"Whether to use the globalPooling."
"global_pooling",
"Whether to use the global_pooling."
"Bool constant equal to false or true."
"Default false."
"If globalPooling = true, ksize is ignored and need not be specified.")
"If global_pooling = true, ksize is ignored and need not be specified.")
.SetDefault(false);
AddAttr<std::vector<int>>("strides",
"The strides(height, width) of pooling window."
......@@ -146,7 +146,7 @@ Pool3dOpMaker::Pool3dOpMaker(framework::OpProto *proto,
"the number of channels, D, H and W is the depth, height and "
"width of feature.");
AddAttr<std::string>("poolingType",
AddAttr<std::string>("pooling_type",
"PoolingType of pooling operator."
"Str constant equal to 'max' or 'avg'.")
.InEnum({"max", "avg"});
......@@ -154,15 +154,15 @@ Pool3dOpMaker::Pool3dOpMaker(framework::OpProto *proto,
AddAttr<std::vector<int>>(
"ksize",
"The pooling window size(depth, height, width) of pooling operator."
"If globalPooling = true, ksize is ignored and need not be "
"If global_pooling = true, ksize is ignored and need not be "
"specified."); // TODO(Chengduo): Add checker. (Currently,
// TypedAttrChecker don't support vector type.)
AddAttr<bool>(
"globalPooling",
"Whether to use the globalPooling."
"global_pooling",
"Whether to use the global_pooling."
"Bool constant equal to false or true."
"Default false."
"If globalPooling = true, ksize is ignored and need not be specified.")
"If global_pooling = true, ksize is ignored and need not be specified.")
.SetDefault(false);
AddAttr<std::vector<int>>("strides",
"Strides(depth, height, width) of pooling operator."
......
......@@ -28,7 +28,6 @@ class PoolOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override;
};
......@@ -36,7 +35,6 @@ class PoolOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override;
};
......@@ -59,11 +57,11 @@ class PoolKernel : public framework::OpKernel<T> {
const Tensor* in_x = context.Input<Tensor>("X");
Tensor* out = context.Output<Tensor>("Out");
std::string pooling_type = context.Attr<std::string>("poolingType");
std::string pooling_type = context.Attr<std::string>("pooling_type");
std::vector<int> ksize = context.Attr<std::vector<int>>("ksize");
std::vector<int> strides = context.Attr<std::vector<int>>("strides");
std::vector<int> paddings = context.Attr<std::vector<int>>("paddings");
if (context.Attr<bool>("globalPooling")) {
if (context.Attr<bool>("global_pooling")) {
for (size_t i = 0; i < ksize.size(); ++i) {
ksize[i] = static_cast<int>(in_x->dims()[i + 2]);
}
......@@ -119,12 +117,12 @@ class PoolGradKernel : public framework::OpKernel<T> {
context.Input<Tensor>(framework::GradVarName("Out"));
Tensor* in_x_grad = context.Output<Tensor>(framework::GradVarName("X"));
std::string pooling_type = context.Attr<std::string>("poolingType");
std::string pooling_type = context.Attr<std::string>("pooling_type");
std::vector<int> ksize = context.Attr<std::vector<int>>("ksize");
std::vector<int> strides = context.Attr<std::vector<int>>("strides");
std::vector<int> paddings = context.Attr<std::vector<int>>("paddings");
if (context.Attr<bool>("globalPooling")) {
if (context.Attr<bool>("global_pooling")) {
for (size_t i = 0; i < ksize.size(); ++i)
ksize[i] = static_cast<int>(in_x->dims()[i + 2]);
}
......
......@@ -27,7 +27,6 @@ class MaxPoolWithIndexOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"X(Input) of Pooling should not be null.");
......@@ -45,7 +44,7 @@ class MaxPoolWithIndexOp : public framework::OperatorWithKernel {
PADDLE_ENFORCE(in_x_dims.size() == 4 || in_x_dims.size() == 5,
"Pooling intput should be 4-D or 5-D tensor.");
if (ctx->Attrs().Get<bool>("globalPooling")) {
if (ctx->Attrs().Get<bool>("global_pooling")) {
ksize.resize(static_cast<size_t>(in_x_dims.size()) - 2);
for (size_t i = 0; i < ksize.size(); ++i)
ksize[i] = static_cast<int>(in_x_dims[i + 2]);
......@@ -72,7 +71,6 @@ class MaxPoolWithIndexOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Mask"), "Input(Mask) must not be null.");
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) must not be null.");
......@@ -108,15 +106,15 @@ class MaxPool2dWithIndexOpMaker : public framework::OpProtoAndCheckerMaker {
AddAttr<std::vector<int>>(
"ksize",
"The pooling window size(height, width) of pooling operator."
"If globalPooling = true, ksize is ignored and need not be "
"If global_pooling = true, ksize is ignored and need not be "
"specified."); // TODO(Chengduo): Add checker. (Currently,
// TypedAttrChecker don't support vector type.)
AddAttr<bool>(
"globalPooling",
"Whether to use the globalPooling."
"global_pooling",
"Whether to use the global_pooling."
"Bool constant equal to false or true."
"Default false."
"If globalPooling = true, ksize is ignored and need not be specified.")
"If global_pooling = true, ksize is ignored and need not be specified.")
.SetDefault(false);
AddAttr<std::vector<int>>("strides",
"The strides(height, width) of pooling window."
......@@ -179,15 +177,15 @@ class MaxPool3dWithIndexOpMaker : public framework::OpProtoAndCheckerMaker {
AddAttr<std::vector<int>>(
"ksize",
"The pooling window size(depth, height, width) of pooling operator."
"If globalPooling = true, ksize is ignored and need not be "
"If global_pooling = true, ksize is ignored and need not be "
"specified."); // TODO(Chengduo): Add checker. (Currently,
// TypedAttrChecker don't support vector type.)
AddAttr<bool>(
"globalPooling",
"Whether to use the globalPooling."
"global_pooling",
"Whether to use the global_pooling."
"Bool constant equal to false or true."
"Default false."
"If globalPooling = true, ksize is ignored and need not be specified.")
"If global_pooling = true, ksize is ignored and need not be specified.")
.SetDefault(false);
AddAttr<std::vector<int>>(
"strides",
......
......@@ -35,7 +35,7 @@ class MaxPoolWithIndexKernel : public framework::OpKernel<T> {
std::vector<int> ksize = context.Attr<std::vector<int>>("ksize");
std::vector<int> strides = context.Attr<std::vector<int>>("strides");
std::vector<int> paddings = context.Attr<std::vector<int>>("paddings");
if (context.Attr<bool>("globalPooling")) {
if (context.Attr<bool>("global_pooling")) {
for (size_t i = 0; i < ksize.size(); ++i) {
ksize[i] = static_cast<int>(in_x->dims()[i + 2]);
}
......@@ -70,7 +70,7 @@ class MaxPoolWithIndexGradKernel : public framework::OpKernel<T> {
std::vector<int> ksize = context.Attr<std::vector<int>>("ksize");
std::vector<int> strides = context.Attr<std::vector<int>>("strides");
std::vector<int> paddings = context.Attr<std::vector<int>>("paddings");
if (context.Attr<bool>("globalPooling")) {
if (context.Attr<bool>("global_pooling")) {
for (size_t i = 0; i < ksize.size(); ++i) {
ksize[i] = static_cast<int>(in_x_grad->dims()[i + 2]);
}
......
......@@ -25,7 +25,6 @@ class PReluOp : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasInput("Alpha"), "Input(Alpha) should not be null");
......@@ -62,7 +61,6 @@ class PReluGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) must not be null.");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
......
......@@ -24,7 +24,6 @@ class RankLossOp : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
// input check
PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) shouldn't be null");
......@@ -89,7 +88,6 @@ class RankLossGradOp : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Label"), "Input(Label) shouldn't be null.");
PADDLE_ENFORCE(ctx->HasInput("Left"), "Input(Left) shouldn't be null.");
......
......@@ -23,7 +23,6 @@ class ReduceOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of ReduceOp should not be null.");
......@@ -57,7 +56,6 @@ class ReduceGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null.");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
......
......@@ -25,7 +25,6 @@ class ReshapeOp : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
// input check
PADDLE_ENFORCE(ctx->HasInput("X"),
......@@ -93,7 +92,6 @@ class ReshapeGradOp : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) shouldn't be null.");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
......
......@@ -21,7 +21,6 @@ class RmspropOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Param"),
"Input(Param) of RmspropOp should not be null.");
......
......@@ -25,7 +25,6 @@ class ScaleOp : public framework::OperatorWithKernel {
const framework::AttributeMap &attrs)
: OperatorWithKernel(type, inputs, outputs, attrs) {}
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of ScaleOp should not be null.");
......@@ -56,7 +55,6 @@ class ScaleGradMaker : public framework::SingleGradOpDescMaker {
public:
using framework::SingleGradOpDescMaker::SingleGradOpDescMaker;
protected:
std::unique_ptr<framework::OpDescBind> Apply() const override {
auto *grad_op = new framework::OpDescBind();
grad_op->SetType("scale");
......
......@@ -22,7 +22,6 @@ class ScatterOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Ref"),
"Input(Ref) of ScatterOp should not be null.");
......@@ -49,6 +48,7 @@ class ScatterOp : public framework::OperatorWithKernel {
ctx->SetOutputDim("Out", ref_dims);
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.Input<Tensor>("Ref")->type());
......@@ -59,13 +59,13 @@ class ScatterGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
ctx->SetOutputDim(framework::GradVarName("Updates"),
ctx->GetInputDim("Updates"));
ctx->SetOutputDim(framework::GradVarName("Ref"), ctx->GetInputDim("Ref"));
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.Input<Tensor>("Ref")->type());
......
......@@ -21,7 +21,6 @@ class SequenceConcatOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInputs("X"),
"Inputs(X) of SequenceConcatOp should not be null.");
......@@ -105,7 +104,6 @@ class SequenceConcatGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
"The gradient of Out should not be null.");
......
......@@ -21,7 +21,6 @@ class SequencePoolOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of SequencePoolOp should not be null.");
......@@ -72,7 +71,6 @@ class SequencePoolGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
"Gradient of Out should not be null.");
......
......@@ -111,7 +111,8 @@ class SequencePoolGradKernel : public framework::OpKernel<T> {
in_g->mutable_data<T>(context.GetPlace());
if (strategy == LAST || strategy == FIRST) {
// set X@Grad be zero at first when strategy is LAST/FIRST
math::SetConstant<Place, T>(context.device_context(), in_g, 0);
math::SetConstant<Place, T> functor;
functor(context.device_context(), in_g, 0);
}
auto place = context.GetEigenDevice<Place>();
for (int i = 0; i < static_cast<int>(lod.size()) - 1; ++i) {
......
......@@ -21,7 +21,6 @@ class SequenceSoftmaxOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of SequenceSoftmaxOp should not be null.");
......@@ -66,7 +65,6 @@ class SequenceSoftmaxGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Out"),
"Input(Out) of SequenceSoftmaxGradOp should not be null.");
......
......@@ -21,7 +21,6 @@ class SGDOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Param"),
"Input(Param) of SGDOp should not be null.");
......
......@@ -23,7 +23,6 @@ class SigmoidCrossEntropyWithLogitsOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should be not null.");
PADDLE_ENFORCE(ctx->HasInput("Labels"),
......@@ -52,7 +51,6 @@ class SigmoidCrossEntropyWithLogitsGradOp
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should be not null.");
PADDLE_ENFORCE(ctx->HasInput("Labels"),
......
......@@ -21,7 +21,6 @@ class SmoothL1LossOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "X must be initialized.");
PADDLE_ENFORCE(ctx->HasInput("Y"), "Y must be initialized.");
......@@ -93,7 +92,6 @@ class SmoothL1LossGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
auto in_dims = ctx->GetInputDim("X");
auto out_dims = ctx->GetInputDim(framework::GradVarName("Out"));
......
......@@ -21,7 +21,6 @@ class SoftmaxOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of SoftmaxOp should not be null.");
......@@ -68,7 +67,6 @@ class SoftmaxOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Y"), "Input(Y) should be not null.");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Y")),
......
......@@ -46,7 +46,7 @@ class SoftmaxWithCrossEntropyOpMaker
"(Tensor, default: Tensor<float>), A 2-D tensor. The cross "
"entropy loss with shape [N x 1].");
AddAttr<bool>(
"softLabel",
"soft_label",
"(bool, default: false), A flag to indicate whether to interpretate "
"the given labels as soft labels.")
.SetDefault(false);
......@@ -82,7 +82,6 @@ class SoftmaxWithCrossEntropyOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("Logits"),
"Input(Logits) should be not null.");
......@@ -100,13 +99,13 @@ class SoftmaxWithCrossEntropyOp : public framework::OperatorWithKernel {
PADDLE_ENFORCE_EQ(labels_dims.size(), 2UL,
"The labels should be a 2-D tensor.");
if (ctx->Attrs().Get<bool>("softLabel")) {
if (ctx->Attrs().Get<bool>("soft_label")) {
PADDLE_ENFORCE_EQ(logits_dims[1], labels_dims[1],
"If Attr(softLabel) == true, the 2nd dimension of "
"If Attr(soft_label) == true, the 2nd dimension of "
"Input(X) and Input(Label) should be equal.");
} else {
PADDLE_ENFORCE_EQ(labels_dims[1], 1UL,
"If Attr(softLabel) == false, the 2nd dimension of "
"If Attr(soft_label) == false, the 2nd dimension of "
"Input(Label) should be 1.");
}
......@@ -117,6 +116,7 @@ class SoftmaxWithCrossEntropyOp : public framework::OperatorWithKernel {
ctx->ShareLoD("Logits", /*->*/ "Loss");
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(ctx.Input<Tensor>("Logits")->type());
......@@ -127,7 +127,6 @@ class SoftmaxWithCrossEntropyOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Loss")),
"Input(Loss@Grad) should not be null.");
......@@ -142,13 +141,13 @@ class SoftmaxWithCrossEntropyOpGrad : public framework::OperatorWithKernel {
PADDLE_ENFORCE_EQ(labels_dims.size(), 2UL,
"The labels should be a 2-D tensor.");
if (ctx->Attrs().Get<bool>("softLabel")) {
if (ctx->Attrs().Get<bool>("soft_label")) {
PADDLE_ENFORCE_EQ(softmax_dims[1], labels_dims[1],
"When Attr(softLabel) == true, the 2nd dimension of "
"When Attr(soft_label) == true, the 2nd dimension of "
"Input(X) and Input(Label) should be equal.");
} else {
PADDLE_ENFORCE_EQ(labels_dims[1], 1UL,
"When Attr(softLabel) == false, the 2nd dimension of "
"When Attr(soft_label) == false, the 2nd dimension of "
"Input(Label) should be 1.");
}
......@@ -156,6 +155,7 @@ class SoftmaxWithCrossEntropyOpGrad : public framework::OperatorWithKernel {
ctx->GetInputDim("Softmax"));
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return framework::ToDataType(
......
......@@ -70,7 +70,7 @@ class SoftmaxWithCrossEntropyCUDAKernel : public framework::OpKernel<T> {
logits, softmax);
math::CrossEntropyFunctor<platform::GPUPlace, T>()(
context.device_context(), loss, softmax, labels,
context.Attr<bool>("softLabel"));
context.Attr<bool>("soft_label"));
}
};
......@@ -93,7 +93,7 @@ class SoftmaxWithCrossEntropyGradCUDAKernel : public framework::OpKernel<T> {
int block = 512;
int grid = (batch_size * class_num + block - 1) / block;
if (context.Attr<bool>("softLabel")) {
if (context.Attr<bool>("soft_label")) {
const T* label_data = labels->data<T>();
SoftCrossEntropyGradientKernel<T><<<
grid, block, 0, reinterpret_cast<const platform::CUDADeviceContext&>(
......
......@@ -44,7 +44,7 @@ class SoftmaxWithCrossEntropyKernel : public framework::OpKernel<T> {
logits, softmax);
math::CrossEntropyFunctor<platform::CPUPlace, T>()(
context.device_context(), loss, softmax, labels,
context.Attr<bool>("softLabel"));
context.Attr<bool>("soft_label"));
}
};
......@@ -60,7 +60,7 @@ class SoftmaxWithCrossEntropyGradKernel : public framework::OpKernel<T> {
logit_grad->ShareDataWith<T>(*context.Input<Tensor>("Softmax"));
const int class_num = logit_grad->dims()[1];
if (context.Attr<bool>("softLabel")) {
if (context.Attr<bool>("soft_label")) {
auto out_grad_mat = EigenMatrix<T>::From(*out_grad);
auto logit_grad_mat = EigenMatrix<T>::From(*logit_grad);
auto lbl_mat = EigenMatrix<T>::From(*labels);
......
......@@ -23,7 +23,6 @@ class SplitOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of SplitOp should not be null.");
......
......@@ -21,7 +21,6 @@ class SquaredL2DistanceOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of SquaredL2DistanceOp should not be null.");
......@@ -85,7 +84,6 @@ class SquaredL2DistanceGradOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
"Gradient of Out should not be null");
......
......@@ -21,7 +21,6 @@ class SumOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInputs("X"), "Inputs(X) should not be null");
auto x_dims = ctx->GetInputsDim("X");
......
......@@ -21,7 +21,6 @@ class TopkOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext *ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"),
"Input(X) of TopkOp should not be null.");
......
......@@ -23,7 +23,6 @@ class TransposeOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasOutput("Out"), "Output(Out) should not be null");
......@@ -92,7 +91,6 @@ class TransposeOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput("X"), "Input(X) should not be null");
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
......
......@@ -46,7 +46,6 @@ class UniformRandomOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;
protected:
void InferShape(framework::InferShapeContext* ctx) const override {
PADDLE_ENFORCE(ctx->HasOutput("Out"),
"Output(Out) of UniformRandomOp should not be null.");
......@@ -63,6 +62,7 @@ class UniformRandomOp : public framework::OperatorWithKernel {
ctx->SetOutputDim("Out", framework::make_ddim(temp));
}
protected:
framework::DataType IndicateDataType(
const framework::ExecutionContext& ctx) const override {
return static_cast<framework::DataType>(Attr<int>("data_type"));
......
......@@ -16,7 +16,9 @@ limitations under the License. */
#include "paddle/framework/backward.h"
#include "paddle/framework/executor.h"
#include "paddle/framework/feed_fetch_method.h"
#include "paddle/framework/lod_tensor.h"
#include "paddle/framework/selected_rows.h"
#include "paddle/framework/tensor_array.h"
#include "paddle/operators/cond_op.h"
#include "paddle/operators/dynamic_recurrent_op.h"
......@@ -138,6 +140,32 @@ PYBIND11_PLUGIN(core) {
#endif
});
py::class_<SelectedRows>(m, "SelectedRows")
.def("__init__",
[](SelectedRows &instance) { new (&instance) SelectedRows(); })
.def("__init__",
[](SelectedRows &instance, const std::vector<int64_t> rows,
const int64_t &height) {
new (&instance) SelectedRows(rows, height);
})
.def("get_tensor",
[](SelectedRows &self) { return self.mutable_value(); },
py::return_value_policy::reference)
.def("set_height", &SelectedRows::set_height)
.def("height", &SelectedRows::height)
.def("set_rows", &SelectedRows::set_rows)
.def("rows", [](SelectedRows &self) {
#ifndef PADDLE_WITH_CUDA
return self.rows();
#else
auto rows = self.rows();
std::vector<int64_t> new_rows;
new_rows.reserve(rows.size());
std::copy(rows.begin(), rows.end(), std::back_inserter(new_rows));
return new_rows;
#endif
});
py::class_<Variable>(m, "Variable", R"DOC(Variable Class.
All parameter, weight, gradient are variables in Paddle.
......@@ -403,6 +431,10 @@ All parameter, weight, gradient are variables in Paddle.
m.def("unique_integer", UniqueIntegerGenerator);
m.def("is_compile_gpu", IsCompileGPU);
m.def("set_feed_variable_float", framework::SetFeedVariable<float>);
m.def("set_feed_variable_double", framework::SetFeedVariable<double>);
m.def("set_feed_variable_int", framework::SetFeedVariable<int>);
m.def("get_fetch_variable", framework::GetFetchVariable);
BindProgramDesc(m);
BindBlockDesc(m);
......
......@@ -153,7 +153,8 @@ class OpProtoHolder(object):
self.op_proto_map[proto.type] = proto
def get_op_proto(self, type):
assert type in self.op_proto_map, "Operator \"%s\" has not been registered." % type
if type not in self.op_proto_map:
raise ValueError("Operator \"%s\" has not been registered." % type)
return self.op_proto_map[type]
......@@ -374,10 +375,10 @@ class Program(object):
cls._instance = cls()
return cls._instance
def __init__(self):
assert not hasattr(self.__class__,
'_instance'), 'Do not call constructor directly!'
self.desc = core.ProgramDesc.instance()
def __init__(self, desc=None):
if desc is None:
desc = core.ProgramDesc.instance()
self.desc = desc
self.blocks = [Block(self, 0)]
self.current_block_idx = 0
......@@ -428,7 +429,6 @@ class Parameter(Variable):
if each < 0:
raise ValueError("Parameter shape should not be related with "
"batch-size")
Variable.__init__(self, block, shape=shape, dtype=dtype, **kwargs)
self.trainable = kwargs.get('trainable', True)
self.init_attr = kwargs.get('initialize_attr', {
......@@ -441,7 +441,7 @@ class Parameter(Variable):
self._append_initialize_ops_()
def _append_initialize_ops_(self):
attr = copy.deepcopy(self.init_attr)
attr = self.init_attr
op_type = attr.pop('type', None)
block = self.block
assert isinstance(block, Block)
......
from paddle.v2.framework.framework import Variable, OpProtoHolder, g_program
import paddle.v2.framework.core as core
import copy
import itertools
def unique_name(prefix):
uid = core.unique_integer() # unique during whole process.
return "_".join([prefix, str(uid)])
class LayerHelper(object):
def __init__(self, layer_type, **kwargs):
self.kwargs = kwargs
self.layer_type = layer_type
name = self.kwargs.get('name', None)
if name is None:
self.kwargs['name'] = unique_name(self.layer_type)
@property
def name(self):
return self.kwargs['name']
@property
def program(self):
prog = self.kwargs.get('program', None)
if prog is None:
return g_program
else:
return prog
def append_op(self, *args, **kwargs):
return self.program.current_block().append_op(*args, **kwargs)
def multiple_input(self, input_param_name='input'):
inputs = self.kwargs.get(input_param_name, [])
type_error = TypeError(
"Input of {0} layer should be Variable or sequence of Variable".
format(self.layer_type))
if isinstance(inputs, Variable):
inputs = [inputs]
elif not isinstance(inputs, list) and not isinstance(inputs, tuple):
raise type_error
else:
for each in inputs:
if not isinstance(each, Variable):
raise type_error
return inputs
def input(self, input_param_name='input'):
inputs = self.multiple_input(input_param_name)
if len(inputs) != 1:
raise "{0} layer only takes one input".format(self.layer_type)
return inputs[0]
@property
def param_attr(self):
default = {
'name': None,
'init_attr': {
'type': 'uniform_random',
'min': -1.0,
'max': 1.0
}
}
actual = self.kwargs.get('param_attr', None)
return actual if actual is not None else default
def bias_attr(self, size, dtype):
bias_attr = self.kwargs.get('bias_attr', False)
if bias_attr is None or bias_attr:
bias_attr = {
'name': None,
'init_attr': {
'type': 'fill_constant',
'value': 0.0,
'shape': [size],
'dataType': dtype
}
}
return bias_attr
def multiple_param_attr(self, length):
param_attr = self.param_attr
if isinstance(param_attr, dict):
param_attr = [param_attr]
if len(param_attr) != 1 and len(param_attr) != length:
raise ValueError("parameter number mismatch")
elif len(param_attr) == 1 and length != 1:
tmp = [None] * length
for i in xrange(length):
tmp[i] = copy.deepcopy(param_attr[0])
param_attr = tmp
return param_attr
def iter_inputs_and_params(self, input_param_name='input'):
inputs = self.multiple_input(input_param_name)
param_attrs = self.multiple_param_attr(len(inputs))
for ipt, param_attr in itertools.izip(inputs, param_attrs):
yield ipt, param_attr
def input_dtype(self, input_param_name='input'):
inputs = self.multiple_input(input_param_name)
dtype = None
for each in inputs:
if dtype is None:
dtype = each.data_type
elif dtype != each.data_type:
raise ValueError("Data Type mismatch")
return dtype
def create_parameter(self, attr, shape, dtype, suffix='w'):
if attr['name'] is None:
attr['name'] = unique_name(".".join([self.name, suffix]))
return self.program.global_block().create_parameter(
name=attr['name'],
dtype=dtype,
shape=shape,
initialize_attr=attr['init_attr'])
def create_tmp_variable(self, dtype):
return self.program.current_block().create_var(
name=unique_name(".".join([self.name, 'tmp'])), dtype=dtype)
def create_global_variable(self, *args, **kwargs):
return self.program.global_block().create_var(*args, **kwargs)
def append_bias_op(self, input_var):
bias_attr = self.bias_attr(
self.kwargs['size'], dtype=input_var.data_type)
if not bias_attr:
return input_var
b = self.create_parameter(
attr=bias_attr,
shape=[self.kwargs['size']],
dtype=input_var.data_type,
suffix='b')
tmp = self.create_tmp_variable(dtype=input_var.data_type)
self.append_op(
type='elementwise_add',
inputs={'X': [input_var],
'Y': [b]},
outputs={'Out': [tmp]})
return tmp
def append_activation(self, input_var):
act = self.kwargs.get('act', None)
if act is None:
return input_var
if isinstance(act, basestring):
act = {'type': act}
tmp = self.create_tmp_variable(dtype=input_var.data_type)
act_type = act.pop('type')
self.append_op(
type=act_type,
inputs={"X": [input_var]},
outputs={"Y": [tmp]},
attrs=act)
return tmp
from paddle.v2.framework.layer_helper import LayerHelper
import paddle.v2.framework.core as core
from paddle.v2.framework.framework import OpProtoHolder, Variable
import re
__all__ = ['fc_layer', 'data_layer', 'cross_entropy']
def fc_layer(input,
size,
param_attr=None,
bias_attr=True,
name=None,
act=None,
num_flatten_dims=1,
program=None):
# create helper
helper = LayerHelper('fc', **locals())
dtype = helper.input_dtype()
# mul
mul_results = []
for input_var, param_attr in helper.iter_inputs_and_params():
input_shape = input_var.shape
param_shape = list(input_shape[num_flatten_dims:]) + [size]
w = helper.create_parameter(
attr=param_attr, shape=param_shape, dtype=dtype)
tmp = helper.create_tmp_variable(dtype)
helper.append_op(
type="mul",
inputs={
"X": input_var,
"Y": w,
},
outputs={"Out": tmp},
attrs={'x_num_col_dims': num_flatten_dims})
mul_results.append(tmp)
# sum
if len(mul_results) == 1:
pre_bias = mul_results[0]
else:
pre_bias = helper.create_tmp_variable(dtype)
helper.append_op(
type="sum", inputs={"X": mul_results}, outputs={"Out": pre_bias})
# add bias
pre_activation = helper.append_bias_op(pre_bias)
# add activation
return helper.append_activation(pre_activation)
def data_layer(name,
shape,
data_type='float32',
type=core.VarDesc.VarType.LOD_TENSOR,
program=None):
helper = LayerHelper('data', **locals())
shape = [-1] + shape # append batch size as -1
return helper.create_global_variable(
name=name, shape=shape, dtype=data_type, type=type)
def _convert_(name):
s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()
def _create_op_func_(op_type):
op_proto = OpProtoHolder.instance().get_op_proto(op_type)
if len(op_proto.outputs) != 1:
raise ValueError(
"Only one output operator can be automatically generated")
if op_proto.outputs[0].duplicable:
raise ValueError(
"Only not duplicable op can be automatically generated")
o_name = op_proto.outputs[0].name
def func(**kwargs):
helper = LayerHelper(op_type, **kwargs)
inputs = dict()
dtype = None
for ipt in op_proto.inputs:
name = _convert_(ipt.name)
val = kwargs.pop(name, [])
if not isinstance(val, list) and not isinstance(val, tuple):
val = [val]
for each in val:
if not isinstance(each, Variable):
raise ValueError("input of {0} must be variable".format(
op_type))
if dtype is None:
dtype = each.data_type
elif dtype != each.data_type:
raise ValueError(
"operator {0} must input same dtype".format(op_type))
inputs[ipt.name] = val
out = helper.create_tmp_variable(dtype=dtype)
helper.append_op(
type=op_type, inputs=inputs, outputs={o_name: [out]}, attrs=kwargs)
return out
func.__name__ = op_type
globals()[op_type] = func
global __all__
__all__.append(op_type)
_create_op_func_('mean')
def cross_entropy(input, label, **kwargs):
helper = LayerHelper('cross_entropy', **kwargs)
out = helper.create_tmp_variable(dtype=input.data_type)
helper.append_op(
type='cross_entropy',
inputs={'X': [input],
'Label': [label]},
outputs={'Y': [out]},
attrs=kwargs)
return out
def square_error_cost(input, label, **kwargs):
helper = LayerHelper('square_error_cost', **kwargs)
minus_out = helper.create_tmp_variable(dtype=input.data_type)
helper.append_op(
type='elementwise_sub',
inputs={'X': [input],
'Y': [label]},
outputs={'Out': [minus_out]})
square_out = helper.create_tmp_variable(dtype=input.data_type)
helper.append_op(
type='pow',
inputs={'X': [minus_out]},
outputs={'Y': [square_out]},
attrs={'factor': 2.0})
return square_out
......@@ -49,7 +49,7 @@ class TestCrossEntropyOp2(OpTest):
self.inputs = {"X": X, "Label": label}
self.outputs = {"Y": cross_entropy}
self.attrs = {"softLabel": True}
self.attrs = {"soft_label": True}
def test_check_output(self):
self.check_output()
......@@ -82,7 +82,7 @@ class TestCrossEntropyOp3(OpTest):
self.inputs = {"X": X, "Label": label.astype(np.float32)}
self.outputs = {"Y": cross_entropy}
self.attrs = {"softLabel": True}
self.attrs = {"soft_label": True}
def test_check_output(self):
self.check_output()
......
import paddle.v2.framework.core as core
import unittest
import numpy as np
class TestFeedFetch(unittest.TestCase):
def test_feed_fetch(self):
place = core.CPUPlace()
input_array = np.ones((4, 4, 6)).astype("float32")
input_array[0, 0, 0] = 3
input_array[3, 3, 5] = 10
input_tensor = core.LoDTensor([[0, 2, 4]])
input_tensor.set(input_array, place)
core.set_feed_variable_float(input_tensor, "feed", 0)
output_tensor = core.get_fetch_variable("feed", 0)
output_lod = output_tensor.lod()
self.assertEqual(0, output_lod[0][0])
self.assertEqual(2, output_lod[0][1])
self.assertEqual(4, output_lod[0][2])
output_array = np.array(output_tensor)
self.assertEqual(3, output_array[0, 0, 0])
self.assertEqual(10, output_array[3, 3, 5])
if __name__ == "__main__":
unittest.main()
from paddle.v2.framework.layers import fc_layer, data_layer, cross_entropy, mean, square_error_cost
from paddle.v2.framework.framework import Program, g_program
import paddle.v2.framework.core as core
import unittest
class TestBook(unittest.TestCase):
def test_fit_a_line(self):
pd = core.ProgramDesc.__create_program_desc__()
program = Program(desc=pd)
x = data_layer(
name='x', shape=[13], data_type='float32', program=program)
y_predict = fc_layer(input=x, size=1, act=None, program=program)
y = data_layer(
name='y', shape=[1], data_type='float32', program=program)
cost = square_error_cost(input=y_predict, label=y, program=program)
avg_cost = mean(x=cost, program=program)
self.assertIsNotNone(avg_cost)
print str(program)
def test_recognize_digits_mlp(self):
pd = core.ProgramDesc.__create_program_desc__()
program = Program(desc=pd)
# Change g_program, so the rest layers use `g_program`
images = data_layer(
name='pixel', shape=[784], data_type='float32', program=program)
label = data_layer(
name='label', shape=[1], data_type='int32', program=program)
hidden1 = fc_layer(input=images, size=128, act='relu', program=program)
hidden2 = fc_layer(input=hidden1, size=64, act='relu', program=program)
predict = fc_layer(
input=hidden2, size=10, act='softmax', program=program)
cost = cross_entropy(input=predict, label=label, program=program)
avg_cost = mean(x=cost, program=program)
self.assertIsNotNone(avg_cost)
print str(program)
if __name__ == '__main__':
unittest.main()
......@@ -16,7 +16,7 @@ class TestOperator(unittest.TestCase):
try:
block.append_op(type="no_such_op")
self.assertFail()
except AssertionError as a_err:
except ValueError as a_err:
self.assertEqual(a_err.message,
"Operator \"no_such_op\" has not been registered.")
......
......@@ -56,8 +56,8 @@ class TestPool2d_Op(OpTest):
'strides': self.strides,
'paddings': self.paddings,
'ksize': self.ksize,
'poolingType': self.pool_type,
'globalPooling': self.global_pool,
'pooling_type': self.pool_type,
'global_pooling': self.global_pool,
}
self.outputs = {'Out': output}
......
......@@ -64,8 +64,8 @@ class TestPool3d_Op(OpTest):
'strides': self.strides,
'paddings': self.paddings,
'ksize': self.ksize,
'poolingType': self.pool_type,
'globalPooling': self.global_pool,
'pooling_type': self.pool_type,
'global_pooling': self.global_pool,
}
self.outputs = {'Out': output}
......
......@@ -86,7 +86,7 @@ class TestMaxPoolWithIndex_Op(OpTest):
'strides': self.strides,
'paddings': self.paddings,
'ksize': self.ksize,
'globalPooling': self.global_pool,
'global_pooling': self.global_pool,
}
self.inputs = {'X': input}
......
import paddle.v2.framework.core as core
import unittest
import numpy as np
class TestSelectedRows(unittest.TestCase):
def test_selected_rows(self):
place = core.CPUPlace()
height = 10
rows = [0, 4, 7]
row_numel = 10
selcted_rows = core.SelectedRows(rows, row_numel)
np_array = np.ones((len(rows), height)).astype("float32")
np_array[0, 0] = 2.0
np_array[2, 8] = 4.0
tensor = selcted_rows.get_tensor()
tensor.set(np_array, place)
# compare rows
self.assertEqual(0, selcted_rows.rows()[0])
self.assertEqual(4, selcted_rows.rows()[1])
self.assertEqual(7, selcted_rows.rows()[2])
# compare height
self.assertEqual(10, selcted_rows.height())
# compare tensor
self.assertAlmostEqual(2.0,
selcted_rows.get_tensor().get_float_element(0))
self.assertAlmostEqual(1.0,
selcted_rows.get_tensor().get_float_element(1))
self.assertAlmostEqual(
4.0, selcted_rows.get_tensor().get_float_element(2 * row_numel + 8))
if __name__ == "__main__":
unittest.main()
......@@ -57,7 +57,7 @@ class TestSoftmaxWithCrossEntropyOp2(OpTest):
self.inputs = {"Logits": logits, "Label": labels}
self.outputs = {"Softmax": softmax, "Loss": cross_entropy}
self.attrs = {"softLabel": True}
self.attrs = {"soft_label": True}
def test_check_output(self):
self.check_output()
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册