& no_grad_vars);
-```
-
-The implementation behind it can be divided into two parts, **Backward Operator Creating** and **Backward Operator Building**.
-
-### Backward Operator Registry
-
-A backward network is built up with several backward operators. Backward operators take forward operators' inputs, outputs, and output gradients and then calculate its input gradients.
-
-| | forward operator | backward operator
-| ---------------------- | ---------------- |------------------------- |
-| **Operator::inputs_** | Inputs | Inputs, Outputs, OutputGradients |
-| **Operator::outputs_** | Outputs | InputGradients |
-
- In most cases, there is a one-to-one relation between the forward and backward operators. These relations are recorded by a global hash map(`OpInfoMap`). To follow the philosophy of minimum core and to make operators pluggable, the registry mechanism is introduced.
-
-For example, we have `mul_op`, and we can register its information and corresponding backward operator by the following macro:
-
-```cpp
-REGISTER_OP(mul, MulOp, MulOpMaker, mul_grad, MulOpGrad);
-```
-
-`mul` is the operator's type. `MulOp` and `MulOpMaker` are the operator class and the operator maker class respectively.
-
-`mul_grad` is the type of backward operator, and `MulOpGrad` is its class name.
-
-### Backward Opeartor Creating
-
-Given a certain forward operator, we can get its corresponding backward operator by calling:
-
-```cpp
-OperatorBase* bwd_op = BuildGradOp(const OperatorBase* fwd_op);
-```
-
-The function `BuildGradOp` will sequentially execute following processes:
-
-1. Get the `type_` of given forward operator, and then get the corresponding backward operator's type by looking up the `OpInfoMap`.
-
-2. Build two maps named `inputs` and `outputs` to temporarily store backward operator's inputs and outputs. Copy forward operator's `inputs_` and `outputs_` to map `inputs`, except these, are not necessary for gradient computing.
-
-3. Add forward inputs' gradient variables into map `output`, adding forward outputs' gradient variables into map `input`.
-
-4. Building backward operator with `inputs`, `outputs` and forward operator's attributes.
-
-### Backward Network Building
-
-A backward network is a series of backward operators. The main idea of building a backward network is creating backward operators in the inverted sequence and appending them together one by one. There are some corner cases that need special processing.
-
-1. Op
-
- When the input forward network is an Op, return its gradient Operator immediately. If all of its outputs are in no gradient set, then return a special `NOP`.
-
-2. NetOp
-
- In our design, the network itself is also a kind of operator(**NetOp**). So the operators contained by a big network may be some small network. When the input forward network is a NetOp, it needs to call the sub NetOp/Operators backward function recursively. During the process, we need to collect the `OutputGradients` name according to the forward NetOp.
-
-3. RnnOp
-
- RnnOp is a nested stepnet operator. Backward module needs to recusively call `Backward` for every stepnet.
-
-4. Sharing Variables
-
- As illustrated in the figure 1 and figure 2, two operators share the same variable name **W@GRAD**, which will overwrite their shared input variable.
-
-
-
-
- Figure 1. Sharing variables in operators.
-
-
-
- Sharing variable between operators or same input variable used in multiple operators can lead to duplicate gradient variables. As illustrated in figure 2, we need to rename the gradient names recursively and add a generic add operator to prevent overwriting.
-
-
-
-
- Figure 2. Replace sharing variable's gradient with `Add` operator.
-
-
-
- Because the framework finds variables according to their names, we need to rename the output links. We add an integer suffix to represent its position in the clockwise direction.
-
-5. Part of the Gradient is Zero.
-
- In the whole graph, there is some case of that one operator's gradient is not needed, but its input's gradient is a dependency link of other operator, we need to fill a same shape gradient matrix in the position. In our implementation, we insert a special `fillZeroLike` operator.
-
-
-Follow these rules above, then collect the sub graph `OutputGradients`/`InputGradients` as the NetOp's and return it.
diff --git a/paddle/framework/backward_test.cc b/paddle/framework/backward_test.cc
index 0957646b5642cd9afce5d88b2c638679cb01f198..692406b1c37d0c02714eafb5cf9a28329ed873bc 100644
--- a/paddle/framework/backward_test.cc
+++ b/paddle/framework/backward_test.cc
@@ -1,16 +1,16 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
+ http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. */
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
#include "paddle/framework/backward.h"
diff --git a/paddle/framework/data_layout.h b/paddle/framework/data_layout.h
index 7429de7ee39297c26360984809e2451100f7b3ff..4a8669c3a41fceaad26878a79eabfd0affce86fd 100644
--- a/paddle/framework/data_layout.h
+++ b/paddle/framework/data_layout.h
@@ -13,11 +13,15 @@ See the License for the specific language governing permissions and
limitations under the License. */
#pragma once
+#include "paddle/platform/enforce.h"
+
+#include
+#include "paddle/platform/enforce.h"
namespace paddle {
namespace framework {
-enum DataLayout {
+enum class DataLayout {
kNHWC = 0,
kNCHW = 1,
kAnyLayout = 2,
@@ -33,5 +37,23 @@ inline DataLayout StringToDataLayout(const std::string& str) {
}
}
+inline std::string DataLayoutToString(const DataLayout& data_layout) {
+ switch (data_layout) {
+ case DataLayout::kNHWC:
+ return "NHWC";
+ case DataLayout::kNCHW:
+ return "NCHW";
+ case DataLayout::kAnyLayout:
+ return "ANY_LAYOUT";
+ default:
+ PADDLE_THROW("unknown DataLayou %d", data_layout);
+ }
+}
+
+inline std::ostream& operator<<(std::ostream& out, DataLayout l) {
+ out << DataLayoutToString(l);
+ return out;
+}
+
} // namespace framework
} // namespace paddle
diff --git a/paddle/framework/data_transform.cc b/paddle/framework/data_transform.cc
new file mode 100644
index 0000000000000000000000000000000000000000..6b1780968867067fe6d1e7fc576811f0a07340b3
--- /dev/null
+++ b/paddle/framework/data_transform.cc
@@ -0,0 +1,159 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+#include
+
+#include "paddle/framework/data_transform.h"
+#include "paddle/framework/lod_tensor.h"
+#include "paddle/platform/device_context.h"
+
+namespace paddle {
+namespace framework {
+
+DataTransformFnMap& DataTransformFnMap::Instance() {
+ static DataTransformFnMap data_transform_map;
+ return data_transform_map;
+}
+
+auto KernelFP32 = OpKernelType(proto::DataType::FP32, platform::CPUPlace(),
+ DataLayout::kNHWC, LibraryType::kPlain);
+
+auto KernelFP64 = OpKernelType(proto::DataType::FP64, platform::CPUPlace(),
+ DataLayout::kNHWC, LibraryType::kPlain);
+
+auto KernelNHWC = OpKernelType(proto::DataType::FP64, platform::CPUPlace(),
+ DataLayout::kNHWC, LibraryType::kPlain);
+
+auto KernelNCHW = OpKernelType(proto::DataType::FP64, platform::CPUPlace(),
+ DataLayout::kNCHW, LibraryType::kPlain);
+
+// TODO(dzhwinter): Only for testing multiple op kernel.
+// Dummy transform function for library_type
+// should be removed.
+auto KernelPlain = OpKernelType(proto::DataType::FP32, platform::CUDAPlace(0),
+ DataLayout::kAnyLayout, LibraryType::kPlain);
+
+auto KernelCUDNN = OpKernelType(proto::DataType::FP32, platform::CUDAPlace(0),
+ DataLayout::kAnyLayout, LibraryType::kCUDNN);
+
+void DummyTrans(const platform::DeviceContext* ctx,
+ const KernelTypePair& kernel_pair, const Variable& in,
+ Variable* out) {
+ PADDLE_ENFORCE(in.IsType(), "Only Support Tensor transform!.");
+ PADDLE_ENFORCE(
+ platform::places_are_same_class(kernel_pair.first.place_,
+ kernel_pair.second.place_),
+ "TransDataType Only Support DataType transform on same place!");
+ auto src = in.Get();
+ auto* dst = out->GetMutable();
+ *dst = src;
+}
+
+void TransDataType(const platform::DeviceContext* ctx,
+ const KernelTypePair& kernel_pair, const Variable& in,
+ Variable* out) {
+ PADDLE_ENFORCE(in.IsType(), "Only Support Tensor transform!.");
+ PADDLE_ENFORCE(
+ platform::places_are_same_class(kernel_pair.first.place_,
+ kernel_pair.second.place_),
+ "TransDataType Only Support DataType transform on same place!");
+
+ auto src = in.Get();
+ auto* dst = out->GetMutable();
+
+ auto dims = src.dims();
+ dst->Resize(dims);
+ auto dst_type = kernel_pair.second.data_type_;
+ auto src_type = kernel_pair.first.data_type_;
+
+ switch (src_type) {
+ case proto::DataType::FP32:
+ framework::VisitDataType(dst_type, CastDataType(src, dst, ctx));
+ break;
+ case proto::DataType::FP64:
+ framework::VisitDataType(dst_type, CastDataType(src, dst, ctx));
+ break;
+ case proto::DataType::INT32:
+ framework::VisitDataType(dst_type, CastDataType(src, dst, ctx));
+ break;
+ case proto::DataType::INT64:
+ framework::VisitDataType(dst_type, CastDataType(src, dst, ctx));
+ break;
+ case proto::DataType::BOOL:
+ framework::VisitDataType(dst_type, CastDataType(src, dst, ctx));
+ break;
+ default:
+ PADDLE_THROW("Not support type %d", src_type);
+ }
+}
+
+void TransDataLayout(const std::vector& axis,
+ const platform::DeviceContext* ctx,
+ const KernelTypePair& kernel_pair, const Variable& in,
+ Variable* out) {
+ PADDLE_ENFORCE(in.IsType(), "Only support Tensor transform!.");
+ PADDLE_ENFORCE(
+ platform::places_are_same_class(kernel_pair.first.place_,
+ kernel_pair.second.place_),
+ "TransDataLayout only support DataLayout transform on same place!");
+ PADDLE_ENFORCE(kernel_pair.first.data_type_ == kernel_pair.second.data_type_,
+ "TransDataLayout only support Datatype are same!");
+
+ auto src = in.Get();
+ auto* dst = out->GetMutable();
+ PADDLE_ENFORCE(arity(src.dims()) == 4, "Input Arity Only Suppport 4!");
+
+ auto place = kernel_pair.second.place_;
+ CopyFrom(src, place, *ctx, dst);
+
+ auto src_dim = src.dims();
+ std::vector dst_dim;
+
+ dst_dim.resize(axis.size());
+ for (size_t i = 0; i < axis.size(); i++) {
+ dst_dim[i] = src_dim[axis[i]];
+ }
+
+ dst->Resize(make_ddim(dst_dim));
+
+ auto src_type = kernel_pair.first.data_type_;
+ framework::VisitDataType(src_type, CastDataLayout(ctx, axis, src, dst));
+
+ dst->set_layout(kernel_pair.second.data_layout_);
+}
+
+} // namespace framework
+} // namespace paddle
+
+namespace f = paddle::framework;
+
+namespace {
+std::vector NHWC2NCHW = {0, 3, 1, 2};
+std::vector NCHW2NHWC = {0, 2, 3, 1};
+}
+
+REGISTER_DATA_TRANSFORM_FN(f::KernelFP32, f::KernelFP64, f::TransDataType);
+REGISTER_DATA_TRANSFORM_FN(f::KernelPlain, f::KernelCUDNN, f::DummyTrans);
+REGISTER_DATA_TRANSFORM_FN(f::KernelCUDNN, f::KernelPlain, f::DummyTrans);
+REGISTER_DATA_TRANSFORM_FN(f::KernelNHWC, f::KernelNCHW,
+ std::bind(f::TransDataLayout, NHWC2NCHW,
+ std::placeholders::_1,
+ std::placeholders::_2,
+ std::placeholders::_3,
+ std::placeholders::_4));
+REGISTER_DATA_TRANSFORM_FN(f::KernelNCHW, f::KernelNHWC,
+ std::bind(f::TransDataLayout, NCHW2NHWC,
+ std::placeholders::_1,
+ std::placeholders::_2,
+ std::placeholders::_3,
+ std::placeholders::_4));
diff --git a/paddle/framework/data_transform.h b/paddle/framework/data_transform.h
new file mode 100644
index 0000000000000000000000000000000000000000..56ebc80f4386958608213f30e745f2d9528e9e5e
--- /dev/null
+++ b/paddle/framework/data_transform.h
@@ -0,0 +1,173 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#pragma once
+
+#include
+#include
+#include
+
+#include "paddle/framework/op_kernel_type.h"
+#include "paddle/framework/tensor.h"
+#include "paddle/framework/variable.h"
+#include "paddle/operators/math/math_function.h"
+#include "paddle/platform/device_context.h"
+#include "paddle/platform/macros.h"
+#include "paddle/platform/transform.h"
+
+namespace paddle {
+namespace framework {
+
+using KernelTypePair = std::pair;
+
+using DataTransformFn =
+ std::function;
+
+struct KernelTypePairHash {
+ static void HashCombine(const OpKernelType& t, std::size_t* seed) {
+ OpKernelType::Hash kernel_type_hasher;
+ (*seed) ^= kernel_type_hasher(t) + 0x9e3779b9 + (*seed << 6) + (*seed >> 2);
+ }
+
+ size_t operator()(const KernelTypePair& kernel_pair) const {
+ std::size_t seed = 0;
+ HashCombine(kernel_pair.first, &seed);
+ HashCombine(kernel_pair.second, &seed);
+ return seed;
+ }
+};
+
+template
+struct CastDataTypeFunctor {
+ HOSTDEVICE inline OutType operator()(InType in) const {
+ return static_cast(in);
+ }
+};
+
+template
+struct CastDataType {
+ CastDataType(const framework::Tensor& in, framework::Tensor* out,
+ const platform::DeviceContext* ctx)
+ : in_(in), out_(out), ctx_(ctx) {}
+ const framework::Tensor in_;
+ framework::Tensor* out_;
+ const platform::DeviceContext* ctx_;
+
+ template
+ void operator()() {
+ auto place = ctx_->GetPlace();
+
+ auto* in_begin = in_.data();
+ auto numel = in_.numel();
+ auto* in_end = in_begin + numel;
+ auto* out_begin = out_->mutable_data(place);
+
+ if (platform::is_cpu_place(place)) {
+ platform::Transform trans;
+ auto* context = static_cast(ctx_);
+ trans(*context, in_begin, in_end, out_begin,
+ CastDataTypeFunctor());
+ } else {
+ // TODO(dzhwinter): enhance CopyFrom CPU<->GPU with different data type?
+ PADDLE_THROW("Unsupport CPU <-> GPU!");
+ }
+ }
+};
+
+struct CastDataLayout {
+ CastDataLayout(const platform::DeviceContext* ctx,
+ const std::vector& axis, const framework::Tensor& in,
+ framework::Tensor* out)
+ : in_(in), out_(out), ctx_(ctx), axis_(axis) {}
+ const framework::Tensor in_;
+ framework::Tensor* out_;
+ const platform::DeviceContext* ctx_;
+ const std::vector axis_;
+
+ template
+ void operator()() {
+ auto place = ctx_->GetPlace();
+
+ if (platform::is_cpu_place(place)) {
+ operators::math::Transpose trans4;
+ auto* context = static_cast(ctx_);
+ trans4(*context, in_, out_, axis_);
+ } else {
+ PADDLE_THROW("Unsupport CPU <-> GPU!");
+ }
+ }
+};
+
+using DataTransformMap =
+ std::unordered_map;
+
+class DataTransformFnMap {
+ public:
+ static DataTransformFnMap& Instance();
+
+ bool Has(const KernelTypePair& key_pair) const {
+ return map_.find(key_pair) != map_.end();
+ }
+
+ void Insert(const OpKernelType& left, const OpKernelType& right,
+ const DataTransformFn& data_tranform_fn) {
+ Insert(std::make_pair(left, right), data_tranform_fn);
+ }
+
+ void Insert(const KernelTypePair& kernel_type_pair,
+ const DataTransformFn& data_tranform_fn) {
+ PADDLE_ENFORCE(!Has(kernel_type_pair),
+ "KernelTypePair %s has been registered", "");
+ map_.insert({kernel_type_pair, data_tranform_fn});
+ }
+
+ const DataTransformFn& Get(const KernelTypePair& key_pair) const {
+ auto data_transformer = GetNullable(key_pair);
+ PADDLE_ENFORCE_NOT_NULL(data_transformer,
+ "DataTransformFn should not be NULL");
+ return *data_transformer;
+ }
+
+ const DataTransformFn* GetNullable(const KernelTypePair& key_pair) const {
+ auto it = map_.find(key_pair);
+ if (it == map_.end()) {
+ return nullptr;
+ } else {
+ return &(it->second);
+ }
+ }
+
+ const DataTransformMap& Map() const { return map_; }
+
+ private:
+ DataTransformFnMap() = default;
+ DataTransformMap map_;
+ DISABLE_COPY_AND_ASSIGN(DataTransformFnMap);
+};
+
+// generate unique name with __LINE__
+// refs https://stackoverflow.com/questions/1597007
+#define TOKENPASTE(x, y) x##y
+#define TOKENPASTE2(x, y) TOKENPASTE(x, y)
+#define REGISTER_DATA_TRANSFORM_FN(from, to, fn) \
+ static int TOKENPASTE2(fn_, __LINE__)() { \
+ ::paddle::framework::DataTransformFnMap::Instance().Insert(from, to, fn); \
+ return 0; \
+ } \
+ static int TOKENPASTE2(var_, __LINE__) __attribute__((unused)) = \
+ TOKENPASTE2(fn_, __LINE__)()
+
+} // namespace framework
+} // namespace paddle
diff --git a/paddle/framework/data_transform_test.cc b/paddle/framework/data_transform_test.cc
new file mode 100644
index 0000000000000000000000000000000000000000..edd305fd17ae202926b83fbec10089719baa2e16
--- /dev/null
+++ b/paddle/framework/data_transform_test.cc
@@ -0,0 +1,168 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+#include
+#include
+
+#include
+
+#include "paddle/framework/data_transform.h"
+#include "paddle/platform/device_context.h"
+
+namespace paddle {
+namespace framework {
+using namespace platform;
+
+/**
+ * @brief cross validation of different kernel type transform
+ * We use four bit map represent different combination.
+ * If the field has multiple possible value, only choose two of them.
+ * For DataType, only test the FP32(float), FP64(double).
+ * e.g. 0000 -> FP32, CPUPlace, kNHWC, kPlain
+ * 1111 -> FP64, GPUPlace, kNCHW, kMKLDNN
+ */
+
+std::array kDataType = {
+ {proto::DataType::FP32, proto::DataType::FP64}};
+
+std::array kPlace = {{CPUPlace(), CUDAPlace(0)}};
+
+std::array kDataLayout = {{
+ DataLayout::kNHWC, DataLayout::kNCHW,
+}};
+
+std::array kLibraryType = {{
+ LibraryType::kPlain, LibraryType::kMKLDNN,
+}};
+
+OpKernelType GenFromBit(const std::vector bits) {
+ return OpKernelType(kDataType[bits[0]], kPlace[bits[1]], kDataLayout[bits[2]],
+ kLibraryType[bits[3]]);
+}
+
+int test_value = 0;
+
+auto kernel0 = GenFromBit({0, 0, 0, 0});
+auto kernel1 = GenFromBit({0, 0, 0, 1});
+auto kernel2 = GenFromBit({0, 0, 1, 0});
+auto kernel3 = GenFromBit({0, 0, 1, 1});
+
+void TransDataType_t(const platform::DeviceContext* ctx,
+ const KernelTypePair& p, const Variable& in,
+ Variable* out) {
+ test_value++;
+}
+
+void TransDataLayout_t(const platform::DeviceContext* ctx,
+ const KernelTypePair& p, const Variable& in,
+ Variable* out) {
+ test_value--;
+}
+
+void TransLibraryType_t(const platform::DeviceContext* ctx,
+ const KernelTypePair& p, const Variable& in,
+ Variable* out) {
+ test_value += 2;
+}
+
+} // namespace framework
+} // namespace paddle
+
+namespace frw = paddle::framework;
+
+REGISTER_DATA_TRANSFORM_FN(frw::kernel0, frw::kernel1, frw::TransDataType_t);
+REGISTER_DATA_TRANSFORM_FN(frw::kernel1, frw::kernel2, frw::TransDataLayout_t);
+REGISTER_DATA_TRANSFORM_FN(frw::kernel0, frw::kernel2, frw::TransLibraryType_t);
+
+TEST(DataTransform, Register) {
+ using namespace paddle::framework;
+ using namespace paddle::platform;
+
+ auto& instance = DataTransformFnMap::Instance();
+ paddle::framework::Variable in;
+ paddle::framework::Variable out;
+
+ DeviceContext* ctx = new CPUDeviceContext();
+ auto pair0 = std::make_pair(frw::kernel0, frw::kernel1);
+ instance.Get(pair0)(ctx, pair0, in, &out);
+ ASSERT_EQ(test_value, 1);
+
+ auto pair1 = std::make_pair(frw::kernel1, frw::kernel2);
+ instance.Get(pair1)(ctx, pair1, in, &out);
+ ASSERT_EQ(test_value, 0);
+
+ auto pair3 = std::make_pair(frw::kernel0, frw::kernel2);
+ instance.Get(pair3)(ctx, pair3, in, &out);
+ ASSERT_EQ(test_value, 2);
+}
+
+TEST(DataTransform, DataLayout) {
+ using namespace paddle::framework;
+ using namespace paddle::platform;
+
+ auto& instance = DataTransformFnMap::Instance();
+ Variable in;
+ Variable out;
+ Tensor* src = in.GetMutable();
+ src->mutable_data(make_ddim({2, 3, 1, 2}), CPUPlace());
+ src->set_layout(DataLayout::kNHWC);
+
+ DeviceContext* ctx = new CPUDeviceContext();
+
+ {
+ auto kernel1 = GenFromBit({1, 0, 0, 0});
+ auto kernel2 = GenFromBit({1, 0, 1, 0});
+ auto pair0 = std::make_pair(kernel1, kernel2);
+ instance.Get(pair0)(ctx, pair0, in, &out);
+ }
+
+ Tensor dst = out.Get();
+
+ EXPECT_TRUE(dst.layout() == DataLayout::kNCHW);
+ EXPECT_TRUE(dst.dims() == make_ddim({2, 2, 3, 1}));
+
+ {
+ auto kernel1 = GenFromBit({1, 0, 1, 0});
+ auto kernel2 = GenFromBit({1, 0, 0, 0});
+ auto pair0 = std::make_pair(kernel1, kernel2);
+ instance.Get(pair0)(ctx, pair0, out, &in);
+ }
+
+ EXPECT_TRUE(src->layout() == DataLayout::kNHWC);
+ EXPECT_TRUE(src->dims() == make_ddim({2, 3, 1, 2}));
+}
+
+TEST(DataTransform, DataType) {
+ using namespace paddle::framework;
+ using namespace paddle::platform;
+
+ auto& instance = DataTransformFnMap::Instance();
+ DeviceContext* ctx = new CPUDeviceContext();
+
+ Variable in;
+ Variable out;
+ Tensor* src = in.GetMutable();
+ float* ptr = src->mutable_data(make_ddim({2, 3}), CPUPlace());
+ for (int i = 0; i < 6; ++i) {
+ ptr[i] = i / 3;
+ }
+
+ {
+ auto kernel1 = GenFromBit({0, 0, 0, 0});
+ auto kernel2 = GenFromBit({1, 0, 0, 0});
+ auto pair0 = std::make_pair(kernel1, kernel2);
+ instance.Get(pair0)(ctx, pair0, in, &out);
+ }
+ Tensor dst = out.Get();
+ EXPECT_TRUE(dst.data() != nullptr);
+}
diff --git a/paddle/framework/data_type.h b/paddle/framework/data_type.h
index e94ee2ed52bc40f52caf783f971dd0b560534e08..6a372ac32e48131eed28e2d42125feb5b92a11c7 100644
--- a/paddle/framework/data_type.h
+++ b/paddle/framework/data_type.h
@@ -1,16 +1,16 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
+ http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. */
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
#pragma once
#include
diff --git a/paddle/framework/ddim_test.cc b/paddle/framework/ddim_test.cc
index bd5ea09d7da700479aa387283d7bde77c64c1293..bc259d1f603fb34ac8546c388669d8c5c1250bd1 100644
--- a/paddle/framework/ddim_test.cc
+++ b/paddle/framework/ddim_test.cc
@@ -1,16 +1,16 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
+ http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. */
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
#include
#include
diff --git a/paddle/framework/details/op_registry.h b/paddle/framework/details/op_registry.h
index 7f5151c41d6046f21f7a9707e45de85ec50219ad..6d50e820b2b625f932768d2ca671d999071f1ca6 100644
--- a/paddle/framework/details/op_registry.h
+++ b/paddle/framework/details/op_registry.h
@@ -1,16 +1,16 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
- Licensed under the Apache License, Version 2.0 (the "License");
- you may not use this file except in compliance with the License.
- You may obtain a copy of the License at
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
+ http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License. */
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
#pragma once
diff --git a/paddle/framework/executor.cc b/paddle/framework/executor.cc
index 14ae37ec49c12203381e74b3f9174a460e41c18e..bf1f0471ccbfccf13cb6f74c8088da7acd68ec0b 100644
--- a/paddle/framework/executor.cc
+++ b/paddle/framework/executor.cc
@@ -14,18 +14,17 @@ limitations under the License. */
#include "paddle/framework/executor.h"
-#include
-#include
-#include
#include
-#include
+#include "gflags/gflags.h"
#include "paddle/framework/feed_fetch_type.h"
#include "paddle/framework/lod_rank_table.h"
-#include "paddle/framework/lod_tensor.h"
#include "paddle/framework/lod_tensor_array.h"
#include "paddle/framework/op_registry.h"
-#include "paddle/framework/scope.h"
+
+DEFINE_bool(check_nan_inf, false,
+ "Checking whether operator produce NAN/INF or not. It will be "
+ "extremely slow so please use this flag wisely.");
namespace paddle {
namespace framework {
@@ -33,13 +32,7 @@ namespace framework {
const std::string kFeedOpType = "feed";
const std::string kFetchOpType = "fetch";
-DeviceContextPool* DeviceContextPool::pool = nullptr;
-
-Executor::Executor(const std::vector& places) {
- DeviceContextPool& pool = DeviceContextPool::Get();
- auto borrowed_contexts = pool.Borrow(places);
- device_contexts_.swap(borrowed_contexts);
-}
+Executor::Executor(const platform::Place& place) : place_(place) {}
static void CreateTensor(Variable* var, proto::VarDesc::VarType var_type) {
if (var_type == proto::VarDesc::LOD_TENSOR) {
@@ -64,6 +57,19 @@ static void CreateTensor(Variable* var, proto::VarDesc::VarType var_type) {
}
}
+static void CheckTensorNANOrInf(const std::string& name,
+ const framework::Tensor& tensor) {
+ if (tensor.memory_size() == 0) {
+ return;
+ }
+ if (tensor.type().hash_code() != typeid(float).hash_code() &&
+ tensor.type().hash_code() != typeid(double).hash_code()) {
+ return;
+ }
+ PADDLE_ENFORCE(!framework::HasInf(tensor), "Tensor %s has Inf", name);
+ PADDLE_ENFORCE(!framework::HasNAN(tensor), "Tensor %s has NAN", name);
+}
+
void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id,
bool create_local_scope, bool create_vars) {
// TODO(tonyyang-svail):
@@ -71,7 +77,6 @@ void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id,
// - will change to use multiple blocks for RNN op and Cond Op
PADDLE_ENFORCE_LT(static_cast(block_id), pdesc.Size());
auto& block = pdesc.Block(block_id);
- auto& device = device_contexts_[0];
Scope* local_scope = scope;
if (create_vars) {
@@ -107,9 +112,18 @@ void Executor::Run(const ProgramDesc& pdesc, Scope* scope, int block_id,
for (auto& op_desc : block.AllOps()) {
auto op = paddle::framework::OpRegistry::CreateOp(*op_desc);
VLOG(3) << op->DebugString();
- op->Run(*local_scope, *device);
+ op->Run(*local_scope, place_);
+ if (FLAGS_check_nan_inf) {
+ for (auto& vname : op->OutputVars(true)) {
+ auto* var = local_scope->FindVar(vname);
+ if (var == nullptr) continue;
+ if (var->IsType()) {
+ CheckTensorNANOrInf(vname, var->Get());
+ }
+ }
+ }
}
- if (create_local_scope) {
+ if (create_vars && create_local_scope) {
scope->DeleteScope(local_scope);
}
}
diff --git a/paddle/framework/executor.h b/paddle/framework/executor.h
index a3d1609293a0d687c33447ca7a0df95c6aac3bc5..d869e18901b82959a40cc296aa0844c20ea63ac1 100644
--- a/paddle/framework/executor.h
+++ b/paddle/framework/executor.h
@@ -14,9 +14,6 @@ limitations under the License. */
#pragma once
-#include