diff --git a/doc/design/ops/dist_train.md b/doc/design/ops/dist_train.md
new file mode 100644
index 0000000000000000000000000000000000000000..fa3c5d7990213cf2b0d236e66e592dd2699da876
--- /dev/null
+++ b/doc/design/ops/dist_train.md
@@ -0,0 +1,106 @@
+# Design Doc: Operation Graph Based Parameter Server
+
+## Abstract
+
+We propose an approach to implement the parameter server. In this
+approach, there is no fundamental difference between the trainer and
+the parameter server: they both run subgraphs, but subgraphs of
+different purposes.
+
+## Background
+
+The previous implementations of the parameter server does not run a
+subgraph. parameter initialization, optimizer computation, network
+communication and checkpointing are implemented twice on both the
+trainer and the parameter server.
+
+It would be great if we can write code once and use them on both the
+trainer and the parameter server: reduces code duplication and
+improves extensibility. Given that after the current refactor, we are
+representing everything as a computing graph on the
+trainer. Representing everything as a computing graph on the parameter
+server becomes a natural extension.
+
+## Design
+
+### Graph Converter
+
+The *graph converter* converts the user-defined operation (OP) graph
+into subgraphs to be scheduled on different nodes with the following
+steps:
+
+1. OP placement: the OPs will be placed on different nodes according
+ to heuristic that minimizes estimated total computation
+ time. Currently we will use a simple heuristic that puts parameter
+ varable on parameter server workers and everything else on trainer
+ workers.
+
+1. Add communication OPs to enable the communication between nodes.
+
+We will need these OPs: *Send*, *Recv*, *Enqueue*, *Dequeue*.
+
+Below is an example of converting the user defined graph to the
+subgraphs for the trainer and the parameter server:
+
+
+
+After converting:
+
+
+
+1. The parameter variable W and it's optimizer subgraph are placed on the parameter server.
+1. Operators are added to the subgraphs.
+ - *Send* sends data to the connected *Recv* operator. The
+ scheduler on the receive node will only schedule *Recv* operator
+ to run when the *Send* operator has ran (the *Send* OP will mark
+ the *Recv* OP runnable automatically).
+ - *Enueue* enqueues the input variable, it can block until space
+ become available in the queue.
+ - *Dequeue* outputs configurable numbers of tensors from the
+ queue. It will block until the queue have the required number of
+ tensors.
+
+
+### Benefits
+
+- Model parallelism become easier to implement: it's an extension to
+ the trainer - parameter server approach. we already have the
+ communication OPs, but need to extend the graph converter's
+ placement functionality.
+
+- User-defined optimizer is easier to add - user can now express it as
+ a subgraph.
+
+- No more duplication logic inside the trainer and the parameter
+ server mentioned in the background section.
+
+### Challenges
+
+- It might be hard for the graph converter to cut a general graph
+ (without any hint for which subgraph is the optimizer). We may need
+ to label which subgraph inside the OP graph is the optimizer.
+
+- It's important to balance the parameter shards of on multiple
+ parameter server. If a single parameter is very big (some
+ word-embedding, fully connected, softmax layer), we need to
+ automatically partition the single parameter onto different
+ parameter servers when possible (only element-wise optimizer depends
+ on the parameter variable).
+
+### Discussion
+
+- In the "Aync SGD" figure, the "W" variable on the parameter server
+ could be read and wrote concurrently, what is our locking strategy?
+ E.g., each variable have a lock cpp method to be invoked by every
+ OP, or, have a lock OP.
+
+- Can the Enqueue OP be implemented under our current tensor design
+ (puts the input tensor into the queue tensor)?
+
+- *Dequeue* OP will have variable numbers of output (depends on the
+ `min_count` attribute), does our current design support it? (similar
+ question for the *Add* OP)
+
+
+### References:
+[1] [TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45166.pdf)
diff --git a/doc/design/ops/src/dist-graph.graffle b/doc/design/ops/src/dist-graph.graffle
new file mode 100644
index 0000000000000000000000000000000000000000..941399c6ced8d5f65b6c595522b770c88259df4b
Binary files /dev/null and b/doc/design/ops/src/dist-graph.graffle differ
diff --git a/doc/design/ops/src/dist-graph.png b/doc/design/ops/src/dist-graph.png
new file mode 100644
index 0000000000000000000000000000000000000000..3546b09f1c2ee3e4f60f519d5e47f823f08051a7
Binary files /dev/null and b/doc/design/ops/src/dist-graph.png differ
diff --git a/doc/design/ops/src/local-graph.graffle b/doc/design/ops/src/local-graph.graffle
new file mode 100644
index 0000000000000000000000000000000000000000..19e509bd9af3c1e9a3f5e0f16ddd281457a339c5
Binary files /dev/null and b/doc/design/ops/src/local-graph.graffle differ
diff --git a/doc/design/ops/src/local-graph.png b/doc/design/ops/src/local-graph.png
new file mode 100644
index 0000000000000000000000000000000000000000..ada51200f793a9bb18911e7d63cfdb3244b967d7
Binary files /dev/null and b/doc/design/ops/src/local-graph.png differ
diff --git a/paddle/framework/operator.cc b/paddle/framework/operator.cc
index 790cfc4746b1d34da413fa3c29a266f962c6dde6..e1e122091f7759b1a68f1f982bc2a35e8241f9f0 100644
--- a/paddle/framework/operator.cc
+++ b/paddle/framework/operator.cc
@@ -123,6 +123,15 @@ OperatorBase::OperatorBase(const std::string& type,
CheckAllInputOutputSet();
}
+std::vector OperatorBase::InputVars() const {
+ std::vector ret_val;
+ for (auto& o : outputs_) {
+ ret_val.reserve(ret_val.size() + o.second.size());
+ ret_val.insert(ret_val.end(), o.second.begin(), o.second.end());
+ }
+ return ret_val;
+}
+
std::vector OperatorBase::OutputVars(bool has_intermediate) const {
std::vector ret_val;
if (has_intermediate) {
diff --git a/paddle/framework/operator.h b/paddle/framework/operator.h
index 9a98d4d3be0d1cb875d614b263f1e4365ede4113..4600b06009bcef7d0774d25b816aac4733f30795 100644
--- a/paddle/framework/operator.h
+++ b/paddle/framework/operator.h
@@ -94,11 +94,14 @@ class OperatorBase {
const VariableNameMap& Inputs() const { return inputs_; }
const VariableNameMap& Outputs() const { return outputs_; }
+
//! Get a input with argument's name described in `op_proto`
std::string Input(const std::string& name) const;
//! Get a input which has multiple variables.
const std::vector& Inputs(const std::string& name) const;
+ std::vector InputVars() const;
+
//! Get a output with argument's name described in `op_proto`
std::string Output(const std::string& name) const;
//! Get an output which has multiple variables.
@@ -311,9 +314,9 @@ class InferShapeContext {
}
template
- std::vector MultiOutput(const std::string& name) const {
+ std::vector MultiOutput(const std::string& name) const {
auto names = op_.Outputs(name);
- std::vector res;
+ std::vector res;
res.reserve(names.size());
std::transform(names.begin(), names.end(), std::back_inserter(res),
[&](const std::string& sub_name) {
diff --git a/paddle/gserver/layers/BatchNormBaseLayer.cpp b/paddle/gserver/layers/BatchNormBaseLayer.cpp
index 1ceaaaa206ee3cbc5421238574c7f310011ccaa5..f7a80e23e1bd49549bec57b360587adc6b423794 100644
--- a/paddle/gserver/layers/BatchNormBaseLayer.cpp
+++ b/paddle/gserver/layers/BatchNormBaseLayer.cpp
@@ -62,14 +62,18 @@ void BatchNormBaseLayer::calFeatureMapSize() {
const ImageConfig& conf = config_.inputs(0).image_conf();
imageH_ = inputLayers_[0]->getOutput().getFrameHeight();
imageW_ = inputLayers_[0]->getOutput().getFrameWidth();
+ imageD_ = inputLayers_[0]->getOutput().getFrameDepth();
+
+ if (0 == imageD_) imageD_ = conf.img_size_z();
if (imageH_ == 0 && imageW_ == 0) {
imageH_ = conf.has_img_size_y() ? conf.img_size_y() : conf.img_size();
imageW_ = conf.img_size();
} else {
getOutput().setFrameHeight(imageH_);
getOutput().setFrameWidth(imageW_);
+ getOutput().setFrameDepth(imageD_);
}
- imgPixels_ = imageH_ * imageW_;
+ imgPixels_ = imageH_ * imageW_ * imageD_;
}
} // namespace paddle
diff --git a/paddle/gserver/layers/BatchNormBaseLayer.h b/paddle/gserver/layers/BatchNormBaseLayer.h
index 230bafc31d96bbd49481a7ed135be6888688627e..e721d2d267a31cae46407673b8b1281e87055608 100644
--- a/paddle/gserver/layers/BatchNormBaseLayer.h
+++ b/paddle/gserver/layers/BatchNormBaseLayer.h
@@ -80,6 +80,7 @@ protected:
/// Height or width of input image feature.
/// Both of them are 1 if the input is fully-connected layer.
+ int imageD_;
int imageH_;
int imageW_;
/// Height * Width.
diff --git a/paddle/gserver/layers/CudnnBatchNormLayer.cpp b/paddle/gserver/layers/CudnnBatchNormLayer.cpp
index 44ba2c4b7d1562d2ce839b5f4b4de1af35e6925f..49a9540c0b6e36b59ed786287ff5c4569b69a6a5 100644
--- a/paddle/gserver/layers/CudnnBatchNormLayer.cpp
+++ b/paddle/gserver/layers/CudnnBatchNormLayer.cpp
@@ -37,7 +37,7 @@ bool CudnnBatchNormLayer::init(const LayerMap& layerMap,
}
void CudnnBatchNormLayer::reshape(int batchSize) {
- hl_tensor_reshape(ioDesc_, batchSize, channels_, imageH_, imageW_);
+ hl_tensor_reshape(ioDesc_, batchSize, channels_, imageH_ * imageD_, imageW_);
}
void CudnnBatchNormLayer::forward(PassType passType) {
@@ -104,7 +104,7 @@ void CudnnBatchNormLayer::forward(PassType passType) {
EPS,
batchSize,
channels_,
- imageH_,
+ imageH_ * imageD_,
imageW_);
}
}
diff --git a/paddle/gserver/layers/SwitchOrderLayer.cpp b/paddle/gserver/layers/SwitchOrderLayer.cpp
index 92cd61cdd515d5c693df086c9575a5f197c00cee..d7eee6eaf078dab8d48adc4c7ee758a433672ac6 100644
--- a/paddle/gserver/layers/SwitchOrderLayer.cpp
+++ b/paddle/gserver/layers/SwitchOrderLayer.cpp
@@ -24,10 +24,12 @@ bool SwitchOrderLayer::init(const LayerMap& layerMap,
/* Initialize the basic parent class */
Layer::init(layerMap, parameterMap);
auto& img_conf = config_.inputs(0).image_conf();
+ size_t inD = img_conf.img_size_z();
size_t inH =
img_conf.has_img_size_y() ? img_conf.img_size_y() : img_conf.img_size();
size_t inW = img_conf.img_size();
size_t inC = img_conf.channels();
+ inH = inH * inD;
inDims_ = TensorShape({0, inC, inH, inW});
outDims_ = TensorShape(4);
@@ -64,9 +66,10 @@ void SwitchOrderLayer::setInDims() {
MatrixPtr input = inputLayers_[0]->getOutputValue();
size_t batchSize = input->getHeight();
inDims_.setDim(0, batchSize);
-
+ int d = inputLayers_[0]->getOutput().getFrameDepth();
+ d = (d == 0 ? 1 : d);
int h = inputLayers_[0]->getOutput().getFrameHeight();
- if (h != 0) inDims_.setDim(2, h);
+ if (h != 0) inDims_.setDim(2, h * d);
int w = inputLayers_[0]->getOutput().getFrameWidth();
if (w != 0) inDims_.setDim(3, w);
int totalCount = input->getElementCnt();
diff --git a/paddle/gserver/tests/test_LayerGrad.cpp b/paddle/gserver/tests/test_LayerGrad.cpp
index d1f3bc241fa621cb0070125980996e8627e40fd6..0e6be2df9ef5f0fae8ed2b0c65ac6c032fe45ab1 100644
--- a/paddle/gserver/tests/test_LayerGrad.cpp
+++ b/paddle/gserver/tests/test_LayerGrad.cpp
@@ -1703,6 +1703,55 @@ TEST(Layer, BatchNormalizationLayer) {
#endif
}
+void testBatchNorm3DLayer(const string& type, bool trans, bool useGpu) {
+ TestConfig config;
+ const int CHANNELS = 10;
+ const int IMG_SIZE = 16;
+ const int IMG_SIZE_Y = 8;
+ const int IMG_SIZE_Z = 8;
+ size_t size = CHANNELS * IMG_SIZE * IMG_SIZE_Y * IMG_SIZE_Z;
+ config.layerConfig.set_type(type);
+ config.layerConfig.set_size(size);
+ config.layerConfig.set_active_type("sigmoid");
+ config.biasSize = CHANNELS;
+ config.inputDefs.push_back({INPUT_DATA,
+ "layer_0",
+ /* dim= */ size,
+ /* paraSize= */ CHANNELS});
+
+ config.inputDefs.push_back({INPUT_DATA, "layer_1_running_mean", 1, CHANNELS});
+ config.inputDefs.back().isStatic = true;
+ config.inputDefs.push_back({INPUT_DATA, "layer_2_running_var", 1, CHANNELS});
+ config.inputDefs.back().isStatic = true;
+
+ LayerInputConfig* input = config.layerConfig.add_inputs();
+ config.layerConfig.add_inputs();
+ config.layerConfig.add_inputs();
+
+ ImageConfig* img_conf = input->mutable_image_conf();
+ img_conf->set_channels(CHANNELS);
+ img_conf->set_img_size(IMG_SIZE);
+ img_conf->set_img_size_y(IMG_SIZE_Y);
+ img_conf->set_img_size_z(IMG_SIZE_Z);
+
+ testLayerGrad(config,
+ "batch_norm",
+ 64,
+ /* trans= */ trans,
+ useGpu,
+ /* useWeight */ true);
+}
+
+TEST(Layer, testBatchNorm3DLayer) {
+ testBatchNorm3DLayer("batch_norm", false, false);
+#ifndef PADDLE_ONLY_CPU
+ testBatchNorm3DLayer("batch_norm", false, true);
+ if (hl_get_cudnn_lib_version() >= int(4000)) {
+ testBatchNorm3DLayer("cudnn_batch_norm", false, true);
+ }
+#endif
+}
+
void testConvOperator(bool isDeconv) {
TestConfig config;
const int NUM_FILTERS = 16;
diff --git a/paddle/operators/sum_op.cc b/paddle/operators/sum_op.cc
new file mode 100644
index 0000000000000000000000000000000000000000..5805826ee8a555ca6dfc1ca81feaadffea9e1012
--- /dev/null
+++ b/paddle/operators/sum_op.cc
@@ -0,0 +1,73 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#include "paddle/operators/sum_op.h"
+#include
+
+namespace paddle {
+namespace operators {
+using framework::Tensor;
+
+class SumOp : public framework::OperatorWithKernel {
+ public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ protected:
+ void InferShape(const framework::InferShapeContext &ctx) const override {
+ auto ins = ctx.MultiInput("X");
+ auto *out = ctx.Output("Out");
+ int N = ins.size();
+
+ auto in_dim = ins[0]->dims();
+
+ PADDLE_ENFORCE_GT(N, 1, "Input tensors count should > 1.");
+ for (int i = 1; i < N; i++) {
+ auto dim = ins[i]->dims();
+ PADDLE_ENFORCE(in_dim == dim, "Input tensors must have same shape");
+ }
+ out->Resize(in_dim);
+ }
+};
+
+class SumOpMaker : public framework::OpProtoAndCheckerMaker {
+ public:
+ SumOpMaker(framework::OpProto *proto, framework::OpAttrChecker *op_checker)
+ : OpProtoAndCheckerMaker(proto, op_checker) {
+ AddInput("X", "the input tensors of sum operator.").AsDuplicable();
+ AddOutput("Out", "the output tensor of sum operator.");
+ AddComment(R"DOC(
+ Sum the input tensors.
+ )DOC");
+ }
+};
+
+class SumGradOp : public framework::OperatorWithKernel {
+ public:
+ using framework::OperatorWithKernel::OperatorWithKernel;
+
+ protected:
+ void InferShape(const framework::InferShapeContext &ctx) const override {
+ auto outputs = ctx.MultiOutput(framework::GradVarName("X"));
+ auto dims = ctx.Input(framework::GradVarName("Out"))->dims();
+ for (auto output : outputs) {
+ output->Resize(dims);
+ }
+ }
+};
+
+} // namespace operators
+} // namespace paddle
+
+namespace ops = paddle::operators;
+REGISTER_OP(sum, ops::SumOp, ops::SumOpMaker, sum_grad, ops::SumGradOp);
+REGISTER_OP_CPU_KERNEL(sum, ops::SumKernel);
+REGISTER_OP_CPU_KERNEL(sum_grad,
+ ops::SumGradKernel);
diff --git a/paddle/operators/sum_op.cu b/paddle/operators/sum_op.cu
new file mode 100644
index 0000000000000000000000000000000000000000..a465cf3659ba7c51338abadfc62962fb6755a39d
--- /dev/null
+++ b/paddle/operators/sum_op.cu
@@ -0,0 +1,18 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#define EIGEN_USE_GPU
+#include "paddle/operators/sum_op.h"
+
+namespace ops = paddle::operators;
+REGISTER_OP_GPU_KERNEL(sum, ops::SumKernel);
+REGISTER_OP_GPU_KERNEL(sum_grad,
+ ops::SumGradKernel);
diff --git a/paddle/operators/sum_op.h b/paddle/operators/sum_op.h
new file mode 100644
index 0000000000000000000000000000000000000000..0b1e9ebaa38d455fb5e3ce8c1a39cbbcdad9a940
--- /dev/null
+++ b/paddle/operators/sum_op.h
@@ -0,0 +1,65 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License. */
+
+#pragma once
+#include "paddle/framework/eigen.h"
+#include "paddle/framework/op_registry.h"
+
+namespace paddle {
+namespace operators {
+
+using Tensor = framework::Tensor;
+template
+using EigenVector = framework::EigenVector;
+
+template
+class SumKernel : public framework::OpKernel {
+ public:
+ void Compute(const framework::ExecutionContext& context) const override {
+ auto ins = context.MultiInput("X");
+ auto* out = context.Output("Out");
+ out->mutable_data(context.GetPlace());
+
+ auto place = context.GetEigenDevice();
+ auto result = EigenVector::Flatten(*out);
+
+ int N = ins.size();
+ auto in = EigenVector::Flatten(*(ins[0]));
+ result.device(place) = in;
+ for (int i = 1; i < N; i++) {
+ auto in = EigenVector::Flatten(*(ins[i]));
+ result.device(place) = result + in;
+ }
+ }
+};
+
+template
+class SumGradKernel : public framework::OpKernel {
+ public:
+ void Compute(const framework::ExecutionContext& context) const override {
+ auto* input = context.Input(framework::GradVarName("Out"));
+ auto outs = context.MultiOutput(framework::GradVarName("X"));
+ for (auto out : outs) {
+ out->mutable_data(context.GetPlace());
+ }
+
+ auto place = context.GetEigenDevice();
+ auto in = EigenVector::Flatten(*input);
+ for (auto out : outs) {
+ auto result = EigenVector::Flatten(*out);
+ result.device(place) = in;
+ }
+ }
+};
+
+} // namespace operators
+} // namespace paddle
diff --git a/paddle/pybind/pybind.cc b/paddle/pybind/pybind.cc
index 109e62e8739f4b0cb0e2ba7d3f7cf2a2f5cbb9b7..be68c0930c849f969e58d6c786842acb99806eeb 100644
--- a/paddle/pybind/pybind.cc
+++ b/paddle/pybind/pybind.cc
@@ -52,6 +52,7 @@ USE_OP(pad);
USE_CPU_ONLY_OP(scatter);
USE_OP(top_k);
USE_OP(squared_l2_distance);
+USE_OP(sum);
namespace paddle {
namespace framework {
@@ -217,7 +218,10 @@ All parameter, weight, gradient are variables in Paddle.
-> std::map> {
return op.Outputs();
})
+ .def("output_vars",
+ [](const OperatorBase &op) { return op.OutputVars(true); })
.def("inputs", [](const OperatorBase &op) { return op.Inputs(); })
+ .def("input_vars", [](const OperatorBase &op) { return op.InputVars(); })
.def("__str__", &OperatorBase::DebugString)
.def("no_intermediate_outputs",
[](const OperatorBase &op) { return op.OutputVars(false); })
diff --git a/proto/ModelConfig.proto b/proto/ModelConfig.proto
index 7d7fc23a4691646dfce4c162a445864c748501d9..ebf0911d6ea0b39d51447859ae2aef485b50b0e6 100644
--- a/proto/ModelConfig.proto
+++ b/proto/ModelConfig.proto
@@ -271,6 +271,7 @@ message ImageConfig {
// The size of input feature map.
required uint32 img_size = 8;
optional uint32 img_size_y = 9;
+ optional uint32 img_size_z = 10 [ default = 1 ];
}
message PriorBoxConfig {
@@ -519,6 +520,7 @@ message LayerConfig {
// for HuberRegressionLoss
optional double delta = 57 [ default = 1.0 ];
+ // for 3D data
optional uint64 depth = 58 [ default = 1 ];
// for switch order layer
diff --git a/python/paddle/trainer/config_parser.py b/python/paddle/trainer/config_parser.py
index 11dc84ae20679bb73735f9119739fca5ea7fa673..7e9112b43bf851575a3a798886d8b1b17e7c2017 100644
--- a/python/paddle/trainer/config_parser.py
+++ b/python/paddle/trainer/config_parser.py
@@ -1332,6 +1332,12 @@ def parse_image(image, input_layer_name, image_conf):
get_img_size(input_layer_name, image_conf.channels)
+def parse_image3d(image, input_layer_name, image_conf):
+ image_conf.channels = image.channels
+ image_conf.img_size, image_conf.img_size_y, image_conf.img_size_z = \
+ get_img3d_size(input_layer_name, image_conf.channels)
+
+
def parse_norm(norm, input_layer_name, norm_conf):
norm_conf.norm_type = norm.norm_type
config_assert(
@@ -2365,9 +2371,11 @@ class BatchNormLayer(LayerBase):
name,
inputs,
bias=True,
+ img3D=False,
use_global_stats=True,
moving_average_fraction=0.9,
batch_norm_type=None,
+ mean_var_names=None,
**xargs):
if inputs is None:
inputs = []
@@ -2409,24 +2417,69 @@ class BatchNormLayer(LayerBase):
input_layer = self.get_input_layer(0)
image_conf = self.config.inputs[0].image_conf
- parse_image(self.inputs[0].image, input_layer.name, image_conf)
-
- # Only pass the width and height of input to batch_norm layer
- # when either of it is non-zero.
- if input_layer.width != 0 or input_layer.height != 0:
- self.set_cnn_layer(name, image_conf.img_size_y, image_conf.img_size,
- image_conf.channels, False)
+ if img3D:
+ parse_image3d(self.inputs[0].image, input_layer.name, image_conf)
+ # Only pass the width and height of input to batch_norm layer
+ # when either of it is non-zero.
+ if input_layer.width != 0 or input_layer.height != 0:
+ self.set_cnn_layer(
+ input_layer_name=name,
+ depth=image_conf.img_size_z,
+ height=image_conf.img_size_y,
+ width=image_conf.img_size,
+ channels=image_conf.channels,
+ is_print=True)
+ else:
+ self.set_layer_size(input_layer.size)
else:
- self.set_layer_size(input_layer.size)
+ parse_image(self.inputs[0].image, input_layer.name, image_conf)
+ # Only pass the width and height of input to batch_norm layer
+ # when either of it is non-zero.
+ if input_layer.width != 0 or input_layer.height != 0:
+ self.set_cnn_layer(
+ input_layer_name=name,
+ height=image_conf.img_size_y,
+ width=image_conf.img_size,
+ channels=image_conf.channels,
+ is_print=True)
+ else:
+ self.set_layer_size(input_layer.size)
psize = self.calc_parameter_size(image_conf)
dims = [1, psize]
+ if mean_var_names is not None:
+ assert len(mean_var_names) == 2
+ self.inputs[1].parameter_name = mean_var_names[0]
+ self.inputs[2].parameter_name = mean_var_names[1]
+
self.create_input_parameter(0, psize)
self.create_input_parameter(1, psize, dims)
self.create_input_parameter(2, psize, dims)
self.create_bias_parameter(bias, psize)
+ def set_cnn_layer(self,
+ input_layer_name,
+ depth=None,
+ height=None,
+ width=None,
+ channels=None,
+ is_print=True):
+ depthIsNone = False
+ if depth is None:
+ depth = 1
+ depthIsNone = True
+ size = depth * height * width * channels
+ self.set_layer_size(size)
+ self.set_layer_height_width(height, width)
+ self.set_layer_depth(depth)
+ if is_print and depthIsNone:
+ print("output for %s: c = %d, h = %d, w = %d, size = %d" %
+ (input_layer_name, channels, height, width, size))
+ elif is_print:
+ print("output for %s: c = %d, d = %d, h = %d, w = %d, size = %d" %
+ (input_layer_name, channels, depth, height, width, size))
+
def calc_parameter_size(self, image_conf):
return image_conf.channels
@@ -2688,9 +2741,20 @@ class AddToLayer(LayerBase):
super(AddToLayer, self).__init__(
name, 'addto', 0, inputs=inputs, **xargs)
config_assert(len(inputs) > 0, 'inputs cannot be empty for AddToLayer')
- for input_index in xrange(len(self.inputs)):
- input_layer = self.get_input_layer(input_index)
- self.set_layer_size(input_layer.size)
+
+ if len(self.inputs) > 1:
+ for input_index in xrange(len(self.inputs)):
+ assert self.get_input_layer(0).height == self.get_input_layer(
+ input_index).height
+ assert self.get_input_layer(0).width == self.get_input_layer(
+ input_index).width
+ assert self.get_input_layer(0).depth == self.get_input_layer(
+ input_index).depth
+
+ self.set_layer_size(self.get_input_layer(0).size)
+ self.set_layer_height_width(self.get_input_layer(0).height, \
+ self.get_input_layer(0).width)
+ self.set_layer_depth(self.get_input_layer(0).depth)
self.create_bias_parameter(bias, self.config.size)
@@ -3370,11 +3434,20 @@ class ConcatenateLayer(LayerBase):
name, 'concat', 0, inputs=inputs, **xargs)
size = 0
for input_index in xrange(len(self.inputs)):
+ assert self.get_input_layer(0).height == self.get_input_layer(
+ input_index).height
+ assert self.get_input_layer(0).width == self.get_input_layer(
+ input_index).width
+ assert self.get_input_layer(0).depth == self.get_input_layer(
+ input_index).depth
input_layer = self.get_input_layer(input_index)
input = self.inputs[input_index]
if self.config.size == 0:
size += input_layer.size
+ self.set_layer_height_width(self.get_input_layer(0).height, \
+ self.get_input_layer(0).width)
+ self.set_layer_depth(self.get_input_layer(0).depth)
self.set_layer_size(size)
diff --git a/python/paddle/trainer_config_helpers/layers.py b/python/paddle/trainer_config_helpers/layers.py
index cba45bd3afa178ab4dd3a50f0947b144e7466e53..dc68c213da66ac680e6b14266cb5038a5ba73ec2 100644
--- a/python/paddle/trainer_config_helpers/layers.py
+++ b/python/paddle/trainer_config_helpers/layers.py
@@ -354,6 +354,10 @@ class LayerOutput(object):
def height(self):
return cp.g_layer_map[self.full_name].height
+ @property
+ def depth(self):
+ return cp.g_layer_map[self.full_name].depth
+
def set_input(self, input):
"""
Set the input for a memory layer. Can only be used for memory layer
@@ -943,7 +947,7 @@ def data_layer(name, size, depth=None, height=None, width=None,
if height is not None and width is not None:
num_filters = size / (width * height * depth)
assert num_filters * width * height * depth == size, \
- "size=%s width=%s height=%s depth=%s" % (size, width, height, depth)
+ "size=%s width=%s height=%s depth=%s" % (size, width, height, depth)
return LayerOutput(name, LayerType.DATA, size=size, num_filters=num_filters)
@@ -2953,13 +2957,15 @@ def img_cmrnorm_layer(input,
def batch_norm_layer(input,
act=None,
name=None,
+ img3D=False,
num_channels=None,
bias_attr=None,
param_attr=None,
layer_attr=None,
batch_norm_type=None,
moving_average_fraction=0.9,
- use_global_stats=None):
+ use_global_stats=None,
+ mean_var_names=None):
"""
Batch Normalization Layer. The notation of this layer as follow.
@@ -3026,6 +3032,8 @@ def batch_norm_layer(input,
:math:`runningMean = newMean*(1-factor)
+ runningMean*factor`
:type moving_average_fraction: float.
+ :param mean_var_names: [mean name, variance name]
+ :type mean_var_names: string list
:return: LayerOutput object.
:rtype: LayerOutput
"""
@@ -3039,6 +3047,7 @@ def batch_norm_layer(input,
(batch_norm_type == "cudnn_batch_norm")
l = Layer(
name=name,
+ img3D=img3D,
inputs=Input(
input.name, image=Image(channels=num_channels), **param_attr.attr),
active_type=act.name,
@@ -3047,6 +3056,7 @@ def batch_norm_layer(input,
bias=ParamAttr.to_bias(bias_attr),
moving_average_fraction=moving_average_fraction,
use_global_stats=use_global_stats,
+ mean_var_names=mean_var_names,
**ExtraLayerAttribute.to_kwargs(layer_attr))
return LayerOutput(
@@ -6410,7 +6420,7 @@ def gated_unit_layer(input,
@wrap_name_default('switch_order')
def switch_order_layer(input,
name=None,
- reshape=None,
+ reshape_axis=None,
act=None,
layer_attr=None):
"""
@@ -6421,8 +6431,9 @@ def switch_order_layer(input,
The example usage is:
.. code-block:: python
+ reshape_axis = 3
+ switch = switch_order(input=layer, name='switch', reshape_axis=reshape_axis)
reshape = {'height':[ 0, 1, 2], 'width':[3]}
- switch = switch_order(input=layer, name='switch', reshape=reshape)
:param input: The input layer.
:type input: LayerOutput
@@ -6434,6 +6445,11 @@ def switch_order_layer(input,
:rtype: LayerOutput
"""
assert isinstance(input, LayerOutput)
+ assert reshape_axis != None and (reshape_axis > 0 and reshape_axis < 4)
+ height = [ele for ele in xrange(reshape_axis)]
+ width = [ele for ele in range(reshape_axis, 4)]
+ reshape = {'height': height, 'width': width}
+
l = Layer(
name=name,
inputs=input.name,
diff --git a/python/paddle/trainer_config_helpers/tests/configs/file_list.sh b/python/paddle/trainer_config_helpers/tests/configs/file_list.sh
index df872a90ff388f0d96cef44763dbd076bc768ab9..8a204a96f3ef57673cef65306d0bf8e8c3409751 100755
--- a/python/paddle/trainer_config_helpers/tests/configs/file_list.sh
+++ b/python/paddle/trainer_config_helpers/tests/configs/file_list.sh
@@ -10,6 +10,6 @@ test_prelu_layer test_row_conv test_detection_output_layer test_multibox_loss_la
test_recursive_topology test_gated_unit_layer test_clip_layer test_row_l2_norm_layer
test_kmax_seq_socre_layer test_sub_nested_seq_select_layer test_scale_shift_layer
test_seq_slice_layer test_cross_entropy_over_beam test_pooling3D_layer
-test_conv3d_layer test_deconv3d_layer)
+test_conv3d_layer test_deconv3d_layer test_BatchNorm3D)
export whole_configs=(test_split_datasource)
diff --git a/python/paddle/trainer_config_helpers/tests/configs/protostr/img_layers.protostr b/python/paddle/trainer_config_helpers/tests/configs/protostr/img_layers.protostr
index 1a577b8d9b1e1915236ba6afcfa97040d70c707a..5ddf6052df021b055390a42c25ce6c0d650e4aee 100644
--- a/python/paddle/trainer_config_helpers/tests/configs/protostr/img_layers.protostr
+++ b/python/paddle/trainer_config_helpers/tests/configs/protostr/img_layers.protostr
@@ -62,6 +62,7 @@ layers {
moving_average_fraction: 0.9
height: 227
width: 227
+ depth: 1
}
layers {
name: "__crmnorm_0__"
diff --git a/python/paddle/trainer_config_helpers/tests/configs/protostr/img_trans_layers.protostr b/python/paddle/trainer_config_helpers/tests/configs/protostr/img_trans_layers.protostr
index 2818389b16cca75f5030b75fc4de8c89c06c5e02..c0252b945b4c7fd6b4dad8770e3e1dccb88df28a 100644
--- a/python/paddle/trainer_config_helpers/tests/configs/protostr/img_trans_layers.protostr
+++ b/python/paddle/trainer_config_helpers/tests/configs/protostr/img_trans_layers.protostr
@@ -62,6 +62,7 @@ layers {
moving_average_fraction: 0.9
height: 256
width: 256
+ depth: 1
}
layers {
name: "__crmnorm_0__"
diff --git a/python/paddle/trainer_config_helpers/tests/configs/protostr/test_BatchNorm3D.protostr b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_BatchNorm3D.protostr
new file mode 100644
index 0000000000000000000000000000000000000000..832ed24a31dd2bedba9a4fce77d7a088d1796fdb
--- /dev/null
+++ b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_BatchNorm3D.protostr
@@ -0,0 +1,92 @@
+type: "nn"
+layers {
+ name: "data3D"
+ type: "data"
+ size: 360
+ active_type: ""
+ height: 6
+ width: 20
+ depth: 3
+}
+layers {
+ name: "__batch_norm_0__"
+ type: "batch_norm"
+ size: 360
+ active_type: "relu"
+ inputs {
+ input_layer_name: "data3D"
+ input_parameter_name: "___batch_norm_0__.w0"
+ image_conf {
+ channels: 1
+ img_size: 20
+ img_size_y: 6
+ img_size_z: 3
+ }
+ }
+ inputs {
+ input_layer_name: "data3D"
+ input_parameter_name: "___batch_norm_0__.w1"
+ }
+ inputs {
+ input_layer_name: "data3D"
+ input_parameter_name: "___batch_norm_0__.w2"
+ }
+ bias_parameter_name: "___batch_norm_0__.wbias"
+ moving_average_fraction: 0.9
+ height: 6
+ width: 20
+ depth: 3
+}
+parameters {
+ name: "___batch_norm_0__.w0"
+ size: 1
+ initial_mean: 1.0
+ initial_std: 0.0
+ initial_strategy: 0
+ initial_smart: false
+}
+parameters {
+ name: "___batch_norm_0__.w1"
+ size: 1
+ initial_mean: 0.0
+ initial_std: 0.0
+ dims: 1
+ dims: 1
+ initial_strategy: 0
+ initial_smart: false
+ is_static: true
+ is_shared: true
+}
+parameters {
+ name: "___batch_norm_0__.w2"
+ size: 1
+ initial_mean: 0.0
+ initial_std: 0.0
+ dims: 1
+ dims: 1
+ initial_strategy: 0
+ initial_smart: false
+ is_static: true
+ is_shared: true
+}
+parameters {
+ name: "___batch_norm_0__.wbias"
+ size: 1
+ initial_mean: 0.0
+ initial_std: 0.0
+ dims: 1
+ dims: 1
+ initial_strategy: 0
+ initial_smart: false
+}
+input_layer_names: "data3D"
+output_layer_names: "__batch_norm_0__"
+sub_models {
+ name: "root"
+ layer_names: "data3D"
+ layer_names: "__batch_norm_0__"
+ input_layer_names: "data3D"
+ output_layer_names: "__batch_norm_0__"
+ is_recurrent_layer_group: false
+}
+
diff --git a/python/paddle/trainer_config_helpers/tests/configs/protostr/test_bi_grumemory.protostr b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_bi_grumemory.protostr
index b110e91498ce7d112987714bd769868179141c54..8a1399efad0ff339e35f69400ac654a4787a6018 100644
--- a/python/paddle/trainer_config_helpers/tests/configs/protostr/test_bi_grumemory.protostr
+++ b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_bi_grumemory.protostr
@@ -74,6 +74,9 @@ layers {
inputs {
input_layer_name: "__bidirectional_gru_0___bw"
}
+ height: 0
+ width: 0
+ depth: 1
}
parameters {
name: "___bidirectional_gru_0___fw_transform.w0"
diff --git a/python/paddle/trainer_config_helpers/tests/configs/protostr/test_recursive_topology.protostr b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_recursive_topology.protostr
index 8133aa9c8d3e7c6843d1b27b70e87d394a1e0e47..046037936a6d85f54095c65f206e468aa69065d7 100644
--- a/python/paddle/trainer_config_helpers/tests/configs/protostr/test_recursive_topology.protostr
+++ b/python/paddle/trainer_config_helpers/tests/configs/protostr/test_recursive_topology.protostr
@@ -16,6 +16,9 @@ layers {
inputs {
input_layer_name: "data"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_1__"
@@ -28,6 +31,9 @@ layers {
inputs {
input_layer_name: "__addto_0__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_2__"
@@ -40,6 +46,9 @@ layers {
inputs {
input_layer_name: "__addto_1__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_3__"
@@ -52,6 +61,9 @@ layers {
inputs {
input_layer_name: "__addto_2__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_4__"
@@ -64,6 +76,9 @@ layers {
inputs {
input_layer_name: "__addto_3__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_5__"
@@ -76,6 +91,9 @@ layers {
inputs {
input_layer_name: "__addto_4__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_6__"
@@ -88,6 +106,9 @@ layers {
inputs {
input_layer_name: "__addto_5__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_7__"
@@ -100,6 +121,9 @@ layers {
inputs {
input_layer_name: "__addto_6__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_8__"
@@ -112,6 +136,9 @@ layers {
inputs {
input_layer_name: "__addto_7__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_9__"
@@ -124,6 +151,9 @@ layers {
inputs {
input_layer_name: "__addto_8__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_10__"
@@ -136,6 +166,9 @@ layers {
inputs {
input_layer_name: "__addto_9__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_11__"
@@ -148,6 +181,9 @@ layers {
inputs {
input_layer_name: "__addto_10__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_12__"
@@ -160,6 +196,9 @@ layers {
inputs {
input_layer_name: "__addto_11__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_13__"
@@ -172,6 +211,9 @@ layers {
inputs {
input_layer_name: "__addto_12__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_14__"
@@ -184,6 +226,9 @@ layers {
inputs {
input_layer_name: "__addto_13__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_15__"
@@ -196,6 +241,9 @@ layers {
inputs {
input_layer_name: "__addto_14__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_16__"
@@ -208,6 +256,9 @@ layers {
inputs {
input_layer_name: "__addto_15__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_17__"
@@ -220,6 +271,9 @@ layers {
inputs {
input_layer_name: "__addto_16__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_18__"
@@ -232,6 +286,9 @@ layers {
inputs {
input_layer_name: "__addto_17__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_19__"
@@ -244,6 +301,9 @@ layers {
inputs {
input_layer_name: "__addto_18__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_20__"
@@ -256,6 +316,9 @@ layers {
inputs {
input_layer_name: "__addto_19__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_21__"
@@ -268,6 +331,9 @@ layers {
inputs {
input_layer_name: "__addto_20__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_22__"
@@ -280,6 +346,9 @@ layers {
inputs {
input_layer_name: "__addto_21__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_23__"
@@ -292,6 +361,9 @@ layers {
inputs {
input_layer_name: "__addto_22__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_24__"
@@ -304,6 +376,9 @@ layers {
inputs {
input_layer_name: "__addto_23__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_25__"
@@ -316,6 +391,9 @@ layers {
inputs {
input_layer_name: "__addto_24__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_26__"
@@ -328,6 +406,9 @@ layers {
inputs {
input_layer_name: "__addto_25__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_27__"
@@ -340,6 +421,9 @@ layers {
inputs {
input_layer_name: "__addto_26__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_28__"
@@ -352,6 +436,9 @@ layers {
inputs {
input_layer_name: "__addto_27__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_29__"
@@ -364,6 +451,9 @@ layers {
inputs {
input_layer_name: "__addto_28__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_30__"
@@ -376,6 +466,9 @@ layers {
inputs {
input_layer_name: "__addto_29__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__addto_31__"
@@ -388,6 +481,9 @@ layers {
inputs {
input_layer_name: "__addto_30__"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__fc_layer_0__"
diff --git a/python/paddle/trainer_config_helpers/tests/configs/protostr/util_layers.protostr b/python/paddle/trainer_config_helpers/tests/configs/protostr/util_layers.protostr
index d0ad388165007b8f96f059e5b003c52f756383e5..7a2f3eab38808a031c27cf7ab9d6273952e389eb 100644
--- a/python/paddle/trainer_config_helpers/tests/configs/protostr/util_layers.protostr
+++ b/python/paddle/trainer_config_helpers/tests/configs/protostr/util_layers.protostr
@@ -22,6 +22,9 @@ layers {
inputs {
input_layer_name: "b"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__concat_0__"
@@ -34,6 +37,9 @@ layers {
inputs {
input_layer_name: "b"
}
+ height: 0
+ width: 0
+ depth: 1
}
layers {
name: "__concat_1__"
diff --git a/python/paddle/trainer_config_helpers/tests/configs/test_BatchNorm3D.py b/python/paddle/trainer_config_helpers/tests/configs/test_BatchNorm3D.py
new file mode 100644
index 0000000000000000000000000000000000000000..a991b22252ba10eed895efd931108c2d8b0e52f1
--- /dev/null
+++ b/python/paddle/trainer_config_helpers/tests/configs/test_BatchNorm3D.py
@@ -0,0 +1,11 @@
+from paddle.trainer_config_helpers import *
+
+settings(batch_size=1000, learning_rate=1e-4)
+
+#data = data_layer(name='data', size=180, width=30, height=6)
+#batchNorm = batch_norm_layer(data, num_channels=1)
+#outputs(batchNorm)
+
+data3D = data_layer(name='data3D', size=120 * 3, width=20, height=6, depth=3)
+batchNorm3D = batch_norm_layer(data3D, num_channels=1, img3D=True)
+outputs(batchNorm3D)
diff --git a/python/paddle/v2/framework/op.py b/python/paddle/v2/framework/op.py
index 78c64e261b7eea07c902743ff86a552c4b7dc355..8109227828bafda85d3a556dda928acd7c1fc94c 100644
--- a/python/paddle/v2/framework/op.py
+++ b/python/paddle/v2/framework/op.py
@@ -142,8 +142,8 @@ def create_op_creation_method(op_proto):
return OpInfo(
method=__impl__,
name=op_proto.type,
- inputs=[var.name for var in op_proto.inputs],
- outputs=[var.name for var in op_proto.outputs],
+ inputs=[(var.name, var.duplicable) for var in op_proto.inputs],
+ outputs=[(var.name, var.duplicable) for var in op_proto.outputs],
attrs=[attr.name for attr in op_proto.attrs])
@@ -180,9 +180,15 @@ class OperatorFactory(object):
return self.op_methods.get(t)
def get_op_input_names(self, type):
+ return map(lambda x: x[0], self.get_op_info(type).inputs)
+
+ def get_op_inputs(self, type):
return self.get_op_info(type).inputs
def get_op_output_names(self, type):
+ return map(lambda x: x[0], self.get_op_info(type).outputs)
+
+ def get_op_outputs(self, type):
return self.get_op_info(type).outputs
def get_op_attr_names(self, type):
diff --git a/python/paddle/v2/framework/tests/CMakeLists.txt b/python/paddle/v2/framework/tests/CMakeLists.txt
index ef910f939be0b9d3cb5e6d49e69a00daa191b1c6..2117fdf0d58520a008d2bd01d56d96dd248be025 100644
--- a/python/paddle/v2/framework/tests/CMakeLists.txt
+++ b/python/paddle/v2/framework/tests/CMakeLists.txt
@@ -33,5 +33,6 @@ py_test(test_sgd_op SRCS test_sgd_op.py)
py_test(test_gradient_checker SRCS test_gradient_checker.py)
py_test(test_lookup_table SRCS test_lookup_table.py)
py_test(test_scale_and_identity_op SRCS test_scale_and_identity_op.py)
+py_test(test_sum_op SRCS test_sum_op.py)
py_test(mnist SRCS mnist.py)
py_test(test_squared_l2_distance_op SRCS test_squared_l2_distance_op.py)
diff --git a/python/paddle/v2/framework/tests/op_test.py b/python/paddle/v2/framework/tests/op_test.py
new file mode 100644
index 0000000000000000000000000000000000000000..3a6a5dca4c4ddc1399d80e491e4072f24707c01e
--- /dev/null
+++ b/python/paddle/v2/framework/tests/op_test.py
@@ -0,0 +1,275 @@
+import unittest
+import numpy as np
+import itertools
+import paddle.v2.framework.core as core
+from paddle.v2.framework.op import Operator
+
+
+def grad_var_name(var_name):
+ return var_name + "@GRAD"
+
+
+def create_op(scope, op_type, inputs, outputs, attrs=None):
+ kwargs = dict()
+
+ for in_name, in_dup in Operator.get_op_inputs(op_type):
+ if in_name in inputs:
+ kwargs[in_name] = []
+ if in_dup:
+ sub_in = inputs[in_name]
+ for sub_in_name in sub_in:
+ var = scope.new_var(sub_in_name)
+ kwargs[in_name].append(sub_in_name)
+ else:
+ var = scope.new_var(in_name)
+ kwargs[in_name].append(in_name)
+
+ for out_name, out_dup in Operator.get_op_outputs(op_type):
+ if out_name in outputs:
+ kwargs[out_name] = []
+ if out_dup:
+ sub_in = outputs[out_name]
+ for sun_in_name in sub_in:
+ var = scope.new_var(sun_in_name)
+ kwargs[out_name].append(sun_in_name)
+ else:
+ var = scope.new_var(out_name)
+ kwargs[out_name].append(out_name)
+
+ for attr_name in Operator.get_op_attr_names(op_type):
+ kwargs[attr_name] = attrs[attr_name]
+ return Operator(op_type, **kwargs)
+
+
+def set_input(scope, op, inputs, place):
+ for in_name, in_dup in Operator.get_op_inputs(op.type()):
+ if in_name in inputs:
+ if in_dup:
+ sub_in = inputs[in_name]
+ for sub_in_name in sub_in:
+ var = scope.find_var(sub_in_name)
+ tensor = var.get_tensor()
+ arr = sub_in[sub_in_name]
+ tensor.set_dims(arr.shape)
+ tensor.set(arr, place)
+ else:
+ var = scope.find_var(in_name)
+ tensor = var.get_tensor()
+ arr = inputs[in_name]
+ tensor.set_dims(arr.shape)
+ tensor.set(arr, place)
+
+
+def set_output_grad(scope, op, outputs, place):
+ for out_name, out_dup in Operator.get_op_outputs(op.type()):
+ if out_name in outputs:
+ if out_dup:
+ sub_out = outputs[out_name]
+ for sub_out_name in sub_out:
+ out_tensor = scope.find_var(sub_out_name).get_tensor()
+ grad_tensor = scope.new_var(grad_var_name(
+ sub_out_name)).get_tensor()
+ grad_tensor.set_dims(out_tensor.shape())
+ data = np.ones(out_tensor.shape(), dtype=np.float32)
+ grad_tensor.set(data, place)
+ else:
+ out_tensor = scope.find_var(out_name).get_tensor()
+ grad_tensor = scope.new_var(grad_var_name(out_name)).get_tensor(
+ )
+ grad_tensor.set_dims(out_tensor.shape())
+ data = np.ones(out_tensor.shape(), dtype=np.float32)
+ grad_tensor.set(data, place)
+
+
+def get_numeric_gradient(scope,
+ op,
+ inputs,
+ input_to_check,
+ output_name,
+ delta=0.005,
+ in_place=False):
+
+ set_input(scope, op, inputs, core.CPUPlace())
+ op.infer_shape(scope)
+
+ tensor_to_check = scope.find_var(input_to_check).get_tensor()
+
+ def product(dim):
+ return reduce(lambda a, b: a * b, dim, 1)
+
+ ctx = core.DeviceContext.create(core.CPUPlace())
+
+ def get_output():
+ op.run(scope, ctx)
+ return np.array(scope.find_var(output_name).get_tensor()).sum()
+
+ tensor_to_check = scope.find_var(input_to_check).get_tensor()
+ tensor_size = product(tensor_to_check.get_dims())
+ gradient_flat = np.zeros(shape=(tensor_size, ), dtype='float32')
+ # we only compute gradient of one element each time.
+ # we use a for loop to compute the gradient of every element.
+ for i in xrange(tensor_size):
+ if in_place:
+ set_input(op, inputs, core.CPUPlace())
+
+ # get one input element throw it's index i.
+ origin = tensor_to_check.get_float_element(i)
+ # add delta to it, run op and then get the sum of the result tensor.
+ x_pos = origin + delta
+ tensor_to_check.set_float_element(i, x_pos)
+ y_pos = get_output()
+
+ if in_place:
+ set_input(op, inputs, core.CPUPlace())
+
+ x_neg = origin - delta
+ tensor_to_check.set_float_element(i, x_neg)
+ y_neg = get_output()
+
+ tensor_to_check.set_float_element(i, origin)
+ gradient_flat[i] = (y_pos - y_neg) / delta / 2
+
+ return gradient_flat.reshape(tensor_to_check.get_dims())
+
+
+def get_backward_op(scope, op, no_grad_set):
+ backward_op = core.Operator.backward(op, no_grad_set)
+ for input in backward_op.input_vars():
+ var = scope.new_var(input)
+ var.get_tensor()
+ for output in backward_op.output_vars():
+ var = scope.new_var(output)
+ var.get_tensor()
+ return backward_op
+
+
+def get_gradient(scope, op, inputs, outputs, grad_name, place,
+ no_grad_set=None):
+ ctx = core.DeviceContext.create(place)
+
+ set_input(scope, op, inputs, place)
+
+ op.infer_shape(scope)
+ op.run(scope, ctx)
+
+ if no_grad_set is None:
+ no_grad_set = set()
+
+ backward_op = get_backward_op(scope, op, no_grad_set)
+ set_output_grad(scope, op, outputs, place)
+
+ backward_op.infer_shape(scope)
+ backward_op.run(scope, ctx)
+
+ out = np.array(scope.find_var(grad_name).get_tensor())
+ return out
+
+
+class OpTest(unittest.TestCase):
+ def check_output_with_place(self, place):
+ self.scope = core.Scope()
+ self.op = create_op(self.scope, self.op_type, self.inputs, self.outputs)
+ if isinstance(place, core.GPUPlace) and not self.op.support_gpu():
+ return
+ set_input(self.scope, self.op, self.inputs, place)
+ self.op.infer_shape(self.scope)
+ ctx = core.DeviceContext.create(place)
+ self.op.run(self.scope, ctx)
+
+ for out_name, out_dup in Operator.get_op_outputs(self.op.type()):
+ if out_dup:
+ sub_out = self.outputs[out_name]
+ for sub_out_name in sub_out:
+ actual = np.array(
+ self.scope.find_var(sub_out_name).get_tensor())
+ expect = sub_out[sub_out_name]
+ self.assertTrue(
+ np.allclose(
+ actual, expect, atol=1e-05),
+ "output name: " + out_name + "has diff")
+ else:
+ actual = np.array(self.scope.find_var(out_name).get_tensor())
+ expect = self.outputs[out_name]
+ self.assertTrue(
+ np.allclose(
+ actual, expect, atol=1e-05),
+ "output name: " + out_name + "has diff")
+
+ def check_output(self):
+ places = [core.CPUPlace()]
+ if core.is_compile_gpu():
+ places.append(core.GPUPlace(0))
+ for place in places:
+ self.check_output_with_place(place)
+
+ def __assert_is_close(self, numeric_grads, analytic_grads, names,
+ max_relative_error, msg_prefix):
+
+ for a, b, name in itertools.izip(numeric_grads, analytic_grads, names):
+ abs_a = np.abs(a)
+ abs_a[abs_a < 1e-3] = 1
+
+ diff_mat = np.abs(a - b) / abs_a
+ max_diff = np.max(diff_mat)
+
+ def err_msg():
+ offset = np.argmax(diff_mat > max_relative_error)
+ return "%s Variable %s max gradient diff %f over limit %f, the first " \
+ "error element is %d" % (
+ msg_prefix, name, max_diff, max_relative_error, offset)
+
+ self.assertLessEqual(max_diff, max_relative_error, err_msg())
+
+ def check_grad(self,
+ inputs_to_check,
+ output_name,
+ no_grad_set=None,
+ in_place=False,
+ max_relative_error=0.005):
+ self.scope = core.Scope()
+ self.op = create_op(self.scope, self.op_type, self.inputs, self.outputs)
+ if no_grad_set is None:
+ no_grad_set = set()
+
+ numeric_grads = [
+ get_numeric_gradient(
+ self.scope,
+ self.op,
+ self.inputs,
+ input_to_check,
+ output_name,
+ in_place=in_place) for input_to_check in inputs_to_check
+ ]
+ grad_names = [
+ grad_var_name(input_to_check) for input_to_check in inputs_to_check
+ ]
+
+ cpu_place = core.CPUPlace()
+ cpu_analytic_grads = [
+ get_gradient(self.scope, self.op, self.inputs, self.outputs,
+ grad_name, cpu_place, no_grad_set)
+ for grad_name in grad_names
+ ]
+
+ self.__assert_is_close(numeric_grads, cpu_analytic_grads, grad_names,
+ max_relative_error,
+ "Gradient Check On %s" % str(cpu_place))
+
+ if core.is_compile_gpu() and self.op.support_gpu():
+ gpu_place = core.GPUPlace(0)
+ gpu_analytic_grads = [
+ get_gradient(self.scope, self.op, self.inputs, self.outputs,
+ grad_name, gpu_place, no_grad_set)
+ for grad_name in grad_names
+ ]
+
+ self.__assert_is_close(numeric_grads, gpu_analytic_grads,
+ grad_names, max_relative_error,
+ "Gradient Check On %s" % str(gpu_place))
+
+ for c_grad, g_grad, name in itertools.izip(
+ cpu_analytic_grads, gpu_analytic_grads, grad_names):
+ self.assertTrue(
+ np.allclose(
+ c_grad, g_grad, atol=1e-4),
+ "output name: " + name + " has diff")
diff --git a/python/paddle/v2/framework/tests/test_cross_entropy_op.py b/python/paddle/v2/framework/tests/test_cross_entropy_op.py
index d4277f2a42ce2e66e37405ccd3b2ee444d403d1a..fb6a440e23c26d1766bdf1fc5f24217afe1150f8 100644
--- a/python/paddle/v2/framework/tests/test_cross_entropy_op.py
+++ b/python/paddle/v2/framework/tests/test_cross_entropy_op.py
@@ -1,36 +1,27 @@
import unittest
import numpy
-from op_test_util import OpTestMeta
-from gradient_checker import GradientChecker, create_op
+from op_test import OpTest
-class TestCrossEntropy(unittest.TestCase):
- __metaclass__ = OpTestMeta
-
+class TestCrossEntropy(OpTest):
def setUp(self):
- self.type = "onehot_cross_entropy"
+ self.op_type = "onehot_cross_entropy"
batch_size = 30
class_num = 10
- X = numpy.random.random((batch_size, class_num)).astype("float32")
- label = 5 * numpy.ones(batch_size).astype("int32")
+ X = numpy.random.uniform(0.1, 1.0,
+ [batch_size, class_num]).astype("float32")
+ label = (class_num / 2) * numpy.ones(batch_size).astype("int32")
self.inputs = {'X': X, 'label': label}
Y = []
for i in range(0, batch_size):
Y.append(-numpy.log(X[i][label[i]]))
self.outputs = {'Y': numpy.array(Y).astype("float32")}
+ def test_check_output(self):
+ self.check_output()
-class CrossEntropyGradOpTest(GradientChecker):
def test_check_grad(self):
- op = create_op("onehot_cross_entropy")
- batch_size = 30
- class_num = 10
- inputs = {
- "X": numpy.random.uniform(
- 0.1, 1.0, [batch_size, class_num]).astype("float32"),
- "label": (class_num / 2) * numpy.ones(batch_size).astype("int32")
- }
- self.check_grad(op, inputs, set("X"), "Y")
+ self.check_grad(["X"], "Y")
if __name__ == "__main__":
diff --git a/python/paddle/v2/framework/tests/test_lookup_table.py b/python/paddle/v2/framework/tests/test_lookup_table.py
index 19eb464baa555fb67a994f3cfb4d3ed628367c73..4b7ce92c0f0492a73c158378299933a0b329948b 100644
--- a/python/paddle/v2/framework/tests/test_lookup_table.py
+++ b/python/paddle/v2/framework/tests/test_lookup_table.py
@@ -4,7 +4,7 @@ from op_test_util import OpTestMeta
from gradient_checker import GradientChecker, create_op
-class TestSigmoidOp(unittest.TestCase):
+class TestLookupTableOp(unittest.TestCase):
__metaclass__ = OpTestMeta
def setUp(self):
@@ -15,7 +15,7 @@ class TestSigmoidOp(unittest.TestCase):
self.outputs = {'Out': table[ids]}
-class TestSigmoidGradOp(GradientChecker):
+class TestLookupTableGradOp(GradientChecker):
def test_grad(self):
op = create_op('lookup_table')
table = np.random.random((17, 31)).astype('float32')
diff --git a/python/paddle/v2/framework/tests/test_sigmoid_op.py b/python/paddle/v2/framework/tests/test_sigmoid_op.py
index 273c2e5ab1a84d12621fe9568c4cf22073b6aed4..2316e49eff7bb1cdb53acb3889a6ef05060b59f3 100644
--- a/python/paddle/v2/framework/tests/test_sigmoid_op.py
+++ b/python/paddle/v2/framework/tests/test_sigmoid_op.py
@@ -1,27 +1,21 @@
import unittest
import numpy as np
-from op_test_util import OpTestMeta
-from gradient_checker import GradientChecker, create_op
+from op_test import OpTest
-class TestSigmoidOp(unittest.TestCase):
- __metaclass__ = OpTestMeta
-
+class TestSigmoid(OpTest):
def setUp(self):
- self.type = "sigmoid"
- self.inputs = {'X': np.random.random((15, 31)).astype("float32")}
+ self.op_type = "sigmoid"
+ self.inputs = {
+ 'X': np.random.uniform(0.1, 1, [11, 17]).astype("float32")
+ }
self.outputs = {'Y': 1 / (1 + np.exp(-self.inputs['X']))}
+ def test_check_output(self):
+ self.check_output()
-class TestSigmoidGradOp(GradientChecker):
- def test_grad(self):
- op = create_op("sigmoid")
- inputs = {"X": np.random.uniform(0.1, 1, [11, 17]).astype("float32")}
- # compare gpu and cpu results for backward op.
- # this test will be skiped if only compiling CPU version.
- self.compare_grad(op, inputs)
- # check gradients
- self.check_grad(op, inputs, set("X"), "Y", max_relative_error=0.007)
+ def test_check_grad(self):
+ self.check_grad(["X"], "Y", max_relative_error=0.007)
if __name__ == '__main__':
diff --git a/python/paddle/v2/framework/tests/test_sum_op.py b/python/paddle/v2/framework/tests/test_sum_op.py
new file mode 100644
index 0000000000000000000000000000000000000000..66417d70e81186465e6f59a17fb62255afeddea5
--- /dev/null
+++ b/python/paddle/v2/framework/tests/test_sum_op.py
@@ -0,0 +1,24 @@
+import unittest
+import numpy as np
+from op_test import OpTest
+
+
+class TestSumOp(OpTest):
+ def setUp(self):
+ self.op_type = "sum"
+ x0 = np.random.random((3, 4)).astype('float32')
+ x1 = np.random.random((3, 4)).astype('float32')
+ x2 = np.random.random((3, 4)).astype('float32')
+ self.inputs = {"X": {"x0": x0, "x1": x1, "x2": x2}}
+ y = x0 + x1 + x2
+ self.outputs = {'Out': y}
+
+ def test_check_output(self):
+ self.check_output()
+
+ def test_check_grad(self):
+ self.check_grad(["x0"], "Out")
+
+
+if __name__ == '__main__':
+ unittest.main()