diff --git a/doc/design/register_grad_op.md b/doc/design/register_grad_op.md index 9f1ce4bae7b393cb9f04909e5e4917b8d660771c..8d973eb53178c3e889c845144553a453e11f067c 100644 --- a/doc/design/register_grad_op.md +++ b/doc/design/register_grad_op.md @@ -3,17 +3,17 @@ ## The Problem Posed -Currently, for each C++ operator class definition, there registers a *gradient operator creator* function, which takes a C++ operator instance and returns the corresponding gradient operator instance. +Currently, for each C++ operator class definition, a *gradient operator creator* function is registered, which takes as input a C++ operator instance and returns the corresponding gradient operator instance. -However, we noticed two problems with the current deisgn: +However, we noticed two problems with the current design: -1. As we decided to separate the *compilation* and *execution* phases, we need to change the creator to take an `OpDesc` protobuf message in a `ProgramDesc` and inserts corresponding `OpDesc` messages into the `ProgramDesc` message. +1. As we decided to separate the *compilation* and the *execution* phases, we need to change the creator to take an `OpDesc` protobuf message in a `ProgramDesc` and inserts corresponding `OpDesc` messages into the `ProgramDesc` message. -1. Some operator's gradient computation requires more than one gradient operators. For example, the gradient of *minus* consists of two operators -- an identity operaotr and a scale operator. So we need to make the registration mechanism to support the mapping from an operator to a set of operators for gradient computation. +1. For some operators, the gradient computation can be written in terms of existing operators. For example, the gradient of *minus* operator consists of two operators -- an *identity* operator followed by a *scale* operator. Hence the registration mechanism needs to support mapping from an operator to a set of operators for the gradient computation. ## The Current Implementation -The C++ class `OpInfos` store in a association map which key is the operator type. The `grad_op_type` indicate associated gradient operator type. Operator can create gradient operator by `OpInfo::creator_` of gradient. The pseudo code is +Instances of the C++ class `OpInfo` are stored an associative map whose key is the operator type. The `grad_op_type` indicates the associated gradient operator type. An operator can create the gradient operator by invoking `OpInfo::creator_` of the gradient operator. The pseudo code is as follows ```cpp struct OpInfo { @@ -31,16 +31,16 @@ OperatorBase* CreateGradientOperator(const OperatorBase& op) { ## Proposed Solution -The mapping relationship between an operator and its gradient operators is a function. The interface of that function is: +The mapping relationship between an operator and its gradient operators is a function. The interface of this function is: ```cpp // (OpDesc) --> vector std::function(const OpDescBind&)>; ``` -The function takes an `OpDescBind` of the forward operator and returns one or many gradient operator descriptions. `OpDescBind` is a C++ wrapper for protobuf message `OpDesc` to manipulate `OpDesc` fast. +The function takes an `OpDescBind` of the forward operator and returns one or many gradient operator descriptions. `OpDescBind` is a C++ wrapper for the protobuf message `OpDesc` for rapid manipulation of `OpDesc`. -The `GradOpDescMaker` will be registered in `OpInfo`, to replace `grad_op_type_` field. The `OpInfo` should be +The `GradOpDescMaker` will be registered in `OpInfo` and will replace the `grad_op_type_` field. The `OpInfo` should look like ```cpp struct OpInfo { @@ -49,7 +49,7 @@ struct OpInfo { }; ``` -The `grad_op_maker_ ` is `nullptr` if the operator does not have associated gradient operators. +The `grad_op_maker_ ` is a `nullptr` if the operator does not have any associated gradient operators. We propose a base class called `GradOpDescMakerBase` to let operator developers generate `Gradient Operators` easily. The public interface of that class is @@ -74,7 +74,7 @@ func = [] (const OpDescBind& fwd_op) { We can write many helper functions since the `GradOpDescMakerBase` is a class now. The basic helper functions get the variables of `Input`, `Output`, `InputGradient` and `OutputGradient` in the forwarding operator. -We should chagne register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So `REGISTER_OP` just register one operator. If the `REGISTER_OPERATOR ` contains `OpProtoAndCheckerMaker` and `GradOpDescMaker`, we just list them in the same macro. It can be done by a macro contains `__VA_ARGS__`. +We should change register macros at the same time. In the current solution, there is no difference between forwarding operators and backward operators. So `REGISTER_OP` just register one operator. If the `REGISTER_OPERATOR ` contains `OpProtoAndCheckerMaker` and `GradOpDescMaker`, we just list them in the same macro. It can be done by a macro contains `__VA_ARGS__`. The user interface should be diff --git a/paddle/operators/elementwise_op_function.h b/paddle/operators/elementwise_op_function.h index 3eb97f60b59848d23bcd15ea1e3d2f21b721f6a4..488a35aafc8600bb8bb252fc3a5161c72a2f6df1 100644 --- a/paddle/operators/elementwise_op_function.h +++ b/paddle/operators/elementwise_op_function.h @@ -108,7 +108,7 @@ void ElementwiseCompute(const framework::ExecutionContext& ctx) { PADDLE_ENFORCE_GE(x_dims.size(), y_dims.size(), "Rank of first input must >= rank of second input.") - if (x_dims == y_dims || product(y_dims) == 1) { + if (x_dims == y_dims) { functor f; f.template Run(x, y, z, ctx); return; @@ -174,12 +174,6 @@ void ElementwiseGradCompute(const framework::ExecutionContext& ctx) { return; } - if (product(y_dims) == 1) { - functor1 f; - f(place, x, y, out, dx, dy, dout); - return; - } - int axis = ctx.Attr("axis"); axis = (axis == -1 ? x_dims.size() - y_dims.size() : axis); diff --git a/python/paddle/v2/framework/tests/test_elementwise_add_op.py b/python/paddle/v2/framework/tests/test_elementwise_add_op.py index f3101a709b8bcf58e8682ab3d0ca5217a7f3572d..57daddd5698f77527bc5b78c436065a851867ae0 100644 --- a/python/paddle/v2/framework/tests/test_elementwise_add_op.py +++ b/python/paddle/v2/framework/tests/test_elementwise_add_op.py @@ -92,5 +92,33 @@ class TestElementwiseAddOp_broadcast_3(TestElementwiseOp): } +class TestElementwiseAddOp_rowwise_add_0(TestElementwiseOp): + def setUp(self): + self.op_type = "elementwise_add" + self.inputs = { + 'X': np.random.rand(2, 3, 4).astype(np.float32), + 'Y': np.random.rand(3, 4).astype(np.float32) + } + + self.attrs = {'axis': 1} + self.outputs = { + 'Out': self.inputs['X'] + self.inputs['Y'].reshape(1, 3, 4) + } + + +class TestElementwiseAddOp_rowwise_add_1(TestElementwiseOp): + def setUp(self): + self.op_type = "elementwise_add" + self.inputs = { + 'X': np.random.rand(2, 1).astype(np.float32), + 'Y': np.random.rand(1).astype(np.float32) + } + + self.attrs = {'axis': 1} + self.outputs = { + 'Out': self.inputs['X'] + self.inputs['Y'].reshape(1, 1) + } + + if __name__ == '__main__': unittest.main()