diff --git a/paddle/framework/backward.md b/paddle/framework/backward.md index b4205fed2e6e98282d60a25a83e5b2e2c5d0cfce..133b17c7be811f26a63b245816f382432b2af0f5 100644 --- a/paddle/framework/backward.md +++ b/paddle/framework/backward.md @@ -2,32 +2,24 @@ ## Motivation -In Neural Network, the backpropagation algorithm follows the chain rule, so we need to compound the fundmental gradient operators/expressions together with chain rule . Every forward network need a backward network to construct the full computation lineage, the operator/ expression's Backward feature will generate the backward pass respect to forward pass. - +In Neural Network, the backpropagation algorithm follows the chain rule, so we need to compound the fundmental gradient operators/expressions together with chain rule . Every forward network need a backward network to construct the full computation lineage, the operator/expression's backward pass will be generated respect to forward pass. + ## Backward Operator Registry -A backward network is built up with several backward operators. Backward operators take forward operators' inputs, outputs and output gradients and then calculate its input gradients. In most cases, there is a one-to-one correspondence between forward and backward operators. We use registry mechanism to save these correspondences, which is quite similar with operator registry itself. +A backward network is built up with several backward operators. Backward operators take forward operators' inputs, outputs and output gradients and then calculate its input gradients. In most cases, there is a one-to-one correspondence between forward and backward operators. We use registry mechanism to save these correspondences. For example, we have got a `add_two_op`, and is registered by the following code: ```cpp -REGISTER_OP(add_two, AddTwoOp, AddTwoOpMaker); +REGISTER_OP(add_two, AddTwoOp, AddTwoOpMaker, add_two_grad, AddTwoGradOp); ``` `add_two` is the operator's type. `AddTwoOp` and `AddTwoOpMaker` are the operator class and the operator maker class respectively. -Assume that we have also got the backward operator of `add_two_op`, which calculating the gradients of `add_two_op`'s inputs. Then we register it by the following way: - -```cpp -REGISTER_GRADIENT_OP(add_two, add_two_grad, AddTwoGradOp); -``` - `add_two_grad` is the type of backward operator, and `AddTwoGradOp` is its class name. ## Backward Opeartor Creating -### Usage - Given a certain forward operator, we can get its corresponding backward opeartor by calling: ```cpp @@ -36,13 +28,13 @@ OperatorBase* bwd_op = BuildGradOp(const OperatorBase* fwd_op); The function `BuildGradOp` will sequentially execute following processes: -1. Getting the `type_` of given forward operator, and then creating the corresponding backward operator. +1. Get the `type_` of given forward operator, and then get the corresponding backward operator's type by looking up the `OpInfoMap`. -2. Copying all the attributes of forward operator expect `input_format` and `output_format`(if it has), for their elements differ between forward and backward operators. +2. Build two maps named `inputs` and `outputs` to temporary storage backward operator's inputs and outputs. Copy forward operator's `inputs_` and `outputs_` to map `inputs`, except these are not necessary for gradient computing. -3. Copying forward operator's `inputs_` and `outputs_` to backward operator's `inputs_`. And adding forward inputs' gradient variables into backward `output_`, adding forward outputs' gradient variables into backward `input_`. +3. Add forward inputs' gradient variables into map `output`, adding forward outputs' gradient variables into map `input`. -4. Building backward operator's `input_format`, `output_format` (if necessary) and `in_out_idxs_` according to its `inputs_` and `outputs_` just created. +4. Building backward operator with `inputs`, `outputs` and forward operator's attributes. ## Backward Network Building