diff --git a/doc/howto/dev/new_op_cn.md b/doc/howto/dev/new_op_cn.md
index 58665e9f2b6299ec3959ed6858ab01d459f64dd8..e3892849abe21fc207d2fcbe4adc65184ba771f4 100644
--- a/doc/howto/dev/new_op_cn.md
+++ b/doc/howto/dev/new_op_cn.md
@@ -262,7 +262,7 @@ MulOp(const std::string &type, const framework::VariableNameMap &inputs,
 
  - 生成库
 
-   无需修改 [`paddle/pybind/CMakeLists.txt`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/pybind/CMakeLists.txt)文件,`paddle/operators` 目录下新增的 `*_op.cc` 文件会被自动添加链接到生成的lib库中。
+   `paddle/operators` 目录下新增的 `*_op.cc` 文件会被自动添加链接到生成的lib库中。
 
 ## 实现单元测试
 
@@ -354,11 +354,7 @@ class TestMulGradOp(GradientChecker):
 
 ### 编译和执行单元测试
 
-单元测试编写完成之后,在[`python/paddle/v2/framework/tests/CMakeLists.txt`](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/framework/tests/CMakeLists.txt)中添加以下内容,将单元测试加入工程:
-
-```
-py_test(test_mul_op SRCS test_mul_op.py)
-```
+`python/paddle/v2/framework/tests` 目录下新增的 `test_*.py` 单元测试会被自动加入工程进行编译。
 
 请注意,**不同于Op的编译测试,运行单元测试测时需要编译整个工程**,并且编译时需要打开`WITH_TESTING`, 即`cmake paddle_dir -DWITH_TESTING=ON`。编译成功后,执行下面的命令来运行单元测试:
 
diff --git a/paddle/framework/backward.md b/paddle/framework/backward.md
index c762811dfc190b255e0a3389885a081ce8315caf..0a6d762bc8be5201ac196b4bc6107c06d07a31d7 100644
--- a/paddle/framework/backward.md
+++ b/paddle/framework/backward.md
@@ -2,11 +2,22 @@
 
 ## Motivation
 
-In Neural Network, the backpropagation algorithm follows the chain rule, so we need to compound the gradient operators/expressions together with the chain rule. Every forward network needs a backward network to construct the full computation graph, the operator/expression's backward pass will be generated respect to forward pass.
+In Neural Network, many model is solved by the the backpropagation algorithm(known as BP) at present. Technically it caculates the gradient of the loss function, then distributed back through the networks. Follows the chain rule, so we need a module chains the gradient operators/expressions together with to construct the backward pass. Every forward network needs a backward network to construct the full computation graph, the operator/expression's backward pass will be generated respect to forward pass. 
 
-## Backward Operator Registry
+## Implementation
 
-A backward network is built up with several backward operators. Backward operators take forward operators' inputs outputs, and output gradients and then calculate its input gradients.
+In this design doc, we exported only one API for generating the backward pass.
+
+```c++
+std::unique_ptr Backward(const OperatorBase& forwardOp,
+    const std::unordered_set& no_grad_vars);
+```
+
+The implementation behind it can be divided into two parts, **Backward Operator Creating** and **Backward Operator Building**.
+
+### Backward Operator Registry
+
+A backward network is built up with several backward operators. Backward operators take forward operators' inputs, outputs, and output gradients and then calculate its input gradients.
 
 |                        | forward operator | backward operator 
 | ---------------------- | ---------------- |------------------------- |		
@@ -25,7 +36,7 @@ REGISTER_OP(mul, MulOp, MulOpMaker, mul_grad, MulOpGrad);
 
 `mul_grad` is the type of backward operator, and `MulOpGrad` is its class name.
 
-## Backward Opeartor Creating
+### Backward Opeartor Creating
 
 Given a certain forward operator, we can get its corresponding backward operator by calling:
 
@@ -43,40 +54,47 @@ The function `BuildGradOp` will sequentially execute following processes:
 
 4. Building backward operator with `inputs`, `outputs` and forward operator's attributes.
 
-## Backward Network Building
-
-A backward network is a series of backward operators. The main idea of building a backward network is creating backward operators in the inverted sequence and put them together.
+### Backward Network Building
 
-In our design, the network itself is also a kind of operator. So the operators contained by a big network may be some small network. 
-
-given a forward network, it generates the backward network. We only care about the Gradients—`OutputGradients`, `InputGradients`.
+A backward network is a series of backward operators. The main idea of building a backward network is creating backward operators in the inverted sequence and append them together one by one. There is some corner case need to process specially.
 
 1. Op 
 
-   when the input forward network is an Op, return its gradient Operator Immediately.
+   When the input forward network is an Op, return its gradient Operator Immediately. If all of its outputs are in no gradient set, then return a special `NOP`.
 
 2. NetOp 
 
-   when the input forward network is a NetOp, it needs to call the sub NetOp/Operators backward function recursively. During the process, we need to collect the `OutputGradients` name according to the forward NetOp.
+   In our design, the network itself is also a kind of operator(**NetOp**). So the operators contained by a big network may be some small network. When the input forward network is a NetOp, it needs to call the sub NetOp/Operators backward function recursively. During the process, we need to collect the `OutputGradients` name according to the forward NetOp.
+
+3. RnnOp
+
+   RnnOp is a nested stepnet operator.  Backward module need to recusively call `Backward` for every stepnet.
+
+4. Sharing Variables
+
+   **sharing variables**. As illustrated in the pictures, two operator's share the same variable name of W@GRAD, which will overwrite their sharing input variable. 
+
+
+
 
-   **shared variable**. As illustrated in the pictures, two operator's `Output` `Gradient` will overwrite their shared input variable.  
+	pic 1. Sharing variables in operators. 
 
-   
-   
+
 
-   1. Shared variable in operators. 
+	Sharing variable between operators or same input variable used in multiple operators leads to a duplicate gradient variable. As demo show above, we need to rename gradient name recursively and add a generic add operator to replace the overwrite links. 
 
-   
+
+
 
-   Share variable between operators or same input variable used in multiple operators leads to a duplicate gradient variable. As demo show above, we need to rename gradient name recursively and add a generic add operator replace the overwrite links. 
+	pic 2. Replace sharing variable's gradient with `Add` operator.
 
-   
-   
+
 
-   2. Replace shared variable's gradient with `Add` operator.
+	Because our framework finds variables accord to their names, we need to rename the output links. We add a suffix of number to represent its position in clockwise. 
 
-   
+5. Part of Gradient is Zero.
 
+   In the whole graph, there is some case of that one operator's gradient is not needed, but its input's gradient is a dependency link of other operator,  we need to fill a same shape gradient matrix in the position. In our implement, we insert a special `fillZeroLike` operator.
 
 
-	Then collect the sub graph `OutputGradients`/`InputGradients` as the NetOp's and return it.
+Follow these rules above, then collect the sub graph `OutputGradients`/`InputGradients` as the NetOp's and return it.
diff --git a/paddle/framework/images/duplicate_op2.graffle b/paddle/framework/images/duplicate_op2.graffle
index ede3bca30ae17d5af52505fd94dc2f79b23b57e0..5cec3bc64dbd44dc99e348485969f29bd128ceb1 100644
Binary files a/paddle/framework/images/duplicate_op2.graffle and b/paddle/framework/images/duplicate_op2.graffle differ
diff --git a/paddle/framework/images/duplicate_op2.png b/paddle/framework/images/duplicate_op2.png
index 4e872dc2caf3b0cbd0d5176f11a14801b538dc86..21cdd5cabf1b5203e1435a75b57770d2f702fa92 100644
Binary files a/paddle/framework/images/duplicate_op2.png and b/paddle/framework/images/duplicate_op2.png differ
diff --git a/paddle/gserver/layers/SwitchOrderLayer.cpp b/paddle/gserver/layers/SwitchOrderLayer.cpp
index d7eee6eaf078dab8d48adc4c7ee758a433672ac6..e97809141a93106f9e6ebaf40c7e8aa9c6010557 100644
--- a/paddle/gserver/layers/SwitchOrderLayer.cpp
+++ b/paddle/gserver/layers/SwitchOrderLayer.cpp
@@ -83,8 +83,7 @@ void SwitchOrderLayer::forward(PassType passType) {
   setOutDims();
   resetOutput(outDims_[0], outDims_[1] * outDims_[2] * outDims_[3]);
   if (heightAxis_.size() > 0) {
-    getOutputValue()->reshape(reshapeHeight_, reshapeWidth_);
-    getOutputGrad()->reshape(reshapeHeight_, reshapeWidth_);
+    resetOutput(reshapeHeight_, reshapeWidth_);
   }
 
   // switch NCHW to NHWC
diff --git a/paddle/operators/name_convention.md b/paddle/operators/name_convention.md
new file mode 100644
index 0000000000000000000000000000000000000000..a090e0b5450509affdd739f63df618595f204f97
--- /dev/null
+++ b/paddle/operators/name_convention.md
@@ -0,0 +1,59 @@
+## Operator's Parameter Name Convention
+
+To make the operator document itself more clear, we recommend operator names obey the listing conventions.
+
+### OpProtoMaker names
+
+When defining an operator in Paddle, a corresponding [OpProtoMaker](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/operator.h#L170) (TODO: OpProtoMaker Doc)need to be defined. All the Input/Output and Attributes will write into the [OpProto](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L61) , and will be used in client language to create operator. 
+
+- Input/Output.
+  - Input/Output names follow the **CamelCase**. e.g. `X`, `Y`, `Matrix`, `LastAxisInMatrix`. Input/Output much more like Variables, we prefer to meaningful English words. 
+  - If an operator's Input/Output are tensors in math, not match to any meaningful words, input name should starts from `X`. e.g. `X`, `Y`, and output name should starts from `Out`. e.g. `Out`. This rule intends making operators which have few inputs/outputs unified.
+
+- Attribute.
+  - Attribute name follows the **camelCase**. e.g. `x`, `y`, `axis`, `rowwiseMatrix`. Also, attribute name prefers to meaningful English words.
+
+- Comments.
+  - Input/Output/Attr comment follow the format of **(type,default value) usage**, corresponding to which type it can be and how it will be used in the operator. e.g.  Attribute in Accumulator`"gamma" `,`(float, default 1.0) Accumulation multiplier`.
+  - Operator comment format of` R"DOC(your comment here)DOC"`. You should explain the input/output of the operator first. If there is math calculation in this operator, you should write the equation in the comment. e.g. `Out = X + Y`. 
+
+- Order.
+  - Follow the order of Input/Output, then Attribute, then Comments. See the example in best practice.
+
+### Best Practice
+
+Here we give some examples to show how these rules will be used.
+
+- The operator has one input, one output. e.g.`relu`, inputs: `X`, outputs: `Out`. 
+
+- The operator has two input, one output. e.g. `rowwise_add`, inputs : `X`, `Y`, outputs : `Out`.
+
+- The operator contains attribute. e.g. `cosine`, inputs : `X`, `axis`, outputs : `Out`.
+
+  We give a full example of Accumulator Operator.
+
+```c++
+class AccumulateOpMaker : public framework::OpProtoAndCheckerMaker {
+public:
+  AccumulateOpMaker(framework::OpProto *proto,
+                            framework::OpAttrChecker *op_checker)
+    : OpProtoAndCheckerMaker(proto, op_checker) {
+    AddInput("X", "(Tensor) The input tensor that has to be accumulated to the output tensor. If the output size is not the same as input size, the output tensor is first reshaped and initialized to zero, and only then, accumulation is done.");
+    AddOutput("Out", "(Tensor) Accumulated output tensor");
+    AddAttr("gamma", "(float, default 1.0) Accumulation multiplier");
+    AddComment(R"DOC(
+Accumulate operator accumulates the input tensor to the output tensor. If the
+output tensor already has the right size, we add to it; otherwise, we first
+initialize the output tensor to all zeros, and then do accumulation. Any
+further calls to the operator, given that no one else fiddles with the output
+in the interim, will do simple accumulations.
+Accumulation is done as shown:
+
+Out = 1*X + gamma*Out
+
+where X is the input tensor, Y is the output tensor and gamma is the multiplier
+argument.
+)DOC");
+  }
+};
+```
diff --git a/paddle/operators/reshape_op.h b/paddle/operators/reshape_op.h
index 26708e72dc8f80d2cff1c1ee5e8763b959320205..873acf30782d390cdca5e7e864c76e1f743f9a7c 100644
--- a/paddle/operators/reshape_op.h
+++ b/paddle/operators/reshape_op.h
@@ -1,4 +1,3 @@
-
 /* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
 
    Licensed under the Apache License, Version 2.0 (the "License");
@@ -52,5 +51,5 @@ class ReshapeGradKernel : public framework::OpKernel {
     d_x->Resize(in_dims);
   }
 };
-}
-}
+}  // namespace operators
+}  // namespace paddle
diff --git a/python/paddle/v2/framework/tests/CMakeLists.txt b/python/paddle/v2/framework/tests/CMakeLists.txt
index 6b22c0008210b492d00dee42e967ca14d0948b20..4d7664469e481344cf9eea84688f068b4fb99dee 100644
--- a/python/paddle/v2/framework/tests/CMakeLists.txt
+++ b/python/paddle/v2/framework/tests/CMakeLists.txt
@@ -1,38 +1,5 @@
-py_test(test_net SRCS test_net.py)
-
-py_test(test_scope SRCS test_scope.py)
-
-py_test(test_tensor SRCS test_tensor.py)
-py_test(test_mul_op SRCS test_mul_op.py)
-py_test(test_cos_sim_op SRCS test_cos_sim_op.py)
-
-py_test(test_mean_op SRCS test_mean_op.py)
-
-py_test(test_protobuf SRCS test_protobuf.py)
-
-py_test(test_add_two_op SRCS test_add_two_op.py)
-py_test(test_sigmoid_op SRCS test_sigmoid_op.py)
-py_test(test_softmax_op SRCS test_softmax_op.py)
-py_test(test_cross_entropy_op SRCS test_cross_entropy_op.py)
-py_test(test_gather_op SRCS test_gather_op.py)
-py_test(test_scatter_op SRCS test_scatter_op.py)
-py_test(test_fill_zeros_like_op SRCS test_fill_zeros_like_op.py)
-py_test(test_top_k_op SRCS test_top_k_op.py)
-
-py_test(test_rowwise_add_op SRCS test_rowwise_add_op.py)
-
-py_test(test_default_scope_funcs SRCS test_default_scope_funcs.py)
-
-py_test(test_operator SRCS test_operator.py)
-py_test(test_gaussian_random_op SRCS test_gaussian_random_op.py)
-py_test(test_uniform_random_op SRCS test_uniform_random_op.py)
-py_test(test_recurrent_op SRCS test_recurrent_op.py)
-py_test(test_sgd_op SRCS test_sgd_op.py)
-py_test(test_gradient_checker SRCS test_gradient_checker.py)
-py_test(test_lookup_table SRCS test_lookup_table.py)
-py_test(test_scale_and_identity_op SRCS test_scale_and_identity_op.py)
-py_test(test_sum_op SRCS test_sum_op.py)
-py_test(mnist SRCS mnist.py)
-py_test(test_concat_op SRCS test_concat_op.py)
-py_test(test_squared_l2_distance_op SRCS test_squared_l2_distance_op.py)
-py_test(test_reshape_op SRCS test_reshape_op.py)
+file(GLOB TEST_OPS RELATIVE "${CMAKE_CURRENT_SOURCE_DIR}" "test_*.py")
+string(REPLACE ".py" "" TEST_OPS "${TEST_OPS}")
+foreach(src ${TEST_OPS})
+    py_test(${src} SRCS ${src}.py)
+endforeach()
diff --git a/python/paddle/v2/framework/tests/mnist.py b/python/paddle/v2/framework/tests/test_mnist.py
similarity index 100%
rename from python/paddle/v2/framework/tests/mnist.py
rename to python/paddle/v2/framework/tests/test_mnist.py