From 5d6d2bc1b991a1a46bac407618f9c490af9d27e9 Mon Sep 17 00:00:00 2001
From: Abhinav Arora <abhinavarora28@gmail.com>
Date: Thu, 28 Sep 2017 14:52:03 -0700
Subject: [PATCH] Fixing typos and grammatical mistakes in Refactorization
 documents (#4479)

---
 doc/design/refactorization.md | 183 +++++++++++++++++-----------------
 1 file changed, 92 insertions(+), 91 deletions(-)

diff --git a/doc/design/refactorization.md b/doc/design/refactorization.md
index ad801ca421c..a07675b3e04 100644
--- a/doc/design/refactorization.md
+++ b/doc/design/refactorization.md
@@ -1,40 +1,40 @@
 # Design Doc: Refactorization Overview
 
-The goal of refactorizaiton include:
+The goals of refactoring include:
 
-1. Make it easy for external contributors to write new elementory computaiton operations.
-1. Make the codebase clean and readable.
-1. Introduce a new design of computation representation -- a computation graph of operators and variables.
-1. The graph representation helps implementing auto-scalable and auto fault recoverable distributed computing.
+1. Making it easy for external contributors to write new elementary computation operations.
+1. Making the codebase clean and readable.
+1. Designing a new computation representation -- a computation graph of operators and variables.
+1. Implementing auto-scalability and auto fault recoverable distributed computing with the help of computation graphs.
 
 ## Computation Graphs
 
-1. PaddlePaddle represent the computation, training and inference of DL models, by computation graphs.
+1. PaddlePaddle represents the computation, training and inference of Deep Learning models, by computation graphs.
 
-  1. Please dig into [computation graphs](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/graph.md) for a solid example.
+  1. Please refer to [computation graphs](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/graph.md) for a concrete example.
 
-1. Users write Python programs to describe the graphs and run it (locally or remotely).
+1. Users write Python programs to describe the graphs and run them (locally or remotely).
 
 1. A graph is composed of *variables* and *operators*.
 
-1. The description of graphs must be able to be serialized/deserialized, so it
+1. The description of graphs must be capable of being serialized/deserialized, so that
 
-   1. could to be sent to the cloud for distributed execution, and
-   1. be sent to clients for mobile or enterprise deployment.
+   1. It can to be sent to the cloud for distributed execution, and
+   1. It can be sent to clients for mobile or enterprise deployment.
 
-1. The Python program do
+1. The Python program does the following steps
 
-   1. *compilation*: runs a Python program to generate a protobuf message representation of the graph and send it to
+   1. *compilation*: run a Python program to generate a protobuf message representation of the graph and send it to
       1. the C++ library `libpaddle.so` for local execution,
       1. the master process of a distributed training job for training, or
       1. the server process of a Kubernetes serving job for distributed serving.
-   1. *execution*: according to the protobuf message, constructs instances of class `Variable` and `OperatorBase`, and run them.
+   1. *execution*: execute the graph by constructing instances of class [`Variable`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/variable.h#L24) and [`OperatorBase`](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/operator.h#L70), according to the protobuf message.
 
-## Description and Realization
+## Description and Realization of Computation Graph
 
-At compile time, the Python program generates protobuf message representation of the graph, or the description of the graph.
+At compile time, the Python program generates a protobuf message representation of the graph, or the description of the graph.
 
-At runtime, the C++ program realizes the graph and run it.
+At runtime, the C++ program realizes the graph and runs it.
 
 | | Representation (protobuf messages) | Realization (C++ class objects) |
 |---|---|---|
@@ -42,30 +42,31 @@ At runtime, the C++ program realizes the graph and run it.
 |Operation|[OpDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L35)|[Operator](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/operator.h#L64)|
 |Block|BlockDesc|Block|
 
-The word *graph* is exchangable with *block* in this document.  A graph represent computation steps and local variables as a C++/Java program block, or a pair of { and }.
+The word *graph* is interchangeable with *block* in this document.  A graph represents computation steps and local variables similar to a C++/Java program block, or a pair of parentheses(`{` and `}`).
 
 ## Compilation and Execution
 
-1. Run an applicaton Python program to describe the graph.  In particular,
+1. Run an application Python program to describe the graph.  In particular, the Python application program does the following:
 
-   1. create VarDesc to represent local/intermediate variables,
-   1. create operators and set attributes,
-   1. validate attribute values,
-   1. inference the type and the shape of variables,
-   1. plan for memory-reuse for variables,
-   1. generate backward and optimization part of the Graph.
-   1. possiblly split the graph for distributed training.
+   1. Create `VarDesc` to represent local/intermediate variables,
+   1. Create operators and set attributes,
+   1. Validate attribute values,
+   1. Infer the type and the shape of variables,
+   1. Plan memory-reuse for variables,
+   1. Generate the backward graph
+   1. Optimize the computation graph.
+   1. Potentially, split the graph for distributed training.
 
-1. The invocation of `train` or `infer` in the application Python program:
+1. The invocation of `train` or [`infer`](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/inference.py#L108) methods in the application Python program does the following:
 
-   1. create a new Scope instance in the [scope hierarchy](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/scope.md) for each run of a block,
+   1. Create a new Scope instance in the [scope hierarchy](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/scope.md) for each run of a block,
       1. realize local variables defined in the BlockDesc message in the new scope,
       1. a scope is similar to the stack frame in programming languages,
 
-   1. create an instance of class `Block`, in which,
+   1. Create an instance of class `Block`, in which,
       1. realize operators in the BlockDesc message,
 
-   1. run the Block by calling
+   1. Run the Block by calling
       1. `Block::Eval(vector<Variable>* targets)` for forward and backward computations, or
       1. `Block::Eval(vector<Operator>* targets)` for optimization.
 
@@ -76,14 +77,14 @@ The word *graph* is exchangable with *block* in this document.  A graph represen
 Compile Time -> IR -> Runtime
 ```
 
-### Benefit
+### Benefits of IR
 
 - Optimization
   ```text
   Compile Time -> IR -> Optimized IR -> Runtime
   ```
-- Send automatically partitioned IR to different nodes.
-  - Automatic data parallel
+- Automatically send partitioned IR to different nodes.
+  - Automatic Data Parallelism
     ```text
     Compile Time
     |-> Single GPU IR
@@ -92,7 +93,7 @@ Compile Time -> IR -> Runtime
             |-> Node-1 (runs trainer-IR-1)
             |-> Node-2 (runs pserver-IR)
     ```
-  - Automatic model parallel (planned for future)
+  - Automatic Model Parallelism (planned for future)
 
 ---
 
@@ -105,10 +106,10 @@ Compile Time -> IR -> Runtime
 # Operator
 ![class_diagram](http://api.paddlepaddle.org/graphviz?dot=https://gist.githubusercontent.com/reyoung/53df507f6749762675dff3e7ce53372f/raw/dd598e8f1976f5759f58af5e5ef94738a6b2e661/op.dot)
 
-* `Operator` is the fundamental building block as the user interface.
-    * Operator stores input/output variable name, and attributes.
-    * The `InferShape` interface is used to infer output variable shapes by its input shapes.
-    * Use `Run` to compute `input variables` to `output variables`.
+* `Operator` is the fundamental building block of the user interface.
+    * Operator stores input/output variable names, and attributes.
+    * The `InferShape` interface is used to infer the shape of the output variable shapes based on the shapes of the input variables.
+    * Use `Run` to compute the `output` variables from the `input` variables.
 
 ---
 
@@ -126,30 +127,30 @@ Compile Time -> IR -> Runtime
 # Why separate Kernel and Operator
 
 * Separate GPU and CPU code.
-    * Make Paddle can run without GPU.
-* Make one operator (which is user interface) can contain many implementations.
-    * Same mul op, different FP16, FP32 Kernel. different MKL, eigen kernel.
+    * Make Paddle capable of running without GPU.
+* Make one operator (which is a user interface) and create many implementations.
+    * For example, same multiplication op can have different implementations kernels such as FP16 kernel, FP32 kernel, MKL, eigen kernel.
 ---
 
 # Libraries for Kernel development
 
 * `Eigen::Tensor` contains basic math and element-wise functions.
     * Note that `Eigen::Tensor` has broadcast implementation.
-    * Limit number of `tensor.device(dev) = ` in your code.
+    * Limit the number of `tensor.device(dev) = ` in your code.
 * `thrust::tranform` and `std::transform`.
-    * `thrust` has the same API as C++ standard library. Using `transform` can quickly implement a customized elementwise kernel.
-    * `thrust` has more complex API, like `scan`, `reduce`, `reduce_by_key`.
+    * `thrust` has the same API as C++ standard library. Using `transform`, one can quickly implement customized elementwise kernels.
+    * `thrust` also has more complex APIs, like `scan`, `reduce`, `reduce_by_key`.
 * Hand-writing `GPUKernel` and `CPU` code
-    * Do not write `.h`. CPU Kernel should be in `.cc`. GPU kernel should be in `.cu`. (`GCC` cannot compile GPU code.)
+    * Do not write in header (`.h`) files. CPU Kernel should be in cpp source (`.cc`) and GPU kernels should be in cuda (`.cu`) files. (GCC cannot compile GPU code.)
 ---
-# Operator Register
+# Operator Registration
 
-## Why register is necessary?
+## Why registration is necessary?
 We need a method to build mappings between Op type names and Op classes.
 
-## How to do the register?
+## How is registration implemented?
 
-Maintain a map, whose key is the type name and value is corresponding Op constructor.
+Maintaining a map, whose key is the type name and the value is the corresponding Op constructor.
 
 ---
 # The Registry Map
@@ -177,34 +178,34 @@ REGISTER_OP(op_type, op_class, op_maker_class, grad_op_type, grad_op_class)
 REGISTER_OP_WITHOUT_GRADIENT(op_type, op_class, op_maker_class)
 ```
 
-### `USE` Macros
-make sure the registration process is executed and linked.
+### USE Macros
+Make sure the registration process is executed and linked.
 
 ---
-# Register Process
-1. Write Op class, as well as its gradient Op class if there is.
-2. Write Op maker class. In the constructor, describe its inputs, outputs, and attributes.
-3. Invoke macro `REGISTER_OP`. The macro will
-	1. call maker class to complete `proto` and `checker`
-	2. with the completed `proto` and `checker`, build a new key-value pair in the `OpInfoMap`
+# Registration Process
+1. Write an Op class and its gradient Op class, if required.
+2. Write an Op maker class. In the constructor of this class, describe the inputs, outputs and attributes of the operator.
+3. Invoke the macro `REGISTER_OP`. This macro will
+	1. Call maker class to complete the `proto` and the `checker`
+	2. Using the completed `proto` and `checker`, it will add a new key-value pair to the `OpInfoMap`
 
-4. Invoke `USE` macro in where the Op is used to make sure it is linked.
+4. Invoke the `USE` macro in which the Op is used, to make sure that it is linked.
 
 ---
 # Backward Module (1/2)
 ### Create Backward Operator
-- Mapping from forwarding Op to backward Op
+- Mapping from forward Op to backward Op
 ![backward](https://gist.githubusercontent.com/dzhwinter/a6fbd4623ee76c459f7f94591fd1abf0/raw/61026ab6e518e66bde66a889bc42557a1fccff33/backward.png)
 
 ---
 # Backward Module (2/2)
 ### Build Backward Network
-- **Input** graph of forwarding operators
-- **Output** graph of backward operators
-- **corner case in construction**
-	- shared variable => insert `Add` operator
-	- no gradient => insert `fill_zero_grad` operator
-	- recursive netOp => call `Backward` recursively
+- **Input**: graph of forwarding operators
+- **Output**: graph of backward operators
+- **Corner cases in construction**
+	- Shared Variables => insert an `Add` operator to combine gradients
+	- No Gradient => insert a `fill_zero_grad` operator
+	- Recursive NetOp => call `Backward` recursively
 	- RNN Op => recursively call `Backward` on stepnet
 
 
@@ -213,41 +214,41 @@ make sure the registration process is executed and linked.
 
 * `Tensor` is an n-dimension array with type.
 	* Only dims and data pointers are stored in `Tensor`.
-	* All operators on `Tensor` is written in `Operator` or global functions.
-	* variable length Tensor design [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md)
-* `Variable` is the inputs and outputs of an operator. Not just `Tensor`.
-	* step_scopes in RNN is a variable and not a tensor.
-* `Scope` is where variables store at.
-	* map<string/*var name */, Variable>
-	* `Scope` has a hierarchical structure. The local scope can get variable from its parent scope.
+	* All operations on `Tensor` are written in `Operator` or global functions.
+	* Variable length Tensor design [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md)
+* `Variable` instances are the inputs and the outputs of an operator. Not just `Tensor`.
+	* `step_scopes` in RNN is a variable and not a tensor.
+* `Scope` is where variables are stores.
+	* map<string `variable_name`, Variable>
+	* `Scope` has a hierarchical structure. The local scope can get variables from its parent scope.
 
 ---
 # Block (in design)
 ## the difference with original RNNOp
-- as an operator is more intuitive than `RNNOp`,
-- offers new interface `Eval(targets)` to deduce the minimal block to `Run`,
-- fits the compile-time/ runtime separation design.
-  - during the compilation, `SymbolTable` stores `VarDesc`s and `OpDesc`s and serialize to a `BlockDesc`
-  - when graph executes, a Block with `BlockDesc` passed in creates `Op` and `Var` then `Run`
+- As an operator is more intuitive than `RNNOp`,
+- Offers a new interface `Eval(targets)` to deduce the minimal block to `Run`,
+- Fits the compile-time/ runtime separation design paradigm.
+  - During the compilation, `SymbolTable` stores `VarDesc`s and `OpDesc`s and serialize to a `BlockDesc`
+  - When graph executes, a Block with `BlockDesc` is passed. It then creates `Op` and `Var` instances and then invokes `Run`.
 
 ---
 # Milestone
-- take Paddle/books as the main line, the requirement of the models motivates framework refactoring,
-- model migration
-  - framework development gives **priority support** to model migration, for example,
+- Take Paddle/books as the main line, the requirement of the models motivates framework refactoring,
+- Model migration
+  - Framework development gives **priority support** to model migration, for example,
     - the MNIST demo needs a Python interface,
     - the RNN models require the framework to support `LoDTensor`.
-  - determine some timelines,
-  - heavily-relied Ops need to be migrated first,
-  - different models can be migrated parallelly.
-- improve the framework at the same time
-- accept imperfection, concentrated on solving the specific problem at the right price.
+  - Determine some timelines,
+  - Frequently used Ops need to be migrated first,
+  - Different models can be migrated in parallel.
+- Improve the framework at the same time
+- Accept imperfection, concentrate on solving the specific problem at the right price.
 
 ---
 # Control the migration quality
-- compare the performance of migrated models with old ones.
-- follow google C style
-- build the automatic workflow of generating Python/C++ documentations
-  - the documentation of layers and ops should be written inside the code
-  - take the documentation quality into account when doing PR
-  - preview the documentations, read and improve them from users' perspective
+- Compare the performance of migrated models with old ones.
+- Follow the google C++ style
+- Build the automatic workflow of generating Python/C++ documentations.
+  - The documentation of layers and ops should be written inside the code.
+  - Take the documentation quality into account when submitting pull requests.
+  - Preview the documentations, read and improve them from a user's perspective.
-- 
GitLab