From 6ae46a29c21375503af1ff1331b83b4d37d505c4 Mon Sep 17 00:00:00 2001
From: kavyasrinet <kavyasrinet@baidu.com>
Date: Mon, 22 Jan 2018 12:12:39 -0800
Subject: [PATCH] Updating the cluster trainign doc (#7746)

---
 .../usage/cluster/fluid_cluster_train_en.md   | 38 +++++++++----------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/doc/howto/usage/cluster/fluid_cluster_train_en.md b/doc/howto/usage/cluster/fluid_cluster_train_en.md
index a64004a7c4e..11904a6f71b 100644
--- a/doc/howto/usage/cluster/fluid_cluster_train_en.md
+++ b/doc/howto/usage/cluster/fluid_cluster_train_en.md
@@ -2,27 +2,27 @@
 
 ## Introduction
 
-In this article, we'll explain how to config and run distributed training jobs with PaddlePaddle Fluid in a bare metal cluster.
+In this article, we'll explain how to configure and run distributed training jobs with PaddlePaddle Fluid in a bare metal cluster.
 
 ## Preparations
 
-### Get your cluster ready
+### Getting the cluster ready
 
-Prepare your computer nodes in the cluster. Nodes in this cluster can be of any specification that runs PaddlePaddle, and with a unique IP address assigned to it. Make sure they can communicate with each other.
+Prepare the compute nodes in the cluster. Nodes in this cluster can be of any specification that runs PaddlePaddle, and with a unique IP address assigned to it. Make sure they can communicate to each other.
 
 ### Have PaddlePaddle installed
 
 PaddlePaddle must be installed on all nodes. If you have GPU cards on your nodes, be sure to properly install drivers and CUDA libraries.
 
-PaddlePaddle build and installation guide can be found from [here](http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/build_and_install/index_en.html).
+PaddlePaddle build and installation guide can be found  [here](http://www.paddlepaddle.org/docs/develop/documentation/en/getstarted/build_and_install/index_en.html).
 
-### Update training script
+### Update the training script
 
 #### Non-cluster training script
 
 Let's take [Deep Learning 101](http://www.paddlepaddle.org/docs/develop/book/01.fit_a_line/index.html)'s first chapter: "fit a line" as an example.
 
-This demo's non-cluster version with fluid API is as follows:
+The non-cluster version of this demo with fluid API is as follows:
 
 ``` python
 import paddle.v2 as paddle
@@ -65,25 +65,25 @@ for pass_id in range(PASS_NUM):
 exit(1)
 ```
 
-We created a simple fully connected neural networks training program and handed it to the fluid executor to run for 100 passes.
+We created a simple fully-connected neural network training program and handed it to the fluid executor to run for 100 passes.
 
-Now let's try to convert it to a distributed version to run in a cluster.
+Now let's try to convert it to a distributed version to run on a cluster.
 
 #### Introducing parameter server
 
-As you see from the non-cluster version of training script, there is only one role in it: the trainer, who does the computing as well as holding parameters. In cluster training, since multi-trainers are working on the same task, they need one centralized place to hold and distribute parameters. This centralized place is called the Parameter Server in PaddlePaddle.
+As we can see from the non-cluster version of training script, there is only one role in the script: the trainer, that performs the computing as well as holds the parameters. In cluster training, since multi-trainers are working on the same task, they need one centralized place to hold and distribute parameters. This centralized place is called the Parameter Server in PaddlePaddle.
 
-![parameter server architect](src/trainer.png)
+![parameter server architecture](src/trainer.png)
 
-Parameter Server in fluid does not only hold parameters but is also assigned with a part of the program. Trainers communicate with parameter servers via send/receive OPs. For more tech detail, please refer to this [document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/dist_refactor/distributed_architecture.md).
+Parameter Server in fluid not only holds the parameters but is also assigned with a part of the program. Trainers communicate with parameter servers via send/receive OPs. For more technical details, please refer to  [this document](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/dist_refactor/distributed_architecture.md).
 
-Now we need to create program for both trainers and parameter servers, the question is how?
+Now we need to create programs for both: trainers and parameter servers, the question is how?
 
 #### Slice the program
 
-Fluid provides a tool called "Distribute Transpiler" to automatically convert the non-cluster program into cluster program.
+Fluid provides a tool called "Distributed Transpiler" that automatically converts the non-cluster program into cluster program.
 
-The idea behind this tool is to find optimize OPs and gradient parameters, slice the program into 2 pieces and connect them with send/receive OP.
+The idea behind this tool is to find the optimize OPs and gradient parameters, slice the program into 2 pieces and connect them with send/receive OP.
 
 Optimize OPs and gradient parameters can be found from the return values of optimizer's minimize function.
 
@@ -94,9 +94,9 @@ To put them together:
 
 optimize_ops, params_grads = sgd_optimizer.minimize(avg_cost) #get optimize OPs and gradient parameters
 
-t = fluid.DistributeTranspiler() # create transpiler instance
+t = fluid.DistributeTranspiler() # create the transpiler instance
 # slice the program into 2 pieces with optimizer_ops and gradient parameters list, as well as pserver_endpoints, which is a comma separated list of [IP:PORT] and number of trainers
-t.transpile(optimize_ops, params_grads, pservers=pserver_endpoints, trainers=2) 
+t.transpile(optimize_ops, params_grads, pservers=pserver_endpoints, trainers=2)
 
 ... #create executor
 
@@ -119,7 +119,7 @@ for pass_id in range(100):
 
 ### E2E demo
 
-Please find the complete demo from [here](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/tests/book_distribute/notest_dist_fit_a_line.py). In parameter server node run this in the command line:
+Please find the complete demo from [here](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/v2/fluid/tests/book_distribute/notest_dist_fit_a_line.py). In parameter server node run the following in the command line:
 
 ``` bash
 PSERVERS=192.168.1.2:6174 SERVER_ENDPOINT=192.168.1.2:6174 TRAINING_ROLE=PSERVER python notest_dist_fit_a_line.py
@@ -129,12 +129,12 @@ PSERVERS=192.168.1.2:6174 SERVER_ENDPOINT=192.168.1.2:6174 TRAINING_ROLE=PSERVER
 
 Wait until the prompt `Server listening on 192.168.1.2:6174`
 
-Then in 2 of your trainer node run this:
+Then in 2 of your trainer nodes run this:
 
 ``` bash
 PSERVERS=192.168.1.2:6174 SERVER_ENDPOINT=192.168.1.2:6174 TRAINING_ROLE=TRAINER python notest_dist_fit_a_line.py
 ```
 
-*the reason you need to run this command twice in 2 nodes is: in the script we set the trainer count to be 2. You can change this setting on line 50*
+*the reason you need to run this command twice in 2 nodes is because: in the script we set the trainer count to be 2. You can change this setting on line 50*
 
 Now you have 2 trainers and 1 parameter server up and running.
-- 
GitLab