diff --git a/paddle/framework/images/multigpu_allreduce.graffle b/doc/design/images/multigpu_allreduce.graffle
similarity index 100%
rename from paddle/framework/images/multigpu_allreduce.graffle
rename to doc/design/images/multigpu_allreduce.graffle
diff --git a/paddle/framework/images/multigpu_allreduce.png b/doc/design/images/multigpu_allreduce.png
similarity index 100%
rename from paddle/framework/images/multigpu_allreduce.png
rename to doc/design/images/multigpu_allreduce.png
diff --git a/paddle/framework/images/multigpu_before_convert.graffle b/doc/design/images/multigpu_before_convert.graffle
similarity index 100%
rename from paddle/framework/images/multigpu_before_convert.graffle
rename to doc/design/images/multigpu_before_convert.graffle
diff --git a/paddle/framework/images/multigpu_before_convert.png b/doc/design/images/multigpu_before_convert.png
similarity index 100%
rename from paddle/framework/images/multigpu_before_convert.png
rename to doc/design/images/multigpu_before_convert.png
diff --git a/paddle/framework/multigpu.md b/doc/design/paddle_nccl.md
similarity index 83%
rename from paddle/framework/multigpu.md
rename to doc/design/paddle_nccl.md
index 1c843326ee1ba94ac278806f6c47d324c38bd3a4..7c889fdd7f3bf1e0f682e1075fbc39241c8d6d39 100644
--- a/paddle/framework/multigpu.md
+++ b/doc/design/paddle_nccl.md
@@ -1,15 +1,17 @@
-# Design Doc: Multi-GPU support in Operation Graph
+# Design Doc: NCCL support in Paddle Fluid
 
 ## Abstract
 
-This Design Doc refers to the multi-GPU feature in  paddle.  We propose an approach to support multi-GPU both on a single machine and multiple machines. Every device only run sub-graphs which our framework issued. We use `Broadcast`, `Allreduce` operators to join different device sub-graph to the whole graph.
-
+This Design Doc refers to the NCCL feature in  paddle.  We propose an approach to support NCCL library both on a single machine and multiple machines. We wrapper the NCCL primitives `Broadcast`, `Allreduce`, `Reduce` as operators to utilize Multi-GPU powers in one script.
 
 
 ## Motivation
 
-Paddle supports training with multiple CPUs and GPUs, refer to different physical devices. We need to support multi-GPU training in parallel for acceleration, in detail, there are two aspects. 
+NCCL is a Nvidia library support Multi-GPU communicating. [NCCL](https://developer.nvidia.com/nccl). With NCCL library, we can easily accelerate the training in parallel.
 
+- can easily move the optimize sub-graph to parameter server,  multi-GPU feature can be compatible with distributed support design.
+- easily plug-in with [NCCL2](https://developer.nvidia.com/nccl) library.
+- GPU Model parallelism becomes easier to implement. we only need to replace different GPU's sub-graph with different part of the whole graph.
 - GPU Data Parallelism 
 
   Suppose to we have `n`GPUs, every GPU has `1/n`part of training data, and store a complete model in GPU memory.  
@@ -58,7 +60,3 @@ As it shown in the picture, when each GPU compute the gradient of `W`, followed
 In fact, in the way of every GPU optimized full batch of data, wasted (n-1) GPU compute resources. We will enhance it in the next stage.
 
 ### Benefits
-
-- can easily move the optimize sub-graph to parameter server,  multi-GPU feature can be  compatible with distributed support design.
-- easily plug-in with [NCCL2](https://developer.nvidia.com/nccl) library.
-- GPU Model parallelism becomes easier to implement. we only need to replace different GPU's sub-graph with different part of the whole graph.