diff --git a/doc/design/ops/dist_train.md b/doc/design/ops/dist_train.md
new file mode 100644
index 0000000000000000000000000000000000000000..0380826b0d4ccfda58f4e4dbaf6010c8db993763
--- /dev/null
+++ b/doc/design/ops/dist_train.md
@@ -0,0 +1,82 @@
+# Design Doc: Operation Graph Based Parameter Server
+
+## Abstract
+
+We propose an approach to implment the parameter server. In this
+approach, there is no fundimental difference between the trainer and
+the parameter server: they both run sub-graphs, but sub-graphs of
+different purposes.
+
+## Background
+
+The previous implementations of the parameter server does not run a
+sub-graph. parameter initialization, optimizer computation, network
+communication and checkpointing are implemented twice on both the
+trainer and the parameter server.
+
+It would be great if we can write code once and use them on both the
+trainer and the parameter server: reduces code duplication and
+improves extensibility. Given during the current refactor, we are
+representing everything as a computing graph on the
+trainer. Representing everything as a computing graph on the parameter
+server becomes a natural extension.
+
+## Design
+
+### Graph Converter
+
+The *graph converter* converts user-defined operation (OP) graph into
+sub-graphs to be scheduled on different nodes.
+
+1. The user-defined OP graph will be cut into sub-graphs of
+different purposes (e.g., trainer, parameter server) to run on
+different workers.
+
+1. OPs will be added to the subgraphs, so the subgraphs can
+communicate with each other. We will need these OPs: *send*, *recv*,
+*gradient accumulator*, *string accumulator*, *loop forever*.
+
+Below is an example of converting the user defined graph to the
+sub-graphs for the trainer and the parameter server:
+
+
+
+After converting:
+
+
+
+1. The parameter variable W and it's optimizer subgraph are placed on the parameter server.
+1. Operators are added to the sub-graphs.
+ - *send* operator sends data and sender's address to the destination.
+ - *recv* operator receives data and sender's address from the
+ destination. It will block until data has been received.
+ - *gradient accumulator* operator accumulates *N* pieces of
+ gradients. N=1 in Async-SGD, N>1 in Sync-SGD.
+ - *string accumulator* accumulates *N* pieces of strings into a
+ list of strings. N=1 in Async-SGD, N>1 in Sync-SGD.
+ - *loop forever* runs itself as a target forever.
+
+### Benefits
+
+- Model parallelism become easier to implement: it's an extension to
+ the trainer - parameter server approach. we already have the
+ communication OPs, but need to extend the graph converter.
+
+- User-defined optimizer is easier to add - user can now express it as
+ a subgraph.
+
+- No more duplication logic inside the trainer and the parameter
+ server in the background section.
+
+### Challenges
+
+- It might be hard for the graph converter to cut a general graph
+ (without any hint for which sub-graph is the optimizer). We may need
+ to label which sub-graph inside the OP graph is the optimizer.
+
+- It's important to balance the parameter shards of on multiple
+ parameter server. If a single parameter is very big (some
+ word-embedding, fully connected, softmax layer), we need to
+ automatically partition the single parameter onto different
+ parameter servers when possible (only element-wise optimizer depends
+ on the parameter variable).
diff --git a/doc/design/ops/src/dist-graph.graffle b/doc/design/ops/src/dist-graph.graffle
new file mode 100644
index 0000000000000000000000000000000000000000..1e1cb18dfecd9ee956ce4fe721a9bec4a24282c2
Binary files /dev/null and b/doc/design/ops/src/dist-graph.graffle differ
diff --git a/doc/design/ops/src/dist-graph.png b/doc/design/ops/src/dist-graph.png
new file mode 100644
index 0000000000000000000000000000000000000000..6f49dce07415025ade04bf0227f652c98540a056
Binary files /dev/null and b/doc/design/ops/src/dist-graph.png differ
diff --git a/doc/design/ops/src/local-graph.graffle b/doc/design/ops/src/local-graph.graffle
new file mode 100644
index 0000000000000000000000000000000000000000..d167d0412bbcddb3458664f507ea6d215afbeae3
Binary files /dev/null and b/doc/design/ops/src/local-graph.graffle differ
diff --git a/doc/design/ops/src/local-graph.png b/doc/design/ops/src/local-graph.png
new file mode 100644
index 0000000000000000000000000000000000000000..8ea9b36076ca8f4042755675d719e173a5b0a8fe
Binary files /dev/null and b/doc/design/ops/src/local-graph.png differ