提交 191a3268 编写于 作者: H Helin Wang

fix according to comments

上级 9572e11d
...@@ -23,8 +23,8 @@ ...@@ -23,8 +23,8 @@
在数据集可以被训练之前,文件需要预先被转换成PaddlePaddle集群内部的存储格式(RecordIO)。我们提供两个转换方式: 在数据集可以被训练之前,文件需要预先被转换成PaddlePaddle集群内部的存储格式(RecordIO)。我们提供两个转换方式:
- 提供给用户本地转换的库,用户可以编写程序完成转换。 1. 用户在本地转换好再上传
- 用户可以上传自己的数据集,在集群运行MapReduce job完成转换。 1. 用户上传数据后,在机群上运行转换程序
转换生成的文件名会是以下格式: 转换生成的文件名会是以下格式:
......
# Design Doc: Master Process # Design Doc: Master Process
For an overview of master process' role, please refer to [distributed training design doc](./README.md). In this design doc we will discuss the master process in more details. The master will be implemented in [golang](https://golang.org/). For an overview of master process' role, please refer to [distributed training design doc](./README.md). In this design doc we will discuss the master process in more details. The master will be implemented in [Go](https://golang.org/).
## Dataset ## Dataset
<img src="src/dataset.png"/> <img src="src/dataset.png"/>
A dataset is represented by a list of files in *RecordIO* format on the distributed filesystem, each RecordIO file consists of multiple *blocks*, and each block has multiple data instances. A dataset is a list of files in *RecordIO* format. A RecordIO file consists of chunks, whereas each chunk consists some records.
## Task Queue ## Task Queue
...@@ -14,7 +14,7 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da ...@@ -14,7 +14,7 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da
### Task Queue Creation ### Task Queue Creation
1. Each trainer will make an RPC call (using [golang rpc](https://golang.org/pkg/net/rpc/)) to the master process, telling it the RecordIO files representing the dataset specified by the user. Since every trainer will tell the master process the same dataset, only the first RPC call will be honored. 1. Each trainer will make an RPC call (using Go's [rpc](https://golang.org/pkg/net/rpc/) package) to the master process, telling it the RecordIO files representing the dataset specified by the user. Since every trainer will tell the master process the same dataset, only the first RPC call will be honored.
The RPC interface is: The RPC interface is:
```go ```go
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册