提交 e021721b 编写于 作者: T Travis CI

Deploy to GitHub Pages: 7af02682

上级 89abf983
...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c ...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c
## Task Queue ## Task Queue
As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *blocks* from one or multiple files. The master server maintains *task queues* to track the training progress. As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *chunks* from one or multiple files. The master server maintains *task queues* to track the training progress.
### Task Queue Creation ### Task Queue Creation
...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da ...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da
func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error { func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error {
} }
``` ```
1. The master server will scan through each RecordIO file to generate the *block index* and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file. 1. The master server will scan through each RecordIO file to generate the *chunk index* and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.
The definition of the block is: The definition of the chunk is:
```go ```go
type Block struct { type Chunk struct {
Idx int // index of the block within the file Idx int // index of the chunk within the file
Path string Path string
Index recordio.Index // block index Index recordio.Index // chunk index
} }
``` ```
1. Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element. 1. Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.
The definition of the task is: The definition of the task is:
```go ```go
type Task struct { type Task struct {
Index int Index int
Blocks []Block Chunks []Chunk
} }
``` ```
......
...@@ -186,7 +186,7 @@ ...@@ -186,7 +186,7 @@
</div> </div>
<div class="section" id="task-queue"> <div class="section" id="task-queue">
<span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="Permalink to this headline"></a></h2> <span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="Permalink to this headline"></a></h2>
<p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>blocks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p> <p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>chunks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p>
<div class="section" id="task-queue-creation"> <div class="section" id="task-queue-creation">
<span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="Permalink to this headline"></a></h3> <span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="Permalink to this headline"></a></h3>
<ol> <ol>
...@@ -197,21 +197,21 @@ ...@@ -197,21 +197,21 @@
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">The master server will scan through each RecordIO file to generate the <em>block index</em> and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file.</p> <li><p class="first">The master server will scan through each RecordIO file to generate the <em>chunk index</em> and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.</p>
<p>The definition of the block is:</p> <p>The definition of the chunk is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Block</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Chunk</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the block within the file</span> <span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the chunk within the file</span>
<span class="nx">Path</span> <span class="kt">string</span> <span class="nx">Path</span> <span class="kt">string</span>
<span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// block index</span> <span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// chunk index</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p> <li><p class="first">Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p>
<p>The definition of the task is:</p> <p>The definition of the task is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Index</span> <span class="kt">int</span> <span class="nx">Index</span> <span class="kt">int</span>
<span class="nx">Blocks</span> <span class="p">[]</span><span class="nx">Block</span> <span class="nx">Chunks</span> <span class="p">[]</span><span class="nx">Chunk</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c ...@@ -10,7 +10,7 @@ A dataset is a list of files in *RecordIO* format. A RecordIO file consists of c
## Task Queue ## Task Queue
As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *blocks* from one or multiple files. The master server maintains *task queues* to track the training progress. As mentioned in [distributed training design doc](./README.md), a *task* is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple *chunks* from one or multiple files. The master server maintains *task queues* to track the training progress.
### Task Queue Creation ### Task Queue Creation
...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da ...@@ -21,23 +21,23 @@ As mentioned in [distributed training design doc](./README.md), a *task* is a da
func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error { func (m *RPCServer) ReportDataset(Paths []string, dummy *int) error {
} }
``` ```
1. The master server will scan through each RecordIO file to generate the *block index* and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file. 1. The master server will scan through each RecordIO file to generate the *chunk index* and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.
The definition of the block is: The definition of the chunk is:
```go ```go
type Block struct { type Chunk struct {
Idx int // index of the block within the file Idx int // index of the chunk within the file
Path string Path string
Index recordio.Index // block index Index recordio.Index // chunk index
} }
``` ```
1. Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element. 1. Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.
The definition of the task is: The definition of the task is:
```go ```go
type Task struct { type Task struct {
Index int Index int
Blocks []Block Chunks []Chunk
} }
``` ```
......
...@@ -193,7 +193,7 @@ ...@@ -193,7 +193,7 @@
</div> </div>
<div class="section" id="task-queue"> <div class="section" id="task-queue">
<span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="永久链接至标题"></a></h2> <span id="task-queue"></span><h2>Task Queue<a class="headerlink" href="#task-queue" title="永久链接至标题"></a></h2>
<p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>blocks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p> <p>As mentioned in <a class="reference internal" href="README.html"><span class="doc">distributed training design doc</span></a>, a <em>task</em> is a data shard that the master server assigns to the trainer process to train on. A task consists of one or multiple <em>chunks</em> from one or multiple files. The master server maintains <em>task queues</em> to track the training progress.</p>
<div class="section" id="task-queue-creation"> <div class="section" id="task-queue-creation">
<span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="永久链接至标题"></a></h3> <span id="task-queue-creation"></span><h3>Task Queue Creation<a class="headerlink" href="#task-queue-creation" title="永久链接至标题"></a></h3>
<ol> <ol>
...@@ -204,21 +204,21 @@ ...@@ -204,21 +204,21 @@
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">The master server will scan through each RecordIO file to generate the <em>block index</em> and know how many blocks does each file have. A block can be referenced by the file path and the index of the block within the file. The block index is in memory data structure that enables fast access to each block, and the index of the block with the file is an integer start from 0, representing the n-th block within the file.</p> <li><p class="first">The master server will scan through each RecordIO file to generate the <em>chunk index</em> and know how many chunks does each file have. A chunk can be referenced by the file path and the index of the chunk within the file. The chunk index is in memory data structure that enables fast access to each chunk, and the index of the chunk with the file is an integer start from 0, representing the n-th chunk within the file.</p>
<p>The definition of the block is:</p> <p>The definition of the chunk is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Block</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Chunk</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the block within the file</span> <span class="nx">Idx</span> <span class="kt">int</span> <span class="c1">// index of the chunk within the file</span>
<span class="nx">Path</span> <span class="kt">string</span> <span class="nx">Path</span> <span class="kt">string</span>
<span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// block index</span> <span class="nx">Index</span> <span class="nx">recordio</span><span class="p">.</span><span class="nx">Index</span> <span class="c1">// chunk index</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
</li> </li>
<li><p class="first">Blocks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p> <li><p class="first">Chunks are grouped into tasks, and tasks are filled into the todo queue. The pending queue and the done queue are initialized with no element.</p>
<p>The definition of the task is:</p> <p>The definition of the task is:</p>
<div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span> <div class="highlight-go"><div class="highlight"><pre><span></span><span class="kd">type</span> <span class="nx">Task</span> <span class="kd">struct</span> <span class="p">{</span>
<span class="nx">Index</span> <span class="kt">int</span> <span class="nx">Index</span> <span class="kt">int</span>
<span class="nx">Blocks</span> <span class="p">[]</span><span class="nx">Block</span> <span class="nx">Chunks</span> <span class="p">[]</span><span class="nx">Chunk</span>
<span class="p">}</span> <span class="p">}</span>
</pre></div> </pre></div>
</div> </div>
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册