Deploy to GitHub Pages: efc2464f

dc64308f · Travis CI · ecf559aa · dc64308f · dc64308f · dc64308f
6 changed file
--- a/develop/doc/_sources/design/model_format.md.txt
+++ b/develop/doc/_sources/design/model_format.md.txt
@@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file.
 The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task.
-As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, the `name` of the tensor, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is, 
+As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is, 
-|HeaderLength|ContentLength|**LoDTensorDesc**|**TensorValue**|
 The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format.
-```text
+|field name  | type | description |
-[offset] [type]              [description] 
+| --- | --- | --- |
-0004     4 bytes integer      HeaderLength, the length of LoDTensorDesc
+| version | uint32_t | Version of saved file. Always 0 now. |
-0008     4 bytes integer      ContentLength, the length of LodTensor Buffer
+| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
-0009     1 bytes char         TensorDesc
+| tensor desc | void* | TensorDesc protobuf binary message |
-00010    1 bytes char         TensorDesc
+| tensor data | void* | Tensor's data in binary format. The length of `tensor_data` is decided by `TensorDesc.dims()` and `TensorDesc.data_type()` |
-...
+| lod_level | uint64_t | Level of LoD |
-00100    1 bytes char         TensorValue
+| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
-00101    1 bytes char         TensorValue
+| data of lod[0] | uint64_t*  | [Optional] lod[0].data() |
-00102    1 bytes char         TensorValue              ..
+| ... | ... | ... |
-...
-```
 ## Summary

--- a/develop/doc/design/model_format.html
+++ b/develop/doc/design/model_format.html
@@ -192,21 +192,18 @@
 <span id="implementation"></span><h2>Implementation<a class="headerlink" href="#implementation" title="Permalink to this headline">¶</a></h2>
 <p>The topology is saved as a plain text in a detailed self-contain protobuf file.</p>
 <p>The parameters are saved as a binary file. As we all know, the protobuf message has a limit of <a class="reference external" href="https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details">64M size</a>. We have done a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/pull/4610">benchmark experiment</a>, which shows that protobuf is not fit for the task.</p>
-<p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, the <code class="docutils literal"><span class="pre">name</span></code> of the tensor, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p>
+<p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p>
-<p>|HeaderLength|ContentLength|<strong>LoDTensorDesc</strong>|<strong>TensorValue</strong>|</p>
 <p>The table below shows a tensor&#8217;s byte view in detail. Note that all the signed values are written in the little-endian format.</p>
-<div class="highlight-text"><div class="highlight"><pre><span></span>[offset] [type]              [description] 
+<p>|field name  | type | description |
-0004     4 bytes integer      HeaderLength, the length of LoDTensorDesc
+| &#8212; | &#8212; | &#8212; |
-0008     4 bytes integer      ContentLength, the length of LodTensor Buffer
+| version | uint32_t | Version of saved file. Always 0 now. |
-0009     1 bytes char         TensorDesc
+| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
-00010    1 bytes char         TensorDesc
+| tensor desc | void* | TensorDesc protobuf binary message |
-...
+| tensor data | void* | Tensor&#8217;s data in binary format. The length of <code class="docutils literal"><span class="pre">tensor_data</span></code> is decided by <code class="docutils literal"><span class="pre">TensorDesc.dims()</span></code> and <code class="docutils literal"><span class="pre">TensorDesc.data_type()</span></code> |
-00100    1 bytes char         TensorValue
+| lod_level | uint64_t | Level of LoD |
-00101    1 bytes char         TensorValue
+| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
-00102    1 bytes char         TensorValue              ..
+| data of lod[0] | uint64_t*  | [Optional] lod[0].data() |
-...
+| ... | ... | ... |</p>
-</pre></div>
-</div>
 </div>
 <div class="section" id="summary">
 <span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="Permalink to this headline">¶</a></h2>

--- a/develop/doc/searchindex.js
+++ b/develop/doc/searchindex.js
--- a/develop/doc_cn/_sources/design/model_format.md.txt
+++ b/develop/doc_cn/_sources/design/model_format.md.txt
@@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file.
 The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task.
-As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, the `name` of the tensor, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is, 
+As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is, 
-|HeaderLength|ContentLength|**LoDTensorDesc**|**TensorValue**|
 The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format.
-```text
+|field name  | type | description |
-[offset] [type]              [description] 
+| --- | --- | --- |
-0004     4 bytes integer      HeaderLength, the length of LoDTensorDesc
+| version | uint32_t | Version of saved file. Always 0 now. |
-0008     4 bytes integer      ContentLength, the length of LodTensor Buffer
+| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
-0009     1 bytes char         TensorDesc
+| tensor desc | void* | TensorDesc protobuf binary message |
-00010    1 bytes char         TensorDesc
+| tensor data | void* | Tensor's data in binary format. The length of `tensor_data` is decided by `TensorDesc.dims()` and `TensorDesc.data_type()` |
-...
+| lod_level | uint64_t | Level of LoD |
-00100    1 bytes char         TensorValue
+| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
-00101    1 bytes char         TensorValue
+| data of lod[0] | uint64_t*  | [Optional] lod[0].data() |
-00102    1 bytes char         TensorValue              ..
+| ... | ... | ... |
-...
-```
 ## Summary

--- a/develop/doc_cn/design/model_format.html
+++ b/develop/doc_cn/design/model_format.html
@@ -206,21 +206,18 @@
 <span id="implementation"></span><h2>Implementation<a class="headerlink" href="#implementation" title="永久链接至标题">¶</a></h2>
 <p>The topology is saved as a plain text in a detailed self-contain protobuf file.</p>
 <p>The parameters are saved as a binary file. As we all know, the protobuf message has a limit of <a class="reference external" href="https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details">64M size</a>. We have done a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/pull/4610">benchmark experiment</a>, which shows that protobuf is not fit for the task.</p>
-<p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, the <code class="docutils literal"><span class="pre">name</span></code> of the tensor, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p>
+<p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p>
-<p>|HeaderLength|ContentLength|<strong>LoDTensorDesc</strong>|<strong>TensorValue</strong>|</p>
 <p>The table below shows a tensor&#8217;s byte view in detail. Note that all the signed values are written in the little-endian format.</p>
-<div class="highlight-text"><div class="highlight"><pre><span></span>[offset] [type]              [description] 
+<p>|field name  | type | description |
-0004     4 bytes integer      HeaderLength, the length of LoDTensorDesc
+| &#8212; | &#8212; | &#8212; |
-0008     4 bytes integer      ContentLength, the length of LodTensor Buffer
+| version | uint32_t | Version of saved file. Always 0 now. |
-0009     1 bytes char         TensorDesc
+| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
-00010    1 bytes char         TensorDesc
+| tensor desc | void* | TensorDesc protobuf binary message |
-...
+| tensor data | void* | Tensor&#8217;s data in binary format. The length of <code class="docutils literal"><span class="pre">tensor_data</span></code> is decided by <code class="docutils literal"><span class="pre">TensorDesc.dims()</span></code> and <code class="docutils literal"><span class="pre">TensorDesc.data_type()</span></code> |
-00100    1 bytes char         TensorValue
+| lod_level | uint64_t | Level of LoD |
-00101    1 bytes char         TensorValue
+| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
-00102    1 bytes char         TensorValue              ..
+| data of lod[0] | uint64_t*  | [Optional] lod[0].data() |
-...
+| ... | ... | ... |</p>
-</pre></div>
-</div>
 </div>
 <div class="section" id="summary">
 <span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="永久链接至标题">¶</a></h2>

--- a/develop/doc_cn/searchindex.js
+++ b/develop/doc_cn/searchindex.js