提交 dc64308f 编写于 作者: T Travis CI

Deploy to GitHub Pages: efc2464f

上级 ecf559aa
...@@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file. ...@@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file.
The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task. The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task.
As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, the `name` of the tensor, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is, As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,
|HeaderLength|ContentLength|**LoDTensorDesc**|**TensorValue**|
The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format. The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format.
```text |field name | type | description |
[offset] [type] [description] | --- | --- | --- |
0004 4 bytes integer HeaderLength, the length of LoDTensorDesc | version | uint32_t | Version of saved file. Always 0 now. |
0008 4 bytes integer ContentLength, the length of LodTensor Buffer | tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
0009 1 bytes char TensorDesc | tensor desc | void* | TensorDesc protobuf binary message |
00010 1 bytes char TensorDesc | tensor data | void* | Tensor's data in binary format. The length of `tensor_data` is decided by `TensorDesc.dims()` and `TensorDesc.data_type()` |
... | lod_level | uint64_t | Level of LoD |
00100 1 bytes char TensorValue | length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
00101 1 bytes char TensorValue | data of lod[0] | uint64_t* | [Optional] lod[0].data() |
00102 1 bytes char TensorValue .. | ... | ... | ... |
...
```
## Summary ## Summary
......
...@@ -192,21 +192,18 @@ ...@@ -192,21 +192,18 @@
<span id="implementation"></span><h2>Implementation<a class="headerlink" href="#implementation" title="Permalink to this headline"></a></h2> <span id="implementation"></span><h2>Implementation<a class="headerlink" href="#implementation" title="Permalink to this headline"></a></h2>
<p>The topology is saved as a plain text in a detailed self-contain protobuf file.</p> <p>The topology is saved as a plain text in a detailed self-contain protobuf file.</p>
<p>The parameters are saved as a binary file. As we all know, the protobuf message has a limit of <a class="reference external" href="https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details">64M size</a>. We have done a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/pull/4610">benchmark experiment</a>, which shows that protobuf is not fit for the task.</p> <p>The parameters are saved as a binary file. As we all know, the protobuf message has a limit of <a class="reference external" href="https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details">64M size</a>. We have done a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/pull/4610">benchmark experiment</a>, which shows that protobuf is not fit for the task.</p>
<p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, the <code class="docutils literal"><span class="pre">name</span></code> of the tensor, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p> <p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p>
<p>|HeaderLength|ContentLength|<strong>LoDTensorDesc</strong>|<strong>TensorValue</strong>|</p>
<p>The table below shows a tensor&#8217;s byte view in detail. Note that all the signed values are written in the little-endian format.</p> <p>The table below shows a tensor&#8217;s byte view in detail. Note that all the signed values are written in the little-endian format.</p>
<div class="highlight-text"><div class="highlight"><pre><span></span>[offset] [type] [description] <p>|field name | type | description |
0004 4 bytes integer HeaderLength, the length of LoDTensorDesc | &#8212; | &#8212; | &#8212; |
0008 4 bytes integer ContentLength, the length of LodTensor Buffer | version | uint32_t | Version of saved file. Always 0 now. |
0009 1 bytes char TensorDesc | tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
00010 1 bytes char TensorDesc | tensor desc | void* | TensorDesc protobuf binary message |
... | tensor data | void* | Tensor&#8217;s data in binary format. The length of <code class="docutils literal"><span class="pre">tensor_data</span></code> is decided by <code class="docutils literal"><span class="pre">TensorDesc.dims()</span></code> and <code class="docutils literal"><span class="pre">TensorDesc.data_type()</span></code> |
00100 1 bytes char TensorValue | lod_level | uint64_t | Level of LoD |
00101 1 bytes char TensorValue | length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
00102 1 bytes char TensorValue .. | data of lod[0] | uint64_t* | [Optional] lod[0].data() |
... | ... | ... | ... |</p>
</pre></div>
</div>
</div> </div>
<div class="section" id="summary"> <div class="section" id="summary">
<span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="Permalink to this headline"></a></h2> <span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="Permalink to this headline"></a></h2>
......
因为 它太大了无法显示 source diff 。你可以改为 查看blob
...@@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file. ...@@ -12,24 +12,22 @@ The topology is saved as a plain text in a detailed self-contain protobuf file.
The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task. The parameters are saved as a binary file. As we all know, the protobuf message has a limit of [64M size](https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details). We have done a [benchmark experiment](https://github.com/PaddlePaddle/Paddle/pull/4610), which shows that protobuf is not fit for the task.
As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, the `name` of the tensor, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is, As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md), and has a description information proto of [LoDTensorDesc](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99). We save the DescProto as the byte string header. It contains all the necessary information, such as the `dims`, and the `LoD` information in [LoDTensor](https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md). A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,
|HeaderLength|ContentLength|**LoDTensorDesc**|**TensorValue**|
The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format. The table below shows a tensor's byte view in detail. Note that all the signed values are written in the little-endian format.
```text |field name | type | description |
[offset] [type] [description] | --- | --- | --- |
0004 4 bytes integer HeaderLength, the length of LoDTensorDesc | version | uint32_t | Version of saved file. Always 0 now. |
0008 4 bytes integer ContentLength, the length of LodTensor Buffer | tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
0009 1 bytes char TensorDesc | tensor desc | void* | TensorDesc protobuf binary message |
00010 1 bytes char TensorDesc | tensor data | void* | Tensor's data in binary format. The length of `tensor_data` is decided by `TensorDesc.dims()` and `TensorDesc.data_type()` |
... | lod_level | uint64_t | Level of LoD |
00100 1 bytes char TensorValue | length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
00101 1 bytes char TensorValue | data of lod[0] | uint64_t* | [Optional] lod[0].data() |
00102 1 bytes char TensorValue .. | ... | ... | ... |
...
```
## Summary ## Summary
......
...@@ -206,21 +206,18 @@ ...@@ -206,21 +206,18 @@
<span id="implementation"></span><h2>Implementation<a class="headerlink" href="#implementation" title="永久链接至标题"></a></h2> <span id="implementation"></span><h2>Implementation<a class="headerlink" href="#implementation" title="永久链接至标题"></a></h2>
<p>The topology is saved as a plain text in a detailed self-contain protobuf file.</p> <p>The topology is saved as a plain text in a detailed self-contain protobuf file.</p>
<p>The parameters are saved as a binary file. As we all know, the protobuf message has a limit of <a class="reference external" href="https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details">64M size</a>. We have done a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/pull/4610">benchmark experiment</a>, which shows that protobuf is not fit for the task.</p> <p>The parameters are saved as a binary file. As we all know, the protobuf message has a limit of <a class="reference external" href="https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.SetTotalBytesLimit.details">64M size</a>. We have done a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/pull/4610">benchmark experiment</a>, which shows that protobuf is not fit for the task.</p>
<p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, the <code class="docutils literal"><span class="pre">name</span></code> of the tensor, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p> <p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/lod_tensor.md">LoDTensor</a>, and has a description information proto of <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/framework.proto#L99">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="https://github.com/PaddlePaddle/Paddle/blob/1c0a4c901c9fc881d120249c703b15d1c50dae7d/paddle/framework/lod_tensor.md">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p>
<p>|HeaderLength|ContentLength|<strong>LoDTensorDesc</strong>|<strong>TensorValue</strong>|</p>
<p>The table below shows a tensor&#8217;s byte view in detail. Note that all the signed values are written in the little-endian format.</p> <p>The table below shows a tensor&#8217;s byte view in detail. Note that all the signed values are written in the little-endian format.</p>
<div class="highlight-text"><div class="highlight"><pre><span></span>[offset] [type] [description] <p>|field name | type | description |
0004 4 bytes integer HeaderLength, the length of LoDTensorDesc | &#8212; | &#8212; | &#8212; |
0008 4 bytes integer ContentLength, the length of LodTensor Buffer | version | uint32_t | Version of saved file. Always 0 now. |
0009 1 bytes char TensorDesc | tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
00010 1 bytes char TensorDesc | tensor desc | void* | TensorDesc protobuf binary message |
... | tensor data | void* | Tensor&#8217;s data in binary format. The length of <code class="docutils literal"><span class="pre">tensor_data</span></code> is decided by <code class="docutils literal"><span class="pre">TensorDesc.dims()</span></code> and <code class="docutils literal"><span class="pre">TensorDesc.data_type()</span></code> |
00100 1 bytes char TensorValue | lod_level | uint64_t | Level of LoD |
00101 1 bytes char TensorValue | length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
00102 1 bytes char TensorValue .. | data of lod[0] | uint64_t* | [Optional] lod[0].data() |
... | ... | ... | ... |</p>
</pre></div>
</div>
</div> </div>
<div class="section" id="summary"> <div class="section" id="summary">
<span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="永久链接至标题"></a></h2> <span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="永久链接至标题"></a></h2>
......
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册