  Design Doc: Model Format













<span id="design-doc-model-format"></span><h1>Design Doc: Model Format<a class="headerlink" href="#design-doc-model-format" title="永久链接至标题"></a></h1>
<span id="motivation"></span><h2>Motivation<a class="headerlink" href="#motivation" title="永久链接至标题"></a></h2>
<p>A model is an output of the training process. One complete model consists of two parts, the <strong>topology</strong> and the <strong>parameters</strong>. In order to support industrial deployment, the model format must be self-complete and must not expose any training source code.</p>
<p>As a result, In PaddlePaddle, the <strong>topology</strong> is represented as a  <a class="reference external" href="">ProgramDesc</a>, which describes the model structure. The <strong>parameters</strong> contain all the trainable weights in the model. We must support large size parameters and efficient serialization/deserialization of parameters.</p>
<span id="implementation"></span><h2>Implementation<a class="headerlink" href="#implementation" title="永久链接至标题"></a></h2>
<p>The topology is saved as a plain text in a detailed self-contain protobuf file.</p>
<p>The parameters are saved as a binary file. As we all know, the protobuf message has a limit of <a class="reference external" href="">64M size</a>. We have done a <a class="reference external" href="">benchmark experiment</a>, which shows that protobuf is not fit for the task.</p>
<p>As a result, we design a particular format for tensor serialization. By default, an arbitrary tensor in Paddle is a <a class="reference external" href="">LoDTensor</a>, and has a description information proto of <a class="reference external" href="">LoDTensorDesc</a>. We save the DescProto as the byte string header. It contains all the necessary information, such as the <code class="docutils literal"><span class="pre">dims</span></code>, and the <code class="docutils literal"><span class="pre">LoD</span></code> information in <a class="reference external" href="">LoDTensor</a>. A tensor stores values in a continuous memory buffer. For speed we dump the raw memory to disk and save it as the byte string content. So, the binary format of one tensor is,</p>
<p>The table below shows a tensor&#8217;s byte view in detail. Note that all the signed values are written in the little-endian format.</p>
<p>|field name  | type | description |
| &#8212; | &#8212; | &#8212; |
| version | uint32_t | Version of saved file. Always 0 now. |
| tensor desc length | uint32_t | TensorDesc(Protobuf message) length in bytes. |
| tensor desc | void* | TensorDesc protobuf binary message |
| tensor data | void* | Tensor&#8217;s data in binary format. The length of <code class="docutils literal"><span class="pre">tensor_data</span></code> is decided by <code class="docutils literal"><span class="pre">TensorDesc.dims()</span></code> and <code class="docutils literal"><span class="pre">TensorDesc.data_type()</span></code> |
| lod_level | uint64_t | Level of LoD |
| length of lod[0] | uint64_t | [Optional] length of lod[0] in bytes. |
| data of lod[0] | uint64_t*  | [Optional] lod[0].data() |
| ... | ... | ... |</p>
<span id="summary"></span><h2>Summary<a class="headerlink" href="#summary" title="永久链接至标题"></a></h2>
<ul class="simple">
<li>We introduce a model format.</li>
<li>The model represented by its forward-pass computation procedure is saved in a <strong>ProgramDesc</strong> protobuf message.</li>
<li>A bunch of specified format binary tensors describe the <strong>parameters</strong>.</li>
