Skip to content

  • 体验新版
    • 正在加载...
  • 登录
  • PaddlePaddle
  • Paddle
  • Issue
  • #7722

P
Paddle
  • 项目概览

PaddlePaddle / Paddle
大约 2 年 前同步成功

通知 2325
Star 20933
Fork 5424
  • 代码
    • 文件
    • 提交
    • 分支
    • Tags
    • 贡献者
    • 分支图
    • Diff
  • Issue 1423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
  • Wiki 0
    • Wiki
  • 分析
    • 仓库
    • DevOps
  • 项目成员
  • Pages
P
Paddle
  • 项目概览
    • 项目概览
    • 详情
    • 发布
  • 仓库
    • 仓库
    • 文件
    • 提交
    • 分支
    • 标签
    • 贡献者
    • 分支图
    • 比较
  • Issue 1,423
    • Issue 1,423
    • 列表
    • 看板
    • 标记
    • 里程碑
  • 合并请求 543
    • 合并请求 543
  • Pages
  • 分析
    • 分析
    • 仓库分析
    • DevOps
  • Wiki 0
    • Wiki
  • 成员
    • 成员
  • 收起侧边栏
  • 动态
  • 分支图
  • 创建新Issue
  • 提交
  • Issue看板
已关闭
开放中
Opened 1月 22, 2018 by saxon_zh@saxon_zhGuest

Saving all trained params in a single file

Created by: sidgoyal78

Merging all params in a single file

For inference, we will to have 2 files, one for the programDesc and one that has all the params together. We look at 1 approach to do this.

Understanding save/load ops (C++ side)

  • From the model_format design doc, we see some details in the table but it is not super clear. So we will look at the implementation.

To understand the current serialization: we look at save_op

  • In save_op the main work is performed by SerializeToStream( <ofstream>, <framework::LoDTensor>, .. ) Code. This function saves a version number, size of LoD and actual LoD data.

  • Then it calls, SerializeToStream(<ofstream>, <Tensor> ..) Code. This function saves a version number, tensor description as a serialized protobuf, and the actual data.

The corresponding load_op basically does the deserialization accordingly (respecting the ordering in the save_op).

Understanding how a model is saved (python api)

Now, we look at how the save/load works for saving actual model params, we look at the implementation of save_vars in fluid. Code. We see that a new program is created with save op is appended for each vars which is persistable. Then the executor runs this program.

Approach

We basically make two assumptions:

  • For both load/save, the order of iterating over the variables is the same. (This should hopefully be true)
  • We don't worry about the overwrite option which is in save_op.

While saving:

  • We basically store a uint64_t number in addition to the actual serialized bytes as in the original save. This number will tell us about the size of the serialized LoDTensor in bytes.

  • When the save is called for the first time, we will create a file, create a string that will have serialized LoDTensor data. Now we store the size of this string first in a fixed width (uint64_t) number, and then store the string.

  • When the save is called later, we basically go to the end of the file, and store 2 things: the size of the string and the string itself.

While loading:

  • We pass an additional attribute, in order to load the correct chunk of parameter. So we pass a counter value (which counts from 0 the relative order of the different params).

  • With this counter and the extra size information that we stored, we can hop to the appropriate part of the file, and read the chunk, and deserialize it.

For implementation, i think it will be better to have another op for this (rather than replacing the original save_op/load_op, so that is easier to debug, and i don't know the details of how the load_op and save_op are used in distributed version as of now).

指派人
分配到
无
里程碑
无
分配里程碑
工时统计
无
截止日期
无
标识: paddlepaddle/Paddle#7722
渝ICP备2023009037号

京公网安备11010502055752号

网络110报警服务 Powered by GitLab CE v13.7
开源知识
Git 入门 Pro Git 电子书 在线学 Git
Markdown 基础入门 IT 技术知识开源图谱
帮助
使用手册 反馈建议 博客
《GitCode 隐私声明》 《GitCode 服务条款》 关于GitCode
Powered by GitLab CE v13.7