add sharded checkpoint loading for AutoTP path to reduce the peak mem… (#3102)
* add sharded checkpoint loading for AutoTP path to reduce the peak memory in initialization stage Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> * fix gptj sharded checkpoint loading problem Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: NWang, Yi A <yi.a.wang@intel.com> Co-authored-by: NOlatunji Ruwase <olruwase@microsoft.com>
Showing
想要评论请 注册 或 登录