Feat: nccl_use_compute_stream support batch accumulation (#4618)
* NCCL logical refine timeshape * Insert nccl ops after acc interface * Inser NCCL ops after acc implement; need refine or add new acc_tick_op * deadlock * speed up and run * add acc tick fix deadlocak ; and add nccl comm debug log * refine log: rm cc_debug_log and cclog * use reference for speed up * refine code for review * fix for review Co-authored-by: NJuncheng <liujuncheng1022@gmail.com> Co-authored-by: Noneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Showing
想要评论请 注册 或 登录