“f1e327e013a54f0ed8c05877edb6b0b857b37226”上不存在“PaddleSlim/classification/imagenet_reader.py”
Fix NCCLBcast hang up bug in Parallel Executor (#11377)
* 1. Create buddy allocator in each places before NcclBcast the variables 2. Check the memory usage of ALL gpus rather than the first one * 1. Make NCCLGroupGuard guards only the ncclBcast part, which avoid ncclGroupEnd blocking the exception throwing 2. NOTE the usage of NCCLGroupGuard * Remove the memory usage check of gpus * Fix code style
Showing
想要评论请 注册 或 登录