[XPU] fix the dataloader problem in RDMA env (#54150)
* [kunlun] fix the dataloader problem in RDMA env When running multi-machine training with Paddle DataLoader, an unexpected segmentfault will be raised in DataLoader Process, where the traceback goes all back to a runtime error that dataloader workers exit unexpectedly. Similar problems have been discussed that lead to a misbehavior of OpenCV working in multiprocessing environment. See https://stackoverflow.com/questions/54013846/pytorch-dataloader-stucked-if-using-opencv-resize-method * code style * fix 'RuntimeError: context has already been set' * Update dataloader_iter.py spawn method raise error 'Can't pickle local object' in some situations * code format check * code style
Showing
想要评论请 注册 或 登录