From 3d6934e35b81ff892a6ff1c917723d81a6c651ee Mon Sep 17 00:00:00 2001 From: Wu Yi Date: Tue, 29 May 2018 20:31:13 +0800 Subject: [PATCH] update benchmark doc (#10995) * update benchmark doc * update by comment --- benchmark/fluid/README.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/benchmark/fluid/README.md b/benchmark/fluid/README.md index 065df2edb..7071e9fdc 100644 --- a/benchmark/fluid/README.md +++ b/benchmark/fluid/README.md @@ -58,3 +58,14 @@ kubectl create -f myjob/ ``` The job shall start. + + +## Notes for Run Fluid Distributed with NCCL2 and RDMA + +Before running NCCL2 distributed jobs, please check that whether your node has multiple network +interfaces, try to add the environment variable `export NCCL_SOCKET_IFNAME=eth0` to use your actual +network device. + +To run high-performance distributed training, you must prepare your hardware environment to be +able to run RDMA enabled network communication, please check out [this](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/fluid/howto/cluster/nccl2_rdma_training.md) +note for details. -- GitLab