@@ -6,10 +6,14 @@ For an overview of trainer's role, please refer to [distributed training design
...
@@ -6,10 +6,14 @@ For an overview of trainer's role, please refer to [distributed training design
The parameters on parameter servers need to be initialized. To provide maximum flexibility, we need to allow trainer initialized the parameters. Only one trainer will do the initialization, the other trainers will wait for the completion of initialization and get the parameters from the parameter servers.
The parameters on parameter servers need to be initialized. To provide maximum flexibility, we need to allow trainer initialized the parameters. Only one trainer will do the initialization, the other trainers will wait for the completion of initialization and get the parameters from the parameter servers.
### Trainer Selection
To select the trainer for initialization, every trainer will try to get a distributed lock, whoever owns the lock will do the initialization. As illustrated below:
To select the trainer for initialization, every trainer will try to get a distributed lock, whoever owns the lock will do the initialization. As illustrated below:
<imgsrc="./src/init_lock.png">
<imgsrc="./src/init_lock.png">
### Selection Process
The select process is encapsulated in the C API function:
The select process is encapsulated in the C API function: