fluid分布式machine_translation,出现ValueError: Variable fc_2.b_0@GRAD.block1.trainer_0 has been created before.The previous persistable is True; the new persistable is False. They are not matched
Created by: xymyeah
使用的是下面的最新代码build的https://github.com/PaddlePaddle/Paddle/commit/78cc64a55cbba36cef56d8436a583323e001b7e5
1个pserver、1个trainer时可以正常跑通
5个pserver、5个trainer时出现异常 Traceback (most recent call last): File "trainer.py", line 187, in main() File "trainer.py", line 155, in main pserver_prog = t.get_pserver_program(current_endpoint) File "/usr/local/lib/python2.7/site-packages/paddle/v2/fluid/distribute_transpiler.py", line 566, in get_pserver_program self._append_pserver_ops(optimize_block, op, endpoint) File "/usr/local/lib/python2.7/site-packages/paddle/v2/fluid/distribute_transpiler.py", line 377, in _append_pserver_ops pserver_block, grad_block, self.trainers) File "/usr/local/lib/python2.7/site-packages/paddle/v2/fluid/distribute_transpiler.py", line 325, in _create_var_for_trainers shape=var.shape) File "/usr/local/lib/python2.7/site-packages/paddle/v2/fluid/framework.py", line 712, in create_var var = Variable(self, *args, **kwargs) File "/usr/local/lib/python2.7/site-packages/paddle/v2/fluid/framework.py", line 238, in init self.name, self.persistable, persistable)) ValueError: Variable fc_2.b_0@GRAD.block1.trainer_0 has been created before.The previous persistable is True; the new persistable is False. They are not matched