分布式训练 fluid.layers.embedding 设置 is_distributed=True 导致 Runtime error
Created by: lzha106
使用 paddle 1.4.1 进行分布式训练, fluid Embedding 接口设置 is_distributed=True 会导致 Runtime Error:
fluid.layers.embedding(is_sparse=True,` is_distributed=True)
Error 信息入下 File "train_dist.py", line 202, in train() File "train_dist.py", line 151, in train optimizer.minimize(loss) File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/incubate/fleet/parameter_server/distributed_transpiler/init.py", line 283, in minimize loss, startup_program, parameter_list, no_grad_set) File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/optimizer.py", line 510, in minimize loss, startup_program=startup_program, params_grads=params_grads) File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/optimizer.py", line 472, in apply_optimize optimize_ops = self.apply_gradients(params_grads) File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/optimizer.py", line 434, in apply_gradients self._process_distribute_lookuptable(params_grads) File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/optimizer.py", line 319, in _process_distribute_lookuptable table_name = find_distributed_lookup_table(program) File "/usr/local/lib/python2.7/dist-packages/paddle/fluid/distribute_lookup_table.py", line 73, in find_distributed_lookup_table raise RuntimeError("all distributed lookup_table_ops" RuntimeError: all distributed lookup_table_ops should have only one table