未验证 提交 e5d416a8 编写于 作者: Q Qiyang Min 提交者: GitHub

Merge pull request #1574 from velconia/change_transformer_delete_scope_strategy

Change transformer for adapting to new delete scope strategy
......@@ -469,7 +469,7 @@ def train_loop(exe,
# For faster executor
exec_strategy = fluid.ExecutionStrategy()
exec_strategy.use_experimental_executor = True
# exec_strategy.num_iteration_per_drop_scope = 5
exec_strategy.num_iteration_per_drop_scope = int(args.fetch_steps)
build_strategy = fluid.BuildStrategy()
# Since the token number differs among devices, customize gradient scale to
# use token average cost among multi-devices. and the gradient scale is
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册