未验证 提交 73321264 编写于 作者: W WangXi 提交者: GitHub

[hybrid][npu] fix npu clear float status in pipeline (#35165)

上级 669853f5
develop Ligoml-patch-1 ZHUI-patch-1 add_some_yaml_config cherry_undefined_var delete_disable_iterable_dataset_unittest delete_fix_retry_ci delete_fix_undefined_var delete_revert-36057-dev/read_flags_in_ut dingjiaweiww-patch-1 disable_iterable_dataset_unittest dy2static enable_eager_model_test final_state_gen_python_c final_state_intermediate fix-numpy-issue fix_concat_slice fix_npu_ci fix_op_flops fix_retry_ci fix_rnn_docs fix_tensor_type fix_undefined_var incubate/infrt inplace_addto make_flag_adding_easier move_embedding_to_phi move_histogram_to_pten move_sgd_to_phi move_slice_to_pten move_temporal_shift_to_phi move_yolo_box_to_phi npu_fix_alloc preln_ernie prv-md-even-more prv-onednn-2.5 pten_tensor_refactor release/2.2 release/2.3 release/2.3-fc-ernie-fix release/2.4 revert-36057-dev/read_flags_in_ut revert-36201-refine_fast_threaded_ssa_graph_executor revert-36985-add_license revert-37318-refactor_dygraph_to_eager revert-37926-eager_coreops_500 revert-37956-revert-37727-pylayer_support_tuple revert-38100-mingdong revert-38301-allocation_rearrange_pr revert-38703-numpy_bf16_package_reupload revert-38732-remove_useless_header_in_elementwise_mul_grad revert-38959-Reduce_Grad revert-39143-adjust_empty revert-39227-move_trace_op_to_pten revert-39268-dev/remove_concat_fluid_kernel revert-40170-support_partial_grad revert-41056-revert-40727-move_some_activaion_to_phi revert-41065-revert-40993-mv_ele_floordiv_pow revert-41068-revert-40790-phi_new revert-41944-smaller_inference_api_test revert-42149-do-not-reset-default-stream-for-stream-safe-cuda-allocator revert-43155-fix_ut_tempfile revert-43882-revert-41944-smaller_inference_api_test revert-45808-phi/simplify_size_op revert-46827-deform_comment support_weight_transpose zhiqiu-patch-1 v2.4.0-rc0 v2.3.2 v2.3.1 v2.3.0 v2.3.0-rc0 v2.2.2 v2.2.1 v2.2.0 v2.2.0-rc0 v2.2.0-bak0
无相关合并请求
......@@ -4654,15 +4654,22 @@ class PipelineOptimizer(object):
op.type == 'elementwise_div'):
device = f"{self._device}:all"
op._set_attr(self._op_device_key, device)
elif self._is_weight_decay_op(op) and op.type == 'scale':
# set AdamW decay_coeff to device:all
op._set_attr(self._op_device_key, f"{self._device}:all")
elif op.type == "alloc_float_status" or op.type == "clear_float_status":
op._set_attr(self._op_device_key, f"{self._device}:all")
# NOTE(wangxi): NPU should only clear the float status
# once at each batch step
op._set_attr(self._op_role_key, self._op_role.LRSched)
float_status_name = op.output_arg_names[0]
float_status_var = block.var(float_status_name)
# FIXME(wangxi): pipeline lr schedule will exec on sub_scope(0)
# while update will exec on sub_scope(last_micro_step), should
# set persistable to use global scope
float_status_var.persistable = True
else:
other_known_ops = [
'update_loss_scaling', 'reduce_any', 'concat', 'sum',
'check_finite_and_unscale', 'alloc_float_status', 'memcpy'
'check_finite_and_unscale', 'memcpy'
]
assert op.type in other_known_ops, "For other ops without " \
"op_device set, they must be one of {}, but it " \
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册
新手
引导
客服 返回
顶部