* enable npu alignment * support flatten_params/grads * support clip by global norm * remove memset in coalesce_tensor_op * fix npu kernel of sum op when input is one tensor * add ut for flatten_param_grads+regularizer * fix ut * fix typo
拖放文件到此处或点击上传