未验证 提交 5cbd946f 编写于 作者: X XiaociZhang 提交者: GitHub

[XPU] Check return value of bkcl call (#55039)

Early stop if return value of bkcl all is not BKCL_SUCCESS
上级 d6ff59bb
...@@ -265,7 +265,8 @@ std::shared_ptr<ProcessGroup::Task> ProcessGroupBKCL::Collective( ...@@ -265,7 +265,8 @@ std::shared_ptr<ProcessGroup::Task> ProcessGroupBKCL::Collective(
const auto* calc_ctx = place_to_calc_ctx_[key]; const auto* calc_ctx = place_to_calc_ctx_[key];
const auto& comm_ctx = place_to_comm_ctx_[key]; const auto& comm_ctx = place_to_comm_ctx_[key];
auto bkcl_stream = use_calc_stream ? calc_ctx->stream() : comm_ctx->stream(); auto bkcl_stream = use_calc_stream ? calc_ctx->stream() : comm_ctx->stream();
fn(out_tensor, in_tensor, comm_ctx->bkcl_context(), bkcl_stream); PADDLE_ENFORCE_XPU_SUCCESS(
fn(out_tensor, in_tensor, comm_ctx->bkcl_context(), bkcl_stream));
if (!use_calc_stream) { if (!use_calc_stream) {
PADDLE_ENFORCE_NOT_NULL( PADDLE_ENFORCE_NOT_NULL(
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册