提交 43adf281 编写于 作者: M mindspore-ci-bot 提交者: Gitee

!77 fix bug of bert pre training

Merge pull request !77 from amongo/FixBugOfBertPreTraining
......@@ -403,9 +403,6 @@ class BertTrainOneStepWithLossScaleCell(nn.Cell):
sens=None):
"""Defines the computation performed."""
weights = self.weights
# alloc status
init = self.alloc_status()
self.clear_before_grad(init)
loss = self.network(input_ids,
input_mask,
token_type_id,
......@@ -417,6 +414,9 @@ class BertTrainOneStepWithLossScaleCell(nn.Cell):
scaling_sens = self.loss_scale
else:
scaling_sens = sens
# alloc status and clear should be right before gradoperation
init = self.alloc_status()
self.clear_before_grad(init)
grads = self.grad(self.network, weights)(input_ids,
input_mask,
token_type_id,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册