argsort这个OP是否反向传播有问题
Created by: huangguanMayday
logits是batch*batch的tensor:
将 hardest_neg_score = fluid.layers.reduce_max(logits, dim=1, keep_dim=True)
换成: hardest_neg_score3 = fluid.layers.argsort(input=logits, axis=1, name='hardest_neg_score3') hardest_neg_score3 = hardest_neg_score3[0] hardest_neg_score = fluid.layers.slice(input=hardest_neg_score3, axes=[1], starts=[int(batch_size - 1)], ends=[int(batch_size)])
模型收敛从没问题变成有问题。
验证过,两者的前向传播的结果是一致的。