“8c7c53b3d5237bcdbcb42e492ec51bc581223549”上不存在“paddle/phi/kernels/gpu/reduce_sum_grad_kernel.cu”
* Refine beam_search_op to output an extra parent_idx tensor. test=develop * Fix the unittest test_beam_search_op. test=develop * Fix the merging mistake. test=develop