Created by: emailweixu
When the reduce_op is applied to a vector, the result is a scalar, which should be a rank-0 tensor. In the current implementation, the result is a rank-1 tensor. This causes dimension mismatch and failure of eigen_assert in debug mode