• H
    Allclose op (#27891) (#28069) · 6bb6cb27
    huangxu96 提交于
    * Fixed allclose_op bug, which cannot deal with some cases of fp64 inputs.
    
    * improved CUDA kernel performance.
    
    * Fixed a bug in cuda kernel which cannot deal with large dimension input, and added an unit test for it.
    
    * Add a test case for float32 input.
    6bb6cb27
allclose_op.h 2.2 KB