• H
    Allclose op (#27891) · d4668938
    huangxu96 提交于
    * Still has bugs.
    
    * Fixed allclose_op bug, which cannot deal with some cases of fp64 inputs.
    
    * improved CUDA kernel performance.
    
    * Changed CUDA code.
    
    * Fixed a bug in cuda kernel which cannot deal with large dimension input, and added an unittest for it.
    
    * Add a test case for float32 input.
    d4668938
allclose_op.cc 5.9 KB