• S
    Add paddle.incubate.graph_send_recv API (#37205) · 39012536
    Siming Dai 提交于
    * add cpu version, using set: sum, min, max
    
    * add cpu version: mean
    
    * improve cpu code and fix dynamic memory allcation problem
    
    * fix arg error, add index judge, delete fp16
    
    * fix bug in CudaAtomicMax and CudaAtomicMin
    
    * add CUDA version
    
    * fix grad_op bug for index
    
    * add op test, add correct cpu grad op
    
    * Add correct CUDA Mean grad
    
    * [Add] Successful MEAN and SUM
    
    * [Add] Successful MIN and MAX in CPU
    
    * [Add] Successful MIN and MAX in CUDA
    
    * fix windows dtype ci
    
    * fix ROCM ci by adding HIP flag
    
    * rename fused_gather_scatter to send_recv
    
    * unify name as send and recv
    
    * change zero index return time
    
    * add send_recv incubate api
    
    * fix index data type, add unittest case for API
    
    * delete redundant input tensor
    
    * fix en example and docs, add default value in pool_type
    
    * add shape judge and max grid judge
    
    * fix comment
    
    * fix index type bug
    
    * add const &
    
    * fix en docs
    
    * delete numpy in examples
    
    * add unittest for int input
    
    * fix send_recv comment
    
    * change send_recv to graph_send_recv
    39012536
cuda_primitives.h 9.6 KB