• B
    Dropout optimize & clean broadcast inT and ElementwiseType (#52969) · d611e48c
    Bo Zhang 提交于
    * change judgement for DropoutGradGPUKernelDriver
    
    * add UnrollerWithoutVecSize and after this Loaddata to be refined
    
    * pass unittest
    
    * use same unroller with XPU
    
    * BroadcastWithInt64Index
    
    * BroadcastDataLoader template partial specialization
    
    * fix compile errs in ROCms
    
    * clean ElementwiseT and InT for BroadcastKernel
    
    * default axis and clean inT
    
    * remove redundant fast divmod computation
    
    * optimize drop_nd & drop_nd_grad
    
    * optimize BroadcastDataLoader bf16 fp16
    
    * rm InT etc. after merge develop
    
    * delete constexpr for windows ci
    
    * fix conflict
    
    * fix conflic with develop
    
    * fix conflic
    
    * new clean
    
    * clean
    d611e48c
compare_kernel.cu 7.0 KB