- 28 4月, 2023 1 次提交
-
-
由 Bo Zhang 提交于
* change judgement for DropoutGradGPUKernelDriver * add UnrollerWithoutVecSize and after this Loaddata to be refined * pass unittest * use same unroller with XPU * BroadcastWithInt64Index * BroadcastDataLoader template partial specialization * fix compile errs in ROCms * clean ElementwiseT and InT for BroadcastKernel * default axis and clean inT * remove redundant fast divmod computation * optimize drop_nd & drop_nd_grad * optimize BroadcastDataLoader bf16 fp16 * rm InT etc. after merge develop * delete constexpr for windows ci * fix conflict * fix conflic with develop * fix conflic * new clean * clean
-
- 01 2月, 2023 1 次提交
-
-
由 gouzil 提交于
* [Divide by 0 Error] add lu check * [Divide by 0 Error] lu check migrate to c++
-
- 30 1月, 2023 1 次提交
-
-
由 RedContritio 提交于
* add pivots type check and fix batchsize error * add unittest for batchsize = 0 * fix nullptr in lu_unpack fix batchsize error in LU_Unpack add nullptr check in OneFunctor * remove exception in device code
-
- 29 7月, 2022 1 次提交
-
-
由 Lin Manhui 提交于
* Add kernel declarations * Copy kernel implementation code * Transfer implementation code * Register new kernels * Remove old kernels * Fix code style * Fix bugs * mutable_data->HostAlloc * Transfer infermeta * Add yaml and update python api * Add PADDLE_WITH_HIP check * Update unittests * Fix bugs * Fix bugs * Optimize directory structure * Add output checks * lu_impl.h->lu_kernel_impl.h Co-authored-by: NBobholamovic <linmanhui@baidu.com>
-