- 04 1月, 2022 3 次提交
-
-
由 Qi Li 提交于
-
由 zhangkaihuo 提交于
-
由 houj04 提交于
-
- 31 12月, 2021 3 次提交
-
-
由 Zhangjingyu06 提交于
* [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun2,*test=kunlun * [XPU]add split op for kunlun,*test=kunlun Co-authored-by: NQingshuChen <chenqingshu@baidu.com>
-
由 jakpiase 提交于
* 6 dims fix * removed limitations of max dims
-
由 Chen Weihang 提交于
* unify data layout * fix test_transfer_layout error
-
- 30 12月, 2021 7 次提交
-
-
由 zhiboniu 提交于
LGTM
-
由 houj04 提交于
* add sigmoid cross entropy with logits to kl1. test=kunlun * add sigmoid cross entropy with logits to kl1. test=kunlun
-
由 zhangyk0314 提交于
Add exp, abs_grad, reciprocal, reciprocal_grad operator for XPU and update xpu2_op_list.h,test=kunlun (#38570)
-
由 Feng Xing 提交于
This PR adds runtime flags run_kp_kernel, which choose which op to run for xpu2. There are two: dynamic linked and built from kp.
-
由 Haohongxiang 提交于
* add cpu kernel of lstsq * update * modify code style * modify unittest * remove support for complex
-
由 zhangkaihuo 提交于
将cuSparse的handle与DeviceContext进行绑定,避免op中进行创建和销毁 添加对cuSparse中dense和sparse转换的API进行封装 添加对封装的API的单测
-
由 Leo Guo 提交于
* Fix the bug of batch_norm and batch_norm_grad op. Add the "roi_align" and "roi_align_grad" op in xpu2 op list. * Fix the bug of batch_norm and batch_norm_grad op. Add the "roi_align" and "roi_align_grad" op in xpu2 op list. test=kunlun Co-authored-by: NZibin <guozibin@baidu.com>
-
- 29 12月, 2021 4 次提交
-
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * update OS info * split host_event_recorder * split host_event_recorder * update * update * update * update * update * update * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 ykkk2333 提交于
-
由 TTerror 提交于
* add argsort/scatter for kunlun * update test_scatter * update xpu.cmake * update xpu.cmake * fix scatter
-
由 sneaxiy 提交于
-
- 28 12月, 2021 1 次提交
-
-
由 houj04 提交于
* add reduce_prod_xpu. fix reduce_mean_xpu bug. * iadd reduce_prod_xpu. fix reduce_mean_xpu bug. test=kunlun
-
- 27 12月, 2021 2 次提交
- 24 12月, 2021 1 次提交
-
-
由 zhiboniu 提交于
-
- 23 12月, 2021 3 次提交
-
-
由 Jacek Czaja 提交于
* First set of fixes * - Make more likely to GetBlob find a blobs * - Lint
-
由 Wilber 提交于
* support external stream. * update * update * update
-
由 houj04 提交于
-
- 20 12月, 2021 1 次提交
-
-
由 fwenguang 提交于
-
- 17 12月, 2021 2 次提交
-
-
由 From00 提交于
* Get GPU BasePtr from CUDA allocation * Fix compile error for ROCm * Add BasePtr function for IPUPlace in naive_best_fit_allocator.cc * Add alignment for BuddyAllocator * Set address alignment of BuddyAllocator to 32 bytes * Fix CI error * Remove code for naive_best_fit strategy
-
由 houj04 提交于
-
- 16 12月, 2021 2 次提交
-
-
由 danleifeng 提交于
* trainer_device fix and checknan tool for psgpu;test=develop * disable show_one_table;test=develop
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * add os_info * update * update * update * update * update * update for bugfix * update * update * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
- 13 12月, 2021 1 次提交
-
-
由 jianghaicheng 提交于
-
- 10 12月, 2021 3 次提交
-
-
由 sneaxiy 提交于
-
由 jianghaicheng 提交于
-
由 jianghaicheng 提交于
-
- 09 12月, 2021 2 次提交
-
-
由 sneaxiy 提交于
* fix cuda atomicAdd for FP16 * try to fix ci
-
由 jianghaicheng 提交于
-
- 08 12月, 2021 2 次提交
-
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * Fix RecordEvent Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 sneaxiy 提交于
* fix CUDA Graph H2D bug again * fix no return bug
-
- 07 12月, 2021 2 次提交
-
-
由 TTerror 提交于
* format xpu op list * format xpu op list * update xpu1 op list
-
由 jianghaicheng 提交于
-
- 03 12月, 2021 1 次提交
-
-
由 jianghaicheng 提交于
-