- 29 12月, 2021 3 次提交
- 28 12月, 2021 1 次提交
-
-
由 houj04 提交于
* add reduce_prod_xpu. fix reduce_mean_xpu bug. * iadd reduce_prod_xpu. fix reduce_mean_xpu bug. test=kunlun
-
- 27 12月, 2021 2 次提交
- 24 12月, 2021 1 次提交
-
-
由 zhiboniu 提交于
-
- 23 12月, 2021 3 次提交
-
-
由 Jacek Czaja 提交于
* First set of fixes * - Make more likely to GetBlob find a blobs * - Lint
-
由 Wilber 提交于
* support external stream. * update * update * update
-
由 houj04 提交于
-
- 20 12月, 2021 1 次提交
-
-
由 fwenguang 提交于
-
- 17 12月, 2021 2 次提交
-
-
由 From00 提交于
* Get GPU BasePtr from CUDA allocation * Fix compile error for ROCm * Add BasePtr function for IPUPlace in naive_best_fit_allocator.cc * Add alignment for BuddyAllocator * Set address alignment of BuddyAllocator to 32 bytes * Fix CI error * Remove code for naive_best_fit strategy
-
由 houj04 提交于
-
- 16 12月, 2021 2 次提交
-
-
由 danleifeng 提交于
* trainer_device fix and checknan tool for psgpu;test=develop * disable show_one_table;test=develop
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * add os_info * update * update * update * update * update * update for bugfix * update * update * update Co-authored-by: Nliutiexing <liutiexing@google.com>
-
- 13 12月, 2021 1 次提交
-
-
由 jianghaicheng 提交于
-
- 10 12月, 2021 3 次提交
-
-
由 sneaxiy 提交于
-
由 jianghaicheng 提交于
-
由 jianghaicheng 提交于
-
- 09 12月, 2021 2 次提交
-
-
由 sneaxiy 提交于
* fix cuda atomicAdd for FP16 * try to fix ci
-
由 jianghaicheng 提交于
-
- 08 12月, 2021 2 次提交
-
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * Fix RecordEvent Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 sneaxiy 提交于
* fix CUDA Graph H2D bug again * fix no return bug
-
- 07 12月, 2021 2 次提交
-
-
由 TTerror 提交于
* format xpu op list * format xpu op list * update xpu1 op list
-
由 jianghaicheng 提交于
-
- 03 12月, 2021 2 次提交
-
-
由 jianghaicheng 提交于
-
由 ronnywang 提交于
* refine structure for cuda and rocm * update * update * update * update
-
- 01 12月, 2021 3 次提交
-
-
由 liutiexing 提交于
* add align for WorkQueue * add spinlock * merge develop * merge * Add EventsWaiter * Revert "Add EventsWaiter" This reverts commit e206173aa9be7401b83a53581627bfaf557c8fb2. * update HostEventTracer * update HostEventTracer * fix c++17 * update * update * update * update * fix bug Co-authored-by: Nliutiexing <liutiexing@google.com>
-
由 TTerror 提交于
* add prior_box for kunlun * update * update CMakeLists
-
由 Feiyu Chan 提交于
* add angle_op
-
- 29 11月, 2021 3 次提交
-
-
由 taixiurong 提交于
-
由 TTerror 提交于
* add expand_v2/expand_as_v2 for kunlun * update expand_as_v2 * update expand_as_v2 * support float16/bool * update xpu.cmake
-
由 piotrekobiIntel 提交于
-
- 27 11月, 2021 1 次提交
-
-
由 Aganlengzi 提交于
* [NPU] reorganization for device API abstraction * [NPU] delete old files * [NPU] fix npu_collective_helper * [NPU] fix collective_helper * [NPU] fix ut * [NPU] mod memory allocation and hccl_helper * [NPU] fix place_type * [NPU] split enfoce.h * move acl* call into npu_info * merge conflict * fix merge * merge conflict * merge conflict
-
- 24 11月, 2021 2 次提交
-
-
由 piotrekobiIntel 提交于
* Add second batch of deprecated mkldnn namespace and macro changes * Unlock CI * Fix temporary namespace alias placing
-
由 Wangzheee 提交于
* matmul_convert_int8 * matmul_convert_int8 * matmulconvert_int8 * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor * Matmul_int8_convert: tensor*tensor
-
- 23 11月, 2021 2 次提交
- 19 11月, 2021 1 次提交
-
-
由 Siming Dai 提交于
* add cpu version, using set: sum, min, max * add cpu version: mean * improve cpu code and fix dynamic memory allcation problem * fix arg error, add index judge, delete fp16 * fix bug in CudaAtomicMax and CudaAtomicMin * add CUDA version * fix grad_op bug for index * add op test, add correct cpu grad op * Add correct CUDA Mean grad * [Add] Successful MEAN and SUM * [Add] Successful MIN and MAX in CPU * [Add] Successful MIN and MAX in CUDA * fix windows dtype ci * fix ROCM ci by adding HIP flag * rename fused_gather_scatter to send_recv * unify name as send and recv * change zero index return time * add send_recv incubate api * fix index data type, add unittest case for API * delete redundant input tensor * fix en example and docs, add default value in pool_type * add shape judge and max grid judge * fix comment * fix index type bug * add const & * fix en docs * delete numpy in examples * add unittest for int input * fix send_recv comment * change send_recv to graph_send_recv
-
- 18 11月, 2021 1 次提交
-
-
由 jakpiase 提交于
* fix * ci rerun * ci rerun * ci Rerun
-