- 18 1月, 2023 6 次提交
-
-
由 zhouweiwei2014 提交于
* [Zero-Dim] support input 0D for paddle.moveaxis/quantile * fix CI
-
由 RuohengMa 提交于
* add reduce_sum_int64 and reduce_sum_int8 xpu kernels * [PHI] add clip grad kernel with support type float32 and int32 * [PHI unittest] add clip_grad unit test * adapt code to clang-format * update xpu api output with clip_grad api * remove int8 support of reduce_sum xpu kernel since it can not pass unit tests * adapt license date, add code for XPUDataType convertion * add int8 support of reduce_sum * add reduce_sum unit tests for dtype int64, int8, and add more test cases * update license date * remove buggy bitwise and, or and xor xpu kernels, refine bitwise not xpu kernel * change license date
-
由 houj04 提交于
-
由 wawltor 提交于
* Add the cumsum 0d tensor * xpu and cpu judge the 0d tensor * change to 2022 to 2023 in new commit * fix the reverse logic
-
由 Zhang Zheng 提交于
-
由 jameszhang 提交于
* revert to use default XPU stream for computing XPUContext now has a null stream by default. If you want to use a separate stream (e.g. in async collective communication), you should create a dedicated XPUContext and invoke its XPUContext::CreateStream() * minor
-
- 17 1月, 2023 6 次提交
-
-
由 zhangbo9674 提交于
* refine munmap freq for ref_cnt_mmap_allocator * add shm reuse logic * fix compile bug * fix compile bug * fix bug of file refcount * fix compile bug * fix compile bug * refine code for delete shm case * polish code * refine shm cache pool size setting logic * set buffer is 2 * refine shm cache size logic * refine max shm cache * refine shm cache size
-
由 yeliang2258 提交于
* add zero dims test * update code * fix zero dims * update code
-
由 pangyoki 提交于
* new exe supports CUDA Graph * fix * fix * fix * fix FLAGS_use_stream_safe_cuda_allocator in unittest * insert output of coalesce_tensor op to skip_gc_var * fix
-
由 YuanRisheng 提交于
* change feed_op to phi kernel * fix ci bugs * fix build bugs * fix ci bugs * fix compile bugs * fix ci bugs * perfect code * perfect comment code * fix install bugs * modify code according comment * remove visitor in feed_op * modify according comment * perfect code according comment * add infershape * fix py3 bugs * fix getexpected kernel type * fix getexpected kernel type * fix ci bugs * add registry for custom device * fix py3 bugs * fix floating point error * fix py3 test bugs
-
由 HongyuJia 提交于
-
由 Xiaoxu Chen 提交于
* support elementwise base func * fix compiling error and add test * support vjp for div using comp * remove additional change * fix dy2st error with magic num * fix dy magic num * another magic * another magic * another magic * add skip rename strategy * support add vjp * support add with new axis cal * support sub vjp * [prim] add multiply vjp rules * [prim] add multiply vjp rules * [prim] fix no infershape with composite in _append_backward_ops * [prim] add expand vjp rule * [prim] add exp vjp rule * uncomment infer shape for reshape/sum static prim api * [prim] fix tanh nullptr error * remove some print message * fix magic number in run_program relative tests @JiaBinYang * [prim] add expand,multiply,exp vjp rules * fix only support single direction reduce error * infer reduce dims using out dims Co-authored-by: NJiabinYang <360788950@qq.com>
-
- 16 1月, 2023 7 次提交
-
-
由 HappyHeavyRain 提交于
* support the 'data_transform' for generating static graph ops * reset 'pow' code * change the 'GetKernelTypeForVar'
-
由 zlsh80826 提交于
* Update warpctc for cuda-12 * Deprecate cudaProfilerInitialize for CUDA > 11 * Deprecate CUSPARSE_MV_ALG_DEFAULT for CUDA_VERSION >= 11040 * Add the missing thrust header
-
由 Weilong Wu 提交于
-
由 wawltor 提交于
-
由 QingshuChen 提交于
-
由 zqw_1997 提交于
-
由 xiaoguoguo626807 提交于
-
- 13 1月, 2023 16 次提交
-
-
由 Weilong Wu 提交于
-
由 limingshu 提交于
* first commit * add some changes in stack kernel. * move the location of GeneralDivMod * fix code format error according to ci
-
由 cyber-pioneer 提交于
-
由 ronnywang 提交于
* add where, atan2, median 0d ut * add where, atan2, median 0d ut * update * update * update
-
由 duanyanhui 提交于
* clear ProcessGroupCustom manually * fix bug * fix bug * move destroy ProcessGroup to ProcessGroupIdMap * enable destroy to all device * remove unused comments * change to internal api * Update process_group.cc * Update process_group.cc
-
由 Jiabin Yang 提交于
* support elementwise base func * fix compiling error and add test * remove additional param * support vjp for div using comp * remove additional change * fix dy2st error with magic num * fix dy magic num * another magic * another magic * add more test * fix windows problem * another magic * fix windows compile * invoke ci * add skip rename strategy * support add vjp * fix test_tanh * support add with new axis cal * fix resnet and some test * add composite log * support sub vjp
-
由 jameszhang 提交于
* kunlun add support for c_concat and c_split * replace mutable_data() and ShareDataWith()
-
由 ykkk2333 提交于
-
由 jameszhang 提交于
* fix xpu unittest issue: zero_dim_tensor * deal with leftout issue introduced by #49470
-
由 Leo Guo 提交于
-
由 wangshengxiang 提交于
-
由 zyfncg 提交于
* generate static graph code of stack, unbind, unique_consecutive op * fix bug
-
由 wangzhen38 提交于
* [cpplint fix] under ps
-
由 Weilong Wu 提交于
* [PHI] rrelu add yaml * polish * polish
-
由 zhangkaihuo 提交于
-
由 Yuanle Liu 提交于
-
- 12 1月, 2023 5 次提交
-
-
由 sunli 提交于
* lerp support 0 Tensor * fix lerp grad * fix lerp zero test * fix 0D + ND/ND + 0D * fix check * update code * fix lerp infer shape * static backward test * updata static graph test
-
由 Wen Sun 提交于
* refactor: migrate comm checks * refactor: add check in comm context * feat: add gloo static check * refactor: add place param in static check
-
由 YuanRisheng 提交于
-
由 xiaoxiaohehe001 提交于
-
由 Leo Guo 提交于
xpu2_op_list.cc. test=kunlun
-