- 20 3月, 2023 19 次提交
-
-
由 Zhang Na 提交于
-
由 limingshu 提交于
* optimization for fused linear op * fix code format * optimization for linear fused forward * merge with develop * fix bugs for gemm_ephilog * package of cublaslt ephilogue type with enmu * final fix before code reviewing * fix missed fusedType typo * fix code according to review suggestions * fix windows ci error * change location of MatmulPlanner * add some changes for compiler error fix ---------
-
由 Ainavo 提交于
* add_up004_for_ruff * 修改配置文件并清除object * fix md
-
由 iSerendipity 提交于
* fix Werror in roi_align_grad_kernel * adopt a better way
-
由 zhouweiwei2014 提交于
-
由 zyfncg 提交于
* register some custom kernel * fix bug
-
由 HongyuJia 提交于
-
由 Weilong Wu 提交于
-
由 mayang002 提交于
-
由 Sonder 提交于
* update codes about pad3d * add codes about Tensor type Padding * update * 更新单测文件 * format code style * update and to &&' * rewrite codes about pad3d * add codes about converting paddle pad format to tensorrt pad format * fix some errors * 指定trt版本范围 * 修正dims初始化方式 * fix code style * update test pad values * 指定pad3d trt版本 * 更新 单测 文件范围 * 更新单测文件 * update pad3d paddings convert codes * update pad3d * add static mode support * update test file * fix bugs about dynamic mode test codes * fix bug and add limite in op_teller * use a new padding convert method[ITensor* padding with using Slice to split the pre_pad and the post pad] * fix PADDLE_THROW grammaly error * update test codes * 添加对于Tensor padding 的 size 判断
-
由 tianshuo78520a 提交于
-
由 FormlessUnit 提交于
shape support bf16
-
由 zhouweiwei2014 提交于
-
由 ykkk2333 提交于
* add xpu tile and concat kernel int64, test=kunlun * fix previous xpu dataoader bug, and add maxpool3dgrad special dim support, test=kunlun
-
由 xiongkun 提交于
* merge * fix bugs while backward multi-times. * code format by ci
-
由 Huang Jiyi 提交于
-
由 Jiabin Yang 提交于
-
由 HongyuJia 提交于
* [Tensor Operants & Prim-Relevant] Tensor supports compare operants * fix dependence of test_comp_static * fix unit test
-
由 wanghuancoder 提交于
-
- 19 3月, 2023 3 次提交
-
-
由 Charles-hit 提交于
-
由 Difer 提交于
* resgister for ftt_r2c, ftt_c2_r * fix clang-format
-
由 Sanbu 提交于
* Add output defs for argsort kernel * Update argsort_kernel.cc * Update argsort_kernel.cu * Update argsort_kernel.cc
-
- 18 3月, 2023 1 次提交
-
-
由 Leo Chen 提交于
-
- 17 3月, 2023 9 次提交
-
-
由 denglianbin 提交于
* finish task * fix some question. * fix error * change unittest:zeroDim.
-
由 Infinity_lee 提交于
-
由 PuQing 提交于
* add multinomial output defs * fix register on gpu
-
由 Zhang Zheng 提交于
* [AMP OP&Test] Support float & bfloat16 when using cub * fix compile error * fix * fix rocm compile error
-
由 chenxujun 提交于
-
由 gouzil 提交于
* [phi][jit] rm Softmax StrideScal * [phi][jit] rm kStrideScal * [phi][jit] fix Softmax clean omission * [phi][jit] fix Softmax clean omission * [phi][jit] fix StrideScal clean omission * [phi][jit] fix mkl SoftmaxKernel clean omission * [phi][jit] fix test error * [phi][jit] fix test error * [phi][jit] rm NCHW16CMulNC * [phi][jit] fix test error * [phi][jit] rm HSum HMax * [phi][jit] fix test error * [phi][jit] rm StrideASum * add AUTHORS.md * [phi][jit] fix test error
-
由 Leo Chen 提交于
* support fetch empty tensor on CPUPlace * fix the shape in unittest of empty output
-
由 HongyuJia 提交于
-
由 cyber-pioneer 提交于
* add bn vjp * fix example * fix code * fix code * fix cinn case * fix code * fix example * fix code * fix example * fix example
-
- 16 3月, 2023 8 次提交
-
-
由 HongyuJia 提交于
* init unit test commit, contains register thinking * support inplace * get inplaced x.grad * Try support inplace and hook at the same time * Support inplace, need debug * Support inplace successfully * Inplace use Tensor&, consistent with Tensor* * fix MapPlainOutputs bug * fix double grad inplace error
-
由 Chitsing KUI 提交于
* rename flash_attn_raw to flash_attn_unpadded * fix static api * fix static return
-
由 xjmxyt 提交于
* add dynamic support * add more test * fix bug * change test * change test
-
由 shaojie_wang 提交于
* add fp32 grad plus fp16 param in adamw * add python UT * fix test case * in test_adamw_op py file, force the moment2 value LE 0 * add a compare option * remove bf16 fused adam kernel case
-
由 Huang Jiyi 提交于
* remove contexts in tensor_utils * update from_blob * update from_blob * update from_blob * fix bug * fix bug
-
由 JZ-LIANG 提交于
* update env setting * update pass logic * dist op support bf16 * backward cast update * update setting * update backward * revert amp pass * update fp16 backward logic * register c_embedding bf16 * revert engine * add unitest * add unitest * update unitest * update cmake * update math * update math.py * update unitest * update unitest * revise unitest * revise unitest * update unitest * update unitest * update unitest
-
由 PuQing 提交于
* add rnn and searchsorted output defs * add gpu kernel
-
由 Huang Jiyi 提交于
* remove fluid thread_data_registry * update * fix bug
-