- 08 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
Reduce amax/amin frobenius_norm_kerne原始实现为Eigen实现,文件编译时间较长,因此本PR将其替换为KP实现 删除DefaultElementwiseOperator中重复功能支持,减少elementwise_double_grad OP编译时间
-
- 07 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
Delete ElementwiseKernel in BroadcastKernel 减少所有Broadcast中重复功能调用,同时减少编译时间和问题体积
-
- 06 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
删除Broadcast function中rank例化以及Elementwise调用,降低编译时间。 从develop分支中的#42645 PR修改而来,由于develop分支与release分支相差较大,无法实现cherry-pick,因此针对release2.3重新提交PR. Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
-
- 28 4月, 2022 2 次提交
-
-
由 FlyingQianMM 提交于
set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast (PaddlePaddle#42320) (#42332)
-
由 zyfncg 提交于
* Optimize performance of dygraph (v4) (#42196) * optimize performance of dygraph * optimize performance of dygraph and elementwise_add * optimize the trace op * fix bug * fix bug * fix unittest bug * fix code format * fix cherry-pick problem
-
- 26 4月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* Add paddle::variant and replace paddle::any (#42139) * add variant and replace any * split attribute * Optimize dygraph GetExpectedKernelType perf (#42154) * opt dygraph scheduling * revert part impl * fix variant compile error (#42203) * replace any by variant in infermeta (#42181)
-
- 19 4月, 2022 1 次提交
-
-
由 zhangkaihuo 提交于
cherry-pick the PR#41586 to realese/2.3
-
- 13 4月, 2022 2 次提交
-
-
由 FlyingQianMM 提交于
add a inner loop for index_select_grad_init() in index_select op when dealing with large-shape data (PaddlePaddle#41563) (#41669)
-
由 Aurelius84 提交于
* Revert "[Phi] Migrate Adam and AdamW into Phi (#40351)" This reverts commit 56cd3407. * add infermeta
-
- 07 4月, 2022 1 次提交
-
-
由 zhouweiwei2014 提交于
cherry-pick fix compile bug of windows cuda11.5 #41433
-
- 03 4月, 2022 1 次提交
-
-
由 FlyingQianMM 提交于
* limit grid dim for index select * mv LimitGridDim into gpu_launch_config.h * fix conflicts * fix conflicts * fix code style * set block to 256 * fix grid setting * set dtype of block_dim to unsigned int
-
- 02 4月, 2022 2 次提交
-
-
由 zhangkaihuo 提交于
-
由 niuliling123 提交于
-
- 01 4月, 2022 1 次提交
-
-
由 chentianyu03 提交于
* add interploate cpu kernel * fix nullptr bug * add interpolate gpu kernel * fix unit test error * remove raw kernels * add cuda kernel impl * add infermeta * recover accidentally deleted kernels in interpolate op * fix grad x_grad name error * remove interpolate_v2_op.h * rm unused codes * fix xpu build error * fix build error * fix namespace error * add register header for nup * fix infermeta error * modify by review * add the missing args in test_trt_convert_nearest_interp_v2
-
- 31 3月, 2022 2 次提交
- 30 3月, 2022 3 次提交
-
-
由 zyfncg 提交于
* move rnn kernel to phi * move infershape of rnn to phi * fix HIP bug * rename function * fix HIP bug * fix hip bug
-
由 Chen Weihang 提交于
Revert "Revert "[Phi] Move elementwise_floordiv and elementwise_pow to phi (#40993)" (#41065)" (#41110) This reverts commit 3a6f1135.
-
- 29 3月, 2022 3 次提交
-
-
由 tianshuo78520a 提交于
This reverts commit b532315d.
-
由 tianshuo78520a 提交于
This reverts commit e77a947e.
-
由 wuyefeilin 提交于
* mv floordiv to phi * mv elementwise_pow to phi * fix as review
-
- 28 3月, 2022 1 次提交
-
-
由 hong 提交于
* update * add forward case * update * update; test=develop * add some grad kernel; test=develop * move gpu kernel; test=develop * update * update; * update test; * fix selected rows bug; * add mix vector include ; * add mixed vector depen; test=develop * add logit grad signature; * polish code * fix bug; * add namespace for abs * revert code * not move softsign * revmove duplate register; * fix softsign bug * polish code * format * format * fix bug * remove cmake dep * add square sqrt selected rows support * update * remove clip norm * add standalone executor sqrt dep * standalone exec denp sqrt * remove sqrt op in cmkaelist * open some case
-
- 27 3月, 2022 1 次提交
-
-
由 hong 提交于
* move slice to pten * merge develop; test=develop * fix slice bug; * update * update * fix error * update * fix bug * polish code * polish code * polish code * try to fix windows bug * add gpu compile flag; * try to fix * remov template; * polish code; * fix npu bug; * fix npu bug * fix npu bug; test=develop * fix slice bug; * remove no need dep
-
- 26 3月, 2022 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 25 3月, 2022 5 次提交
-
-
由 YuanRisheng 提交于
-
由 Aurelius84 提交于
* [Phi] Migrate strided_slice into Phi * [Phi] Migrate strided_slice into Phi * fix compilation problem
-
由 Aurelius84 提交于
* [Phi] Migrate Adam and Adamw into Phi * fix compile error and unittest ok * fix compile error and unittest ok * fix undefined reference to fLI::FLAGS * test depend on operator * fix cmake * fix xpu compile * fix infrt * fix amp_type_traits * fix amp_type_traits * modify according reviewer * modify according reviewer * fix dtype float16 * fix typo * fix Cmake * fix code style
-
由 FlyingQianMM 提交于
* add maximum limit for grid of reduce, elementwise and gather * add {} after if
-
由 FlyingQianMM 提交于
-
- 24 3月, 2022 2 次提交
-
-
由 caozhou 提交于
* migrate infershape * fix tril_triu infershape error * fix qr_op infershape * add parse qr mode func * move order
-
由 niuliling123 提交于
-
- 23 3月, 2022 5 次提交
-
-
由 zyfncg 提交于
* move deformable_conv_grad to phi * move infershape of deformable_conv to phi * adjust some code format * move deformable_conv_v1 to phi
-
由 zhouweiwei2014 提交于
-
由 niuliling123 提交于
-
由 xiongkun 提交于
* transfer unsqueeze to phi * fix conflict * add squeeze * add infershape * fix xpu and npu error
-
由 YuanRisheng 提交于
* move activation * fix bugs when run ce
-
- 22 3月, 2022 2 次提交
-
-
由 hong 提交于
* move mutable_data to context alloc * move mutable_data to context alloc * remvoe duplicate code
-
由 hong 提交于
* move embeding to phi; * update sig; test=develop * move reset impl to phi; test=develop * remove old register; test=develop * fix cpu bf16 bug; test=develop * fix lookup speed error * polish code * fix paddle throw type
-
- 21 3月, 2022 1 次提交
-
-
由 niuliling123 提交于
* Support MaskedSelectGrad op with Kernel Primitive API
-