- 27 6月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* Create Tensor by paddle::empty in custom operator (#41840) * create tensor by empty in custom op * fix some bug * update relu custom op demo (#43173) * Fix incompatible error for custom op Placetype (#43749) * fix incompatible error * rmeove default constructor * add macro * fix cpu make error * add DefaultGPUPlace api Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
-
- 24 6月, 2022 1 次提交
-
-
由 wawltor 提交于
-
- 23 6月, 2022 1 次提交
-
-
由 zyfncg 提交于
-
- 22 6月, 2022 4 次提交
-
-
由 xiaoxiaohehe001 提交于
-
由 Yiqun Liu 提交于
cherry-pick #42750。 QA反馈,#42750 优化后,solov2模型性能可提升6%,故cherry-pick到2.3。因#41096 将linspace python实现从fluid.layers.tensor挪到了paddle.tensor.creation下,该pr不在release/2.3分支中,故将#42750 中python修改同步到fluid.layers.tensor.linspace中。
-
由 Zhang Ting 提交于
[cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax (#43719) [cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax cherry-pick #43635 #43681 #43474
-
由 zyfncg 提交于
-
- 20 6月, 2022 1 次提交
-
-
由 xiongkun 提交于
* cherry pick from #43397 * fix code
-
- 15 6月, 2022 1 次提交
-
-
由 zyfncg 提交于
* fix bug of strided_slice (#43388) * fix stride_slice bug * fix bug * fix bug of infer shape for slice (#43443)
-
- 14 6月, 2022 1 次提交
-
-
由 xiongkun 提交于
* [EinsumOp] Polish forward logic and backward logic for optimize (#42603) * change logic for optimize * modifty * merge * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010) * [EinsumOp] Make EinsumOp support bfloat16. (#43085) * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 * make EInsumOP support bf16 * add unittest for BF16 * add condition for test_BF16 * fix bugs * fix * change the backward api to fit einsum op
-
- 08 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
Reduce amax/amin frobenius_norm_kerne原始实现为Eigen实现,文件编译时间较长,因此本PR将其替换为KP实现 删除DefaultElementwiseOperator中重复功能支持,减少elementwise_double_grad OP编译时间
-
- 07 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
Delete ElementwiseKernel in BroadcastKernel 减少所有Broadcast中重复功能调用,同时减少编译时间和问题体积
-
- 06 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
删除Broadcast function中rank例化以及Elementwise调用,降低编译时间。 从develop分支中的#42645 PR修改而来,由于develop分支与release分支相差较大,无法实现cherry-pick,因此针对release2.3重新提交PR. Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
-
- 10 5月, 2022 1 次提交
-
-
由 fwenguang 提交于
* [MLU] add mlu new profiler (#41138) * [MLU] add mlu new profiler * fix format * [MLU] support add callback to stream (#41831) * [MLU] add gather mlu kernel (#41969) * [MLU] add mlu activation kernels (#41751)
-
- 06 5月, 2022 1 次提交
-
-
由 wawltor 提交于
* Fix the race condition in cumsum operator * Optimize cumsum operator Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>
-
- 05 5月, 2022 1 次提交
-
-
由 xiongkun 提交于
-
- 04 5月, 2022 2 次提交
-
-
由 XiaoguangHu 提交于
* fix bug of batch_norm_grad kernel with fp16 * format code
-
由 XiaoguangHu 提交于
-
- 01 5月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
-
- 30 4月, 2022 2 次提交
-
-
由 xiongkun 提交于
* Extend python einsum interface to make einsum_v2 support multi-operands and switch it to default. * add opt_einsum dependence * add yaml and support eager model * fix by code review
-
由 littletomatodonkey 提交于
* fix pad3d infer shape * fix pad3d * fix pad default value * fix order * add unit test * fix unittest for ci coverage * add ndhwc check
-
- 28 4月, 2022 5 次提交
-
-
由 Chen Weihang 提交于
* opt attr eaque perf * opt attr select code * fix one hot infermeta * polish get attr impl * fix tests failed * add testcases
-
由 xiongkun 提交于
* full api fix * when out is None, go old dygraph mode * by static check * first version: support 2-inputs forwards. TODO: 1. backward 2. BroadCast 3. MultiVariable * time out -> 120
-
由 FlyingQianMM 提交于
set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast (PaddlePaddle#42320) (#42332)
-
由 zyfncg 提交于
* Optimize performance of dygraph (v4) (#42196) * optimize performance of dygraph * optimize performance of dygraph and elementwise_add * optimize the trace op * fix bug * fix bug * fix unittest bug * fix code format * fix cherry-pick problem
-
由 zyfncg 提交于
* Optimize the performanece of sum api (#42231) * optimize the performanece of sum api * optimize IsDenseTensorInput * remove debug log * Add move construct for KernelSignature (#42253) * add move construct for KernelSignature * add noexcept * fix cherry-pick problem
-
- 27 4月, 2022 3 次提交
-
-
由 Chen Weihang 提交于
* Remove std::type_index in AttributeArdDef (#42122) * polish some impl * add lost attr type * polish details * fix error type * polish in name lists * add double attr * adapt infrt attr parse * add attr type test (#42263) * opt attr eaque perf (#42272)
-
由 Jiabin Yang 提交于
* fix memory issue for eager * fix bug
-
由 Chen Weihang 提交于
* Change small vector size (#42202) * change samll vector size * Update type_defs.h * Optimize dygraph InferShape perf (#42155) * init commit * remove two hash impl * fix bug * polish details * fix compile failed * fix compile failed * fix compile failed * add default kernel sig cache * fix get kernel arg defs error * remove kernel arg defs cache * fix origin op execute
-
- 26 4月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* Add paddle::variant and replace paddle::any (#42139) * add variant and replace any * split attribute * Optimize dygraph GetExpectedKernelType perf (#42154) * opt dygraph scheduling * revert part impl * fix variant compile error (#42203) * replace any by variant in infermeta (#42181)
-
- 25 4月, 2022 2 次提交
-
-
由 zyfncg 提交于
* optimiaze performance of PreparePhiData (#42093) * Dygraph performance optimization (v2) (#42103) * optimiaze performance of PreparePhiData * dygraph performance optimization * optimize performance of dygraph (#42137)
-
由 Aurelius84 提交于
[Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm and fix shape op (#42170) * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT (#42138) * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT * [Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm (#42132)
-
- 22 4月, 2022 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 21 4月, 2022 4 次提交
-
-
由 zyfncg 提交于
* [PHI] Support some c++ api in paddle namespace (#41778) * support some c++ api in paddle namespace * change c++ api namespace in custom op * [Phi] Support setting size of vector<Tensor> for out in yaml (#41576) * support setting vector out size in yaml * support setting size of vector<tensor> for out in yaml * add data transform config for shape and size (#41909) * fix api_gen bug
-
由 Chen Weihang 提交于
* [Phi] Support setting size of vector<Tensor> for out in yaml (#41576) * support setting vector out size in yaml * support setting size of vector<tensor> for out in yaml * resolve conflict Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
-
由 Jiabin Yang 提交于
* cherry-pick python/paddle/utils/code_gen/backward.yaml * remove unsupported yaml Co-authored-by: NZhanlue Yang <jim19930609@gmail.com>
-
由 Chen Weihang 提交于
* polish tensor api details (#41971) * [CustomOp] Fix custom op pinned input error (#41972) * fix custom op pinned input error * fix compile error * fix inference custom op (#41999) * resolve conflict
-
- 20 4月, 2022 2 次提交
-
-
由 YuanRisheng 提交于
* support construct scalar using non-cpu tensor * fix bugs when run unittest * fix compile bugs * fix bugs when run ci * fix compile bugs * fix bugs when move copy * perfect unit test * perfect unittest * update according to comment * add target dependency * deal with conflict * fix bugs when run unit test * fix unit test bugs
-
由 Aurelius84 提交于
[Cherry-Pick]Fix expand_sig infershape BUG under static graph mode and NeedTransformPlace behavior if set skip_transform in yaml (#41973) * [Phi]Fix expand_sig infershape BUG under static graph mode (#41936) * [Phi]Fix expand_sig infershape BUG under static graph mode * [Phi]Fix expand_sig infershape BUG under static graph mode * [Phi]Fix unittest * [Phi]Fix unittest * [Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml (#41920) * [Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml * add unittest for full_like * fix unittest
-
- 19 4月, 2022 1 次提交
-
-
由 zyfncg 提交于
* add rsqrt yaml and unittest (#41443) * Add expand equal all yaml (#41540) * add expand, poisson * add poison grad * add expand equal_all poisson triangular solve yaml Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>
-