- 23 6月, 2022 1 次提交
-
-
由 zyfncg 提交于
-
- 22 6月, 2022 2 次提交
-
-
由 Yiqun Liu 提交于
cherry-pick #42750。 QA反馈,#42750 优化后,solov2模型性能可提升6%,故cherry-pick到2.3。因#41096 将linspace python实现从fluid.layers.tensor挪到了paddle.tensor.creation下,该pr不在release/2.3分支中,故将#42750 中python修改同步到fluid.layers.tensor.linspace中。
-
由 Zhang Ting 提交于
[cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax (#43719) [cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax cherry-pick #43635 #43681 #43474
-
- 20 6月, 2022 1 次提交
-
-
由 xiongkun 提交于
* cherry pick from #43397 * fix code
-
- 15 6月, 2022 1 次提交
-
-
由 zyfncg 提交于
* fix bug of strided_slice (#43388) * fix stride_slice bug * fix bug * fix bug of infer shape for slice (#43443)
-
- 14 6月, 2022 1 次提交
-
-
由 xiongkun 提交于
* [EinsumOp] Polish forward logic and backward logic for optimize (#42603) * change logic for optimize * modifty * merge * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010) * [EinsumOp] Make EinsumOp support bfloat16. (#43085) * change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 * make EInsumOP support bf16 * add unittest for BF16 * add condition for test_BF16 * fix bugs * fix * change the backward api to fit einsum op
-
- 08 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
Reduce amax/amin frobenius_norm_kerne原始实现为Eigen实现,文件编译时间较长,因此本PR将其替换为KP实现 删除DefaultElementwiseOperator中重复功能支持,减少elementwise_double_grad OP编译时间
-
- 07 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
Delete ElementwiseKernel in BroadcastKernel 减少所有Broadcast中重复功能调用,同时减少编译时间和问题体积
-
- 06 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
删除Broadcast function中rank例化以及Elementwise调用,降低编译时间。 从develop分支中的#42645 PR修改而来,由于develop分支与release分支相差较大,无法实现cherry-pick,因此针对release2.3重新提交PR. Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
-
- 06 5月, 2022 1 次提交
-
-
由 wawltor 提交于
* Fix the race condition in cumsum operator * Optimize cumsum operator Co-authored-by: NLeo Chen <39020268+leo0519@users.noreply.github.com>
-
- 04 5月, 2022 1 次提交
-
-
由 XiaoguangHu 提交于
* fix bug of batch_norm_grad kernel with fp16 * format code
-
- 30 4月, 2022 1 次提交
-
-
由 xiongkun 提交于
* Extend python einsum interface to make einsum_v2 support multi-operands and switch it to default. * add opt_einsum dependence * add yaml and support eager model * fix by code review
-
- 28 4月, 2022 5 次提交
-
-
由 Chen Weihang 提交于
* opt attr eaque perf * opt attr select code * fix one hot infermeta * polish get attr impl * fix tests failed * add testcases
-
由 xiongkun 提交于
* full api fix * when out is None, go old dygraph mode * by static check * first version: support 2-inputs forwards. TODO: 1. backward 2. BroadCast 3. MultiVariable * time out -> 120
-
由 FlyingQianMM 提交于
set device id of Place() to get GPUContext needed by LimitGridDim in ElemwiseGradBroadcast (PaddlePaddle#42320) (#42332)
-
由 zyfncg 提交于
* Optimize performance of dygraph (v4) (#42196) * optimize performance of dygraph * optimize performance of dygraph and elementwise_add * optimize the trace op * fix bug * fix bug * fix unittest bug * fix code format * fix cherry-pick problem
-
由 zyfncg 提交于
* Optimize the performanece of sum api (#42231) * optimize the performanece of sum api * optimize IsDenseTensorInput * remove debug log * Add move construct for KernelSignature (#42253) * add move construct for KernelSignature * add noexcept * fix cherry-pick problem
-
- 26 4月, 2022 1 次提交
-
-
由 Chen Weihang 提交于
* Add paddle::variant and replace paddle::any (#42139) * add variant and replace any * split attribute * Optimize dygraph GetExpectedKernelType perf (#42154) * opt dygraph scheduling * revert part impl * fix variant compile error (#42203) * replace any by variant in infermeta (#42181)
-
- 25 4月, 2022 1 次提交
-
-
由 Aurelius84 提交于
[Cherry-Pick][Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm and fix shape op (#42170) * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT (#42138) * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT * [Performance]Set ShapeKernel with ALL_BACKEND and ALL_LAYOUT * [Performance]Remove CudaStreamSychornize in ClipGradByGlobalNorm (#42132)
-
- 21 4月, 2022 2 次提交
-
-
由 Chen Weihang 提交于
* [Phi] Support setting size of vector<Tensor> for out in yaml (#41576) * support setting vector out size in yaml * support setting size of vector<tensor> for out in yaml * resolve conflict Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
-
由 Jiabin Yang 提交于
* cherry-pick python/paddle/utils/code_gen/backward.yaml * remove unsupported yaml Co-authored-by: NZhanlue Yang <jim19930609@gmail.com>
-
- 19 4月, 2022 4 次提交
-
-
由 zyfncg 提交于
* add rsqrt yaml and unittest (#41443) * Add expand equal all yaml (#41540) * add expand, poisson * add poison grad * add expand equal_all poisson triangular solve yaml Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>
-
由 Yiqun Liu 提交于
Cherry-pick #40338 #41741 #41313
-
由 zhangkaihuo 提交于
cherry-pick the PR#41586 to realese/2.3
-
由 Siming Dai 提交于
* add eids result for graph_sample_neighbors * fix bug * move fisher_yates sample to warp * add cpu eid output * delete comment * delete comment * change nullptr placeholder * optimize sample kernel * fix mutable_data
-
- 18 4月, 2022 2 次提交
-
-
由 chentianyu03 提交于
* split reduce_kernel * rm reduce_kernel in cmake * split reduce_grad kernels * fix cmake build error * format code * fix standalone_executor_test error
-
由 Zhanlue Yang 提交于
[DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad (#41451) (#41893) * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures
-
- 15 4月, 2022 3 次提交
-
-
由 zhangkaihuo 提交于
Add paddle.sparse and three Sparse API (#41276) Add Sparse API to_dense, to_sparse_coo and values (#41394)
-
由 zhiboniu 提交于
-
由 YuanRisheng 提交于
* add multi_dot,maxout,multiplex yaml * add code converage
-
- 14 4月, 2022 2 次提交
-
-
由 chentianyu03 提交于
* [Yaml]add exp yaml (#41217) * add exp yaml * add exp api in test case * add determinant yaml * fix exp op unittest * change test class name * modify api name * compacted with raw api * fix det api * add python_api * add test eager for determinant op * [Yaml] Add assign yaml (#41428) * add assign yaml * add assign api * add assign backward api * add assign * add assign yaml * add assign * assign yaml * add assign raw kernel and use assign_raw in yaml * merge develop branch * add missing python_api * exchange assign and assign_raw kernel name (#41625) * exchange assign and assign_raw kernel name * fix register error * [Yaml]add gaussian_random yaml and test case (#41312) * add guassian random yaml * add gaussian_random yaml and test case * fix error modify of full yaml * import in_dygraph_mode * import _in_legacy_dygraph * add place arg in api * import __current_expected_place * fix test_egr_python_api failed case * add test case * add cast for NormalInitializer * fix test error * fix test error * rm unsed check code * fix test error in test_initializer_nn * modify by review * [Phi]fix split error when sections has 0 size and add test case (#41708) * fix split error when sections has 0 size and add test case * fix test case
-
由 wuyefeilin 提交于
-
- 13 4月, 2022 3 次提交
-
-
由 Chen Weihang 提交于
* [Eager] Remove elementwise add in conv (#41515) * remove elementwise add in conv * use reshape * fix warpctc grad kernel dep eror (#41598)
-
由 FlyingQianMM 提交于
add a inner loop for index_select_grad_init() in index_select op when dealing with large-shape data (PaddlePaddle#41563) (#41669)
-
由 Aurelius84 提交于
* Revert "[Phi] Migrate Adam and AdamW into Phi (#40351)" This reverts commit 56cd3407. * add infermeta
-
- 12 4月, 2022 3 次提交
-
-
由 hong 提交于
* fix search sort bug (#41664) * fix depthwise dnn bug (#41666)
-
由 YuanRisheng 提交于
[Cherry-Pick]Add hard_swish/kron/linspace/logit/graph_send_recv/multi_dot/maxout/multiplex op yaml file (#41566) * [Phi]Add graph_send_recv yaml file (#41206) * add graph_send_recv yaml * deal with confict * fix compile bugs * cherry-pick pr 41298 * cherry-pick pr41550 * fix compile bugs
-
由 Jack Zhou 提交于
-
- 11 4月, 2022 2 次提交
-
-
由 hong 提交于
-
由 Chen Weihang 提交于
[Cherry-pick] Add truncated_normal/unique/swish/unbind yaml and polish Getting tensor place impl (#41539) * [Phi] Polish truncated normal kernel and add yaml (#41280) * polish truncated normal kernel * add yaml * add truncated normal kernel and add yaml * polish unittests and yaml * import dygraph mehtod * add unique yaml and final state api (#41460) * fix get tensor backend set bug (#41478) * [Phi] Add unbind yaml and final state api (#41277) * add unbind yaml * fix unittest * [Phi] Add swish yaml and final state api (#41479) * add swish yaml and final state api * skip mkldnn test * fix grad mkldnn test * add cherry-pick lost code
-