- 06 6月, 2022 1 次提交
-
-
由 niuliling123 提交于
删除Broadcast function中rank例化以及Elementwise调用,降低编译时间。 从develop分支中的#42645 PR修改而来,由于develop分支与release分支相差较大,无法实现cherry-pick,因此针对release2.3重新提交PR. Broadcast中关于rank的例化会导致底层模板展开较多,造成reduce_sum_grad_kernel.cu.o文件体积过大,修改后可以降低.o体积及编译时间
-
- 30 5月, 2022 1 次提交
-
-
由 WangZhen 提交于
* Fix cond_block_grad error when handle no need grad vras * Add comment and UT
-
- 10 5月, 2022 1 次提交
-
-
由 fwenguang 提交于
* [MLU] add mlu new profiler (#41138) * [MLU] add mlu new profiler * fix format * [MLU] support add callback to stream (#41831) * [MLU] add gather mlu kernel (#41969) * [MLU] add mlu activation kernels (#41751)
-
- 07 5月, 2022 1 次提交
-
-
由 FlyingQianMM 提交于
Reduce the number of threads per block of deformable_psroi_pooling to solve the bug where too many resources requested for launch (PaddlePaddle#42531) (#42533)
-
- 02 5月, 2022 1 次提交
-
-
由 Zhang Zheng 提交于
* Fix test_cudnn_norm_conv and test_cudnn_bn_add_relu in CUDA11.2 * no throw in V100 for some cases
-
- 30 4月, 2022 1 次提交
-
-
由 xiongkun 提交于
* Extend python einsum interface to make einsum_v2 support multi-operands and switch it to default. * add opt_einsum dependence * add yaml and support eager model * fix by code review
-
- 29 4月, 2022 2 次提交
-
-
由 WangXi 提交于
* fix FusedResidualDropoutBias nan in v100 (#42344) * fix lod_tensor_array gc (#42377)
-
由 WangXi 提交于
[cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer generation performance (#42311) * Add fused_multi_transformer op to optimize transformer generation performance (#41814) * fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315) * fix ci timeout
-
- 28 4月, 2022 2 次提交
-
-
由 xiongkun 提交于
* full api fix * when out is None, go old dygraph mode * by static check * first version: support 2-inputs forwards. TODO: 1. backward 2. BroadCast 3. MultiVariable * time out -> 120
-
由 zyfncg 提交于
* Optimize the performanece of sum api (#42231) * optimize the performanece of sum api * optimize IsDenseTensorInput * remove debug log * Add move construct for KernelSignature (#42253) * add move construct for KernelSignature * add noexcept * fix cherry-pick problem
-
- 26 4月, 2022 2 次提交
-
-
由 Fan Zhang 提交于
* XPUPS Adaptation (#40991) * Adapt XPUPS - 1st version - 3.24 * Adapt XPUPS - update XPU PushSparse - 2nd version - 3.24 * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25 * refactor heter comm kernel * update. test=develop * Adapt XPUPS - modify by compilation - 4th version - 3.27 * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * heter_comm update * heter_comm update * update calc_shard_offset. test=develop * heter_comm update * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30 * update. test=develop * update pslib.cmake * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * Adapt XPUPS - modify by kp compilation - 6th version - 3.30 * update. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * used by minxu * update heter_comm_inl * fix. test=develop * Adapt XPUPS - modify by kp compilation - 7th version - 3.30 * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 3.31 update * Adapt XPUPS - update kp compilation path - 8th version - 3.31 * add optimizer kernel. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm.h 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * Adapt XPUPS - update by kp compilation - 9th version - 4.1 * update hashtable. test=develop * fix. test=develop * update hashtable 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 10th version - 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * modify by compilation 4.1 * update. test=develop * update. test=develop * fix. test=develop * modify by compilation 4.1 * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 19:30 * fix. test=develop * update ps_gpu_wrapper.kps 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 11th version - 4.1 * fix. test=develop * Adapt XPUPS - update by kp compilation - 12nd version - 4.2 * fix. test=develop * fix. test=develop * modify by compilation 4.2 * 4.2 update * fix. test=develop * template init. test=develop * update 4.6 * fix. test=develop * template init. test=develop * 4.6 modify by compilation * hashtable template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 13nd version - 4.7 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.11 update * fix. test=develop * fix. test=develop * 4.11 update * update by pre-commit * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.12 update * fix. test=develop * Adapt XPUPS - update by kp compilation - 14th version - 4.13 * 4.13 update * 4.14 update * 4.14 update * 4.14 update * 4.14 modify by merged latest compilation * retry CI 4.14 * 4.15 pass static check * 4.15 modify by gpups CI * 3.16 update by gpups CI - modify ps_gpu_wrapper.h * 4.16 update * 4.16 pass xpu compile * 4.16 retry CI * 4.16 update Co-authored-by: Nzmxdream <zhangminxu01@baidu.com> * modify ps_gpu_wrapper.cc * update * Adapt BKCL comm for XPUPS (#42168) * Adapt XPUPS - 1st version - 3.24 * Adapt XPUPS - update XPU PushSparse - 2nd version - 3.24 * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25 * refactor heter comm kernel * update. test=develop * Adapt XPUPS - modify by compilation - 4th version - 3.27 * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * heter_comm update * heter_comm update * update calc_shard_offset. test=develop * heter_comm update * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30 * update. test=develop * update pslib.cmake * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * Adapt XPUPS - modify by kp compilation - 6th version - 3.30 * update. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * used by minxu * update heter_comm_inl * fix. test=develop * Adapt XPUPS - modify by kp compilation - 7th version - 3.30 * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 3.31 update * Adapt XPUPS - update kp compilation path - 8th version - 3.31 * add optimizer kernel. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm.h 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * Adapt XPUPS - update by kp compilation - 9th version - 4.1 * update hashtable. test=develop * fix. test=develop * update hashtable 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 10th version - 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * modify by compilation 4.1 * update. test=develop * update. test=develop * fix. test=develop * modify by compilation 4.1 * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 19:30 * fix. test=develop * update ps_gpu_wrapper.kps 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 11th version - 4.1 * fix. test=develop * Adapt XPUPS - update by kp compilation - 12nd version - 4.2 * fix. test=develop * fix. test=develop * modify by compilation 4.2 * 4.2 update * fix. test=develop * template init. test=develop * update 4.6 * fix. test=develop * template init. test=develop * 4.6 modify by compilation * hashtable template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 13nd version - 4.7 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.11 update * fix. test=develop * fix. test=develop * 4.11 update * update by pre-commit * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.12 update * fix. test=develop * Adapt XPUPS - update by kp compilation - 14th version - 4.13 * 4.13 update * 4.14 update * 4.14 update * 4.14 update * 4.14 modify by merged latest compilation * retry CI 4.14 * 4.15 pass static check * 4.15 modify by gpups CI * 3.16 update by gpups CI - modify ps_gpu_wrapper.h * 4.16 update * 4.16 pass xpu compile * 4.16 retry CI * 4.16 update * Adapt XPUPS - adapt BKCL comm for XPUPS - 4.24 * update by compilation * Adapt XPUPS - register PSGPUTrainer for XPUPS - 4.25 * update device_worker_factory Co-authored-by: Nzmxdream <zhangminxu01@baidu.com> * update * update CMakeLists Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
-
由 Chen Weihang 提交于
* Add paddle::variant and replace paddle::any (#42139) * add variant and replace any * split attribute * Optimize dygraph GetExpectedKernelType perf (#42154) * opt dygraph scheduling * revert part impl * fix variant compile error (#42203) * replace any by variant in infermeta (#42181)
-
- 21 4月, 2022 5 次提交
-
-
由 zhangyikun02 提交于
-
由 z8hanghuan 提交于
* modify xpu.cmake,*test=kunlun (#41832) * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * support bilstm,*test=kunlun * [cherry-pick]support multi_layer of bilstm,*test=kunlun
-
由 WangXi 提交于
-
由 Chen Weihang 提交于
* [Phi] Support setting size of vector<Tensor> for out in yaml (#41576) * support setting vector out size in yaml * support setting size of vector<tensor> for out in yaml * resolve conflict Co-authored-by: Nzyfncg <zhangyunfei07@baidu.com>
-
由 TeFeng Chen 提交于
cherry-pick #41795
-
- 20 4月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* refactor mp in eager mode * update * update * add uts
-
- 19 4月, 2022 4 次提交
-
-
由 zyfncg 提交于
* add rsqrt yaml and unittest (#41443) * Add expand equal all yaml (#41540) * add expand, poisson * add poison grad * add expand equal_all poisson triangular solve yaml Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>
-
由 Yiqun Liu 提交于
Cherry-pick #40338 #41741 #41313
-
由 Fan Zhang 提交于
* XPUPS Adaptation (#40991) * Adapt XPUPS - 1st version - 3.24 * Adapt XPUPS - update XPU PushSparse - 2nd version - 3.24 * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25 * refactor heter comm kernel * update. test=develop * Adapt XPUPS - modify by compilation - 4th version - 3.27 * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * heter_comm update * heter_comm update * update calc_shard_offset. test=develop * heter_comm update * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30 * update. test=develop * update pslib.cmake * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * Adapt XPUPS - modify by kp compilation - 6th version - 3.30 * update. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * used by minxu * update heter_comm_inl * fix. test=develop * Adapt XPUPS - modify by kp compilation - 7th version - 3.30 * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 3.31 update * Adapt XPUPS - update kp compilation path - 8th version - 3.31 * add optimizer kernel. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm.h 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * Adapt XPUPS - update by kp compilation - 9th version - 4.1 * update hashtable. test=develop * fix. test=develop * update hashtable 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 10th version - 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * modify by compilation 4.1 * update. test=develop * update. test=develop * fix. test=develop * modify by compilation 4.1 * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 19:30 * fix. test=develop * update ps_gpu_wrapper.kps 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 11th version - 4.1 * fix. test=develop * Adapt XPUPS - update by kp compilation - 12nd version - 4.2 * fix. test=develop * fix. test=develop * modify by compilation 4.2 * 4.2 update * fix. test=develop * template init. test=develop * update 4.6 * fix. test=develop * template init. test=develop * 4.6 modify by compilation * hashtable template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 13nd version - 4.7 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.11 update * fix. test=develop * fix. test=develop * 4.11 update * update by pre-commit * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.12 update * fix. test=develop * Adapt XPUPS - update by kp compilation - 14th version - 4.13 * 4.13 update * 4.14 update * 4.14 update * 4.14 update * 4.14 modify by merged latest compilation * retry CI 4.14 * 4.15 pass static check * 4.15 modify by gpups CI * 3.16 update by gpups CI - modify ps_gpu_wrapper.h * 4.16 update * 4.16 pass xpu compile * 4.16 retry CI * 4.16 update Co-authored-by: Nzmxdream <zhangminxu01@baidu.com> * modify ps_gpu_wrapper.cc * update Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
-
由 TeFeng Chen 提交于
cinn_launch_op: optimize the overhead of preparing variables before executing cinn compiled program (#41777) (#41910) cherry-pick #41777 * optimize preparation overhead before executing cinn compiled program
-
- 18 4月, 2022 3 次提交
-
-
由 lilong12 提交于
-
由 Roc 提交于
* fix moe apis (#41650) * Moe ref (#41836) * moe ref * ref commit * update; document_fix * update;document_fix * Moe ref (#41864) * moe ref * ref commit; document_fix * update; document_fix * update document_fix * update; document_fix
-
由 huangxu96 提交于
This PR is the cherry-pick of #41824 This PR fixes a bug that will cause the Cuda address error. The reason for this bug is that the grid number of the Cuda Kernel had been wrongly set.
-
- 15 4月, 2022 1 次提交
-
-
由 zyfncg 提交于
* fix data transform problem for cudnn backend (#41622) * Fix problem of infermeta with vector output (#41646) * remove stack_grad infershape * fix bug of output with null * fix bug
-
- 14 4月, 2022 1 次提交
-
-
由 chentianyu03 提交于
* [Yaml]add exp yaml (#41217) * add exp yaml * add exp api in test case * add determinant yaml * fix exp op unittest * change test class name * modify api name * compacted with raw api * fix det api * add python_api * add test eager for determinant op * [Yaml] Add assign yaml (#41428) * add assign yaml * add assign api * add assign backward api * add assign * add assign yaml * add assign * assign yaml * add assign raw kernel and use assign_raw in yaml * merge develop branch * add missing python_api * exchange assign and assign_raw kernel name (#41625) * exchange assign and assign_raw kernel name * fix register error * [Yaml]add gaussian_random yaml and test case (#41312) * add guassian random yaml * add gaussian_random yaml and test case * fix error modify of full yaml * import in_dygraph_mode * import _in_legacy_dygraph * add place arg in api * import __current_expected_place * fix test_egr_python_api failed case * add test case * add cast for NormalInitializer * fix test error * fix test error * rm unsed check code * fix test error in test_initializer_nn * modify by review * [Phi]fix split error when sections has 0 size and add test case (#41708) * fix split error when sections has 0 size and add test case * fix test case
-
- 13 4月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* Revert "[Phi] Migrate Adam and AdamW into Phi (#40351)" This reverts commit 56cd3407. * add infermeta
-
- 12 4月, 2022 5 次提交
-
-
由 hong 提交于
* ad conj flip yaml * add flip conj pixel shuffle
-
由 xiongkun 提交于
* gather op * add mod * [Yaml] final state for uniform and uniform_random
-
由 YuanRisheng 提交于
[Cherry-Pick]Add hard_swish/kron/linspace/logit/graph_send_recv/multi_dot/maxout/multiplex op yaml file (#41566) * [Phi]Add graph_send_recv yaml file (#41206) * add graph_send_recv yaml * deal with confict * fix compile bugs * cherry-pick pr 41298 * cherry-pick pr41550 * fix compile bugs
-
由 ykkk2333 提交于
-
由 crystal 提交于
[cherry-pick] #41531 and #41570
-
- 11 4月, 2022 3 次提交
-
-
由 whs 提交于
-
由 Aurelius84 提交于
* [Eager]Fix segment_pool/allclose/isclose/scale API bug (#41506) * [Eager]Fix segment_pool/allclose/isclose/scale API bug * fix kernel register problem * add norm, segment_pool (#41465) Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>
-
由 lilong12 提交于
-
- 08 4月, 2022 1 次提交
-
-
由 YuanRisheng 提交于
-
- 06 4月, 2022 2 次提交
-
-
由 Weilong Wu 提交于
* [Eager] Support test_layers's test cases switch to eager mode * Update batch_norm _C_ops action to fix CI * Use None instead of new EmptyTensor * Updated var name * Make sure to switch eager mode, Fix Coverage_CI * Remove _non_static_mode statement * Remove batch_norm dispensable input statement * Polish batch_norm code * Fix CI issue
-
由 hong 提交于
* update * add conv yaml * add backward * remove useless code * fix bug * fix bug * revert fluid dygraph conv2d * remove useless infermeta function * fix meta fn deluplicat error * conv using custom impl * remove amp include * fix bug * use cudnn = true * fix test mkldnn caching bug
-
- 05 4月, 2022 1 次提交
-
-
由 zhaocaibei123 提交于
* update name * update name * fix test * fix fleet bind * update name * update name * fix test * fix gpups wrapper * remove Push/Pull/Load/Save with context in client and wrapper base class * fix * fix * remove some interface * fix * remove * code style * recover * fix * remove code unused * remove some unused table & accessor & CommonDenseTable => MemoryDenseTable * fix * fix * fix * recover * remove unused code Co-authored-by: Nesythan <esythan@126.com>
-