- 20 4月, 2022 14 次提交
-
-
由 seemingwang 提交于
* gpu_graph engine optimization+ (#41455) * extract sub-graph * graph-engine merging * fix * fix * fix heter-ps config * test performance * test performance * test performance * test * test * update bfs * change cmake * test * test gpu speed * gpu_graph_engine optimization * add ssd layer to graph_engine * fix allocation * fix syntax error * fix syntax error * fix pscore class * fix * recover test * recover test * fix spelling * recover * fix * Cpu gpu graph engine (#41942) * extract sub-graph * graph-engine merging * fix * fix * fix heter-ps config * test performance * test performance * test performance * test * test * update bfs * change cmake * test * test gpu speed * gpu_graph_engine optimization * add ssd layer to graph_engine * fix allocation * fix syntax error * fix syntax error * fix pscore class * fix * recover test * recover test * fix spelling * recover * fix * fix linking problem * remove comment
-
由 Leo Chen 提交于
* [new-exec] shrink downstream map (#41471) * shrink downstream map * shrink last live ops of var * add comment * fix bug * add dependency for send/recv to support pp parallel (#41652) * [new-exec] clear the scope listener after run (#41947) * clear the listener after run * only sync variables in program * refine code * fit for lod_tensor_blocking_queue
-
由 Xiaoxu Chen 提交于
-
由 heliqi 提交于
windows编译脚本增加onnxruntime编译选项
-
由 niuliling123 提交于
Add AutoTune to reader.py for DataLoader
-
由 feng_shuai 提交于
-
由 pangyoki 提交于
* support no_need_buffer in eager_fluid state * change no_need_buffer info from fwd_info to bwd_info * fix CI fail, gru_unit donnot use no_need_buffer * fix conflict between no_need_buffer and dispensable * use tensor.define in dispensable * solve conflict * solve conflict
-
由 Jiabin Yang 提交于
Co-authored-by: NZhanlue Yang <jim19930609@gmail.com>
-
由 YuanRisheng 提交于
* support construct scalar using non-cpu tensor * fix bugs when run unittest * fix compile bugs * fix bugs when run ci * fix compile bugs * fix bugs when move copy * perfect unit test * perfect unittest * update according to comment * add target dependency * deal with conflict * fix bugs when run unit test * fix unit test bugs
-
由 Aurelius84 提交于
* update (#41636) * fix bug for eager mode distributed training (#41841) Co-authored-by: Nlilong12 <lilong12@baidu.com>
-
由 Aurelius84 提交于
[Cherry-Pick]Fix expand_sig infershape BUG under static graph mode and NeedTransformPlace behavior if set skip_transform in yaml (#41973) * [Phi]Fix expand_sig infershape BUG under static graph mode (#41936) * [Phi]Fix expand_sig infershape BUG under static graph mode * [Phi]Fix expand_sig infershape BUG under static graph mode * [Phi]Fix unittest * [Phi]Fix unittest * [Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml (#41920) * [Eager]Fix NeedTransformPlace behavior if set skip_transform in yaml * add unittest for full_like * fix unittest
-
由 chenjian 提交于
* fix divide zero error when cpu only (#41794) * reduce performance influence by RecordEvent in Python (#41822) * reduce performance influence * add unit test * fix * Rebase for profiler statistic ratio (#41939) * fix according to suggestion * add kernel summary * improve coverage
-
由 Zhang Ting 提交于
cherry-pick #41884
-
由 feng_shuai 提交于
-
- 19 4月, 2022 17 次提交
-
-
由 zyfncg 提交于
* add rsqrt yaml and unittest (#41443) * Add expand equal all yaml (#41540) * add expand, poisson * add poison grad * add expand equal_all poisson triangular solve yaml Co-authored-by: Nhong <43953930+phlrain@users.noreply.github.com>
-
由 zmxdream 提交于
* add rename for heter_ps.cu * update. test=develop * update. test=develop * fix. test=develop
-
由 Weilong Wu 提交于
* [Eager] Fix numpy interface for constructing empty tensor * Fix CI, construct empty tensor * Modify empty tensor's shape from [] to [0] * Add more test for constructing empty tensor
-
由 Weilong Wu 提交于
* [Eager] paddle.sort interface use final_state * Add eager test case for paddle.sort()
-
由 zhangbo9674 提交于
-
由 Yiqun Liu 提交于
Cherry-pick #40338 #41741 #41313
-
由 Fan Zhang 提交于
* XPUPS Adaptation (#40991) * Adapt XPUPS - 1st version - 3.24 * Adapt XPUPS - update XPU PushSparse - 2nd version - 3.24 * Adapt XPUPS - add XPU PullSparseOp - 3nd version - 3.25 * refactor heter comm kernel * update. test=develop * Adapt XPUPS - modify by compilation - 4th version - 3.27 * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * heter_comm update * heter_comm update * update calc_shard_offset. test=develop * heter_comm update * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * Adapt XPUPS - use WITH_XPU_KP and modify wrapper kernel function - 5th version - 3.30 * update. test=develop * update pslib.cmake * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * Adapt XPUPS - modify by kp compilation - 6th version - 3.30 * update. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * used by minxu * update heter_comm_inl * fix. test=develop * Adapt XPUPS - modify by kp compilation - 7th version - 3.30 * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 3.31 update * Adapt XPUPS - update kp compilation path - 8th version - 3.31 * add optimizer kernel. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * update heter_comm_kernel.kps 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update heter_comm.h 3.31 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * Adapt XPUPS - update by kp compilation - 9th version - 4.1 * update hashtable. test=develop * fix. test=develop * update hashtable 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 10th version - 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * modify by compilation 4.1 * update. test=develop * update. test=develop * fix. test=develop * modify by compilation 4.1 * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * modify by compilation 4.1 19:30 * fix. test=develop * update ps_gpu_wrapper.kps 4.1 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 11th version - 4.1 * fix. test=develop * Adapt XPUPS - update by kp compilation - 12nd version - 4.2 * fix. test=develop * fix. test=develop * modify by compilation 4.2 * 4.2 update * fix. test=develop * template init. test=develop * update 4.6 * fix. test=develop * template init. test=develop * 4.6 modify by compilation * hashtable template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * Adapt XPUPS - update by kp compilation - 13nd version - 4.7 * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.11 update * fix. test=develop * fix. test=develop * 4.11 update * update by pre-commit * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * 4.12 update * fix. test=develop * Adapt XPUPS - update by kp compilation - 14th version - 4.13 * 4.13 update * 4.14 update * 4.14 update * 4.14 update * 4.14 modify by merged latest compilation * retry CI 4.14 * 4.15 pass static check * 4.15 modify by gpups CI * 3.16 update by gpups CI - modify ps_gpu_wrapper.h * 4.16 update * 4.16 pass xpu compile * 4.16 retry CI * 4.16 update Co-authored-by: Nzmxdream <zhangminxu01@baidu.com> * modify ps_gpu_wrapper.cc * update Co-authored-by: Nzmxdream <zhangminxu01@baidu.com>
-
由 feng_shuai 提交于
-
由 feng_shuai 提交于
-
由 z8hanghuan 提交于
This reverts commit 8ccdb91b.
-
由 Zhanlue Yang 提交于
* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode * [DoubleGrad] Enabled test_autograd_functional_dynamic.py under eager mode * Enabled more test cases * Fixed performance issues * Fixed minor issue
-
由 JingZhuangzhuang 提交于
-
由 JingZhuangzhuang 提交于
-
由 zhangkaihuo 提交于
cherry-pick the PR#41586 to realese/2.3
-
由 TeFeng Chen 提交于
cinn_launch_op: optimize the overhead of preparing variables before executing cinn compiled program (#41777) (#41910) cherry-pick #41777 * optimize preparation overhead before executing cinn compiled program
-
由 Zhanlue Yang 提交于
* [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures * [DoubleGrad] Enabled test_imperative_triple_grad test cases under eager_mode * Fixed minor issues
-
由 Siming Dai 提交于
* add eids result for graph_sample_neighbors * fix bug * move fisher_yates sample to warp * add cpu eid output * delete comment * delete comment * change nullptr placeholder * optimize sample kernel * fix mutable_data
-
- 18 4月, 2022 9 次提交
-
-
由 lilong12 提交于
-
由 lilong12 提交于
-
由 Aurelius84 提交于
* [Eager] add _fallback_legacy_dygraph for npu/xpu/rocm * fix import
-
由 Roc 提交于
* fix moe apis (#41650) * Moe ref (#41836) * moe ref * ref commit * update; document_fix * update;document_fix * Moe ref (#41864) * moe ref * ref commit; document_fix * update; document_fix * update document_fix * update; document_fix
-
由 zmxdream 提交于
* [XPUPS]add support for kunlun2 (#40985) [XPUPS]add support for kunlun2 Co-authored-by: NWorgenZhang <frank08081993@gmail.com> * [XPUPS]fix hashtable_kernel.kps (#41790) * refactor heter comm kernel * update. test=develop * update calc_shard_offset. test=develop * update xpu kernel. test=develop * update args of calc_shard_offset * update. test=develop * remove customGradMerger * update. test=develop * update. test=develop * fix. test=develop * update. test=develop * update. test=develop * update optimizer kernel * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * add optimizer kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix kunlun not support size_t. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update hashtable. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * update. test=develop * update. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * template init. test=develop * hashtable template init. test=develop * fix. test=develop * fix. test=devlop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix hashtable_kernel. test=develop * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop Co-authored-by: NWorgenZhang <frank08081993@gmail.com> * [XPUPS]modify xpu_kp.cmake with HETERPS&PSLIB (#41760) * modify xpu_kp.cmake with HETERPS&PSLIB * fix. test=develop * fix. test=develop * fix. test=develop * fix. test=develop Co-authored-by: NWorgenZhang <frank08081993@gmail.com>
-
由 z8hanghuan 提交于
* modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun * modify xpu.cmake,*test=kunlun
-
由 chentianyu03 提交于
* split reduce_kernel * rm reduce_kernel in cmake * split reduce_grad kernels * fix cmake build error * format code * fix standalone_executor_test error
-
由 Zhanlue Yang 提交于
[DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad (#41451) (#41893) * [DoubleGrad] Enabled double grad test cases in eager_mode for test_imperative_double_grad * Fixed elementwise issue * Addressed CI failures
-
由 huangxu96 提交于
This PR is the cherry-pick of #41824 This PR fixes a bug that will cause the Cuda address error. The reason for this bug is that the grid number of the Cuda Kernel had been wrongly set.
-