- 24 11月, 2022 3 次提交
-
-
由 HongyuJia 提交于
* support default use_gpudnn=True * fully support cudnn in phi * add header file * add white_list, verify accuracy * phi support all cudnn * opt affine_grad * try different arches of pretrained_model * try different arches of pretrained_model * add debug string * debug eager_method * add debug string, pass all local ctest * polish all debug code * delete use_cudnn relevant code autogen * fix depthwise_conv2d * Share all other members of Tensor except use_cudnn * polish codes according to review opinion * polish codes according to review opinion, fix bug * polish codes according to review opinion, opt performance * polish codes according to review opinion, fix pooling.py
-
由 wanghuancoder 提交于
* do not calc reduce_all in eager mode * refine python c cast list * refine * refine * refine * refine * refine * refine * refine * refine * refine
-
由 wanghuancoder 提交于
* dense tensor in eager mode support data_ptr
-
- 23 11月, 2022 1 次提交
-
-
由 Charles-hit 提交于
* add nparray case for basic operator * fix unit test * fix unit test * add unit test * fix unit test
-
- 21 11月, 2022 4 次提交
- 18 11月, 2022 3 次提交
-
-
由 zyfncg 提交于
* fix bug of zero_allocator in host * fix test compile bug * add unittest * update test
-
由 Wen Sun 提交于
-
由 james 提交于
* fix device id issue for xpu eager xpu device id is not correctly set in eager mode, thus vars are on dev0 unless XPUDeviceGurad is called, leading to this error message for all node rank != 0: "NotImplementedError: (Unimplemented) Place Place(xpu:0) is not supported." * fix typo * fix pybind error
-
- 17 11月, 2022 2 次提交
- 16 11月, 2022 2 次提交
- 14 11月, 2022 4 次提交
-
-
由 Wen Sun 提交于
* refactor: simplify send, recv interfaces * refactor: rm send_partial, recv_partial, all_gather_partial
-
由 LiYuRio 提交于
-
由 LiYuRio 提交于
-
由 engineer1109 提交于
-
- 10 11月, 2022 4 次提交
-
-
由 WangZhen 提交于
Get grads types from cpp for adam to speed up
-
由 YuanRisheng 提交于
* standard api * fix sparse bugs * fix xpu bugs, test=kunlun * remove hard code for custom unittest * open ci, test=kunlun * deal with conflict
-
由 james 提交于
* XPU support eager mode * add unittest for XPU eager mode * minor bugfix * minor bugfix, test=kunlun * correct copyright info * 1. remove unsed vars/funcs 2. ProcessGroupBKCL inherit from ProcessGroupStream * bugfix for fp16 in eager mode multi-card, test=kunlun * rebase & fix a few issues * use new processgroup interface, test=kunlun * fix compile issue, test=kunlun
-
由 Wen Sun 提交于
* refactor: send, recv, send_partial, recv_partial * refactor: rm useless const ref
-
- 09 11月, 2022 4 次提交
-
-
由 WangZhen 提交于
* Get params and grads in cpp to avoid gpu idel time * Using python param instead of cpp return param to fix test_asp_optimize_dynamic.py * Get grads from cpp and construct params_grads on python * Check meta and remove comments
-
由 Paulina Gacek 提交于
* Analysis API interface for disabling fc passes * Unit tests corrected * Python API added * test runs only when PADDLE_WITH_MKLDNN * Fc op changed to relu in matmul_op_test * Disable fc passes in tests where acc drops * code formating * Unit test for analysisConf added * Unit test gpu added * fc passes disabled when iterations=0 in gru test * style * passes disabled when fp32 in gru test * fc passes disabled in lstm test * Import from inference, not fluid in doc
-
由 Wen Sun 提交于
-
由 wanghuancoder 提交于
* refine python call error report
-
- 08 11月, 2022 2 次提交
- 07 11月, 2022 4 次提交
- 04 11月, 2022 1 次提交
-
-
由 wanghuancoder 提交于
* fix cc_library link python lib
-
- 03 11月, 2022 1 次提交
-
-
由 Leo Chen 提交于
-
- 01 11月, 2022 3 次提交
-
-
由 Yuanle Liu 提交于
-
由 Ruibiao Chen 提交于
* [Auto Parallel] Improve the c++ dist attr * [Auto Parallel] Modify test_program.py * Support custom stream for standalone executor Co-authored-by: NYulong Ao <aoyulong@baidu.com>
-
由 shentanyue 提交于
-
- 31 10月, 2022 1 次提交
-
-
由 Yulong Ao 提交于
* [Auto Parallel] Improve the c++ dist attr * [Auto Parallel] Modify test_program.py * [Auto Parallel] Add the missiong import
-
- 28 10月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
-