- 02 6月, 2022 10 次提交
-
-
由 Wangzheee 提交于
* new general transformer inference support
-
由 Zhang Zheng 提交于
* Delete inplace strategy in group_norm_fwd * fix
-
由 wanghuancoder 提交于
* first run accumulation node
-
由 Siming Dai 提交于
* support heter reindex * add unittest, fix bug * add comment * delete empty line * refine example * fix codestyle * add disable static
-
由 Jackwaterveg 提交于
* fix usage of prefetch_factor * add assert * add docstring and change prefetch_factor when num_workers=0 * fix doc
-
由 Guoxia Wang 提交于
-
由 Li Min 提交于
* extend forward fast_ln_kernel to support more column values.
-
由 zhaoyingli 提交于
* prepare only once
-
由 zhaoyingli 提交于
-
由 sneaxiy 提交于
* support CUDAGraph for partial graph * add ut * fix ci * fix ut again because of eager mode * fix kunlun ci * fix win ci
-
- 01 6月, 2022 22 次提交
-
-
由 xiongkun 提交于
-
由 YuanRisheng 提交于
* add yaml * fix infrt compile bugs
-
由 Aganlengzi 提交于
-
由 Qi Li 提交于
-
由 BrilliantYuKaimin 提交于
* Update random.py * test=document_fix * test=document_fix * Update random.py
-
由 Guoxia Wang 提交于
-
由 sneaxiy 提交于
* support weight transpose * add ut * add template * fix transpose error * fix transpose_comment * add api tests * add skipif * add doc
-
由 YUNSHEN XIE 提交于
-
由 zhouweiwei2014 提交于
-
由 Sing_chan 提交于
-
由 JZ-LIANG 提交于
* adapt for 10 loss * partitioner support optimizer
-
由 BrilliantYuKaimin 提交于
-
由 houj04 提交于
* update xpu cmake: xdnn 0527. test=kunlun * update to xdnn 0531. * update to xdnn 0531. test=kunlun * update to xdnn 0601. test=kunlun
-
由 zhangchunle 提交于
unittest parallel Co-authored-by: Nzhangbo9674 <zhangbo54@baidu.com>
-
由 Ruibiao Chen 提交于
* Add pinned memory to HostMemoryStats * Add macro for WrapStatAllocator * Fix CI errors
-
由 zhiboniu 提交于
-
由 Guoxia Wang 提交于
* fix the bug of adamw which set the attribute in param group not working * fix undefined variable * fix api example typo * add unittest * fix unittest typo
-
由 huzhiqiang 提交于
-
由 caozhou 提交于
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the parallel tuner * [Auto Parallel] Improve the parallel tuner and fix some bugs * upodate cost model * update import Resharder by dist op * update cost model * fix comp cost bug * update cost model * [Auto Parallel] Amend the dist attr for #processses=1 * update cost model and tuner * update cost model and tuner * update cost model and tuner * update cluster * update reshard * [Auto Parallel] Add the estimation from the cost model * [Auto Parallel] Reimplement the backup and restore functions * [Auto Parallel] Fix the bugs of the parallel tuner * [Auto Parallel] Update the engine api and dist context * [Auto Parallel] Work around the high order grad problem * [Auto Parallel] Add some miscellaneous improvements * [Auto Parallel] Add a unittest for DistributedContext Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
-
由 chentianyu03 提交于
* add conv3d yaml * add conv3d_grad, conv3d_double_grad * add final_state_conv3d test case * add conv3d double test case * add depthwise_conv2d grad yaml * add depthwise_conv2d double grad test case * modify the order of args * add depthwise_conv2d_grad_grad config
-
- 31 5月, 2022 8 次提交
-
-
由 Sławomir Siwek 提交于
* remove attrs from base op * fix typos * remove brelu * undo removing code related to matmul * remove whitespaces * undo changes in matmul * remove empty line
-
由 pangyoki 提交于
* add double_grad and triple_grad inplace info in backward.yaml * only generate inplace api in forward
-
由 wanghuancoder 提交于
* fix full zero * fix full zero * fix full zero * fix full zero * refine * refine * refine
-
由 Sing_chan 提交于
-
由 Chen Weihang 提交于
* fix assign kernel copy impl * fix test failed
-
由 BrilliantYuKaimin 提交于
-
由 cambriconhsq 提交于
-
由 yaozhixin 提交于
* [IPU] support paddle.distributed.launch with IPUs * add device_num to env_args_mapping
-