- 31 8月, 2021 10 次提交
- 
- 
由 Roc 提交于Co-authored-by: Nronnywang <524019753@qq.com>
- 
由 Yuang Liu 提交于
- 
由 Yuang Liu 提交于[cherry-pick][hybrid performance] optim the grad fuse for pipeline mode by sorting the grad by dtype (#35070) (#35300) 
- 
由 Yuang Liu 提交于[cherry-pick][hybrid performance] Grad fuse for gradient merge under pipeline mode (#35004) (#35299) 
- 
由 Roc 提交于Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
- 
由 Roc 提交于Co-authored-by: NWangXi <wangxi16@baidu.com>
- 
由 Roc 提交于Co-authored-by: NWangXi <wangxi16@baidu.com>
- 
由 Roc 提交于Co-authored-by: NWangXi <wangxi16@baidu.com>
- 
由 Yuang Liu 提交于[cherry-pick][Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965) (#35296) Co-authored-by: NWangXi <wangxi16@baidu.com>
- 
由 Yuang Liu 提交于Co-authored-by: NWangXi <wangxi16@baidu.com>
 
- 
- 18 8月, 2021 3 次提交
- 
- 
由 Leo Chen 提交于* add retry for HcclGetRootInfo * refine code * reduce retry interval 
- 
由 Guoxia Wang 提交于* support class center sample of PartialFC 
- 
由 Wangzheee 提交于* unitest_quant_dequant * fix * fix * deleted: test_trt_quant_conv2d_dequant_fuse_pass.py * fix 
 
- 
- 17 8月, 2021 14 次提交
- 
- 
由 Roc 提交于
- 
由 Aganlengzi 提交于
- 
由 tianshuo78520a 提交于* fix op-benchmark * test=document_fix 
- 
由 chentianyu03 提交于* copy boost optional.hpp to paddle * copy boost optional.hpp to paddle * move directions * del fluid/utils * modify .hpp to .h * move directions * modify to paddle::optional * add modification description * format code stype for the files in paddle/utils * format code stype 
- 
由 Jacek Czaja 提交于* - disabled caching of layer norm - fix in compilation - compilation fix - transpose caching disabled - compilation fix - more compilation fixes - sum caching disabled - compilation fix * - LRN with disabled cache * lint fixes 
- 
由 chentianyu03 提交于* add exclude rules of pre-commit to paddle/utils and third_party * remove exclude direction distributed/third_party * remove exclude of paddle/utils for format cpplint check 
- 
由 WeiXin 提交于* polish unittest. * polish code * polish code 
- 
由 shangliang Xu 提交于* [bug fix] fix unfold negative_size_param 
- 
由 Peihan 提交于* add mkl multi-thread test cases * fix codestyle * fix codestyle & enable ernie mkl test 
- 
由 Hui Zhang 提交于* dygraph support more ctc grad scale * scale for 1.x * fix unitest * fix unitest * format code * fix unittest * fix log info * unittest cov * fix format;notest,test=cpu,coverage * skip ctc_loss egs;test=cpu * warpctc grad cov;test=coverage * add dygraph test;test=coverage * format;test=cpu,coverage * format;test=cpu * add api compat;test=cpu * add cpu test * rename * rename * fix * fix test * format * eigen cpu * eigen gpu grad pass * cuda gpu pass * format * fix ci 
- 
由 Zeng Jinle 提交于* add inplace passes and tests * update * fix use_cuda undefined fix compile error of op compat * add more ut * fix CPU CI error * check adam unique * fix mac/windows ci, improve coverage * fix ci error * follow weihang's comment * fix BlockDesc::MoveFrom * follow qiuliang's comment * update * follow huihuang's comments 
- 
由 zhiboniu 提交于
- 
由 Kaipeng Deng 提交于* fix drop_last not work in IterableDataset. test=develop 
- 
由 niuliling123 提交于fix a bug in nlp: text_matching/sentence_transformers when last dim is 1 and reduce mid dim (#34941) 
 
- 
- 16 8月, 2021 13 次提交
- 
- 
由 zhangchunle 提交于
- 
由 Li Min 提交于* Fix typos in english docs for diag and diagflat. 
- 
由 veyron95 提交于* [NPU] Support npu op:(1)arg_min (2)arg_max * Modify and add unit test cases * Modify unit test cases 
- 
由 feng_shuai 提交于* change bilinear thread for nano and tx2 * change bilinear thread for nano and tx2 
- 
由 Baibaifan 提交于
- 
由 0x45f 提交于* add size npu op * modify support data type * no longer use NPU size OP * remove useless comments, add test case * fix copyright, remove useless include 
- 
由 zyfncg 提交于Change the invoking method of settiem by Ellipsis and None index from numpy to set_value op (#34911) * Change invoking mathod of the settiem by Ellipsis and None index from numpy to set_value op * add none_axes into attr of set_value_op in dygraph mode 
- 
由 Fan Zhang 提交于
- 
由 joanna.wozna.intel 提交于* Remove force_fp32_output from elementwise_add quantization * Fix cpu_quantize_placement test * Review related changes 
- 
由 Jacek Czaja 提交于* - Added softmax without caching * - Binary is no longer manually cached * - Activation onednn caching removed * - Removed manual caching of activation * - modified UT * - fix * - fix * - fixes to building * - fix * - fix * - fix to UT * - Faulty UT workaround * - approval workaround * - Fixes after review * - compilation fixes * - more lint fixes * - more fixes after review * - fixes after another round of review * - hopefully compilation fix - compilation fix 
- 
由 zhangchunle 提交于
- 
由 Qi Li 提交于
- 
由 From00 提交于* Add NPU kernel for nearest_interp op * Add grad op * Modify codes according to the review comments * Modify codes according to the review comments 
 
- 
