- 18 10月, 2022 1 次提交
-
-
由 Haohongxiang 提交于
* [Dygraph] Fix performance of pp+mp by using send/recv_calc_stream instead of send/recv (#46116) * [Dygraph] Fix Perf of FusedFeedForward and FusedAttention with AllReduce (#46780) * update
-
- 17 10月, 2022 6 次提交
-
-
由 Wen Sun 提交于
* Support both use_calc_stream and sync_op in send recv APIs (#46023) * Support both use_calc_stream and sync_op in allgather API (#46295) * Support both use_calc_stream and sync_op in collective communication API (#46761) * Move group and all reduce from collective to communication (#45848) * Completes bfloat16 dtype for collective api in eager mode (#45844) * Fix collective APIs cannot be recognized when building docs (#46962) Co-authored-by: NLiYuRio <63526175+LiYuRio@users.noreply.github.com>
-
由 zhangkaihuo 提交于
cherry-pick : #46322, #46245 Sparse API 支持静态图
-
由 Guanghua Yu 提交于
* fix dygraph new format quant * fix unittest * fix conflict
-
由 Allen Guo 提交于
-
由 Allen Guo 提交于
-
由 Zhang Zheng 提交于
为了提升性能,将label的边界检查从python端转移到kernel内,减少额外op的调用,如min、max和同步拷贝等 当前的模板参数IgnoreIndex仅在ignore_index取值范围在[0, dim)时才生效,但是当某个label值超出了边界,ignore_index等于该label,这种情况下是应该仍然能正常计算。虽然当前的计算逻辑在结果上不会出错,但逻辑上仍是有问题的,且模板参数IgnoreIndex是没有必要的
-
- 14 10月, 2022 7 次提交
-
-
由 Wilber 提交于
-
由 Guanghua Yu 提交于
-
由 xiaoxiaohehe001 提交于
-
由 Aurelius84 提交于
-
由 Aurelius84 提交于
* [BUG]Fix expand_as_v2 bug while X and Y with different dtype * fix commit
-
由 Zhang Jun 提交于
* fix reshape2 opteller; add elementwise min/max register for tensorrt
-
由 zhoutianzi666 提交于
-
- 13 10月, 2022 2 次提交
-
-
由 傅剑寒 提交于
Fix set_value failure when source tensor is fp16 Dtype and destiny value is a number (dev PR link:#46801)
-
由 Sławomir Siwek 提交于
* Revert pool+grad oneDNN kernel conversion (#45989) * [PHI] transpose2_grad op migration (#46139) * op migrated, Copy(OneDNNContext, ...) added * mutable_data & op registration in fluid removed * refactoring * OneDNNGetDataType to uppercase * missing cpu check added, handler moved to .h file * name changed to transpose_grad * Copy changed back to TensorCopy * Resizing corrected, Copy(OneDNNContext) removed Co-authored-by: NPiotr Paturej <48731682+piotrekobi@users.noreply.github.com> Co-authored-by: NPaulina Gacek <paulina.gacek@intel.com>
-
- 12 10月, 2022 2 次提交
-
-
由 niuliling123 提交于
Cherry-pick 46541 保证Reset50 TSM deeplabv3模型零修改下实现Layout自动调优
-
由 ronnywang 提交于
cherry pick pr46536
-
- 11 10月, 2022 2 次提交
-
-
由 Sławomir Siwek 提交于
-
由 Yuang Liu 提交于
* bug fix for virtual pipeline parallel (#45922) * dont wait for send op under dygraph pp (#46209) * [interleave pp] sync recv for 1f1b (#46399) * [dygraph pp] all sync for allgather partial (#46483)
-
- 10 10月, 2022 2 次提交
-
-
由 feng_shuai 提交于
* fix gather op convert to only support int32 index as input. * add ut
-
由 Aurelius84 提交于
-
- 09 10月, 2022 1 次提交
-
-
由 xiongkun 提交于
* 1. refactor the return transformer. 2. fix some bugs in return transformer. * support raise error while return stmt's father is For or while * fix ci error. * fix ci error and add some unittest * code format * fix ci error
-
- 29 9月, 2022 3 次提交
-
-
由 傅剑寒 提交于
Add FP16 support for uniform in dygraph mode on Nvidia GPU Dev PR link PR46212
-
由 zyfncg 提交于
* set flag of clip_extra in save_inference_model to true (#46151) * open the clip_extra flag in paddle.static.save_inference_model, test=allcase (#46456) * Open the clip_extra flag in TracedLayer.save_inference_model (#46473) * open the clip_extra flag in paddle.static.save_inference_model, test=allcase * set the defalut value of clip_extra in TracedLayer from False to True, test=allcase * update english doc of paddle.static.save_inference_model, test=document_fix (#46484) * Fix clip_extra logic in remove_training_info (#46534) * fix clip_extra code in remove_training_info * revert rnn opmaker clear
-
由 weishengying 提交于
-
- 28 9月, 2022 2 次提交
-
-
由 Chen Weihang 提交于
* fix libpaddle soname mismatch error * fix windows failed * polish linux and windows make impl * unify winddows lib name * fix windows error * revert copy dst change * revert naming change * revert windows change * fix gpups compile failed
-
由 zhoutianzi666 提交于
-
- 27 9月, 2022 4 次提交
-
-
由 zhaoyingli 提交于
-
由 zyfncg 提交于
* Clear extra attrs of elementwise op in OpMaker (#45845) * clear extra attrs of elementwise op in opmaker * fix op_debug_string_test * fix bug of grad_add * fix sort of runtime attrs * Clear extra attrs of scale in OpMaker (#45984) * clear extra attr of scale in opmaker * fix sum bug * fix merge conflict * fix minus * Clear extra attributes of some Op in OpMaker (Part4) (#46060) * clear extra attr of some ops in opmaker * revert clear use_cudnn for pool * fix test_operator_desc * fix Attr interface of OperatorBase * fix code stype
-
由 Hui Zhang 提交于
-
由 LiYuRio 提交于
-
- 26 9月, 2022 3 次提交
-
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fix gloo compile warning * adapt for nn fl-ps * flps del fake-init op * add learning_rate_0 intializer op * bug fix * . * .
-
由 feifei-111 提交于
-
由 Hui Zhang 提交于
* fix sub sign reverse for mkldnn * refactor code as comment * remove useless
-
- 24 9月, 2022 1 次提交
-
-
由 YangZhou 提交于
* unexpose audio ParameterError * clean audio utils api
-
- 23 9月, 2022 4 次提交
-
-
由 Aurelius84 提交于
-
由 feifei-111 提交于
* use re replace judge by case * simplify re
-
由 xiongkun 提交于
-
由 Aurelius84 提交于
* [BugFix]Fix reduce_mean/min/sum/prod, cumsum grad_op infershape bug * fix typo * fix typo
-