- 26 8月, 2022 8 次提交
-
-
由 Ruibiao Chen 提交于
-
由 Ruibiao Chen 提交于
-
由 Ruibiao Chen 提交于
-
由 zyfncg 提交于
* delete fill xpu op in fluid * delete fill_constant header, test=kunlun * fix npu header, test=kunlun
-
由 houj04 提交于
-
由 kangguangli 提交于
* remove fluid kernel and activate phi kernel * fix parameter error * transfer mkldnn part * modify header file path * fix compile error * transfer special case * fix lod setting and special case for layout setting * add testcase and refine code
-
由 haosicheng 提交于
* add temporal shift and grad *test=kunlun * fix reduce mean grad bug *test=kunlun
-
由 xiongkun 提交于
* while support for python container. It is convenient to convert more dynamic graph codes into static graphs. * cond support python container * 1. make select_input output shape = input[1] 2. add warning in while_loop risky assign * fix 2 problem in GPT export: 1. a bug in while_op no_need_copy_var, which causes gpu memory leakage 2. a bug in undefined_var where the stop_gradient should be False. * change name by code review * format
-
- 25 8月, 2022 7 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]axis of Reverse Support Tensor type * fix coverage * fix unittest
-
由 Aurelius84 提交于
* [OpAttr]min/max of Uniform_rand support Tensor type * fix typo
-
由 kangguangli 提交于
* transfer memcpy_d2h from fluid to phi * refine arg check and add comment * fix cannot fallback to phi kernel * fix gpu_context host alloc when tensor size = 0 * add kernel for std::vector<DenseTensor> args * fix bugs in MemcpyD2HMultiIOKernel * remove useless header file * polish format * fix typo * add testcase for cudapinned place * refine check condition in test * polish error message * polish error message * remove header in fluid directory * merge memcpy_h2d and memcpy_d2h into one file, change register method to simplify implementation * fix code style check
-
由 ronnywang 提交于
* [NPU] add run_program_op_npu * add run_program_op_npu ut
-
由 hong 提交于
* optimizer conv alog speed * code polish * remove useless code * fix compile error * fix cpu compile error * not use cudnn alog t * add search cache max number * polish code * fix cache test bug * add groups data format to conv args * fix cache test bug * fix cudnn_deterministic bug * fix test switch auto tune bug * fix test swith autotune bug; * fix conv cache bug * fix cache test error * fix cache test bug * fix windows mac compile error * fix workspace search error * update cudnn cache * fix cache test bug; test=develop * fix autotune swith test error * polish code * oplish code
-
由 Rayman 提交于
-
由 USTCKAY 提交于
-
- 24 8月, 2022 6 次提交
-
-
由 Leo Chen 提交于
* make tensor_util contains no cuda code * refine isfinite * revert ut * move isfinite function to its op * fix test * fix compile * std::isnan is not defined for int type on windows * fix windows compile * fix fp16 * fix rocm compile * revert gradient node
-
由 WangZhen 提交于
-
由 mengqingchun02 提交于
* support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support fp16 of adam operator in xpu environment. test=kunlun * support fp16 of adam operator in xpu environment. test=kunlun * support fp16 of adam operator in xpu environment. test=kunlun
-
由 WangZhen 提交于
* Adapt minlength attr for bincount
-
由 wenbin 提交于
* fix * optimize
-
由 zhaoying9105 提交于
-
- 23 8月, 2022 2 次提交
-
-
由 niuliling123 提交于
-
由 YuanRisheng 提交于
* move distribute_fpn_proposals * fix some code * fix yaml bugs * add set dtype * move proposal_impl to funcs * fix compile bugs
-
- 20 8月, 2022 1 次提交
-
-
由 Sing_chan 提交于
* add max_p without test * add test of max_p * make max_p consistent with paddle.maximum
-
- 19 8月, 2022 5 次提交
-
-
由 HongyuJia 提交于
-
由 houj04 提交于
-
由 mengqingchun02 提交于
* support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * make up beam_search_decode operator test cases on xpu and cpu environment. test=kunlun
-
由 dongfangshenzhu 提交于
* add merged_momentum *test=kunlun * add merged_momentum *test=kunlun * add fp16 to merged_momentum,*test=kunlun * change dist_model.cc * add merged_momentum unittest and change momentum,test=kunlun * add merged_momentum unittest and change momentum,test=kunlun * add merged_momentum unittest and change momentum,test=kunlun * add merged_momentum unittest and change momentum,test=kunlun
-
由 mengqingchun02 提交于
* support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * support beam_search operator on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * fix beam_search operator bugs on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun * support beam_search_decode operator on xpu. test=kunlun
-
- 18 8月, 2022 2 次提交
-
-
由 pangyoki 提交于
apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085) * apply inplace addto in python apply_pass * fix * apply inplace pass for program * skip feed and fetch var * fix block_desc.move_from * fix block desc * alltoall remove inplace * fix
-
由 Aurelius84 提交于
* [OpAttr]Squeeze axes support Tensor * add support_tensor * fix unittest * fix coverage
-
- 17 8月, 2022 4 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]Add SupportTensor for OpMaker * fix typo * fix code style * add SupportTensor for concat op * add unittest for register Tensor * add shape checker and split attribute
-
由 Wilber 提交于
* fix multi stream error.
-
由 fwenguang 提交于
-
由 ykkk2333 提交于
* xpu unittest grad compute supports more types, *test=kunlun * add instance norm xpu, *test=kunlun
-
- 16 8月, 2022 5 次提交
-
-
由 Chen Weihang 提交于
* move check finite and unscale kernel into phi * move infershape into phi * move update_loss_scaling kernel into phi * remove original kernels * move update loss scaling infershape into phi * add header for xpu and npu * solve coverage failed * fix npu test failed * remove mutable data in cu file * fix new executor failed * add valid check for meta tensor output
-
由 feng_shuai 提交于
* convert multihead to oss * fix:bug * fix:delete const cast * fix:don't support bias_qk * add vit pass * fix:convert bug and add preln_residual_bias * support length=-1 * add UT for convert * add no_bias_qk support for gpu_multihead_op * delete infer_shape depends on bias_qk * oss just can be used in T4 and A* * fix:change api for ROCM CI
-
由 Aganlengzi 提交于
-
由 feifei-111 提交于
* fix_shape * code style * fix assert * fix to_tensor badreturn
-
由 Sing_chan 提交于
* add select_p * fix bugs * add custom test for select_p; modify select_p primrules * modify according to xiaoxu's comment * add eq_p, select_p, pow_p, use autograd to test grad of high order * add requirement of autograd, modify expected type of eq * modify according to xiaoxu's comment * import primops to use primops.pow
-