- 12 2月, 2023 1 次提交
-
-
由 Xiaoxu Chen 提交于
-
- 11 2月, 2023 2 次提交
-
-
由 HongyuJia 提交于
* init commit * fix tensor operator* * fix compile bug * bug reproduce * update commit * polish codes * fix compile bug * test begin * test begin * compile finish * restore origin composite_backward_api * pass local CI * fix merge error * fix merge error * change py_test from GPU->CPU, test custom op * polish codes, modify prim unittest * modify prim unittest * determine phi_tensor_operants location * polish codes * add header file * solve windows unresolved symbol * fix some CI error * add overload defination * fix CI inference and Windows * polish codes according to reviewers' opinion * polish codes according to reviewers' opinion
-
由 Wang Bojun 提交于
* eleadd_trans first version log fix * refine code for linear format, add pass check * linear format refine and ut fix * fix ut * windows ut * windows ut 2 * move tensorMeta and alloc to configure
-
- 10 2月, 2023 14 次提交
-
-
由 umiswing 提交于
-
由 Leo Guo 提交于
d_bias are nullptr. Modify the code style of full_kernel.cc. Add new data type for concat, elementwise_add, gather, scale, scatter ops. test=kunlun
-
由 Aurelius84 提交于
* Fix inferMefer in transpose2_grad * fix infershape * fix unittest
-
由 ykkk2333 提交于
-
由 Infinity_lee 提交于
-
由 RedContritio 提交于
* add dim check in scatter * add check in scatter.cu * add unittest * remove unnecessary log and comment --------- Co-authored-by: RedContritio <>
-
由 HongyuJia 提交于
* fix NLP-Bert model performance loss * fix windows compile error
-
由 risemeup1 提交于
* fix test_fleet_exe_dist_model_run * test
-
由 Weilong Wu 提交于
-
由 zhupengyang 提交于
-
由 HongyuJia 提交于
-
由 Huang Jiyi 提交于
* remove AllocatorFacade in phi * fix include * fix bugs
-
由 Huang Jiyi 提交于
* rm gradient_accumulator in phi * update
-
由 wangshengxiang 提交于
-
- 09 2月, 2023 15 次提交
-
-
由 Zhang Jun 提交于
* update * support int64 shape tensor as engine input * add inference_predictor ut
-
由 Leo Guo 提交于
-
由 Roc 提交于
Co-authored-by: Nzhangxiaoci <zhangxiaoci@baidu.com>
-
由 joanna.wozna.intel 提交于
* Adjust mkldnn_placement_pass to check library type and data type * Check if var has inputs * Remove unrelated test * Refactor
-
由 Huang Jiyi 提交于
* decouple strided_memcpy * move strided_memcpy * move strided_memcpy to phi * fix namespace * update * fix gpu compile bugs
-
由 Huang Jiyi 提交于
-
由 yuehuayingxueluo 提交于
* add multi_tenosr_adam * update multi_tensor_base.py, test_multi_tensor_adam.py, adamw.py * fix adam.py optimizer.py * fix adamw.py * fix test_multi_tensor_adam.py * fix CI bug * fix CI coverage * fix ci bug * fix betapow * fix some bugs * fix test_adamw_op.py * fix CI coverage * fix multi_tensor_adam_kernel.cc * fix CI bug * fix multi_tensor_adam_op.cc and test_multi_tensor_adam.py * fix code style * update C++ parts * remove python parts modification temporarily * add C++ ut * update betapow copy code logic * fix ci ut * fix windows ci * fix coverage ci * improve coverage rate --------- Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-
由 LiYuRio 提交于
-
由 zhoutianzi666 提交于
* add fmha_flashattention oss plugin * add fmhca * add oss fmhca * code reconstruct and add ut * code style refine * fix ut and enforce check * refine trt version check refine compile fix compile * fix cross ut * code refine * use runtime trt version check * bug fix and code refine * compile fix * merge develop * add GN QDQ kernel * support GN int8 fake kernel * add with_int8 * add GN int8 fake kernel * add GN int8 fake kernel * add GN int8 fake kernel * add GN int8 fake kernel * add GN int8 fake kernel * add GN int8 fake kernel * add GN int8 fake kernel * add GN int8 UT * add verison > 8000 in GN int8 UT * add some check in .cu * add stdlib.h in UT * little change in .cu * remove rand_r use rand * remove use rand * setAxis(1) * when int8 is on allow fall back to fp16 --------- Co-authored-by: Nwwbitejotunn <wang_bojun@outlook.com>
-
由 kangguangli 提交于
* fix judgement about scope validation * fix ci bug: same address is not enough for data consistency * remove useless check
-
由 pangengzheng 提交于
-
由 zhangbo9674 提交于
* add TypeID * Specification comment code * refine code * add AbstractType * add TypeStorage * fix unittest bug * change dir * change dir * refine code * fix bug * Refine code by comment * delete unused code * normative naming rules * refine code by comment * refine doc * refine codestyle
-
由 zhangyikun02 提交于
-
由 Wang Bojun 提交于
* trans_layernorm
-
由 傅剑寒 提交于
-
- 08 2月, 2023 8 次提交
-
-
由 Paulina Gacek 提交于
* QuantTranpose pattern is being found by pass * quant + transpose fuse * code style changes * UT written, reorder fixed * Dequantize + transpose2 fuse added * pass name changed * UT added & shift corrected * got rid of redundancy * review changes * AsIntermediate corrected * compat added
-
由 Sławomir Siwek 提交于
* add support for bf16 fused_ops * fused_matmul only
-
由 wangxiaoning 提交于
* fix codestyle * fix std
-
由 Zhang Jun 提交于
* update * update * format code * update * Update test_trt_convert_nearest_interp_v2.py
-
由 Yuang Liu 提交于
-
由 zmxdream 提交于
* hidden unzip * fix * fix
-
由 weishengying 提交于
-
由 HongyuJia 提交于
-