- 10 8月, 2022 2 次提交
-
-
由 Leo Chen 提交于
* fix proto bug * add ut * reset need_update for var_desc * refine code * fix var desc order issue
-
由 Aurelius84 提交于
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute * add unittest for inference predictor
-
- 09 8月, 2022 2 次提交
-
-
由 yeliang2258 提交于
-
由 yeliang2258 提交于
* fix a bug in transpose2 about mkldnn * fix bug
-
- 08 8月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* clean tensor.h * fix gather_nd
-
- 05 8月, 2022 4 次提交
-
-
由 Feiyu Chan 提交于
fix 5 operator makers with typos which pass string literal to argument 'generated', remove generated as parameter of AddAttr (#44935)
-
由 YuanRisheng 提交于
* move mkldnn activation kernel * fix compile bugs * fix compile bugs * deal with conflict * fix compile bugs * fix windows compile bugs * mkldnn unittest fix * change mutable to alloc * fix unittest bugs * modify code according comment
-
由 Zhen Wang 提交于
-
由 Sławomir Siwek 提交于
* remove v2_transpose_reshape * matmul_transpose_reshape * reshape_transpose_matmul * restore ut * adjust old ut * restore parallel UT ruels * feedback from review
-
- 04 8月, 2022 2 次提交
-
-
由 Sławomir Siwek 提交于
* Add unit tests * matmul_v2 + activation * matmuls + elementwise_add * matmul_v2 postops * transform matmul to v2 * opcompat * fix fusing matmul with multipe outs * add shape constraints * remove unused vars * change pass order * - Unit tests to be debugged - fix - refactor - diagnostic - more diagnostic - fix - Fix number two - fix - fix - fix - alpha added - more fixes - compilation fix - removed diagnostic code - cosmetic fixes * lint * add alpha constraint * merge matmul refactor * trigger CI * - fix * - another fix * code style * add support for matmul+elementwise_add+activation * code style * fix bfloat16 bugs * change append_binary to append_sum Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
-
由 王明冬 提交于
-
- 03 8月, 2022 2 次提交
- 02 8月, 2022 6 次提交
-
-
由 Leo Chen 提交于
-
由 Wilber 提交于
* multihead matmul add fp16 * fix windows error * fix rocm error * fix rocm error
-
由 danleifeng 提交于
-
由 Weilong Wu 提交于
* polish and rename, pt* -> phi* * fix code format
-
由 Ruibiao Chen 提交于
* Skip inplace for coalesce_tensor_op outputs * Fix typos * Add UTs * Fix typos
-
由 Ruibiao Chen 提交于
* Refactor build_op_downstream_map for standalone executor * Add some comments
-
- 01 8月, 2022 3 次提交
-
-
由 Leo Chen 提交于
* remove cudaDeviceContext * remove more template * fix rocm compile * remove alias name CUDADeviceContext * fix compile * fix tests * revert changes
-
由 danleifeng 提交于
Co-authored-by: seemingwang <zsasuke@qq.com> Co-authored-by: NDesmonDay <908660116@qq.com> Co-authored-by: Nseemingwang <seemingwang@users.noreply.github.com> Co-authored-by: NThunderbrook <a754913769@163.com> Co-authored-by: Nxuewujiao <105861147+xuewujiao@users.noreply.github.com> Co-authored-by: Nroot <root@yq01-sys-hic-k8s-v100-box-a225-0693.yq01.baidu.com> Co-authored-by: NThunderbrook <52529258+Thunderbrook@users.noreply.github.com> Co-authored-by: Nroot <root@yq01-inf-hic-k8s-a100-ab2-0009.yq01.baidu.com> Co-authored-by: Nhuwei02 <53012141+huwei02@users.noreply.github.com> Co-authored-by: Nyaoxuefeng <yaoxuefeng@baidu.com> Co-authored-by: Nlxsbupt <luoxsbupt@163.com> Co-authored-by: Nmiaoli06 <106585574+miaoli06@users.noreply.github.com> Co-authored-by: Nroot <root@yq01-inf-hic-k8s-a100-ab2-0008.yq01.baidu.com> Co-authored-by: Nchao9527 <33347532+chao9527@users.noreply.github.com> Co-authored-by: Nqingshui <qshuihu@gmail.com> Co-authored-by: Nyangjunchao <yangjunchao@baidu.com>
-
由 Wangzheee 提交于
* add varlen_token_prune plugin, pass, convert
-
- 29 7月, 2022 4 次提交
-
-
由 Leo Chen 提交于
* remove cudaDeviceContext * remove more template * fix rocm compile
-
由 JZ-LIANG 提交于
* fixed bug for pass & engine * fixed bug for benchmark GPT-3 * add tuner & profiler * add algorithms & config
-
由 Leo Chen 提交于
* init * move CUDAStream to phi * fix compilation * merge develop * add stream_owned_ member * split cuda_stream.h * fix cpu compile * fix constructor * fix bug * fix windows compile * fix inference test_levit * fix windows tests
-
由 houj04 提交于
-
- 27 7月, 2022 1 次提交
-
-
由 pangyoki 提交于
* fix RemoveNode in fuse_elewise_add_act_pass * fix * change pointer to share_ptr * fix * fix * fix format * fix * fix graph_safe_remove_nodes
-
- 26 7月, 2022 5 次提交
-
-
由 Zhen Wang 提交于
* Add a feed op before each input parameter var. * Fix some issues about the unit test build_cinn_pass_test.
-
由 Ruibiao Chen 提交于
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fl-ps v1.0 * . * support N + N mode * . * . * . * . * delete print * . * . * . * . * fix bug * . * . * fl-ps with coordinator ready * merge dev * update message parse only * update fl client scheduler * fix bug * update multithreads sync * fix ci errors * update role_maker.py * update role_maker.py * fix ci error: windows py import error * fix ci error: windows py import error * fix windows ci pylib import error * add dump fields & params * try to fix windows import fleet error * fix ps FLAGS error
-
由 Ruibiao Chen 提交于
* Set more attrs in ReplaceScaleLossGradOp * Fix typos * Fix CI errors * Add UT
-
由 Ruibiao Chen 提交于
-
- 25 7月, 2022 1 次提交
-
-
由 lyq 提交于
-
- 21 7月, 2022 2 次提交
-
-
由 zhaocaibei123 提交于
* add slot attr for push sparse op * add pybind * remove fleet * add unittest * fix
-
由 xiaoxiaohehe001 提交于
* convfusionfp16 * convfusionfp16 * convfusionfp16
-
- 20 7月, 2022 5 次提交
-
-
由 zmxdream 提交于
* Update ps_gpu_wrapper.h * Update ps_gpu_wrapper.h * Update ps_gpu_wrapper.cc
-
由 danleifeng 提交于
* add adam/sharedadam optimzier for gpups;edit optimizer struct;test=develop
-
由 houj04 提交于
* device_guard support xpu. test=kunlun * sum op of xpu support LoDTensorArray. add test for while op of xpu. test=kunlun.
-
由 zmxdream 提交于
* fix FleetWrapper initialize
-