- 20 10月, 2022 10 次提交
-
-
由 Kaipeng Deng 提交于
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
-
由 HongyuJia 提交于
* remove fc mkldnn hardcode * remove useless enum of kFCMKLDNN * fix macro error * update operators.cmake
-
由 Sylwester Fraczek 提交于
-
由 Xinger 提交于
-
由 JingZhuangzhuang 提交于
* Add infer prune function * Update phi.cmake * Update operators.cmake * add fusion op
-
由 JingZhuangzhuang 提交于
* add _get_phi_kernel_name interface * remove inference interface * Revert "remove inference interface" This reverts commit 784a8a6c51fa2dc49a01c8699525298ac21b178f.
-
由 Chen Weihang 提交于
-
由 Tony Cao 提交于
* Fix W605 in tools folder by adding escape symbols * Fix W605 in incubate and some other folders * Fix W605 in /fluid/test folders * Update tools/analysisPyXml.py Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com> * Add some changes to manual and auto escape symbols * revert changes in transformer.py * Fix new code with W605 error: add escape symbols * revert changes in transformer.py * revert changes in transformer.py Co-authored-by: NNyakku Shigure <sigure.qaq@gmail.com>
-
由 Weilong Wu 提交于
-
由 Li-fAngyU 提交于
* close Wno-error=sign_compare * close Wno-error=sign_compare * Update CMakeLists.txt
-
- 19 10月, 2022 17 次提交
-
-
由 Weilong Wu 提交于
-
由 Wang Xin 提交于
-
由 RichardWooSJTU 提交于
-
由 Xinger 提交于
-
由 Nyakku Shigure 提交于
-
由 Yuanle Liu 提交于
-
由 zyfncg 提交于
-
由 zyfncg 提交于
* rename op in yaml * fix test_layout_autotune * fix layout autotune of transpose
-
由 Ruibiao Chen 提交于
* Support stream overlap for c_allreduce_sum * Test CI * Add notes * Add SingleStreamGuard for BuildOpFuncList
-
由 Yiqun Liu 提交于
Enable to record whether the conv algo is got by exhaustive search to fix autotune cache bug. (#47065)
-
由 Charles-hit 提交于
* support uniform api in new ad * add unit test for uniform_random_p * resolve conflict * fix uniform_random orig2prim * fix primrules * remove ShapeTensor and ShapeTensorList input in uniform_random_p op and add sigmoid orig2prim rules
-
由 WangZhen 提交于
* Fix recurrent op eager deletion pass error in dy2st * Polish code * Refine error message
-
由 will-jl944 提交于
-
由 Hui Zhang 提交于
* cond infer apply exec seprate * fix bugs * fix as comment
-
由 Leo Chen 提交于
* clean unused code: piece.cc/h * clean usage
-
由 wanghuancoder 提交于
-
由 Li-fAngyU 提交于
-
- 18 10月, 2022 10 次提交
-
-
由 weishengying 提交于
-
由 zhoutianzi666 提交于
* Rewrite strided_slice converter using shape tensor * clean code
-
由 Wang Bojun 提交于
* first version, accuracy corrected * disable debug print * use blockReduceSum in phi * add UT * add opCompat * code style * code refine * bug fix * code refine * test fix * bugfix * codesytle fix * code style * code-style * code-style * code-style
-
由 Sławomir Siwek 提交于
* git * style * leave default relu in kernel * style * cleanup FCMKLDNN pattern * merge conflicts * update develop * update develop * add const * rename to oneDNN and adjust attributes * whitespace
-
由 Hui Zhang 提交于
* cond infer apply exec seprate * fix bugs
-
由 Wilber 提交于
-
由 xiaoxiaohehe001 提交于
-
由 Weilong Wu 提交于
-
由 zyfncg 提交于
* support generating code of opmaker for backward op invoke forward op * gsupport code-gen of opmaker for sparse op * refind logic of choose phi kernrel * fix complie budg * fix code_gen bug * fix bug * fix kernel signature code-gen * fix complie bug of VarType * fix complie bug of VarType * fix test_sparse_conv_op * fix test_sparse_norm_op
-
由 HongyuJia 提交于
-
- 17 10月, 2022 3 次提交
-
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Support allow_partial switch, which can be configure in pipeline_configs. If sent tensor are not the same from different hosts, they shouldn't been sent partially and then concated as a whole tensor. * Change name allow_partial to enable_partial_send_recv. * Add global variable _enable_partial_send_recv
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * support pure bfloat16 * support bf16 linear * update PR to pass CI * tiny fix where_grad_kernel.cu * Support bfloat16 type for reducer and sharding. * Fix some bug. * Polish code. * Polise code. * Add bfloat16 datatype in fill_grad kernels. Co-authored-by: Nsneaxiy <sneaxiy@126.com>
-