- 21 11月, 2022 4 次提交
-
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency * init * ExecuteMatMulV2 * rm fluid kernel * matmul_grad * remove mutable_data * mul_grad
-
由 lzy 提交于
* use mma for QK dot computing in fused_multi_transformer. * Update fused_multi_transformer_op.cu.h
-
由 huangjiyi 提交于
* move cross_entropy from fluid to phi * replace mutable_data with Alloc * use .template
-
由 Wen Sun 提交于
* refactor: replace Collective & PointToPoint with NCCLEnv * refactor: rename to RunFnInNCCLEnv * refactor: pass std::function by value
-
- 18 11月, 2022 7 次提交
-
-
由 MarDino 提交于
* fused qkvBiasAdd and transpose with split qkv * fix typo * fix format * fix name * add annotation * fix comment
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency * init * ExecuteMatMulV2 * rm fluid kernel * matmul_grad * remove mutable_data
-
由 Zuza Gawrysiak 提交于
* Migrate conv_transpose to phi * Move handler to kernel * kernel m * Fix formatting * handler * remove fluid * revert tcp_store * tcp_store * remove unused * Fix declaration * add dnn input * Fix typo Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
-
由 MarDino 提交于
* Add quick gelu and fused bias add kernel * fix annotation * remove useless code * add fast gelu option and set it in multi transformer op * add flag to restrict if use fast gelu approximate * fix flags conflict * fix use tanh function instead * add cudart version limit * use phi fast tanh func * fix comment
-
由 Wang Xin 提交于
* remove "gpu_primitives.h" in fluid namespace * fix PR-CI-GpuPS fail * fix PR-CI-GpuPS fail
-
由 feng_shuai 提交于
-
由 huangjiyi 提交于
-
- 17 11月, 2022 6 次提交
-
-
由 zyfncg 提交于
* clip extra and intermediate output of op * fix bug * fix bug * polich code * polich log
-
由 huangjiyi 提交于
-
由 YuanRisheng 提交于
* standard api * fix xpu bugs
-
由 taixiurong 提交于
-
由 huangjiyi 提交于
* rm "paddle/fluid/operators/math.h" in phi * rm "paddle/fluid/operators/math.h" in fluit
-
由 zyfncg 提交于
-
- 15 11月, 2022 4 次提交
-
-
由 YuanRisheng 提交于
-
由 jakpiase 提交于
* optimization for ln * fix * added output to gpd * added formatting * fix
-
由 zhouweiwei2014 提交于
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency
-
- 14 11月, 2022 3 次提交
-
-
由 Wen Sun 提交于
* refactor: simplify send, recv interfaces * refactor: rm send_partial, recv_partial, all_gather_partial
-
由 xiaoxiaohehe001 提交于
-
由 Ruibiao Chen 提交于
-
- 11 11月, 2022 3 次提交
-
-
由 zhouweiwei2014 提交于
-
由 zhangbo9674 提交于
* refine shape op in new_exe * Revert "refine shape op in new_exe" This reverts commit 0e0336ddc5eede3da019b348a0bcc0ef0f3be64e. * refine shape op in new_exe * refine shape expected_kernel_type * add SelectedRows check for shape op * refine code
-
由 zyfncg 提交于
* generate static graph code for some ops by yaml * remove deleted files * update cmake * update cmake * udpate cmake
-
- 10 11月, 2022 5 次提交
-
-
由 Sylwester Fraczek 提交于
* migrate prelu * remove cache * review fixes
-
由 YuanRisheng 提交于
* standard api * fix sparse bugs * fix xpu bugs, test=kunlun * remove hard code for custom unittest * open ci, test=kunlun * deal with conflict
-
由 james 提交于
* XPU support eager mode * add unittest for XPU eager mode * minor bugfix * minor bugfix, test=kunlun * correct copyright info * 1. remove unsed vars/funcs 2. ProcessGroupBKCL inherit from ProcessGroupStream * bugfix for fp16 in eager mode multi-card, test=kunlun * rebase & fix a few issues * use new processgroup interface, test=kunlun * fix compile issue, test=kunlun
-
由 zyfncg 提交于
* add ci check for code-gen script * update
-
由 Charles-hit 提交于
-
- 09 11月, 2022 7 次提交
-
-
由 HongyuJia 提交于
-
由 jakpiase 提交于
-
由 Sławomir Siwek 提交于
-
由 Jacek Czaja 提交于
* first commit - more fixes - compilation fix - compilation fix - fix - another fix - yet another fix - Fix - fix to fused ops - compilation fix - compilation fix - another compilation fix - another fix - fix - fix - fix - fix - yet another fix - fix - fix - cosmetic fix :- lint - Revert some changes (to be brought back later) - fix to build - Added prototype of slice - fix compilation fix - compilation fix - fix - fix - Fix - fix fix modified: cmake/flags.cmake * lint * rerun of CI * - Fix * - lint * - lint2
-
由 zyfncg 提交于
* generate static graph code of some op * polish code * fix bug * update default value
-
由 Chen Weihang 提交于
* move fluid op generator into fluid * remove parsed op * resolve sig undef error * append python interp find logic * remove dup code
-
由 LiYuRio 提交于
-
- 08 11月, 2022 1 次提交
-
-
由 Paulina Gacek 提交于
* Split kernel registered, tests for uint/int added * Split quantized * Split output scales calculated only once * NearestInterp test fix reversed * DequantizeOutputs corrected
-