- 17 11月, 2022 3 次提交
-
-
由 Yuang Liu 提交于
Support bfloat16 for adamw and adam optimizer. Fit the lr for pure bf16 training with tensor fusion. (#48041) * add bfloat16 for adamw * set lr not to bfloat16 for pure bf16 training * update the logic * update the adamw optimizer * support bfloat for adam
-
由 sneaxiy 提交于
* add vectorized bfloat16 atomicAdd * fix compile error * fix compile error again * fix V100 compile error * fix V100 compile again
-
由 zyfncg 提交于
-
- 16 11月, 2022 9 次提交
-
-
由 xiaoxiaohehe001 提交于
* add_fill_any_like * add_fill_any_like
-
由 wenbin 提交于
* elementwise_op * add teller * modify ut * comments * modify ut * return * modify
-
由 Zhang Jun 提交于
-
由 Zhang Jun 提交于
-
由 Piotr Paturej 提交于
* Enable bf16 in oneDNN bilinear_interp kernel * Fix bilinear_interp_v2 not enabled in models * Remove unnecessary checks
-
由 hong 提交于
* remove avx check * fix bug;
-
由 Leo Chen 提交于
-
由 Wen Sun 提交于
* refactor: update pg custom * fix: use new api in ut * fix: typo * revert: recover legacy apis * fix: add GetDeviceContext
-
由 czr-gc 提交于
-
- 15 11月, 2022 5 次提交
-
-
由 YuanRisheng 提交于
-
由 jakpiase 提交于
* optimization for ln * fix * added output to gpd * added formatting * fix
-
由 zhouweiwei2014 提交于
-
由 Wilber 提交于
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency
-
- 14 11月, 2022 9 次提交
-
-
由 Wen Sun 提交于
* refactor: simplify send, recv interfaces * refactor: rm send_partial, recv_partial, all_gather_partial
-
由 xiaoxiaohehe001 提交于
-
由 LiYuRio 提交于
-
由 cyber-pioneer 提交于
-
由 LiYuRio 提交于
-
由 Ruibiao Chen 提交于
-
由 engineer1109 提交于
-
由 yeliang2258 提交于
-
由 Ruibiao Chen 提交于
-
- 11 11月, 2022 6 次提交
-
-
由 zhouweiwei2014 提交于
-
由 czr-gc 提交于
* feat(ipu): add model_runtime backend support in IPU. * fix(ipu_executor): fix error message format. * fix(ipu_executor): fix format. * fix(ipu_executor): fix format again. * fix(ipu_executor): fix format again. * fix(ipu_executor): fix format again.
-
由 zhangbo9674 提交于
* refine shape op in new_exe * Revert "refine shape op in new_exe" This reverts commit 0e0336ddc5eede3da019b348a0bcc0ef0f3be64e. * refine shape op in new_exe * refine shape expected_kernel_type * add SelectedRows check for shape op * refine code
-
由 james 提交于
phi::Alloc() complains about missing device_allocator_
-
由 zyfncg 提交于
* generate static graph code for some ops by yaml * remove deleted files * update cmake * update cmake * udpate cmake
-
由 Yuanle Liu 提交于
-
- 10 11月, 2022 8 次提交
-
-
由 Sylwester Fraczek 提交于
* migrate prelu * remove cache * review fixes
-
由 WangZhen 提交于
Get grads types from cpp for adam to speed up
-
由 LiYuRio 提交于
-
由 YuanRisheng 提交于
* standard api * fix sparse bugs * fix xpu bugs, test=kunlun * remove hard code for custom unittest * open ci, test=kunlun * deal with conflict
-
由 zhangxin81 提交于
* add roformer pass&&plugin(novarlen)
-
由 james 提交于
* XPU support eager mode * add unittest for XPU eager mode * minor bugfix * minor bugfix, test=kunlun * correct copyright info * 1. remove unsed vars/funcs 2. ProcessGroupBKCL inherit from ProcessGroupStream * bugfix for fp16 in eager mode multi-card, test=kunlun * rebase & fix a few issues * use new processgroup interface, test=kunlun * fix compile issue, test=kunlun
-
由 wenbin 提交于
* skip_merge_layernorm * add UT * modify comments
-
由 zyfncg 提交于
* add ci check for code-gen script * update
-