- 21 11月, 2022 22 次提交
-
-
由 Vvsmile 提交于
-
由 lzy 提交于
* use mma for QK dot computing in fused_multi_transformer. * Update fused_multi_transformer_op.cu.h
-
由 wanghuancoder 提交于
* refine reduce_all
-
由 JYChen 提交于
* remove apis in fluid.ops * fix test_activation_nn_grad * fix circle import error * fix ops * fix cos * fix divide not inplace * remove lazy-import part
-
由 zyfncg 提交于
* Fix wrong eigen header include * fix compile bug
-
由 傅剑寒 提交于
-
由 Vvsmile 提交于
remove crop which is not used in Paddle 2.0
-
由 PuQing 提交于
* move threadpool fix cmake * fix make
-
由 傅剑寒 提交于
-
由 taixiurong 提交于
-
由 houj04 提交于
-
由 傅剑寒 提交于
* remove relu6 test case under fluid * fix relu6 test case in mkldnn_elt_act_fuse_pass
-
由 Vvsmile 提交于
replace paddle.fluid.layers.selu with paddle.nn.functional.selu
-
由 Vvsmile 提交于
* Remove API: gather replace the paddle.fluid.layers.gather with paddle.gather * modify the call of gather from old style to new style
-
由 engineer1109 提交于
-
由 wenbin 提交于
-
由 huangjiyi 提交于
* move cross_entropy from fluid to phi * replace mutable_data with Alloc * use .template
-
由 Wen Sun 提交于
* refactor: replace Collective & PointToPoint with NCCLEnv * refactor: rename to RunFnInNCCLEnv * refactor: pass std::function by value
-
由 LiYuRio 提交于
-
由 LiYuRio 提交于
-
由 PuQing 提交于
-
由 sneaxiy 提交于
-
- 20 11月, 2022 1 次提交
-
-
由 ccrrong 提交于
* remove range
-
- 19 11月, 2022 2 次提交
-
-
由 Wen Sun 提交于
-
由 Aganlengzi 提交于
* [CustomPlace] fix amp * [CustomPlace] fix amp * fix ut because of too long time matmul fp16
-
- 18 11月, 2022 15 次提交
-
-
由 wanghuancoder 提交于
-
由 MarDino 提交于
* fused qkvBiasAdd and transpose with split qkv * fix typo * fix format * fix name * add annotation * fix comment
-
由 yuehuayingxueluo 提交于
* clear fluid apis in fleet and passes * fix model.py * fix model.py * fix cpp_pass.py
-
由 Sławomir Siwek 提交于
* cleanup unused code * unify is_int8 is_bfloat16 * Simplify matmul_v2 FWD kernel * remove RunKernel methods * remove import namespace * remove headers * clean fluid/phi cross imports * remove fluid axpy_handler * delete fluid methods * activations * OneDNNMemDesc * MKLDNNFormatForSize * MatchShapeToLayout * MKLDNNMemoryFormat * MKLDNNFormat * ReorderMKLDNNHandler * to_void_cast * review suggestions * interpolate * remove fluid depedency * init * ExecuteMatMulV2 * rm fluid kernel * matmul_grad * remove mutable_data
-
由 Vvsmile 提交于
remove pad_constant_like which is not used in paddle 2.0
-
由 Zuza Gawrysiak 提交于
* Migrate conv_transpose to phi * Move handler to kernel * kernel m * Fix formatting * handler * remove fluid * revert tcp_store * tcp_store * remove unused * Fix declaration * add dnn input * Fix typo Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
-
由 201716010711 提交于
-
由 zyfncg 提交于
* fix bug of zero_allocator in host * fix test compile bug * add unittest * update test
-
由 傅剑寒 提交于
-
由 MarDino 提交于
* Add quick gelu and fused bias add kernel * fix annotation * remove useless code * add fast gelu option and set it in multi transformer op * add flag to restrict if use fast gelu approximate * fix flags conflict * fix use tanh function instead * add cudart version limit * use phi fast tanh func * fix comment
-
由 huangjiyi 提交于
* move "paddle/phi/backends/gpu/gpu_device_function.h" to phi * update copyright years * rm "fluid/platform/device/gpu/gpu_device_function.h" in phi * fix rocm-complie bugs
-
由 Wen Sun 提交于
-
由 zhaoyingli 提交于
* [AutoParallel] selective recompute * add cmakelist
-
由 james 提交于
* correct sync behavior for XPU distributed training XPU support event mechanism similar to cuda event, so it is advisable to use an event to sync compute/comm streams for performance. However this mechanism is never fully tested, and inconsistent loss/ending_epochs are reported. Therefore, this PR replaces event sync with stream waiting as a temporary solution. * remove compile warning
-
由 Dandelight 提交于
-