- 14 10月, 2021 1 次提交
-
-
由 Zhang Zheng 提交于
-
- 21 9月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Create stateful OneDNNAXPYHandler object. This makes it possible to call it multiple times without recreating the oneDNN primitives every time. * Prepare SGDOpKernel to reuse its implementation from OneDNN kernel. * OneDNN SGD kernel. * Update call to use new OneDNNAXPYHandler object api. * Setup seed in proper place. * Enable OneDNN kernel only for single case. * For dense param and sparse grad. * Small refactor. * Enable oneDNN by op attr or by cmd line flag. * Use int64_t type for number of elements. * Support dense param and grad from OneDNN kernel. * Enable SGD OneDNN kernel when use MP BF16 optimizer. * Force non-copyable/movable OneDNNAXPYHandler. * Reuse OneDNNAXPYHandler for spare tensors in SUM op. * Fix SFINAE rules. * Remove recording event inside AXPY. * Get rid of internal primitive caching. * Stop use PP cache mechanims to store mem and primitive obj. * Handler obj store and reuse needed desc & prim * Do not derive from MKLDNNHandlerT
-
- 10 9月, 2021 1 次提交
-
-
由 ShenLiang 提交于
-
- 24 8月, 2021 1 次提交
-
-
由 Adam Osewski 提交于
* Small corrections. * Fix lr for bf16. * Revert some changes.
-
- 17 8月, 2021 1 次提交
-
-
由 Roc 提交于
-
- 05 8月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 22 7月, 2021 2 次提交
- 19 7月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* pass found_inf to adam * add unittest * fix bug * refine unittest * change unit test's directory * disable unittest on cpu
-
- 16 7月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* add clear_float_status op * refine infershape * fix typo * refine check_finite_and_scale * refine code
-
- 05 7月, 2021 1 次提交
-
-
由 jiangcheng 提交于
* reduce sum op default fp32, add into amp black list * reduce_sum default fp32 can avoid return inf when the sum value large than 65504
-
- 01 7月, 2021 1 次提交
-
-
由 taixiurong 提交于
-
- 29 6月, 2021 1 次提交
-
-
由 taixiurong 提交于
-
- 21 6月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 16 6月, 2021 1 次提交
-
-
由 zhiboniu 提交于
-
- 10 6月, 2021 1 次提交
-
-
由 Baibaifan 提交于
-
- 26 5月, 2021 1 次提交
-
-
由 JZ-LIANG 提交于
-
- 07 5月, 2021 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Add casting initializers for bf16 training * Changes after review * Correct test and add comment
-
- 28 4月, 2021 1 次提交
-
-
由 arlesniak 提交于
-
- 23 4月, 2021 1 次提交
-
-
由 Leo Chen 提交于
* refactor_check_finite_and_scale_npu_kernel * fix compile * add alloc_float_status op * add alloc_float_status op * add FloatStatus for check_finite_and_unscale * refine code * remove unneccessary logic * refine for fleet
-
- 22 4月, 2021 1 次提交
-
-
由 Yuang Liu 提交于
-
- 21 4月, 2021 2 次提交
- 15 4月, 2021 1 次提交
-
-
由 fangshuixun007 提交于
fix test sync_with_cpp (#32212)
-
- 08 4月, 2021 1 次提交
-
-
由 Zhen Wang 提交于
* Use the runtime to create the unsupported_fp16_list using in AMP. * Add more infos about supported ops. * Add some comments for the function of OpSupportedInfos. * Fix the unit test of test_multi_precision_fp16_train.
-
- 26 3月, 2021 1 次提交
-
-
由 lilong12 提交于
* update, test=develop
-
- 22 3月, 2021 1 次提交
-
-
由 arlesniak 提交于
-
- 20 1月, 2021 1 次提交
-
-
由 huangxu96 提交于
* add fleet amp.init() * add unittest for fleet_amp_init
-
- 13 1月, 2021 1 次提交
-
-
由 huangxu96 提交于
-
- 08 1月, 2021 1 次提交
-
-
由 Zhen Wang 提交于
* add cast ops before and after unsupported fp16 ops. * Keep partial net in FP32 pattern. * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode. * Add fp16 support for adam op. * add multi precision attr for adam. * Fix the bug of test_multi_precision_fp16_train UT. * Code format for CI. * Fix the redefine error about MPTypeTrait on windows. * fix bugs of the _create_accumulators func in Momentum. * fix bug when inserting post cast op. * Add the update_loss_scaling op in allow_set of UnusedVarCheck. * Update for ci coverage. * Add some doc for OptimizerWithMixedPrecision. * Fix the code style. * Imporve the doc of `amp_init`. * Change for fp16 testing if users have the infer program defined in separate way.
-
- 05 1月, 2021 1 次提交
-
-
由 WangXi 提交于
-
- 15 12月, 2020 1 次提交
-
-
由 huangxu96 提交于
* add alias for fluid.contrib.mixed_precision
-
- 09 12月, 2020 1 次提交
-
-
由 Aurelius84 提交于
-
- 02 12月, 2020 2 次提交
-
-
由 Zhen Wang 提交于
* add the weight decay func for the momentum op * Add the multi_precision function in Momentum Optimizer. * Make sure that the initial value of master weights are same with the fp16 weights. * add static loss scaling. * add the rescale_grad function in the pure fp16 training. * use the original momentum updating method. * Polish some codes, such as variable names. * add docstring for apis. * update the var creation details of _create_master_weight. * not modify codes about imperative momentum updating. * Fix the error of test_dist_sparse_tensor_load_momentum UT. * add unit test for multi precision fp16 training. * add more unit tests for CI. * Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT. * For CI Coverage Checking.
-
由 furnace 提交于
* add fp16 for layer_norm op * revert layernorm api * fix forward * fix forward * fix backward for layernorm with fp16 * fix unit test for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 * 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U> * fix with_mkldnn compile error for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
- 30 11月, 2020 1 次提交
-
-
由 WangXi 提交于
-
- 18 11月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* add matmtl_v2 to amp list * support dygraph
-
- 04 11月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* skip reader op in mixed_precision decorator * add ut
-
- 12 10月, 2020 1 次提交
-
-
由 WangXi 提交于
-
- 23 9月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* add fused_bn_add_relu op
-