- 25 2月, 2021 1 次提交
-
-
由 tangwei12 提交于
* fix entry * fix distributed lookup table fuse case * fix entry bug at first time * move entry from paddle.fluid -> paddle.distributed * fix ut with paddle.enable_static() Co-authored-by: Nmalin10 <malin10@baidu.com> Co-authored-by: Nmalin10 <malin10@baidu.com>
-
- 20 1月, 2021 3 次提交
-
-
由 guofei 提交于
动态图中Conv2D保存成预测模型时,对应的Op可能是conv2d,也可能是depthwise_conv2d,但目前的save_quantized_model接口并未考虑depthwise_conv2d情况,可能会致使out_scale的值保存错误,该PR主要是修复这个问题。
-
由 huangxu96 提交于
* add fleet amp.init() * add unittest for fleet_amp_init
-
由 huangxu96 提交于
* Implemented AddQuantDequantPass in imperative quantization. * support 2.0 API such as Pool2D and ReLU
-
- 19 1月, 2021 2 次提交
- 14 1月, 2021 2 次提交
- 13 1月, 2021 1 次提交
-
-
由 huangxu96 提交于
-
- 12 1月, 2021 1 次提交
-
-
由 Chengmo 提交于
* Fix server.h include device_context (#30243) * fix cmake Co-authored-by: NseiriosPlus <tangwei12@baidu.com> * 【Paddle.Fleet】Support local save sparse param (#30175) * add save tensor support Co-authored-by: NseiriosPlus <tangwei12@baidu.com> * add sparse embedding & load vars for 2.0 & gloo bug fix (#30306) * add sparse embedding & load vars for 2.0 Change-Id: I36b59ed5f015189dc9d9d2e34a9357722d369f1b * fix hdfs gloo Change-Id: Ia84d579053720ad804183e54c9a04b4f031c79c6 * fix gloo hdfs Change-Id: I5ab982fd483cddc10adcdef0b8aa83aca976cb9e * move loadvar/sparse embedding from incubute to static Change-Id: I57081d3545ad2efab78c72420d2162c0eacaf3a0 Co-authored-by: Ntangwei12 <tangwei12@baidu.com>
-
- 11 1月, 2021 4 次提交
-
-
由 wangchaochaohu 提交于
cherry-pick #28769, add support for place string representation
-
由 Zhen Wang 提交于
* Support pure fp16 training for AMP API. (#29544) * add cast ops before and after unsupported fp16 ops. * Keep partial net in FP32 pattern. * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode. * Add fp16 support for adam op. * add multi precision attr for adam. * Fix the bug of test_multi_precision_fp16_train UT. * Code format for CI. * Fix the redefine error about MPTypeTrait on windows. * fix bugs of the _create_accumulators func in Momentum. * fix bug when inserting post cast op. * Add the update_loss_scaling op in allow_set of UnusedVarCheck. * Update for ci coverage. * Add some doc for OptimizerWithMixedPrecision. * Fix the code style. * Imporve the doc of `amp_init`. * Change for fp16 testing if users have the infer program defined in separate way. * Remove tensor copy in the update_loss_scaling op. (#29426) * remove tensor copy in the update_loss_scaling op * not use thrust. * fix some cuda memory access error.
-
由 guofei 提交于
* Quantization supports 2.0 APIs * Fix the error of save_quantized_model
-
由 WangXi 提交于
* Optimization grad merge performance (#29784) * [fleet] combine amp and gradient merge, test=develop (#30086) * fix assign_op_xpu concat_op_xpu warining (#30120) Co-authored-by: Nliuyuhui <liuyuhui@baidu.com>
-
- 08 1月, 2021 1 次提交
-
-
由 huangxu96 提交于
* Optimizer trans momentum (#29597) * merge amp related function in Momentum from paddle.fluid.contrib.optimizer into paddle.optimizer. * Add unittest for 2.0 Momentum API. * fix some bugs in weight_decay. * add alias for fluid.contrib.mixed_precision (#29562) * add alias for fluid.contrib.mixed_precision * add static.amp into setup.pu.in (#29621) * add static.amp into setup.pu.in * add unittest for api * fix a bug in multi_precision_fp16 unittest. (#29756)
-
- 07 1月, 2021 1 次提交
-
-
由 furnace 提交于
* Layer norm fp16 (#29169) * add fp16 for layer_norm op * revert layernorm api * fix forward * fix forward * fix backward for layernorm with fp16 * fix unit test for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 * 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U> * fix with_mkldnn compile error for layernorm with fp16 * fix with_mkldnn compile error for layernorm with fp16 Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com> * fix layer_norm accuracy (#29434) * Layernorm opt (#29522) * layernorm fw opt * layernorm bw opt * fix typo, test=develop * remove const dim3 for windows CI compatibility * merge develop Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com> * Fix compile problem when cuda_arch < 6000 (#29576) * fix compile problem when cuda_arch < 6000 * refine code * refine code Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com> Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
-
- 05 1月, 2021 1 次提交
-
-
由 cc 提交于
* fix ininite scale values (#29386) * Support dygraph quant model (#29927) * Avoid the scale to be infinity in quant2_int8_mkldnn_pass, test=develop * support quantized model for paddle2.0 dygraph, test=develop Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>
-
- 29 12月, 2020 1 次提交
-
-
由 XiaoguangHu 提交于
* [cherry-pick] cherry-pick of PR#29928 * delete paddle.metric.chunk_eval and paddle.metric.mean_iou * delete paddle.nn.clip and paddle.nn.clip_by_norm * delete paddle.nn.functional.activation.hard_sigmoid and paddle.nn.functional.activation.hard_swish * [cherry-pick] cherry-pick of PR#29928 * fix extension import error
-
- 09 12月, 2020 1 次提交
-
-
由 Aurelius84 提交于
-
- 03 12月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* Add pure fp16 training with master weights. (#27712) * add the weight decay func for the momentum op * Add the multi_precision function in Momentum Optimizer. * Make sure that the initial value of master weights are same with the fp16 weights. * add static loss scaling. * add the rescale_grad function in the pure fp16 training. * use the original momentum updating method. * Polish some codes, such as variable names. * add docstring for apis. * update the var creation details of _create_master_weight. * not modify codes about imperative momentum updating. * Fix the error of test_dist_sparse_tensor_load_momentum UT. * add unit test for multi precision fp16 training. * add more unit tests for CI. * Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.
-
- 01 12月, 2020 1 次提交
-
-
由 Leo Chen 提交于
-
- 30 11月, 2020 2 次提交
-
-
由 WangXi 提交于
-
由 Wojciech Uss 提交于
-
- 27 11月, 2020 1 次提交
-
-
由 guofei 提交于
* Optimiz the unittest test_imperative_out_scale test=develop
-
- 26 11月, 2020 1 次提交
-
-
由 Aurelius84 提交于
-
- 25 11月, 2020 1 次提交
-
-
由 huangxu96 提交于
* Impelement 2.0 API version Conv2d and Linear layer quantization in imperative mode. * use cudnn softmax in static Lenet * Modified ChannelwiseQAT Unittest for 2.0 API. * For CI python coverage.
-
- 24 11月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* upgrade comment string to raw string * fix string in * fix string with ' ' * revert update on comments * upgrade only necessary * fix sample code checker * fix comments with '''
-
- 23 11月, 2020 1 次提交
-
-
由 furnace 提交于
* refactor momentum op to combine weight_decay (scale op and sum op)
-
- 18 11月, 2020 3 次提交
-
-
由 Chen Weihang 提交于
* add debuging code * change seed & add debug message
-
由 Bai Yifan 提交于
* support user-defined quant and preprocess
-
由 Leo Chen 提交于
* add matmtl_v2 to amp list * support dygraph
-
- 16 11月, 2020 1 次提交
-
-
由 cc 提交于
-
- 08 11月, 2020 1 次提交
-
-
由 YUNSHEN XIE 提交于
* disable ut test_parallel_executor_fetch_isolated_var,test=document_fix * test for limiting ut exec time as 15S * fix an error caused by cannot find ut * fix some error * can not find test_transformer * fix error caused by ut not run in windows * fix error caused by Compiler Options * fix error caused by setting timeout value as 15 in python/paddle/tests/CMakeLists.txt * setting timeout value to 120s for old ut * add the timeout value setting * fix error caused by ut only run in coverage_ci * add analyzer_transformer_profile_tester * fix some error * fix some error * fix error with inference option * fix error with inference option setting as ON_INFER * add some ut to set timeout * modified some option * fix error * fix some timeout error * fix error * fix error * fix timeout for test_analyzer_bfloat16_resnet50 * fix error * setting timeout properity for some ut * first pr for new ut timeout as 15S
-
- 04 11月, 2020 1 次提交
-
-
由 Leo Chen 提交于
* skip reader op in mixed_precision decorator * add ut
-
- 21 10月, 2020 2 次提交
-
-
由 Chen Weihang 提交于
-
由 cnn 提交于
* rename manual_seed to seed * rename xxx1d-->xxx1D, xxx2d-->xxx2D, xxx3d-->xxx3D * rename manual_seed --> seed * do not rename .cc, .cu and .h file * rename manual_seed --> seed * rename manual_seed --> seed * rename manual_seed --> seed * rename manual_seed --> seed * disable_static on doc example code * donot change manual_seed on generator * add enable_static on sample code * convert python/paddle/fluid/layers/nn.py to bak * fix typo * fix code style * fix seed to manual_seed when call functions of Generator() * fix bug
-
- 14 10月, 2020 1 次提交
-
-
由 guofei 提交于
* Implement the function of OueScaleForTraining/OutScaleForInference in dygraph test=develop
-
- 12 10月, 2020 2 次提交
- 11 10月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* replace config by kwargs * change save path form dir to prefix * fix failed unittests * revert unittest name change * polish en docs * add more tests for coverage
-