提交 · d8dfef54a5caba7bbe1fd383707ee69dac58a959 · PaddlePaddle / Paddle

11 1月, 2021 3 次提交

[Cherry-Pick] Support pure fp16 training for AMP API. (#29544) (#30241) · d8dfef54

由 Zhen Wang 提交于 1月 11, 2021

* Support pure fp16 training for AMP API. (#29544)

* add cast ops before and after unsupported fp16 ops.

* Keep partial net in FP32 pattern.

* Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.

* Add fp16 support for adam op.

* add multi precision attr for adam.

* Fix the bug of test_multi_precision_fp16_train UT.

* Code format for CI.

* Fix the redefine error about MPTypeTrait on windows.

* fix bugs of the _create_accumulators func in Momentum.

* fix bug when inserting post cast op.

* Add the update_loss_scaling op in allow_set of UnusedVarCheck.

* Update for ci coverage.

* Add some doc for OptimizerWithMixedPrecision.

* Fix the code style.

* Imporve the doc of `amp_init`.

* Change for fp16 testing if users have the infer program defined in separate way.

* Remove tensor copy in the update_loss_scaling op. (#29426)

* remove tensor copy in the update_loss_scaling op

* not use thrust.

* fix some cuda memory access error.

d8dfef54

G
Quantization supports 2.0 APIs (#30036) (#30257) · 393a91f1
由 guofei 提交于 1月 11, 2021
```
* Quantization supports 2.0 APIs

* Fix the error of save_quantized_model
```
393a91f1

[cherry-pick 2.0] optimize gradient merge (#30185) · e283dc6f

由 WangXi 提交于 1月 11, 2021

* Optimization grad merge performance (#29784)

* [fleet] combine amp and gradient merge, test=develop (#30086)

* fix assign_op_xpu concat_op_xpu warining (#30120)
Co-authored-by: Nliuyuhui <liuyuhui@baidu.com>

e283dc6f

08 1月, 2021 1 次提交

[Cherry-pick] amp related PR cherry pick into Release/2.0 (#30212) · 9f7c66b4

由 huangxu96 提交于 1月 08, 2021

* Optimizer trans momentum (#29597)

* merge amp related function in Momentum from paddle.fluid.contrib.optimizer into paddle.optimizer.

* Add unittest for 2.0  Momentum API.

* fix some bugs in weight_decay.

* add alias for fluid.contrib.mixed_precision (#29562)

* add alias for fluid.contrib.mixed_precision

* add static.amp into setup.pu.in (#29621)

* add static.amp into setup.pu.in

* add unittest for api

* fix a bug in multi_precision_fp16 unittest. (#29756)

9f7c66b4

07 1月, 2021 1 次提交

[Cherry-pick] Layer norm fp16 and Nvidia optimize (#29169 #29434 #29522 #29576) (#30110) · 44b81e63

由 furnace 提交于 1月 07, 2021

* Layer norm fp16 (#29169)

* add fp16 for layer_norm op

* revert layernorm api

* fix forward

* fix forward

* fix backward for layernorm with fp16

* fix unit test for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16

* 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>

* fix with_mkldnn compile error for layernorm with fp16

* fix with_mkldnn compile error for layernorm with fp16
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>

* fix layer_norm accuracy (#29434)

* Layernorm opt (#29522)

* layernorm fw opt

* layernorm bw opt

* fix typo, test=develop

* remove const dim3 for windows CI compatibility

* merge develop
Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>

* Fix compile problem when cuda_arch < 6000 (#29576)

* fix compile problem when cuda_arch < 6000

* refine code

* refine code
Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>

44b81e63

05 1月, 2021 1 次提交

[cherry-pick 2.0] Support dygraph quant model and avoid the scale to be infinity (#30098) · 3fe71d0a

由 cc 提交于 1月 05, 2021

* fix ininite scale values (#29386)

* Support dygraph quant model (#29927)

* Avoid the scale to be infinity in quant2_int8_mkldnn_pass, test=develop
* support quantized model for paddle2.0 dygraph, test=develop
Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>

3fe71d0a

29 12月, 2020 1 次提交

[cherry-pick] clean redundant API alias in 2.0 - part 1 #29928 (#29960) · c9c835b5

由 XiaoguangHu 提交于 12月 28, 2020

* [cherry-pick] cherry-pick of PR#29928

* delete paddle.metric.chunk_eval and paddle.metric.mean_iou

* delete paddle.nn.clip and paddle.nn.clip_by_norm

* delete paddle.nn.functional.activation.hard_sigmoid and paddle.nn.functional.activation.hard_swish

* [cherry-pick] cherry-pick of PR#29928

* fix extension import error

c9c835b5

09 12月, 2020 1 次提交
- A
  
  [cherry-pick] Fix amp support fleet(#29505) · d82d59e6
  由 Aurelius84 提交于 12月 09, 2020
  
  d82d59e6
03 12月, 2020 1 次提交

[Cherry-pick] Add pure fp16 training with master weights. (#29301) · d8ea8a06

由 Zhen Wang 提交于 12月 03, 2020

* Add pure fp16 training with master weights. (#27712)

* add the weight decay func for the momentum op

* Add the multi_precision function in Momentum Optimizer.

* Make sure that the initial value of master weights are same with the fp16 weights.

* add static loss scaling.

* add the rescale_grad function in the pure fp16 training.

* use the original momentum updating method.

* Polish some codes, such as variable names.

* add docstring for apis.

* update the var creation details of _create_master_weight.

* not modify codes about imperative momentum updating.

* Fix the error of test_dist_sparse_tensor_load_momentum UT.

* add unit test for multi precision fp16 training.

* add more unit tests for CI.

* Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.

d8ea8a06

01 12月, 2020 1 次提交
- L
  
  Upgrade string literals to raw string [part 2](#29217) · 4556ad76
  由 Leo Chen 提交于 12月 01, 2020
  
  4556ad76
30 11月, 2020 2 次提交
- W
  
  optimizer amp, all use fp16 communication, overlap last comm and compute (#28957) · 0c2a51d2
  由 WangXi 提交于 11月 30, 2020
  
  0c2a51d2
- W
  
  Add quantization of multi_gru op and tests (#28615) · 4fd4095d
  由 Wojciech Uss 提交于 11月 30, 2020
  
  4fd4095d
27 11月, 2020 1 次提交
- G
  Integrate ImperativeOutScale into ImperativeQuantAware. (#27956) · 63840227
  由 guofei 提交于 11月 27, 2020
```
* Optimiz the unittest test_imperative_out_scale

test=develop
```
  63840227
26 11月, 2020 1 次提交
- A
  
  Remove prettytable in requirements.txt (#29100) · 14013a2e
  由 Aurelius84 提交于 11月 26, 2020
  
  14013a2e
25 11月, 2020 1 次提交

Quant nn2.0 (#28764) · 40f54537

由 huangxu96 提交于 11月 25, 2020

* Impelement 2.0 API version Conv2d and Linear layer quantization in imperative mode.

* use cudnn softmax in static Lenet

* Modified ChannelwiseQAT Unittest for 2.0 API.

* For CI python coverage.

40f54537

24 11月, 2020 1 次提交

Upgrade string literals to raw string (#28989) · 3815d7aa

由 Leo Chen 提交于 11月 24, 2020

* upgrade comment string to raw string

* fix string in

* fix string with ' '

* revert update on comments

* upgrade only necessary

* fix sample code checker

* fix comments with '''

3815d7aa

23 11月, 2020 1 次提交
- F
  refactor momentum op to combine weight (#27414) · 8ff35506
  由 furnace 提交于 11月 23, 2020
```
* refactor momentum op to combine weight_decay (scale op and sum op)
```
  8ff35506
18 11月, 2020 3 次提交
- C
  Fix test_weight_decay_extend random failed on windows (#28643) · 358d6bc9
  由 Chen Weihang 提交于 11月 18, 2020
```
* add debuging code

* change seed & add debug message
```
  358d6bc9
- B
  Support user-defined activation/weight quantize and preprocess. (#28570) · 5050e761
  由 Bai Yifan 提交于 11月 18, 2020
```
* support user-defined quant and preprocess
```
  5050e761
- L
  Add matmtl_v2 to amp list (#28693) · 11e32baf
  由 Leo Chen 提交于 11月 18, 2020
```
* add matmtl_v2 to amp list

* support dygraph
```
  11e32baf
16 11月, 2020 1 次提交
- C
  
  Add some ops for cacluating output scale, test=develop (#28644) · d1e84f3e
  由 cc 提交于 11月 16, 2020
  
  d1e84f3e
08 11月, 2020 1 次提交

exec ut no more than 15s 1 (#28439) · ba075632

由 YUNSHEN XIE 提交于 11月 08, 2020

* disable ut test_parallel_executor_fetch_isolated_var,test=document_fix

* test for limiting ut exec time as 15S

* fix an error caused by cannot find ut

* fix some error

* can not find test_transformer

* fix error caused by ut not run in windows

* fix error caused by Compiler Options

* fix error caused by setting timeout value as 15 in python/paddle/tests/CMakeLists.txt

* setting timeout value to 120s for old ut

* add the timeout value setting

* fix error caused by ut only run in coverage_ci

* add analyzer_transformer_profile_tester

* fix some error

* fix some error

* fix error with inference option

* fix error with inference option setting as ON_INFER

* add some ut to set timeout

* modified some option

* fix error

* fix some timeout error

* fix error

* fix error

* fix timeout for test_analyzer_bfloat16_resnet50

* fix error

* setting timeout properity for some ut

* first pr for new ut timeout as 15S

ba075632

04 11月, 2020 1 次提交
- L
  Skip reader op in mixed_precision decorator (#28353) · 71d62207
  由 Leo Chen 提交于 11月 04, 2020
```
* skip reader op in mixed_precision decorator

* add ut
```
  71d62207
21 10月, 2020 2 次提交

C

fix test_weight_decay_extend error (#28178) · 5d73bfdb
由 Chen Weihang 提交于 10月 21, 2020

5d73bfdb

2.0rc api rename (#28088) · 7c1aa0d6

由 cnn 提交于 10月 21, 2020

* rename manual_seed to seed

* rename xxx1d-->xxx1D, xxx2d-->xxx2D, xxx3d-->xxx3D

* rename manual_seed --> seed

* do not rename .cc, .cu and .h file

* rename manual_seed --> seed

* rename manual_seed --> seed

* rename manual_seed --> seed

* rename manual_seed --> seed

* disable_static on doc example code

* donot change manual_seed on generator

* add enable_static on sample code

* convert python/paddle/fluid/layers/nn.py to bak

* fix typo

* fix code style

* fix seed to manual_seed when call functions of Generator()

* fix bug

7c1aa0d6

14 10月, 2020 1 次提交
- G
  Implement the function of OutScaleForTraining/OutScaleForInference in dygraph (#26601) · 6bbb6e7f
  由 guofei 提交于 10月 14, 2020
```
* Implement the function of OueScaleForTraining/OutScaleForInference in dygraph

test=develop
```
  6bbb6e7f
12 10月, 2020 2 次提交
- W
  
  fleet combine amp dgc recompute meta optimizer (#27643) · 0a1862d1
  由 WangXi 提交于 10月 12, 2020
  
  0a1862d1
- C
  Add test attribute in channelwise_quant op, test=develop (#27742) · 8fabb1c3
  由 cc 提交于 10月 12, 2020
```
* Add test attribute in channelwise_quant op, test=develop
```
  8fabb1c3
11 10月, 2020 1 次提交

Polish jit.save/load design & remove paddle.SaveLoadConfig (#27623) · 9b49f024

由 Chen Weihang 提交于 10月 11, 2020

* replace config by kwargs

* change save path form dir to prefix

* fix failed unittests

* revert unittest name change

* polish en docs

* add more tests for coverage

9b49f024

09 10月, 2020 1 次提交
- L
  Fix bilateral inference shape bug (#26822) · 9089841b
  由 LielinJiang 提交于 10月 09, 2020
```
* fix bilateral bug
```
  9089841b
01 10月, 2020 1 次提交
- W
  
  Added support for quantization of fusion_gru (#27518) · 966447e3
  由 Wojciech Uss 提交于 10月 01, 2020
  
  966447e3
24 9月, 2020 1 次提交
- C
  
  replace dataset with fake data (#27519) · f2c97b6d
  由 Chen Weihang 提交于 9月 24, 2020
  
  f2c97b6d
23 9月, 2020 2 次提交
- Y
  
  modified timeout value for 4 ut (#27462) · 66951ab2
  由 YUNSHEN XIE 提交于 9月 23, 2020
  
  66951ab2
- Z
  add fuse_bn_act op (#27230) · 906e7f92
  由 Zhang Ting 提交于 9月 23, 2020
```
* add fused_bn_add_relu op
```
  906e7f92
22 9月, 2020 1 次提交

Use dygraph mode by default (#27443) · 827ac36f

由 pangyoki 提交于 9月 22, 2020

* default open dygraph mode

* fix CI-Mac

* fix Mac-CI other unittest file

* fix CI-Py3

* fix test_communicator_geo and test_buffer_shared_memory_reuse_pass

* add enable_static to fix CI-Py3

* add enable_static to fix CI-coverage

* delete try except

827ac36f

21 9月, 2020 1 次提交

Quant op dev (#25932) · 02606d45

由 huangxu96 提交于 9月 21, 2020

* Finished ChannelWiseQuantDequantAbsMaxOp and Passed unittests.

* Finished channel-wise quantize strategy in imperative quantization.

* Added Cuda code of ChannelWiseQuantDequantMaxAbsOP
Add Cuda code of ChannelWiseQuantDequantMaxAbsOp

* Add quant_axis for channel_wise quant.

* fixed a bug in unnitests, which will not trigger axis = 1 case and cannot meet the coverage rate requirement.

* Added some assert infomation and fixed some coding style mistakes.

02606d45

18 9月, 2020 1 次提交
- Z
  
  Remove save_quantized_model in ImperativeQuantAware. (#27240) · d28162b9
  由 Zhen Wang 提交于 9月 18, 2020
  
  d28162b9
15 9月, 2020 1 次提交
- C
  Remove the cache in post_traning_quantization, test=develop (#26450) · 2d8281d5
  由 cc 提交于 9月 15, 2020
```
* Remove the cache in post_traning_quantization, test=develop
```
  2d8281d5
14 9月, 2020 1 次提交

Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for... · d708b210

由 Zhen Wang 提交于 9月 14, 2020

Update amp_check_finite_and_scale_op and add an updating_loss_scaling op for static graph amp training. (#26240)

* update amp_check_finite_and_scale_op for static_amp.

* use amp_check_finite_and_scale in static graph amp.

* update grads to zero when grads own infinite values(as for amp_checkout_finite_and_scale op).

* add update_loss_scaling op in cpp.

* add update_loss_scaling_op unit test.

* update the doc of the check_finite_and_unscale op

* Update the process of gradients updating skipping if the gradients have infinite values.

* update the way to zero grads.

* update test_update_loss_scaling_op.py

* add log info when find infinite grads.

* add the unit test for UpdateLossScaling Layer.

d708b210

10 9月, 2020 1 次提交
- Z
  
  Update the _get_fake_quant_type definition in imperative QAT. (#27222) · ece74c4c
  由 Zhen Wang 提交于 9月 10, 2020
  
  ece74c4c

PaddlePaddle / Paddle 大约 1 年 前同步成功

PaddlePaddle / Paddle
大约 1 年前同步成功