提交 · cbab018413688067025f820727e5a87bfacd9750 · PaddlePaddle / Paddle

10 8月, 2022 1 次提交

[Cherry pick] fix quant scale name (#44903) · cbab0184

由 ceci3 提交于 8月 10, 2022

* fix quant scale name (#44116)

* fix acc diff problem caused by pr #44116 (#44311)
Co-authored-by: Nhandiz <35895648+ZhangHandi@users.noreply.github.com>

cbab0184

04 8月, 2022 1 次提交
- G
  [cherry-pick] fix QuantizeLinear pass and support reduce_max in quantization (#44872) · 24b3bbde
  由 Guanghua Yu 提交于 8月 04, 2022
```
* fix QuantizeLinear kernel and pass in QAT (#44784)

* Add Reduce Max in Quant (#44825)
Co-authored-by: NChang Xu <molixu7@gmail.com>
```
  24b3bbde
27 6月, 2022 1 次提交
- G
  [cherry-pick]Update quantization round and clip calculation methods (#43829) · ff70a269
  由 Guanghua Yu 提交于 6月 27, 2022
```
* update quantization clip and round

* fix quantization clip and round Attribute

* fix typo
```
  ff70a269
23 6月, 2022 1 次提交
- L
  
  remove slowing down pass (#43750) · 096eb801
  由 lidanqing 提交于 6月 23, 2022
  
  096eb801
22 6月, 2022 1 次提交

Cherry-pick PR#43237 from deveop (#43685) · e90dfaf7

由 shiyutang 提交于 6月 22, 2022

* merge_release_and_dev

* merge_release_dev

* update

* Use tempfile to place the temporary files (#43237)

* tempfile_fix

* update

* fix_CI

* update_word2vec.inference.model

* remove_change_in_word2vec_book

* fix_word2vec_book

* rm_affine

* update

e90dfaf7

16 6月, 2022 1 次提交
- G
  [cherry-pick]Add progress bar and speed up Quantization Pass (#43454) · abb0b2d6
  由 Guanghua Yu 提交于 6月 16, 2022
```
* Add progress bar and speed up Quantization Pass

* fix typo
```
  abb0b2d6
09 6月, 2022 2 次提交
- G
  cherry pick #42255 (fuse conv + bn in QAT) and #42378 (support skip_op_list in PTQ) (#43301) · 0a00fc4e
  由 Guanghua Yu 提交于 6月 09, 2022
```
* support fuse conv and bn in QAT (#42255)

* support skip_op_list in PostTrainingQuantization (#42378)

* fix unittest
```
  0a00fc4e
- G
  
  Modify quantization use tempfile to place the temporary files (#43281) · f4e09397
  由 Guanghua Yu 提交于 6月 09, 2022
  
  f4e09397
04 5月, 2022 2 次提交
- G
  [cherry-pick] fix PTQ unittest timeout (#42452) · 25318f6f
  由 Guanghua Yu 提交于 5月 04, 2022
```
* fix PTQ unittest timeout

* fix ut
```
  25318f6f
- C
  Fix problem with py3.6 and test for quant2_int8_lstm (#41420) (#42447) · 706b7b7f
  由 cc 提交于 5月 04, 2022
```
Co-authored-by: Njoanna.wozna.intel <joanna.wozna@intel.com>
```
  706b7b7f
29 4月, 2022 1 次提交

[cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer... · 50bfe420

由 WangXi 提交于 4月 29, 2022

[cherry-pick 2.3] Add fused_multi_transformer op to optimize transformer generation performance (#42311)

* Add fused_multi_transformer op to optimize transformer generation performance (#41814)

* fix fused_multi_transformer compile failed in cuda arch < sm53 (#42315)

* fix ci timeout

50bfe420

22 4月, 2022 1 次提交
- A
  [IPU] add mixed-precission support for ipu (#41733) (#41906) · c09b1d68
  由 Allen Guo 提交于 4月 22, 2022
```
add mixed-precission support for ipu

cherry-pick from #41733
```
  c09b1d68
05 4月, 2022 1 次提交
- G
  
  add new format of quantization (#41041) · b72a7ebb
  由 Guanghua Yu 提交于 4月 05, 2022
  
  b72a7ebb
01 4月, 2022 1 次提交
- D
  
  edit fused_seqpool_cvm doc; test=develop (#41192) · 3b7b8528
  由 danleifeng 提交于 4月 01, 2022
  
  3b7b8528
28 3月, 2022 3 次提交
- D
  add fused_seqpool_cvm op (#37928) · ea5b2f26
  由 danleifeng 提交于 3月 28, 2022
```
* add fused_seqpool_cvm op;test=develop
```
  ea5b2f26
- L
  update docs dtype(core.VarDesc.VarType)test=document_fix (#40947) · 34f07045
  由 Ligoml 提交于 3月 28, 2022
```
* update docs dtype(core.VarDesc.VarType)

* fix code style, test=document_fix

fix code style, test=document_fix
Co-authored-by: NChen Long <1300851984@qq.com>
```
  34f07045
- G
  add adaround post-quant method (#38460) · 3d5a27f0
  由 Guanghua Yu 提交于 3月 28, 2022
```
* add adaround post-quant method
```
  3d5a27f0
25 3月, 2022 1 次提交

Refactor Dygraph Flags (#40786) · 3085d5e4

由 Jiabin Yang 提交于 3月 25, 2022

* refactor eager flags

* fix flags error when we switch from eager to dygraph

* fix ci problem

* fix ci

* fix ci

* merge develop and fix code style

* merge develop and fix code style

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* fix op test error

* merge develop

3085d5e4

24 3月, 2022 1 次提交

[AMP] Support amp for Intermediate_dygraph (#40623) · c12f7d48

由 zhangbo9674 提交于 3月 24, 2022

* approve amp for intermediate_dygraph

* add amp_utils for intermediate_dygraph

* add amp needcast check for mlu & npu

* test unittest

* add SetGradNode for set_stop_gradient && add checktensor for GradientHooks

* refine code

* refien unittest of imperative_amp for new dygraph

* inplace api skip amp

* add test_imperative_qat_amp for intermediate amp

* refine code

* refine test_amp ci strategy

* refine unittest code

* refine amp_utils code

* refine amp getpromotetype for some special op

* refine unittest code

c12f7d48

16 3月, 2022 3 次提交
- J
  Modify save_quant_model to support different input and output filenames (#40542) · dec2b1ca
  由 joanna.wozna.intel 提交于 3月 16, 2022
```
* Modify save_quant_model.py to support differnet input and output filenames

* Correct wrong order of arguments
```
  dec2b1ca
- M
  
  Add Support Layer List to ASP (#40253) · c040bbd7
  由 Ming-Xu Huang 提交于 3月 16, 2022
  
  c040bbd7
- Q
  
  [MLU] support amp O1 of mlu (#40461) · ad81f22c
  由 qipengh 提交于 3月 16, 2022
  
  ad81f22c
15 3月, 2022 1 次提交
- G
  Support some ops for full quantization (#40083) · 7ced3017
  由 Guanghua Yu 提交于 3月 15, 2022
```
* add some op for full_quantization
```
  7ced3017
11 3月, 2022 1 次提交
- G
  
  add EMD method of post_quant (#40421) · 82c30f71
  由 Guanghua Yu 提交于 3月 11, 2022
  
  82c30f71
04 3月, 2022 1 次提交
- J
  
  extend test_imperative_qat_user_defined test time (#40114) · 73a4fe6c
  由 Jiabin Yang 提交于 3月 04, 2022
  
  73a4fe6c
03 3月, 2022 2 次提交

B

change_ASP_sharding_option (#40028) · 815f7a67
由 Baibaifan 提交于 3月 03, 2022

815f7a67

Support slim eager (#39874) · da47544c

由 Jiabin Yang 提交于 3月 03, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* remove useless _set_value method

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop

* Support quant and part of slice

* support legacy static save

* extend slim tests time

* remove imperative on inference

* remove imperative on inference

* merge develop

* fix typo

* fix typo

* split slice related code into 2 part for imperative and eager

* split slice from inference

* split slice from inference

* fix test_tensor_register_hook
Co-authored-by: NWang Huan <wanghuan29@baidu.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>

da47544c

01 3月, 2022 2 次提交
- J
  Add mobilenetv3_large performance test for bf16 and int8 (#39738) · eb7c211a
  由 joanna.wozna.intel 提交于 3月 01, 2022
```
* Add mobilenetv3_large performance test

* Disable the BF16 test if the device does not support BF16 computations

* Change test timeout
```
  eb7c211a
- W
  remove conv_affine_channel_fuse_pass (#39817) · fc06be9d
  由 wenbin 提交于 3月 01, 2022
```
* remove

* pass

* more pass
```
  fc06be9d
19 2月, 2022 1 次提交

Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61

由 sneaxiy 提交于 2月 19, 2022

* add DistributedFusedLamb op

* polish code

* fix compile error

* compatible with pten changement

* fix rocm compile error

* improve converage

* update upstream/develop

* fix cast_with_ptr.h

* add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1

* fix clip before allreduce

* add use_master_param_norm

* code polish

* fix bug

* fix ROCM ci

5df3cd61

14 2月, 2022 1 次提交

[UT] mish op, conv+mish, fc+mish fuse passes (#39340) · 02938b3d

由 Sławomir Siwek 提交于 2月 14, 2022

* mish unit tests

* code format

* remove unused imports

* code format

* remove hard-coded shape values

* remove timeouts

* remove timeouts v2

* restore timeouts

02938b3d

09 2月, 2022 1 次提交

[Paddle-Inference] rebuild matmul pass: trt and gpu_cpu (#39369) · db7d129e

由 Wangzheee 提交于 2月 09, 2022

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

db7d129e

07 2月, 2022 1 次提交

Update BF16 amp list (#39304) · 0c43ce22

由 arlesniak 提交于 2月 07, 2022

* amp list updated

* tests updated

* gray list updated

* amp list updated

* test updated

0c43ce22

27 1月, 2022 1 次提交

Update passes in quant2_int8_mkldnn_pass (#38912) · 0e235e58

由 joanna.wozna.intel 提交于 1月 27, 2022

* Upadate pass in quant2_int8_mkldnn_pass

* Back to the previous scale_matmul order

* Change place of cpu_quantize_placement_pass

0e235e58

21 1月, 2022 1 次提交
- C
  
  fix save channel wise quant model (#39054) · ab1abd40
  由 ceci3 提交于 1月 21, 2022
  
  ab1abd40
13 1月, 2022 1 次提交

Added mul BF16/FP32 FWD/BWD oneDNN kernel (#38552) · fc6eed5b

由 jakpiase 提交于 1月 13, 2022

* base changes for mul reimplementation

* empty commit

* tmp save

* full implementation of mul bf16/fp32 fwd bwd

* CI fix

* CI rerun

* changed unity build cmake to avoid gpu issues

* removed mul mkldnn from unity build

* added skipping tests if not cpu_bf16

* CI fix

* CI fix

* CI fix

fc6eed5b

12 1月, 2022 1 次提交
- S
  Fix conv act int8 scale (#38331) · 4825addd
  由 Sylwester Fraczek 提交于 1月 12, 2022
```
* fix conv act int8 scale

* add unit test for conv+hard_swish
```
  4825addd
06 1月, 2022 1 次提交
- M
  
  [Paddle-ASP]Asp sharding (#37725) · aec6e8a9
  由 minghaoBD 提交于 1月 06, 2022
  
  aec6e8a9
05 1月, 2022 2 次提交
- J
  Make post training quant API support dataloader (#38686) · 0af1a87b
  由 Jiaqi Liu 提交于 1月 05, 2022
```
* make post training quant API support dataloader
```
  0af1a87b
- J
  Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d
  由 joanna.wozna.intel 提交于 1月 05, 2022
```
* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list
```
  1456b02d

PaddlePaddle / Paddle 1 年多 前同步成功

PaddlePaddle / Paddle
1 年多前同步成功