提交 · d1a98f0bb72822f44c88bec8a82e4e7fc031ba9f · BaiXuePrincess / Paddle

16 3月, 2022 1 次提交
- Q
  
  [MLU] support amp O1 of mlu (#40461) · ad81f22c
  由 qipengh 提交于 3月 16, 2022
  
  ad81f22c
15 3月, 2022 1 次提交
- G
  Support some ops for full quantization (#40083) · 7ced3017
  由 Guanghua Yu 提交于 3月 15, 2022
```
* add some op for full_quantization
```
  7ced3017
11 3月, 2022 1 次提交
- G
  
  add EMD method of post_quant (#40421) · 82c30f71
  由 Guanghua Yu 提交于 3月 11, 2022
  
  82c30f71
04 3月, 2022 1 次提交
- J
  
  extend test_imperative_qat_user_defined test time (#40114) · 73a4fe6c
  由 Jiabin Yang 提交于 3月 04, 2022
  
  73a4fe6c
03 3月, 2022 2 次提交

B

change_ASP_sharding_option (#40028) · 815f7a67
由 Baibaifan 提交于 3月 03, 2022

815f7a67

由 Jiabin Yang 提交于 3月 03, 2022

* eager, test=develop

* fix bug, test=develop

* eager, test=develop

* merge legacy to fluid

* eager, test=develop

* eager, test=develop

* Refactor TensorAdd func by template and remove gradient_accumulation in eager

* Remove needless target name

* eager, test=develop

* eager, test=develop

* Use overload instead of template

* Remove legacy code

* Remove legacy code

* selectedrows, test=develop

* Remove DataType test

* eager, test=develop

* eager, test=develop

* support gan, test=develop

* Using Tensor directly instead of using EagerTensor

* support gradient_accumulation

* make test_imperative_lod_tensor_to_selected_rows longer

* make test_imperative_lod_tensor_to_selected_rows longer

* refine code

* ptb, test=develop

* Rename all EagerTensor to Tensor

* Rename some EagerTensor to Tensor

* rename EagerTensor to EagerVariable

* eager, test=develop

* eager, test=develop

* eager, test=develop

* eager, test=develop

* add more test

* eager, test=develop

* Support copiable selected rows and merge develop

* save load, eager, test=develop

* save load, eager, test=develop

* refine, test=develop

* remove useless _set_value method

* refine, test=develop

* refine, test=develop

* revert static_runner, test=develop

* EagerTensor to Tensor, test=develop

* refine, test=develop

* refine, test=develop

* clear grad, test=develop

* merge, develop

* merge, develop

* merge, test=develop

* merge, test=develop

* Support quant and part of slice

* support legacy static save

* extend slim tests time

* remove imperative on inference

* remove imperative on inference

* merge develop

* fix typo

* fix typo

* split slice related code into 2 part for imperative and eager

* split slice from inference

* split slice from inference

* fix test_tensor_register_hook
Co-authored-by: NWang Huan <wanghuan29@baidu.com>
Co-authored-by: NWeilong Wu <veyron_wu@163.com>
Co-authored-by: Nwanghuancoder <wanghuancoder@163.com>

da47544c

01 3月, 2022 2 次提交
- J
  Add mobilenetv3_large performance test for bf16 and int8 (#39738) · eb7c211a
  由 joanna.wozna.intel 提交于 3月 01, 2022
```
* Add mobilenetv3_large performance test

* Disable the BF16 test if the device does not support BF16 computations

* Change test timeout
```
  eb7c211a
- W
  remove conv_affine_channel_fuse_pass (#39817) · fc06be9d
  由 wenbin 提交于 3月 01, 2022
```
* remove

* pass

* more pass
```
  fc06be9d
19 2月, 2022 1 次提交

Add the DistributedFusedLamb optimizer (#39148) · 5df3cd61

由 sneaxiy 提交于 2月 19, 2022

* add DistributedFusedLamb op

* polish code

* fix compile error

* compatible with pten changement

* fix rocm compile error

* improve converage

* update upstream/develop

* fix cast_with_ptr.h

* add FLAGS_distributed_lamb_divide_nranks_when_allreduce=1

* fix clip before allreduce

* add use_master_param_norm

* code polish

* fix bug

* fix ROCM ci

5df3cd61

14 2月, 2022 1 次提交

[UT] mish op, conv+mish, fc+mish fuse passes (#39340) · 02938b3d

由 Sławomir Siwek 提交于 2月 14, 2022

* mish unit tests

* code format

* remove unused imports

* code format

* remove hard-coded shape values

* remove timeouts

* remove timeouts v2

* restore timeouts

02938b3d

09 2月, 2022 1 次提交

[Paddle-Inference] rebuild matmul pass: trt and gpu_cpu (#39369) · db7d129e

由 Wangzheee 提交于 2月 09, 2022

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

* rebuild matmul pass: trt and gpu_cpu

db7d129e

07 2月, 2022 1 次提交

Update BF16 amp list (#39304) · 0c43ce22

由 arlesniak 提交于 2月 07, 2022

* amp list updated

* tests updated

* gray list updated

* amp list updated

* test updated

0c43ce22

27 1月, 2022 1 次提交

Update passes in quant2_int8_mkldnn_pass (#38912) · 0e235e58

由 joanna.wozna.intel 提交于 1月 27, 2022

* Upadate pass in quant2_int8_mkldnn_pass

* Back to the previous scale_matmul order

* Change place of cpu_quantize_placement_pass

0e235e58

21 1月, 2022 1 次提交
- C
  
  fix save channel wise quant model (#39054) · ab1abd40
  由 ceci3 提交于 1月 21, 2022
  
  ab1abd40
13 1月, 2022 1 次提交

Added mul BF16/FP32 FWD/BWD oneDNN kernel (#38552) · fc6eed5b

由 jakpiase 提交于 1月 13, 2022

* base changes for mul reimplementation

* empty commit

* tmp save

* full implementation of mul bf16/fp32 fwd bwd

* CI fix

* CI rerun

* changed unity build cmake to avoid gpu issues

* removed mul mkldnn from unity build

* added skipping tests if not cpu_bf16

* CI fix

* CI fix

* CI fix

fc6eed5b

12 1月, 2022 1 次提交
- S
  Fix conv act int8 scale (#38331) · 4825addd
  由 Sylwester Fraczek 提交于 1月 12, 2022
```
* fix conv act int8 scale

* add unit test for conv+hard_swish
```
  4825addd
06 1月, 2022 1 次提交
- M
  
  [Paddle-ASP]Asp sharding (#37725) · aec6e8a9
  由 minghaoBD 提交于 1月 06, 2022
  
  aec6e8a9
05 1月, 2022 2 次提交
- J
  Make post training quant API support dataloader (#38686) · 0af1a87b
  由 Jiaqi Liu 提交于 1月 05, 2022
```
* make post training quant API support dataloader
```
  0af1a87b
- J
  Quantize nearest_interp and nearest_interp_v2 (#38622) · 1456b02d
  由 joanna.wozna.intel 提交于 1月 05, 2022
```
* Quantize nearest_interp and nearest_interp_v2

* Check if avx_core supported

* Add depthwise_conv2d to supported quantization list
```
  1456b02d
28 12月, 2021 1 次提交

Fix scatter_op fp16 perf problem. (#38499) · 33ce249f

由 Li Min 提交于 12月 28, 2021

* Fix scatter_op fp16 perf problem.

* Add scatter into black list.

* Add scatter into black list for dygraph.

33ce249f

22 12月, 2021 1 次提交
- G
  
  fix clip extra when QAT export model (#38323) · 142ea171
  由 Guanghua Yu 提交于 12月 22, 2021
  
  142ea171
20 12月, 2021 1 次提交

Support FP16 for more ops (#38123) · 1f445bf3

由 sneaxiy 提交于 12月 20, 2021

* support FP16 for more ops

* add amp list tests

* refine reduce_mean_grad

* fix OP benchmark ci

* fix fp16 reduce_mean

* updat ut, but still have some problems

* remove mean/reduce_mean fp16 kernel

1f445bf3

17 12月, 2021 1 次提交

Refine some AMP operators for BERT (#37923) · d80fe268

由 sneaxiy 提交于 12月 17, 2021

* support multi precision update for LAMB

* hide some api

* fix ci uts

* fix lamb output of dygraph

* remove some changes to some PR

* try to fix Py3 CI compile error

* fix test_imperative_optimizer, add lars ut, add layer_norm ut

* fix ut, fix format

* fix ut

* fix windows ci

d80fe268

14 12月, 2021 3 次提交
- S
  add map_matmul and fc_act_fuse passes to quant2_int8_mkldnn_pass (#38023) · 8f800dc0
  由 Sylwester Fraczek 提交于 12月 14, 2021
```
* add map_matmul passes to quant2_int8_mkldnn_pass

* fix fc+act fuse (activation scale)

* ci fix, c++17 structured bindings not available

* fix ci static check
```
  8f800dc0
- G
  
  fix QAT export bug in while OP (#38102) · fff6e77c
  由 Guanghua Yu 提交于 12月 14, 2021
  
  fff6e77c
- S
  add reshape+transpose+matmul_v2 only (#37847) · a922168a
  由 Sylwester Fraczek 提交于 12月 14, 2021
```
* reshape+transpose+matmul_v2

* in_name->input_name

* fix pr-ci-static-check
```
  a922168a
13 12月, 2021 1 次提交
- X
  fix single card 8 unittests in new executor (#37957) · 9a4eec98
  由 xiongkun 提交于 12月 13, 2021
```
* fix single card 8 unittests in new executor

* fix

* fix
```
  9a4eec98
10 12月, 2021 2 次提交
- G
  Support quantization of condition block (#37498) · 89069af5
  由 Guanghua Yu 提交于 12月 10, 2021
```
* Support sub graph quant-post
```
  89069af5
- G
  
  fix fetch op rename_input bug in QAT export model (#38012) · 76c73226
  由 Guanghua Yu 提交于 12月 10, 2021
  
  76c73226
07 12月, 2021 1 次提交
- Z
  Quantize slice op (#37630) · 2bd0f3c7
  由 Zuza 提交于 12月 07, 2021
```
* quantize slice op

* correct test

* fix code formatting
```
  2bd0f3c7
01 12月, 2021 2 次提交

dequantize matmul and matmul_v2 Y weights in quant2_int8 (#37618) · 7094251b

由 Sylwester Fraczek 提交于 12月 01, 2021

* dequantize matmul and matmul_v2 Y weights in qat2_int8

* review fix

* split conv and mul tests, add matmul test

* fixup

* fix ci build

* remove unused variables

* formatting fix

* remove extra newline at end of file

7094251b

G

fix flatten in quant (#37722) · 9f61bc36
由 Guanghua Yu 提交于 12月 01, 2021

9f61bc36

30 11月, 2021 1 次提交
- S
  
  add matmul_v2_transpose_reshape_fuse_pass to quant2_int8_mkldnn_pass.py (#37619) · 82b55961
  由 Sylwester Fraczek 提交于 11月 30, 2021
  
  82b55961
26 11月, 2021 1 次提交
- Z
  upgrade async distributed training in pscore (#37515) · 74605fc2
  由 zhaocaibei123 提交于 11月 26, 2021
```
* test

* test

* rm test

* update

* update

* update

* add unittest

* update

* update save
```
  74605fc2
04 11月, 2021 1 次提交
- X
  Fix a bug of quantization (#36982) · cb6c0e21
  由 XGZhang 提交于 11月 04, 2021
```
* fix a quantization bug
```
  cb6c0e21
29 10月, 2021 1 次提交
- M
  
  Move the ASP training API to paddle.static.sparsity. (#36525) · 113816d8
  由 Ming-Xu Huang 提交于 10月 29, 2021
  
  113816d8
28 10月, 2021 1 次提交
- X
  
  support quantization of bert (#36593) · 6390b175
  由 XGZhang 提交于 10月 28, 2021
  
  6390b175
27 10月, 2021 1 次提交

Fused transformer encoder layer and fused feedforward layer (#36604) · 9f3613f3

由 zhangkaihuo 提交于 10月 27, 2021

本PR是fused_transformer的layer层代码，包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。

9f3613f3

20 10月, 2021 1 次提交
- Z
  
  fix pow2 decay (#36559) · 605e7f08
  由 Zeng Jinle 提交于 10月 20, 2021
  
  605e7f08
19 10月, 2021 1 次提交

Add pow2_decay_with_linear_warmup op (#36421) · 305b99a0

由 Zeng Jinle 提交于 10月 19, 2021

* add pow2_warmup op

* remove contrib __all__

* add AttrT

* rename

* follow comments

* fix duplicate PADDLE_RESTRICT

305b99a0

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致