提交 · 76c73226234802e4d8fed62932af0beb7c4c4a5b · Crayon鑫 / Paddle

10 12月, 2021 1 次提交
- G
  
  fix fetch op rename_input bug in QAT export model (#38012) · 76c73226
  由 Guanghua Yu 提交于 12月 10, 2021
  
  76c73226
07 12月, 2021 1 次提交
- Z
  Quantize slice op (#37630) · 2bd0f3c7
  由 Zuza 提交于 12月 07, 2021
```
* quantize slice op

* correct test

* fix code formatting
```
  2bd0f3c7
01 12月, 2021 2 次提交

dequantize matmul and matmul_v2 Y weights in quant2_int8 (#37618) · 7094251b

由 Sylwester Fraczek 提交于 12月 01, 2021

* dequantize matmul and matmul_v2 Y weights in qat2_int8

* review fix

* split conv and mul tests, add matmul test

* fixup

* fix ci build

* remove unused variables

* formatting fix

* remove extra newline at end of file

7094251b

G

fix flatten in quant (#37722) · 9f61bc36
由 Guanghua Yu 提交于 12月 01, 2021

9f61bc36

30 11月, 2021 1 次提交
- S
  
  add matmul_v2_transpose_reshape_fuse_pass to quant2_int8_mkldnn_pass.py (#37619) · 82b55961
  由 Sylwester Fraczek 提交于 11月 30, 2021
  
  82b55961
26 11月, 2021 1 次提交
- Z
  upgrade async distributed training in pscore (#37515) · 74605fc2
  由 zhaocaibei123 提交于 11月 26, 2021
```
* test

* test

* rm test

* update

* update

* update

* add unittest

* update

* update save
```
  74605fc2
04 11月, 2021 1 次提交
- X
  Fix a bug of quantization (#36982) · cb6c0e21
  由 XGZhang 提交于 11月 04, 2021
```
* fix a quantization bug
```
  cb6c0e21
29 10月, 2021 1 次提交
- M
  
  Move the ASP training API to paddle.static.sparsity. (#36525) · 113816d8
  由 Ming-Xu Huang 提交于 10月 29, 2021
  
  113816d8
28 10月, 2021 1 次提交
- X
  
  support quantization of bert (#36593) · 6390b175
  由 XGZhang 提交于 10月 28, 2021
  
  6390b175
27 10月, 2021 1 次提交

Fused transformer encoder layer and fused feedforward layer (#36604) · 9f3613f3

由 zhangkaihuo 提交于 10月 27, 2021

本PR是fused_transformer的layer层代码，包含FusedFeedForward的layer层代码和FusedTransformerEncoderLayer的代码。

9f3613f3

20 10月, 2021 1 次提交
- Z
  
  fix pow2 decay (#36559) · 605e7f08
  由 Zeng Jinle 提交于 10月 20, 2021
  
  605e7f08
19 10月, 2021 1 次提交

Add pow2_decay_with_linear_warmup op (#36421) · 305b99a0

由 Zeng Jinle 提交于 10月 19, 2021

* add pow2_warmup op

* remove contrib __all__

* add AttrT

* rename

* follow comments

* fix duplicate PADDLE_RESTRICT

305b99a0

18 10月, 2021 1 次提交
- C
  quant support matmul_v2 (#36469) · 051544b6
  由 ceci3 提交于 10月 18, 2021
```
* quant support matmul_v2

* fix format
```
  051544b6
14 10月, 2021 2 次提交
- Y
  add sparse_embedding doc (#36283) · 6ccc2a40
  由 Yanxing Shi 提交于 10月 14, 2021
```
* add sparse_embedding doc

* delete wrong space

* fix error for sample code

* fix error for doc compile

* delete __all__

* modify sample code
```
  6ccc2a40
- Z
  
  Add the complete code and related files of resnet_unit_op (#36366) · 12e6dbbc
  由 Zhang Zheng 提交于 10月 14, 2021
  
  12e6dbbc
11 10月, 2021 1 次提交

[Paddle-ASP] Revise 4d tensor sparsity mask pattern for conv2d sparsity (#36054) · 00245cfd

由 zlsh80826 提交于 10月 11, 2021

Sparse tensor core for convolution requires the input channel dimension is 2:4 structed sparse.
So we have to mask the input channel dimension for using sparse tensor core

00245cfd

22 9月, 2021 1 次提交
- J
  
  Add quant2 int8 lstm model test (#35887) · be4d0026
  由 joanna.wozna.intel 提交于 9月 22, 2021
  
  be4d0026
21 9月, 2021 1 次提交

Reuse OneDNN handler for SGD and SUM for SelectedRows input tensors. (#35510) · 799f3861

由 Adam Osewski 提交于 9月 20, 2021

* Create stateful OneDNNAXPYHandler object.

This makes it possible to call it multiple times without recreating the
oneDNN primitives every time.

* Prepare SGDOpKernel to reuse its implementation from OneDNN kernel.

* OneDNN SGD kernel.

* Update call to use new OneDNNAXPYHandler object api.

* Setup seed in proper place.

* Enable OneDNN kernel only for single case.

* For dense param and sparse grad.

* Small refactor.

* Enable oneDNN by op attr or by cmd line flag.

* Use int64_t type for number of elements.

* Support dense param and grad from OneDNN kernel.

* Enable SGD OneDNN kernel when use MP BF16 optimizer.

* Force non-copyable/movable OneDNNAXPYHandler.

* Reuse OneDNNAXPYHandler for spare tensors in SUM op.

* Fix SFINAE rules.

* Remove recording event inside AXPY.

* Get rid of internal primitive caching.

* Stop use PP cache mechanims to store mem and primitive obj.
* Handler obj store and reuse needed desc & prim

* Do not derive from MKLDNNHandlerT

799f3861

17 9月, 2021 1 次提交

[AMP] Support pure fp16 training mode for dygraph (#35521) · adaeee4d

由 zhangbo9674 提交于 9月 17, 2021

* add pure fp16 major function in auto_cast & tracer

* support master weight in dygraph for pure fp16

* check mix dtype of fp16&fp32 for check_finite_and_unscale op

* change pure fp16 funtion name

* refine some bug in auto_cast

* refine auto_cast interface logic

* add param _casted_by_pure_fp16 for class Layer

* support state_dict hook for save model by user appointed dtype in pure_fp16_decorator

* refine pure_fp16_decorator as decorator

* add unittest

* add comment

* add comment

* support recompute

* add comment for auto_cast and decorator

* support to_static_state_dict for paddle.jit.save

* unlimite models num and optimizers num

* add lookup_table in black_list

* fix momentum and layer state_dict

* fix bug in layer state_dict

* fix bug in layer state_dict_helper

* refine unittest

* refine test_momentun_op

* refine interface and some code

* refine amp_decorator interface

* refine pure fp16 interface

* refine master weight interface

adaeee4d

15 9月, 2021 1 次提交

王

clip op extra information when export model. (#35447) · 4d236354

由王明冬提交于 9月 15, 2021

* clip op extra information when export model,test=ocr

* rename clip_extra parameter to kwargs in save_inference_model, test=ocr

4d236354

13 9月, 2021 3 次提交
- Z
  [RC22] Fix linear with matmul_op replace (#35445) · 53e294ca
  由 zhulei 提交于 9月 13, 2021
```
* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace

* [RC22] Fix linear with matmul_op replace
```
  53e294ca
- L
  
  add lstm qat models scales (#35382) · 1ee237c1
  由 lidanqing 提交于 9月 13, 2021
  
  1ee237c1
- J
  
  Update scales when var is unsigned (#35599) · b4806644
  由 joanna.wozna.intel 提交于 9月 12, 2021
  
  b4806644
10 9月, 2021 2 次提交
- W
  
  Set attribute "with_quant_attr" into quantized operators (#35583) · d856f876
  由 whs 提交于 9月 10, 2021
  
  d856f876
- S
  
  fix bug of recompute in hybridparallel (#35588) · d53e567a
  由 ShenLiang 提交于 9月 10, 2021
  
  d53e567a
09 9月, 2021 1 次提交
- X
  
  quant: fix a export bug (#35410) · 81e702ac
  由 XGZhang 提交于 9月 09, 2021
  
  81e702ac
06 9月, 2021 1 次提交

Add fusion_lstm INT8 PTQ (#35334) · 7ef04da6

由 joanna.wozna.intel 提交于 9月 06, 2021

* Add fusion_lstm INT8 PTQ

* Correct mkldnn_cache_capacity and enable fc_lstm_fuse_pass only for this test

* Change mkldnn_cache_capacity

7ef04da6

03 9月, 2021 1 次提交
- X
  
  fix a quantization bug (#35407) · 07126112
  由 XGZhang 提交于 9月 03, 2021
  
  07126112
01 9月, 2021 1 次提交
- C
  
  add support ops for quantization (#35312) · 5baccfdd
  由 cc 提交于 9月 01, 2021
  
  5baccfdd
31 8月, 2021 1 次提交
- X
  
  support fuse layers for ptq (#35015) · ef536250
  由 XGZhang 提交于 8月 31, 2021
  
  ef536250
26 8月, 2021 1 次提交
- X
  
  fix the bug of channel-wise quantization for ernie (#34948) · c71025eb
  由 XGZhang 提交于 8月 26, 2021
  
  c71025eb
24 8月, 2021 1 次提交
- A
  Update LearningRate for test fit a line BF16 (#34653) · 36f7e751
  由 Adam Osewski 提交于 8月 24, 2021
```
* Small corrections.

* Fix lr for bf16.

* Revert some changes.
```
  36f7e751
18 8月, 2021 1 次提交
- X
  
  support quantization of conv2d_transpose (#34547) · 8967a66a
  由 XGZhang 提交于 8月 18, 2021
  
  8967a66a
17 8月, 2021 1 次提交
- R
  
  [NPU]Adamw skip update for npu (#34897) · b4474fb4
  由 Roc 提交于 8月 17, 2021
  
  b4474fb4
16 8月, 2021 1 次提交
- Z
  
  fix iscan bug in test file (#34912) · f6d8ab54
  由 zhangchunle 提交于 8月 16, 2021
  
  f6d8ab54
10 8月, 2021 1 次提交
- X
  
  fix a quantization bug (#34647) · cfd49acc
  由 XGZhang 提交于 8月 10, 2021
  
  cfd49acc
05 8月, 2021 1 次提交
- W
  
  optimize pipeline performance with recompute and amp, test=allcase (#34519) · 911c8593
  由 WangXi 提交于 8月 05, 2021
  
  911c8593
30 7月, 2021 1 次提交
- Z
  
  fix function-redefined 1 (#34507) · 06b55eaa
  由 zhangchunle 提交于 7月 30, 2021
  
  06b55eaa
28 7月, 2021 1 次提交
- C
  
  quantize_transpiler_v2 supports quantize fp16 tensor (#34398) · 9f604928
  由 cc 提交于 7月 28, 2021
  
  9f604928
22 7月, 2021 1 次提交

copy found_inf to cpu in advance to improve performance (#34274) · 781f4028

由 Leo Chen 提交于 7月 22, 2021

* copy found_inf to cpu in advance to improve performance

* add npu test

* add npu test

* refine code

* refine memcpy op

* fix adam

781f4028

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致