提交 · ce08fdcf2b55896a3709f7fb496469f4cb20b425 · BaiXuePrincess / Paddle

08 4月, 2020 2 次提交
- J
  Add support for INT8 matmul in C-API quantization (#23463) · ce08fdcf
  由 joanna.wozna.intel 提交于 4月 08, 2020
```
* Integrate matmul with cpu_quantize_pass

test=develop

* Add matmul checking scales

test=develop

* Change condition of matmul quantization

test=develop

* Remove redundant var

test=develop
```
  ce08fdcf
- W
  
  fix untime fail for output var stop_gradient=True for fusion group (#23317) · d085f792
  由 wangchaochaohu 提交于 4月 08, 2020
  
  d085f792
05 4月, 2020 1 次提交
- K
  Fix inplace_abn compile error on Windows (#23464) · d223a249
  由 Kaipeng Deng 提交于 4月 05, 2020
```
* fix inplace_abn windows compile error. test=develop
```
  d223a249
03 4月, 2020 2 次提交
- W
  
  polish the code of fusion group test=develop (#23370) · 5c607787
  由 wangchaochaohu 提交于 4月 03, 2020
  
  5c607787
- Y
  
  Disable test_code_generator and test_post_training_quantization_mobilenetv1 (#23440) · bc2981e9
  由 Yiqun Liu 提交于 4月 03, 2020
  
  bc2981e9
02 4月, 2020 2 次提交
- J
  
  Add default pass attributes (#23042) · 8c463700
  由 joanna.wozna.intel 提交于 4月 02, 2020
  
  8c463700
- K
  Add inplace abn op (#22806) · 21d95be0
  由 Kaipeng Deng 提交于 4月 02, 2020
```
* add inplace_abn_op. test=develop
```
  21d95be0
01 4月, 2020 3 次提交
- Z
  
  add reader dependency pass, test=develop (#23301) · 3a21980b
  由 Zeng Jinle 提交于 4月 01, 2020
  
  3a21980b
- W
  Add support for attr type Op and add fill_constant Op and scale Op (#23163) · d2801060
  由 wangchaochaohu 提交于 4月 01, 2020
```
* add attr support for fusion group and add support for fill_constant and scale Op
```
  d2801060
- J
  
  [DNNL] Added MKL-DNN inplace pass for C-API inference (#23315) · 2bb1b0e8
  由 Jacek Czaja 提交于 4月 01, 2020
  
  2bb1b0e8
28 3月, 2020 1 次提交
- W
  
  add check for scales and a message (#23119) · f836c8aa
  由 Wojciech Uss 提交于 3月 28, 2020
  
  f836c8aa
27 3月, 2020 1 次提交
- T
  simplify the cmake log of ir/CMakeLists.txt (#23262) · c00d427d
  由 Tao Luo 提交于 3月 27, 2020
```
test=develop
```
  c00d427d
25 3月, 2020 1 次提交
- Z
  
  fix graph attr copy issues, test=develop (#23191) · bae5930b
  由 Zeng Jinle 提交于 3月 24, 2020
  
  bae5930b
20 3月, 2020 3 次提交

Reader sequential and inference partial feed (#22699) · acfc9b8a

由 Zeng Jinle 提交于 3月 20, 2020

* sequential reader stage 1, test=develop

* fix ut, test=develop

* fix iterable=False reset bug, add some logs and polish code, test=develop

* inference feed partial data, test=develop

* Turn on keep_order=True for test, test=develop

* enhance ut to test more cases, test=develop

* test commit for reverting

* Revert "test commit for reverting", test=develop

This reverts commit 80aef42e.

* add ut of merged and unmerged results, test=develop

* add more uts for coverages and add en doc of api, test=develop

* follow comments, test=develop

* change note style, test=develop

acfc9b8a

W
update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114) · 95b356a0
由 Wilber 提交于 3月 20, 2020
```
update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input
```
95b356a0
Y

Add the detection and code-generation of sqrt and square in fusion_group (#23095) · 3af47711
由 Yiqun Liu 提交于 3月 20, 2020

3af47711

19 3月, 2020 1 次提交
- S
  
  added mkldnn swish activation (#23041) · abee05a8
  由 Sylwester Fraczek 提交于 3月 19, 2020
  
  abee05a8
13 3月, 2020 1 次提交
- W
  Add Unittest for backward of fusion group (#22932) · 3757e068
  由 wangchaochaohu 提交于 3月 13, 2020
```
* add fusion group test for backward and refine code
```
  3757e068
12 3月, 2020 1 次提交
- W
  Cast fusion for fusion group (#22876) · f0d193a2
  由 wangchaochaohu 提交于 3月 12, 2020
```
* add support for expression type convert and add cast Op support in fusion group
```
  f0d193a2
11 3月, 2020 2 次提交

W
add skip_layernorm pass. test=develop (#22895) · ff3ddbb5
由 Wilber 提交于 3月 11, 2020
```
* add skip_layernorm pass. test=develop
```
ff3ddbb5

[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494) · 8d6dc102

由 Zhaolong Xing 提交于 3月 11, 2020

* 1. add embedding eltwise layernorm fuse
2. add embedding eltwise layernorm op
3. refine inplace_add_relu
4. refine fc_eltwise_layernorm
test=develop

* 1. refine fc
test=develop

* fix comments
test=develop

* fix comments

test=develop

8d6dc102

09 3月, 2020 1 次提交

Fix fc padding bug during inference fusion (#22860) · 61fef975

由 liu zhengxi 提交于 3月 09, 2020

* fix fc padding during fusion, test=develop

* fix optim model inference after SaveOptimModel, test=develop

61fef975

01 3月, 2020 1 次提交
- W
  add sum op support for fusion group (#22771) · ca9e77a8
  由 wangchaochaohu 提交于 3月 01, 2020
```
* Add the codegen and auto fusion for sum Op  in fusion group
```
  ca9e77a8
28 2月, 2020 1 次提交
- T
  
  fix typo word (#22784) · 433cef03
  由 tianshuo78520a 提交于 2月 28, 2020
  
  433cef03
24 2月, 2020 1 次提交

Add an inference interface to disable FC padding (#22097) · cdf5f6fb

由 GaoWei8 提交于 2月 24, 2020

* Add an interface of disabling FC padding
* fix bert regression
* polish fc padding interface
* recover pass function
* fix argument error
* fix mkldnn error

cdf5f6fb

23 2月, 2020 1 次提交
- T
  
  fix typo words (#22653) · d2ba91aa
  由 tianshuo78520a 提交于 2月 23, 2020
  
  d2ba91aa
21 2月, 2020 1 次提交
- Y
  
  Add the support of fp16 in fusion_group (#22239) · 22bbd547
  由 Yiqun Liu 提交于 2月 21, 2020
  
  22bbd547
14 2月, 2020 1 次提交

fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551) · 9a8203aa

由 Wilber 提交于 2月 14, 2020

当一个模型中有多个fc_lstm子图的时候，且其中fc共用了同一个persistable的bias，此时不应该将bias节点删除，只将非persistable的节点去除即可。

9a8203aa

13 2月, 2020 1 次提交

[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486) · 8acd745c

由 Zhaolong Xing 提交于 2月 13, 2020

* 1. optim multihead matmul: fuse three fc to multihtead matmul

test=develop

* fix conflict
test=develop

* fix comments
test=develop

8acd745c

07 2月, 2020 1 次提交

Enable the detection of subgraph composed of grad ops (#21223) · dcfb6038

由 Yiqun Liu 提交于 2月 07, 2020

* Add the first implememtation of fusion_group op #19621 (#3)

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Enable generating code for a given subgraph. #21126 (#4)

* Enable generating code for a given subgraph.

* Support sorting the subgraph.

* Remove the rearange of expressions because we use the sorted subgraph directly.

* Enable generating code for a subgraph which is composed of grad ops.

* Use expression information to check the accuracy in unittest.

* Separate load and store from computation expressions.
test=develop

* Improve the loading statements in generated codes.
test=develop

* Remove unused arguments from formal list.
test=develop

* Enable the detection of subgraph of grad ops.

* Generate code for detected subgraph in fusion_group_pass.

* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop

* Fix a bug when checking whether the shape of all inputs are the same.

* Add debug information.

* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5)

test=develop

* Call subgraph_detector in fusion_group pass.
test=develop

* Disable fusion_group when WITH_GPU is OFF.
test=develop

* Refine all PADDLE_ENFORCE message.
test=develop

* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop

* Follow review comments.
test=develop

dcfb6038

06 2月, 2020 1 次提交
- J
  Add dequant-scale squash (#22409) · 17f2c089
  由 joanna.wozna.intel 提交于 2月 06, 2020
```
* Add dequant scale squash

test=develop

* Correct dequant-scale squash test

test=develop
```
  17f2c089
05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

04 2月, 2020 1 次提交
- 石
  
  remove anakin from code, test=develop (#22420) · e1b0d7cb
  由石晓伟提交于 2月 04, 2020
  
  e1b0d7cb
31 1月, 2020 1 次提交

[DNNL] Fix accuracy in INT8 FC (#22404) · 269db0d1

由 Michał Gallus 提交于 1月 31, 2020

* Enable quantize to reorder to nchw as well

* Correct FC MKL-DNN input dim requirements to accept 3D

* Improve DNNL FC format, error and 3D input handling

test=develop

* Improve error checking in FC

test=develop

* Improve PADDLE_ENFORCE messages in fc-related files

* Remove data layout attribute from obligatory pass args

test=develop

* Fix message in fc_mkldnn_pass to be logically correct

test=develop

269db0d1

25 1月, 2020 1 次提交
- J
  
  Restore requantize squash (#22399) · 3099d9d4
  由 joanna.wozna.intel 提交于 1月 25, 2020
  
  3099d9d4
17 1月, 2020 1 次提交

Implement a common python unittest to test the ir passes. (#22209) · b7cac50b

由 Yiqun Liu 提交于 1月 17, 2020

* Implement a common python unittest to test the ir passes.
test=develop

* Save the results in np.array and support to startup on CPU.
test=develop

* Fix the unittest.
test=develop

* Add check_program to check whether the optimized program is different from the origin one.
test=develop

* Remove the inferface all_ops.
test=develop

* Add exception test in pass_test.
test=develop

b7cac50b

16 1月, 2020 1 次提交
- L
  
  change std::cout to log(INFO), vlog (#22316) · 895f8da7
  由 lidanqing 提交于 1月 16, 2020
  
  895f8da7
15 1月, 2020 1 次提交
- Z
  
  fix the bug of assert_is_op_output. test=develop (#22262) · e40cfb10
  由 Zhen Wang 提交于 1月 15, 2020
  
  e40cfb10
14 1月, 2020 1 次提交
- W
  
  improve placement pass tests code coverage (#22197) · d3a66473
  由 Wojciech Uss 提交于 1月 14, 2020
  
  d3a66473
10 1月, 2020 1 次提交

Add bn and relu fuse pass (#22048) · 46189b16

由 Zhen Wang 提交于 1月 10, 2020

* add bn and relu fuse pass

* add op attr assert and dtype assert

* fix some inputs&&outputs bugs for the fused op and pattern.

* add the unittest for fuse_bn_act_pass. test=develop

* use normative enforce statements. test=develop

* add the cpu test. test=develop

* add the support of batch_size=1 for the bn with relu op. test=develop

* add the error type for paddle throws. test=develop

* add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop

46189b16

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致