提交 · 12ba05ce0c5c798565e761a938674f55aa856bf3 · 机器未来 / Paddle

13 4月, 2020 2 次提交
- J
  
  Add scale-matmul fuse pass (#23734) · 12ba05ce
  由 joanna.wozna.intel 提交于 4月 13, 2020
  
  12ba05ce
- C
  API (CompiledProgram) error message enhancement (#23559) · 532079a2
  由 Chen Weihang 提交于 4月 13, 2020
```
* api compild program error polish, test=develop

* fix coverage problem, test=develop

* fix details & add unittests, test=develop

* add test for coverage, test=develop
```
  532079a2
11 4月, 2020 2 次提交
- C
  Add three passes and api reference of paddle_pass_builder. test=develop (#23741) · 9b06dd86
  由 chenhaoze 提交于 4月 11, 2020
```
* Add three passes and api reference of paddle_pass_builder.h
```
  9b06dd86
- J
  Op-requant squash (#23665) · 5ee099ca
  由 joanna.wozna.intel 提交于 4月 11, 2020
```
* Op-requant squash

test=develop

* Add matmul to op-requant test

test=develop
```
  5ee099ca
09 4月, 2020 1 次提交

Remove: NGraph engine from PDPD repository (#23545) · 3baaee9a

由 mozga-intel 提交于 4月 09, 2020

* Remove the NGraph engine from PDPD repository
1. Each operator was removed from the operator's directory
2. Each test was removed from the unittest directory
3. The parallel executor support was removed from the PDPD
4. The CMake file was removed from the PDPD
5. The NG flags were removed from the repository
test=develop

* Remove ngraph from:
1. Cmake file
2. Python file
test=develop

3baaee9a

08 4月, 2020 4 次提交
- J
  Add matmul dequant squash (#23505) · 3cb5623d
  由 joanna.wozna.intel 提交于 4月 08, 2020
```
test=develop
```
  3cb5623d
- W
  
  Fp16 refine for fusion group (#23472) · c1187cd6
  由 wangchaochaohu 提交于 4月 08, 2020
  
  c1187cd6
- J
  Add support for INT8 matmul in C-API quantization (#23463) · ce08fdcf
  由 joanna.wozna.intel 提交于 4月 08, 2020
```
* Integrate matmul with cpu_quantize_pass

test=develop

* Add matmul checking scales

test=develop

* Change condition of matmul quantization

test=develop

* Remove redundant var

test=develop
```
  ce08fdcf
- W
  
  fix untime fail for output var stop_gradient=True for fusion group (#23317) · d085f792
  由 wangchaochaohu 提交于 4月 08, 2020
  
  d085f792
05 4月, 2020 1 次提交
- K
  Fix inplace_abn compile error on Windows (#23464) · d223a249
  由 Kaipeng Deng 提交于 4月 05, 2020
```
* fix inplace_abn windows compile error. test=develop
```
  d223a249
03 4月, 2020 2 次提交
- W
  
  polish the code of fusion group test=develop (#23370) · 5c607787
  由 wangchaochaohu 提交于 4月 03, 2020
  
  5c607787
- Y
  
  Disable test_code_generator and test_post_training_quantization_mobilenetv1 (#23440) · bc2981e9
  由 Yiqun Liu 提交于 4月 03, 2020
  
  bc2981e9
02 4月, 2020 2 次提交
- J
  
  Add default pass attributes (#23042) · 8c463700
  由 joanna.wozna.intel 提交于 4月 02, 2020
  
  8c463700
- K
  Add inplace abn op (#22806) · 21d95be0
  由 Kaipeng Deng 提交于 4月 02, 2020
```
* add inplace_abn_op. test=develop
```
  21d95be0
01 4月, 2020 3 次提交
- Z
  
  add reader dependency pass, test=develop (#23301) · 3a21980b
  由 Zeng Jinle 提交于 4月 01, 2020
  
  3a21980b
- W
  Add support for attr type Op and add fill_constant Op and scale Op (#23163) · d2801060
  由 wangchaochaohu 提交于 4月 01, 2020
```
* add attr support for fusion group and add support for fill_constant and scale Op
```
  d2801060
- J
  
  [DNNL] Added MKL-DNN inplace pass for C-API inference (#23315) · 2bb1b0e8
  由 Jacek Czaja 提交于 4月 01, 2020
  
  2bb1b0e8
28 3月, 2020 1 次提交
- W
  
  add check for scales and a message (#23119) · f836c8aa
  由 Wojciech Uss 提交于 3月 28, 2020
  
  f836c8aa
27 3月, 2020 1 次提交
- T
  simplify the cmake log of ir/CMakeLists.txt (#23262) · c00d427d
  由 Tao Luo 提交于 3月 27, 2020
```
test=develop
```
  c00d427d
25 3月, 2020 1 次提交
- Z
  
  fix graph attr copy issues, test=develop (#23191) · bae5930b
  由 Zeng Jinle 提交于 3月 24, 2020
  
  bae5930b
20 3月, 2020 3 次提交

Reader sequential and inference partial feed (#22699) · acfc9b8a

由 Zeng Jinle 提交于 3月 20, 2020

* sequential reader stage 1, test=develop

* fix ut, test=develop

* fix iterable=False reset bug, add some logs and polish code, test=develop

* inference feed partial data, test=develop

* Turn on keep_order=True for test, test=develop

* enhance ut to test more cases, test=develop

* test commit for reverting

* Revert "test commit for reverting", test=develop

This reverts commit 80aef42e.

* add ut of merged and unmerged results, test=develop

* add more uts for coverages and add en doc of api, test=develop

* follow comments, test=develop

* change note style, test=develop

acfc9b8a

W
update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114) · 95b356a0
由 Wilber 提交于 3月 20, 2020
```
update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input
```
95b356a0
Y

Add the detection and code-generation of sqrt and square in fusion_group (#23095) · 3af47711
由 Yiqun Liu 提交于 3月 20, 2020

3af47711

19 3月, 2020 1 次提交
- S
  
  added mkldnn swish activation (#23041) · abee05a8
  由 Sylwester Fraczek 提交于 3月 19, 2020
  
  abee05a8
13 3月, 2020 1 次提交
- W
  Add Unittest for backward of fusion group (#22932) · 3757e068
  由 wangchaochaohu 提交于 3月 13, 2020
```
* add fusion group test for backward and refine code
```
  3757e068
12 3月, 2020 1 次提交
- W
  Cast fusion for fusion group (#22876) · f0d193a2
  由 wangchaochaohu 提交于 3月 12, 2020
```
* add support for expression type convert and add cast Op support in fusion group
```
  f0d193a2
11 3月, 2020 2 次提交

W
add skip_layernorm pass. test=develop (#22895) · ff3ddbb5
由 Wilber 提交于 3月 11, 2020
```
* add skip_layernorm pass. test=develop
```
ff3ddbb5

[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494) · 8d6dc102

由 Zhaolong Xing 提交于 3月 11, 2020

* 1. add embedding eltwise layernorm fuse
2. add embedding eltwise layernorm op
3. refine inplace_add_relu
4. refine fc_eltwise_layernorm
test=develop

* 1. refine fc
test=develop

* fix comments
test=develop

* fix comments

test=develop

8d6dc102

09 3月, 2020 1 次提交

Fix fc padding bug during inference fusion (#22860) · 61fef975

由 liu zhengxi 提交于 3月 09, 2020

* fix fc padding during fusion, test=develop

* fix optim model inference after SaveOptimModel, test=develop

61fef975

01 3月, 2020 1 次提交
- W
  add sum op support for fusion group (#22771) · ca9e77a8
  由 wangchaochaohu 提交于 3月 01, 2020
```
* Add the codegen and auto fusion for sum Op  in fusion group
```
  ca9e77a8
28 2月, 2020 1 次提交
- T
  
  fix typo word (#22784) · 433cef03
  由 tianshuo78520a 提交于 2月 28, 2020
  
  433cef03
24 2月, 2020 1 次提交

Add an inference interface to disable FC padding (#22097) · cdf5f6fb

由 GaoWei8 提交于 2月 24, 2020

* Add an interface of disabling FC padding
* fix bert regression
* polish fc padding interface
* recover pass function
* fix argument error
* fix mkldnn error

cdf5f6fb

23 2月, 2020 1 次提交
- T
  
  fix typo words (#22653) · d2ba91aa
  由 tianshuo78520a 提交于 2月 23, 2020
  
  d2ba91aa
21 2月, 2020 1 次提交
- Y
  
  Add the support of fp16 in fusion_group (#22239) · 22bbd547
  由 Yiqun Liu 提交于 2月 21, 2020
  
  22bbd547
14 2月, 2020 1 次提交

fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551) · 9a8203aa

由 Wilber 提交于 2月 14, 2020

当一个模型中有多个fc_lstm子图的时候，且其中fc共用了同一个persistable的bias，此时不应该将bias节点删除，只将非persistable的节点去除即可。

9a8203aa

13 2月, 2020 1 次提交

[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486) · 8acd745c

由 Zhaolong Xing 提交于 2月 13, 2020

* 1. optim multihead matmul: fuse three fc to multihtead matmul

test=develop

* fix conflict
test=develop

* fix comments
test=develop

8acd745c

07 2月, 2020 1 次提交

Enable the detection of subgraph composed of grad ops (#21223) · dcfb6038

由 Yiqun Liu 提交于 2月 07, 2020

* Add the first implememtation of fusion_group op #19621 (#3)

* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc.
test=develop

* Call CUDA driver api to launch the kernel compiled by nvrtc.
test=develop

* Disable for mac and windows.
test=develop

* Refine the codes to support manually specified num_threads and workload_per_thread.
test=develop

* Refine the CUDA kernel to support large dims.
test=develop

* Add DeviceCodePool to manage all device codes.

* Add the first implementation fusion_group op.

* Add unit-test for fusion_group op.

* Add the check of result.

* Add the check of nvrtc in unit-test.
test=develop

* Add comment to explain the inputs, outputs and features of fusion_group op.
test=develop

* Disable fusion_group op for mac and windows.
test=develop

* Make the compiling of device code return status instead of hanging up.
test=develop

* Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API.

* Unify fusion_group_op's input and output names.
test=develop

* Add the check of CUDA driver library in unittest.
test=develop

* Enable generating code for a given subgraph. #21126 (#4)

* Enable generating code for a given subgraph.

* Support sorting the subgraph.

* Remove the rearange of expressions because we use the sorted subgraph directly.

* Enable generating code for a subgraph which is composed of grad ops.

* Use expression information to check the accuracy in unittest.

* Separate load and store from computation expressions.
test=develop

* Improve the loading statements in generated codes.
test=develop

* Remove unused arguments from formal list.
test=develop

* Enable the detection of subgraph of grad ops.

* Generate code for detected subgraph in fusion_group_pass.

* Add an option in BuildStrategy to enable fusion_group_pass and add unittest.
test=develop

* Fix a bug when checking whether the shape of all inputs are the same.

* Add debug information.

* Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5)

test=develop

* Call subgraph_detector in fusion_group pass.
test=develop

* Disable fusion_group when WITH_GPU is OFF.
test=develop

* Refine all PADDLE_ENFORCE message.
test=develop

* Fix the case that some inputs are not defined in grad ops, and set op_role for fused op.
test=develop

* Follow review comments.
test=develop

dcfb6038

06 2月, 2020 1 次提交
- J
  Add dequant-scale squash (#22409) · 17f2c089
  由 joanna.wozna.intel 提交于 2月 06, 2020
```
* Add dequant scale squash

test=develop

* Correct dequant-scale squash test

test=develop
```
  17f2c089
05 2月, 2020 1 次提交

add WITH_NCCL option for cmake. (#22384) · 7bc4b095

由 Wilber 提交于 2月 05, 2020

cmake选项中添加了WITH_NCCL，显示指定是否编译NCCL的部分代码，WITH_NCCL默认打开，但如果WITH_GPU为OFF，则关闭WITH_NCCL

添加了PADDLE_WITH_NCCL定义

单机单卡能够关闭NCCL编译，多卡的话需要默认打开NCCL，如果关闭NCCL，则只能使用单卡
Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>

7bc4b095

04 2月, 2020 1 次提交
- 石
  
  remove anakin from code, test=develop (#22420) · e1b0d7cb
  由石晓伟提交于 2月 04, 2020
  
  e1b0d7cb

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致