提交 · e931cd12384ecc07629a9f3d8babf63003f54d63 · Crayon鑫 / Paddle

31 8月, 2021 2 次提交

Y
[cherry-pick][hybrid performance] Grad fuse for gradient merge under pipeline... · e931cd12
由 Yuang Liu 提交于 8月 31, 2021
```
[cherry-pick][hybrid performance] Grad fuse for gradient merge under pipeline mode (#35004) (#35299)
```
e931cd12

[cherry-pick][Hybrid Performance] Move the cast op of AMP which cast fp32... · 6fb58aef

由 Yuang Liu 提交于 8月 31, 2021

[cherry-pick][Hybrid Performance] Move the cast op of AMP which cast fp32 param to fp16 param to the optimizer (#34965) (#35296)
Co-authored-by: NWangXi <wangxi16@baidu.com>

6fb58aef

17 8月, 2021 2 次提交

Copy boost optional to Paddle (#34780) · 9be41447

由 chentianyu03 提交于 8月 17, 2021

* copy boost optional.hpp to paddle

* copy boost optional.hpp to paddle

* move directions

* del fluid/utils

* modify .hpp to .h

* move directions

* modify to paddle::optional

* add modification description

* format code stype for the files in paddle/utils

* format code stype

9be41447

Add some passes which can be applied to Program (#34730) · 8046e33d

由 Zeng Jinle 提交于 8月 17, 2021

* add inplace passes and tests

* update

* fix use_cuda undefined
fix compile error of op compat

* add more ut

* fix CPU CI error

* check adam unique

* fix mac/windows ci, improve coverage

* fix ci error

* follow weihang's comment

* fix BlockDesc::MoveFrom

* follow qiuliang's comment

* update

* follow huihuang's comments

8046e33d

16 8月, 2021 2 次提交
- F
  
  [CPU-PSLIB] Add config for scale_sparse_grad in config_fleet.py,test=develop (#34893) · d028214d
  由 Fan Zhang 提交于 8月 16, 2021
  
  d028214d
- J
  Fix elementwise_add quantization (#34820) · ae80df91
  由 joanna.wozna.intel 提交于 8月 16, 2021
```
* Remove force_fp32_output from elementwise_add quantization

* Fix cpu_quantize_placement test

* Review related changes
```
  ae80df91
13 8月, 2021 2 次提交
- Z
  Bug fix : Can't load multiple modules of custom c++ op (#34505) · fc6b4a50
  由 zyfncg 提交于 8月 13, 2021
```
* Fix a bug : can't load more than one custom op module

* Fix a bug : can't load more than one custom op module

* add test for load multiple modules of custom c++ op

* add config for Coverage CI
```
  fc6b4a50
- Z
  
  fix generator thread safety bug (#34888) · f421741c
  由 Zeng Jinle 提交于 8月 13, 2021
  
  f421741c
11 8月, 2021 4 次提交

W
[Paddle TRT]fix_fc_int8_convert; fix_reshape_convert (#34787) · 3429c04b
由 Wangzheee 提交于 8月 11, 2021
```
* fix_fc_reshape_convert

* fix
```
3429c04b

Add ext_tensor.slice() API (#34227) · 3f011d82

由 Hao Lin 提交于 8月 11, 2021

* Add ext_tensor.slice() API, test=develop

* Call Tensor::mutable_data first to fix bugs and add test for writing to sliced tensor

* Fix unit test bug

* Fix code format problem, test=develop

* Fix code format problem

* Fix code format problem

* strengthen unit test

* Use CustomTensorUtils::ShareDataFrom to simplify codes

3f011d82

L
add the basic apis for auto_parallel (#33804) · 3f962e77
由 lilong12 提交于 8月 11, 2021
```
* add auto_parallel apis
```
3f962e77

Add no need output to gc check list (#34754) · 17c1dae9

由 hong 提交于 8月 11, 2021

* add not used output var to gc_check_list; test=develop

* add useless output to gc check list; test=develop

17c1dae9

10 8月, 2021 1 次提交

copy boost/any.hpp to utils and replace boost::any with self defined any (#34613) · 12892929

由 chentianyu03 提交于 8月 10, 2021

* add any.hpp to utils and replace boost::any with self defined paddle::any

* add copy any.hpp to custom op depends

* modify any.hpp include path

* remove boost from setup.py.in

* add copy any.hpp to custom op depends

* move any.hpp to paddle/utils/ dirs

* move any.h to extension/include direction

* copy utils to right directions

12892929

06 8月, 2021 3 次提交
- H
  
  zero_copy_tensor unittest: support XPU. (#34670) · 52e38a00
  由 houj04 提交于 8月 06, 2021
  
  52e38a00
- Q
  support kunlun black list and add kl1 op (#34605) · 21beef91
  由 QingshuChen 提交于 8月 06, 2021
```
* support kunlun black list and add kl1 op

* xpu_op_list add device_context dependence
```
  21beef91
- Q
  
  fix npu compile error, test=develop (#34656) · c16421c2
  由 Qi Li 提交于 8月 06, 2021
  
  c16421c2
05 8月, 2021 3 次提交

New executor dev (#34407) · 012d12b5

由 hong 提交于 8月 05, 2021

* first test version

* add test exec;

* add data transfer; test=develop

* add new exec head;

* add memcpy; test=develop

* add python fetch

* add new test

* add graph node; test=develop

* remove useless new executor test; test=develop

* remove gperf dependency; test=develop

* fix compile bugs; test=develop

* remove useless code; test=develop

* remove useless code; test=develop

* add uni test; test=develop

* polish code; test=develop

* polish code; test=develop

* add interpreter cmakefile; test=develop

* remove useless code; test=develop

012d12b5

remove boost::algorithm::ends_with ，boost macro and boost::lexical_cast apis (#34310) · bb7b4c0c

由 chentianyu03 提交于 8月 05, 2021

* replace boost::algorithm::ends_with with self define ends_with function

* remove BOOST macro in certain operators

* remove boost::lexical_cast

* add test for string_helper

* add more test case for string_helper

* modify join_string func and test case

* fix build_strategy_test failed bug

* remove string_helper_test from parallel_UT_rule.py

bb7b4c0c

王

[pass_enhance]fix the mkldnn model performance drop problem. test=develop (#34625) · e47d8a57
由王明冬提交于 8月 05, 2021

e47d8a57

04 8月, 2021 2 次提交

李
Revert pull request 34212 (#34558) · 09892118
由李季提交于 8月 04, 2021
```
* revert commit id 34212
```
09892118

[NPU] Support npu kernel for assign_value op (#34568) · f39c3a5a

由 Sing_chan 提交于 8月 04, 2021

* [NPU] Support npu kernel for assign_value op

* move test_assign_value_op_npu.py into unittests/npu folder

* correce copyright year; add TestAssignApi class using NPUplace in test files

f39c3a5a

03 8月, 2021 2 次提交
- Q
  support Kunlun2 (#34459) · 2d0f3d9b
  由 QingshuChen 提交于 8月 03, 2021
```
* support Kunlun2

* support KL2

* support KL2
```
  2d0f3d9b
- polish sccahce (#34350) · 61e51c18
  由 zhouweiwei2014 提交于 8月 03, 2021
  
  61e51c18
02 8月, 2021 2 次提交

Add basic functions of Program Pass (#34524) · 145cdb5a

由 Zeng Jinle 提交于 8月 02, 2021

* add basic APIs

* add attr_types

* follow comments

* change pass attr types

* add set pass attribute codes

* refine PADDLE_THROW

145cdb5a

Fix Inference CE Error by Topo Order (#34521) · 508b40ec

由 Huihuang Zheng 提交于 8月 02, 2021

The comment background message is too long, see details at https://github.com/PaddlePaddle/Paddle/pull/34521

508b40ec

30 7月, 2021 3 次提交

H

Revert of PR34452 (#34516) · 72a9c8ff
由 Huihuang Zheng 提交于 7月 30, 2021

72a9c8ff

Added reshape, reshape2, squeeze and squeeze2 BF16/FP32 FWD/BWD kernels (#34219) · 22c4c189

由 jakpiase 提交于 7月 30, 2021

* test version of matmul_v2

* added matmul_v2 grad kernel

* minor changes

* minor changes

* minor change for CI approval

* CI fix

* CI fix

* added squeeze and squeeze2 kernels

* CI fix

* CI fix

* CI fix

* disabled tests when compiled with cuda

* added setting format_tag by strides

* added sigmoid BF16 FWD/BWD and gelu BF16 BWD

* changes after review

* Revert "added sigmoid BF16 FWD/BWD and gelu BF16 BWD"

This reverts commit 6e3f76720b545abfcff9f6052b46b73a1e745cae.

* Revert "Merge branch 'matmul_v2_grad' into squeeze2_op"

This reverts commit 06fcf67843a4a7884eccdf67a02a03575e1d4cb8, reversing
changes made to 6e3f76720b545abfcff9f6052b46b73a1e745cae.

* minor change

* added reshape1/2 kernels

* moved some functions into private block

* CI fix

* CI fix

* CI fix

22c4c189

W
add trainer desc config to distributed strategy (#34457) · e6aacd1e
由 wangguanqun 提交于 7月 30, 2021
```
* add trainer desc config to distributed strategy

* code style modified
```
e6aacd1e

29 7月, 2021 5 次提交
- Z
  add fix op run order pass (#34427) · 79e758c6
  由 Zeng Jinle 提交于 7月 29, 2021
```
* add fix op run order pass

* add ut for fix_op_run_order

* fix ci error

* improve coverage

* improve coverge again and fix cpu test case

* follow some comments
```
  79e758c6
- G
  
  Fix allreduce_sum potential bugs on NPU. (#34462) · 02cc3c5e
  由 gongweibao 提交于 7月 29, 2021
  
  02cc3c5e
- Y
  
  fix the allreduce fused bug, test=develop (#34446) · b56dbe08
  由 Yuang Liu 提交于 7月 29, 2021
  
  b56dbe08
- H
  Enable FLAGS_convert_all_blocks (#34452) · 76f94f88
  由 Huihuang Zheng 提交于 7月 29, 2021
```
As the title
```
  76f94f88
- L
  
  [NPU] Avoid cpu tensor freed before copying to npu completed (#34475) · d71b9ba7
  由 Leo Chen 提交于 7月 29, 2021
  
  d71b9ba7
28 7月, 2021 4 次提交
- J
  graph_to_program topology sort (#33949) · 167523e7
  由 jiangcheng 提交于 7月 28, 2021
```
See https://github.com/PaddlePaddle/Paddle/pull/33949 for details
```
  167523e7
- J
  graph_to_program save parameter and stop_gradient information (#33771) · 8a7dee31
  由 jiangcheng 提交于 7月 28, 2021
```
This PR added optional boolean is_parameter and stop_gradient in the VarDesc proto, and remove them during save_inference_model
```
  8a7dee31
- W
  
  add quant_dequant_matmul (#34359) · a59f215d
  由 Wangzheee 提交于 7月 28, 2021
  
  a59f215d
- J
  apply pass strategy to sub graph (#34158) · 5e27d16d
  由 jiangcheng 提交于 7月 28, 2021
```
When Graph has sub-graph, apply pass to it and all sub-graph. And add single test script .
```
  5e27d16d
27 7月, 2021 1 次提交

Revert "Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support... · 0dd6a44a

由 Aurelius84 提交于 7月 27, 2021

Revert "Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support BuildStrategy for pass (#34181)" (#34348)" (#34384)

This reverts commit 577fdde5.

0dd6a44a

26 7月, 2021 1 次提交
- D
  【HETERPS】edit cuda remote_streams (#34276) · 539d7185
  由 danleifeng 提交于 7月 26, 2021
```
* psgpu:edit cuda remote_streams; test=develop
```
  539d7185
23 7月, 2021 1 次提交

Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support BuildStrategy... · 577fdde5

由 Aurelius84 提交于 7月 23, 2021

Revert "[Dy2Stat] Refactor ExecutorCache logic and pre-support BuildStrategy for pass (#34181)" (#34348)

This reverts commit 609f8225.

577fdde5

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致