提交 · 096eb8016f98e0871bf814332e9f2f68b18be1ac · Crayon鑫 / Paddle

23 6月, 2022 6 次提交
- L
  
  remove slowing down pass (#43750) · 096eb801
  由 lidanqing 提交于 6月 23, 2022
  
  096eb801
- Z
  
  fix set_value (#43694) (#43783) · 9d12e70c
  由 zyfncg 提交于 6月 23, 2022
  
  9d12e70c
- H
  
  Upgrade paddle2onnx to 0.9.9 (#43774) (#43775) · 4aa0515d
  由 heliqi 提交于 6月 23, 2022
  
  4aa0515d
- W
  
  [cherry pick][Inference]Enhance gpu multihead matmul v3 fuse pass (#43765) · 94bacb47
  由 WJJ1995 提交于 6月 23, 2022
  
  94bacb47
- H
  [cherry pick 2.3][Inference]Fix the ort Backend multiple input bug(#43621 #43742) (#43739) · babba557
  由 heliqi 提交于 6月 22, 2022
```
* cherry pick form develop 43621

* code format

* paddle2onnx update to 0.9.8
```
  babba557
- L
  [cherry-pick] release/2.3 elementwise_mul and matmul mkldnn fix (#43725) · a7e0cdea
  由 lidanqing 提交于 6月 23, 2022
```
* Correct elementwise quantization (#43693)

* [Bug fix] Do not quantize weights Y when matmul X and Y both other ops outputs (#43297)

* fix some matmul that X and Y both other ops outputs, do not dequantize the Y.

* fix CI format

* fix according to review
Co-authored-by: Njoanna.wozna.intel <joanna.wozna@intel.com>
```
  a7e0cdea
22 6月, 2022 12 次提交

由 ccrrong 提交于 6月 22, 2022

* add bilinear_interp_v2 converter

* update op_teller.cc

* add unittest for bilinear_interp_v2 converter

* code format

* bug fix

* code format and add unittest

* remove merged modify in op_teller.cc

* code format

* code format

* fix scale init error

d0bbf46c

X

gpu_context (#43661) · 90ae3533
由 xiaoxiaohehe001 提交于 6月 22, 2022

90ae3533
J
[Cherry-pick]to Release/2.3, Improve MSRAInitializer (#43721) · 1aafc31b
由 Jackwaterveg 提交于 6月 22, 2022
```
* fix conflict

* improve the doc
```
1aafc31b

Optimize linspace to avoid GPU -> CPU copy. (#42750) (#43746) · 4dcfc6df

由 Yiqun Liu 提交于 6月 22, 2022

cherry-pick #42750。

QA反馈，#42750 优化后，solov2模型性能可提升6%，故cherry-pick到2.3。因#41096 将linspace python实现从fluid.layers.tensor挪到了paddle.tensor.creation下，该pr不在release/2.3分支中，故将#42750 中python修改同步到fluid.layers.tensor.linspace中。

4dcfc6df

Cherry-pick PR#43237 from deveop (#43685) · e90dfaf7

由 shiyutang 提交于 6月 22, 2022

* merge_release_and_dev

* merge_release_dev

* update

* Use tempfile to place the temporary files (#43237)

* tempfile_fix

* update

* fix_CI

* update_word2vec.inference.model

* remove_change_in_word2vec_book

* fix_word2vec_book

* rm_affine

* update

e90dfaf7

Z
fix the bug that _DataLoaderIterMultiProcess use time to generate the seed (#43318) (#43702) · f4c42389
由 Zhang Ting 提交于 6月 22, 2022
```
 fix the bug that _DataLoaderIterMultiProcess use time to generate the seed

cherry-pick #43318
```
f4c42389

[cherry pick] Support optional residual add in fused ops and slice large... · 0660d5f2

由 Zhang Ting 提交于 6月 22, 2022

[cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax (#43719)

 [cherry pick] Support optional residual add in fused ops and slice large tensor for cudnn_softmax

cherry-pick #43635 #43681 #43474

0660d5f2

test=document_fix;cherry pick code format check upgrade to release/2.3 (#43732) · 8e6a1945

由 Sing_chan 提交于 6月 22, 2022

Only cherry pick format tool(clang-format, yapf, cmake-format) upgrade to release/2.3, lint tool such as cpplint will not move, because we are not going to fix cpplint error in release/2.3
pre_commit.sh also is moved to release/2.3 so that both PR-CI-pre-commit and PR-CI-pre-commit-23 can works.
pre install clang-format to avoid repeat installation due to pre-commit's multi-thread running.

8e6a1945

Z

fix tensor copy bug (#43299) (#43728) · 8760817a
由 zyfncg 提交于 6月 22, 2022

8760817a
L
[Cherrypick 2.3] fix decode jpeg example code (#42752) · a4c898cf
由 LielinJiang 提交于 6月 22, 2022
```
* fix decode_jpeg example code

* fix decode_jpeg example code
```
a4c898cf

set_state_dict not use state_dict hook (#43407) (#43711) · 0fb66355

由 zhangbo9674 提交于 6月 22, 2022

在 amp-o2功能开发过程中，为了支持指定网络存储数据类型的功能，添加state_dict hook功能，但是在Layer的set_state_dict是通过state_dict获取网络参数并加载的，hook接口的存在导致 set_state_dict无法加载到原本网络参数。
本pr通过增加hook控制开关，在set_state_dict中禁用hook解决该问题。

详见pr43407

0fb66355

[FIx bug]layer to 'NoneType' object has no attribute 'place' (#43597) (#43717) · 0b879318

由 zhangbo9674 提交于 6月 22, 2022

bug：
当class Layer的_buffers中有参数为None的时候，调用to()方法将会报layer to 'NoneType' object has no attribute 'place'的错误。
修复方法：
to()方法增加对_buffers中None类型参数的判断，如果为None，跳过该参数的处理。

0b879318

21 6月, 2022 5 次提交
- J
  [Cherry-pick ] to Release/2.3, Add prefetch_factor in dataloader (#43674) · af415bc2
  由 Jackwaterveg 提交于 6月 21, 2022
```
* fix usage of prefetch_factor

* add assert

* add docstring and change prefetch_factor when num_workers=0

* fix doc
```
  af415bc2
- G
  [cherry pick #43088 #40664] Add float16 to fake quantize/dequantize OP (#43689) · 9783e887
  由 Guanghua Yu 提交于 6月 21, 2022
```
* cherry pick #43088 #40664

* fix clang format
```
  9783e887
- C
  [Cherry-pick] Update CUDA and TensorRT version for CI (#43642) · a363e5ab
  由 chalsliu 提交于 6月 21, 2022
```
* Update CUDA and TensorRT version for CI

* disable ut

* Update TensorRT for CUDA 10.2
```
  a363e5ab
- N
  delete the log printing in layout autotune (#43677) · 090a9132
  由 niuliling123 提交于 6月 21, 2022
```
删除 layout autotune 中的多余打印
背景 ：layout autotune log会导致模型打印信息增多
```
  090a9132
- Z
  
  fix compile fail in cuda11.6 (#43588) · e1604f9e
  由 zhoutianzi666 提交于 6月 21, 2022
  
  e1604f9e
20 6月, 2022 5 次提交
- [cherry-pick]to Release/2.3,modify scale op xpu unittest (#43657) · 6262efb5
  由 z8hanghuan 提交于 6月 20, 2022
```
* modify xpu.cmake,*test=kunlun (#41832)

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* support bilstm,*test=kunlun

* [cherry-pick]support multi_layer of bilstm,*test=kunlun

* [cherry-pick]refactor sum unit test,*test=kunlun (#43561)
```
  6262efb5
- X
  [Cherry pick] Einsum memory optimization PR #43397 (#43554) · 638b69dc
  由 xiongkun 提交于 6月 20, 2022
```
* cherry pick from #43397

* fix code
```
  638b69dc
- S
  
  fix unittest (#43609) (#43617) · 68d5c12b
  由 Shang Zhizhou 提交于 6月 20, 2022
  
  68d5c12b
- Z
  
  place all save/load path into temporary directory (#43652) · a5ccc713
  由 zhaoyingli 提交于 6月 20, 2022
  
  a5ccc713
- Z
  [Cherry-Pick] place all save/load path into temporary directory (#43316) (#43651) · 0f16ccf5
  由 zhaoyingli 提交于 6月 20, 2022
```
* place all save/load path into temporary directory

* rm no need unittest
```
  0f16ccf5
18 6月, 2022 1 次提交
- G
  Cherry pick 42508 (#43601) · bfe21ff3
  由 gongweibao 提交于 6月 18, 2022
```
* fix test

* fix test.
```
  bfe21ff3
17 6月, 2022 4 次提交
- W
  
  Export symbols of phi operator library (#43478) · 68ed3b86
  由 weishengying 提交于 6月 17, 2022
  
  68ed3b86
- Y
  
  cherry pick 43581 (#43596) · 2eb60ddb
  由 YuanRisheng 提交于 6月 17, 2022
  
  2eb60ddb
- H
  [Dygraph] Fix barrier bugs of ProcessGroup in Eager Mode (#43589) · 3689a126
  由 Haohongxiang 提交于 6月 17, 2022
```
* fix pg bugs

* update
```
  3689a126
- W
  [cherry-pick 2.3] Cherry parallel fused transformer api (#43505) · 19b87aec
  由 WangXi 提交于 6月 17, 2022
```
* Rename dropout is test (#43098)

* replace dropout_is_test with is_test.
* improve atol on a100.

* fused_attention fused_feedforward api support Model Tensor Parallel (#42985)

* fix is_test bug in fused_feedforward. (#43508)
Co-authored-by: NLi Min <11663212+limin2021@users.noreply.github.com>
```
  19b87aec
16 6月, 2022 5 次提交

[cherry pick] Unit test with tempfile to place the temporary files (#43522) · 1a660c8a

由 zhangbopd 提交于 6月 16, 2022

Use tempfile for unit test & custom op test to replace temporary files to ensure that all temporary files will be deleted normally after a single measurement, avoiding the usage of disk files.
The PR only involves single-test and op test modifications and does not affect existing functionality.
Release/2.3 branch modified in PR43521;

1a660c8a

Q
[Cherry-pick] Fix ut tempfile v23 (#43387) · 24843fcb
由 Qi Li 提交于 6月 16, 2022
```
* fix unit test temp file, test=develop (#43155)

* add cleanup code, test=develop (#43305)
```
24843fcb

[Cherry-pick] Fix numpy 1.20+ deprecation warnings (#43513) · 689e0999

由 Qi Li 提交于 6月 16, 2022

* Fix numpy 1.20+ deprecation warnings (#42929)

* Replace np.bool/np.bool8 with np.bool_

* Replace np.object with np.object_

* Replace np.complex with np.complex128

* Replace np.float with np.float64

* Replace np.int with np.int_

* Rerun pre-commit for newer pre-commit configuration

* Use builtin bool instead of np.bool_ based on the context

* fix mode dtype
Co-authored-by: Nzlsh80826 <rewang@nvidia.com>

689e0999

Z

cherry-pick adamw unittest (#43498) · 0cdde0b4
由 zhaoyingli 提交于 6月 16, 2022

0cdde0b4
G
[cherry-pick]Add progress bar and speed up Quantization Pass (#43454) · abb0b2d6
由 Guanghua Yu 提交于 6月 16, 2022
```
* Add progress bar and speed up Quantization Pass

* fix typo
```
abb0b2d6

15 6月, 2022 1 次提交
- Z
  [cherry-pick] Fix bug of strided_slice and slice (#43388, #43443) (#43432) · 7e940b84
  由 zyfncg 提交于 6月 15, 2022
```
* fix bug of strided_slice (#43388)

* fix stride_slice bug

* fix bug

* fix bug of infer shape for slice (#43443)
```
  7e940b84
14 6月, 2022 1 次提交
- S
  
  Add jetson tool (#43486) · 53a7d38b
  由 Shang Zhizhou 提交于 6月 14, 2022
  
  53a7d38b

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致