提交 · af415bc218cf6f85f9277f9596334ec818e0b7b5 · BaiXuePrincess / Paddle

21 6月, 2022 5 次提交
- J
  [Cherry-pick ] to Release/2.3, Add prefetch_factor in dataloader (#43674) · af415bc2
  由 Jackwaterveg 提交于 6月 21, 2022
```
* fix usage of prefetch_factor

* add assert

* add docstring and change prefetch_factor when num_workers=0

* fix doc
```
  af415bc2
- G
  [cherry pick #43088 #40664] Add float16 to fake quantize/dequantize OP (#43689) · 9783e887
  由 Guanghua Yu 提交于 6月 21, 2022
```
* cherry pick #43088 #40664

* fix clang format
```
  9783e887
- C
  [Cherry-pick] Update CUDA and TensorRT version for CI (#43642) · a363e5ab
  由 chalsliu 提交于 6月 21, 2022
```
* Update CUDA and TensorRT version for CI

* disable ut

* Update TensorRT for CUDA 10.2
```
  a363e5ab
- N
  delete the log printing in layout autotune (#43677) · 090a9132
  由 niuliling123 提交于 6月 21, 2022
```
删除 layout autotune 中的多余打印
背景 ：layout autotune log会导致模型打印信息增多
```
  090a9132
- Z
  
  fix compile fail in cuda11.6 (#43588) · e1604f9e
  由 zhoutianzi666 提交于 6月 21, 2022
  
  e1604f9e
20 6月, 2022 5 次提交
- [cherry-pick]to Release/2.3,modify scale op xpu unittest (#43657) · 6262efb5
  由 z8hanghuan 提交于 6月 20, 2022
```
* modify xpu.cmake,*test=kunlun (#41832)

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* modify xpu.cmake,*test=kunlun

* support bilstm,*test=kunlun

* [cherry-pick]support multi_layer of bilstm,*test=kunlun

* [cherry-pick]refactor sum unit test,*test=kunlun (#43561)
```
  6262efb5
- X
  [Cherry pick] Einsum memory optimization PR #43397 (#43554) · 638b69dc
  由 xiongkun 提交于 6月 20, 2022
```
* cherry pick from #43397

* fix code
```
  638b69dc
- S
  
  fix unittest (#43609) (#43617) · 68d5c12b
  由 Shang Zhizhou 提交于 6月 20, 2022
  
  68d5c12b
- Z
  
  place all save/load path into temporary directory (#43652) · a5ccc713
  由 zhaoyingli 提交于 6月 20, 2022
  
  a5ccc713
- Z
  [Cherry-Pick] place all save/load path into temporary directory (#43316) (#43651) · 0f16ccf5
  由 zhaoyingli 提交于 6月 20, 2022
```
* place all save/load path into temporary directory

* rm no need unittest
```
  0f16ccf5
18 6月, 2022 1 次提交
- G
  Cherry pick 42508 (#43601) · bfe21ff3
  由 gongweibao 提交于 6月 18, 2022
```
* fix test

* fix test.
```
  bfe21ff3
17 6月, 2022 4 次提交
- W
  
  Export symbols of phi operator library (#43478) · 68ed3b86
  由 weishengying 提交于 6月 17, 2022
  
  68ed3b86
- Y
  
  cherry pick 43581 (#43596) · 2eb60ddb
  由 YuanRisheng 提交于 6月 17, 2022
  
  2eb60ddb
- H
  [Dygraph] Fix barrier bugs of ProcessGroup in Eager Mode (#43589) · 3689a126
  由 Haohongxiang 提交于 6月 17, 2022
```
* fix pg bugs

* update
```
  3689a126
- W
  [cherry-pick 2.3] Cherry parallel fused transformer api (#43505) · 19b87aec
  由 WangXi 提交于 6月 17, 2022
```
* Rename dropout is test (#43098)

* replace dropout_is_test with is_test.
* improve atol on a100.

* fused_attention fused_feedforward api support Model Tensor Parallel (#42985)

* fix is_test bug in fused_feedforward. (#43508)
Co-authored-by: NLi Min <11663212+limin2021@users.noreply.github.com>
```
  19b87aec
16 6月, 2022 5 次提交

[cherry pick] Unit test with tempfile to place the temporary files (#43522) · 1a660c8a

由 zhangbopd 提交于 6月 16, 2022

Use tempfile for unit test & custom op test to replace temporary files to ensure that all temporary files will be deleted normally after a single measurement, avoiding the usage of disk files.
The PR only involves single-test and op test modifications and does not affect existing functionality.
Release/2.3 branch modified in PR43521;

1a660c8a

Q
[Cherry-pick] Fix ut tempfile v23 (#43387) · 24843fcb
由 Qi Li 提交于 6月 16, 2022
```
* fix unit test temp file, test=develop (#43155)

* add cleanup code, test=develop (#43305)
```
24843fcb

[Cherry-pick] Fix numpy 1.20+ deprecation warnings (#43513) · 689e0999

由 Qi Li 提交于 6月 16, 2022

* Fix numpy 1.20+ deprecation warnings (#42929)

* Replace np.bool/np.bool8 with np.bool_

* Replace np.object with np.object_

* Replace np.complex with np.complex128

* Replace np.float with np.float64

* Replace np.int with np.int_

* Rerun pre-commit for newer pre-commit configuration

* Use builtin bool instead of np.bool_ based on the context

* fix mode dtype
Co-authored-by: Nzlsh80826 <rewang@nvidia.com>

689e0999

Z

cherry-pick adamw unittest (#43498) · 0cdde0b4
由 zhaoyingli 提交于 6月 16, 2022

0cdde0b4
G
[cherry-pick]Add progress bar and speed up Quantization Pass (#43454) · abb0b2d6
由 Guanghua Yu 提交于 6月 16, 2022
```
* Add progress bar and speed up Quantization Pass

* fix typo
```
abb0b2d6

15 6月, 2022 1 次提交
- Z
  [cherry-pick] Fix bug of strided_slice and slice (#43388, #43443) (#43432) · 7e940b84
  由 zyfncg 提交于 6月 15, 2022
```
* fix bug of strided_slice (#43388)

* fix stride_slice bug

* fix bug

* fix bug of infer shape for slice (#43443)
```
  7e940b84
14 6月, 2022 3 次提交

S

Add jetson tool (#43486) · 53a7d38b
由 Shang Zhizhou 提交于 6月 14, 2022

53a7d38b

[ CherryPick ] Cherry pick for einsum optimization. (#43468) · 22e75d92

由 xiongkun 提交于 6月 14, 2022

* [EinsumOp] Polish forward logic and backward logic for optimize (#42603)

* change logic for optimize

* modifty

* merge

* change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0 (#43010)

* [EinsumOp] Make EinsumOp support bfloat16. (#43085)

* change einsum_v2 as default and add new flags: FLAG_einsum_opt=1|0

* make EInsumOP support bf16

* add unittest for BF16

* add condition for test_BF16

* fix bugs

* fix

* change the backward api to fit einsum op

22e75d92

Use tempfile to place all the temporary files. (#43392) · afd0c1db

由 freeliuzc 提交于 6月 14, 2022

使用 tempfile 替换临时文件，保证在单测结束后，所有临时文件都会被正常的删除，避免占用磁盘文件。
此 PR 仅涉及单测修改，不影响现有功能。
develop 分支修改在 PR 43376

afd0c1db

13 6月, 2022 1 次提交
- T
  2.3 del test message (#43404) · ed859054
  由 tianshuo78520a 提交于 6月 13, 2022
```
删除无用信息
```
  ed859054
09 6月, 2022 3 次提交
- G
  cherry pick #42255 (fuse conv + bn in QAT) and #42378 (support skip_op_list in PTQ) (#43301) · 0a00fc4e
  由 Guanghua Yu 提交于 6月 09, 2022
```
* support fuse conv and bn in QAT (#42255)

* support skip_op_list in PostTrainingQuantization (#42378)

* fix unittest
```
  0a00fc4e
- G
  
  Modify quantization use tempfile to place the temporary files (#43281) · f4e09397
  由 Guanghua Yu 提交于 6月 09, 2022
  
  f4e09397
- Z
  
  disable lite gpu (#43178) · 36980306
  由 zhupengyang 提交于 6月 09, 2022
  
  36980306
08 6月, 2022 4 次提交
- N
  Replace ReduceAmax/Amax.part.cu with KP (#43202) (#43263) · e161979e
  由 niuliling123 提交于 6月 08, 2022
```
Reduce amax/amin frobenius_norm_kerne原始实现为Eigen实现，文件编译时间较长，因此本PR将其替换为KP实现
删除DefaultElementwiseOperator中重复功能支持，减少elementwise_double_grad OP编译时间
```
  e161979e
- T
  Del whl check for release/2.3 (#43288) · 8f127681
  由 tianshuo78520a 提交于 6月 08, 2022
```
删除在2.3 对比whl包大小。
```
  8f127681
- J
  
  updated paddle_bfloat to v0.1.7 · df1d4645
  由 jakpiase 提交于 5月 19, 2022
  
  df1d4645
- H
  Resolve protobuf of ORT Backend conflict (#43275) · c2804390
  由 heliqi 提交于 6月 07, 2022
```
解决onnxruntime后端依赖的protobuf跟框架或外部protobuf版本冲突问题
```
  c2804390
07 6月, 2022 3 次提交
- Z
  
  fix the problem of slice infer shape (#42568) (#43246) · f1b4e4d5
  由 zyfncg 提交于 6月 07, 2022
  
  f1b4e4d5
- X
  
  fix memory leakage (#43141) (#43220) · e09803c5
  由 xiongkun 提交于 6月 07, 2022
  
  e09803c5
- N
  [cherry-pick]Delete ElementwiseKernel in BroadcastKernel (#42779) (#43210) · 52ef8656
  由 niuliling123 提交于 6月 07, 2022
```
Delete ElementwiseKernel in BroadcastKernel
减少所有Broadcast中重复功能调用，同时减少编译时间和问题体积
```
  52ef8656
06 6月, 2022 1 次提交

cherry-pick 42645 (#43205) · 835a1888

由 niuliling123 提交于 6月 06, 2022

删除Broadcast function中rank例化以及Elementwise调用，降低编译时间。
从develop分支中的#42645 PR修改而来，由于develop分支与release分支相差较大，无法实现cherry-pick，因此针对release2.3重新提交PR.
Broadcast中关于rank的例化会导致底层模板展开较多，造成reduce_sum_grad_kernel.cu.o文件体积过大，修改后可以降低.o体积及编译时间

835a1888

31 5月, 2022 1 次提交

Del check size (#43113) · 40a7e0ad

由 tianshuo78520a 提交于 5月 31, 2022

删除判断build目录大小和预测库大小检查功能。该功能是和develop比较，会存在差异，在release任务中取消判断

40a7e0ad

30 5月, 2022 2 次提交
- W
  [Dy2St]Fix cond_block_grad error when handle no need grad vras (#43034) (#43084) · e6e85b35
  由 WangZhen 提交于 5月 30, 2022
```
* Fix cond_block_grad error when handle no need grad vras

* Add comment and UT
```
  e6e85b35
- W
  [Paddle-Inference] fix_multiheadpass_int8 (#43020) · 72880279
  由 Wangzheee 提交于 5月 30, 2022
```
* fix_multi_int8 (#42977)

* cherry-pick fix_multihead_int8
```
  72880279
27 5月, 2022 1 次提交
- T
  
  test=document_fix · aedd4592
  由 tianshuo78520a 提交于 5月 27, 2022
  
  aedd4592

BaiXuePrincess / Paddle 与 Fork 源项目一致

BaiXuePrincess / Paddle
与 Fork 源项目一致