提交 · 079c585c0a43fd85cd5451e9e3680dc72b7d4ea1 · PaddlePaddle / Paddle

31 8月, 2021 16 次提交
- A
  [Dy2Stat]Add model ResNet50 for Dy2stat AMP training (#35276) · 079c585c
  由 Aurelius84 提交于 8月 31, 2021
```
* Add model for ResNet50 for Dy2stat AMP training

* fix timeout

* fix dataloader
```
  079c585c
- Q
  [NPU] fix cmake for ascend ci, test=develop (#35255) · f6004ab9
  由 Qi Li 提交于 8月 31, 2021
```
* [NPU] fix cmake for ascend ci, test=develop

* update paddle_build.sh scripts, test=allcase
```
  f6004ab9
- S
  Revert "Revert "Add copy from tensor (#34406)" (#35173)" (#35256) · 6116f9af
  由 Shang Zhizhou 提交于 8月 31, 2021
```
* Revert "Revert "Add copy from tensor (#34406)" (#35173)"

This reverts commit 32c1ec42.

* add template instantiation
```
  6116f9af
- fix bug that cmake find python (#35304) · 00c9aeb0
  由 zhouweiwei2014 提交于 8月 31, 2021
  
  00c9aeb0
- Z
  New whl release strategy with pruned nv_fatbin (#35239) · 2f3b393d
  由 Zhanlue Yang 提交于 8月 31, 2021
```
[Background]
Expansion in code size can be irreversible in the long run, leading to huge release packages which
not only hampers user experience but also exceeds a hard limit of pypi.

In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU
arches supported.

This PR aims to prune this NV_FATBIN.

[Solution]
In the new release strategy, two types of whl packages will be involved:

Cubin PIP package:
PIP package maintains a smaller window for GPU arches support, containing
sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches

JIT release package:
This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60,
compute_70, compute_75, compute_80, with best performance and GPU arches coverage.

However, it takes around 10 min to install due to the JIT compilation.

[How to use]
The new release strategy is disabled by default.
To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP
To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL
```
  2f3b393d
- T
  Put code style check on gpu_ci (#35309) · d9f59fd1
  由 tianshuo78520a 提交于 8月 31, 2021
```
* notest;test=cpu

* test

* test=document_fix
```
  d9f59fd1
- W
  
  update infer trt ut. (#35261) · 96e7d903
  由 Wilber 提交于 8月 31, 2021
  
  96e7d903
- W
  add trt error information. (#35277) · a2afcace
  由 wenbin 提交于 8月 31, 2021
```
* add trt error information.

* rerun ci
```
  a2afcace
- W
  fix CI skip cc test error (#35264) · 3d76d003
  由 wuhuanzhou 提交于 8月 31, 2021
```
* fix CI skip cc test error, test=develop

* remove test code, test=develop
```
  3d76d003
- H
  Add AsExtra() for conditional_block_op.h (#35268) · 2100816c
  由 Huihuang Zheng 提交于 8月 31, 2021
```
As the title, see details at the PR description.
```
  2100816c
- fix windows batch file error:The system cannot find the batch label specified (#35288) · 2c0d667b
  由 zhouweiwei2014 提交于 8月 31, 2021
  
  2c0d667b
- 王
  
  fix the pass compat check position error, test=develop (#35272) · 54f07019
  由王明冬提交于 8月 31, 2021
  
  54f07019
- X
  
  support fuse layers for ptq (#35015) · ef536250
  由 XGZhang 提交于 8月 31, 2021
  
  ef536250
- A
  
  NPU add elementwise_mod (#35245) · 561841d2
  由 Aganlengzi 提交于 8月 31, 2021
  
  561841d2
- A
  
  NPU add fill_zeros_like kernel (#35246) · aaaa9965
  由 Aganlengzi 提交于 8月 31, 2021
  
  aaaa9965
- P
  change exit code and polish infer_ut summary style (#35254) · 531a8909
  由 Peihan 提交于 8月 31, 2021
```
* change exit code and summary style

* disable test_ernie_text_cls on windows
```
  531a8909
30 8月, 2021 10 次提交
- X
  [Paddle Inference-TRT]Adding six op unittest codes of TRT INT8 (#35130) · 39565147
  由 xiaoxiaohehe001 提交于 8月 30, 2021
```
* add_op_unittest
```
  39565147
- Z
  [NPU] Add log_loss op (#35010) · b94d7ff3
  由 zhulei 提交于 8月 30, 2021
```
* [NPU] Add log_loss op

* [NPU] Add log_loss op

* [NPU] Add log_loss op
```
  b94d7ff3
- C
  
  fix using boost::none as the init value when using paddle::optional (#35215) · e864667b
  由 chentianyu03 提交于 8月 30, 2021
  
  e864667b
- J
  
  - candidate fix (#35231) · ca4d2fca
  由 Jacek Czaja 提交于 8月 30, 2021
  
  ca4d2fca
- Z
  [Op Def] Add extra def of linear_interp & linear_interp_v2 & addmm (#35233) · 6ff179ab
  由 zhulei 提交于 8月 30, 2021
```
* [Op Def] Add extra def of linear_interp & linear_interp_v2

* [Op Def] Add extra def of linear_interp & linear_interp_v2 & addmm
```
  6ff179ab
- C
  [paddle-TRT]support matmul set to int8 in multihead (#34917) · 0043fa8c
  由 ceci3 提交于 8月 30, 2021
```
* update ernie int8
```
  0043fa8c
- T
  
  del message;test=document_fix (#35248) · c0bdef5d
  由 tianshuo78520a 提交于 8月 30, 2021
  
  c0bdef5d
- X
  Set value (#34886) · 37d281c9
  由 xiongkun 提交于 8月 30, 2021
```
* tmp

* Tile - Assign - Crop

* Finish the set value npu kernel and test case in npu

* improve the error message

* Modify according to zhangliujie

* code review
```
  37d281c9
- T
  Add cpu/gpu for PR-CI-CPU-Py2 (#35174) · 8f94d349
  由 tianshuo78520a 提交于 8月 30, 2021
```
* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* fix

* fix
```
  8f94d349
- A
  Abstract GenerateDeviceEventFlag to shield platforms (#35219) · 20cfa8ba
  由 Aurelius84 提交于 8月 30, 2021
```
* Abstract GenerateDeviceEventFlag to shield platforms

* Remove get_cuda_flags
```
  20cfa8ba
29 8月, 2021 1 次提交
- G
  
  test=document_fix (#35221) · 31cd1065
  由 Guoxia Wang 提交于 8月 29, 2021
  
  31cd1065
27 8月, 2021 13 次提交
- G
  
  test=document_fix (#35222) · 5dcff7c8
  由 Guoxia Wang 提交于 8月 27, 2021
  
  5dcff7c8
- J
  
  add uniform_ op and UT (#33934) · be29b8ee
  由 JYChen 提交于 8月 27, 2021
  
  be29b8ee
- X
  
  add more models for model_benchmark_ci,test=document_fix (#35178) · 5a72cf43
  由 xiegegege 提交于 8月 27, 2021
  
  5a72cf43
- X
  Add unpool2d op & Expose max_unpool2d API (#35056) · ceee71a0
  由 xiaoting 提交于 8月 27, 2021
```
* add maxunppol2d op, test=develop

* fix typo, test=develop

* fix unpool unitest, test=develop

* fix unpool code-example, test=develop

* fix for unpool_op_unittest,test=develop

* fix example code, test=develop

* add noqa:F401, test=develop

* fix converage, test=develop

* fix unitest for unpool, test=develop

* rename unpool2d to unpool, test=develop

* rename unpool2d to unpool, test=develop
```
  ceee71a0
- G
  sparse_momentum_op is used to save w@GRAD memory for gather_op (#34942) · 234ce932
  由 Guoxia Wang 提交于 8月 27, 2021
```
* sparse_momentum_op is used to save w@GRAD memory for gather_op when gather from a large parameter
```
  234ce932
- W
  
  [hybrid] Fix row parallel linear bias (#35186) · 1533d7e2
  由 WangXi 提交于 8月 27, 2021
  
  1533d7e2
- J
  Add fusion_gru and multi_gru to PTQ (Post-Training Quantization) (#33749) · 7debae3a
  由 joanna.wozna.intel 提交于 8月 27, 2021
```
* Add calculation for gru op

* Correct the types

* Remove mkldnn only

* Correct mkldnn ifdef

* Remove mkldnn ifdef

* Separate mkldnn quantizer test

* Correct Windows test

* Check different cmake fix

* Revert cmake change

* Cmake change 2

* Cmake change 3
```
  7debae3a
- A
  Polish DeviceEvent interface and Remove #ifdef in InterpreterCore (#35196) · 48bf7cbf
  由 Aurelius84 提交于 8月 27, 2021
```
* add CPUDeiveEvent

* Polish DeviceEvent code

* Add DEVICE_EVENT_LIBS
```
  48bf7cbf
- W
  fix count_api_without_core_ops (#35170) · 7272526b
  由 wanghuancoder 提交于 8月 27, 2021
```
* fix count_api_without_core_ops, test=develop

* fix count_api_without_core_ops, test=develop

* refine, test=develop

* remove test code, test=develop

* remove test, test=develop

* modify check_api_approvals.sh, test=develop
```
  7272526b
- Z
  
  gelu/logsigmoid add AsExtra (#35198) · 2006fbc4
  由 zhupengyang 提交于 8月 27, 2021
  
  2006fbc4
- 王
  
  fix the crash when input variable is bool type, test=develop (#35176) · ad522483
  由王明冬提交于 8月 27, 2021
  
  ad522483
- H
  
  Update test_cross_entropy_loss.py · e838cacf
  由 HydrogenSulfate 提交于 8月 26, 2021
  
  e838cacf
- H
  
  Update loss.py · cf6e543b
  由 HydrogenSulfate 提交于 8月 18, 2021
  
  cf6e543b

PaddlePaddle / Paddle 2 年多 前同步成功

PaddlePaddle / Paddle
2 年多前同步成功