提交 · 2f3b393d151afd6e82cb88095cb2079a9be65f61 · 机器未来 / Paddle

31 8月, 2021 12 次提交

New whl release strategy with pruned nv_fatbin (#35239) · 2f3b393d

由 Zhanlue Yang 提交于 8月 31, 2021

[Background]
Expansion in code size can be irreversible in the long run, leading to huge release packages which
not only hampers user experience but also exceeds a hard limit of pypi.

In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU
arches supported.

This PR aims to prune this NV_FATBIN.

[Solution]
In the new release strategy, two types of whl packages will be involved:

Cubin PIP package:
PIP package maintains a smaller window for GPU arches support, containing
sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches

JIT release package:
This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60,
compute_70, compute_75, compute_80, with best performance and GPU arches coverage.

However, it takes around 10 min to install due to the JIT compilation.

[How to use]
The new release strategy is disabled by default.
To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP
To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL

2f3b393d

T
Put code style check on gpu_ci (#35309) · d9f59fd1
由 tianshuo78520a 提交于 8月 31, 2021
```
* notest;test=cpu

* test

* test=document_fix
```
d9f59fd1
W

update infer trt ut. (#35261) · 96e7d903
由 Wilber 提交于 8月 31, 2021

96e7d903
W
add trt error information. (#35277) · a2afcace
由 wenbin 提交于 8月 31, 2021
```
* add trt error information.

* rerun ci
```
a2afcace
W
fix CI skip cc test error (#35264) · 3d76d003
由 wuhuanzhou 提交于 8月 31, 2021
```
* fix CI skip cc test error, test=develop

* remove test code, test=develop
```
3d76d003
H
Add AsExtra() for conditional_block_op.h (#35268) · 2100816c
由 Huihuang Zheng 提交于 8月 31, 2021
```
As the title, see details at the PR description.
```
2100816c
fix windows batch file error:The system cannot find the batch label specified (#35288) · 2c0d667b
由 zhouweiwei2014 提交于 8月 31, 2021

2c0d667b
王

fix the pass compat check position error, test=develop (#35272) · 54f07019
由王明冬提交于 8月 31, 2021

54f07019
X

support fuse layers for ptq (#35015) · ef536250
由 XGZhang 提交于 8月 31, 2021

ef536250
A

NPU add elementwise_mod (#35245) · 561841d2
由 Aganlengzi 提交于 8月 31, 2021

561841d2
A

NPU add fill_zeros_like kernel (#35246) · aaaa9965
由 Aganlengzi 提交于 8月 31, 2021

aaaa9965
P
change exit code and polish infer_ut summary style (#35254) · 531a8909
由 Peihan 提交于 8月 31, 2021
```
* change exit code and summary style

* disable test_ernie_text_cls on windows
```
531a8909

30 8月, 2021 10 次提交
- X
  [Paddle Inference-TRT]Adding six op unittest codes of TRT INT8 (#35130) · 39565147
  由 xiaoxiaohehe001 提交于 8月 30, 2021
```
* add_op_unittest
```
  39565147
- Z
  [NPU] Add log_loss op (#35010) · b94d7ff3
  由 zhulei 提交于 8月 30, 2021
```
* [NPU] Add log_loss op

* [NPU] Add log_loss op

* [NPU] Add log_loss op
```
  b94d7ff3
- C
  
  fix using boost::none as the init value when using paddle::optional (#35215) · e864667b
  由 chentianyu03 提交于 8月 30, 2021
  
  e864667b
- J
  
  - candidate fix (#35231) · ca4d2fca
  由 Jacek Czaja 提交于 8月 30, 2021
  
  ca4d2fca
- Z
  [Op Def] Add extra def of linear_interp & linear_interp_v2 & addmm (#35233) · 6ff179ab
  由 zhulei 提交于 8月 30, 2021
```
* [Op Def] Add extra def of linear_interp & linear_interp_v2

* [Op Def] Add extra def of linear_interp & linear_interp_v2 & addmm
```
  6ff179ab
- C
  [paddle-TRT]support matmul set to int8 in multihead (#34917) · 0043fa8c
  由 ceci3 提交于 8月 30, 2021
```
* update ernie int8
```
  0043fa8c
- T
  
  del message;test=document_fix (#35248) · c0bdef5d
  由 tianshuo78520a 提交于 8月 30, 2021
  
  c0bdef5d
- X
  Set value (#34886) · 37d281c9
  由 xiongkun 提交于 8月 30, 2021
```
* tmp

* Tile - Assign - Crop

* Finish the set value npu kernel and test case in npu

* improve the error message

* Modify according to zhangliujie

* code review
```
  37d281c9
- T
  Add cpu/gpu for PR-CI-CPU-Py2 (#35174) · 8f94d349
  由 tianshuo78520a 提交于 8月 30, 2021
```
* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* notest;test=cpu_gpu

* fix

* fix
```
  8f94d349
- A
  Abstract GenerateDeviceEventFlag to shield platforms (#35219) · 20cfa8ba
  由 Aurelius84 提交于 8月 30, 2021
```
* Abstract GenerateDeviceEventFlag to shield platforms

* Remove get_cuda_flags
```
  20cfa8ba
29 8月, 2021 1 次提交
- G
  
  test=document_fix (#35221) · 31cd1065
  由 Guoxia Wang 提交于 8月 29, 2021
  
  31cd1065
27 8月, 2021 17 次提交
- G
  
  test=document_fix (#35222) · 5dcff7c8
  由 Guoxia Wang 提交于 8月 27, 2021
  
  5dcff7c8
- J
  
  add uniform_ op and UT (#33934) · be29b8ee
  由 JYChen 提交于 8月 27, 2021
  
  be29b8ee
- X
  
  add more models for model_benchmark_ci,test=document_fix (#35178) · 5a72cf43
  由 xiegegege 提交于 8月 27, 2021
  
  5a72cf43
- X
  Add unpool2d op & Expose max_unpool2d API (#35056) · ceee71a0
  由 xiaoting 提交于 8月 27, 2021
```
* add maxunppol2d op, test=develop

* fix typo, test=develop

* fix unpool unitest, test=develop

* fix unpool code-example, test=develop

* fix for unpool_op_unittest,test=develop

* fix example code, test=develop

* add noqa:F401, test=develop

* fix converage, test=develop

* fix unitest for unpool, test=develop

* rename unpool2d to unpool, test=develop

* rename unpool2d to unpool, test=develop
```
  ceee71a0
- G
  sparse_momentum_op is used to save w@GRAD memory for gather_op (#34942) · 234ce932
  由 Guoxia Wang 提交于 8月 27, 2021
```
* sparse_momentum_op is used to save w@GRAD memory for gather_op when gather from a large parameter
```
  234ce932
- W
  
  [hybrid] Fix row parallel linear bias (#35186) · 1533d7e2
  由 WangXi 提交于 8月 27, 2021
  
  1533d7e2
- J
  Add fusion_gru and multi_gru to PTQ (Post-Training Quantization) (#33749) · 7debae3a
  由 joanna.wozna.intel 提交于 8月 27, 2021
```
* Add calculation for gru op

* Correct the types

* Remove mkldnn only

* Correct mkldnn ifdef

* Remove mkldnn ifdef

* Separate mkldnn quantizer test

* Correct Windows test

* Check different cmake fix

* Revert cmake change

* Cmake change 2

* Cmake change 3
```
  7debae3a
- A
  Polish DeviceEvent interface and Remove #ifdef in InterpreterCore (#35196) · 48bf7cbf
  由 Aurelius84 提交于 8月 27, 2021
```
* add CPUDeiveEvent

* Polish DeviceEvent code

* Add DEVICE_EVENT_LIBS
```
  48bf7cbf
- W
  fix count_api_without_core_ops (#35170) · 7272526b
  由 wanghuancoder 提交于 8月 27, 2021
```
* fix count_api_without_core_ops, test=develop

* fix count_api_without_core_ops, test=develop

* refine, test=develop

* remove test code, test=develop

* remove test, test=develop

* modify check_api_approvals.sh, test=develop
```
  7272526b
- Z
  
  gelu/logsigmoid add AsExtra (#35198) · 2006fbc4
  由 zhupengyang 提交于 8月 27, 2021
  
  2006fbc4
- 王
  
  fix the crash when input variable is bool type, test=develop (#35176) · ad522483
  由王明冬提交于 8月 27, 2021
  
  ad522483
- H
  
  Update test_cross_entropy_loss.py · e838cacf
  由 HydrogenSulfate 提交于 8月 26, 2021
  
  e838cacf
- H
  
  Update loss.py · cf6e543b
  由 HydrogenSulfate 提交于 8月 18, 2021
  
  cf6e543b
- H
  
  Update loss.py · 11e9d4e3
  由 HydrogenSulfate 提交于 8月 16, 2021
  
  11e9d4e3
- H
  
  Update loss.py · 0c2d6bcb
  由 HydrogenSulfate 提交于 8月 16, 2021
  
  0c2d6bcb
- H
  
  Update loss.py · 52804cd8
  由 HydrogenSulfate 提交于 8月 16, 2021
  
  52804cd8
- H
  
  Update loss.py · 00467688
  由 HydrogenSulfate 提交于 8月 16, 2021
  
  00467688

机器未来 / Paddle 与 Fork 源项目一致

机器未来 / Paddle
与 Fork 源项目一致