提交 · 01063218f273931f6856777b7aa797109fedbbae · Crayon鑫 / Paddle

16 9月, 2021 1 次提交
- C
  
  Add CPU and GPU eigh op implementation (#34990) · 07d0b834
  由 crystal 提交于 9月 16, 2021
  
  07d0b834
14 9月, 2021 1 次提交

windows third party cache optimization: share third party cache among servers (#35368) · e919620a

由 Sing_chan 提交于 9月 14, 2021

* new function: share third party cache among servers to fasten build speed

* modified code according to zhouwei25's comment

* add wget install step, move cd build to the last of if condition

* block note and error of third_party share; change bce upload method

* change third_party sub_dir in bos, since third party in different cuda version cant share

* set sub_dir by get nvcc version

* change third_party local path to be same with bos path

e919620a

13 9月, 2021 1 次提交
- T
  
  add xpu_wait & new implementation replace memcpy in adam, adamw (#35437) · 86a6be1a
  由 taixiurong 提交于 9月 13, 2021
  
  86a6be1a
09 9月, 2021 1 次提交

Add matrix_rank Op and it's GPU and CPU kernel (#34823) · eb1fbf12

由 0x45f 提交于 9月 09, 2021

* init matrix_rank op, add matrix_rank CPU code and test

* add GPU kernel, remove svd_eigen.h

* add CPU kernel when tol is tensor

* add cpu and gpu code when tol is tensor

* fix CI-ROCM error

* add matrix_rank API describe, fix PR-CI-Py3 error

* fix PR-CI-Windows error, add matrix_rank API test

* delete useless comments

* fix review

* add my code in svd_helper.h

* update doc commets

* remove spaces

eb1fbf12

03 9月, 2021 2 次提交
- Q
  [NPU] add int64_t kernels for YoloV3, test=develop (#35045) · f014e301
  由 Qi Li 提交于 9月 03, 2021
```
* [NPU] add int64 kernels, test=develop

* update ci scripts to be able to trun WITH_ASCEND_INT64 on, test=develop
```
  f014e301
- T
  
  fix bn_infer and optimize momentum for kunlun (#35250) · 8305ba37
  由 TTerror 提交于 9月 03, 2021
  
  8305ba37
02 9月, 2021 1 次提交

Add SVD Op and it's GPU and CPU kernel (#34953) · 7e5fb462

由 xiongkun 提交于 9月 02, 2021

* Add SVD Op and it's GPU and CPU kernel

* Remove CUDAPlace in test_svd_op, make the test available in CPU package

* modfity the file

* fix windows bug/ fix ROCM / fix test timeout

* for pass the CIs

* improve error report

* for code review

* some modification to test_svd_op

* change python code style

* expose the svd interface for document

7e5fb462

01 9月, 2021 1 次提交
- Q
  support KL label smooth (#35177) · 7ca28bb6
  由 QingshuChen 提交于 9月 01, 2021
```
* support KL label smooth

* update UT for KL label_smooth
```
  7ca28bb6
31 8月, 2021 4 次提交

S
Revert "Revert "Add copy from tensor (#34406)" (#35173)" (#35256) · 6116f9af
由 Shang Zhizhou 提交于 8月 31, 2021
```
* Revert "Revert "Add copy from tensor (#34406)" (#35173)"

This reverts commit 32c1ec42.

* add template instantiation
```
6116f9af
fix bug that cmake find python (#35304) · 00c9aeb0
由 zhouweiwei2014 提交于 8月 31, 2021

00c9aeb0

New whl release strategy with pruned nv_fatbin (#35239) · 2f3b393d

由 Zhanlue Yang 提交于 8月 31, 2021

[Background]
Expansion in code size can be irreversible in the long run, leading to huge release packages which
not only hampers user experience but also exceeds a hard limit of pypi.

In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU
arches supported.

This PR aims to prune this NV_FATBIN.

[Solution]
In the new release strategy, two types of whl packages will be involved:

Cubin PIP package:
PIP package maintains a smaller window for GPU arches support, containing
sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches

JIT release package:
This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60,
compute_70, compute_75, compute_80, with best performance and GPU arches coverage.

However, it takes around 10 min to install due to the JIT compilation.

[How to use]
The new release strategy is disabled by default.
To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP
To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL

2f3b393d

W
fix CI skip cc test error (#35264) · 3d76d003
由 wuhuanzhou 提交于 8月 31, 2021
```
* fix CI skip cc test error, test=develop

* remove test code, test=develop
```
3d76d003

27 8月, 2021 1 次提交
- Z
  Revert "Add copy from tensor (#34406)" (#35173) · 32c1ec42
  由 zhangchunle 提交于 8月 27, 2021
```
This reverts commit ac33c0ca.
```
  32c1ec42
26 8月, 2021 1 次提交

Add copy from tensor (#34406) · ac33c0ca

由 Shang Zhizhou 提交于 8月 26, 2021

* add api

* temp save

* revert

* copytocpu async ok

* fix style

* copy sync ok

* fix compile error

* fix compile error

* api done

* update python async api

* fix compile

* remove async python api; add c++ async unittest

* remove python async api

* update unittest

* update unittest

* add C++ unittest for copytensor

* add unittest

* update namespace utils to class TensorUtils

* add unittest

* update unittest

* update unittest

* update code style

* update code style

* update unittest

ac33c0ca

25 8月, 2021 2 次提交
- W
  
  strip inference.so and make link to mkldnn.so (#34895) · 086540cc
  由 Wilber 提交于 8月 25, 2021
  
  086540cc
- T
  
  update elementwise api in kunlun (#35021) · ff96a7d5
  由 taixiurong 提交于 8月 25, 2021
  
  ff96a7d5
23 8月, 2021 1 次提交
- L
  
  upgrade oneDNN to v2.3.2 (#35040) · a047c139
  由 lidanqing 提交于 8月 23, 2021
  
  a047c139
16 8月, 2021 1 次提交

Jetson nano bilinear (#34751) · 2a4ed087

由 feng_shuai 提交于 8月 16, 2021

* change bilinear thread for nano and tx2

* change bilinear thread for nano and tx2

2a4ed087

10 8月, 2021 1 次提交

copy boost/any.hpp to utils and replace boost::any with self defined any (#34613) · 12892929

由 chentianyu03 提交于 8月 10, 2021

* add any.hpp to utils and replace boost::any with self defined paddle::any

* add copy any.hpp to custom op depends

* modify any.hpp include path

* remove boost from setup.py.in

* add copy any.hpp to custom op depends

* move any.hpp to paddle/utils/ dirs

* move any.h to extension/include direction

* copy utils to right directions

12892929

09 8月, 2021 1 次提交
- Increase the speed of incremental compilation (#34616) · aab4d6e4
  由 zhouweiwei2014 提交于 8月 09, 2021
  
  aab4d6e4
06 8月, 2021 1 次提交
- T
  
  add get xpu version api (#34594) · 8a9dc5dc
  由 TTerror 提交于 8月 06, 2021
  
  8a9dc5dc
03 8月, 2021 1 次提交
- Q
  support Kunlun2 (#34459) · 2d0f3d9b
  由 QingshuChen 提交于 8月 03, 2021
```
* support Kunlun2

* support KL2

* support KL2
```
  2d0f3d9b
29 7月, 2021 1 次提交
- Improve sccache hit rate and avoid absolute path (#34435) · 92d8fed8
  由 zhouweiwei2014 提交于 7月 29, 2021
  
  92d8fed8
21 7月, 2021 1 次提交
- Polish windows compile for Ninja, fix UT random compile (#34237) · 05805d91
  由 zhouweiwei2014 提交于 7月 21, 2021
```
* polish windows compile for Ninja, fix random compile fail

* polish windows compile for Ninja, fix random compile fail
```
  05805d91
14 7月, 2021 2 次提交
- T
  Support Mac M1 build (#34071) · ec0ea4c5
  由 tianshuo78520a 提交于 7月 14, 2021
```
* Support Mac M1 make

* cmake version check
```
  ec0ea4c5
- Support sccache to speed up compilation on Windows (#34019) · 4ce66826
  由 zhouweiwei2014 提交于 7月 14, 2021
```
* Support sccache to speed up compilation on Windows

* Support sccache to speed up compilation on Windows
```
  4ce66826
07 7月, 2021 1 次提交
- T
  
  [xpu] add dropout & amp ops in xpu place (#33891) · 84e813e3
  由 taixiurong 提交于 7月 07, 2021
  
  84e813e3
06 7月, 2021 1 次提交

Add gpu implementation of shuffle_batch_op (#33938) · c6b6ba1f

由 Zeng Jinle 提交于 7月 06, 2021

* add gpu implementation of shuffle batch
test=develop

* add thrust cuda patches
test=develop

* fix macro guard

* fix shuffle batch compile on windows/hip

* fix hip compilation error

* refine CMakeLists.txt

* fix windows compile error

* try to fix windows CI compilation error

* fix windows compilation again

* fix shuffle_batch op test on Windows

c6b6ba1f

02 7月, 2021 2 次提交
- J
  
  update of oneDNN to 2.3 final (#33923) · 15451c61
  由 Jacek Czaja 提交于 7月 02, 2021
  
  15451c61
- T
  
  update xpu cmake (#33906) · 4c352033
  由 TTerror 提交于 7月 02, 2021
  
  4c352033
29 6月, 2021 2 次提交
- T
  
  xpu support amp (#33809) · 4d4fb660
  由 taixiurong 提交于 6月 29, 2021
  
  4d4fb660
- Z
  support Ninja, establish dependencies relationship between paddle with third_party (#33140) · 43c38c67
  由 Zhou Wei 提交于 6月 29, 2021
```
* support Ninja and establish dependencies relationship between paddle with third_party

* fix CI

* support Ninja
```
  43c38c67
24 6月, 2021 1 次提交
- Z
  
  fix unittest can't get cuda error message correctly (#33743) · 3946afc4
  由 Zhou Wei 提交于 6月 24, 2021
  
  3946afc4
22 6月, 2021 1 次提交
- J
  
  - Updated of oneDNN to 2.3 + bugfixes (#33702) · 246da751
  由 Jacek Czaja 提交于 6月 22, 2021
  
  246da751
21 6月, 2021 1 次提交
- W
  
  update trt version from major to full (#33690) · 2d7ef7ad
  由 Wilber 提交于 6月 21, 2021
  
  2d7ef7ad
18 6月, 2021 2 次提交
- Z
  
  polish windows ci (#32964) · 39556a44
  由 Zhou Wei 提交于 6月 18, 2021
  
  39556a44
- W
  
  [XPU] Add xpu include and so into inference third_party (#33641) · cca44c1d
  由 Wilber 提交于 6月 18, 2021
  
  cca44c1d
17 6月, 2021 1 次提交
- L
  
  Relax the constraint of installed openblas from version==0.3.7 to >=0.3.7 (#33626) · ab0272eb
  由 Leo Chen 提交于 6月 17, 2021
  
  ab0272eb
16 6月, 2021 1 次提交
- Z
  
  Add bitwise_and/or/xor/not OP/API and unittest (#33524) · ecc05377
  由 Zhou Wei 提交于 6月 16, 2021
  
  ecc05377
15 6月, 2021 1 次提交
- W
  
  [XPU] Update cmake options for xpu. (#33450) · e47c3f04
  由 Wilber 提交于 6月 15, 2021
  
  e47c3f04

Crayon鑫 / Paddle 与 Fork 源项目一致

Crayon鑫 / Paddle
与 Fork 源项目一致