- 16 9月, 2021 1 次提交
-
-
由 crystal 提交于
-
- 14 9月, 2021 1 次提交
-
-
由 Sing_chan 提交于
* new function: share third party cache among servers to fasten build speed * modified code according to zhouwei25's comment * add wget install step, move cd build to the last of if condition * block note and error of third_party share; change bce upload method * change third_party sub_dir in bos, since third party in different cuda version cant share * set sub_dir by get nvcc version * change third_party local path to be same with bos path
-
- 13 9月, 2021 1 次提交
-
-
由 taixiurong 提交于
-
- 09 9月, 2021 1 次提交
-
-
由 0x45f 提交于
* init matrix_rank op, add matrix_rank CPU code and test * add GPU kernel, remove svd_eigen.h * add CPU kernel when tol is tensor * add cpu and gpu code when tol is tensor * fix CI-ROCM error * add matrix_rank API describe, fix PR-CI-Py3 error * fix PR-CI-Windows error, add matrix_rank API test * delete useless comments * fix review * add my code in svd_helper.h * update doc commets * remove spaces
-
- 03 9月, 2021 2 次提交
- 02 9月, 2021 1 次提交
-
-
由 xiongkun 提交于
* Add SVD Op and it's GPU and CPU kernel * Remove CUDAPlace in test_svd_op, make the test available in CPU package * modfity the file * fix windows bug/ fix ROCM / fix test timeout * for pass the CIs * improve error report * for code review * some modification to test_svd_op * change python code style * expose the svd interface for document
-
- 01 9月, 2021 1 次提交
-
-
由 QingshuChen 提交于
* support KL label smooth * update UT for KL label_smooth
-
- 31 8月, 2021 4 次提交
-
-
由 Shang Zhizhou 提交于
* Revert "Revert "Add copy from tensor (#34406)" (#35173)" This reverts commit 32c1ec42. * add template instantiation
-
由 zhouweiwei2014 提交于
-
由 Zhanlue Yang 提交于
[Background] Expansion in code size can be irreversible in the long run, leading to huge release packages which not only hampers user experience but also exceeds a hard limit of pypi. In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU arches supported. This PR aims to prune this NV_FATBIN. [Solution] In the new release strategy, two types of whl packages will be involved: Cubin PIP package: PIP package maintains a smaller window for GPU arches support, containing sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches JIT release package: This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60, compute_70, compute_75, compute_80, with best performance and GPU arches coverage. However, it takes around 10 min to install due to the JIT compilation. [How to use] The new release strategy is disabled by default. To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL
-
由 wuhuanzhou 提交于
* fix CI skip cc test error, test=develop * remove test code, test=develop
-
- 27 8月, 2021 1 次提交
-
-
由 zhangchunle 提交于
This reverts commit ac33c0ca.
-
- 26 8月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* add api * temp save * revert * copytocpu async ok * fix style * copy sync ok * fix compile error * fix compile error * api done * update python async api * fix compile * remove async python api; add c++ async unittest * remove python async api * update unittest * update unittest * add C++ unittest for copytensor * add unittest * update namespace utils to class TensorUtils * add unittest * update unittest * update unittest * update code style * update code style * update unittest
-
- 25 8月, 2021 2 次提交
-
-
由 Wilber 提交于
-
由 taixiurong 提交于
-
- 23 8月, 2021 1 次提交
-
-
由 lidanqing 提交于
-
- 16 8月, 2021 1 次提交
-
-
由 feng_shuai 提交于
* change bilinear thread for nano and tx2 * change bilinear thread for nano and tx2
-
- 10 8月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* add any.hpp to utils and replace boost::any with self defined paddle::any * add copy any.hpp to custom op depends * modify any.hpp include path * remove boost from setup.py.in * add copy any.hpp to custom op depends * move any.hpp to paddle/utils/ dirs * move any.h to extension/include direction * copy utils to right directions
-
- 09 8月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 06 8月, 2021 1 次提交
-
-
由 TTerror 提交于
-
- 03 8月, 2021 1 次提交
-
-
由 QingshuChen 提交于
* support Kunlun2 * support KL2 * support KL2
-
- 29 7月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 21 7月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
* polish windows compile for Ninja, fix random compile fail * polish windows compile for Ninja, fix random compile fail
-
- 14 7月, 2021 2 次提交
-
-
由 tianshuo78520a 提交于
* Support Mac M1 make * cmake version check
-
由 zhouweiwei2014 提交于
* Support sccache to speed up compilation on Windows * Support sccache to speed up compilation on Windows
-
- 07 7月, 2021 1 次提交
-
-
由 taixiurong 提交于
-
- 06 7月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
* add gpu implementation of shuffle batch test=develop * add thrust cuda patches test=develop * fix macro guard * fix shuffle batch compile on windows/hip * fix hip compilation error * refine CMakeLists.txt * fix windows compile error * try to fix windows CI compilation error * fix windows compilation again * fix shuffle_batch op test on Windows
-
- 02 7月, 2021 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 TTerror 提交于
-
- 29 6月, 2021 2 次提交
-
-
由 taixiurong 提交于
-
由 Zhou Wei 提交于
* support Ninja and establish dependencies relationship between paddle with third_party * fix CI * support Ninja
-
- 24 6月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 22 6月, 2021 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 21 6月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 18 6月, 2021 2 次提交
- 17 6月, 2021 1 次提交
-
-
由 Leo Chen 提交于
-
- 16 6月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 15 6月, 2021 1 次提交
-
-
由 Wilber 提交于
-