- 31 8月, 2021 2 次提交
-
-
由 Zhanlue Yang 提交于
[Background] Expansion in code size can be irreversible in the long run, leading to huge release packages which not only hampers user experience but also exceeds a hard limit of pypi. In such, NV_FATBIN section takes up 86% of the compiled dylib size, owing to the vast number of GPU arches supported. This PR aims to prune this NV_FATBIN. [Solution] In the new release strategy, two types of whl packages will be involved: Cubin PIP package: PIP package maintains a smaller window for GPU arches support, containing sm_60, sm_70, sm_75, sm_80 cubins, covering Pascal - Ampere arches JIT release package: This is a backup for Cubin PIP package, containing compute_35, compute_50, compute_60, compute_70, compute_75, compute_80, with best performance and GPU arches coverage. However, it takes around 10 min to install due to the JIT compilation. [How to use] The new release strategy is disabled by default. To compile for Cubin PIP package, add this to cmake: -DCUBIN_RELEASE_PIP To compile for JIT release package, add this to cmake: -DJIT_RELEASE_WHL
-
由 wuhuanzhou 提交于
* fix CI skip cc test error, test=develop * remove test code, test=develop
-
- 27 8月, 2021 1 次提交
-
-
由 zhangchunle 提交于
This reverts commit ac33c0ca.
-
- 26 8月, 2021 1 次提交
-
-
由 Shang Zhizhou 提交于
* add api * temp save * revert * copytocpu async ok * fix style * copy sync ok * fix compile error * fix compile error * api done * update python async api * fix compile * remove async python api; add c++ async unittest * remove python async api * update unittest * update unittest * add C++ unittest for copytensor * add unittest * update namespace utils to class TensorUtils * add unittest * update unittest * update unittest * update code style * update code style * update unittest
-
- 25 8月, 2021 2 次提交
-
-
由 Wilber 提交于
-
由 taixiurong 提交于
-
- 23 8月, 2021 1 次提交
-
-
由 lidanqing 提交于
-
- 16 8月, 2021 1 次提交
-
-
由 feng_shuai 提交于
* change bilinear thread for nano and tx2 * change bilinear thread for nano and tx2
-
- 10 8月, 2021 1 次提交
-
-
由 chentianyu03 提交于
* add any.hpp to utils and replace boost::any with self defined paddle::any * add copy any.hpp to custom op depends * modify any.hpp include path * remove boost from setup.py.in * add copy any.hpp to custom op depends * move any.hpp to paddle/utils/ dirs * move any.h to extension/include direction * copy utils to right directions
-
- 09 8月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 06 8月, 2021 1 次提交
-
-
由 TTerror 提交于
-
- 03 8月, 2021 1 次提交
-
-
由 QingshuChen 提交于
* support Kunlun2 * support KL2 * support KL2
-
- 29 7月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
-
- 21 7月, 2021 1 次提交
-
-
由 zhouweiwei2014 提交于
* polish windows compile for Ninja, fix random compile fail * polish windows compile for Ninja, fix random compile fail
-
- 14 7月, 2021 2 次提交
-
-
由 tianshuo78520a 提交于
* Support Mac M1 make * cmake version check
-
由 zhouweiwei2014 提交于
* Support sccache to speed up compilation on Windows * Support sccache to speed up compilation on Windows
-
- 07 7月, 2021 1 次提交
-
-
由 taixiurong 提交于
-
- 06 7月, 2021 1 次提交
-
-
由 Zeng Jinle 提交于
* add gpu implementation of shuffle batch test=develop * add thrust cuda patches test=develop * fix macro guard * fix shuffle batch compile on windows/hip * fix hip compilation error * refine CMakeLists.txt * fix windows compile error * try to fix windows CI compilation error * fix windows compilation again * fix shuffle_batch op test on Windows
-
- 02 7月, 2021 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 TTerror 提交于
-
- 29 6月, 2021 2 次提交
-
-
由 taixiurong 提交于
-
由 Zhou Wei 提交于
* support Ninja and establish dependencies relationship between paddle with third_party * fix CI * support Ninja
-
- 24 6月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 22 6月, 2021 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 21 6月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 18 6月, 2021 2 次提交
- 17 6月, 2021 1 次提交
-
-
由 Leo Chen 提交于
-
- 16 6月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 15 6月, 2021 1 次提交
-
-
由 Wilber 提交于
-
- 09 6月, 2021 1 次提交
-
-
由 Leo Chen 提交于
-
- 08 6月, 2021 1 次提交
-
-
由 TTerror 提交于
-
- 07 6月, 2021 1 次提交
-
-
由 lidanqing 提交于
-
- 02 6月, 2021 2 次提交
- 01 6月, 2021 2 次提交
-
-
由 Zhou Wei 提交于
-
由 chentianyu03 提交于
* replace and remove complex64/128 types in custom OP and other files * fix custom_tensor_test fail bug * fix custom_conj_test fail bug * fix dispatch_test_op build fail bug
-
- 28 5月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 27 5月, 2021 2 次提交
-
-
由 Thunderbrook 提交于
* support ssd in PsCore * remove log * remove bz2 * defalut value * code style * parse table class * code style * add define
-
由 Zhou Wei 提交于
* Unify all external API error message mechanism and enhance third-party API error msg * fix some comment * fix some comment
-