- 19 12月, 2020 1 次提交
-
-
由 Jacek Czaja 提交于
* - Reimplemented elementwise_add grad - lint * - fix after review * - Fix to fix after review
-
- 18 12月, 2020 1 次提交
-
-
由 Aurelius84 提交于
-
- 17 12月, 2020 3 次提交
-
-
由 wanghuancoder 提交于
* Windows generate pdb and dump, for debug * fix code style, test=develop * modify cmakelist
-
由 Huihuang Zheng 提交于
Modify CublasHandleHolder from using PADDLE_ENFORCE_CUDA_SUCCESS to PADDLE_RETRY_CUDA_SUCCESS to fix random unittest failure. We checked that the unittest log showed CUDA allocation error at this file, which may due to GPU not enough. We fixed similar failure in the past, so we applied PADDLE_RETRY_CUDA_SUCCESS here.
-
由 Jacek Czaja 提交于
-
- 16 12月, 2020 2 次提交
- 15 12月, 2020 1 次提交
-
-
由 AshburnLee 提交于
-
- 14 12月, 2020 2 次提交
-
-
由 arlesniak 提交于
-
由 Jacek Czaja 提交于
-
- 11 12月, 2020 1 次提交
-
-
由 taixiurong 提交于
* 1.fix matmul bug 2. add one hot * add xpu error msg
-
- 09 12月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
Add Sleep Time for CUDA Retry, which is similar to our GPU retry logic. This is a try to avoid init GPU allocation random failure in unit test.
-
- 08 12月, 2020 1 次提交
-
-
由 jakpiase 提交于
* added external reorder to profiler * added external and internal reorders to profiler * added internal and external reorder to profiler * added formatting to int/ext reorder commit * removed unnecessary comment
-
- 07 12月, 2020 1 次提交
-
-
由 Jack Zhou 提交于
-
- 04 12月, 2020 4 次提交
-
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest * kron, reshape, transpose support complex types * sum and trace op support complex types * add test case of sum and trace op * fix the bug of imag part of complex not initialized * format file * format code style * kron support type promotion; modify test cases
-
由 卖鱼的哲学 提交于
* fix expand && concat/transpose to new api * update uniform_random_op * update xpu_header
-
由 lilong12 提交于
-
由 Chen Weihang 提交于
* basic impl of type promote * add comment & another testcase * fix complex bugs & support python op promote type * fix failed unittests & polish code * add unittest for coverage * change to only promote complex type * polish code details * polish several comments
-
- 01 12月, 2020 2 次提交
-
-
由 QingshuChen 提交于
* update conv2d & softmax to new xpu api * test=kunlun * remove useless comments * test=kunlun * remote softmax xpu op * test=kunlun * update kunlun softmax * test=kunlun * update xpu unitest * test=kunlun * fix elementwise_grad bug for kunlun *test=kunlun
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest
-
- 27 11月, 2020 5 次提交
-
-
由 ShenLiang 提交于
* add reducer * refine envent for memorycopy * add concat&split for allreduce * apply concat & split for fuse tensor * fix nccl dep * fix the untest, compile problem and ddp initialize problem * fix untest for mac & add some comments & solve the repeated param in sublayers * fix untest for windows & fix document
-
由 Zhou Wei 提交于
-
由 arlesniak 提交于
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
由 Leo Chen 提交于
-
- 26 11月, 2020 1 次提交
-
-
由 Aurelius84 提交于
-
- 25 11月, 2020 2 次提交
-
-
由 Chen Weihang 提交于
* default not show cpp statck & add hint * fix failed unittest * fix failed unittests
-
由 wawltor 提交于
remove eigen threadpool for the speed up
-
- 23 11月, 2020 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 Pei Yang 提交于
* change avg pooling and global pooling to trt layer * add support for static shape global pooling * modify trt errmsg
-
- 20 11月, 2020 2 次提交
-
-
由 gongweibao 提交于
-
由 QingshuChen 提交于
* adjust kunlun header file *test=kunlun * update kunlun unittest *test=kunlun * update xpu unitest * test = kunlun * update xpu unittest * test=kunlun * update xpu unitest * test=kunlun
-
- 17 11月, 2020 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 lilong12 提交于
-
- 13 11月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
-
- 04 11月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 03 11月, 2020 4 次提交
-
-
由 Shang Zhizhou 提交于
* fp16 result ok * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS * auto detect special slice op converter for ernie with trt oss * ernie oss only support fp16 * fix special_slice_plugin serialize bug * matmul in tensorrt ok * ernie unittest ok * add matmul tensorrt unittest * remove demo code
-
由 Jacek Czaja 提交于
-
由 Wilber 提交于
-
由 Guo Sheng 提交于
* Add rnn_op. test=develop * Fix rnn_op grad maker's drop_empty_grad. test=develop
-