- 09 12月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
Add Sleep Time for CUDA Retry, which is similar to our GPU retry logic. This is a try to avoid init GPU allocation random failure in unit test.
-
- 08 12月, 2020 1 次提交
-
-
由 jakpiase 提交于
* added external reorder to profiler * added external and internal reorders to profiler * added internal and external reorder to profiler * added formatting to int/ext reorder commit * removed unnecessary comment
-
- 07 12月, 2020 1 次提交
-
-
由 Jack Zhou 提交于
-
- 04 12月, 2020 4 次提交
-
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest * kron, reshape, transpose support complex types * sum and trace op support complex types * add test case of sum and trace op * fix the bug of imag part of complex not initialized * format file * format code style * kron support type promotion; modify test cases
-
由 卖鱼的哲学 提交于
* fix expand && concat/transpose to new api * update uniform_random_op * update xpu_header
-
由 lilong12 提交于
-
由 Chen Weihang 提交于
* basic impl of type promote * add comment & another testcase * fix complex bugs & support python op promote type * fix failed unittests & polish code * add unittest for coverage * change to only promote complex type * polish code details * polish several comments
-
- 01 12月, 2020 2 次提交
-
-
由 QingshuChen 提交于
* update conv2d & softmax to new xpu api * test=kunlun * remove useless comments * test=kunlun * remote softmax xpu op * test=kunlun * update kunlun softmax * test=kunlun * update xpu unitest * test=kunlun * fix elementwise_grad bug for kunlun *test=kunlun
-
由 chentianyu03 提交于
* add complex64 and complex128 type; add +-*/@ and slice opreator for complex types * add test cases for complex elementwise, matmul and getitem unittest * add test cases for complex types * add test cases for complex matmul unittest
-
- 27 11月, 2020 5 次提交
-
-
由 ShenLiang 提交于
* add reducer * refine envent for memorycopy * add concat&split for allreduce * apply concat & split for fuse tensor * fix nccl dep * fix the untest, compile problem and ddp initialize problem * fix untest for mac & add some comments & solve the repeated param in sublayers * fix untest for windows & fix document
-
由 Zhou Wei 提交于
-
由 arlesniak 提交于
-
由 Shang Zhizhou 提交于
* remove -DSUPPORTS_CUDA_FP16 in cuda.cmake * comile with cuda9 * add some unittest * notest;test=coverage * add unittest for trt plugin swish && split * update ernie unittest * fix some error message * remove repeated judgement of CUDA version in mbEltwiseLayerNormOpConverter * fix comile errror when CUDA_ARCH_NAME < Pascal" * fix comile error * update unittest timeout * compile with cuda9 * update error msg * fix code style * add some comments * add define IF_CUDA_ARCH_SUPPORT_FP16 * rename IF_CUDA_ARCH_SUPPORT_FP16 to CUDA_ARCH_FP16_SUPPORTED
-
由 Leo Chen 提交于
-
- 26 11月, 2020 1 次提交
-
-
由 Aurelius84 提交于
-
- 25 11月, 2020 2 次提交
-
-
由 Chen Weihang 提交于
* default not show cpp statck & add hint * fix failed unittest * fix failed unittests
-
由 wawltor 提交于
remove eigen threadpool for the speed up
-
- 23 11月, 2020 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 Pei Yang 提交于
* change avg pooling and global pooling to trt layer * add support for static shape global pooling * modify trt errmsg
-
- 20 11月, 2020 2 次提交
-
-
由 gongweibao 提交于
-
由 QingshuChen 提交于
* adjust kunlun header file *test=kunlun * update kunlun unittest *test=kunlun * update xpu unitest * test = kunlun * update xpu unittest * test=kunlun * update xpu unitest * test=kunlun
-
- 17 11月, 2020 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 lilong12 提交于
-
- 13 11月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
-
- 04 11月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 03 11月, 2020 4 次提交
-
-
由 Shang Zhizhou 提交于
* fp16 result ok * change -DWITH_NVINFER_PLUGIN toconfig.EnableTensorRtOSS * auto detect special slice op converter for ernie with trt oss * ernie oss only support fp16 * fix special_slice_plugin serialize bug * matmul in tensorrt ok * ernie unittest ok * add matmul tensorrt unittest * remove demo code
-
由 Jacek Czaja 提交于
-
由 Wilber 提交于
-
由 Guo Sheng 提交于
* Add rnn_op. test=develop * Fix rnn_op grad maker's drop_empty_grad. test=develop
-
- 02 11月, 2020 2 次提交
-
-
由 wangchaochaohu 提交于
-
由 Huihuang Zheng 提交于
This PR is follow up of #28213. On that PR we tried to decrease GPU usage, however the CI still randomly failed. So I added retry logic for the initialization of nccl and cusolver. If the initialization failed, we can retry to avoid the random failure.
-
- 30 10月, 2020 1 次提交
-
-
由 Leo Chen 提交于
-
- 28 10月, 2020 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 27 10月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add multiple exception type * define all exception & polish compile pystack * mapping paddle error to python exception * polish static mode error format * fix failed unittests * fix dytostatic test_error * fix check_nan_inf failed * add unittest for coverage * revert some code try to solve compile error * refactor enforce & error change * polish code & add unittest
-
- 23 10月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
* add compile limit for paddle enforce * polish elementwise_op_function.cu.h * fix failed unittest * fix windows compile failed * detail polish * revert no type constructor
-
- 21 10月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
-
- 20 10月, 2020 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 19 10月, 2020 1 次提交
-
-
由 Pei Yang 提交于
-
- 16 10月, 2020 1 次提交
-
-
由 lidanqing 提交于
* conv dilated mkldnn support: forward and backward pass * add mkldnn conv_transpose dilation UT test=develop * remove unnecessary PADDLE_ENFORCE * add int8 and bf16 dilated conv UT * update according to reviews
-
- 14 10月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* use exhaustive_search for float16 * tune algo only when dtype is float16
-