- 07 1月, 2020 8 次提交
-
-
由 liu zhengxi 提交于
-
由 Yiqun Liu 提交于
test=develop
-
由 Pei Yang 提交于
* add TRT support for instance_norm op
-
由 zhaoyuchen2018 提交于
windows conv_fusion failed as no kernel, explicit declare lambda Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Chengmo 提交于
* add special way to add distribute vars, Update Pyramid hash op
-
由 bingyanghuang 提交于
-
由 Feiyu Chan 提交于
* add erf op and python interface. * add fp16 support for erf op. * add unitests for erf op and its python interface.
-
由 Chen Weihang 提交于
-
- 06 1月, 2020 8 次提交
-
-
由 silingtong123 提交于
-
由 Double_V 提交于
* support elu activation double grad,test=develop * delete the code commit in .cc,test=develop * fix relu test unpass, test=develop * add elu double grad kernel and unit test * add caculate dX in elu double grad functor, test=develop * update the commit code,test=develop
-
由 Pei Yang 提交于
* add gelu plugin * align trt bert with gpu * add support for fused fc with relu, * add unittest for bert trt
-
由 Jacek Czaja 提交于
-
由 Huihuang Zheng 提交于
-
由 Zeng Jinle 提交于
-
由 Zeng Jinle 提交于
-
由 123malin 提交于
* add distributed_strategy
-
- 05 1月, 2020 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 04 1月, 2020 1 次提交
-
-
由 Kaipeng Deng 提交于
-
- 03 1月, 2020 5 次提交
-
-
由 SunAhong1993 提交于
* register int/int64_t/float16 in pow/square kernel,test=develop * add abs/square/exp type,test=develop
-
由 Leo Chen 提交于
* fix test_conv2d_ngraph for grad diff, test=develop * register NoNeedBufferVarsInference for max_pool_grad_op, test=develop * refine error message, test=develop * fix numpy, test=develop * disable test conv2d_ngraph_op, test=develop Co-authored-by: NZhang Ting <709968123@qq.com>
-
由 Yiqun Liu 提交于
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop
-
由 Michał Gallus 提交于
-
由 FDInSky 提交于
* test=develop fix generate_proposal_labesl op
-
- 02 1月, 2020 2 次提交
-
-
由 ceci3 提交于
* update error information about batch_norm_grad * update bn,test=develop
-
由 Aurelius84 提交于
* fix integer overflow in match_matrix test=develop * fix integer overflow in match_matrix test=develop * fix typo test=develop
-
- 01 1月, 2020 1 次提交
-
-
由 Chen Weihang 提交于
-
- 31 12月, 2019 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 30 12月, 2019 4 次提交
-
-
由 Chen Weihang 提交于
-
由 Chen Weihang 提交于
-
由 zhouwei25 提交于
-
由 danleifeng 提交于
-
- 29 12月, 2019 1 次提交
-
-
由 liu zhengxi 提交于
* fix seqconv_eltadd_relu pass during multi-threads predictor, test=develop * fix attention_lstm_fuse_pass during multi-threads inference, test=develop * fix embedding_fc_lstm_fuse_pass during multi-threads inference, test=develop * fix fc_lstm_fuse_pass during multi-threads inference, test=develop * fix seq_concat_fc_fuse_pass during multi-threads inference, test=develop
-
- 27 12月, 2019 5 次提交
-
-
由 zhaoyuchen2018 提交于
* Refine multihead kernel, align block to 32 test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Refine log comments test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 silingtong123 提交于
-
由 zhoushiyu 提交于
* add shuffle batch op, test=develop, test=document_preview * fix size_t conflict and check_output test=develop, test=document_preview * fix bug test=develop, test=document_preview * add unittest of shuffle_batch layer test=develop, test=document_preview * fix py coverage and op input type, test=develop, test=document_preview * fix py coverage, test=develop * fix en doc, test=develop * move to contrib test=develop * add unique_name test=develop * invoke shuffle_batch in contrib.layers test=develop
-
由 mapingshuo 提交于
* make reverse op support negative axis
-
由 石晓伟 提交于
* fix multi-thread error of fc_gru_fuse_pass.cc, test=develop * export FLAGS and GLOG symbols, test=develop
-
- 26 12月, 2019 3 次提交
-
-
由 Aurelius84 提交于
* fix compile error in CUDA10 test=develop * remove double in pad2d test=develop
-
由 zhouwei25 提交于
* Fix openblas to support compile on Windows when WITH_MKL=OFF
-
由 hutuxian 提交于
* fix stat shape back in global auc scenario * add UT to cover global auc
-