- 21 1月, 2020 1 次提交
-
-
由 lidanqing 提交于
-
- 20 1月, 2020 1 次提交
-
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 17 1月, 2020 1 次提交
-
-
由 qingqing01 提交于
-
- 16 1月, 2020 1 次提交
-
-
由 Adam 提交于
-
- 14 1月, 2020 3 次提交
-
-
由 123malin 提交于
* test=develop, bug fix for sparse recorder
-
由 FlyingQianMM 提交于
-
由 Zhen Wang 提交于
-
- 10 1月, 2020 1 次提交
-
-
由 baojun 提交于
-
- 09 1月, 2020 2 次提交
-
-
由 zhongpu 提交于
* test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop
-
由 石晓伟 提交于
-
- 08 1月, 2020 3 次提交
-
-
由 zhaoyuchen2018 提交于
stack's wait cost a lot of cpu time, use cuda kernel to do memory copy will reduce cpu time. Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 liu zhengxi 提交于
-
由 Double_V 提交于
1. Add a new input named batch_roi_nums for prroi_pool_op. batch_roi_nums includes the number of roi for each image in batch when rois is Tensor. This information is saved in rois's lod when rois is LoDTensor. 2. add grad check to prroi_pool_op and solve unnormal X grad diff in CPU.
-
- 07 1月, 2020 4 次提交
-
-
由 zhaoyuchen2018 提交于
windows conv_fusion failed as no kernel, explicit declare lambda Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Chengmo 提交于
* add special way to add distribute vars, Update Pyramid hash op
-
由 Feiyu Chan 提交于
* add erf op and python interface. * add fp16 support for erf op. * add unitests for erf op and its python interface.
-
由 Chen Weihang 提交于
-
- 06 1月, 2020 4 次提交
-
-
由 Double_V 提交于
* support elu activation double grad,test=develop * delete the code commit in .cc,test=develop * fix relu test unpass, test=develop * add elu double grad kernel and unit test * add caculate dX in elu double grad functor, test=develop * update the commit code,test=develop
-
由 Pei Yang 提交于
* add gelu plugin * align trt bert with gpu * add support for fused fc with relu, * add unittest for bert trt
-
由 Jacek Czaja 提交于
-
由 123malin 提交于
* add distributed_strategy
-
- 05 1月, 2020 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 04 1月, 2020 1 次提交
-
-
由 Kaipeng Deng 提交于
-
- 03 1月, 2020 5 次提交
-
-
由 SunAhong1993 提交于
* register int/int64_t/float16 in pow/square kernel,test=develop * add abs/square/exp type,test=develop
-
由 Leo Chen 提交于
* fix test_conv2d_ngraph for grad diff, test=develop * register NoNeedBufferVarsInference for max_pool_grad_op, test=develop * refine error message, test=develop * fix numpy, test=develop * disable test conv2d_ngraph_op, test=develop Co-authored-by: NZhang Ting <709968123@qq.com>
-
由 Yiqun Liu 提交于
* Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Refine the calling of PADDLE_ENFORCE. test=develop
-
由 Michał Gallus 提交于
-
由 FDInSky 提交于
* test=develop fix generate_proposal_labesl op
-
- 02 1月, 2020 2 次提交
-
-
由 ceci3 提交于
* update error information about batch_norm_grad * update bn,test=develop
-
由 Aurelius84 提交于
* fix integer overflow in match_matrix test=develop * fix integer overflow in match_matrix test=develop * fix typo test=develop
-
- 31 12月, 2019 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 30 12月, 2019 1 次提交
-
-
由 danleifeng 提交于
-
- 27 12月, 2019 3 次提交
-
-
由 zhaoyuchen2018 提交于
* Refine multihead kernel, align block to 32 test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Refine log comments test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 zhoushiyu 提交于
* add shuffle batch op, test=develop, test=document_preview * fix size_t conflict and check_output test=develop, test=document_preview * fix bug test=develop, test=document_preview * add unittest of shuffle_batch layer test=develop, test=document_preview * fix py coverage and op input type, test=develop, test=document_preview * fix py coverage, test=develop * fix en doc, test=develop * move to contrib test=develop * add unique_name test=develop * invoke shuffle_batch in contrib.layers test=develop
-
由 mapingshuo 提交于
* make reverse op support negative axis
-
- 26 12月, 2019 2 次提交
-
-
由 Aurelius84 提交于
* fix compile error in CUDA10 test=develop * remove double in pad2d test=develop
-
由 hutuxian 提交于
* fix stat shape back in global auc scenario * add UT to cover global auc
-
- 25 12月, 2019 3 次提交
-
-
由 Aurelius84 提交于
* add register op_data_type test=develop * fix register bug in isfinite op test=develop * rm int int64_t in pad2d gradKernel test=develop
-
由 hong 提交于
-
由 zhouwei25 提交于
-