- 11 2月, 2020 3 次提交
-
-
由 huzhiqiang 提交于
-
由 zhaoyuchen2018 提交于
* Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Wilber 提交于
支持不依赖nccl进行编译。[1/2] 多卡下,如果没有打开WITH_NCCL开关编译,多卡不能通信,则只能选择一张卡使用。 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 10 2月, 2020 3 次提交
- 07 2月, 2020 4 次提交
-
-
由 Zhong Hui 提交于
Fix the integer overflow problem in the op of sequence2batch, change the int32_t to size_t, In the /paddle/fluid/operators/math/sequence2batch.h#L122.
-
由 cc 提交于
* support weight quantization in post_training_quanzitaion, test=develop * add test for weight quantization, test=develop
-
由 Tao Luo 提交于
test=develop
-
由 LielinJiang 提交于
* optimize interpolate op, test=develop
-
- 06 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456) * Add log in memory::Copy for debug purpose. * Change to use context in DeviceContextPool directly in sequence_pooling_test, instead to new one. * Change to use context in DeviceContextPool directly in sequence_padding_test, instead to new one. test=develop * Change the type of second_dim from size_t to int64_t. test=develop
-
- 05 2月, 2020 2 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 Tao Luo 提交于
* Sigmoid bug fix, test=develop * fix code format test=develop Co-authored-by: NManjunath Bhat <manjunathbhat9920@gmail.com>
-
- 04 2月, 2020 1 次提交
-
-
由 石晓伟 提交于
-
- 02 2月, 2020 1 次提交
-
-
由 liu zhengxi 提交于
* update the ut precision of pad pad2d pad_constant_like from fp32 to fp64, test=develop
-
- 31 1月, 2020 2 次提交
-
-
由 Michał Gallus 提交于
* Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop
-
由 joanna.wozna.intel 提交于
-
- 25 1月, 2020 1 次提交
-
-
由 lidanqing 提交于
-
- 23 1月, 2020 1 次提交
-
-
由 Wojciech Uss 提交于
-
- 22 1月, 2020 1 次提交
-
-
由 ceci3 提交于
-
- 21 1月, 2020 1 次提交
-
-
由 Chengmo 提交于
* test=develop, fix geo Send & Init
-
- 19 1月, 2020 2 次提交
-
-
由 zhupengyang 提交于
-
由 wangchaochaohu 提交于
-
- 17 1月, 2020 2 次提交
-
-
由 qingqing01 提交于
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 16 1月, 2020 4 次提交
-
-
由 wangchaochaohu 提交于
-
由 Leo Chen 提交于
* remove unused inputs, test=develop * remove unused inputs, test=develop * update dtype, test=develop * remove unused inputs, test=develop * update op_use_default_grad_op_maker, tese=develop * resolve conflicts, test=develop * follow comments, test=develop * update center_loss_grad, test=develop
-
由 zhangchunle 提交于
-
由 lidanqing 提交于
-
- 15 1月, 2020 1 次提交
-
-
由 Bai Yifan 提交于
* fix fsp_op, test=develop * fix fsp grad op maker, test=develop * update op_use_default_grad_op_maker.spec, test=develop
-
- 13 1月, 2020 2 次提交
- 10 1月, 2020 3 次提交
-
-
由 FlyingQianMM 提交于
* add backward gradient computation for op argsort test=developo * use pre-commit test=develop
-
由 Zhen Wang 提交于
* add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
-
由 baojun 提交于
-
- 09 1月, 2020 2 次提交
-
-
由 zhongpu 提交于
* test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop
-
由 石晓伟 提交于
-
- 08 1月, 2020 3 次提交
-
-
由 zhaoyuchen2018 提交于
stack's wait cost a lot of cpu time, use cuda kernel to do memory copy will reduce cpu time. Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 liu zhengxi 提交于
-
由 Double_V 提交于
1. Add a new input named batch_roi_nums for prroi_pool_op. batch_roi_nums includes the number of roi for each image in batch when rois is Tensor. This information is saved in rois's lod when rois is LoDTensor. 2. add grad check to prroi_pool_op and solve unnormal X grad diff in CPU.
-