- 12 2月, 2020 1 次提交
-
-
由 Double_V 提交于
* support slice double grad, test=develop * merge two doublegradopmaker to one doublegradopmaker,test=develop * change the shape of slice_OP's unittest, test=develop
-
- 11 2月, 2020 4 次提交
-
-
由 hutuxian 提交于
Refine PaddleBox Framework, Main functions: * Add MetricMsg util class, which can calculate metrics like AUC, bucket_error, COPC. * Replace FeedPass with new interface: BeginFeedPass & EndFeedPass * Refactor Pull/Push Sparse Function in box_wrapper. * Use CUDA Kernel to copy keys and copy feasign between tensor and boxps struct. * Cache copied keys in pull sparse in order to reuse it in push period.
-
由 huzhiqiang 提交于
-
由 zhaoyuchen2018 提交于
* Refine code, fix select tile error,test=develop * Refine element type and some comments, test=develop * Refine comments and gpu utils, test=develop * Remove some useless condition * Refine floor and ceil, test=develop * refine for loop. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Wilber 提交于
支持不依赖nccl进行编译。[1/2] 多卡下,如果没有打开WITH_NCCL开关编译,多卡不能通信,则只能选择一张卡使用。 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
- 10 2月, 2020 3 次提交
- 07 2月, 2020 4 次提交
-
-
由 Zhong Hui 提交于
Fix the integer overflow problem in the op of sequence2batch, change the int32_t to size_t, In the /paddle/fluid/operators/math/sequence2batch.h#L122.
-
由 cc 提交于
* support weight quantization in post_training_quanzitaion, test=develop * add test for weight quantization, test=develop
-
由 Tao Luo 提交于
test=develop
-
由 LielinJiang 提交于
* optimize interpolate op, test=develop
-
- 06 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456) * Add log in memory::Copy for debug purpose. * Change to use context in DeviceContextPool directly in sequence_pooling_test, instead to new one. * Change to use context in DeviceContextPool directly in sequence_padding_test, instead to new one. test=develop * Change the type of second_dim from size_t to int64_t. test=develop
-
- 05 2月, 2020 2 次提交
-
-
由 Wilber 提交于
cmake选项中添加了WITH_NCCL,显示指定是否编译NCCL的部分代码,WITH_NCCL默认打开,但如果WITH_GPU为OFF,则关闭WITH_NCCL 添加了PADDLE_WITH_NCCL定义 单机单卡能够关闭NCCL编译,多卡的话需要默认打开NCCL,如果关闭NCCL,则只能使用单卡 Co-authored-by: N石晓伟 <39303645+Shixiaowei02@users.noreply.github.com>
-
由 Tao Luo 提交于
* Sigmoid bug fix, test=develop * fix code format test=develop Co-authored-by: NManjunath Bhat <manjunathbhat9920@gmail.com>
-
- 04 2月, 2020 1 次提交
-
-
由 石晓伟 提交于
-
- 02 2月, 2020 1 次提交
-
-
由 liu zhengxi 提交于
* update the ut precision of pad pad2d pad_constant_like from fp32 to fp64, test=develop
-
- 31 1月, 2020 2 次提交
-
-
由 Michał Gallus 提交于
* Enable quantize to reorder to nchw as well * Correct FC MKL-DNN input dim requirements to accept 3D * Improve DNNL FC format, error and 3D input handling test=develop * Improve error checking in FC test=develop * Improve PADDLE_ENFORCE messages in fc-related files * Remove data layout attribute from obligatory pass args test=develop * Fix message in fc_mkldnn_pass to be logically correct test=develop
-
由 joanna.wozna.intel 提交于
-
- 25 1月, 2020 1 次提交
-
-
由 lidanqing 提交于
-
- 23 1月, 2020 1 次提交
-
-
由 Wojciech Uss 提交于
-
- 22 1月, 2020 1 次提交
-
-
由 ceci3 提交于
-
- 21 1月, 2020 1 次提交
-
-
由 Chengmo 提交于
* test=develop, fix geo Send & Init
-
- 19 1月, 2020 2 次提交
-
-
由 zhupengyang 提交于
-
由 wangchaochaohu 提交于
-
- 17 1月, 2020 2 次提交
-
-
由 qingqing01 提交于
-
由 tangwei12 提交于
* add half_async in the communicator * fix DistributedStrategy
-
- 16 1月, 2020 4 次提交
-
-
由 wangchaochaohu 提交于
-
由 Leo Chen 提交于
* remove unused inputs, test=develop * remove unused inputs, test=develop * update dtype, test=develop * remove unused inputs, test=develop * update op_use_default_grad_op_maker, tese=develop * resolve conflicts, test=develop * follow comments, test=develop * update center_loss_grad, test=develop
-
由 zhangchunle 提交于
-
由 lidanqing 提交于
-
- 15 1月, 2020 1 次提交
-
-
由 Bai Yifan 提交于
* fix fsp_op, test=develop * fix fsp grad op maker, test=develop * update op_use_default_grad_op_maker.spec, test=develop
-
- 13 1月, 2020 2 次提交
- 10 1月, 2020 3 次提交
-
-
由 FlyingQianMM 提交于
* add backward gradient computation for op argsort test=developo * use pre-commit test=develop
-
由 Zhen Wang 提交于
* add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
-
由 baojun 提交于
-
- 09 1月, 2020 2 次提交
-
-
由 zhongpu 提交于
* test Optimizer in dygraph, test=develop * add optest for Optimizer in dygraph, test=develop * fix adagrad optimizer, test=develop * fix dpsgd optimizer, test=develop * fix test_optimizer.py, test=develop * fix dpsgd optimizer, this op only support cpu, test=develop * add optest for optimizer, test=develop * add description for dpsgd, test=develop * add rmsprop to white_list in unused_var_check.cc, test=develop * polish code style, test=develop * polish code style, test=develop * delete seed attribute for DpsgdOptimizer, test=develop * change testing to debugging, test=develop
-
由 石晓伟 提交于
-
- 08 1月, 2020 1 次提交
-
-
由 zhaoyuchen2018 提交于
stack's wait cost a lot of cpu time, use cuda kernel to do memory copy will reduce cpu time. Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-