- 13 5月, 2019 1 次提交
-
-
由 Kaipeng Deng 提交于
* add double grad for square. test=develop * formax code. test=develop * fix for grad sum. test=develop * refine shape. test=develop * refine extract. test=develop
-
- 10 5月, 2019 4 次提交
-
-
由 zhoukunsheng 提交于
-
由 zhoukunsheng 提交于
-
由 zhaoyuchen2018 提交于
refine code fuse cublas calling and kernels into one cuda kernel. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 qingqing01 提交于
* Add conv2d_grad_grad_op * Extracte the cuDNN conv algo searching code in conv_cudnn_helper.h. - Now use it in conv2d_grad_grad. - Will simply the searching code in conv2d and conv2d_grad in next PR. * Enhance and fix bug in unit testing of gradient_checker. * Support to fetch empty variables,return None in Python.
-
- 09 5月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
-
由 zhoukunsheng 提交于
* test=develop add elementwise_mod and elementwise_floordiv, fix equation problem in elementwise_mod
-
- 08 5月, 2019 8 次提交
-
-
由 xiaoting 提交于
* modified formula for lrn test=develop * modified api.spec test=develop
-
由 zhaoyuchen2018 提交于
* Refine elementwise kernel. Add a simple cuda kernel if grad x and y both exist Use 2D block cuda kernel to do broadcast. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 Yiqun Liu 提交于
* Optimize the cuda implementation of sum_op, which add two lod_tensors inplace. test=develop * Use eigen to add to tensors. test=develop
-
由 chengduo 提交于
test=develop
-
由 Hongyu Liu 提交于
* fix shape_check; test=develop * fix format; test=develop * fix format; test=develop * fix ddim bug; test=develop * fix c++ format; test=develop * change function name; test=develop
-
由 whs 提交于
-
由 baojun 提交于
* added lrn op test=develop * Added CreateConstant method test=develop * avoid duplicates test=develop
-
由 gongweibao 提交于
-
- 07 5月, 2019 7 次提交
-
-
由 Zeng Jinle 提交于
* add use_cuda to inplace pass,test=develop * add test softmax_with_xe_inplace test,test=develop * fix potential inplace bug test=develop * add more skip vars in mem opt pass,test=develop * follow comment,test=develop * follow comments,move duplicate out arg check to program->graph,test=develop
-
由 baojun 提交于
-
由 Kaipeng Deng 提交于
* add attr axis infershape. test=develop * add CUDA kernel. test=develop * fix unittest. test=develop * fix unittest for soft_label. test=develop * fix fp16 unittest. test=develop * remove comment code. test=develop * refine test for axis. test=develop * add python api. test=develop * fix doc. test=develop * fix fp16 unittest. test=develop * fix ngraph test. test=develop * fix ENFORCE for test_imperative_transformer. test=develop * fit for ngraph test. test=develop * fix after rebase develop. test=develop * fix doc. test=develop * fix API.spec. test=develop * fix test_layers. test=develop * fix format. test=develop
-
由 Zhen Wang 提交于
* Add MovingAverageAbsMaxScale operator which is only used for calculating the quantization scale. * test=develop * change the output into inplace. test=develop * Revert "test=develop" This reverts commit 696cf62699ba1e1c98f61f7345ac7060010eb29a. * Revert "change the output into inplace. test=develop" This reverts commit a19acd20f07eee82622701a3015e6e9c073a5e0b. * test=develop. * update the MovingAverageAbsMaxScaleOp test. test=develop
-
由 zhaoyuchen2018 提交于
* optimize sum op fuse multi eigen kernel calls into one cuda kernel. refine code test=develop. Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Refine code. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * Refine code according to comments. test=develop * refine code delete sum_op_gpu.h test=develop * Fix test error. test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code in format. test=develop. * refine code test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com> * refine code test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 石晓伟 提交于
* cherry-pick commit from 88770542 * cherry-pick commit from 3f0b97df * cherry-pick from 16691:Anakin subgraph support yolo_v3 and faster-rcnn (cherry picked from commit 8643dbc2) * Cherry-Pick from 16662 : Anakin subgraph cpu support (cherry picked from commit 7ad182e1) * Cherry-pick from 1662, 16797.. : add anakin int8 support (cherry picked from commit e14ab180) * Cherry-pick from 16813 : change singleton to graph RegistBlock test=release/1.4 (cherry picked from commit 4b9fa423) * Cherry Pick : 16837 Support ShuffleNet and MobileNet-v2 Support ShuffleNet and MobileNet-v2, test=release/1.4 (cherry picked from commit a6fb066f) * Cherry-pick : anakin subgraph add opt config layout argument #16846 test=release/1.4 (cherry picked from commit 8121b3ec) * 1. add shuffle_channel_detect (cherry picked from commit 6efdea89) * update shuffle_channel op convert, test=release/1.4 (cherry picked from commit e4726a06) * Modify symbol export rules test=develop
-
由 jerrywgz 提交于
* refine api comment, test=develop
-
- 06 5月, 2019 2 次提交
-
-
由 jerrywgz 提交于
* fix distribute fpn proposals, test=develop
-
由 Zeng Jinle 提交于
* add use_cuda to inplace pass,test=develop * add test softmax_with_xe_inplace test,test=develop
-
- 05 5月, 2019 1 次提交
-
-
由 jerrywgz 提交于
* enhance_concat, test=develop
-
- 30 4月, 2019 4 次提交
-
-
由 Zeng Jinle 提交于
* fix op graph view test=develop * rewrite inplace pass and fix reference count pass bug test=develop * fix unittest failed test=develop * follow comments, test=develop
-
由 Zeng Jinle 提交于
-
由 xiaoting 提交于
* polish the label_smooth test=develop * polish code test=develop
-
由 Leo Zhao 提交于
resolve #17147 test=develop
-
- 29 4月, 2019 1 次提交
-
-
由 tangwei12 提交于
cvm without LoD.
-
- 28 4月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* refine_dropout_mem,test=develop * # This is a combination of 14 commits. # The first commit's message is: remove ut test_dist_word2vec in mac ci, will fix it in private, test=develop (#17066) # This is the 2nd commit message: Fleet unify distributed training (#16791) * implement distributed transpiler with fleet # This is the 3rd commit message: ParallelDyGraph with GPU collective mode (#16827) implement dygraph.parallel.DataParallel to hook reduce op. # This is the 4th commit message: Init mixed precision training interface (#16856) * Init mixed precision training interface * Add fp16 test script test=develop * All initializers support float16 test=develop * Code cleanup & add more code annotations test=develop * Update API spec test=develop * Add usage example in doc test=develop # This is the 5th commit message: fix reference_count_pass,test=develop (#17060) test=develop # This is the 6th commit message: Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090) * Cache the information of linear interpolation in forward and use it in backward. test=develop * Fix cuda kernel. test=develop # This is the 7th commit message: remove unnecessary prepare_data (#17080) test=develop # This is the 8th commit message: fix interpolate cu. test=develop (#17101) # This is the 9th commit message: test=develop, double backward leaky_relu (#17067) backward of backward: leaky_relu # This is the 10th commit message: fix fuse optimizer ops (#17102) test=develop # This is the 11th commit message: truncated_gaussian_random supported in distributed training, test=develop (#17091) # This is the 12th commit message: Detailed coordinate description for yolov3 loss (#17007) * Detailed coordinate description for yolov3 loss test=develop * modified api.spec test=develop * modified loss name * fix api.spec test=develop * polish description test=develop * modified api.spec test=develop # This is the 13th commit message: fix test_weight_decay (#17109) test=develop # This is the 14th commit message: Path flag (#17105) * fix python/paddle/fluid/__init__.py detecting problems
-
由 Huihuang Zheng 提交于
1. Use CudnnWorkspaceHandle in exhaustive search of conv_cudnn. 2. For Ops using CudnnWorkspaceHandle in exhaustive search, release their GPU memory after exhaustive search. test=develop
-
- 26 4月, 2019 3 次提交
-
-
由 xiaoting 提交于
* Detailed coordinate description for yolov3 loss test=develop * modified api.spec test=develop * modified loss name * fix api.spec test=develop * polish description test=develop * modified api.spec test=develop
-
由 ceci3 提交于
backward of backward: leaky_relu
-
由 Kaipeng Deng 提交于
-
- 25 4月, 2019 2 次提交
-
-
由 whs 提交于
Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090) * Cache the information of linear interpolation in forward and use it in backward. test=develop * Fix cuda kernel. test=develop
-
由 Yan Xu 提交于
implement dygraph.parallel.DataParallel to hook reduce op.
-
- 23 4月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* make_conv_cudnn_ws_size_configurable, test=develop * change std::max to std::min test=develop
-
由 qingqing01 提交于
Support backward of backward for Relu and add a new gradient checker by comparing theoretical and numerical Jacobian. (#16862) * Support backward of backward and a new gradient checker * Rename decorators.py to decorator_helper.py, since Python on Windows CI has decorators package. 1. Add ReluDoubleGradMaker when register relu_grad. 2. Add a new gradient checker by comparing theoretical and numerical Jacobian. Check double gradients by double_grad_check.
-
- 22 4月, 2019 1 次提交
-
-
由 tangwei12 提交于
-