- 30 8月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
* Support memory eager deletion on recurrent OP (#17710) Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR) * Fix random test_recurrent_op failure (#18718) The change includes 3 things: 1. Set CPU_NUM to 1 in the tests because the ParallelExecutor will print warning that CPU_NUM is not set and use default 1. 2. Old tests compare two RNNs, hand written simple RNN and same RNN built by Paddle, but initialized RNN weights in numpy random and Paddle random separately. Fixed it by setting weights and bias values. 3. Also set numpy random seed in the tests. Now the two RNNs diff can be smaller (rtol from 0.1, 0.2 to. 0.01) in the tests.
-
- 29 8月, 2019 1 次提交
-
-
由 tangwei12 提交于
* fix bug in Class MultiSlotDataGenerator's function _gen_str, test=develop (#18222) * fix some bug when merge sparse embedding parameters, test=develop (#18223) * fix communicator with pyreader (#18350) * delete AllocatorFacade destructor (#18606) * fix distribute transpiler GRPC error code 4, RPC Deadline (#18984) * merge pr #18441
-
- 27 8月, 2019 1 次提交
-
-
由 LielinJiang 提交于
* fix depthwise conv gpu kernel bug, test=develop * add more depthwise conv test, test=develop
-
- 26 8月, 2019 4 次提交
-
-
由 LielinJiang 提交于
* make_roi_perspective_transform_op_return_mask_and_matrix * make_roi_perspective_transform_op_return_mask_and_matrix
-
由 Zhaolong Xing 提交于
* CHERRY_PICK 18941, 18860: TRT fp16 support. test=release/1.5 * CHERRY_PICK 19213: Fix BUG: Mask RCNN inference diff When using AnalysisPredictor. 1. fix affine channel fuse pass. 2. fix condition block op. 3. fix merge lod tensor op bug. 4. fix memory optim cause by reset lod op. test=release/1.5
-
由 石晓伟 提交于
-
由 石晓伟 提交于
* add fusion_seqpool_cvm_concat test=develop * simplify pass, test=develop * fix code style, test=develop
-
- 21 8月, 2019 1 次提交
-
-
由 chengduo 提交于
* add warning info for CPU_NUM test=develop * update dygraph parallel.py test=develop * prune the feed op in compiler test=release/1.5 * remove compile from PE test=develop * test CUDAPinnedPlace in reader test=release/1.5
-
- 20 8月, 2019 1 次提交
-
-
由 silingtong123 提交于
* add PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#19211) * test=develop,Modify PADDLE_ENFORCE to PADDLE_ENFORCE_CUDA_SUCCESS
-
- 15 8月, 2019 1 次提交
-
-
由 chengduo 提交于
* fix gather op bug test=release/1.5
-
- 29 7月, 2019 2 次提交
-
-
由 chengduo 提交于
* fix backward bug
-
由 Zeng Jinle 提交于
-
- 26 7月, 2019 1 次提交
-
-
由 FDInSky 提交于
[cherry pick]fix roi_align_op cpu backward's bug
-
- 25 7月, 2019 3 次提交
-
-
由 wangchaochaohu 提交于
* rewrite the conv_op using cudnn_conv_helper * add workspace limit for v7 test=develop * fix test=develop * add half float test=develop * fix test=develop * fix test=develop * revise code style test=develop * fix test=develop
-
由 qingqing01 提交于
-
由 qingqing01 提交于
-
- 08 7月, 2019 2 次提交
-
-
由 Zhaolong Xing 提交于
fix mask rcnn add interface for setting optim_cache_dir(eg: when in trt int8 mode, and load model from memory, there should be a interface for setting the trt calibration table data dir) test=release/1.5
-
由 zhaoyuchen2018 提交于
Add path to handle 1D vector
-
- 05 7月, 2019 1 次提交
-
-
由 gongweibao 提交于
-
- 29 6月, 2019 1 次提交
-
-
由 Yibing Liu 提交于
* Update lamb optimizer (#18333) * Update lamb optimizer * Regenerate api spec test=release/1.5 * Give an experimental warning test=release/1.5
-
- 28 6月, 2019 2 次提交
-
-
由 qingqing01 提交于
* Simplify multi_box_head API in detection.py and remove assign op.
-
由 石晓伟 提交于
* Update the Anakin interfaces for content-dnn and MLU (#17890) * update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * modify the access level of anakin engine (#18015) test=develop * fix ci test cmake test=develop
-
- 26 6月, 2019 1 次提交
-
-
由 tensor-tang 提交于
* fix softrelu doc * update API doc test=release/1.5
-
- 25 6月, 2019 4 次提交
-
-
由 Hongyu Liu 提交于
* sequnce mask support max length tensor input; test=develop * add rnn_impl.py; test=develop * add basic gru lstm unittest; test=develop * fix api spec; test=develop * fix sequence_mask op bug; test=develop test=document_preview * change +-*x to elmentwise_op; test=develop * add mkl flag; test=develop * fix rnn impl bug; test=develop * update api spec; test=develop * fix doc bug; test=develop * fix lstm bugs; test=develop
-
由 Guo Sheng 提交于
test=release/1.5 * Fix the GetExpectedKernelType of add_position_encoding_op. * Fix the doc of lstm_unit outputs in nn.py.
-
由 Yiqun Liu 提交于
test=release/1.5
-
由 Yibing Liu 提交于
* Use TensorCopySync for sequence_unpad op * Fix the tensor memory alloc bug test=release/1.5
-
- 24 6月, 2019 2 次提交
-
-
由 Hongyu Liu 提交于
* fix slice op bug; test=develop * fix variabel test bug; test=develop * remove slice while true; test=develop
-
由 lujun 提交于
Repair error prompt: Users are prompted to check whether the model or parameter files are damaged when loading parameters are wrong. * cherry pick 18000, test=release/1.5
-
- 20 6月, 2019 2 次提交
-
-
由 qingqing01 提交于
* Update backward appending stragety to support double backward and fix some bug. (#18104) * Update backward.py: - If there is no input grad var in all outputs of previous ops, do not append this op into graph. - Only apply this stragety when double backward. * Update some double backward op. * Update sum_op to judge whether a tensor is empty by numel or IsInitialized().
-
由 翟飞跃 提交于
-
- 19 6月, 2019 2 次提交
-
-
由 tangwei12 提交于
* fix save/load in fleet (#17675) * fix save/load in Fleet * add UT framework of Fleet (#18058) * add paddle cloud role maker for customized usage, note this is only for industrial users that have cloud environment pre-configuration (#18121) add paddle cloud role maker for specific cloud usage. This pr will simplifies user's configuration in distributed training. * assign role_maker before use (#18137)
-
由 FlyingQianMM 提交于
Cherry pick retinanet_target_assign_op(#17893), sigmoid_focal_loss_op(#17895) and retinanet_detection_output_op(#17896) for supporting retinanet (#18141) * test=release/1.5 Fix conflicts in test_layers.py when adding target assign operator for supporting retinanet. Cherry pick #17893 * test=release/1.5 Add sigmoid focal loss operator for supporting retinanet. Cherry pick #17895 * test=release/1.5 Add detection output operator for supporting retinanet. Cherry pick #17896 * test=release/1.5 fix wrong code style in test_layers.py when cherry pick retinanet_target_assign #17893 * test=release/1.5 Fix type error of std::pow in sigmoid_focal_loss. Cherry pick #17895
-
- 18 6月, 2019 2 次提交
-
-
由 AIFollowers 提交于
Add cascade rcnn support.
-
由 cjt222 提交于
cherry pick for deform roi pooling
-
- 15 6月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* fix py_reader iterable bug, test=release/1.5 * move data from buffered_reader,test=release/1.5
-
由 chengduo 提交于
* update CPU_NUM config test=develop
-
- 13 6月, 2019 3 次提交
-
-
由 wawltor 提交于
test=release/1.5 cherry-pick from #17952 The scatter op has a calc bug when the indices has same index, the scatter op use overwrite mode to calculate the same index, fix this bug by using the accumulate mode to calculate the same index.At the same time, the gather op has the same bug when the op calc the grad. And we use the lib of open-blas and eigen to optimize the time cost in accumulate mode.
-
由 Wojciech Uss 提交于
Added unit test for QAT FP32 & INT8 comparison (#17814) Disable MKLDNN FC in Resnet50 test (#18030) test=release/1.5
-
由 tensor-tang 提交于
test=release/1.5
-