- 19 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
- 18 9月, 2019 1 次提交
-
-
由 石晓伟 提交于
-
- 17 9月, 2019 1 次提交
-
-
由 Pei Yang 提交于
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
-
- 11 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop
-
- 09 9月, 2019 1 次提交
-
-
由 Tao Luo 提交于
* paddle::framework::vectorize() templatization test=develop * update pybind/imperative.cc test=develop * revert update on unsqueeze_op.cc and warpctc_cudnn_op.cu.cc test=develop
-
- 03 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add a interface to enable cudnn for inference. * Add cudnn_placement_pass. test=develop * Set the default value of cudnn_enabled_op_types to null. test=develop * Write the common basic class, placement_pass_base, to refine the codes. test=develop * Call EnableCUDNN in unittest. test=develop * Refine cudnn_placement_pass tester. * Enable the testing of cudnn_placement_pass in inference's unittest. test=develop * Add the check of op kernels. test=develop
-
- 30 8月, 2019 2 次提交
-
-
由 liuwei1031 提交于
-
由 Yiqun Liu 提交于
* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true. test=develop * Delete dropout_op directly when upscale_in_train is true. test=develop * Improve the debug string, adding the print of op_desc information. * Fix the case when dropout's input x is reused as the next op's output. * Add the pass to inference. test=develop * Change the log level. test=develop * Add unittest for inplace case. * Add comment to explain the pass. * Apply the pass for CPU inference. test=develop * Fix the typo. test=develop * Add the check of AttrType. test=develop
-
- 22 8月, 2019 1 次提交
-
-
由 lidanqing 提交于
* add local user data conversion into full_pascalvoc_test_preprocess.py test=develop * change PADDLE_ENFORCE to PADDLE_ENFORCE_GE test=develop * change according to reviews test=develop
-
- 21 8月, 2019 1 次提交
-
-
由 Adam 提交于
* Add generalized Conv+Activation MKLDNN fuse pass creation Part2 test=develop * Undefined behaviour of GetAttrIfExists<> FIX test=develop
-
- 19 8月, 2019 2 次提交
-
-
由 Zhaolong Xing 提交于
* fix mask rcnn bug: 1. affine channel fuse (diff) 2. condition block op (memory leak) 3. merge lod tensor op (diff) 4. memroy optim (diff) test=develop * fix ci aboud PADDLE_ENFOCE fix merge lod infer op ut test=develop
-
由 Zeng Jinle 提交于
-
- 15 8月, 2019 1 次提交
-
-
由 Adam 提交于
test=develop
-
- 12 8月, 2019 1 次提交
-
-
由 wopeizl 提交于
* add tensorrt support for windows
-
- 09 8月, 2019 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-
- 05 8月, 2019 1 次提交
-
-
由 silingtong123 提交于
Fix the third-party openblas dependency for paddle on windows
-
- 02 8月, 2019 1 次提交
-
-
由 石晓伟 提交于
* add fusion_seqpool_cvm_concat test=develop * simplify pass, test=develop * fix code style, test=develop
-
- 31 7月, 2019 2 次提交
-
-
由 liuwei1031 提交于
* fix security issue, test=develop * bug fix, test=develop * throw an exception when null pointer data with non-zero length PaddleBuf is passed, test=develop
-
由 Zhaolong Xing 提交于
* Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop * 1 add trt fp16 support test=develop
-
- 24 7月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
* update paddle-trt for: 1. fix bug: when batch > 2, core in split plugin. 2. add leaky_relu trt5.0 support (yolov3 from 65ms to 42ms.) 3. add new attr to dropout. 4. shuffle channel, swish, relu6 support test=develop * 1. fix ci test=develop
-
- 17 7月, 2019 1 次提交
-
-
由 石晓伟 提交于
* update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop * modify interfaces test=develop * add cmake dependments test=develop * enforce the outputs of net test=develop
-
- 11 7月, 2019 1 次提交
-
-
由 Tao Luo 提交于
* add config.SetMkldnnCacheCapacity api for mkldnn cache clear strategy test=develop * enhance MkldnnPostReset test=develop * add comments for mkldnn_cache_capacity field test=develop
-
- 09 7月, 2019 1 次提交
-
-
由 Jiabin Yang 提交于
* test=develop, fix docker with paddle nccl problem * test=develop, fix/gcc_4.8_ubt_link_error * test=develop, fix code format
-
- 08 7月, 2019 2 次提交
-
-
由 Zhaolong Xing 提交于
* Fix Mask rcnn predictor 1. refine memory optim algorithm to support the model with the block op. 2. output diff : modify the affine channel fuse 3. add condition_block_infer op add interface for setting trt calib table dir test=develop * add the missing files. test=develop
-
由 石晓伟 提交于
* update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop * load model from buffer with length test=develop * modify the access level of class test=develop * support anakin for bitmain arch test=develop * remove files * checkout cmakelists test=develop
-
- 03 7月, 2019 1 次提交
-
-
由 石晓伟 提交于
* remove the obsolete cmake options, test=develop * remove unittests, test=develop
-
- 02 7月, 2019 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-
- 01 7月, 2019 1 次提交
-
-
由 Michał Gallus 提交于
* Int8: Fix Pooling output scale test=develop * Update scales quantization for certain operators These include: concat, transpose, pool and reshape. test=develop * Move concat minimum scale finding to quantizer test=develop
-
- 27 6月, 2019 2 次提交
-
-
由 Michał Gallus 提交于
test=develop
-
由 Sylwester Fraczek 提交于
add prior_box quantization code add scale algo rules for prior box test=develop
-
- 21 6月, 2019 1 次提交
-
-
由 wopeizl 提交于
-
- 19 6月, 2019 1 次提交
-
-
由 翟飞跃 提交于
* fix spelling errors; test=develop * Update API.spec update md5 * Update API.spec * change the order of api;test=develop
-
- 12 6月, 2019 1 次提交
-
-
由 石晓伟 提交于
test=develop
-
- 11 6月, 2019 1 次提交
-
-
由 石晓伟 提交于
* update anakin-engine interfaces for content-dnn test=develop * support only-gpu mode of Anakin modify eltwise parse test=develop * modification for thread-safe test=develop * Integrated template instance test=develop * increase template parameters test=develop * support MLU predictor test=develop * update anakin cmake files test=develop * update TargetWrapper::set_device * update the initialization of anakin subgraph test=develop * use the default constructor of base class test=develop
-
- 06 6月, 2019 3 次提交
-
-
由 石晓伟 提交于
test=develop
-
由 Zhaolong Xing 提交于
test=develop
-
由 翟飞跃 提交于
* refactor PR 16865 * delete mergetool files * test=develop * test=develop * test=develop * test=develop * create dir for int8 model before call SaveOptimModel * test=develop * mkldnn int8 only support linux; test=develop * refine code; test=develop * remove comment; test=develop * refine code; test=develop * fix bug; test=develop * add exception for mkldnn_post_training_strategy * reuse int8v2 CAPI dataset; test=develop * fix accuracy check bug; test=develop * remove tab * convert files to unix format * test=develop * reduce CI time;test=develop * reduce CI time and refine code;test=develop * refine comment; test=develop * add cmake FLAGS;test=develop * remove predict_num;test=develop
-
- 03 6月, 2019 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-
- 29 5月, 2019 1 次提交
-
-
由 mozga-intel 提交于
-
- 28 5月, 2019 1 次提交
-
-
由 lidanqing 提交于
* add INT8 conv+relu6 fuse and enbale mobilentv2 INT8 test test=develop * change fasle and 0.0 to fuse_brelu and brelu_threshold test=develop change the "fuse_relu||fuse_brelu" to "unsigned_output" test=develop * Use relu instead of brelu as INT8 post-op because INT8 brelu is not enabled in mkldnn v0.18 test=develop * continuous-integration fix test=develop
-