- 27 11月, 2019 1 次提交
-
-
由 Michał Gallus 提交于
* Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop
-
- 26 11月, 2019 2 次提交
-
-
由 GaoWei8 提交于
* Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop
-
由 silingtong123 提交于
-
- 25 11月, 2019 1 次提交
-
-
由 zhouwei25 提交于
-
- 20 11月, 2019 2 次提交
-
-
由 liu zhengxi 提交于
* fix the CAPI ZeroCopy shape error and reconstruct the output obtain * use an anonymous namespace to cover the functor * fix unit tests because of the output of typeid(T).name() is different from linux and windows, test=develop
-
由 Pei Yang 提交于
added splitter "__" between weight name and suffix number to avoid conflicts.
-
- 19 11月, 2019 1 次提交
-
-
由 zhouwei25 提交于
-
- 18 11月, 2019 1 次提交
-
-
由 Zhaolong Xing 提交于
* refine trt int8 for dynamic range set test=develop * refine trt int8 test=develop
-
- 15 11月, 2019 1 次提交
-
-
由 GaoWei8 提交于
* solve cmake fails on inference_download_and_uncompress test=develop * solve cmake fails on inference_download_and_uncompress test=develop
-
- 14 11月, 2019 1 次提交
-
-
由 Adam 提交于
* Add relative error measure when value > 1 test=develop * Move code to CheckError function test=develop
-
- 13 11月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop
-
- 08 11月, 2019 2 次提交
-
-
由 joanna.wozna.intel 提交于
* Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdb, reversing changes made to 2ce6473f. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd7. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop
-
由 GaoWei8 提交于
* Add ernie unit test test=develop * Add ernie unit test test=develop * Add ernie unit test test=develop * remove ngraph * optimize gpu test test=develop * optimize codes test=develop
-
- 23 10月, 2019 2 次提交
- 20 10月, 2019 1 次提交
-
-
由 bingyanghuang 提交于
-
- 18 10月, 2019 2 次提交
-
-
由 石晓伟 提交于
* support MLU nums, test=develop * change anakin apis, test=develop
-
由 liu zhengxi 提交于
modify the way to pass parameter out_size in function.
-
- 17 10月, 2019 1 次提交
-
-
由 liu zhengxi 提交于
-
- 16 10月, 2019 1 次提交
-
-
由 lidanqing 提交于
-
- 15 10月, 2019 1 次提交
-
-
由 liu zhengxi 提交于
* fix the PD_ZeroCopyPredictorRun output problem and add some checks and logs for users * modify the cmakelists depends and fix the cmakelists problem
-
- 14 10月, 2019 2 次提交
-
-
由 bingyanghuang 提交于
-
由 Pei Yang 提交于
-
- 13 10月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments
-
- 12 10月, 2019 1 次提交
-
-
由 Adam 提交于
* Add ConvTranspose + BatchNorm fuse pass test=develop * Add tests for conv+bn and conv_transpose+bn passes test=develop
-
- 11 10月, 2019 1 次提交
-
-
由 liu zhengxi 提交于
remove incorrect "new" in c style.
-
- 10 10月, 2019 1 次提交
-
-
由 石晓伟 提交于
-
- 08 10月, 2019 1 次提交
-
-
由 liu zhengxi 提交于
* add dll to inference capi, test=develop * add if win32 in cmakelists, test=develop
-
- 05 10月, 2019 1 次提交
-
-
由 liu zhengxi 提交于
* add capi for fluid inference api, including AnalysisConfig, AnalysisPredictor, PaddleBuf, PaddleTensor, ZeroCopyTensor
-
- 30 9月, 2019 1 次提交
-
-
由 Wilber 提交于
* fix compile with anakin bug * remove useless deps test=develop - 修复了联编anakin时,遇到的bug. - 编译test_anakin_activate 不通过 - 编译test_anakin_engine 不通过
-
- 27 9月, 2019 1 次提交
-
-
由 石晓伟 提交于
* update operator compatible info, test=develop * revert cmake/version.cmake, test=develop * add unit_tests and fix bugs, test=develop * update ../paddle/fluid/framework/framework.proto, test=develop * fix bug of paddle/fluid/inference/api/analysis_predictor.cc, test=develop * update paddle/fluid/framework/version_test.cc, test=develop * add comments and rename interfaces, test=develop
-
- 25 9月, 2019 2 次提交
-
-
由 Zhaolong Xing 提交于
FIx C++ inference BUG: When open memory optim and enable trt subgraph at the same time, there is a bug (#19969) * fix memory optimization type test=develop * 1. fix BUG: open trt and memory optim will trigger bug. 2. Clean memory optim bug. test=develop
-
由 Aurelius84 提交于
* Removing last dims constraints of seq_pad and seq_unpad test=develop * fix test_layer api code test=develop * fix sequence_pad_op.cc conflict test=develop * remove test_analyzer_mm_dnn test=develop * fix vectorize bug test=develop * fix vectorize<int> test=develop
-
- 21 9月, 2019 3 次提交
-
-
由 pawelpiotrowicz 提交于
test=develop
-
由 Pei Yang 提交于
* add TRT shape check, test=develop * model_input_shape == runtime_input_shape, refine message, test=develop
-
由 Pei Yang 提交于
* fix trt bugs when sharing params, test=develop * add unittest for cascade_rcnn
-
- 20 9月, 2019 1 次提交
-
-
由 石晓伟 提交于
-
- 19 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
- 18 9月, 2019 1 次提交
-
-
由 石晓伟 提交于
-
- 17 9月, 2019 1 次提交
-
-
由 Pei Yang 提交于
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
-