- 28 11月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
-
由 Zeng Jinle 提交于
* fix ref_cnt pass, test=develop * add cpp unittests to reference_count_pass, test=develop * follow comments, test=develop
-
- 27 11月, 2019 4 次提交
-
-
由 hutuxian 提交于
* support data_norm_op run in CUDA * add two parameters sync_stats & summary_decay_rate * add UT
-
由 GaoWei8 提交于
test=develop
-
由 Michał Gallus 提交于
* Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop
-
由 Zeng Jinle 提交于
-
- 26 11月, 2019 7 次提交
-
-
由 Youwei Song 提交于
* add axis check for concat op test=develop * fix PADDLE_ENFORCE format test=develop * move to ComputeAxis for InferShape check test=develop
-
由 zhaoyuchen2018 提交于
* Fix ernie pythoin infer diff * Refine mask test=develop
-
由 Lv Mengsi 提交于
* fix_bn * revert unittest,test=develop
-
由 lilong12 提交于
* add the framework support for distfc and ut, test=develop * fix the implementation of shard_index_op, test=develop
-
由 GaoWei8 提交于
* Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop
-
由 Jacek Czaja 提交于
-
由 Michał Gallus 提交于
* Refactor MKL-DNN ElementwiseMul remove manual fallback, remove format attrs test=develop * Refine PADDLE_ENFORCEs in eltwise_mul_op.h test=develop * Make ElementwiseMulOp inherit from ElementwiseOp * Change type of simd_width to int test=develop * Remove Constructor extensions in ElementwiseOp and ElementwiseMulOp test=develop * Restore attributes test=develop * Fix test coverage for mkldnn eltwise mul test=develop * Conform to new is_run_common_broadcast API test=develop * Add UT for AreDimsAndFormatCorrect test=develop
-
- 25 11月, 2019 4 次提交
-
-
由 zhouwei25 提交于
-
由 wangchaochaohu 提交于
* fix the fill_constant op precious problem test=develop
-
由 zhaoyuchen2018 提交于
* Improve argsort performance. - Give 200000 data to compute argsort on v100, can speed up ~190x before opt cost: 0.53s after opt cost:0.0027s - Add fp16 support * Refine error message * Refine code test=develop Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-
由 WangXi 提交于
-
- 24 11月, 2019 2 次提交
-
-
由 Leo Zhao 提交于
* use prefetch to load next mem into cache test=develop * remove hard code memcpy om pyramid_hash_ff test=develop
-
由 gongweibao 提交于
-
- 22 11月, 2019 5 次提交
-
-
由 Yihua Xu 提交于
* Fix the crash issue when scale or bias was null-pointer. test=develop * Add the error message for passing CI. test=develop
-
由 Zhang Ting 提交于
-
由 Liufang Sang 提交于
* add int8 kernel to lookup_table op and add dequantize op test=develop * change paddle_enforce to paddle_enforce_eq test=develop * change copyright and change some not suitable code test=develop * remove debug log test=develop * replace GetInputType with IndicateVarDataType test=develop * fix EmptyGradMaker test=develop * fix diff between cpu and gpu test=develop * use memcopy when int8_t test=develop
-
由 hutuxian 提交于
Previously, CVM OP was only able to run in CPU. This PR implements its GPU kernel. What's more, we improve the UTs about CVM OP.
-
由 Yihua Xu 提交于
* Avoid the string as the key of map to improve the jit performance. test=develop * Use map to replace unordered_map. test=develop
-
- 21 11月, 2019 1 次提交
-
-
由 zhongpu 提交于
* open dygraph op test, test=develop * modify to_variable, test=develop * modify input and output for dygraph, test=develop * modify input and output for dygraph(fix bug), test=develop * fix input processing of dygraph op test, test=develop * fix bug, test=develop * fix op test, test=develop * fix forward bug for dygraph, test=develop * fix mkldnn op test for forward, test=develop * update nn.py for dygraph, test=develop * fix crop_tensor_op, test=develop * fix elementwise_mul_op, test=develop * fix fill_op, test=develop * fix some mkldnn op, test=develop * open backward op test for dygraph, test=develop * delete log, test=develop * close backward op test for dygraph, test=develop * fix bug for edit_distance_op and test_lstm_cudnn_op, test=develop * fix optest backward bug for dygraph, test=develop * fix optest backward bug for dygraph, test=develop * close backward op test for dygraph, test=develop * close backward op test for dygraph, test=develop * open dygraph op test, test=develop * fix op test for dygraph, fix GradOpDescMaker, test=develop * fix bug for linear_chain_crf_op.h, test=develop * remove log, test=develop * remove log, test=develop * remove log for op_test.py, test=develop * remove log for op_test.py, test=develop * fix bug for var_conv_2d_op, change PADDLE_ENFORCE, test=develop * fix PADDLE_ENFORCE_EQ for hierarchical_sigmoid_op.cc, test=develop * fix bug for test_increment_ngraph_op.py, test=develop * fix lod for op test in dygraph, test=develop * refactor op_test.py to reduce redundant code, test=develop * fix lod optest, modify InputVar/OutputVar to HasInput/HasOutput, test=develop * remove debug log, test=develop * remove redundant code in base.py, test=develop * fix some error in optest, test=develop * fix ClearNoNeedBufferInputs function's bug for LoDTensor, test=develop * refactor op_test.py, test=develop * remove redundant writing, test=develop * fix error(get tensor of the grad variable), test=develop * fix test_concat_mkldnn test_conv2d_mkldnn, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix optest.py for get tensor of LoDTensor, test=develop * fix some redundant code, test=develop * reslove conflict and rewrite paddle error message, test=develop
-
- 20 11月, 2019 3 次提交
-
-
由 danleifeng 提交于
-
由 zhaoyuchen2018 提交于
* Fix topk compile failed on windows * Use explicit cast for assign data
-
由 Zhang Ting 提交于
* optimize assign op to avoid copy data from GPU to GPU, test=develop * modified GetkernelTypeForVar and just avoid device transform, test=develop
-
- 19 11月, 2019 3 次提交
-
-
由 danleifeng 提交于
-
由 Adam 提交于
test=develop
-
由 yaoxuefeng 提交于
* fix auc drop first commit test=develop * update datanorm op * update datanorm with enforce test=develop * update test=develop * update format test=develop * update format * update format test=develop * add unit test test=develop * update unit test test=develop * update format test=develop * update format test=develop * update API description test=develop * update API description test=develop * update format test=develop * fix codes as comments test=develop * fix description as comments test=develop * fix description as comments test=develop * update codes.. test=develop
-
- 18 11月, 2019 3 次提交
-
-
由 Zhang Ting 提交于
* modified error message for conv and conv_transpose, test=develop * modified doc of conv and conv_transpose op, test=develop * modified the expression for error message, test=develop * modified error message for group_norm op, test=develop * modified detail of Attr(data_format) or Attr(data_layout) * add ValueError in API doc for maxout op, test=develop
-
由 guofei 提交于
-
由 WangXi 提交于
-
- 15 11月, 2019 2 次提交
- 14 11月, 2019 4 次提交
-
-
由 Kaipeng Deng 提交于
-
由 whs 提交于
-
由 Chen Weihang 提交于
Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_**, test=develop * add more already exists examples, test=develop
-
由 zhaoyuchen2018 提交于
* Improve topk performance. give 200000 data to compute topk, before opt: cost 1s after opt: cost 0.0028s. * Refine return value. * Add cuda util funtions. * Fix ComputeBlockSize bug & refine comments. Signed-off-by: Nzhaoyuchen <zhaoyuchen01@baidu.com>
-