- 27 11月, 2019 1 次提交
-
-
由 Michał Gallus 提交于
* Implement Int8 FC * Integrate FC into INT8v2 test=develop * int8 FC: transpose weights before computing scales test=develop * Add support for activation_type string in FC test=develop * Disable MKL-DNN's FC in VGG16 and 19 test=develop * Disable FC quantization when mkldnn FC is disabled test=develop * Solve PADDLE_ENFORCES in FC int8 * Fix Paddle enforces and remove const cast test=develop * Fix style changes test=develop * Fix quantizer_tester test and add fc quantization test=develop * Fix FC test fail on CUDA * Remove unnecessary log from quantize placement pass test=develop * Add Thread ID to FC hash key test=develop * Add comments to MKL-DNN FC Kernel test=develop * Refactor quantizer test=develop * Fix linter issues test=develop * Fix crash in slim googlenet test=develop * Fix PADDLE_ENFORCE messages test=develop
-
- 26 11月, 2019 1 次提交
-
-
由 GaoWei8 提交于
* Add fc padding to solve mkl performance test=develop * fix gpu pass and error information test=develop * fix fc_fuse_pass_test test=develop * fix error information test=develop * fix error information test=develop * fix name and add fc op padding test test=develop * fix attributes test=develop * optimize fc padding test=develop * fix test test=develop
-
- 25 11月, 2019 1 次提交
-
-
由 zhouwei25 提交于
-
- 24 11月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. test=develop * Print the subgraph when check failed. test=develop
-
- 22 11月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
* polish code details, test=develop * futher polish hint msg, test=develop
-
- 20 11月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop
-
- 18 11月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
* fix warnings oof gcc 8 compilation, test=develop * fix boost::bad_get, test=develop * refine PADDLE_ENFORCE, test=develop
-
由 Zhaolong Xing 提交于
* refine trt int8 for dynamic range set test=develop * refine trt int8 test=develop
-
- 14 11月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
Add examples for error message writing specification - NotFound, OutOfRange, AlreadyExists, PermissionDenied (#21134) * add examples for error msg spec, test=develop * change ENFORCE to ENFORCE_**, test=develop * add more already exists examples, test=develop
-
- 13 11月, 2019 1 次提交
-
-
由 Chen Weihang 提交于
Add examples for error message writing specification - PreconditionNotMet, Unimplemented, Unavailable (#21137) * add examples for error spec, test=develop * change ENFORCE to ENFORCE_**, test=develop
-
- 12 11月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 11 11月, 2019 2 次提交
-
-
由 Chen Weihang 提交于
* add pre condition check for fuse optimizer op pass, test=develop * add log & set init to zero, test=develop * fix test_fuse_all_reduce_pass failed, test=develop * polish details, test=develop * refine PADDLE_ENFORCE & remove needless VLOG, test=develop * refactor op check method, test=develop
-
由 Yiqun Liu 提交于
* Add the definition of operation in fusion_group. * Use operations in OperationMap to detect fusion_group of elementwise pattern. * Add namespace fusion_group in code_generator. * Use operations recorded in OperationMap to generate code. * Remove implementation codes to .cc file. * Refine Operation and CodeGenerator to make it easier to generate code for grad_op. Refine the unittest for better reuse. * Avoid recording the template's keyword in a array. * Support the generating of code for grad_op and add unittest. test=develop * Remove replaced_element_in_order and use use number instead. test=develop
-
- 08 11月, 2019 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Add transpose2 INT8 for mkl-dnn test=develop * Fix test_transpose_int8_mkldnn test=develop * Revert "Merge branch 'develop' into transpose_int8_mkldnn_2" This reverts commit 34011bdb, reversing changes made to 2ce6473f. * Revert "Revert "Merge branch 'develop' into transpose_int8_mkldnn_2"" This reverts commit 23754dd7. * Add template to TransposeMKLDNNHandler test=develop * Resolve conflict test=develop * Restore get_size and refactor test=develop
-
- 05 11月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* support no need buffer vars in dygraph, test=develop * fix inference compilation error, test=develop * update no_need_buffer_vars_inference, test=develop * add unittests for no_need_buffer_vars_context, test=develop * refine no_need_buffer_vars by return ref, test=develop * polish some codes, test=develop
-
- 02 11月, 2019 1 次提交
-
-
由 Wilber 提交于
fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param test=develop (#20960) fix squared_mat_sub_fuse_pass when elementwise_op input is from persistable param
-
- 01 11月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 31 10月, 2019 1 次提交
-
-
由 hong 提交于
* refactor dygraph,test=develop * fix failed unittest,test=develop * polish code,test=develop * check windows ci error,test=develop try to fix windows ci error by np.allclose,test=develop * polish vlog and profiler, test=develop * try to fix preceding ops order,test=develop * test transformer in windows ci, test=develop * use python c-api to speed up tracer.trace,test=develop * test=develop, fix docker with paddle nccl problem * test=develop, add ut for debug string and gradient_accumulator * test=develop, add tests for layer/gradient_accumulator/prepared_op * test=develop, fix complie error for test_prepared_op * test=develop, add more ut for dygraph * test=develop, create API.spec for dygraph api change * optimize grad maker; test=develop * optimize grad maker * test * grad make optim; test=develop * fix unittest bugs; test=develop * add dygraph grad op maker and split_op * grad op maker refactor; test=develop * add dygraph grad maker; test=develop * fix op deformable_conv_v1_op bug; test=develop * fix deformable_conv prroi pool bugs; * fix new op grad op maker bug; test=develop * fix split by ref bug; test=develop * fix dygraph auto prune bug; test=develop * fix test_trace bug; test=develop * fix fused emb seq pool bug; test=develop * remove useless code in op_desc file; test=develop * remove useless code, StrVarBaseNode; test=develop * fix review issues; test=develop * fix rank_loss grad maker; test=develop * remove flag in VarBase; test=develop * fix distributed_notify_op compile bug ; test=develop * fix reshape op double grad; test=develop * fix expand as op; test=develop * add impertive type_defs.h for demo_train; test=develop * fix inference lib cmake; test=develop * fix inference lib; test=develop * fix infernce_lib; test=develop * fix inference cmake; test=develop * fix inference lib; test=develop * fix inference lib; test=develop * remove condition dygraph grad maker, modify local name; test=develop * fix split grad maker bug; test=develop * fix pyramid_op bug; test=develop * change travis time out limit; test=develop * restore travis; test=develop * change timeout limit; test=develop
-
- 29 10月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fusion_group_pass and elementwise pattern. * Rewrite the detector of elementwise group. test=develop * Add a comment in codegen. * Add more unittest cases. test=develop * Move code_generator related code to fusion_group directory. * Correct the including path. * Add the definition of SubGraph and finish the insert of fusion_group op in pass. * Insert graph_vis_pass in tester to visualize the graph for debug.
-
- 24 10月, 2019 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 19 10月, 2019 1 次提交
-
-
由 石晓伟 提交于
-
- 18 10月, 2019 1 次提交
-
-
由 wopeizl 提交于
* add support to gcc8, add docker env test=develop
-
- 15 10月, 2019 1 次提交
-
-
由 WangXi 提交于
-
- 14 10月, 2019 1 次提交
-
-
由 Pei Yang 提交于
-
- 13 10月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments
-
- 12 10月, 2019 1 次提交
-
-
由 Adam 提交于
* Add ConvTranspose + BatchNorm fuse pass test=develop * Add tests for conv+bn and conv_transpose+bn passes test=develop
-
- 28 9月, 2019 1 次提交
-
-
由 bingyanghuang 提交于
* Follow Wangzhen's comment in PR 18970, test=develop * Review comments, test=develop * Leave fake quantization around mul test=develop * Replace Fake with Real Quantized Mul test=develop * Fix bug in quantize placement pass Nodes in the graph now have checked type instead of node name when they are to be marked for quantization test=develop
-
- 27 9月, 2019 2 次提交
-
-
由 joanna.wozna.intel 提交于
* Fix conv2d+dequantize squash for residual fusion test=develop * Disable conv-requant squash test=develop
-
由 wangchaochaohu 提交于
* codegen code for reconstruction test=develop * fix the cmake test=develop * fix review advice test=develop
-
- 26 9月, 2019 1 次提交
-
-
由 chengduo 提交于
Add dtype for coalesce_tensor_op
-
- 19 9月, 2019 2 次提交
-
-
由 joanna.wozna.intel 提交于
* Fix conv2d+dequantize squash for residual fusion test=develop * Change condition test=develop
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
- 18 9月, 2019 2 次提交
-
-
由 Zeng Jinle 提交于
-
由 Zeng Jinle 提交于
* fix memory reuse bug on feeding variables, test=develop * add comments to reference count members, test=develop
-
- 16 9月, 2019 2 次提交
-
-
由 chengduo 提交于
* fix warning info test=develop * fix bug of all_reduce_deps_pass test=develop
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop
-
- 13 9月, 2019 1 次提交
-
-
由 chengduo 提交于
* Open fuse all reduce op test=develop * Add Fuse optimization op log * Add log in fuse_optimizer op pass and fuse all_reduce op pass * replace with boost::optional<bool> test=develop * Polish code test=develop * fix code coverage test=develop
-
- 11 9月, 2019 3 次提交
-
-
由 chengduo 提交于
* fix vlog level and fuse option type test=develop
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop
-
由 chengduo 提交于
* Enable fused_all_reduce_op_handle support GPU and CPU Gradients
-