- 31 12月, 2020 1 次提交
-
-
由 cc 提交于
* Add mkldnn nearest_interp and bilinear_interp op * don't run mkldnn interpolate in default * add interpolate_mkldnn_pass
-
- 29 12月, 2020 1 次提交
-
-
由 cc 提交于
* map matmul/squeeze2+matmul/reshape2+matmul to mul
-
- 28 12月, 2020 1 次提交
-
-
由 Wilber 提交于
-
- 24 12月, 2020 1 次提交
-
-
由 jakpiase 提交于
-
- 23 12月, 2020 1 次提交
-
-
由 YUNSHEN XIE 提交于
* remove duplicate ut reload * remove duplicate ut define in cmakelist
-
- 25 11月, 2020 1 次提交
-
-
由 Wojciech Uss 提交于
* Add multi_gru_fuse_pass and tests * fix date * cleaned up headers
-
- 24 11月, 2020 1 次提交
-
-
由 Wojciech Uss 提交于
* Add multi_gru_seq_fuse_pass and tests * fix date * removed unused functions
-
- 04 11月, 2020 1 次提交
-
-
由 石晓伟 提交于
* enhance the op_version_registry, test=develop * add unittests, test=develop * enhance the op_version_registry, test=develop * fix bugs, test=develop * revert pybind_boost_headers.h, test=develop * fix a attribute bug, test=develop
-
- 27 10月, 2020 1 次提交
-
-
由 Zhang Ting 提交于
* add fuse_bn_add_act pass
-
- 26 10月, 2020 1 次提交
-
-
由 Adam Osewski 提交于
-
- 14 9月, 2020 1 次提交
-
-
由 joanna.wozna.intel 提交于
-
- 29 7月, 2020 1 次提交
-
-
由 Zhou Wei 提交于
-
- 21 6月, 2020 1 次提交
-
-
由 Shibo Tao 提交于
* don't re-generate header file if content doesn't change. test=develop * add copy_if_different function. test=develop
-
- 09 6月, 2020 1 次提交
-
-
由 Sylwester Fraczek 提交于
* remove gmock from ut test=develop * coverage enabled for r+t+m fuse pass test=develop
-
- 03 6月, 2020 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 08 5月, 2020 1 次提交
-
-
由 Huihuang Zheng 提交于
1. To make ProgramTranslator to support `assert` grammar, this PR adds `assert` python API and C++ code. 2. Fix a bug: graph_pattern_detector.h #include <gtest/gtest_prod.h> but didn't declared dependency at CMakeLists, which can cause single build failure. 3. Refactoring `Formatter` in print_op to make it reusable and reuse the formatter to print in assert op.
-
- 28 4月, 2020 1 次提交
-
-
由 Sylwester Fraczek 提交于
-
- 24 4月, 2020 2 次提交
-
-
由 Jacek Czaja 提交于
-
由 arlesniak 提交于
-
- 22 4月, 2020 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 13 4月, 2020 1 次提交
-
-
由 joanna.wozna.intel 提交于
-
- 09 4月, 2020 1 次提交
-
-
由 mozga-intel 提交于
* Remove the NGraph engine from PDPD repository 1. Each operator was removed from the operator's directory 2. Each test was removed from the unittest directory 3. The parallel executor support was removed from the PDPD 4. The CMake file was removed from the PDPD 5. The NG flags were removed from the repository test=develop * Remove ngraph from: 1. Cmake file 2. Python file test=develop
-
- 01 4月, 2020 1 次提交
-
-
由 Jacek Czaja 提交于
-
- 27 3月, 2020 1 次提交
-
-
由 Tao Luo 提交于
test=develop
-
- 20 3月, 2020 1 次提交
-
-
由 Wilber 提交于
update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input
-
- 11 3月, 2020 2 次提交
-
-
由 Wilber 提交于
* add skip_layernorm pass. test=develop
-
由 Zhaolong Xing 提交于
* 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop
-
- 14 2月, 2020 1 次提交
-
-
由 Wilber 提交于
当一个模型中有多个fc_lstm子图的时候,且其中fc共用了同一个persistable的bias,此时不应该将bias节点删除,只将非persistable的节点去除即可。
-
- 07 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
* Add the first implememtation of fusion_group op #19621 (#3) * Add the dynamic load of nvrtc, and support runtime compiling of CUDA kernel using nvrtc. test=develop * Call CUDA driver api to launch the kernel compiled by nvrtc. test=develop * Disable for mac and windows. test=develop * Refine the codes to support manually specified num_threads and workload_per_thread. test=develop * Refine the CUDA kernel to support large dims. test=develop * Add DeviceCodePool to manage all device codes. * Add the first implementation fusion_group op. * Add unit-test for fusion_group op. * Add the check of result. * Add the check of nvrtc in unit-test. test=develop * Add comment to explain the inputs, outputs and features of fusion_group op. test=develop * Disable fusion_group op for mac and windows. test=develop * Make the compiling of device code return status instead of hanging up. test=develop * Add the check of whether there is CUDA driver library, and do not core dump when failing to call the CUDA driver API. * Unify fusion_group_op's input and output names. test=develop * Add the check of CUDA driver library in unittest. test=develop * Enable generating code for a given subgraph. #21126 (#4) * Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop * Enable the detection of subgraph of grad ops. * Generate code for detected subgraph in fusion_group_pass. * Add an option in BuildStrategy to enable fusion_group_pass and add unittest. test=develop * Fix a bug when checking whether the shape of all inputs are the same. * Add debug information. * Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#5) test=develop * Call subgraph_detector in fusion_group pass. test=develop * Disable fusion_group when WITH_GPU is OFF. test=develop * Refine all PADDLE_ENFORCE message. test=develop * Fix the case that some inputs are not defined in grad ops, and set op_role for fused op. test=develop * Follow review comments. test=develop
-
- 04 2月, 2020 1 次提交
-
-
由 石晓伟 提交于
-
- 10 1月, 2020 1 次提交
-
-
由 Zhen Wang 提交于
* add bn and relu fuse pass * add op attr assert and dtype assert * fix some inputs&&outputs bugs for the fused op and pattern. * add the unittest for fuse_bn_act_pass. test=develop * use normative enforce statements. test=develop * add the cpu test. test=develop * add the support of batch_size=1 for the bn with relu op. test=develop * add the error type for paddle throws. test=develop * add fused_batch_norm_act and fused_batch_norm_act_grad to op_has_unsed_vars_white_list. test=develop
-
- 07 1月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
test=develop
-
- 24 11月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Disable fusion_group pass for windows and mac. We will do some experiments on Linux first. test=develop * Print the subgraph when check failed. test=develop
-
- 29 10月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fusion_group_pass and elementwise pattern. * Rewrite the detector of elementwise group. test=develop * Add a comment in codegen. * Add more unittest cases. test=develop * Move code_generator related code to fusion_group directory. * Correct the including path. * Add the definition of SubGraph and finish the insert of fusion_group op in pass. * Insert graph_vis_pass in tester to visualize the graph for debug.
-
- 24 10月, 2019 1 次提交
-
-
由 wangchaochaohu 提交于
-
- 13 10月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments
-
- 12 10月, 2019 1 次提交
-
-
由 Adam 提交于
* Add ConvTranspose + BatchNorm fuse pass test=develop * Add tests for conv+bn and conv_transpose+bn passes test=develop
-
- 27 9月, 2019 1 次提交
-
-
由 wangchaochaohu 提交于
* codegen code for reconstruction test=develop * fix the cmake test=develop * fix review advice test=develop
-
- 19 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
- 16 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop
-