- 19 6月, 2023 1 次提交
-
-
由 wz1qqx 提交于
-
- 02 6月, 2023 1 次提交
-
-
由 wz1qqx 提交于
-
- 22 5月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 21 4月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 29 3月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 22 3月, 2023 1 次提交
-
-
由 zhupengyang 提交于
-
- 20 3月, 2023 1 次提交
-
-
由 mayang002 提交于
-
- 09 2月, 2023 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Adjust mkldnn_placement_pass to check library type and data type * Check if var has inputs * Remove unrelated test * Refactor
-
- 08 12月, 2022 1 次提交
-
-
由 RichardWooSJTU 提交于
* rewrite delete_weight_deqquant_linear_op_encoder/decoder pass
-
- 10 11月, 2022 1 次提交
-
-
由 RichardWooSJTU 提交于
* add fuse_multi_transformer_layer_pass
-
- 20 10月, 2022 1 次提交
-
-
由 Kaipeng Deng 提交于
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
-
- 26 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 14 12月, 2021 1 次提交
-
-
由 Sylwester Fraczek 提交于
* reshape+transpose+matmul_v2 * in_name->input_name * fix pr-ci-static-check
-
- 28 6月, 2021 1 次提交
-
-
由 王明冬 提交于
-
- 18 6月, 2021 1 次提交
-
-
由 Wangzheee 提交于
-
- 12 6月, 2021 1 次提交
-
-
由 joanna.wozna.intel 提交于
* Small changes related to BF16 fusion_gru and fusion_lstm * Correct to pass arg by value * Add conditions to rnn op * Correct the spelling mistake * Improving the test with checking activation * Trigger CI
-
- 10 6月, 2021 1 次提交
-
-
由 王明冬 提交于
-
- 28 12月, 2020 1 次提交
-
-
由 Wilber 提交于
-
- 28 4月, 2020 1 次提交
-
-
由 Sylwester Fraczek 提交于
-
- 19 4月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 11 3月, 2020 1 次提交
-
-
由 Zhaolong Xing 提交于
* 1. add embedding eltwise layernorm fuse 2. add embedding eltwise layernorm op 3. refine inplace_add_relu 4. refine fc_eltwise_layernorm test=develop * 1. refine fc test=develop * fix comments test=develop * fix comments test=develop
-
- 21 2月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 14 2月, 2020 1 次提交
-
-
由 Wilber 提交于
当一个模型中有多个fc_lstm子图的时候,且其中fc共用了同一个persistable的bias,此时不应该将bias节点删除,只将非persistable的节点去除即可。
-
- 09 1月, 2020 1 次提交
-
-
由 Yiqun Liu 提交于
* Polish the PADDLE_ENFORCE in fusion_group pass related codes. test=develop * Correct the unittest because of the change relu_grad's formula. test=develop
-
- 20 11月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Enable generating code for a given subgraph. * Support sorting the subgraph. * Remove the rearange of expressions because we use the sorted subgraph directly. * Enable generating code for a subgraph which is composed of grad ops. * Use expression information to check the accuracy in unittest. * Separate load and store from computation expressions. test=develop * Improve the loading statements in generated codes. test=develop * Remove unused arguments from formal list. test=develop
-
- 29 10月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fusion_group_pass and elementwise pattern. * Rewrite the detector of elementwise group. test=develop * Add a comment in codegen. * Add more unittest cases. test=develop * Move code_generator related code to fusion_group directory. * Correct the including path. * Add the definition of SubGraph and finish the insert of fusion_group op in pass. * Insert graph_vis_pass in tester to visualize the graph for debug.
-
- 13 10月, 2019 1 次提交
-
-
由 zhaoyuchen2018 提交于
* Add multihead fuse pass for ernie opt * Refine softmax test=develop * Refine cuda kernel * Refine cuda version * Refine cmake test=develop * refine header file * refine test case and pass * refine comments
-
- 12 10月, 2019 1 次提交
-
-
由 Adam 提交于
* Add ConvTranspose + BatchNorm fuse pass test=develop * Add tests for conv+bn and conv_transpose+bn passes test=develop
-
- 19 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add fc_elementwise_layernorm_fuse pass and unittest. * Add fused_fc_elementwise_layernorm op and its GPU kernel. test=develop * Apply fc_elementwise_layernorm_fuse_pass to GPU inference. * Add the setting of attrs in the definition of binary_op. test=develop * Add comment. * Implement the unittest. test=develop * Change the unittest name of layer_norm. test=develop
-
- 16 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Refine the codes related to fc op. * Add GPU implementation for fc functor. * Apply fc_fuse_pass in GPU inference. test=develop * Change the cmake for fc op. * Change PADDLE_ENFORCE to PADDLE_ENFORCE_EQ. * Add an attribute to set the activation type in fc_op. * Enhance the unittest of fc_op. test=develop * Remove the declaration of FCOpGrad back to the header file. test=develop * Set default value for newly added arguments in test_fc_op. test=develop * Enhance fc_fuse_pass to enable fusing relu. * Allow print the shapes of var_desc in graph. test=develop * Enhance fc_fuse_pass_tester. * Remove the use of PADDLE_ENFORCE. test=develop * Correct the number of ops after fusing. test=develop * Fix a typo. test=develop * Set activation_type to null when there is no relu in fc. test=develop * Refine fc_fuse_pass's codes. * Enable the set of shape for tensor. * Refine repeated_fc_relu_pass and add unittest. test=develop
-
- 03 9月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add a interface to enable cudnn for inference. * Add cudnn_placement_pass. test=develop * Set the default value of cudnn_enabled_op_types to null. test=develop * Write the common basic class, placement_pass_base, to refine the codes. test=develop * Call EnableCUDNN in unittest. test=develop * Refine cudnn_placement_pass tester. * Enable the testing of cudnn_placement_pass in inference's unittest. test=develop * Add the check of op kernels. test=develop
-
- 30 8月, 2019 1 次提交
-
-
由 Yiqun Liu 提交于
* Add simplify_with_basic_ops_pass to replace dropout_op with scale_op when is_test is true. test=develop * Delete dropout_op directly when upscale_in_train is true. test=develop * Improve the debug string, adding the print of op_desc information. * Fix the case when dropout's input x is reused as the next op's output. * Add the pass to inference. test=develop * Change the log level. test=develop * Add unittest for inplace case. * Add comment to explain the pass. * Apply the pass for CPU inference. test=develop * Fix the typo. test=develop * Add the check of AttrType. test=develop
-