- 30 5月, 2023 1 次提交
-
-
由 risemeup1 提交于
* update_c++17 * update_c++17 * fix windows bug * solve cirle depend * solve cirle depend * solve cirle depend * solve cirle depend * solve cirle depend * fix windows bug * fix compiler error * fix compiler error * update eigen3 * update eigen3 * update eigen3 * fix mac-py3 compiler error * update C++17 * fix mac compiler error * fix compile error * fix coverage_compiler error * fix coverage_ci_problem * fix coverage_error * fix_kunlun200 compile error * fix kunlun200 compiler error * fix compile error * fix compiler error * fix py3 failed test * fix kunlun200 compiler error * test * fix test error * fix test error * fix test error * test * test * fix mac py3 error * fix mac py3 error * fix mac py3 error * fix test error * fix test error * fix compile error * fix compile error * fix compile error * test * test * fix compiler error * test * test * debug on ci * fix compiler error * fix compiler error * test * fix cinn compiler error * test * fix rocm cmpile error * fix cinn and kunlun compile error * update c++14 * Update flags.cmake
-
- 11 5月, 2023 1 次提交
-
-
由 gouzil 提交于
* [test]mv fluid controlflow detection dlnne tensorrt tests to tests * [test]clean dlnne * [test] fix test_tensorrt_engine_op * [test] try fix path error * [test] RollBACK test_tensorrt_engine_op * [test] RollBACK test_tensorrt_engine_op * [test]add todo * Empty-Commit; test=document_fix
-
- 04 4月, 2023 1 次提交
-
-
由 lzydev 提交于
* fix bug of redefine use_equal_all * fix bug of redefine use_equal_all
-
- 29 11月, 2022 1 次提交
-
-
由 kangguangli 提交于
* fix:add no support for cuda_arch<700 * replace Executor in while op with InterpreterCore * cache InterpreterCore as the member of WhileOp * fix bug: tensor place changed because of assign op in while loop * refine code * refine code * refine code * hot fix * fix compile * merge develop * follow comments * add log for test * remove LoDTensor * set flag control_flow_use_new_executor false Co-authored-by: Nfengshuai <fengshuai03@baidu.com> Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
-
- 31 10月, 2022 1 次提交
-
-
由 kangguangli 提交于
* replace executor in conditional_block_op.run with standalone_executor * add block_id as the argument of standalone executor's method run; add print for program * fix scope bug about conditional block op * fix bug: unnecessary return of fetch value * fix typo * fix: quantization will set variable persistable, and these variables must exist in global scope * add interpretercore cache for conditional block op but not activate in default * fix bug: local scope reuse for conditional block op * reset scope when conditional block op runs * fix typo * fix typo and code style * add build scope for conditional block op * add skip for transfer_layout kernel * refind code * fix reset_scope * fix reset_scope * refine code * refine code * refine code 1. remove flag use in conditional_block_op 2. pass execution_config to BuildOpFuncList instead of individual parameter * refine code * remove the use of FLAGS_control_flow_use_new_executor_cache * change FLAGS_control_flow_use_new_executor to false
-
- 04 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 04 3月, 2022 1 次提交
-
-
由 zhouweiwei2014 提交于
* Migrate bitwise_and/or/xor/not op into phi * fix CI
-
- 03 3月, 2022 1 次提交
-
-
由 From00 提交于
* Move compare OPs to phi * Fix bug * Use BroadcastKernel and ElementwiseKernel in phi
-
- 01 3月, 2022 1 次提交
-
-
由 Aurelius84 提交于
* [Phi] Migrate logical_and/or/not/xor into Phi * fix unittest * fix function name
-
- 26 1月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* update cmake file to remove fluid kernel * add pten declaration.h to where pybind.h used * fix sync_bn and tensorrt_engine * refine detection_library * fix interpreter_core * support eager legacy * fit eager legacy for pten * fall back to cpu if not found kernel * fix compile problem * fix compile problem * refine fallback logic * fit operator.run() * fix xpu compile * fit for new_exec * add REGISTER_OP_WITHOUT_GRADIENT * un-cache pt_kernel_context * fix compile * fix cudnn * fix compiling with on_infer * fix mkldnn * fix isfinite_v2 * fix xpu problem * fix op_device * refine fallback for xpu * fix xpu compile * merge develop * refine code format * fix compile * fix compile * add data_transfer * fix PreparePtenData * fix cpu context * merge develop * fix compile * fix error device context * fix xpu * fix dev_ctx
-
- 25 10月, 2021 1 次提交
-
-
由 TTerror 提交于
* add some ops to train ssd on kunlun * add some ops to train ssd on kunlun * add some ops to train ssd on kunlun * update cast op unittest * update cast op unittest * update cast op unittest * update xpu cmake * update cast unittest
-
- 16 6月, 2021 1 次提交
-
-
由 Zhou Wei 提交于
-
- 07 12月, 2020 1 次提交
-
-
由 LoveAn 提交于
* Compiling operator libraries with Unity Build on Windows CPU. * Compiling operator libraries with Unity Build on Windows GPU, no_test, test=windows_ci * Add option in windows ci script, no_test, test=windows_ci * Optimize parallel compiling, test=develop * remove limit of parallel compile and skip some ops in UB, test=develop * remove changes of header file, test=develop * remove changes of header file, test=develop * fix test_eye_op unittest failed, test=develop * Compiling operator libraries with Unity Build on Linux, test=develop * set default WITH_UNITY_BUILD=OFF, test=develop * Move unity build rules into a single file and add comment, test=develop * optimize parallel compilation, test=develop * fix undefined reference error on coverage ci, test=develop
-
- 14 10月, 2020 1 次提交
-
-
由 zhang wenhui 提交于
* add multitask * add multitask, test=develop * fix code style, test=develop * add partail push dense, test=develop * fix has_kay in py3, test=develop * fix, test=develop * fix, test=develop * fix, test=develop
-
- 30 7月, 2020 1 次提交
-
-
由 wawltor 提交于
Update the code for the compare_ops, update the api and doc
-
- 03 4月, 2020 1 次提交
-
-
由 channings 提交于
* update linspace, equal operators to API 2.0, test=develop * equal support higher performance CUDA kernel, test=develop * update comment of equal&linspace operator, test=develop * update comment of equal&linspace operator, test=develop
-
- 06 12月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests. Fix bugs: 1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op. 2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var. This PR also did some code clean up: 1. Print the var name when sgd_op catches shape error so that it is easier to debug 2. Fix a typo: dicta -> dict
-
- 02 8月, 2019 1 次提交
-
-
由 Zeng Jinle 提交于
* open gc by default, test=develop * fix test_train_recognize_digits and disable gc when ngraph is enabled, test=develop * fix conditional_block op eager deletion bug, test=develop * add some comments to reviewers, test=develop
-
- 19 7月, 2019 1 次提交
-
-
由 Huihuang Zheng 提交于
Test PaddingRNN on V100 GPU device. Test configuration: large model, padding mode (which is the mode using recurrentOp), one GPU. GPU memory (MiB): 6414 (this PR) vs 6837 (without this PR) Speed (steps/s): 10.28 (this PR) vs 9.89 (without this PR)
-
- 28 3月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 27 3月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 05 3月, 2019 1 次提交
-
-
由 sneaxiy 提交于
test=develop
-
- 14 12月, 2018 1 次提交
-
-
由 Yan Chunwei 提交于
-
- 16 11月, 2018 1 次提交
-
-
由 Wu Yi 提交于
* wip simplify operator framework * wip * wip * done test=develop * clean test=develop * fix test=develop * fix deps test=develop * fix cpu build test=develop * fix tensorrt build test=develop * fix tests test=develop * fix test=develop * fix cpu build test=develop
-