- 07 11月, 2022 2 次提交
-
-
由 HongyuJia 提交于
* move cudnn hardcode outside GetExpectedKernelType * add header file * debug * update interpreter_util with hardcode * update interpreter_util headerfile * solve activation hardcode * debug with CI * add mkldnn_op_list header file * temporarily uncomment mkldnn * temporarily uncomment mkldnn * delete sequence_softmax cudnn hardcode * add hardcode to data_transfer.cc * update data_transfer headerfile * try fix segment fault * update cudnn&miopen_helper * reset HasAttr of DygraphExctnCtx * debug, this commit should pass all CI * debug should pass CI, temporarily disable activation * debug should pass CI * fix default_attr=nullptr bug * clean debug code * Call SetDnnFallback function in the base class * activation fallback to plain kernel * fix default GetExpectedKernelType find wrong kernel * search cudnn kernel instead of fallback * fix cudnn_handle bug * remove tanh use_cudnn * restore tanh use_cudnn * debug tanh * fix tanh bug * delete activation cudnn kernel * polish code
-
由 Sławomir Siwek 提交于
* init changes * bnorm * method signature * change order * bnorm * removed unused args
-
- 05 11月, 2022 1 次提交
-
-
由 Yiqun Liu 提交于
-
- 04 11月, 2022 2 次提交
- 03 11月, 2022 6 次提交
-
-
由 Ruibiao Chen 提交于
* Dispath computation OPs before communication in standalone executor * Update code * Fix CI errors * Improve performance of coalesce_tensor and depend OP in standalone executor * pre-commit check
-
由 yeliang2258 提交于
* add constant_folding_pass pass for mkldnn int8 * update UpdateScaleOpInOutScales
-
由 Leo Chen 提交于
-
由 HongyuJia 提交于
* opt CanMKLDNNBeUsed performance * fix nullptr bug * fix OpBase default_attrs=nullptr bug * fix OpBase default_attrs=nullptr bug * fix OpBase default_attrs=nullptr bug
-
由 Sławomir Siwek 提交于
* add extra attr property set * add type_info for all context * add onednn context to all context * fix context compile error * simplify conv kernel args * pass runtime attr into dev_ctx * fix marco error * clear conv_grad_kernel extra args * merge conv_grad_grad into conv_grad * clear conv2d_grad_grad extra attrs * remove redundant imports * migrate softmax * clear yaml and eager extra attr * fix conv1d error * change to thread local * fix npu compile failed * try to fix windows compile failed * add conv2d onednn phi kernel * fix ci bugs (#36) * fix compile bugs (#38) * fix extra input transform bug (#39) * support dynamic created attr (#40) * reset extra info gen code * rm conv_grad_grad kernel * reimpl pass attr adapting * add int attr support * remove vector inputnames creating * merge dev * fix map at error * adjust attribute * adapt funcs to PHI Co-authored-by: NChen Weihang <chenweihang@baidu.com> Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com>
-
由 wenbin 提交于
-
- 02 11月, 2022 4 次提交
-
-
由 丁一 提交于
-
由 Ruibiao Chen 提交于
* Dispath computation OPs before communication in standalone executor * Update code * Fix CI errors
-
由 Yiqun Liu 提交于
Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor. (#47095) * Improve the tool for checking nan and inf, and support to compute the max, min and mean of output tensor. * Add a FLAGS to control whether abort when meets inf/nan and polish codes. * Fix unittest. * Change the computing of mean.
- 01 11月, 2022 6 次提交
-
-
由 HongyuJia 提交于
* move cudnn hardcode outside GetExpectedKernelType * add header file * debug * update interpreter_util with hardcode * update interpreter_util headerfile * solve activation hardcode * debug with CI * add mkldnn_op_list header file * temporarily uncomment mkldnn * temporarily uncomment mkldnn * delete sequence_softmax cudnn hardcode * add hardcode to data_transfer.cc * update data_transfer headerfile * try fix segment fault * update cudnn&miopen_helper * reset HasAttr of DygraphExctnCtx * debug, this commit should pass all CI * debug should pass CI, temporarily disable activation * debug should pass CI * fix default_attr=nullptr bug * clean debug code
-
由 Yuanle Liu 提交于
-
由 YuanRisheng 提交于
* standard_api * add hardtanh
-
由 Ruibiao Chen 提交于
* [Auto Parallel] Improve the c++ dist attr * [Auto Parallel] Modify test_program.py * Support custom stream for standalone executor Co-authored-by: NYulong Ao <aoyulong@baidu.com>
-
由 Kaipeng Deng 提交于
* fix memory copy in prepare_data. test=develop
-
由 Chen Weihang 提交于
* add extra attr property set * add type_info for all context * add onednn context to all context * fix context compile error * simplify conv kernel args * pass runtime attr into dev_ctx * fix marco error * clear conv_grad_kernel extra args * merge conv_grad_grad into conv_grad * clear conv2d_grad_grad extra attrs * clear yaml and eager extra attr * fix conv1d error * change to thread local * fix npu compile failed * try to fix windows compile failed * add conv2d onednn phi kernel * fix ci bugs (#36) * fix compile bugs (#38) * fix extra input transform bug (#39) * support dynamic created attr (#40) * reset extra info gen code * rm conv_grad_grad kernel * reimpl pass attr adapting * add int attr support * remove vector inputnames creating * fix map at error * Update paddle/phi/kernels/onednn/conv_grad_kernel.cc Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com> * remove useless extra attrs * replace mkldnn_engine by onednn_engine Co-authored-by: NYuanRisheng <yuanrisheng@baidu.com> Co-authored-by: NSławomir Siwek <slawomir.siwek@intel.com>
-
- 31 10月, 2022 4 次提交
-
-
由 feng_shuai 提交于
* feat: add int8 support for vit * test:add test
-
由 Yulong Ao 提交于
* [Auto Parallel] Improve the c++ dist attr * [Auto Parallel] Modify test_program.py * [Auto Parallel] Add the missiong import
-
由 kangguangli 提交于
* replace executor in conditional_block_op.run with standalone_executor * add block_id as the argument of standalone executor's method run; add print for program * fix scope bug about conditional block op * fix bug: unnecessary return of fetch value * fix typo * fix: quantization will set variable persistable, and these variables must exist in global scope * add interpretercore cache for conditional block op but not activate in default * fix bug: local scope reuse for conditional block op * reset scope when conditional block op runs * fix typo * fix typo and code style * add build scope for conditional block op * add skip for transfer_layout kernel * refind code * fix reset_scope * fix reset_scope * refine code * refine code * refine code 1. remove flag use in conditional_block_op 2. pass execution_config to BuildOpFuncList instead of individual parameter * refine code * remove the use of FLAGS_control_flow_use_new_executor_cache * change FLAGS_control_flow_use_new_executor to false
-
由 Nyakku Shigure 提交于
* fix typo `Fasle`/`Flase` -> `Flase` * fix typo `Ture` -> `True`
-
- 27 10月, 2022 3 次提交
-
-
由 Leo Chen 提交于
* make all cpp tests dynamic linked to libpaddle.so * add comments * keep old cc_test for some tests * fix some ut * make some ut use cc_test_old * fix typos and fit for win32 * fix lib path * fix some tests * skip lite test * fit for rocm * fit for cinn * fit for mac * fit for win32 * skip inference ut * skip windows * fix coverage
-
由 Zhen Wang 提交于
-
由 Chen Weihang 提交于
* fix compile error of mkldnn * fix tensorrt error
-
- 26 10月, 2022 5 次提交
-
-
由 wenbin 提交于
* prelnlayernorm_shift * add ut * remove paddle_enforce * remove useless * add UT * remove UT * add UT * set timeout
-
由 Sławomir Siwek 提交于
* fc/matmuls + scale fuse pass * remove double-extension * add unit tests * comments from review * codestyle * add pass to int8 list * new codestyle * attr name typo
-
由 Siming Dai 提交于
* fix dlpack deletion * add unittest * fix unittest
-
由 Wang Xin 提交于
fix uninitialized, tautological-constant-out-of-range-compare and literal-conversion warning on macos (#47341)
-
由 Chen Weihang 提交于
* remove using lodtensor part2 * resolve code format error * resolve conflict * resolve conflict * replace added frameworrk tensor
-
- 25 10月, 2022 1 次提交
-
-
由 HongyuJia 提交于
* use dnn_fallback flag to delete mkldnn hardcode * polish code style * fix protected error * fix const error * fix reduce_op fallback * fix pool_op fallback * add Set function of dnn_fallback_
-
- 24 10月, 2022 3 次提交
-
-
由 yeliang2258 提交于
* fix log bugs * more fix * fix bugs
-
由 Wang Xin 提交于
* fix macos inconsistent-missing-override warnings * fix inconsistent-missing-override error in test
-
由 Yiqun Liu 提交于
-
- 21 10月, 2022 1 次提交
-
-
由 Allen Guo 提交于
-
- 20 10月, 2022 2 次提交
-
-
由 HongyuJia 提交于
-
由 Kaipeng Deng 提交于
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
-