- 20 10月, 2022 2 次提交
-
-
由 Kaipeng Deng 提交于
* add fused_multi_transformer_encoder/decoder pass, run GPT-3 success
-
由 Sylwester Fraczek 提交于
-
- 19 10月, 2022 5 次提交
-
-
由 Wang Xin 提交于
-
由 Yuanle Liu 提交于
-
由 zyfncg 提交于
-
由 Ruibiao Chen 提交于
* Support stream overlap for c_allreduce_sum * Test CI * Add notes * Add SingleStreamGuard for BuildOpFuncList
-
由 WangZhen 提交于
* Fix recurrent op eager deletion pass error in dy2st * Polish code * Refine error message
-
- 18 10月, 2022 3 次提交
-
-
由 Wang Bojun 提交于
* first version, accuracy corrected * disable debug print * use blockReduceSum in phi * add UT * add opCompat * code style * code refine * bug fix * code refine * test fix * bugfix * codesytle fix * code style * code-style * code-style * code-style
-
由 Sławomir Siwek 提交于
* git * style * leave default relu in kernel * style * cleanup FCMKLDNN pattern * merge conflicts * update develop * update develop * add const * rename to oneDNN and adjust attributes * whitespace
-
由 zyfncg 提交于
* support generating code of opmaker for backward op invoke forward op * gsupport code-gen of opmaker for sparse op * refind logic of choose phi kernrel * fix complie budg * fix code_gen bug * fix bug * fix kernel signature code-gen * fix complie bug of VarType * fix complie bug of VarType * fix test_sparse_conv_op * fix test_sparse_norm_op
-
- 17 10月, 2022 7 次提交
-
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Support allow_partial switch, which can be configure in pipeline_configs. If sent tensor are not the same from different hosts, they shouldn't been sent partially and then concated as a whole tensor. * Change name allow_partial to enable_partial_send_recv. * Add global variable _enable_partial_send_recv
-
由 YuanRisheng 提交于
* namespace modify * update by comment
-
由 Wang Bojun 提交于
* first version of ln_s_p with s>0 * refine and UT * pass opt draft * pass opt * code refine * code-style * bug fix * fix ci test * code style
-
由 jakpiase 提交于
-
由 pangyoki 提交于
* skip ReplaceAllReduceOp in GraphtoBlock when nccl_ctxs_ is nullptr * update ut * test_dist_allreduce_op failed * fix test_dist_allreduce_op * add ut * fix nccl cpu compile * fix
-
由 HongyuJia 提交于
-
- 16 10月, 2022 1 次提交
-
-
由 ZeKai Zhou 提交于
-
- 14 10月, 2022 1 次提交
-
-
由 Shijie 提交于
-
- 13 10月, 2022 5 次提交
-
-
由 yeliang2258 提交于
* fix immutable op quantize bugs * fix * fix build bug * fix test * notest,test=inference * fix ppyoloe acc drop bugs * fix test * fix test * add test * fix * fix * fix test * fix refined name bug * fix test * bias fix * fix matmul weight dequant bug * re-ci * fix tester * fix test * fix tester * update weight dequantize func * update code * update test for converage * update test * update cmake * update cmakelist * update code * rerun ci * remove useless code
-
由 YuanRisheng 提交于
-
由 Leo Chen 提交于
* remove class ScopeBase * reopen test
-
由 HongyuJia 提交于
* remove PADDLE_WITH_MKLDNN, test white_list=abs * fix unique_ptr * fix op.Type() * remove TODO in kernel_dispatch.h * remove IndicateVarDataType function, update white_list * remove mkldnn hard code * add comments * fix == * update mkldnn_op_list * delete hard code of OPs * update mkldnn_op_list * update mkldnn_op_list, remove interp * add error check for ExecutionContext * update mkldnn_op_list, remove transpose2_grad * remove interpolate mkldnn * remove fill_constant mkldnn * opt HasAttr in DygraphExecutionContext * deprecated commit, test mkldnn_white_list * deprecated commit, test mkldnn_white_list * deprecated commit, test mkldnn_black_list * update mkldnn_op_list, add assert error op * solve cudnn related op * fix error * add mkldnn fallback in phi_utils.cc * remove mkldnn fallback in phi_utils.cc * opt code implementation * polish Copyright License
-
由 joanna.wozna.intel 提交于
* Add unsigned int8 propagation * Add or modify unit tests * Correct concat scale checking * Apply review suggestions * Corrections
-
- 12 10月, 2022 5 次提交
-
-
由 sunli 提交于
* fix wz review * update code
-
由 Leo Chen 提交于
* refactor * refine code
-
由 weishengying 提交于
-
由 zhouweiwei2014 提交于
* [Zero-Dim] support input 0D Tensor for unary api * fix CI
-
由 sunli 提交于
* optimize cinn subgraph detector * fix update subgraph * add annotation
-
- 11 10月, 2022 4 次提交
-
-
由 Sylwester Fraczek 提交于
* add logging to fc residual fuse pass * expand logging message to fc residual fuse pass * Add test for fc residual not fusing with activation
-
由 Aganlengzi 提交于
-
由 Chen Weihang 提交于
* remove using lodtensor part1 * polish history code format
-
由 Zhen Wang 提交于
* Fix some bugs hidden in build_cinn_pass. * Update codes about OpTransInfo. * Only support for the static reshape/reshape2 op.
-
- 10 10月, 2022 5 次提交
-
-
由 YuanRisheng 提交于
* add yaml entry for rnn and rrnn_grad, move infershape function for rnn_grad to phi infer meta * WIP: move rnn kernrl to phi * Change the code generation to avoid converting from intializer list to tuple of heterogeneous types. This is only triggered when an api has intermediate outputs, and the result of the outputs are of heterogeneous types. * fix the bug that when none in a vector of tensors requires gradient, the conversion to InferShapeContext to InferMetaContext (a.k.a. BuildInferMetaContext) produces errorous results. * fix ci bugs * fix ci bugs * fix ci bugs * modify code according comment Co-authored-by: Nchenfeiyu <chenfeiyu@baidu.com>
-
由 Leo Chen 提交于
* reduce time cost on atomic in interpretercore * clear code of PrepareAtomic in interpretercore * refine threadpool cache
-
由 Sylwester Fraczek 提交于
* fix fc pattern remove use_bias add residual input switch fix references to pattern * review fixes
-
由 Sylwester Fraczek 提交于
* Add methods that find input or output name by var name * kind of bugfix - initialize variables * ci fix * review fixed
-
由 zhoutianzi666 提交于
-
- 09 10月, 2022 2 次提交
-
-
由 zmxdream 提交于
-
由 zhangbo9674 提交于
-