- 24 8月, 2022 2 次提交
- 23 8月, 2022 4 次提交
-
-
由 pangyoki 提交于
-
由 zhaoyingli 提交于
* add quant pass
-
由 OccupyMars2025 提交于
-
由 OccupyMars2025 提交于
* Update scope.h * typo * Update dense_tensor.inl
-
- 22 8月, 2022 3 次提交
-
-
由 joanna.wozna.intel 提交于
* Add int8 support for matmul+elementwiae_add fuse * Corrections after review and ernie test fix
-
由 Sławomir Siwek 提交于
* merge conv_concat_relu to conv_act * fix typo * extend unit test * reuse existing gpd * codestyle * enforce mkldnn conv
-
由 Yuanle Liu 提交于
-
- 19 8月, 2022 1 次提交
-
-
由 Ruibiao Chen 提交于
* Fix random op depenency and lr_shedule bugs for standalone executor * Fix CI errors * Fix CI errors * Fix CI errors
-
- 18 8月, 2022 3 次提交
-
-
由 pangyoki 提交于
apply buffer_shared_inplace_pass and inplace_addto_op_pass pass to program in Standalone Executor (#45085) * apply inplace addto in python apply_pass * fix * apply inplace pass for program * skip feed and fetch var * fix block_desc.move_from * fix block desc * alltoall remove inplace * fix
-
由 zhangxiaoci 提交于
* change to async mode for xpu multi-card training in static graph mode * minor bugfix * irrelevant. move to another pr * move change to other pr * fix stream issue * fix 'stream not meet with current context' error * fix branch diverge, test=kunlun
-
由 JingZhuangzhuang 提交于
* fix infer tans scop * fix infer trans scope * fic infer trans scope * fic infer trans scope Co-authored-by: Ndingjiawei <327396238@qq.com>
-
- 17 8月, 2022 2 次提交
-
-
由 Aurelius84 提交于
* [OpAttr]Add SupportTensor for OpMaker * fix typo * fix code style * add SupportTensor for concat op * add unittest for register Tensor * add shape checker and split attribute
-
由 feng_shuai 提交于
-
- 16 8月, 2022 4 次提交
-
-
由 Chen Weihang 提交于
* move check finite and unscale kernel into phi * move infershape into phi * move update_loss_scaling kernel into phi * remove original kernels * move update loss scaling infershape into phi * add header for xpu and npu * solve coverage failed * fix npu test failed * remove mutable data in cu file * fix new executor failed * add valid check for meta tensor output
-
由 feng_shuai 提交于
* convert multihead to oss * fix:bug * fix:delete const cast * fix:don't support bias_qk * add vit pass * fix:convert bug and add preln_residual_bias * support length=-1 * add UT for convert * add no_bias_qk support for gpu_multihead_op * delete infer_shape depends on bias_qk * oss just can be used in T4 and A* * fix:change api for ROCM CI
-
由 Wangzheee 提交于
-
由 Feiyu Chan 提交于
-
- 15 8月, 2022 2 次提交
-
-
由 Yuanle Liu 提交于
-
由 Yulong Ao 提交于
* [Auto Parallel] Move the distributed info from python to c++ * [Auto Parallel] Add dist_attrs for VarDesc and OpDesc * [Auto Parallel] Add the lost file * [Auto Parallel] Make the dist attr be unique_ptr * [Auto Parallel] Add the proto conversion * [Auto Parallel] Improve the proto support * [Auto Parallel] Fix the bugs for adding a device or a link * [Auto Parallel] Add the C++ ProcessMesh and DistributedMapper * [Auto Parallel] Improve the impl of these dist attrs * [Auto Parallel] Pybind11 ProcessMesh and DeviceMesh * [Auto Parallel] Fix the unittest problem * [Auto Parallel] Explicitly add the src file for auto_parallel target * [Auto Parallel] Add the proto depedency explicitly * [Auto Parallel] Fix the cmake bug on windows and mac * [Auto Parallel] Remove the pybind11 header file in process_mesh.h * [Auto Parallel] Remove unused codes * [Auto Parallel] Check whether the dist attr is null * [Auto Parallel] Implement the assign operator for OpDesc explicitly
-
- 14 8月, 2022 1 次提交
-
-
由 xiaoxiaohehe001 提交于
This reverts commit 84bf5c31.
-
- 13 8月, 2022 2 次提交
-
-
由 Leo Chen 提交于
* add cached_serialize_str_ * support program hash * add sha * add ut * use hash_str only for new_exe * fix attr order
-
由 ziyoujiyi 提交于
* back fl * delete ssl cert * . * make warning * . * unittest paral degree * solve unittest * heter & multi cloud commm ready * . * . * fl-ps v1.0 * . * support N + N mode * . * . * . * . * delete print * . * . * . * . * fix bug * . * . * fl-ps with coordinator ready * merge dev * update message parse only * update fl client scheduler * fix bug * update multithreads sync * fix ci errors * update role_maker.py * update role_maker.py * fix ci error: windows py import error * fix ci error: windows py import error * fix windows ci pylib import error * add dump fields & params * try to fix windows import fleet error * fix ps FLAGS error * fix logging risk * fix logging possible risk * write trainer_desc file * support split sparse params in local & remote * fix import paddle.fluid.core.PSGPU * fix import paddle.fluid.core.PSGPU * add remote_sparse & local_sparse config * fix unittest * fix test_dist_fleet_geo table error * fix PADDLE_ENFORCE error * fix other's pr conflict
-
- 12 8月, 2022 2 次提交
-
-
由 Sławomir Siwek 提交于
* remove v2_transpose_reshape * matmul_transpose_reshape * reshape_transpose_matmul * Add int8 support for matmulV2 * restore ut * adjust old ut * restore parallel UT ruels * remove mkldnn code from base ops * move enforces to pass * remove duplicated functions * delete duplicated enforces * feedback from review * add comments to variables * enable eltwise support * dynamic attribute * remove fusepass tests from op test * remove fuse pass cases from op test * revert introduction of dynamic attributes * style Co-authored-by: Nwozna <joanna.wozna@intel.com>
-
由 kangguangli 提交于
* transfer memcpy_h2d from fluid to phi * use UnchangedInferMeta instead * restore test_standalone_executor * add newline to fix codestyle check * rename pt -> phi * simplify logic and add check * make the comment more clear * remove useless comment * refine code
-
- 11 8月, 2022 1 次提交
-
-
由 whs 提交于
-
- 10 8月, 2022 4 次提交
-
-
由 xiaoxiaohehe001 提交于
* cuda_graph * cuda_graph_ * cuda_graph_ * cuda_graph_
-
由 Leo Chen 提交于
* set cuda device before run * add header file * fix compile
-
由 Leo Chen 提交于
* fix proto bug * add ut * reset need_update for var_desc * refine code * fix var desc order issue
-
由 Aurelius84 提交于
* [OpAttr]Support VarDesc* and vector<VarDesc*> in Attribute * add unittest for inference predictor
-
- 09 8月, 2022 2 次提交
-
-
由 yeliang2258 提交于
-
由 yeliang2258 提交于
* fix a bug in transpose2 about mkldnn * fix bug
-
- 08 8月, 2022 1 次提交
-
-
由 Leo Chen 提交于
* clean tensor.h * fix gather_nd
-
- 05 8月, 2022 4 次提交
-
-
由 Feiyu Chan 提交于
fix 5 operator makers with typos which pass string literal to argument 'generated', remove generated as parameter of AddAttr (#44935)
-
由 YuanRisheng 提交于
* move mkldnn activation kernel * fix compile bugs * fix compile bugs * deal with conflict * fix compile bugs * fix windows compile bugs * mkldnn unittest fix * change mutable to alloc * fix unittest bugs * modify code according comment
-
由 Zhen Wang 提交于
-
由 Sławomir Siwek 提交于
* remove v2_transpose_reshape * matmul_transpose_reshape * reshape_transpose_matmul * restore ut * adjust old ut * restore parallel UT ruels * feedback from review
-
- 04 8月, 2022 2 次提交
-
-
由 Sławomir Siwek 提交于
* Add unit tests * matmul_v2 + activation * matmuls + elementwise_add * matmul_v2 postops * transform matmul to v2 * opcompat * fix fusing matmul with multipe outs * add shape constraints * remove unused vars * change pass order * - Unit tests to be debugged - fix - refactor - diagnostic - more diagnostic - fix - Fix number two - fix - fix - fix - alpha added - more fixes - compilation fix - removed diagnostic code - cosmetic fixes * lint * add alpha constraint * merge matmul refactor * trigger CI * - fix * - another fix * code style * add support for matmul+elementwise_add+activation * code style * fix bfloat16 bugs * change append_binary to append_sum Co-authored-by: NJacek Czaja <jacek.czaja@intel.com>
-
由 王明冬 提交于
-