1. 02 3月, 2022 19 次提交
    • H
      Move transpose to pten (#39327) · 7a857924
      hong 提交于
      * immigrate_transpose_to_pten cpu kernel only; test=develop
      
      * fix bug; test=develop
      
      * add transpose cuda api
      
      * bug fix;
      
      * fix bugs
      
      * fix bugs; test=develop
      
      * bug fix;
      
      * move transepose to pten; test=develop
      
      * fix bug; test=develop
      
      * fix bugs; test=develop
      
      * add transpose grad fp16 support; test=develop
      
      * fix bug; test=develop
      
      * fix npu bug; test=develop
      
      * fix nemul = 0 bug; test=develop
      
      * add fp16 support; test=develop
      
      * fix data type register bug; test=develop
      
      * fix transpose bug; test=develop
      
      * update transpose
      
      * fix transpose bug; test=develop
      
      * remove useless code; test=develop
      
      * remove useless code; test=develop
      
      * fix transpose alias bug; test=develop
      
      * polish code; test=develop
      
      * resolve confict; test=develop
      
      * resolve confilct; test=develop
      
      * recover prepared operator; test=develop
      
      * fix bug; test=develop
      
      * polish code; test=develop
      
      * fix bug; test=develop
      
      * fix bug; test=develop
      7a857924
    • F
      Move BroadcastTensors OP to phi (#40047) · 2a5590a1
      From00 提交于
      * Move BroadcastTensors OP to phi
      
      * Remove mutable_data in impl
      
      * Move BilinearTensorProductInferMeta to multiary.h/cc
      2a5590a1
    • Z
      new fleet_desc builder (#39948) · 1c4e3e5d
      ziyoujiyi 提交于
      * delete gloo connect retry
      
      * the_one_ps dirs reconstruct
      
      * .
      
      * .
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * refactor theoneps
      
      * the_one_ps
      
      * add ps pass unittest
      
      * add ps pass unittest
      
      * ps unitest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * ps unittest ready
      
      * ps unittest ready
      
      * solve dist_pass init conflict
      
      * solve import CommContext error
      
      * unittest ok
      
      * implement AllocateFrom
      
      * solve setup.py.in conflict
      
      * solve conflict
      
      * solve conflict
      
      * solve conflict
      
      * .
      
      * .
      
      * cpu-async-ps minimize test ok & gpu minimize test ok
      
      * add heter 2stage unittest
      
      * add heter 2stage unittest
      
      * add heter 2stage unittest
      
      * sync/geo test ok & fix heter_worker program ok
      
      * .
      
      * new fleet desc generator
      
      * new fleet_desc builder
      
      * new fleet_desc builder
      
      * .
      
      * .
      
      * correct ps.proto compile
      
      * .
      Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>
      1c4e3e5d
    • H
      [Infrt]add phi kernel dialect (#39726) · 07dad6d6
      huzhiqiang 提交于
      07dad6d6
    • Z
      [bf16] add bf16 kernel: softmax & log_softmax (#39999) · 4a4215ff
      zhangbo9674 提交于
      * add softmax log_softmax
      
      * refine rocm
      
      * refine unittest
      4a4215ff
    • C
      【phi】migrate gather_tree,reduce_prod to phi (#39844) · 6af2729e
      crystal 提交于
      * move to phi
      
      * migrate gather_tree_op into phi
      
      * move reduce_prod tp phi
      
      * optimize code
      6af2729e
    • C
      Upgrade new profiler (#39984) · 0c3f7fbc
      chenjian 提交于
      * add new profiler components
      
      * fix bug
      
      * upgrade new profiler
      
      * fix operator.cc
      
      * fix operator.cc
      
      * fix cmakelists.txt
      
      * fix bug
      
      * fix according to pr
      
      * fix bug
      
      * fix cmake
      
      * fix bug
      
      * fix a bug
      
      * fix bug
      
      * fix bug
      0c3f7fbc
    • J
      add logic kernel for mlu (#39940) · bc113e10
      joeqiao12 提交于
      bc113e10
    • Y
      [fleet_executor] Add entrance of FleetExecutor in AnalysisPredictor for... · 244ae318
      Yuang Liu 提交于
      [fleet_executor] Add entrance of FleetExecutor in AnalysisPredictor for distributed inference (#39992)
      
      244ae318
    • L
      90ab7403
    • C
      [Phi] Unify complex type trait and fix real imag bug (#40036) · 0764fda2
      Chen Weihang 提交于
      * unify complex type trait and fix real imag bug
      
      * add unittest for type tratis
      0764fda2
    • Q
      [MLU] adapt matmul op (#39727) · b4d931e8
      qipengh 提交于
      * [MLU] adapt matmul op
      
      * [MLU] fix phi namespace
      b4d931e8
    • F
      [MLU] add transpose2 mlu kernel (#39994) · 4cab812e
      fwenguang 提交于
      4cab812e
    • B
      add_new_comm_primitive (#40040) · 4e00d2bb
      Baibaifan 提交于
      4e00d2bb
    • W
      [Eager] open eager when WITH_PYTHON (#39979) · 9af72957
      wanghuancoder 提交于
      * open eager when WITH_PYTHON, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * add DWITH_PYTHON for gen_fluid_lib, test=develop
      9af72957
    • W
      ernie: revert skip_layernorm_fp16 (#39991) · 26e2b918
      Wangzheee 提交于
      26e2b918
    • J
      add share external data interface (#39809) · 1ff1c1e0
      JingZhuangzhuang 提交于
      1ff1c1e0
    • F
      [Pten] Gru lstm migration (#39729) · e4dba69a
      Feiyu Chan 提交于
      * move sequence2batch
      
      * move lstm and gru
      
      * Add phi/kernels directory into exclusion to stop using hipcc to compile non .cu files in it.
      e4dba69a
    • F
      Fix bug for prepare phi OP (#40033) · fb0cadfd
      From00 提交于
      fb0cadfd
  2. 01 3月, 2022 21 次提交