1. 24 3月, 2022 6 次提交
  2. 23 3月, 2022 4 次提交
  3. 21 3月, 2022 2 次提交
  4. 18 3月, 2022 1 次提交
  5. 16 3月, 2022 1 次提交
    • Y
      [Auto Parallel] Add the support for the auto completion of while_op (#39939) · ec6b8fbd
      Yulong Ao 提交于
      * [Auto Parallel] Support the auto completion of while_op
      
      * [Auto Parallel] Improve the completion algorithms
      
      * [Auto Parallel] Fix bugs for ernie inference
      
      * [Auto Parallel] Remove attrs which cannot be pickled
      
      * [Auto Parallel] make the dims_mappings of LodTensorArray vars empty
      
      * [Auto Parallel] Fix bugs for the ernie inference in the pipeline parallel
      
      * [Auto Parallel] Remove unncessary comments
      
      * [Auto Parallel] Fix a bug of the CMakeLists
      
      * [Auto Parallel] Use the newest APIs to write the unit test
      
      * [Auto Parallel] Remove unnecessary statements
      ec6b8fbd
  6. 15 3月, 2022 4 次提交
  7. 14 3月, 2022 3 次提交
  8. 11 3月, 2022 1 次提交
  9. 10 3月, 2022 1 次提交
  10. 09 3月, 2022 1 次提交
  11. 08 3月, 2022 2 次提交
  12. 07 3月, 2022 1 次提交
  13. 05 3月, 2022 1 次提交
    • W
      Ps optimizer multi programs (#39883) · bcaf88d2
      wangguanqun 提交于
      * fix benchmark and communicator config
      
      * fix bugs of the_one_ps
      
      * multi program and fix bug in optimizer
      
      * multi program in the_one_ps
      
      * public commcontext
      
      * ps optimizer multi programs
      
      * the one ps merge
      
      * fix bug in test
      bcaf88d2
  14. 03 3月, 2022 1 次提交
  15. 02 3月, 2022 3 次提交
    • L
      run recompute's real backward with amp disabled (#40042) · 28795771
      Leo Chen 提交于
      28795771
    • Z
      new fleet_desc builder (#39948) · 1c4e3e5d
      ziyoujiyi 提交于
      * delete gloo connect retry
      
      * the_one_ps dirs reconstruct
      
      * .
      
      * .
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * refactor theoneps
      
      * the_one_ps
      
      * add ps pass unittest
      
      * add ps pass unittest
      
      * ps unitest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * ps unittest ready
      
      * ps unittest ready
      
      * solve dist_pass init conflict
      
      * solve import CommContext error
      
      * unittest ok
      
      * implement AllocateFrom
      
      * solve setup.py.in conflict
      
      * solve conflict
      
      * solve conflict
      
      * solve conflict
      
      * .
      
      * .
      
      * cpu-async-ps minimize test ok & gpu minimize test ok
      
      * add heter 2stage unittest
      
      * add heter 2stage unittest
      
      * add heter 2stage unittest
      
      * sync/geo test ok & fix heter_worker program ok
      
      * .
      
      * new fleet desc generator
      
      * new fleet_desc builder
      
      * new fleet_desc builder
      
      * .
      
      * .
      
      * correct ps.proto compile
      
      * .
      Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>
      1c4e3e5d
    • J
      [Auto Parallel] Adapt Partitioner & DistOp for ERNIE3.0 Inference and cache (#39895) · c9cd47d9
      JZ-LIANG 提交于
      * adapot dist op
      
      * add dist_fill_constant_batch_size_like
      
      * remvoe print
      
      * update compitable
      
      * add unitest
      c9cd47d9
  16. 25 2月, 2022 1 次提交
  17. 24 2月, 2022 2 次提交
  18. 22 2月, 2022 3 次提交
    • J
      Auto Parallel support conditional block (#39612) · a08ee62a
      JZ-LIANG 提交于
      * add subblock logic for context and partitioner
      
      * partitioner support sub blocks
      
      * revise typos
      
      * fixed param init bug for while
      
      * chmod 644
      
      * add unitest
      
      * mv forward parser
      
      * update unitest
      
      * update dist op ctx
      
      * update dist op ctx
      
      * fixed bug in dist op ctx
      
      * fixed bug for recompute subblock
      a08ee62a
    • W
      fix bug in new the_one_ps (#39505) · d56a0a1b
      wangguanqun 提交于
      * fix benchmark and communicator config
      
      * fix bugs of the_one_ps
      
      * multi program and fix bug in optimizer
      
      * multi program in the_one_ps
      
      * public commcontext
      d56a0a1b
    • Y
      [Auto Parallel] Add the high-level Engine API (#39709) · 5595fdbb
      Yulong Ao 提交于
      * [Auto Parallel] Add the high-level Engine API
      
      * Update the test cmakefile
      5595fdbb
  19. 18 2月, 2022 2 次提交
    • Z
      bug fix (#39630) · bbf31a4e
      zhaoyingli 提交于
      bbf31a4e
    • Z
      [AMP] support GPU BF16 amp for dygraph (#39029) · 7d6d3848
      zhangbo9674 提交于
      * support dtype param for auto_cast
      
      * add amp_dtype for tracer
      
      * add unsupported bf16 list
      
      * support bf16 amp for O2
      
      * refine python interface for bfloat16
      
      * refine code
      
      * refine code
      
      * refine unittest
      
      * refine code
      
      * refine code
      
      * add bf16 o1
      
      * refine code by comment
      
      * add gradient accumulator
      
      * add recompute
      7d6d3848