1. 05 1月, 2022 1 次提交
  2. 16 12月, 2021 1 次提交
  3. 15 12月, 2021 1 次提交
  4. 13 12月, 2021 2 次提交
  5. 09 12月, 2021 1 次提交
  6. 25 11月, 2021 1 次提交
  7. 23 11月, 2021 1 次提交
    • 0
      [Dy2stat]Allow users to switch eval/train mode when using @to_static to... · eed736dc
      0x45f 提交于
      [Dy2stat]Allow users to switch eval/train mode when using @to_static to decorate a function (#37383) (#37432)
      
      本PR之前使用@to_static装饰一个单独的function时,对于生成的Program无法切换train/eval模式,只能运行在train模式下。这也就导致动转静后用户多次调用function显存会一直增长。
      本PR之后,使用@to_static装饰一个单独的function时,可以通过function.train()或者function.eval()的方式来切换train/eval模式。
      eed736dc
  8. 19 11月, 2021 1 次提交
  9. 28 10月, 2021 2 次提交
  10. 26 10月, 2021 3 次提交
    • S
      [Cherry-pick] Add FasterTokenizer Operator (#36716) · edff5b79
      Steffy-zxf 提交于
      * Add FasterTokenizer Operator (#34491)
      
      Add Tokenizer related functionalities for Transformer model in order that the process of training and predicting is consistent.
      
      * support the text string as an input Tensor
      * support the "VOCAB"unordered_map<wstring, int> as an input Tensor to lookup tokens
      * Tokenizer used for BERT. This tokenizer applies an end-to-end, text string to wordpiece tokenization.
      * It first applies basic tokenization, followed by wordpiece tokenization.
      
      * optimize fast tokenizer
      
      * remove const_cast
      Co-authored-by: Nzhoushunjie <zhoushunjie@baidu.com>
      Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
      edff5b79
    • H
      [cherry-pick]Support FP16 in HybridParallel and Fix bugs in HybridOptimizer (#36707) · 5b357e02
      Haohongxiang 提交于
      * fix bugs in HybridParallelClipGrad of hybrid_parallel_optimizer (#36237)
      
      * fix bugs in HybridParallelClipGrad of hybrid_parallel_optimizer
      
      * update
      
      * update
      
      * fix bugs in mp_layers、pp_layers and HybridParallelClipGrad (#36144)
      
      * fix calling bug of HybridParallelClipGrad
      
      * fix bugs of HybridParallelClipGrad
      
      * add unittest of pp with HybridParallelClipGrad
      
      * fix bugs in mp_layers.py
      
      * update
      
      * fix bugs in pp_layers.py
      
      * update
      
      * [HybridParallel]Rebuild code for pipeline (#36396)
      
      * add no_sync for parameters sync
      
      * add pipeline for moe
      
      * [HybridParallel]Support fp16 in dygraph hybrid parallel (#36420)
      
      * [HybridParallel]Support fp16 in dygraph hybrid parallel
      
      * update
      
      * update
      
      * update for recompute
      
      * add unittest of pp+fp16
      
      * add unittest of recompute+fp16
      
      * update
      
      * modify ut
      
      * modify ut of cond (#36475)
      
      * fix bugs of ClipGradByGlobalNorm in HybridParallel (#36555)
      
      * fix bugs of ClipGradByGlobalNorm
      
      * add unittests
      
      * add unittests
      
      * [HybridParallel]fix bug of check_inf in fleet_base.py (#36651)
      
      * fix bug of check_inf
      
      * fix allreduce
      
      * support ClipGradByGlobalNorm in sharding (#36012)
      
      * support ClipGradByGlobalNorm in sharding
      
      * support ClipGradByGlobalNorm in sharding
      
      * test=allcase
      
      * Update test_linalg_cond.py
      
      * Update hybrid_parallel_util.py
      
      * Update hybrid_parallel_util.py
      Co-authored-by: NShenLiang <1422485404@qq.com>
      Co-authored-by: Nzhaoyingli <86812880+zhaoyinglia@users.noreply.github.com>
      5b357e02
    • L
      [Amp] refine code of amp level (#36362) (#36726) · 1ee4fc32
      Leo Chen 提交于
      * refine amp level
      
      * fix typo
      
      * update tracer._amp_level
      1ee4fc32
  11. 21 10月, 2021 1 次提交
  12. 20 10月, 2021 1 次提交
  13. 18 10月, 2021 1 次提交
  14. 13 10月, 2021 1 次提交
  15. 26 9月, 2021 1 次提交
  16. 22 9月, 2021 1 次提交
  17. 17 9月, 2021 2 次提交
    • Z
      [AMP] Support pure fp16 training mode for dygraph (#35521) · adaeee4d
      zhangbo9674 提交于
      * add pure fp16 major function in auto_cast & tracer
      
      * support master weight in dygraph for pure fp16
      
      * check mix dtype of fp16&fp32 for check_finite_and_unscale op
      
      * change pure fp16 funtion name
      
      * refine some bug in auto_cast
      
      * refine auto_cast interface logic
      
      * add param _casted_by_pure_fp16 for class Layer
      
      * support state_dict hook for save model by user appointed dtype in pure_fp16_decorator
      
      * refine pure_fp16_decorator as decorator
      
      * add unittest
      
      * add comment
      
      * add comment
      
      * support recompute
      
      * add comment for auto_cast and decorator
      
      * support to_static_state_dict for paddle.jit.save
      
      * unlimite models num and optimizers num
      
      * add lookup_table in black_list
      
      * fix momentum and layer state_dict
      
      * fix bug in layer state_dict
      
      * fix bug in layer state_dict_helper
      
      * refine unittest
      
      * refine test_momentun_op
      
      * refine interface and some code
      
      * refine amp_decorator interface
      
      * refine pure fp16 interface
      
      * refine master weight interface
      adaeee4d
    • W
      polish code. (#35783) · 61010bb8
      WeiXin 提交于
      61010bb8
  18. 16 9月, 2021 1 次提交
  19. 15 9月, 2021 3 次提交
  20. 14 9月, 2021 2 次提交
  21. 13 9月, 2021 1 次提交
  22. 10 9月, 2021 1 次提交
  23. 08 9月, 2021 1 次提交
  24. 07 9月, 2021 1 次提交
  25. 06 9月, 2021 1 次提交
  26. 03 9月, 2021 1 次提交
  27. 01 9月, 2021 3 次提交
  28. 26 8月, 2021 1 次提交
  29. 24 8月, 2021 1 次提交
    • H
      Add no_sync in data parallel for dynamic graph (#34740) · b09f4d7f
      Haohongxiang 提交于
      * Add no_sync in data parallel for dynamic graph
      
      * modify UT of no_sync
      
      * delete test_parallel_dygraph_dataparallel_no_sync.py
      
      * add test_parallel_dygraph_no_sync.py
      
      * modify run_trainer_with_spawn in UTs
      
      * Add UT of complex control flow in no_sync
      
      * add specific descriptions and notes for no_sync
      
      * check code style
      
      * modify UT's TIMEOUT in CMakeLists.txt
      b09f4d7f
  30. 20 8月, 2021 1 次提交