1. 12 1月, 2021 2 次提交
  2. 11 1月, 2021 12 次提交
  3. 10 1月, 2021 1 次提交
  4. 09 1月, 2021 1 次提交
  5. 08 1月, 2021 11 次提交
  6. 07 1月, 2021 5 次提交
    • W
      [cherry pick] paddle.save/load ,paddle.static.save/load 保存大文件的bug (#30170) · bfb6f613
      WeiXin 提交于
      * Support storage of large parameters (#29988)
      
      * Support storage of large parameters
      
      * Reduce the complexity of the unittest
      
      * Reduce the complexity of the unittest,commented out unittest for
      
      * add unittest for static.save/load
      
      * Increase the timeout threshold of 'test_static_save_load'
      
      * Increase the timeout threshold of 'test_static_save_load'
      
      * Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
      
      * Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
      
      * Extend the timeout for the (#30151)
      bfb6f613
    • L
      [cherry pick] Some optimizations of elementwise_add, gelu and dropout for AMP (#30152) · 07f68fad
      Leo Chen 提交于
      * Improve performance of elementwise_add grad op (#29187)
      
      * pass stop_gradient for cast op
      
      * improve performance of elementwise_add grad
      
      * use tensor copy async
      
      * dygraph branch
      
      * fix dygraph branch
      
      * add ut
      
      * make gelu fp16 computing more robust (#29484)
      
      * Add fast path for dropout when p == 0  (#29553)
      
      * add fast path for p == 0 in dropout
      
      * add ut
      07f68fad
    • F
      [Cherry-pick] Layer norm fp16 and Nvidia optimize (#29169 #29434 #29522 #29576) (#30110) · 44b81e63
      furnace 提交于
      * Layer norm fp16 (#29169)
      
      * add fp16 for layer_norm op
      
      * revert layernorm api
      
      * fix forward
      
      * fix forward
      
      * fix backward for layernorm with fp16
      
      * fix unit test for layernorm with fp16
      
      * fix with_mkldnn compile error for layernorm with fp16
      
      * 1. revert to PADDLE_ENFORCE_NOT_NULL, 2. change static_cast<float> to static_cast<U>
      
      * fix with_mkldnn compile error for layernorm with fp16
      
      * fix with_mkldnn compile error for layernorm with fp16
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      
      * fix layer_norm accuracy (#29434)
      
      * Layernorm opt (#29522)
      
      * layernorm fw opt
      
      * layernorm bw opt
      
      * fix typo, test=develop
      
      * remove const dim3 for windows CI compatibility
      
      * merge develop
      Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
      
      * Fix compile problem when cuda_arch < 6000 (#29576)
      
      * fix compile problem when cuda_arch < 6000
      
      * refine code
      
      * refine code
      Co-authored-by: Nzhiqiu <chenqiuliang@baidu.com>
      Co-authored-by: Nzlsh80826 <zlsh80826@gmail.com>
      44b81e63
    • T
      pre padding in dygraph (#30179) · a2b0357d
      tangwei12 提交于
      Change-Id: Ia5279b0cbb6a5b3970aff66e9510e0d85efa70ce
      a2b0357d
    • C
      Cherry pick bn (#30136) · 157ff094
      ceci3 提交于
      * fix bn docs (#30096)
      
      * add attribute for batch_norm (#29950)
      
      * add attribute for batch_norm
      157ff094
  7. 06 1月, 2021 3 次提交
  8. 05 1月, 2021 5 次提交
    • T
      add topo-aware in heter-ps (#30087) (#30117) · 7fc2ce50
      Thunderbrook 提交于
      * add topo aware
      
      * resource.h
      
      * topo aware
      
      * format
      7fc2ce50
    • L
      [Cherry-pick 2.0] cherry pick 3 PRs about Dynamic-to-Static (#30100) · faeee3c3
      liym27 提交于
      * [cherry-pick 2.0] Fix unitest test_slice (#29740)
      
      Before this commit, test_slice use old api `dygraph_to_static_func` to use Dynamic-t-Static and use Executor explicitly,which is not recommended to users.
      After fixed, use recommended API `paddle.jit.to_static` to replace `dygraph_to_static_func`, which won't trigger the random exception on coverage CI.
      
      * [cherry-pick 2.0][Dy2Stat] Support grammar: for ele in var[idx] (#29541)
      
      Support to transformfor ele in var stms in which var is a slice of Tensor.
      
      * [cherry-pick 2.0][Dy2Stat] Fix bug for loop: a variable is used and created in loop, but used before created (#29769)
      faeee3c3
    • C
      [cherry-pick 2.0] Support dygraph quant model and avoid the scale to be infinity (#30098) · 3fe71d0a
      cc 提交于
      * fix ininite scale values (#29386)
      
      * Support dygraph quant model (#29927)
      
      * Avoid the scale to be infinity in quant2_int8_mkldnn_pass, test=develop
      * support quantized model for paddle2.0 dygraph, test=develop
      Co-authored-by: NWojciech Uss <wojciech.uss@intel.com>
      3fe71d0a
    • G
      fix test=release/2.0 (#30045) · 6e2066b0
      gongweibao 提交于
      6e2066b0
    • C
      [cherry pick]Set FLAGS_selected_gpus for spawn (#29962) (#30097) · cda7397f
      Chen Weihang 提交于
      Set FLAGS_selected_gpus for spawn.
      
      When the child process starts, it will inherit the configuration of the main process and set the FLAGS once, but the environment variable has not been set at this time, which leads to the FLAGS_selected_gpus is keep same with mainprocess(usually empty), so manually update the flags here.
      
      注:增加了一个单测,又移除了,单测打印显示CI机器nvidia-smi只有两张卡,需要大于两张卡才能测这个问题
      cda7397f