1. 14 1月, 2021 5 次提交
    • C
      [Cherry-pick] Fix prune input bug of jit.save #30425 · 2cdc36f4
      Chen Weihang 提交于
      [Cherry-pick] Fix prune input bug of jit.save
      
      cheryy-pick of #30384
      2cdc36f4
    • Q
      optimize memcpy perf for kunlun (#30291) (#30382) · 9de42be2
      QingshuChen 提交于
      * optimize memcpy perf for kunlun (#30291)
      
      * optimize memcpy perf for kunlun
      
      * remove useless unitest for kunlun mean
      
      * minor
      
      * fix bug that cann't find mkldnn(kunlun) (#30394)
      9de42be2
    • L
      [cherrypick 2.0] add double grad for conv_transpose and depthwise_conv (#30429) · 1552343a
      LielinJiang 提交于
      * Add double grad for conv_transpose (#29706)
      
      * add double grad for conv_transpose
      
      * register cudnn conv double grad for depthwise conv (#29807)
      1552343a
    • B
      cherry-pick 30354 (#30407) · 5d30d072
      Bai Yifan 提交于
      5d30d072
    • C
      fix bug of celoss when using ignore_index and reduction (#30395) · c22ee575
      chajchaj 提交于
      * fix bug of celoss when using ignore_index and reduction (#30180)
      
      * fix bug of using ignore_index and reduction,test=develop
      
      * fix bug of celoss when using ignore_index and reduction, test=develop
      
      * improve performance when ignore_index=-100, test=develop
      
      * add test in test_cross_entropy_loss.py for coverage rate, test=develop
      
      * rm comment in test_cross_entropy_loss.py, test=develop
      
      * del  hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop
      
      * change mask to a more simplified implementation, test=develop
      
      * del comment in python/paddle/nn/functional/loss.py, test=develop
      
      * del hard code and change mask to a more simplified implementation, test=develop
      
      * change mask to a more simplified implementation, test=develop
      
      * change mask to a more simplified implementation, test=develop
      
      * fix bug of celoss when using ignore_index and reduction (#30180)
      
      * fix bug of using ignore_index and reduction,test=develop
      
      * fix bug of celoss when using ignore_index and reduction, test=develop
      
      * improve performance when ignore_index=-100, test=develop
      
      * add test in test_cross_entropy_loss.py for coverage rate, test=develop
      
      * rm comment in test_cross_entropy_loss.py, test=develop
      
      * del  hard code of "float64" in python/paddle/nn/functional/loss.py, test=develop
      
      * change mask to a more simplified implementation, test=develop
      
      * del comment in python/paddle/nn/functional/loss.py, test=develop
      
      * del hard code and change mask to a more simplified implementation, test=develop
      
      * change mask to a more simplified implementation, test=develop
      
      * change mask to a more simplified implementation, test=develop
      c22ee575
  2. 13 1月, 2021 4 次提交
  3. 12 1月, 2021 7 次提交
  4. 11 1月, 2021 12 次提交
  5. 10 1月, 2021 1 次提交
  6. 08 1月, 2021 9 次提交
    • L
      [cherry-pick] [Dy2Stat] Don't convert to paddle.shape if var_x.shape is not... · 2ba9bdd7
      liym27 提交于
      [cherry-pick] [Dy2Stat] Don't convert to paddle.shape if var_x.shape is not negetive #29965 (#30235)
      
      * [Cherry-Pick 2.0] [Dy2Stat] Don't convert to paddle.shape if var_x.shape is not negetive (#29965)
      
      1. When x is Variable, call nn.shape(x) only in following cases:
       1)The shape of x is used in control flow condition.
       2)The dim to be used is negetive
      2. When x is Variable, but x.shape or x.shape[idx] doesn't contain negetive value, don't convert to paddle.shape()
      
      * [Cherry-Pick 2.0] [Dy2Stat] Use Paddle2.0 api paddle.tensor.array_* (#30156)
      2ba9bdd7
    • H
      [Cherry-pick] amp related PR cherry pick into Release/2.0 (#30212) · 9f7c66b4
      huangxu96 提交于
      * Optimizer trans momentum (#29597)
      
      * merge amp related function in Momentum from paddle.fluid.contrib.optimizer into paddle.optimizer.
      
      * Add unittest for 2.0  Momentum API.
      
      * fix some bugs in weight_decay.
      
      * add alias for fluid.contrib.mixed_precision (#29562)
      
      * add alias for fluid.contrib.mixed_precision
      
      * add static.amp into setup.pu.in (#29621)
      
      * add static.amp into setup.pu.in
      
      * add unittest for api
      
      * fix a bug in multi_precision_fp16 unittest. (#29756)
      9f7c66b4
    • L
      [cherry-pick 2.0] Fix bug: In dynamic mode, if start or end is negetive,... · 5fe3da39
      liym27 提交于
      [cherry-pick 2.0] Fix bug: In dynamic mode, if start or end is negetive, __getitem__  return wrong result(#30003) (#30146)
      
      1. when slice_item is a slice:
       1) the start of __getitem__ should be std::max(start, 0) if slice
       2) the start of __getitem__ should be std::min(end, dim)
      2. when slice_item is an integer, it should be in [-dim_len, dim_len)
      3. Fix error message to use accurate data
      5fe3da39
    • L
      [Cherry-Pick 2.0][setitem] Support Tensor setitem in static mode (#29708) (#30104) · f46ddc0e
      liym27 提交于
      1. Type of index: int, slice(step must be 1).
      
      2. Type of value:
       (1) int32, int64, float32, bool;
       (2) numpy.array(int32, int64, float32, bool);<Note: float64 is not supported>
       (3) paddle.Tensor(int32, int64, float32, float64, bool);
      f46ddc0e
    • J
      Fix beam search bug (#29824) (#30140) · b2ca2cad
      Jiaqi Liu 提交于
      * fix beam search bug
      
      * add dygraph unittest
      
      * update dynamic_decode argument doc
      
      * add warning info for state which has no lengths attribute
      b2ca2cad
    • C
      [Cherry-pick] [Complex] Simplify prepared op impl to improve performance (#30153) (#30215) · 0e3a1d35
      Chen Weihang 提交于
      * simplify prepared op impl to improve performance
      
      * fix kunlun compile error
      
      * continue fix kunlun compile error
      
      * only transform diff place when dtype diff
      
      * fix failed unittests
      
      * remove useless file
      
      * polish impl by review comment
      0e3a1d35
    • 1
      【2.0API CherryPick】LookAhead, ModelAverage, IndexSelect (#30205) · 3ce4d34d
      123malin 提交于
      * Add Lookahead and ModelAverage Optimizer (#30004)
      
      * test=develop, add model_average and lookahead
      
      * Improve Index select cuda kernel (#30139)
      
      * test=develop, add index_select_cuda kernel
      3ce4d34d
    • C
      fix syncbn convert (#30158) (#30176) · 030d678c
      ceci3 提交于
      * fix syncbn convet
      
      * add unittest
      030d678c
    • C
      [Cherry-pick] Simplify the options of spawn based on fleetrun (#30144) (#30197) · 39204d56
      Chen Weihang 提交于
      * Simplify the options of spawn based on fleetrun (#30144)
      
      * Simplify the options of spawn based on fleetrun
      
      * polish details
      
      * polish doc details
      
      * cleanup enum test=develop (#29294)
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      39204d56
  7. 07 1月, 2021 2 次提交
    • W
      [cherry pick] paddle.save/load ,paddle.static.save/load 保存大文件的bug (#30170) · bfb6f613
      WeiXin 提交于
      * Support storage of large parameters (#29988)
      
      * Support storage of large parameters
      
      * Reduce the complexity of the unittest
      
      * Reduce the complexity of the unittest,commented out unittest for
      
      * add unittest for static.save/load
      
      * Increase the timeout threshold of 'test_static_save_load'
      
      * Increase the timeout threshold of 'test_static_save_load'
      
      * Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
      
      * Increase the timeout threshold of 'test_static_save_load' and 'test_paddle_save_load'
      
      * Extend the timeout for the (#30151)
      bfb6f613
    • L
      [cherry pick] Some optimizations of elementwise_add, gelu and dropout for AMP (#30152) · 07f68fad
      Leo Chen 提交于
      * Improve performance of elementwise_add grad op (#29187)
      
      * pass stop_gradient for cast op
      
      * improve performance of elementwise_add grad
      
      * use tensor copy async
      
      * dygraph branch
      
      * fix dygraph branch
      
      * add ut
      
      * make gelu fp16 computing more robust (#29484)
      
      * Add fast path for dropout when p == 0  (#29553)
      
      * add fast path for p == 0 in dropout
      
      * add ut
      07f68fad