1. 14 1月, 2021 11 次提交
  2. 13 1月, 2021 9 次提交
  3. 12 1月, 2021 9 次提交
  4. 11 1月, 2021 11 次提交
    • L
      [Cherry-Pick] Support vector<double> as type of op attribute and op set_value... · d839761e
      liym27 提交于
      [Cherry-Pick] Support vector<double> as type of op attribute and op set_value suppport vector<double> as value (#30126) (#30305)
      
      Cherry-Pick #30126
      1. Support vector<float64> as type of op attribute.
      2. op set_value suppports float64 numpy.array
      d839761e
    • L
      [cherry-pick] Async drop scope in executor (#29714) #30285 · 93ce7f69
      Leo Chen 提交于
      [cherry-pick] Async drop scope in executor (#29714)
      93ce7f69
    • L
      [Cherry-Pick 2.0] Check the rank of input in kernel of set_value op (#30147) (#30301) · a2bbd06a
      liym27 提交于
      cherry-pick #30147,For op set_value, check input's rank < 7
      a2bbd06a
    • W
      [cherry pick] Add detailed error message for curandStatus_t, cublasStatus_t,... · 04cc659c
      WeiXin 提交于
      [cherry pick] Add detailed error message for curandStatus_t, cublasStatus_t, cusolverStatus_t (#30161) (#30280)
      
      为curandStatus_t、cublasStatus_t、cusolverStatus_t添加详细的报错信息。
      原始PR:#30161
      04cc659c
    • W
      [cherry pick] Fix bug for 'save mutiple method' (#30218) (#30278) · d9c70217
      WeiXin 提交于
      * Fix bug for 'save mutiple method'
      
      * To pass coverage.
      
      * edit code to pass coverage.
      
      * edit code to pass coverage.
      
      * add unittest for coverage.
      
      * change for coverage.
      
      * edit for coverage.
      d9c70217
    • P
      [Cherry-pick PR 29913], add View(reuse allocation) strategy on squeeze,... · 7c943a65
      pangyoki 提交于
      [Cherry-pick PR 29913], add View(reuse allocation) strategy on squeeze, unsqueeze, reshape, flatten op (#29913) (#30258)
      
      * add view strategy on squeeze,unsqueeze,reshape,flatten
      
      * add squeeze unittest
      
      * add unittests
      
      * use View strategy as name rather than Reuse Allacation
      
      * fix view api doc
      
      * fix format
      
      * use core.ops when input of reshape2 is Tensor
      
      * fix test_cross_entropy_loss error because of reshape2
      
      * delete selected_rows
      
      * change op_function
      
      * little change
      
      * solve HandleViewBetweenInputAndOutput
      7c943a65
    • Z
      [cherry-pick]add cast cuda kernel (#29352) #30263 · afbc6367
      Zhang Ting 提交于
       add cast cuda kernel
      
      cherry-pick #29352
      afbc6367
    • H
      [Cherry-pick] Add Static Variable Clone (#30208) #30270 · 6dd70b9b
      Huihuang Zheng 提交于
      Cherry-pick of PR #30208 , this PR added clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat where users called clone of dygraph Tensor.
      6dd70b9b
    • W
      [cherry-pick]add support for place string representation #30264 · fb66355e
      wangchaochaohu 提交于
      cherry-pick #28769, add support for place string representation 
      fb66355e
    • W
      [cherry-pick]Elementwise add grad GPU kernel optimization (#30276) · e59524f8
      wangchaochaohu 提交于
      * elementwise_add_grad Op optimization  (#29575)
      
      * optimize for long width for elementwise (#29602)
      
      * refine (#29622)
      
      * delete the code for fp16 optimization because it is not faster than common template code (#29715)
      
      * fix the shape choose of vectorize for cuda
      
      * optimization for fp16 elementwise add (#29744)
      
      * Fix the compiler error for half type (#29799)
      
      * refine the compiler error for half2 operation (#29816)
      
      * fix the compiler error when gcc4 cuda9.0 (#29997)
      e59524f8
    • Z
      [Cherry-Pick] Support pure fp16 training for AMP API. (#29544) (#30241) · d8dfef54
      Zhen Wang 提交于
      * Support pure fp16 training for AMP API. (#29544)
      
      * add cast ops before and after unsupported fp16 ops.
      
      * Keep partial net in FP32 pattern.
      
      * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
      
      * Add fp16 support for adam op.
      
      * add multi precision attr for adam.
      
      * Fix the bug of test_multi_precision_fp16_train UT.
      
      * Code format for CI.
      
      * Fix the redefine error about MPTypeTrait on windows.
      
      * fix bugs of the _create_accumulators func in Momentum.
      
      * fix bug when inserting post cast op.
      
      * Add the update_loss_scaling op in allow_set of UnusedVarCheck.
      
      * Update for ci coverage.
      
      * Add some doc for OptimizerWithMixedPrecision.
      
      * Fix the code style.
      
      * Imporve the doc of `amp_init`.
      
      * Change for fp16 testing if users have the infer program defined in separate way.
      
      * Remove tensor copy in the update_loss_scaling op. (#29426)
      
      * remove tensor copy in the update_loss_scaling op
      
      * not use thrust.
      
      * fix some cuda memory access error.
      d8dfef54