1. 14 4月, 2021 1 次提交
    • A
      adds new CPU kernel for SGD op supporting BF16 data type (#32162) · 3ac6c189
      Adam Osewski 提交于
      * Initial draft for SGD BG16 kernel.
      
      * Unit tests for SGD with BF16 data type.
      
      * Add VLOG message to SGD BF16 op CPU kernel.
      
      * Enhance error messages and error types.
      
      * Refactor SGD op kernels to leverage some common code.
      
      * Make easier to add new kerne invoke code.
      
      * Fix SGD op kernel for sparse grad.
      
      * Unify quotes style.
      
      * Fix error for ROCM compilation.
      
      * Use specialized PADDLE_ENFORCE_xx functions.
      3ac6c189
  2. 31 3月, 2021 1 次提交
  3. 02 3月, 2021 1 次提交
    • G
      lamb_op_xpu;test=kunlun (#31012) · d79fdc3d
      Gradie 提交于
      * lamb_op_xpu;test=kunlun
      
      * modify lamb_op_xpu.cc;test=kunlun
      
      * delete atol lamb_op_xpu; test=kunlun
      
      * update xpu.cmake;test=kunlun
      
      * test_error 1e-5,lamb_op_xpu;test=kunlun
      
      * error1e-5,lamb_op_xpu,test=kunlun
      
      * delete atol lamb_xpu;test=kunlun
      
      * modify atol,lamb_op_xpy;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu, XPUOptest;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu,modify xpu_cmake; test=kunlun
      
      * lamb_op_xpu;test=kunlun
      
      * lamb_op_xpu,modify xpucmake;test=kunlun
      d79fdc3d
  4. 19 1月, 2021 1 次提交
  5. 17 1月, 2021 1 次提交
  6. 12 1月, 2021 1 次提交
  7. 09 1月, 2021 1 次提交
  8. 08 1月, 2021 1 次提交
    • Z
      Support pure fp16 training for AMP API. (#29544) · 7f7dfccf
      Zhen Wang 提交于
      * add cast ops before and after unsupported fp16 ops.
      
      * Keep partial net in FP32 pattern.
      
      * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
      
      * Add fp16 support for adam op.
      
      * add multi precision attr for adam.
      
      * Fix the bug of test_multi_precision_fp16_train UT.
      
      * Code format for CI.
      
      * Fix the redefine error about MPTypeTrait on windows.
      
      * fix bugs of the _create_accumulators func in Momentum.
      
      * fix bug when inserting post cast op.
      
      * Add the update_loss_scaling op in allow_set of UnusedVarCheck.
      
      * Update for ci coverage.
      
      * Add some doc for OptimizerWithMixedPrecision.
      
      * Fix the code style.
      
      * Imporve the doc of `amp_init`.
      
      * Change for fp16 testing if users have the infer program defined in separate way.
      7f7dfccf
  9. 30 12月, 2020 1 次提交
  10. 21 12月, 2020 1 次提交
    • L
      Optimize compilation time with Unity Build (#29733) · 2e5b4a21
      LoveAn 提交于
      * Test compilation time with less parallel count, notest, test=windows_ci
      
      * optimize rules of Unity Build, notest, test=windows_ci, test=windows_op
      
      * limit parallel counts used only on GPU, test=develop
      
      * remove limit of argument /m:8 on Windows, test=develop
      2e5b4a21
  11. 07 12月, 2020 1 次提交
    • L
      Compiling operator libraries with Unity build (#29130) · 671555ed
      LoveAn 提交于
      * Compiling operator libraries with Unity Build on Windows CPU.
      
      * Compiling operator libraries with Unity Build on Windows GPU, no_test, test=windows_ci
      
      * Add option in windows ci script, no_test, test=windows_ci
      
      * Optimize parallel compiling, test=develop
      
      * remove limit of parallel compile and skip some ops in UB, test=develop
      
      * remove changes of header file, test=develop
      
      * remove changes of header file, test=develop
      
      * fix test_eye_op unittest failed, test=develop
      
      * Compiling operator libraries with Unity Build on Linux, test=develop
      
      * set default WITH_UNITY_BUILD=OFF, test=develop
      
      * Move unity build rules into a single file and add comment, test=develop
      
      * optimize parallel compilation, test=develop
      
      * fix undefined reference error on coverage ci, test=develop
      671555ed
  12. 02 12月, 2020 2 次提交
    • Z
      Remove some useless log. (#29300) · 9b59a589
      Zhen Wang 提交于
      9b59a589
    • Z
      Add pure fp16 training with master weights. (#27712) · be3777a5
      Zhen Wang 提交于
      * add the weight decay func for the momentum op
      
      * Add the multi_precision function in Momentum Optimizer.
      
      * Make sure that the initial value of master weights are same with the fp16 weights.
      
      * add static loss scaling.
      
      * add the rescale_grad function in the pure fp16 training.
      
      * use the original momentum updating method.
      
      * Polish some codes, such as variable names.
      
      * add docstring for apis.
      
      * update the var creation details of _create_master_weight.
      
      * not modify codes about imperative momentum updating.
      
      * Fix the error of test_dist_sparse_tensor_load_momentum UT.
      
      * add unit test for multi precision fp16 training.
      
      * add more unit tests for CI.
      
      * Use lower threshold values for allclose comparing in test_multi_precision_fp16_train UT.
      
      * For CI Coverage Checking.
      be3777a5
  13. 23 11月, 2020 1 次提交
  14. 06 11月, 2020 1 次提交
  15. 19 10月, 2020 2 次提交
    • Y
      xpu adam op (#28031) · 6f0c3d1f
      yinhaofeng 提交于
      * lookup_table_xpu op report errors;test=kunlun
      
      * add adam xpu op;test=kunlun
      
      * reset lookup
      
      * change adam wrong;test=kunlun
      6f0c3d1f
    • C
      Fix xpu error message (#28061) · 5f04875c
      Chengmo 提交于
      * fix error message,test=kunlun
      
      * fix, test=kunlun
      5f04875c
  16. 14 10月, 2020 2 次提交
  17. 13 10月, 2020 1 次提交
  18. 27 9月, 2020 1 次提交
  19. 22 9月, 2020 1 次提交
  20. 21 9月, 2020 1 次提交
  21. 09 9月, 2020 1 次提交
  22. 29 8月, 2020 1 次提交
    • J
      Adadelta Optimizer (#26590) · a1b99fae
      Jiawei Wang 提交于
      * add doc; notest
      
      * fix doc; notest
      
      * update doc; notest
      
      * refine optimizer && adam
      
      * refine optimizer; notest
      
      * add adam
      
      * fix doc
      
      * fix doc && add adamw; notest
      
      * add error message
      
      * bug fix
      
      * refine rmsprop && adamax
      
      * fix ci
      
      * buf fix
      
      * update comment
      
      * unify arguments place; notest
      
      * fix ut, test=develop
      
      * bug fix
      
      * fix conflicts, test=develop
      
      * add examples code
      
      * bug fix
      
      * fix comments
      
      * fix sample code
      
      * add sample code for Optimizer
      
      * add adamax ut, test=develop
      
      * fix rmsprop ut, test=develop
      
      * add ut for optimizer.py and adamw.py
      
      * first commit of adadelta optimizer
      
      * fix learning rate
      
      * fix adadelta doc and add sgd momentum
      
      * remove unused fluid
      
      * fix codestyle
      
      * Update test_adam_op.py
      
      * Update test_adam_op.py
      
      * fix SGD in 2 unittests
      
      * fix SGD in 2 unittests
      
      * fix ci
      
      * fix ut
      Co-authored-by: NMRXLT <xlt2024@gmail.com>
      Co-authored-by: Nmapingshuo <mps2012@yeah.net>
      a1b99fae
  23. 28 8月, 2020 1 次提交
  24. 11 7月, 2020 1 次提交
  25. 03 6月, 2020 1 次提交
  26. 13 5月, 2020 3 次提交
  27. 26 4月, 2020 1 次提交
    • L
      improve efficiency of runtime InferVarType (#22778) · 9a93f6aa
      liuwei1031 提交于
      * save InferVarType changes, test=develop
      
      * remove code comments, test=develop
      
      * tweak code, test=develop
      
      * fix compilation warning, update merge_ids_op split_ids_op to new interface, test=develop
      
      * modify fused_bn_activation_op, test=develop
      
      * fix error of fused_bn_activation_op, test=develop
      
      * fix PADDLE_ENFORCE and unittest coverage issue, test=develop
      
      * tweak PADDLE_ENFORCE messages, test=develop
      
      * improve unittest coverage, test=develop
      
      * add StaticGraphInferVarType class, test=develop
      
      * rebase develop branch, test=develop
      
      * fix unittest error, test=develop
      
      * remove comments, test=develop
      
      * improve unittest coverage, test=develop
      
      * imporve error message and imporve unittest coverage, test=develop
      
      * upgrade InferVarType API, test=develop
      
      * tweak pyfunc error message, test=develop
      
      * fix compilation conflict - save_combine_op, test=develop
      9a93f6aa
  28. 07 4月, 2020 1 次提交
  29. 04 4月, 2020 1 次提交
    • C
      Delete Ref & VectorRef and add GetDataSafely (#22997) · 16315d3d
      Chen Weihang 提交于
      * delete invalid check inferface Ref & VectorRef, test=develop
      
      * fix vector ref delete error, test=develop
      
      * try the new check inferface, test=develop
      
      * change all related code with new check macro, test=develop
      
      * remove static assert, test=develop
      
      * polish detail, test=develop
      
      * skip coverage problem, test=develop
      
      * add new check macro, test=develop
      16315d3d
  30. 27 2月, 2020 1 次提交
    • Z
      Refine adam op to improve performance, test=develop (#22346) · 72dde4ab
      zhaoyuchen2018 提交于
      * Refine adam op, test=develop
      
      * Fuse kernels together to reduce cpu time.
      
      * Refine paddle enforce, test=develop
      
      * Remove some comments, test=develop
      
      * Refine code,test=develop
      
      * Refine cuda kernel, test=develop
      
      * Refine code according to comments, test=develop
      72dde4ab
  31. 09 1月, 2020 1 次提交
    • Z
      test Optimizer in dygraph (#21949) · d0f0a252
      zhongpu 提交于
      * test Optimizer in dygraph, test=develop
      
      * add optest for Optimizer in dygraph, test=develop
      
      * fix adagrad optimizer, test=develop
      
      * fix dpsgd optimizer, test=develop
      
      * fix test_optimizer.py, test=develop
      
      * fix dpsgd optimizer, this op only support cpu, test=develop
      
      * add optest for optimizer, test=develop
      
      * add description for dpsgd, test=develop
      
      * add rmsprop to white_list in unused_var_check.cc, test=develop
      
      * polish code style, test=develop
      
      * polish code style, test=develop
      
      * delete seed attribute for DpsgdOptimizer, test=develop
      
      * change testing to debugging, test=develop
      d0f0a252
  32. 24 12月, 2019 1 次提交
    • A
      Optimize adam speed (#21777) · 51a86d2b
      Aurelius84 提交于
      * optimize adam speed by removing _finish_update test=develop
      
      * fix SparseAdamFunctor param list test=develop
      
      * Remove scale_op in expect_list of adam_op test=develop
      
      * fix test optimizer loss assert error test=develop
      
      * fix test optimizer loss assert error test=develop
      
      * modify PADDLE_ENFORCE usage test=develop
      
      * fix op_type in lamb_op.cc test=develop
      
      * fix errors ostream format bug test=develop
      
      * add betaPowOut in ngraph op test=develop
      
      * fix ngraph::op api for gcc8 test=develop
      
      * clean code test=develop
      
      * modify struct into class test=develop
      
      * remove code of beta1Tensor in lamb_op test=develop
      51a86d2b
  33. 06 12月, 2019 1 次提交
    • H
      Add Much Complex Test and Fix Bugs for Control Flow cond API (#21532) · 1dcf6a72
      Huihuang Zheng 提交于
      Add tests to use dy/dx to make sure the gradient values calculated by the control flow backward is correct. Also fixed bugs detected by those tests.
      
      Fix bugs:
      
      1. Unlike sum_op, optimizer ops don't allow uninitialized input tensor. But in conditional_block_grad_op, since the conditional_block may not run, the output gradient tensor may be uninitialized, which will cause the optimizer op error. To fix it, we should let optimizer ops support uninitialized input like sum_op or assign the uninitialized gradient to 0 when the conditional_block_grad_op doesn't run. I found there are about 10+ optimizer ops. **To be simpler, I just assign output gradient of the conditional_block_grad_op to 0 in this PR**. But it can be further explored whether we can make optimizer ops like sum_op to support uninitialized input tensor because theoretically we can speed up without the assigning in conditional_block_grad_op.
      
      2. Infer parameter shapes during append_backward. I didn't know that all our parameters are in global block. When op_desc is inferring shapes at the sub-block, it may not know the shape of gradients of parameters whose shape information is at global block. I fixed it by inferring shapes of gradients from forward var.
      
      This PR also did some code clean up:
      1. Print the var name when sgd_op catches shape error so that it is easier to debug
      2. Fix a typo: dicta -> dict
      1dcf6a72
  34. 29 11月, 2019 2 次提交
    • C
      Fix optimizer op infershape failed in dygraph multi-cards mode (#21374) · 664f958a
      Chen Weihang 提交于
      * add param & grad shape check for sgd op
      
      * add _reshape_inplece interface for dygraph parallel
      
      * refine unittest based paddle/models scripts, test=develop
      
      * add unittest for parallel grad fuse, test=develop
      664f958a
    • H
      Add dygraph execution context (#20157) · ac854670
      hong 提交于
      * add_dygraph_execution_context
      
      * add dygraph infershape context and execution context; test=develop
      
      * fix imperative bug; test=develop
      
      * remove inputs outputs interface from execution context,
      because it have same function with inputNames;
      test=develop
      
      * remove tracer_test ctest; test=develop
      
      * fix split op bug; test=develop
      
      * fix unitests bug; test=develop
      
      * fix distribute test bug; test=develop
      
      * fix ngraph compile bug; test=develop
      
      * fix grad maker bug; test=develop
      
      * fix load op bugs; test=develop
      
      * fix operator.cc construct bug; test=develop
      
      * remove useless name find in operator; test=develop
      
      * add tracer_test; test=develop
      
      * fix concat, split bug; test=develop
      
      * remove tracer_test unitest; test=develop
      
      * fix attribute check bug; test=develop
      
      * add test code to fix converage; test=develop
      
      * remove useless code, change check backward input in engin; test=develop
      
      * unlock var type infer shape;test=develop
      
      * add ShareAllLoD api; test=develop
      
      * add dygraph infershape context unitest; test=develop
      
      * remove increase and decrease lod in dygraph; test=develop
      
      * addd override; test=develop
      
      * fix increase descrease lod; test=develop
      
      * fix paddle_enforce; test=develop
      
      * disable lod op dygraph check; test=develop
      
      * fix paddle enforce error; test=develop
      
      * add comment for op_registry and OperatorBase; test=develop
      
      * optimize the comment of op_registry; test=develop
      
      * fix format of comment; test=develop
      
      * fix format of comment; test=develop
      
      * optimize the format of comment; test=develop
      
      * optimize the format of the comment; test=develop
      
      * optimize comment of op_registry; test=develop
      ac854670