1. 01 4月, 2021 1 次提交
  2. 30 3月, 2021 1 次提交
  3. 29 3月, 2021 1 次提交
  4. 26 3月, 2021 1 次提交
  5. 23 3月, 2021 1 次提交
  6. 10 3月, 2021 1 次提交
  7. 01 3月, 2021 1 次提交
  8. 23 2月, 2021 2 次提交
  9. 22 2月, 2021 1 次提交
  10. 09 2月, 2021 3 次提交
    • L
      [feature] support npu allocator, part 2 (#30972) · 1201cd2e
      Leo Chen 提交于
      * support npu allocator
      
      * add npu device context
      
      * fix some compile problem
      
      * fix some compile problem
      
      * add npu info
      
      * compile ok
      
      * fix include dir
      
      * support naive_best_fit_allocator
      
      * run ut ok, bug failed to exit
      
      * call aclrtResetDevice before exit
      
      * fix aclFinilize
      
      * add system allocatot test
      
      * add selected_gpus in gtest
      
      * add tensor_test for npu
      
      * support npu op, initial commit
      
      * add npu stream
      
      * add elementwise_add_op
      
      * compile ok
      
      * fix typo
      
      * fix elementwise_add_op_npu_test
      
      * support op run
      
      * test can run but failed
      
      * change aclopExecuteV2 to aclopCompileAndExecute
      1201cd2e
    • L
      [feature] support npu operator (#30951) · 7e049108
      Leo Chen 提交于
      [feature] support npu operator
      7e049108
    • L
      [feature] support npu allocator (#30840) · 81138239
      Leo Chen 提交于
      [feature] support npu allocator
      81138239
  11. 08 2月, 2021 1 次提交
  12. 28 1月, 2021 1 次提交
  13. 27 1月, 2021 1 次提交
  14. 15 1月, 2021 2 次提交
  15. 14 1月, 2021 1 次提交
  16. 13 1月, 2021 3 次提交
    • C
      skip quantizing ops in cpu inference (#30342) · 8e3a2940
      cc 提交于
      * skip quantizing ops in cpu inference, test=develop
      8e3a2940
    • A
      Added support for inference using quantization aware trained dygraph (#30288) · 7bbf3ac5
      alncat 提交于
      * added support for inference using qunatization aware trained dygraph
      
      * added support for inference using qunatization aware trained dygraph
      correct boost get usage
      
      * Delete incorrect warning message (#30196)
      
      * fix warning and no grad
      
      * clean redundant API alias in 2.0 - part 2 (#30013)
      
      * delete paddle.nn.functional.assign
      
      * fix dynamic to static error
      
      * just add the op error message for the matmul xpu (#30246)
      
       add the op error message for the matmul xpu
      
      * Add Static Variable Clone (#30208)
      
      Add clone method for static Variable so that this interface will be same as dygraph. It fixed some bugs in dy2stat
      
      * use wget to replace curl to download the lcov file (#30229)
      
      * use wget to replace curl to download the lcov file
      
      * add cache for lcov
      
      * fix test_pool3d_op timeout issue (#30248)
      
      * Fix unittests bugs. (#30250)
      
      * modify error message based on comments (#30189)
      
      * modify error message based on comments
      
      * edit code according to review.
      
      * Correct spelling according to review.
      
      * Fix bug for 'save mutiple method' (#30218)
      
      * Fix bug for 'save mutiple method'
      
      * To pass coverage.
      
      * edit code to pass coverage.
      
      * edit code to pass coverage.
      
      * add unittest for coverage.
      
      * change for coverage.
      
      * edit for coverage.
      
      * added support for inference using qunatization aware trained dygraph
      
      * Alias from  paddle.fluid.layers.auc to paddle.static.auc (#30206)
      
      * add alias from  fluid.layers.auc to static.auc
      
      * Update __init__.py
      
      * added support for inference using qunatization aware trained dygraph
      correct boost get usage
      
      * corrected boost get usage
      
      * corrected naming issues and enforcing zero check
      
      * correct paddle enforce message
      
      * added more error checkings
      
      * corrected error report message and optimized code
      
      * corrected findvar usage
      
      * corrected paddle_enforce in scope
      
      * correct error messages
      
      * correct error reporting format
      Co-authored-by: NLielinJiang <50691816+LielinJiang@users.noreply.github.com>
      Co-authored-by: NXiaoguangHu <46782768+XiaoguangHu01@users.noreply.github.com>
      Co-authored-by: Nwawltor <fangzeyang0904@hotmail.com>
      Co-authored-by: NHuihuang Zheng <zhhsplendid@gmail.com>
      Co-authored-by: NYUNSHEN XIE <1084314248@qq.com>
      Co-authored-by: NBai Yifan <me@ethanbai.com>
      Co-authored-by: Ngongweibao <weibao.gong@gmail.com>
      Co-authored-by: NWeiXin <weixin10@baidu.com>
      Co-authored-by: NJiaqi Liu <liujiaqi06@baidu.com>
      7bbf3ac5
    • Z
      fix bug on compiling inference shared lib with crypto;test=develop (#30269) · 10a8f3e5
      Zhang Jun 提交于
      * fix bug on compiling inference shared lib with crypto;test=develop
      
      * fix cmake bug when build inference lib using -DWITH_CRYPTO=OFF
      
      * update cmake
      
      * remove unnecessary enforce message
      10a8f3e5
  17. 12 1月, 2021 3 次提交
  18. 11 1月, 2021 2 次提交
  19. 10 1月, 2021 1 次提交
  20. 08 1月, 2021 4 次提交
    • Z
      Support pure fp16 training for AMP API. (#29544) · 7f7dfccf
      Zhen Wang 提交于
      * add cast ops before and after unsupported fp16 ops.
      
      * Keep partial net in FP32 pattern.
      
      * Support check_finite_and_unscale and update_loss_scaling for FP16 calculation mode.
      
      * Add fp16 support for adam op.
      
      * add multi precision attr for adam.
      
      * Fix the bug of test_multi_precision_fp16_train UT.
      
      * Code format for CI.
      
      * Fix the redefine error about MPTypeTrait on windows.
      
      * fix bugs of the _create_accumulators func in Momentum.
      
      * fix bug when inserting post cast op.
      
      * Add the update_loss_scaling op in allow_set of UnusedVarCheck.
      
      * Update for ci coverage.
      
      * Add some doc for OptimizerWithMixedPrecision.
      
      * Fix the code style.
      
      * Imporve the doc of `amp_init`.
      
      * Change for fp16 testing if users have the infer program defined in separate way.
      7f7dfccf
    • L
      use cuda generator in bernoulli cuda kernel (#30199) · 789743e1
      Leo Chen 提交于
      789743e1
    • L
      Add callback after TensorCopy (#30123) · 1f97d61c
      Leo Chen 提交于
      * change to tensor copy sync
      
      * change to tensor copy sync
      
      * make copy_to safe when use TensorCopy
      
      * refine code
      
      * add ut
      
      * add cudapinned garbagecollector
      
      * add testcase: cpu place -> cuda pinned place
      1f97d61c
    • C
      【Paddle.Fleet】Fix tensor table (#30075) · 528e03fc
      Chengmo 提交于
      * add tensor table
      528e03fc
  21. 07 1月, 2021 3 次提交
  22. 06 1月, 2021 1 次提交
  23. 05 1月, 2021 2 次提交
  24. 04 1月, 2021 2 次提交