1. 01 6月, 2022 1 次提交
  2. 25 5月, 2022 1 次提交
    • M
      Dynamic graph support to Automatic SParsity. (#41177) · e5fc68b2
      Ming-Xu Huang 提交于
      * Dynamic graph support to Automatic SParsity.
      
      1. Added dynamic support to ASP module (paddle.fluid.contrib.sparsity).
      2. Added ASP related unit-tests regards to above changes.
      3. Put ASP module under paddle.static for now, waiting for APIs confirmation from Paddle.
      
      * Modified documents of functions to have correct examples.
      
      * Update in_dygraph_mode to paddle.in_dynamic_mode()
      
      * Modified documents of functions and added comments
      
      * Minor changes.
      
      * Fix example errors in asp API.
      
      * Code Change for Review
      
      1. Added more examples in documents.
      2. Chaged test_asp_pruning_static.
      
      * Minor changes
      
      * Update ASP function documents.
      
      * Update ASP function documents.
      
      * Reduce test case size of asp pruning due CI time limit.
      
      * Update time limitation to some asp UTs.
      
      * Fix sample code errors.
      
      * Fix sample code errors.
      
      * Fix sample code errors.
      
      * Update time limitation to parts of ASP UTs.
      
      * Update UTs to fit with CI.
      
      * Reduce problem size in python/paddle/fluid/tests/unittests/asp/test_fleet_with_asp_dynamic.py
      
      * Added paddle.asp
      
      * Fixed type casting error of OpRole.Optimize in new dygraph mode.
      
      * Made set_excluded_layers be compatible with 2.2
      
      * Fix example code of calculate_density.
      
      * Update code examples.
      
      * Move paddle.asp to paddle.incubate.asp
      
      * Fixed an example error of calculate_density
      e5fc68b2
  3. 18 5月, 2022 1 次提交
    • W
      Add support for forward and reverse high-order automatic differentiation mechanism (#41919) · f6ee202f
      WangZhen 提交于
      * Updated triple_grad_check func
      
      * add todo for gradient checker and refine some comments
      
      * remove additional code
      
      * add test for warnging in backward.py
      
      * format python code
      
      * support multi input in triple gradient checker
      
      * Add matmul triple grad kernel
      
      * Updated comments of TODO
      
      * Supported some special tests
      
      * Change code-format to follow CI std
      
      * Updated gradient_checker.py
      
      * Fix conflicts
      
      * Removed unnecessary printing log
      
      * Change code style to follow CI std
      
      * merge upstream
      
      * add priops.py
      
      * add_p
      
      * rm useless files
      
      * add sub_p mul_p div_p
      
      * add sqrt_p and tanh_p
      
      * add reshape_p
      
      * add broadcast_p
      
      * Add python primitive wrappers.
      
      * Jvp rules updated.
      
      * JVP rules done for all the 17 primops.
      
      * quick check and fixes.
      
      * add jvp(op, *args)
      
      * add broadcast_p fill_constant_p matmul_p reduce_p reshape_p transpose_p
      
      * add split_p and concat_p
      
      * add gather_p and scatter_add_p
      
      * add slice_select_p and slice_assign_p
      
      * Add transpose rules.
      
      * add multi input check for add_p, sub_p, mul_p, div_p
      
      * update concat_p
      
      * Linearize and transpose in progress..
      
      * refine gather_p and scatter_add_p
      
      * updated.
      
      * update transpose.
      
      * refine slice_assign_p and slice_select_p
      
      * init commit for lower
      
      * Merged with primitive ops.
      
      * small update
      
      * add rules for orig2prim and prim2orig
      
      * add 9 test for prim ops
      
      * add more test and fix some bug
      
      * add more test
      
      * register proto
      
      * Adding primops test.
      
      * add shape valid check for broadcast_p op, and add keepdim attr into reduce_p op proto
      
      * support multi input and multi output for split_p and concat_p
      
      * Test updated.
      
      * update
      
      * fix slice bug for slice_select_p and slice_assign_p
      
      * updated.
      
      * Ops updated.
      
      * Refactor and bug fixes.
      
      * updated.
      
      * finish orig2prim and prim2orig rules
      
      * dtype for axis attr should be long int
      
      * update dtype for axis attr int64_t
      
      * update for iscan CI
      
      * Update primx.
      
      * Refactor vars in primx.
      
      * update for lower transform
      
      * add more shape and dtype check
      
      * update primx.py
      
      * change IndexTensor into int32 dtype
      
      * update
      
      * Fix linearize and transpose.
      
      * Update is_dot
      
      * Update is_dot
      
      * Update is_dot
      
      * add gradient aggregation, fix add_transpose.
      
      * pass first linearize+transpose test.
      
      * update test
      
      * refactor op registration and primx.
      
      * update rule for slice_assign
      
      * try test lower
      
      * update orig2prim and prim2orig
      
      * pass simple lower pass
      
      * update
      
      * Update input types in the unit test.
      
      * orig2prim segfault.
      
      * 50% for adam.minimize
      
      * test updated.
      
      * temp fix erros in removing vars.
      
      * primx updated.
      
      * update for matmul_v2 and reshape2 orig2prim
      
      * update for minimize
      
      * Refine primrules
      
      * Remove some code
      
      * supporting unused and unreachable vars.
      
      * update for use prim2orig in minimize
      
      * fix gather and scatter_add transpose.
      
      * Add rules UT
      
      * update scatter_add
      
      * Refine UT code
      
      * fix nonetype check in topo
      
      * Update gather_p pywrapper.
      
      * remove useless print
      
      * Merge tongxin PR and refine code
      
      * readd some test
      
      * rm useless print
      
      * polish code.
      
      * fix bug in minimize
      
      * add get_input_var_list and get_output_var_list and use it in lower
      
      * Fix scatter_add_p prim2orig
      
      * Update code and fix orig2prim/prim2orig UT
      
      * delete vars after block.desc._remove
      
      * Improve ops and vars clean up logics.
      
      * fix some bug in linearize and lower
      
      * update tanh transpose.
      
      * use set instead of list for var2remove
      
      * test updated.
      
      * polish code.
      
      * fix dot2bar delete.
      
      * merge tx/ad
      
      * add indextensor_dot for gather and scatter_add
      
      * add sorted for set
      
      * Fix scale_orig2prim params
      
      * fix some syntax bug
      
      * add golbal_lower_update list
      
      * Better handling of unused vars.
      
      * update tests.
      
      * Fix elementwise_sub orig2prim
      
      * support none for transpose rule
      
      * Merge and add transform UT
      
      * fix a bug in transpose
      
      * Fix transpose and UT
      
      * a hacky fix for cancat op
      
      * Fix exector place
      
      * Refine variable name
      
      * Add elementwise_mul orig2prim and support p_norm when p=1
      
      * Add sqrt orig2prim rule and UT
      
      * merge wz test
      
      * rename files, add enable_prim, disable_prim, prim_enabled, delete global_lower_update
      
      * fix a bug in test_ad_transform_trans
      
      * revert modify in framework.py
      
      * add paddle.fluid.incubate.ad_transform to  python/setup.py.in
      
      * Fix remove vars error
      
      * Fix p_norm_orig2prim
      
      * merge wz
      
      * Modify the code directory
      
      * Add utils.py and remove get_input/output_vars functions
      
      * Update maolin code
      
      * Rename UT and refine test_ad_transform_primops
      
      * Fix div_p jvp rule
      
      * Add higher derivatives UT
      
      * Remove UT to autograd dir
      
      * Fix comments
      
      * import paddle in primops.py
      
      * Add some error message for assert
      
      * Refine UT class name and refine some comments in primreg.py
      
      * update minimize of paddle/optimizer for supporting new autograd
      
      * resolve cicular importing between backward.py and optimizer.py
      
      * fill gradients and minimize unittest
      
      * Replace `assert isinstance` with `raise TypeError`
      
      * Add some assert message for primx.py
      
      * Polish variable name
      
      * Add some assert message
      
      * add some docstring
      
      * refine some name
      
      * update the format of english documents
      
      * Split test_transform.py to two files to avoid ci error
      
      * fix the document format of enable_prim/disable_prim/prim2orig/prim_enabled
      
      * polish test_gradients_and_minimize
      
      * add default value for prim_enabled api doc
      
      * Remove some UT to avoid windows ci error
      
      * Enlarge test_gradients_and_minimize limit time
      
      * Fix ut limit time
      Co-authored-by: Nveyron95 <veyron_wu@163.com>
      Co-authored-by: NJiabin Yang <360788950@qq.com>
      Co-authored-by: Nlevi131 <limaolin01@baidu.com>
      Co-authored-by: NTongxin Bai <waffle.bai@gmail.com>
      Co-authored-by: NXiaoxu Chen <chenxx_id@163.com>
      Co-authored-by: Nlevi131 <83750468+levi131@users.noreply.github.com>
      f6ee202f
  4. 27 4月, 2022 1 次提交
    • R
      Fix paddle setup (#42254) · 8395d660
      Roc 提交于
      * expose api
      
      * ref clipgradbynorm
      
      * update
      
      * Update __init__.py
      8395d660
  5. 25 4月, 2022 1 次提交
  6. 06 4月, 2022 2 次提交
  7. 04 4月, 2022 1 次提交
  8. 02 4月, 2022 1 次提交
    • X
      Enhance vjp/jvp/Jacobian/Hessian API for supporting dynamic, static graph and... · 9e764d82
      Xiaoxu Chen 提交于
      Enhance vjp/jvp/Jacobian/Hessian API for supporting dynamic, static graph and batched, unbatched mode (#40692)
      
      * modify vjp/jvp for both dynamic and static graph
      
      * enforce jacobian class for supporting first/last batch
      
      * add unittest for jvp, jacobian withlast batch, jacobian with first batch
      
      * fix the incorrect shape when multi-index Jacobian
      
      * enforce Hessian class for supporting dynamic graph
      
      * add Hessian class unittest
      
      * bugfix, jvp double_backward_trick zeros_like return stop_gradient=True in static graph
      
      * add API beta warnnings
      
      * add white_list for cuda11.x ci windows.
      
      * optimize some code snippets and documments
      
      * set unittest timeout to 100 seconds
      
      * move vjp,jvp,Jacobian,Hessian to incubate
      
      * fix vjp,vjp import path of sample code
      
      * fix code style error of augtograd/__init__ file
      9e764d82
  9. 31 3月, 2022 1 次提交
    • S
      [New API]: miminize_bfgs and miminize_lbfgs (#40710) · e7928a06
      Sing_chan 提交于
      * [New API]: miminize_bfgs and miminize_lbfgs
      
      * modify for python module call correctly
      
      * add functional package, add error raise in static_graph, change assign to set_value
      
      * unify static_graph and dygraph, fix bug when x or H0 is float64
      
      * now only accept input is tensor, put check args in utils.py, put exception test together
      
      * temp
      
      * add more detailed algorithm illustration and comment, reduce test case to limit test time in 15s
      
      * change in_dygraph_mode to in_dynamic_mode
      
      * fix bug of sample code; reduce test case to reduce test time
      
      * change dir to incubate
      e7928a06
  10. 24 3月, 2022 1 次提交
  11. 23 3月, 2022 1 次提交
  12. 21 3月, 2022 1 次提交
  13. 15 3月, 2022 1 次提交
  14. 10 3月, 2022 1 次提交
    • H
      Inference add ONNXRuntime back-end (#39988) · 431afc39
      heliqi 提交于
      * add onnxruntime predictor
      
      * Add code comments
      
      * support link paddle2onnx onnxruntime
      
      * support onnxruntime with python
      
      * support onnxruntime with python
      
      * support onnxruntime with windows
      
      * paddle2onnx compile with windows
      
      * supoort windows compile
      
      * supoort windows compile with onnxruntime
      
      * supoort windows compile with paddle2onnx
      
      * supoort mac compile
      
      * compile with mac
      
      * compile with mac
      
      * add code comments
      
      * fix remind word
      
      * code optimization
      
      * add test case
      
      * add test case
      
      * add inference demo_ci test case
      
      * fix compile paddle2onnx with no python
      
      * add inference demo_ci test case
      
      * add inference demo_ci test case
      
      * add inference infer_ut test case
      
      * support c go api and test cases
      
      * add converage test case
      
      * add converage test case
      
      * add capi test case
      
      * add capi test case
      431afc39
  15. 09 3月, 2022 1 次提交
  16. 08 3月, 2022 1 次提交
    • C
      add python profiler package (#40065) · 10325a82
      chenjian 提交于
      * add python profiler package
      
      * update according to review
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * add unit test
      
      * Revert "add unit test"
      
      This reverts commit 4e69ff71b0645e069afe5dd8fea0d07717852c48.
      
      * reduce for pr
      
      * add unit test
      
      * modify for pr
      
      * fix unittest
      
      * update for ci coverage
      
      * modify according to review
      
      * fix bug
      
      * improve coverage
      10325a82
  17. 04 3月, 2022 1 次提交
  18. 03 3月, 2022 1 次提交
  19. 28 2月, 2022 1 次提交
  20. 21 2月, 2022 1 次提交
  21. 20 2月, 2022 1 次提交
  22. 15 2月, 2022 1 次提交
    • R
      [PluggableDevice] Add custom runtime support (#38740) · 3e7825f3
      ronnywang 提交于
      * [CustomRuntime] Add DeviceManager
      
      * [CustomRuntime] Add DeviceInterface
      
      * [CustomRuntime] Add Stream, Event, DeviceGuard, CallbackManager
      
      * [CustomRuntime] Add plug-in device
      
      * [CustomRuntime] Memory module support PluggableDevice
      
      * [CustomRuntime] Add WITH_PLUGGABLE_DEVICE cmake option
      
      * update
      
      * [API] update API doc based on comments, test=develop
      Co-authored-by: Nqili93 <qili93@qq.com>
      3e7825f3
  23. 08 2月, 2022 1 次提交
    • Z
      ps optimize refactor (#38982) · 196dbfc2
      ziyoujiyi 提交于
      * delete gloo connect retry
      
      * the_one_ps dirs reconstruct
      
      * .
      
      * .
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * create the_one_ps dirs
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * the one ps dirs modify
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * refactor ps optimize
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * .
      
      * refactor theoneps
      
      * the_one_ps
      
      * add ps pass unittest
      
      * add ps pass unittest
      
      * ps unitest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * ps unittest frame
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * add cpu_async_ps_mode test
      
      * ps unittest ready
      
      * ps unittest ready
      
      * solve dist_pass init conflict
      
      * solve import CommContext error
      
      * unittest ok
      
      * implement AllocateFrom
      
      * solve setup.py.in conflict
      
      * solve conflict
      
      * solve conflict
      
      * solve conflict
      
      * .
      
      * .
      Co-authored-by: Nzkh2016 <zhangkaihuo@baidu.com>
      196dbfc2
  24. 29 1月, 2022 1 次提交
    • C
      [PTen] Tidy pten core headers (#39188) · dd990981
      Chen Weihang 提交于
      * open header for custom kernel
      
      * add core utils
      
      * tidy core code
      
      * tify header
      
      * tidy include
      
      * tidy namespace
      
      * resolve conflit
      
      * fix unittest and coverage
      
      * remove platform using
      
      * resolve conflict
      
      * resolve conflict
      
      * fix digamma namespace error
      
      * fix xpu full kernel error
      
      * fix xpu full kernel error
      
      * polish details
      
      * add place for lib storage
      dd990981
  25. 28 1月, 2022 1 次提交
    • F
      [PSLIB] Add Metrics Module, Support User-defined Add Metric (#38789) · 2e6be886
      Fan Zhang 提交于
      * [PSLIB] Add Metrics Module, Support User-defined Add Metric
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI Coverage
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI
      
      * [PSLIB] Modify According to CI Coverage
      
      * [PSLIB] Modify According to CI Coverage
      
      * [PSLIB] Modify According to CI Coverage
      
      * modify role_maker
      
      * update CMakeLists.txt
      2e6be886
  26. 27 1月, 2022 1 次提交
    • A
      [PluggableDevice] Add custom kernel support based on pten kernel management (#38848) · a8879215
      Aganlengzi 提交于
      * [Demo] custom kernel based on pten kernel
      
      * merge and npu custom work well
      
      * del comments
      
      * delete other code
      
      * fix CUDAContext
      
      * fix not found small_vector.h
      
      * support NPU
      
      * fix NPUContext
      
      * fix DeviceContext support
      
      * add UT
      
      * fix call
      
      * add UT
      
      * fix
      
      * fix for comments and ut
      
      * add MACRO control
      
      * fix multi input output
      
      * support env CUSTOM_DEVICE_ROOT
      
      * deal with special cases
      
      * fix for Windows
      
      * try coverage with test_custom_kernel_dot.py
      
      * fix test_custom_kernel_dot
      
      * fix test_custom_kernel_dot
      
      * fix merge
      
      * fix merge
      
      * fix CI
      
      * update
      
      * merge and fix
      
      * remove WITH_CUSTOM_KERNEL
      
      * fix merge
      
      * merge and fix
      
      * fix ut
      
      * fix ut for mac
      
      * add more UT
      
      * add more UT
      
      * fix
      a8879215
  27. 26 1月, 2022 1 次提交
  28. 20 1月, 2022 1 次提交
  29. 05 1月, 2022 1 次提交
    • W
      [Eager] Support test imperative basic in eager test_empty_grad (#38376) · 9108e777
      wanghuancoder 提交于
      * Rearranged Eager AutoCodeGen directory structure
      
      * Removed USE_OP in Eager AutoCodeGen
      
      * Enabled generation for Operators without Grad/Inputs/Outputs
      
      * Resolved operators without input
      
      * Fixed merge conflicts
      
      * Enabled Eager AutoCodeGen for 10+ more operators
      
      * Refactored Eager AutoCodeGen with more organized helper objects
      
      * Enabled Eager AutoCodeGen for operators with multiple OpBases
      
      * Adjusted Eager AutoCodeGen to Enable Passing Output Tensor as Input Argument
      
      * Handled Dispensable Inputs/Outputs in Eager AutoCodeGen
      
      * Adjusted function generation/call between Python-C API & Dygraph API
      
      * Synchronized auto-generated Python-C API with Dygraph Forward Functions
      
      * support more eager tensor api
      
      * fix merge compile error
      
      * fix compile error and fit develop code
      
      * support pure CPU
      
      * fix some logic error in eager_mode
      
      * support _varbase_creator in eager mode
      
      * Added safe_initialized interface to EagerTensor for use in processing dispensable inputs
      
      * for eager mode
      
      * refine
      
      * support multiple constructor for eager tensor
      
      * add place related code
      
      * polish code
      
      * specific randint with dtype of int64
      
      * Support pure cpu test
      
      * eager logic
      
      * refine test in pure cpu
      
      * eager logic
      
      * eager logic
      
      * eager logic, test=develop
      
      * skip core.eager when in inference, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * call RetainGrad after run forward kernel, test=develop
      
      * refine, test=develop
      
      * support dygraph util, meta, guard test
      
      * eager test case
      
      * support inference test
      
      * refine test and fix initializer failed
      
      * modify eagertensor patch method
      
      * add eagertensor.clear_grandint, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * call monkey_patch_varbase in _test_eager_guard, test=develop
      
      * split clear_gradient to clear_gradient and zero_grads, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      Co-authored-by: Njim19930609 <jim19930609@gmail.com>
      Co-authored-by: NJiabinYang <360788950@qq.com>
      9108e777
  30. 23 12月, 2021 1 次提交
  31. 07 12月, 2021 1 次提交
  32. 03 12月, 2021 1 次提交
    • W
      [Eager] publish python c api for eager (#37550) · 07b4fe93
      wanghuancoder 提交于
      * refine a test case, test=develop
      
      * publish python c api for eager, test=develop
      
      * revert modify about test_allclose_layer.py, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * delete numpy includes, use pybind11 numpy.h, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      
      * suport eager error msg, and add grad test case, test=develop
      
      * refine, test=develop
      
      * refine, test=develop
      07b4fe93
  33. 29 11月, 2021 1 次提交
  34. 19 11月, 2021 1 次提交
    • W
      Add fuse_resnet_unit pass (#36818) · 3cd3bf29
      wuhuanzhou 提交于
      * GeneratePass support attr condition and mapping, test=develop
      
      * fix coverage, test=develop
      
      * Add fuse_resnet_unit pass, test=develop
      
      * fix CI errors, test=develop
      
      * fix CI errors, test=develop
      
      * fix unittest error when compiling without CUDA, test=develop
      
      * fix static ci error, test=develop
      
      * limit kernel size must equal 1, test=develop
      3cd3bf29
  35. 15 11月, 2021 2 次提交
    • C
      [Pten] Refactor the implementation of custom operator (#37122) · 1e598f1a
      Chen Weihang 提交于
      * move extension into pten [no-verify]
      
      * append tensor methods by ext_tensor [no-verify]
      
      * append other tensor methods [no-verify]
      
      * ext related files tidy [no-verify]
      
      * include relation tidy [no-verify]
      
      * add pten tensor test [no-verify]
      
      * replace tensor in custom op & compile success
      
      * refine tensor constructor for unittest
      
      * custom relu jit run success
      
      * fix all custom op unittests
      
      * add inference cmake adapt [no-verify]
      
      * fix failed unittests
      
      * fix windows failed unittests
      
      * try to fix kunlun and inference failed
      
      * fix test_elementwise_api error
      
      * try to fix win compile failed
      
      * fix kunlun fp16 type error
      
      * remove useless haddle error macro
      
      * add custom linear op test
      
      * fix compile failed & add win symbols
      
      * fix non pten kernel cast failed
      
      * add dll decl for api
      
      * polish several deetails
      
      * polish details by review comment
      
      * add dll_decl for register
      1e598f1a
    • Z
      Add distributed pass framework: including PassBase/PassTest/PassUtils (#36643) · 12339fa0
      Zeng Jinle 提交于
      * add split_program
      
      * make ut faster
      
      * increase ut timeout
      
      * make result deterministic
      
      * add fuse_all_reduce pass
      
      * add ut framework, update
      
      * fix ut framework
      
      * remove useless code
      
      * add coverage support
      
      * update
      
      * fix CI
      
      * fix some bugs and fix ci coverage
      
      * fix conflict
      12339fa0
  36. 10 11月, 2021 1 次提交
  37. 04 11月, 2021 1 次提交
  38. 29 10月, 2021 1 次提交