1. 18 9月, 2021 1 次提交
  2. 15 9月, 2021 1 次提交
  3. 24 8月, 2021 1 次提交
    • Y
      Add auto completion module for auto parallel (#34813) · 93d862b0
      Yulong Ao 提交于
      * add auto_parallel dir
      
      * mv to paddle.distributed
      
      * add shard_xx api
      
      * add distributed attrs for var
      
      * add ut, test=develop
      
      * add dist
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update, test=develop
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update
      
      * update, test=develop
      
      * update, test=develop
      
      * update
      
      * update
      
      * delete unused proto
      
      * resotre op_desc
      
      * restore type_defs
      
      * update var_desc
      
      * remove dimss_mapping for proto_pybind
      
      * update interface.py
      
      * update framework.py
      
      * update
      
      * update
      
      * add auto_parallel dir
      
      * mv to paddle.distributed
      
      * add shard_xx api
      
      * add distributed attrs for var
      
      * add ut, test=develop
      
      * [WIP] Add the auto completion feature and related codes
      
      * [WIP] Improve the auto completion and related codes
      
      * [WIP] Make the auto completion to support data-parallel
      
      * [WIP] Make the completion support mp and dp+mp
      
      * [WIP] Refactor auto completion unit test for MLP
      
      * [WIP] Refactor the implementation of DistributedOperatorImpl
      
      * [WIP] Improve dims_mapping update rule and fix a bug
      
      * [WIP] Support auto completion for one transformer decoder layer
      
      * [WIP] Add a minor change
      
      * [WIP] Fix a bug within the uint test
      
      * Shard XShape tensor, add embedding completion and refactor code
      
      * Add the distributed_operators dir to setup.py.in
      
      * Improve the completion process and add the unittest for gpt
      
      * fix process_mesh ut
      
      * fix process_mesh ut
      
      * update
      
      * update, test=develop
      
      * Add support for automatically completing distributed attrs of special ops
      
      * update
      
      * update
      
      * update
      
      * fix doc sample codes, test=develop
      
      * improve coverage, test=develop
      
      * add static_mode check, test=develop
      
      * Model the cluster for cost model and physical mapping
      
      * update, test=develop
      
      * add set_placement, test=develop
      
      * Add the check to make sure the candidate tensors' size is great than zero
      
      * update doc, test=develop
      
      * update doc, test=develop
      
      * update doc, test=develop
      
      * update doc, test=develop
      
      * update, test=develop
      
      * Auto mark dist attrs annotated by user
      
      * update ndarray to nested list, test=develop
      
      * update, test=develop
      
      * Add auto-completion module for auto-parallel (based on PR#33804)
      
      * Remove unnecessary files
      
      * Remove unrelated files for the auto completion pr
      
      * Update the unit test to improve the coverage
      
      * Modify codes based on reviews
      
      * Minor changes for CI
      
      * Improve some codes based on new comments
      
      * Fix bugs caused by shallow copy in attributes.py
      * Imporve amend_distributed_attr_for_program in context.py
      * Other changes for weihang's comments
      Co-authored-by: Nsandyhouse <lilong12@baidu.com>
      93d862b0
  4. 15 7月, 2021 1 次提交
    • H
      Class for processing program (#33439) · 85642a0d
      huangxu96 提交于
      This PR creates a class to process the program at the C++ level. Currently, this class has one class method:
      GetInputsOutputsInBlock()
      85642a0d
  5. 26 4月, 2021 1 次提交
  6. 24 9月, 2020 1 次提交
    • W
      use iwyu clean include (#27267) · df43905f
      wanghuancoder 提交于
      * use iwyu clean include, test=develop, test=win
      
      * compilation error, test=develop
      
      * fix compilation error2, test=develop
      
      * fix compilation error3, test=develop
      
      * fix compilation error4, test=develop
      
      * fix compilation error5, test=develop
      
      * fix compilation error6, test=develop
      
      * fix compilation error7, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error8, test=develop
      
      * fix compilation error10, test=develop
      
      * fix compilation error11, test=develop
      df43905f
  7. 23 6月, 2020 1 次提交
  8. 11 5月, 2020 1 次提交
    • C
      Add macro BOOST_GET to enrich the error information of boost :: get (#24175) · aa0f254f
      Chen Weihang 提交于
      * add new macro BOOST_GET_SAFELY & unittests, test=develop
      
      * add different macro type, test=develop
      
      * fix get macro type in executor, test=develop
      
      * four macro part change backup
      
      * using one macro for all case, test=develop
      
      * revert attribute change, test=develop
      
      * change to three func to solve gcc4.8 bug, test=develop
      
      * polish some details, test=develop
      aa0f254f
  9. 31 10月, 2019 1 次提交
    • H
      GradMaker for dygraph (#19706) · 8c4573a3
      hong 提交于
      * refactor dygraph,test=develop
      
      * fix failed unittest,test=develop
      
      * polish code,test=develop
      
      * check windows ci error,test=develop
      try to fix windows ci error by np.allclose,test=develop
      
      * polish vlog and profiler, test=develop
      
      * try to fix preceding ops order,test=develop
      
      * test transformer in windows ci, test=develop
      
      * use python c-api to speed up tracer.trace,test=develop
      
      * test=develop, fix docker with paddle nccl problem
      
      * test=develop, add ut for debug string and gradient_accumulator
      
      * test=develop, add tests for layer/gradient_accumulator/prepared_op
      
      * test=develop, fix complie error for test_prepared_op
      
      * test=develop, add more ut for dygraph
      
      * test=develop, create API.spec for dygraph api change
      
      * optimize grad maker; test=develop
      
      * optimize grad maker
      
      * test
      
      * grad make optim; test=develop
      
      * fix unittest bugs; test=develop
      
      * add dygraph grad op maker and split_op
      
      * grad op maker refactor; test=develop
      
      * add dygraph grad maker; test=develop
      
      * fix op deformable_conv_v1_op bug; test=develop
      
      * fix deformable_conv prroi pool bugs;
      
      * fix new op grad op maker bug; test=develop
      
      * fix split by ref bug; test=develop
      
      * fix dygraph auto prune bug; test=develop
      
      * fix test_trace bug; test=develop
      
      * fix fused emb seq pool bug; test=develop
      
      * remove useless code in op_desc file; test=develop
      
      * remove useless code, StrVarBaseNode; test=develop
      
      * fix review issues; test=develop
      
      * fix rank_loss grad maker; test=develop
      
      * remove flag in VarBase; test=develop
      
      * fix distributed_notify_op compile bug ; test=develop
      
      * fix reshape op double grad; test=develop
      
      * fix expand as op; test=develop
      
      * add impertive type_defs.h for demo_train; test=develop
      
      * fix inference lib cmake; test=develop
      
      * fix inference lib; test=develop
      
      * fix infernce_lib; test=develop
      
      * fix inference cmake; test=develop
      
      * fix inference lib; test=develop
      
      * fix inference lib; test=develop
      
      * remove condition dygraph grad maker, modify local name; test=develop
      
      * fix split grad maker bug; test=develop
      
      * fix pyramid_op bug; test=develop
      
      * change travis time out limit; test=develop
      
      * restore travis; test=develop
      
      * change timeout limit; test=develop
      8c4573a3
  10. 21 8月, 2019 1 次提交
  11. 28 3月, 2019 1 次提交
  12. 12 12月, 2018 1 次提交
  13. 10 12月, 2018 1 次提交
  14. 28 11月, 2018 1 次提交
  15. 26 10月, 2018 2 次提交
  16. 17 10月, 2018 1 次提交
  17. 24 8月, 2018 1 次提交
  18. 15 8月, 2018 1 次提交
  19. 14 8月, 2018 2 次提交
  20. 22 6月, 2018 1 次提交
  21. 31 5月, 2018 1 次提交
  22. 22 5月, 2018 2 次提交
  23. 25 4月, 2018 1 次提交
  24. 19 4月, 2018 1 次提交
  25. 26 2月, 2018 1 次提交
  26. 12 2月, 2018 1 次提交
  27. 10 2月, 2018 2 次提交
  28. 23 1月, 2018 1 次提交
    • Q
      Memory optimization on Dynamic RNN (#7599) · d76fcb6f
      QI JUN 提交于
      * limit variable type to lod tensor in memory optimization transpiler
      
      * refine policy
      
      * support while operator
      
      * fix random seed and training data order
      
      * refine get_cfgs method to support multi while operators
      
      * refine codes
      d76fcb6f
  29. 10 1月, 2018 1 次提交
  30. 22 12月, 2017 2 次提交
  31. 21 12月, 2017 1 次提交
  32. 20 12月, 2017 1 次提交
  33. 16 11月, 2017 1 次提交
    • Y
      feature/while_grad_op (#5554) · 18f0c40a
      Yang Yang(Tony) 提交于
      * first commit
      
      * Python API for while op
      
      * Python Unittest for simple while_op forward
      
      * fix out to be list
      
      * Fix UT
      
      * VarType
      
      * Fix several bugs
      
      * Fix bug
      
      * Fix bug
      
      * Fix Bug
      
      * Fix bug
      
      * Fix unittest
      
      * Remove debug log
      
      * Add comments
      
      * add PADDLE_ENFORCE
      
      * while_grad_op first commit
      
      * Add `BlockDescBind::FindRecursiveOrCreateVar()` and fix bugs
      
      * not sure how to setdim of while outputs
      
      * push for test
      
      * add executor vlog
      
      * fix bug of while_op cond
      
      * Several enhancement for code
      
      1. Backward always infer shape & infer var type. Since there are RENAME
      variables will be created when creating backward operator, but their
      shape & var types are not inferenced.
      2. Never use SomePtr-> directly, since every pointer could be nullptr if
      it is a function return value. Add `detail::Ref` to cast pointer to
      reference safely.
      3. Enhance error message for backward.
      4. Infer data type of variable in `sum` and `tensor_write`
      
      * Fix bugs of while_op gradient
      
      * Fix several bugs of while_op grad
      
      * fix fill zeros like
      
      * fix 3 >= 3
      
      * fix place holder shouldn't be null
      
      * fail on sum op
      
      * Fix SumOp of TensorList
      
      * clean up
      
      * pass while test
      
      * fix test_array_write_read
      
      * pass sum op
      
      * Support int/int64 for fill_constant_batch_size_like
      
      * Fix compile
      18f0c40a
  34. 28 10月, 2017 1 次提交
    • F
      Python API for inference model saving/load (#5020) · 6783dcee
      fengjiayi 提交于
      * Add `dump_to_file()` for ProgrameDescBind in pybind
      
      * Update
      
      * Add utility.py
      
      * typo
      
      * Fix bugs
      
      * Move add_feed/fetch_components to untility.py
      
      * Compelete dump
      
      * Follow comments
      
      * Change output of Prune() from inference to pointer
      
      * Expose Prune() to Python
      
      * Compelete save/load API of inference model
      
      * Fix errors
      
      * Debuging
      
      * Compelete unit tests
      
      * follow comments
      6783dcee
  35. 27 10月, 2017 1 次提交