- 14 7月, 2023 1 次提交
-
-
由 caozhou 提交于
* distribute best cfg * adapt to multi args transmission * update metric extracting * fix bugs of prune and reading log * fix time default value * remove time record * adjust the order of searching dim * fix prune bugs * fix adding cfg bug * fix multi nodes bug * reset status * remove alarm and set logdir * deepcopy ctx * change alarm * fix restart bug * add exit * best no need alarm * add warmup time
-
- 30 6月, 2023 1 次提交
-
-
由 sneaxiy 提交于
-
- 25 6月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
-
- 20 6月, 2023 1 次提交
-
-
由 Azure 提交于
* add auto tuner * compare and record module * revert launch main * add prune rule * add unit test * add auto tuner * revert launch main * add prune rule * modify unit test script * fix bug for dump nodes; fix bug for checking log file * fix bug --------- Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
-
- 19 6月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
* no endpoints in dy mode * fix fleet api inconsistent
-
- 14 6月, 2023 1 次提交
-
-
由 caozhou 提交于
* add auto tuner * fix prune * fix sharding prune and mbs candidates * fix cfg * fix launch * fix launch * add unittest * fix code style
-
- 12 6月, 2023 1 次提交
-
-
由 Nyakku Shigure 提交于
-
- 08 6月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
-
- 07 6月, 2023 1 次提交
-
-
由 risemeup1 提交于
* replace requests with httpx * set timeout=3 * replace requests with httpx * replace request with httpx * test * repalce requests with httpx * test * replace requests with httpx * replace requests with httpx * modify paddle_build.sh * fix bug
-
- 11 5月, 2023 1 次提交
-
-
由 jjyaoao 提交于
-
- 10 5月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
* add log overwrite flag * use strtobool
-
- 24 4月, 2023 1 次提交
-
-
由 张春乔 提交于
-
- 23 4月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
* save env log for each worker * fix ut
-
- 13 4月, 2023 1 次提交
-
-
由 TaoTao Li 提交于
* add auto parallel tuner options in launch * add ut for launch in auto_parallel tuner fix code format * fix ci-converage
-
- 06 4月, 2023 1 次提交
-
-
由 Kim Yann 提交于
* rem is_compiled_with_npu * rem nup related code * make lint happy * rem test * remove some tests * Update grad_scaler.py * fix an error
-
- 03 4月, 2023 1 次提交
-
-
由 Kim Yann 提交于
* rem is_compiled_with_mlu * fix some mlu_place and mlu_device_coount * make lint happy
-
- 31 3月, 2023 1 次提交
-
-
由 张春乔 提交于
* autofix Co-authored-by: NLiyulingyue <83450930+Liyulingyue@users.noreply.github.com> * revert changes in python/paddle/distributed/fleet/utils/hybrid_parallel_util.py * empty commit, trigger ci * fix test_slice --------- Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
-
- 30 3月, 2023 1 次提交
-
-
由 cyberslack_lee 提交于
[CodeStyle][C416][C417] rewrite unnecessary comprehension with function call and use generator instead of map (#52140) * codestyle c416 c417 * fix error * fix inc * unify all C4 rules into one * fix inc --------- Co-authored-by: NSigureMo <sigure.qaq@gmail.com>
-
- 25 3月, 2023 1 次提交
-
-
由 gouzil 提交于
-
- 23 3月, 2023 1 次提交
-
-
由 Infinity_lee 提交于
-
- 20 3月, 2023 1 次提交
-
-
由 Chitsing KUI 提交于
-
- 13 12月, 2022 1 次提交
-
-
由 jjyaoao 提交于
* first pr * Revise nn.py * Revise nn.py 2.0 * Revise rnn.py;test=document_fix * test=document_fix Co-authored-by: NLigoml <39876205+Ligoml@users.noreply.github.com>
-
- 08 12月, 2022 1 次提交
-
-
由 Ghost Screaming 提交于
* Fix bug of reduce_sum op. When input.numel() > INT32_MAX, its result is wrong. * Remove climits. * Clean fluid API in paddle/distributed and paddle/fleetx folders. Include following files: python/paddle/distributed/__init__.py python/paddle/distributed/collective.py python/paddle/distributed/fleet/utils/fs.py python/paddle/distributed/fleet/utils/hybrid_parallel_inference.py python/paddle/distributed/fleet/utils/hybrid_parallel_util.py python/paddle/distributed/fleet/utils/internal_storage.py python/paddle/distributed/launch/context/device.py python/paddle/distributed/parallel.py python/paddle/distributed/parallel_with_gloo.py python/paddle/distributed/spawn.py python/paddle/framework/__init__.py To be mentioned, 'paddle.fluid.dygraph.parallel.ParallelEnv' and 'fluid.framework.core' keeps unchanged in those files. ParallelEnv is used by paddle.fluid.dygraph.parallel.DataParallel. However, APIs in paddle.fluid.dygraph.parallel can't be migrated to paddle.distributed, as there exists cyclic import dependencies in modules like paddle.static, paddle.tensor. And 'fluid.framework.core' will be changed to import framework.core after fluid.core is transmitted. * Change TODO authors.
-
- 29 11月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* isort all files * revert conflicting files * revert conflicting files * revert conflicting files
-
- 14 11月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
[CodeStyle][F821] fix undefined variables due to missing imports, misspelled variable names (#47899) * `hann` -> `_hann` * `false` -> `False` * a missing passed argument `reduce_all` * some missing imports * `device_type` -> `heter_device_type` * `PKVClient` -> `KVClient` * fix some typos and missing imports
-
- 09 11月, 2022 1 次提交
-
-
由 Tony Cao 提交于
* fix flake8 CodeStyle E266 * fix comments
-
- 08 11月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* [CodeStyle][py2][U004] unecessary explicit `object` inheritance in class definition * fix an increment
-
- 03 11月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* [CodeStyle][py2][U008] remove unnecessary args in `super()` * remove remained args * revert changes in test_pylayer_op * Revert "revert changes in test_pylayer_op" This reverts commit ff185a9ae738afac3b0264f61bde6c6b7f72e7c4. * revert some changes in example code
-
- 01 11月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* [CodeStyle][py2] remove `six` package (part2) * six.ensure_str * remove unused `import six` * remove six from BUILTIN_LIKELY_MODULES * remove six in example code * remove some decode * try to fix example code * fix MockEtcdClient get/get_prefix returns data type * fix MockEtcdClient get_prefix returns data * fix MockEtcdClient get returns data * remove `six` in pypi and conda requirements * fix MockEtcdClient add_watch_callback/add_watch_prefix_callback returns data type * refine MockEtcdClient
-
- 23 10月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* update config * re-blacken python code * temporarily disable date and diff_py_file * skip a format
-
- 19 10月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
-
- 13 10月, 2022 1 次提交
-
-
由 Xinger 提交于
* add rpc module in cpp side * add rpc module in python side * support win32 and mac for rpc * 代码优化 * 优化代码 * update rpc * update rpc launch * rpc remove rank and world_size api * fix logger import bug * remove support for win and mac * remove support for xpu, npu, cinn and rocm * remove support for xpu, npu, cinn and rocm * fix shutdown barrier timeout bug * update:python_rpc_handler to shared ptr * fix master shutodwn first bug * tests support for cpu * update log to vlog * update get service info api * add single process test case * remove process group * remove some useless dependencies * update rpc api comments * update rpc comments: Example to Examples * update rpc api comments * update rpc api comments * update launch api comments * update init_rpc comments * update rpc sync and async comments * fix bug: init_rpc cant be called repeatly in a process * update rpc api comment: make master endpoint unique * update rpc api:service to worker, timeout_ms to timeout * rename ServiceInfo to WorkerInfo * refactor: rename server to worker, log to vlog * add launch test * remove unused codes * refine
-
- 12 10月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* [CodeStyle][F401] remove unused import in python/paddle/distributed * remove pass * empty commit * Fix ValueError: list.remove(x): x not in list for meta_optimizer_names. Fix ValueError: list.remove(x): x not in list for meta_optimizer_names. * Fix split import. Fix split import. * add noqa after meta_optimizers in factory * restort collective ops * expand `import *` * add noqa after required imports * try to fix APIs without core.ops * Revert "try to fix APIs without core.ops" This reverts commit 6172beaf601e84bf61f2490c12c4739f0edaa5eb. * fix an increment * empty commit * add noqa after required imports * expand `import *`, fix ci error Co-authored-by: NShuangchi He <34329208+Yulv-git@users.noreply.github.com>
-
- 14 9月, 2022 1 次提交
-
-
由 Nyakku Shigure 提交于
* trim trailing whitespace * fix `.cmake-format.py` * revert npu ut changes, avoid npu ci error
-
- 22 8月, 2022 1 次提交
-
-
由 kuizhiqing 提交于
-
- 19 8月, 2022 1 次提交
-
-
由 kuizhiqing 提交于
* rewrite get free port strategy * hide the old one
-
- 18 8月, 2022 1 次提交
-
-
由 kuizhiqing 提交于
-
- 17 8月, 2022 1 次提交
-
-
由 kuizhiqing 提交于
-
- 11 8月, 2022 1 次提交
-
-
由 kuizhiqing 提交于
-
- 08 8月, 2022 1 次提交
-
-
由 kuizhiqing 提交于
* make launch compatible * fix ut * fix log offset
-