- 08 6月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* add fetch_list * fix evaluate log * tiny fix
-
- 05 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
* use yapf to format all python file * yapf exclude two unittests file for they rely on writing and reading file, and format will break them * disable diff_py_file because too many diff files cause command following failed
-
- 04 6月, 2022 1 次提交
-
-
由 Sing_chan 提交于
-
- 02 6月, 2022 2 次提交
-
-
由 zhaoyingli 提交于
* prepare only once
-
由 zhaoyingli 提交于
-
- 01 6月, 2022 2 次提交
-
-
由 caozhou 提交于
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the parallel tuner * [Auto Parallel] Improve the parallel tuner and fix some bugs * upodate cost model * update import Resharder by dist op * update cost model * fix comp cost bug * update cost model * [Auto Parallel] Amend the dist attr for #processses=1 * update cost model and tuner * update cost model and tuner * update cost model and tuner * update cluster * update reshard * [Auto Parallel] Add the estimation from the cost model * [Auto Parallel] Reimplement the backup and restore functions * [Auto Parallel] Fix the bugs of the parallel tuner * [Auto Parallel] Update the engine api and dist context * [Auto Parallel] Work around the high order grad problem * [Auto Parallel] Add some miscellaneous improvements * [Auto Parallel] Add a unittest for DistributedContext Co-authored-by: Ncaozhou <caozhou@radi.ac.cn>
-
- 19 5月, 2022 2 次提交
-
-
由 JZ-LIANG 提交于
* auto parallel support primitive op with data parallel * add primitive change * 5 loss 3D cylinder acc aligned * add unitest
-
由 zhaoyingli 提交于
* slice data in dist_loader & flag to scale grad * bug fix * update unittest * enable static
-
- 18 5月, 2022 1 次提交
-
-
由 caozhou 提交于
-
- 13 5月, 2022 1 次提交
-
-
由 Tao CHANG 提交于
-
- 10 5月, 2022 2 次提交
- 07 5月, 2022 1 次提交
-
-
由 Yulong Ao 提交于
* [Auto Parallel] Replace the old planner by the new partition tuner * [Auto Parallel] Improve the completion and distributed context * [Auto Parallel] Fix some bugs of the compatible check of some dist ops * [Auto Parallel] Fix some bugs
-
- 06 5月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* add default_ctx in backward.py * record grad_var_to_var with grad_times * fix backward * update annotation * add complete_high_order_grad in complete_forward * add dist slice op * update grad_var_to_var type * update partition_block init mapping before loss op * update compatible for 'XShape' & update 'allreduce_vars' * add dist reshape op when input dim equal to output dim * update 'set_grad_var_shape' with grad_var_to_var * fix dist slice * fix set_grad_var_shape * add dist pnorm op * fix dist pnorm dist_attr * fix engine startprogram & adapt highorder grad * fix set_grad_var_shape when mp * update unittest * update cmakelist * default strategy in engine: dp * bug fix * tiny fix * flatten outputs * fix default strategy * init default ctx * tiny fix * test=allcase
-
- 19 4月, 2022 2 次提交
-
-
由 zhaoyingli 提交于
* add dist reshape impl_idx=2 * fix cmakelist
-
由 zhaoyingli 提交于
* add dist_pnorm op * update cmakelist * fix cmakelist * fix cmakelist
-
- 18 4月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* add dist slice * fix debug * fix cmakelist
-
- 15 4月, 2022 1 次提交
-
-
由 caozhou 提交于
* update cluster
-
- 02 4月, 2022 1 次提交
-
-
由 pangyoki 提交于
-
- 30 3月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* fix converter when sliced_shape is 1 * update unittest
-
- 24 3月, 2022 1 次提交
-
-
由 caozhou 提交于
* refactor cost model
-
- 23 3月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* add dist_saver and update engine * add dist_saver and update engine
-
- 16 3月, 2022 1 次提交
-
-
由 Yulong Ao 提交于
* [Auto Parallel] Support the auto completion of while_op * [Auto Parallel] Improve the completion algorithms * [Auto Parallel] Fix bugs for ernie inference * [Auto Parallel] Remove attrs which cannot be pickled * [Auto Parallel] make the dims_mappings of LodTensorArray vars empty * [Auto Parallel] Fix bugs for the ernie inference in the pipeline parallel * [Auto Parallel] Remove unncessary comments * [Auto Parallel] Fix a bug of the CMakeLists * [Auto Parallel] Use the newest APIs to write the unit test * [Auto Parallel] Remove unnecessary statements
-
- 15 3月, 2022 2 次提交
- 14 3月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* [AutoParallel] Converter Converter API
-
- 07 3月, 2022 1 次提交
-
-
由 zhaoyingli 提交于
* engine support pp * fix format * avoid multi print * fix convert * bug fix * add pp unittest
-
- 02 3月, 2022 1 次提交
-
-
由 JZ-LIANG 提交于
* adapot dist op * add dist_fill_constant_batch_size_like * remvoe print * update compitable * add unitest
-
- 22 2月, 2022 2 次提交
-
-
由 JZ-LIANG 提交于
* add subblock logic for context and partitioner * partitioner support sub blocks * revise typos * fixed param init bug for while * chmod 644 * add unitest * mv forward parser * update unitest * update dist op ctx * update dist op ctx * fixed bug in dist op ctx * fixed bug for recompute subblock
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the high-level Engine API * Update the test cmakefile
-
- 27 1月, 2022 1 次提交
-
-
由 caozhou 提交于
* update planner * update unitest * update dist matmul * update auto converter
-
- 17 12月, 2021 1 次提交
-
-
由 caozhou 提交于
* add planner * add planner * add cost model update * add relaunch updation * update process_group * fix error * add unitest * update unitest * update cost model * avoid api problem
-
- 07 12月, 2021 1 次提交
-
-
由 Yulong Ao 提交于
* [Auto Parallel] Add the unified cluster representation * [Auto Parallel] Add the graph class for physical mapping * [Auto Parallel] Add the simple physical mapper * Set the timeout of the mapper * Merge the upstream develop unittests cmake files * Fix a bug of the process group * Remove mapper unittest from platforms which is not GPU * Move the instantiation of process group after resharding * Add the local id for devices * Update the rank mapping format * [Auto Parallel] Relaunch with the rank mapping file * Remove the unnecessary json file * Avoid entering get_device_proc_info for auto mapping * Correct the mapper unit test * Add some comments * Remove the related files about mapping * Update the unittest for auto mapping * Remove unused rank_mapping unittest * Improve the unittest coverage * Improve the unittest coverage * Improve the unittest of relaunch * Fix the unittest problem in CI * Improve the unittest of relaunch * Remove unnecessary statements * Update the unittest cmakefile * Correct the cmakefile of auto parallel unittests * Modify codes based on the new elastic change * Use the GPUs exclusively in the unittest * Correct the cmakefile * Set the timeout of the unittest
-