- 17 12月, 2021 1 次提交
-
-
由 Swain 提交于
* fix(nyz): fix doc generation python version(ci skip) * fix(nyz): modify dev doc branch trigger(ci skip)
-
- 16 12月, 2021 1 次提交
-
-
由 Will-Nie 提交于
* add comments for r2d2 * sort style * revise according to the comments * fix style * add r2d2 residual link + commnets * revise according to comments, add spaceinvader * add test for the model and fix test bugs
-
- 15 12月, 2021 4 次提交
-
-
由 Weiyuhong-1998 提交于
* ma mujoco env and masac code * env(wyh):ma mujoco agent id * feature(wyh):maqac continuous * fix(wyh):multi-mujoco add readme * fix(wyh): td error * fix(wyh)style * fix(wyh):multi agent mujoco test
-
由 niuyazhe 提交于
-
由 Xu Jingxin 提交于
* Init base buffer and storage * Use ratelimit as middleware * Pass style check * Keep the return original return value * Add buffer.view * Add replace flag on sample, rewrite middleware processing * Test slicing * Add buffer copy middleware * Add update/delete api in buffer, rename middleware * Implement update and delete api of buffer * add naive use time count middleware in buffer * Rename next to chain * feature(nyz): add staleness check middleware and polish buffer * feature(nyz): add naive priority experience replay * Sample by indices * Combine buffer and storage layers * Support indices when deleting items from the queue * Use dataclass to save buffered data, remove return_index and return_meta * Add ignore_insufficient * polish(nyz): add return index in push and copy same data in sample * Drop useless import * Fix sample with indices, ensure return size is equal to input size or indices size * Make sure sampled data in buffer is different from each other * Support sample by grouped meta key * Support sample by rolling window * Add import/export data in buffer * Padding after sampling from buffer * Polish use_time_check * Use buffer as dataset * Set collate_fn in buffer test * Init framework * Remove set_default, add keep * Move backward_stack to task * Fix total_step * Pydash pick is too slow * Add step records * Add async mode * Reuse forward and backward functions in sequence * Fix sample profile * demo(nyz): add atari pong runnable demo * Fix forward bug * Add task test * Test pong * feature(nyz): add deque buffer compatibility wrapper and demo * polish(nyz): polish code style and add pong dqn new deque buffer demo * Use sync mode * Config worker number * Init parallel mode * Add prev property on context * Mesh workers * First version of parallel mode * Make send rpc async * Dont pickle prev * Support tcp * More cleanup on system exit * Test parallel and task * Enable task copy * Test attach mode * Add with statment * Polish code * Raise exception when timeout in attach mode * Add event listeners * feature(nyz): add pendulum sac new pipeline demo * Fix main * Add profiler and step profiler * Rewrite parallel, cleanup res after task finished * Add comments * Remove ctx.prev * Enable standalone parallel mode * Remove hooks on ctx * Add max mean * demo(nyz): add pong dqn new pipeline demo * Ensure parallel sock closed before program exit * Fix parallel test * Fix pong * feature(zjow): add feature of profile in ding (#135) * add profiling feature in ding cli. * fix ding --profile cli. * reformat files. * reformat files again. * reformat files again. * Remove flameprof * Change kept_keys to set * Use finish as a properity * Use wrapper * Reformat step timer output * Test random seed * Revert learning rate * Add topology on parallel * Use labels on task * Star in parallel mode * Don't use daemon process * Auto sync finish state * Return logvars * Fix test wrapper * Fix test profiler helper * Pass flake_check * Lazy launch * Reporter * Replace main with main_sac * Fix parallel ctx * Fix test * Fix merge issues Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com> Co-authored-by: Nzjowowen <93968541+zjowowen@users.noreply.github.com>
-
由 Ke Li 提交于
* feature(lk): fix port conflict * polish(lk): polish code style and format * fix(lk): change to subprocess
-
- 14 12月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add trex algorithm for pong * sort style * add atari, ll,cp; fix device, collision; add_ppo * add accuracy evaluation * correct style * add seed to make sure results are replicable * remove useless part in cum return of model part * add mujoco onppo training pipeline; ppo config * improve style * add sac training config for mujoco * add log, add save data; polish config * logger; hyperparameter;walker * correct style * modify else condition * change rnd to trex * revise according to comments, add eposode collect * new collect mode for trex, fix all bugs, commnets * final change * polish after the final comment * add readme/test * add test for serial entry of trex/gcl * sort style * change mujoco to cartpole for test for trex_onppo * remove files generated by testing * revise tests for entry * sort style * revise tests * modify pytest * fix(nyz): speed up ppg/ppo and marl algo unittest * polish(nyz): speed up trex unittest and fix trex entry default config bug * fix(nyz): fix same name bug * fix(nyz): fix remove conflict bug(ci skip) Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add comments for r2d2 * sort style * revise according to the comments * fix style
-
- 13 12月, 2021 1 次提交
-
-
由 Swain 提交于
* feature(nyz): add delay reward mujoco env * test(nyz): add delay reward mujoco env test and fix bug
-
- 12 12月, 2021 1 次提交
-
-
由 Ming Zhang 提交于
-
- 09 12月, 2021 2 次提交
-
-
由 niuyazhe 提交于
-
由 Xu Jingxin 提交于
* Init base buffer and storage * Use ratelimit as middleware * Pass style check * Keep the return original return value * Add buffer.view * Add replace flag on sample, rewrite middleware processing * Test slicing * Add buffer copy middleware * Add update/delete api in buffer, rename middleware * Implement update and delete api of buffer * add naive use time count middleware in buffer * Rename next to chain * feature(nyz): add staleness check middleware and polish buffer * feature(nyz): add naive priority experience replay * Sample by indices * Combine buffer and storage layers * Support indices when deleting items from the queue * Use dataclass to save buffered data, remove return_index and return_meta * Add ignore_insufficient * polish(nyz): add return index in push and copy same data in sample * Drop useless import * Fix sample with indices, ensure return size is equal to input size or indices size * Make sure sampled data in buffer is different from each other * Support sample by grouped meta key * Support sample by rolling window * Add import/export data in buffer * Padding after sampling from buffer * Polish use_time_check * Use buffer as dataset * Set collate_fn in buffer test * feature(nyz): add deque buffer compatibility wrapper and demo * polish(nyz): polish code style and add pong dqn new deque buffer demo * feature(nyz): add use_time_count compatibility in wrapper * feature(nyz): add priority replay buffer compatibility in wrapper * Improve performance of buffer.update * polish(nyz): add priority max limit and correct flake8 * Use __call__ to rewrite middleware * Rewrite buffer index * Fix buffer delete * Skip first item * Rewrite buffer delete * Use caller * Use caller in priority * Add group sample Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
-
- 08 12月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add trex algorithm for pong * sort style * add atari, ll,cp; fix device, collision; add_ppo * add accuracy evaluation * correct style * add seed to make sure results are replicable * remove useless part in cum return of model part * add mujoco onppo training pipeline; ppo config * improve style * add sac training config for mujoco * add log, add save data; polish config * logger; hyperparameter;walker * correct style * modify else condition * change rnd to trex * revise according to comments, add eposode collect * new collect mode for trex, fix all bugs, commnets * final change * polish after the final comment * add readme/test * add test for serial entry of trex/gcl * sort style
-
由 Weiyuhong-1998 提交于
* fix(wyh):masac * feature(wyh):single agent discrete sac * feature(wyh):single agent discrete sac td * fix(wyh):fix pong bug * fix(wyh):fix smac bug * fix(wyh):masac_5m6m best config * env(wyh):allow SMAC env return ippo/isac obs * fix(wyh):masac polish * fix(wyh):masac style * fix(wyh):masac test
-
- 06 12月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 03 12月, 2021 5 次提交
-
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 Ke Li 提交于
* feature(lk): add initial version of MP-PDQN * fix(lk): fix expand function bug * refactor(nyz): refactor mpdqn continuous args inputs module * fix(nyz): fix pdqn scatter index generation * fix(lk): fix pdqn scatter assignment bug * feature(lk): polish mpdqn code and style format * feature(lk): add mpdqn config and test file * feature(lk): polish mpdqn code and style format * fix(lk): fix import bug * polish(lk): add test for mpdqn * polish(lk): polish code style and format * polish(lk): rm print debug info * polish(lk): rm print debug info * polish(lk): polish code style and format * polish(lk): add MPDQN in readme.md Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
-
由 Davide Liu 提交于
* added r2d2 + a2c configs * changed convergence reward for some env * removed configs that don't converge * removed 'on_policy' param in 2rd2 configs
-
由 Robin Chen 提交于
* update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error * add unpickle catch for sync * fix reset waitingenv bug
-
- 02 12月, 2021 1 次提交
-
-
由 Robin Chen 提交于
* update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error * add unpickle catch for sync
-
- 01 12月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 30 11月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 26 11月, 2021 2 次提交
-
-
由 蒲源 提交于
* fix(pu): fix adam weight decay bug * feature(pu): add pitfall offppo config * feature(pu): add qbert spaceinvaders pitfall r2d3 config * fix(pu): fix expert offfppo config in r2d3 * fix(pu): fix pong connfig * polish(pu): add loss statistics * fix(pu): fix loss statistics bug * polish(pu): polish pong r2d3 config * polish(pu): polish r2d3 pong and lunarlander config * polish(pu): delete unused files
-
由 Robin Chen 提交于
fix(crb): add renew for env manager; update retry and timeout logit for subprecess env manager (#127) * update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error
-
- 25 11月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add apple key to door treasure and polish * add test, revise reward, build four envs * add 7x7-1 ADTKT
-
由 niuyazhe 提交于
-
由 timothijoe 提交于
* curisity_icm_v1 * modified version1 * modified v2 * one_hot function change * add paper information * format minigrid ppo curiosity * flake8 ding checked * 6th-Oct-gpu-modified * reset configs in minigrid files * minigird-env-doorkey88-100-300 * use modulelist instead of list in icm module * change icm reward model * delete origin curiosit_reward model and add icm_reward model * modified icm reward model * polish icm model by zt, (1) polish ding/reward_model/icm_reward_model.py and related __init__.py (2) add config files for pong:dizoo/atari/config/serial/pong/pong_ppo_offpolicy_icm.py and minigrid env: dizoo/minigrid/config/doorkey8_icm_config.py,fourroom_icm_config.py,minigrid_icm_config.py (3) add element icm in README * remove some useless config files in minigrid * remove redundant part in ppo.py, add cartpole_ppo_icm_config.py, changed test_icm.py and Readme
-
- 24 11月, 2021 3 次提交
- 22 11月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 Weiyuhong-1998 提交于
* guided_cost * max_e * guided_cost * fix(wyh):fix guided cost recompute bug * fix(wyh):add model save * feature(wyh):polish guided cost * feature(wyh):on guided cost * fix(wyh):gcl-modify * fix(wyh):gcl sac config * fix(wyh):gcl style * fix(wyh):modify comments * fix(wyh):masac_5m6m best config * fix(wyh):sac bug * fix(wyh):GCL readme * fix(wyh):GCL readme conflicts
-