- 23 12月, 2021 1 次提交
-
-
由 puyuan1996 提交于
polish(pu): polish vae structure, use add not concat between the embeddings of obs and action, use tanh after sample z and after the reconstruction_action head
-
- 22 12月, 2021 1 次提交
-
-
由 puyuan1996 提交于
-
- 20 12月, 2021 4 次提交
-
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
- 19 12月, 2021 2 次提交
-
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
- 17 12月, 2021 1 次提交
-
-
由 puyuan1996 提交于
-
- 16 12月, 2021 1 次提交
-
-
由 puyuan1996 提交于
-
- 15 12月, 2021 11 次提交
-
-
由 niuyazhe 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 puyuan1996 提交于
-
由 Ke Li 提交于
* feature(lk): fix port conflict * polish(lk): polish code style and format * fix(lk): change to subprocess
-
- 14 12月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add trex algorithm for pong * sort style * add atari, ll,cp; fix device, collision; add_ppo * add accuracy evaluation * correct style * add seed to make sure results are replicable * remove useless part in cum return of model part * add mujoco onppo training pipeline; ppo config * improve style * add sac training config for mujoco * add log, add save data; polish config * logger; hyperparameter;walker * correct style * modify else condition * change rnd to trex * revise according to comments, add eposode collect * new collect mode for trex, fix all bugs, commnets * final change * polish after the final comment * add readme/test * add test for serial entry of trex/gcl * sort style * change mujoco to cartpole for test for trex_onppo * remove files generated by testing * revise tests for entry * sort style * revise tests * modify pytest * fix(nyz): speed up ppg/ppo and marl algo unittest * polish(nyz): speed up trex unittest and fix trex entry default config bug * fix(nyz): fix same name bug * fix(nyz): fix remove conflict bug(ci skip) Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add comments for r2d2 * sort style * revise according to the comments * fix style
-
- 13 12月, 2021 1 次提交
-
-
由 Swain 提交于
* feature(nyz): add delay reward mujoco env * test(nyz): add delay reward mujoco env test and fix bug
-
- 12 12月, 2021 1 次提交
-
-
由 Ming Zhang 提交于
-
- 09 12月, 2021 2 次提交
-
-
由 niuyazhe 提交于
-
由 Xu Jingxin 提交于
* Init base buffer and storage * Use ratelimit as middleware * Pass style check * Keep the return original return value * Add buffer.view * Add replace flag on sample, rewrite middleware processing * Test slicing * Add buffer copy middleware * Add update/delete api in buffer, rename middleware * Implement update and delete api of buffer * add naive use time count middleware in buffer * Rename next to chain * feature(nyz): add staleness check middleware and polish buffer * feature(nyz): add naive priority experience replay * Sample by indices * Combine buffer and storage layers * Support indices when deleting items from the queue * Use dataclass to save buffered data, remove return_index and return_meta * Add ignore_insufficient * polish(nyz): add return index in push and copy same data in sample * Drop useless import * Fix sample with indices, ensure return size is equal to input size or indices size * Make sure sampled data in buffer is different from each other * Support sample by grouped meta key * Support sample by rolling window * Add import/export data in buffer * Padding after sampling from buffer * Polish use_time_check * Use buffer as dataset * Set collate_fn in buffer test * feature(nyz): add deque buffer compatibility wrapper and demo * polish(nyz): polish code style and add pong dqn new deque buffer demo * feature(nyz): add use_time_count compatibility in wrapper * feature(nyz): add priority replay buffer compatibility in wrapper * Improve performance of buffer.update * polish(nyz): add priority max limit and correct flake8 * Use __call__ to rewrite middleware * Rewrite buffer index * Fix buffer delete * Skip first item * Rewrite buffer delete * Use caller * Use caller in priority * Add group sample Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
-
- 08 12月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add trex algorithm for pong * sort style * add atari, ll,cp; fix device, collision; add_ppo * add accuracy evaluation * correct style * add seed to make sure results are replicable * remove useless part in cum return of model part * add mujoco onppo training pipeline; ppo config * improve style * add sac training config for mujoco * add log, add save data; polish config * logger; hyperparameter;walker * correct style * modify else condition * change rnd to trex * revise according to comments, add eposode collect * new collect mode for trex, fix all bugs, commnets * final change * polish after the final comment * add readme/test * add test for serial entry of trex/gcl * sort style
-
由 Weiyuhong-1998 提交于
* fix(wyh):masac * feature(wyh):single agent discrete sac * feature(wyh):single agent discrete sac td * fix(wyh):fix pong bug * fix(wyh):fix smac bug * fix(wyh):masac_5m6m best config * env(wyh):allow SMAC env return ippo/isac obs * fix(wyh):masac polish * fix(wyh):masac style * fix(wyh):masac test
-
- 06 12月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 03 12月, 2021 5 次提交
-
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 Ke Li 提交于
* feature(lk): add initial version of MP-PDQN * fix(lk): fix expand function bug * refactor(nyz): refactor mpdqn continuous args inputs module * fix(nyz): fix pdqn scatter index generation * fix(lk): fix pdqn scatter assignment bug * feature(lk): polish mpdqn code and style format * feature(lk): add mpdqn config and test file * feature(lk): polish mpdqn code and style format * fix(lk): fix import bug * polish(lk): add test for mpdqn * polish(lk): polish code style and format * polish(lk): rm print debug info * polish(lk): rm print debug info * polish(lk): polish code style and format * polish(lk): add MPDQN in readme.md Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
-
由 Davide Liu 提交于
* added r2d2 + a2c configs * changed convergence reward for some env * removed configs that don't converge * removed 'on_policy' param in 2rd2 configs
-
由 Robin Chen 提交于
* update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error * add unpickle catch for sync * fix reset waitingenv bug
-
- 02 12月, 2021 1 次提交
-
-
由 Robin Chen 提交于
* update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error * add unpickle catch for sync
-