1. 16 12月, 2021 1 次提交
    • W
      feature(nyp): add residual in R2D2(#150) · ab94376c
      Will-Nie 提交于
      * add comments for r2d2
      
      * sort style
      
      * revise according to the comments
      
      * fix style
      
      * add r2d2 residual link + commnets
      
      * revise according to comments, add spaceinvader
      
      * add test for the model and fix test bugs
      ab94376c
  2. 15 12月, 2021 4 次提交
    • W
      feature(wyh): multi agent mujoco environment (#146) · b040b1c3
      Weiyuhong-1998 提交于
      * ma mujoco env and masac code
      
      * env(wyh):ma mujoco agent id
      
      * feature(wyh):maqac continuous
      
      * fix(wyh):multi-mujoco add readme
      
      * fix(wyh): td error
      
      * fix(wyh)style
      
      * fix(wyh):multi agent mujoco test
      b040b1c3
    • N
      fix(nyz): fix test_ppo same dir bug · 02bd3300
      niuyazhe 提交于
      02bd3300
    • X
      feature(xjx): new main framework and profile helper (#142) · d8bde45c
      Xu Jingxin 提交于
      * Init base buffer and storage
      
      * Use ratelimit as middleware
      
      * Pass style check
      
      * Keep the return original return value
      
      * Add buffer.view
      
      * Add replace flag on sample, rewrite middleware processing
      
      * Test slicing
      
      * Add buffer copy middleware
      
      * Add update/delete api in buffer, rename middleware
      
      * Implement update and delete api of buffer
      
      * add naive use time count middleware in buffer
      
      * Rename next to chain
      
      * feature(nyz): add staleness check middleware and polish buffer
      
      * feature(nyz): add naive priority experience replay
      
      * Sample by indices
      
      * Combine buffer and storage layers
      
      * Support indices when deleting items from the queue
      
      * Use dataclass to save buffered data, remove return_index and return_meta
      
      * Add ignore_insufficient
      
      * polish(nyz): add return index in push and copy same data in sample
      
      * Drop useless import
      
      * Fix sample with indices, ensure return size is equal to input size or indices size
      
      * Make sure sampled data in buffer is different from each other
      
      * Support sample by grouped meta key
      
      * Support sample by rolling window
      
      * Add import/export data in buffer
      
      * Padding after sampling from buffer
      
      * Polish use_time_check
      
      * Use buffer as dataset
      
      * Set collate_fn in buffer test
      
      * Init framework
      
      * Remove set_default, add keep
      
      * Move backward_stack to task
      
      * Fix total_step
      
      * Pydash pick is too slow
      
      * Add step records
      
      * Add async mode
      
      * Reuse forward and backward functions in sequence
      
      * Fix sample profile
      
      * demo(nyz): add atari pong runnable demo
      
      * Fix forward bug
      
      * Add task test
      
      * Test pong
      
      * feature(nyz): add deque buffer compatibility wrapper and demo
      
      * polish(nyz): polish code style and add pong dqn new deque buffer demo
      
      * Use sync mode
      
      * Config worker number
      
      * Init parallel mode
      
      * Add prev property on context
      
      * Mesh workers
      
      * First version of parallel mode
      
      * Make send rpc async
      
      * Dont pickle prev
      
      * Support tcp
      
      * More cleanup on system exit
      
      * Test parallel and task
      
      * Enable task copy
      
      * Test attach mode
      
      * Add with statment
      
      * Polish code
      
      * Raise exception when timeout in attach mode
      
      * Add event listeners
      
      * feature(nyz): add pendulum sac new pipeline demo
      
      * Fix main
      
      * Add profiler and step profiler
      
      * Rewrite parallel, cleanup res after task finished
      
      * Add comments
      
      * Remove ctx.prev
      
      * Enable standalone parallel mode
      
      * Remove hooks on ctx
      
      * Add max mean
      
      * demo(nyz): add pong dqn new pipeline demo
      
      * Ensure parallel sock closed before program exit
      
      * Fix parallel test
      
      * Fix pong
      
      * feature(zjow): add feature of profile in ding (#135)
      
      * add profiling feature in ding cli.
      
      * fix ding --profile cli.
      
      * reformat files.
      
      * reformat files again.
      
      * reformat files again.
      
      * Remove flameprof
      
      * Change kept_keys to set
      
      * Use finish as a properity
      
      * Use wrapper
      
      * Reformat step timer output
      
      * Test random seed
      
      * Revert learning rate
      
      * Add topology on parallel
      
      * Use labels on task
      
      * Star in parallel mode
      
      * Don't use daemon process
      
      * Auto sync finish state
      
      * Return logvars
      
      * Fix test wrapper
      
      * Fix test profiler helper
      
      * Pass flake_check
      
      * Lazy launch
      
      * Reporter
      
      * Replace main with main_sac
      
      * Fix parallel ctx
      
      * Fix test
      
      * Fix merge issues
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      Co-authored-by: Nzjowowen <93968541+zjowowen@users.noreply.github.com>
      d8bde45c
    • K
      fix(lk): fix port conflict in gym_soccer (#139) · aa612443
      Ke Li 提交于
      * feature(lk): fix port conflict
      
      * polish(lk): polish code style and format
      
      * fix(lk): change to subprocess
      aa612443
  3. 14 12月, 2021 4 次提交
    • N
      fix(nyz): fix PER indice repeat unittest bug · ff31a86b
      niuyazhe 提交于
      ff31a86b
    • W
      polish(nyp): fix unittest for trex training and collecting (#144) · f089d02a
      Will-Nie 提交于
      * add trex algorithm for pong
      
      * sort style
      
      * add atari, ll,cp; fix device, collision; add_ppo
      
      * add accuracy evaluation
      
      * correct style
      
      * add seed to make sure results are replicable
      
      * remove useless part in cum return  of model part
      
      * add mujoco onppo training pipeline; ppo config
      
      * improve style
      
      * add sac training config for mujoco
      
      * add log, add save data; polish config
      
      * logger; hyperparameter;walker
      
      * correct style
      
      * modify else condition
      
      * change rnd to trex
      
      * revise according to comments, add eposode collect
      
      * new collect mode for trex, fix all bugs, commnets
      
      * final change
      
      * polish after the final comment
      
      * add readme/test
      
      * add test for serial entry of trex/gcl
      
      * sort style
      
      * change mujoco to cartpole for test for trex_onppo
      
      * remove files generated by testing
      
      * revise tests for entry
      
      * sort style
      
      * revise tests
      
      * modify pytest
      
      * fix(nyz): speed up ppg/ppo and marl algo unittest
      
      * polish(nyz): speed up trex unittest and fix trex entry default config bug
      
      * fix(nyz): fix same name bug
      
      * fix(nyz): fix remove conflict bug(ci skip)
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      f089d02a
    • N
      973e33e2
    • W
      polish(nyp):add R2d2 comments (#149) · a2edf6a2
      Will-Nie 提交于
      * add comments for r2d2
      
      * sort style
      
      * revise according to the comments
      
      * fix style
      a2edf6a2
  4. 13 12月, 2021 1 次提交
  5. 12 12月, 2021 1 次提交
  6. 09 12月, 2021 2 次提交
    • N
      style(nyz): update intro and env doc link(ci skip) · 147d56f3
      niuyazhe 提交于
      147d56f3
    • X
      feature(xjx): refactor buffer (#129) · a490729f
      Xu Jingxin 提交于
      * Init base buffer and storage
      
      * Use ratelimit as middleware
      
      * Pass style check
      
      * Keep the return original return value
      
      * Add buffer.view
      
      * Add replace flag on sample, rewrite middleware processing
      
      * Test slicing
      
      * Add buffer copy middleware
      
      * Add update/delete api in buffer, rename middleware
      
      * Implement update and delete api of buffer
      
      * add naive use time count middleware in buffer
      
      * Rename next to chain
      
      * feature(nyz): add staleness check middleware and polish buffer
      
      * feature(nyz): add naive priority experience replay
      
      * Sample by indices
      
      * Combine buffer and storage layers
      
      * Support indices when deleting items from the queue
      
      * Use dataclass to save buffered data, remove return_index and return_meta
      
      * Add ignore_insufficient
      
      * polish(nyz): add return index in push and copy same data in sample
      
      * Drop useless import
      
      * Fix sample with indices, ensure return size is equal to input size or indices size
      
      * Make sure sampled data in buffer is different from each other
      
      * Support sample by grouped meta key
      
      * Support sample by rolling window
      
      * Add import/export data in buffer
      
      * Padding after sampling from buffer
      
      * Polish use_time_check
      
      * Use buffer as dataset
      
      * Set collate_fn in buffer test
      
      * feature(nyz): add deque buffer compatibility wrapper and demo
      
      * polish(nyz): polish code style and add pong dqn new deque buffer demo
      
      * feature(nyz): add use_time_count compatibility in wrapper
      
      * feature(nyz): add priority replay buffer compatibility in wrapper
      
      * Improve performance of buffer.update
      
      * polish(nyz): add priority max limit and correct flake8
      
      * Use __call__ to rewrite middleware
      
      * Rewrite buffer index
      
      * Fix buffer delete
      
      * Skip first item
      
      * Rewrite buffer delete
      
      * Use caller
      
      * Use caller in priority
      
      * Add group sample
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      a490729f
  7. 08 12月, 2021 4 次提交
    • N
      fix(nyz): disable trex unittest · a7de696a
      niuyazhe 提交于
      a7de696a
    • N
      fix(nyz): fix trex unittest bugs · 234de26b
      niuyazhe 提交于
      234de26b
    • W
      feature(nyp): add Trex algorithm (#119) · 63105fef
      Will-Nie 提交于
      * add trex algorithm for pong
      
      * sort style
      
      * add atari, ll,cp; fix device, collision; add_ppo
      
      * add accuracy evaluation
      
      * correct style
      
      * add seed to make sure results are replicable
      
      * remove useless part in cum return  of model part
      
      * add mujoco onppo training pipeline; ppo config
      
      * improve style
      
      * add sac training config for mujoco
      
      * add log, add save data; polish config
      
      * logger; hyperparameter;walker
      
      * correct style
      
      * modify else condition
      
      * change rnd to trex
      
      * revise according to comments, add eposode collect
      
      * new collect mode for trex, fix all bugs, commnets
      
      * final change
      
      * polish after the final comment
      
      * add readme/test
      
      * add test for serial entry of trex/gcl
      
      * sort style
      63105fef
    • W
      feature(wyh):add masac algorithms (#112) · 18b3720a
      Weiyuhong-1998 提交于
      * fix(wyh):masac
      
      * feature(wyh):single agent discrete sac
      
      * feature(wyh):single agent discrete sac td
      
      * fix(wyh):fix pong bug
      
      * fix(wyh):fix smac bug
      
      * fix(wyh):masac_5m6m best config
      
      * env(wyh):allow SMAC env return ippo/isac obs
      
      * fix(wyh):masac polish
      
      * fix(wyh):masac style
      
      * fix(wyh):masac test
      18b3720a
  8. 06 12月, 2021 1 次提交
  9. 03 12月, 2021 5 次提交
    • N
      v0.2.2 · 312f274d
      niuyazhe 提交于
      312f274d
    • N
    • K
      feature(lk): implement multi pass DQN (#131) · f087d2c7
      Ke Li 提交于
      * feature(lk): add initial version of MP-PDQN
      
      * fix(lk): fix expand function bug
      
      * refactor(nyz): refactor mpdqn continuous args inputs module
      
      * fix(nyz): fix pdqn scatter index generation
      
      * fix(lk): fix pdqn scatter assignment bug
      
      * feature(lk): polish mpdqn code and style format
      
      * feature(lk): add mpdqn config and test file
      
      * feature(lk): polish mpdqn code and style format
      
      * fix(lk): fix import bug
      
      * polish(lk): add test for mpdqn
      
      * polish(lk): polish code style and format
      
      * polish(lk): rm print debug info
      
      * polish(lk): rm print debug info
      
      * polish(lk): polish code style and format
      
      * polish(lk): add MPDQN in readme.md
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      f087d2c7
    • D
      benchmark(davide): Bsuite memory benchmark (#138) · 5ee17ad1
      Davide Liu 提交于
      * added r2d2 + a2c configs
      
      * changed convergence reward for some env
      
      * removed configs that don't converge
      
      * removed 'on_policy' param in 2rd2 configs
      5ee17ad1
    • R
      fix(crb): fix subenvmanager reset bug (#137) · 0cfa4235
      Robin Chen 提交于
      * update base env manager and test
      
      * add test reset once
      
      * update subprecess env manager and test
      
      * format code
      
      * update picking error
      
      * add unpickle catch for sync
      
      * fix reset waitingenv bug
      0cfa4235
  10. 02 12月, 2021 1 次提交
  11. 01 12月, 2021 1 次提交
  12. 30 11月, 2021 1 次提交
  13. 26 11月, 2021 2 次提交
    • polish(pu): add loss statistics and polish r2d3 pong config (#126) · 81602ce9
      蒲源 提交于
      * fix(pu): fix adam weight decay bug
      
      * feature(pu): add pitfall offppo config
      
      * feature(pu): add qbert spaceinvaders pitfall r2d3 config
      
      * fix(pu): fix expert offfppo config in r2d3
      
      * fix(pu): fix pong connfig
      
      * polish(pu): add loss statistics
      
      * fix(pu): fix loss statistics bug
      
      * polish(pu): polish pong r2d3 config
      
      * polish(pu): polish r2d3 pong and lunarlander config
      
      * polish(pu): delete unused files
      81602ce9
    • R
      fix(crb): add renew for env manager; update retry and timeout logit for... · f88bc0e0
      Robin Chen 提交于
      fix(crb): add renew for env manager; update retry and timeout logit for subprecess env manager (#127)
      
      * update base env manager and test
      
      * add test reset once
      
      * update subprecess env manager and test
      
      * format code
      
      * update picking error
      f88bc0e0
  14. 25 11月, 2021 4 次提交
    • N
      polish(nyz): polish impala atrai config · 41dce176
      niuyazhe 提交于
      41dce176
    • W
      feature(nyp): add apple key to door treasure env(#128) · 4157cdae
      Will-Nie 提交于
      * add apple key to door treasure and polish
      
      * add test, revise reward, build four envs
      
      * add 7x7-1 ADTKT
      4157cdae
    • N
      045937e3
    • T
      feature(zt): add curiosity icm algorithm (#41) · b50e8aea
      timothijoe 提交于
      * curisity_icm_v1
      
      * modified version1
      
      * modified v2
      
      * one_hot function change
      
      * add paper information
      
      * format minigrid ppo curiosity
      
      * flake8 ding checked
      
      * 6th-Oct-gpu-modified
      
      * reset configs in minigrid files
      
      * minigird-env-doorkey88-100-300
      
      * use modulelist instead of list in icm module
      
      * change icm reward model
      
      * delete origin curiosit_reward model and add icm_reward model
      
      * modified icm reward model
      
      * polish icm model by zt, (1) polish ding/reward_model/icm_reward_model.py and related __init__.py (2) add config files for pong:dizoo/atari/config/serial/pong/pong_ppo_offpolicy_icm.py and minigrid env: dizoo/minigrid/config/doorkey8_icm_config.py,fourroom_icm_config.py,minigrid_icm_config.py  (3) add element icm in README
      
      * remove some useless config files in minigrid
      
      * remove redundant part in ppo.py, add cartpole_ppo_icm_config.py, changed test_icm.py and Readme
      b50e8aea
  15. 24 11月, 2021 3 次提交
  16. 22 11月, 2021 5 次提交