提交 · main · OpenDILab开源决策智能平台 / DI-engine

30 12月, 2021 1 次提交
- N
  
  polish(nyz): move actor_head_type to action_space field in qac and update readme new repo link · 118cc673
  由 niuyazhe 提交于 12月 30, 2021
  
  118cc673
21 12月, 2021 2 次提交
- N
  
  style(nyz): fix rl intro link problem(ci skip) · b7cd6751
  由 niuyazhe 提交于 12月 21, 2021
  
  b7cd6751
- N
  
  style(nyz): update en doc link and add how to migrate a new env · 2f4d53be
  由 niuyazhe 提交于 12月 21, 2021
  
  2f4d53be
15 12月, 2021 1 次提交

feature(wyh): multi agent mujoco environment (#146) · b040b1c3

由 Weiyuhong-1998 提交于 12月 15, 2021

* ma mujoco env and masac code

* env(wyh):ma mujoco agent id

* feature(wyh):maqac continuous

* fix(wyh):multi-mujoco add readme

* fix(wyh): td error

* fix(wyh)style

* fix(wyh):multi agent mujoco test

b040b1c3

14 12月, 2021 1 次提交
- N
  
  style(nyz): update zh doc link and add more env tutorial zh(ci skip) · 973e33e2
  由 niuyazhe 提交于 12月 14, 2021
  
  973e33e2
09 12月, 2021 1 次提交
- N
  
  style(nyz): update intro and env doc link(ci skip) · 147d56f3
  由 niuyazhe 提交于 12月 09, 2021
  
  147d56f3
08 12月, 2021 1 次提交

feature(nyp): add Trex algorithm (#119) · 63105fef

由 Will-Nie 提交于 12月 08, 2021

* add trex algorithm for pong

* sort style

* add atari, ll,cp; fix device, collision; add_ppo

* add accuracy evaluation

* correct style

* add seed to make sure results are replicable

* remove useless part in cum return  of model part

* add mujoco onppo training pipeline; ppo config

* improve style

* add sac training config for mujoco

* add log, add save data; polish config

* logger; hyperparameter;walker

* correct style

* modify else condition

* change rnd to trex

* revise according to comments, add eposode collect

* new collect mode for trex, fix all bugs, commnets

* final change

* polish after the final comment

* add readme/test

* add test for serial entry of trex/gcl

* sort style

63105fef

06 12月, 2021 1 次提交
- N
  
  style(nyz): update kaggle link and algo table · 100ea314
  由 niuyazhe 提交于 12月 06, 2021
  
  100ea314
03 12月, 2021 2 次提交

N

v0.2.2 · 312f274d
由 niuyazhe 提交于 12月 03, 2021

312f274d

feature(lk): implement multi pass DQN (#131) · f087d2c7

由 Ke Li 提交于 12月 03, 2021

* feature(lk): add initial version of MP-PDQN

* fix(lk): fix expand function bug

* refactor(nyz): refactor mpdqn continuous args inputs module

* fix(nyz): fix pdqn scatter index generation

* fix(lk): fix pdqn scatter assignment bug

* feature(lk): polish mpdqn code and style format

* feature(lk): add mpdqn config and test file

* feature(lk): polish mpdqn code and style format

* fix(lk): fix import bug

* polish(lk): add test for mpdqn

* polish(lk): polish code style and format

* polish(lk): rm print debug info

* polish(lk): rm print debug info

* polish(lk): polish code style and format

* polish(lk): add MPDQN in readme.md
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

f087d2c7

01 12月, 2021 1 次提交
- N
  
  style(nyz): add supporters in README(ci skip) · e8e1d09d
  由 niuyazhe 提交于 12月 01, 2021
  
  e8e1d09d
26 11月, 2021 1 次提交

蒲

polish(pu): add loss statistics and polish r2d3 pong config (#126) · 81602ce9

由蒲源提交于 11月 26, 2021

* fix(pu): fix adam weight decay bug

* feature(pu): add pitfall offppo config

* feature(pu): add qbert spaceinvaders pitfall r2d3 config

* fix(pu): fix expert offfppo config in r2d3

* fix(pu): fix pong connfig

* polish(pu): add loss statistics

* fix(pu): fix loss statistics bug

* polish(pu): polish pong r2d3 config

* polish(pu): polish r2d3 pong and lunarlander config

* polish(pu): delete unused files

81602ce9

25 11月, 2021 1 次提交

feature(zt): add curiosity icm algorithm (#41) · b50e8aea

由 timothijoe 提交于 11月 25, 2021

* curisity_icm_v1

* modified version1

* modified v2

* one_hot function change

* add paper information

* format minigrid ppo curiosity

* flake8 ding checked

* 6th-Oct-gpu-modified

* reset configs in minigrid files

* minigird-env-doorkey88-100-300

* use modulelist instead of list in icm module

* change icm reward model

* delete origin curiosit_reward model and add icm_reward model

* modified icm reward model

* polish icm model by zt, (1) polish ding/reward_model/icm_reward_model.py and related __init__.py (2) add config files for pong:dizoo/atari/config/serial/pong/pong_ppo_offpolicy_icm.py and minigrid env: dizoo/minigrid/config/doorkey8_icm_config.py,fourroom_icm_config.py,minigrid_icm_config.py  (3) add element icm in README

* remove some useless config files in minigrid

* remove redundant part in ppo.py, add cartpole_ppo_icm_config.py, changed test_icm.py and Readme

b50e8aea

22 11月, 2021 2 次提交

feature(wyh): add guided cost algorithm (#57) · ffe8d7c0

由 Weiyuhong-1998 提交于 11月 22, 2021

* guided_cost

* max_e

* guided_cost

* fix(wyh):fix guided cost recompute bug

* fix(wyh):add model save

* feature(wyh):polish guided cost

* feature(wyh):on guided cost

* fix(wyh):gcl-modify

* fix(wyh):gcl sac config

* fix(wyh):gcl style

* fix(wyh):modify comments

* fix(wyh):masac_5m6m best config

* fix(wyh):sac bug

* fix(wyh):GCL readme

* fix(wyh):GCL readme conflicts

ffe8d7c0

N

v0.2.1 · cf8ad134
由 niuyazhe 提交于 11月 22, 2021

cf8ad134

20 11月, 2021 1 次提交
- N
  
  style(nyz): add PDQN/MAPPO link, DQN doc zh link and correct format · 96103e9b
  由 niuyazhe 提交于 11月 20, 2021
  
  96103e9b
19 11月, 2021 2 次提交

feature(lk): add PDQN algorithm for hybrid action spaces (#118) · 39a7cfe3

由 Ke Li 提交于 11月 19, 2021

* add_pdqn_model

* modify_model_structure

* initial_version_PDQN

* bug_free_PDQN_no_test_convergence

* update_pdqn_config

* add_noise_to_continuous_args

* polish(nyz): polish code style and add noise in pdqn

* seperate_dis_and_cont_model

* fix_bug_for_separation

* fix(pu): current q value use the data action, fix cont loss detach bug, 1 encoder, dist and cont learning rate

* polish(pu): actor delay update

* fix(pu): fix disc cont update frequency

* polish(pu): polish pdqn config

* polish(lk): add comments and typelint for pdqn and dqn

* feature(lk): add test file for pdqn model and policy

* polish(lk): code style

* polish(lk): rm the modify of unrelated files

* polish(lk): rm useless commentes code in pdqn
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
Co-authored-by: Npuyuan1996 <2402552459@qq.com>

39a7cfe3

N

style(nyz): modify supported PyTorch version and correct format · d8115c50
由 niuyazhe 提交于 11月 19, 2021

d8115c50

15 11月, 2021 1 次提交
- N
  
  style(nyz): add mbrl badge and env doc link · 12bc041d
  由 niuyazhe 提交于 11月 15, 2021
  
  12bc041d
29 10月, 2021 2 次提交

feature(lcm): add MBPO algorithm (#113) · b1e9b4ea

由 Swain 提交于 10月 29, 2021

* feature(lcm): add MBPO algorithm (#87)

* add model-based rl

* fix yazhe's comments

* format

* pass flake8 test

* polish(nyz): polish mbpo import, name and test
Co-authored-by: Nlichuming <lichuming@lichumingdeMacBook-Pro.local>

b1e9b4ea

feature(nyz): add PADDPG for hybrid action space as baseline (#109) · d2f79536

由 Swain 提交于 10月 29, 2021

* fix(nyz): fix gym_hybrid env not scale action bug

* feature(nyz): add PADDPG basic implementation for hybrid action space

* fix(nyz): fix td3/d4pg comatibility bug with new modifications

* fix(nyz): fix hybrid ddpg action type grad bug and update config

* feature(nyz): add eps greedy + multinomial wrapper and gym_hybrid ddpg convergence config

* style(nyz): update PADDPG in README

* test_model_hybrid_qac

* fix_typo_in_README

* test_policy_hybrid_qac

* polish(nyz): polish hybrid action space to dict structure and polish unittest

* fix(nyz): fix td3bc compatibility bug
Co-authored-by: N李可 <like2@CN0014008466M.local>

d2f79536

28 10月, 2021 1 次提交

feature(nyz): add gobigger baseline (#95) · a8fec8bb

由 Swain 提交于 10月 28, 2021

* feature(nyz): add gobigger baseline

* style(nyz): add gobigger env infor

* feature(nyz): add ignore prefix in default collate

* feautre(nyz): add vsbot training baseline

* fix(nyz): fix to_tensor empty list bug and polish gobigger baseline

* style(nyz): split gobigger baseline code

a8fec8bb

22 10月, 2021 1 次提交

feature(zym): add offlineRL algo td3_bc and polish policy comments(#88) · 7c1b5e95

由 Yinmin.Zhang 提交于 10月 22, 2021

* feature(zym): add offlineRL algo td3_bc.

* feature(zym): add offlineRL algo td3_bc.

* feature(zym): add offlineRL algo td3_bc.

* polish(zym): polish some annotations in td3/ddpg/sac/ppo; polish `_forward_collect` and `_foward_eval`.

* fix(lj): fix dimension bug in cql for continuous env.

* fix(zym): fix dimension bug in cql for continuous env.

* fix(zym): fix dimension bug in cql for continuous env.

* polish(zym): update README.md.

7c1b5e95

21 10月, 2021 1 次提交

feature(lk): add gym-soccer (HFO) env (#94) · 8f47f4cb

由 Ke Li 提交于 10月 21, 2021

* add_soccer_env

* add_info

* close

* format

* test_gym_soccer

* rm_torch

* replay_log

* format_style

* add_gym_soccer_to_readme

* separate render_func

* add_gif_file

* scale_action

* flake_style_format

* resolve_review_comments

* add branch info for gym hybrid

8f47f4cb

19 10月, 2021 1 次提交
- W
  
  polish(nyp): polish dqfd policy, entry and config(#98) · aa8508bb
  由 Will-Nie 提交于 10月 19, 2021
  
  aa8508bb
16 10月, 2021 1 次提交

feature(nyp): add DQfD algorithm (#48) · e2ca8738

由 Will-Nie 提交于 10月 16, 2021

* add_dqfd

* Is_expert to is_expert

* modify according to the last commnets

* value_gamma; done; marginloss; sqil compatibility

* finally shorten the code, revise config

* revise config, style

* add_readme/two_more_config

* correct format
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

e2ca8738

12 10月, 2021 1 次提交
- S
  feature(nyz): add gym-hybrid hybrid action space env (#86) · 292f0246
  由 Swain 提交于 10月 12, 2021
```
* feature(nyz): add gym-hybrid hybrid action space env

* style(nyz): update readme for gym_hybrid env
```
  292f0246
08 10月, 2021 1 次提交

feature(zlx): add vs bot training and self-play training with slime volley env (#23) · dbf432cd

由 LuciusMos 提交于 10月 08, 2021

* slime volley env in dizoo, first commit

* fix bug in slime volley env

* modify volley env to satisfy ding 1v1 requirements; add naive self-play and league training pipeline(evaluator is not finished, now use a very naive one)

* adopt volley builtin ai as default eval opponent

* polish(nyz): polish slime_volley_env and its test

* feature(nyz): add slime_volley vs bot ppo demo

* feature(nyz): add battle_sample_serial_collector and adapt abnormal check in subprocess env manager

* feature(nyz): add slime volley self-play demo

* style(nyz): add slime_volleyball env gif and split MARL and selfplay label

* feature(nyz): add save replay function in slime volleyball env
Co-authored-by: Nzlx-sensetime <zhaoliangxuan@sensetime.com>
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

dbf432cd

01 10月, 2021 1 次提交
- N
  
  style(nyz): fix typo and release multi python version bug(enable docker, smac docker) · 13450e65
  由 niuyazhe 提交于 10月 01, 2021
  
  13450e65
30 9月, 2021 4 次提交

N

v0.2.0 · 769401cc
由 niuyazhe 提交于 9月 30, 2021

769401cc

feature(davide): Implementation of D4PG (#76) · 16a89c35

由 Davide Liu 提交于 9月 30, 2021

* added experience replay and n-step

* implementing distributional q value

* added distributional q-value

* added overview in qac_dist and d4pg

* derived D4PG from DDPG

* fixed a bug when action shape >1

* benchmark D4PG mujoco + minor fixs

-entry for DDPG mujoco
-entry for D4PG mujoco
-config for D4PG mujoco
-fixed style D4PG code
-unittests for QAC distributional

* formatted code

* minor updates (read description)

-added d4pg seria_entry test
-updated comments in QACDIST
-added d4pg in commander register
-added q_value in d4pg return dict
-added priority update in d4pg entry
-added assertion in QACDIST

16a89c35

N

style(nyz): add pypi release workflow(enable docker, smac docker) · 3f33b9d7
由 niuyazhe 提交于 9月 30, 2021

3f33b9d7
N

fix(nyz): fix il test unstable bug, update torch to 1.9.0, and polish readme · 231120f2
由 niuyazhe 提交于 9月 30, 2021

231120f2

24 9月, 2021 1 次提交
- S
  
  style(nyz): fix DRL typo in README · 89e4a5de
  由 Swain 提交于 9月 24, 2021
  
  89e4a5de
23 9月, 2021 1 次提交
- S
  
  style(nyz): update algorithm paper link · 9ea60112
  由 Swain 提交于 9月 23, 2021
  
  9ea60112
17 9月, 2021 2 次提交

feature(davide): add BSuite environment wrapper (#58) · 8050a9bb

由 Davide Liu 提交于 9月 17, 2021

* start implementing bsuite env

* add bsuite env

* Implemented

* removed unused file

* added cartpole_swing environment

* Update test_bsuite_env.py

* added env in readme and in setup.py

* Create bsuite.png

8050a9bb

N

style(nyz): add d4rl env link and fix cql demo cmd · 3e5d6a6c
由 niuyazhe 提交于 9月 17, 2021

3e5d6a6c

14 9月, 2021 1 次提交
- X
  
  doc(xjx): add forum and project urls to optimize SEO (#56) · 22a13ab5
  由 Xu Jingxin 提交于 9月 14, 2021
  
  22a13ab5
08 9月, 2021 2 次提交
- S
  
  style(nyz): update sparse reward badge in env table · 5e52c1a0
  由 Swain 提交于 9月 08, 2021
  
  5e52c1a0
- N
  
  style(nyz): polish env table in README · 1439e22f
  由 niuyazhe 提交于 9月 08, 2021
  
  1439e22f

OpenDILab开源决策智能平台 / DI-engine 上一次同步 2 年多

OpenDILab开源决策智能平台 / DI-engine
上一次同步 2 年多