提交 · f087d2c716754c4b1c914ff3a0d7fc7de2df5135 · OpenDILab开源决策智能平台 / DI-engine

03 12月, 2021 1 次提交

feature(lk): implement multi pass DQN (#131) · f087d2c7

由 Ke Li 提交于 12月 03, 2021

* feature(lk): add initial version of MP-PDQN

* fix(lk): fix expand function bug

* refactor(nyz): refactor mpdqn continuous args inputs module

* fix(nyz): fix pdqn scatter index generation

* fix(lk): fix pdqn scatter assignment bug

* feature(lk): polish mpdqn code and style format

* feature(lk): add mpdqn config and test file

* feature(lk): polish mpdqn code and style format

* fix(lk): fix import bug

* polish(lk): add test for mpdqn

* polish(lk): polish code style and format

* polish(lk): rm print debug info

* polish(lk): rm print debug info

* polish(lk): polish code style and format

* polish(lk): add MPDQN in readme.md
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

f087d2c7

19 11月, 2021 1 次提交

feature(lk): add PDQN algorithm for hybrid action spaces (#118) · 39a7cfe3

由 Ke Li 提交于 11月 19, 2021

* add_pdqn_model

* modify_model_structure

* initial_version_PDQN

* bug_free_PDQN_no_test_convergence

* update_pdqn_config

* add_noise_to_continuous_args

* polish(nyz): polish code style and add noise in pdqn

* seperate_dis_and_cont_model

* fix_bug_for_separation

* fix(pu): current q value use the data action, fix cont loss detach bug, 1 encoder, dist and cont learning rate

* polish(pu): actor delay update

* fix(pu): fix disc cont update frequency

* polish(pu): polish pdqn config

* polish(lk): add comments and typelint for pdqn and dqn

* feature(lk): add test file for pdqn model and policy

* polish(lk): code style

* polish(lk): rm the modify of unrelated files

* polish(lk): rm useless commentes code in pdqn
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
Co-authored-by: Npuyuan1996 <2402552459@qq.com>

39a7cfe3

29 10月, 2021 1 次提交

feature(nyz): add PADDPG for hybrid action space as baseline (#109) · d2f79536

由 Swain 提交于 10月 29, 2021

* fix(nyz): fix gym_hybrid env not scale action bug

* feature(nyz): add PADDPG basic implementation for hybrid action space

* fix(nyz): fix td3/d4pg comatibility bug with new modifications

* fix(nyz): fix hybrid ddpg action type grad bug and update config

* feature(nyz): add eps greedy + multinomial wrapper and gym_hybrid ddpg convergence config

* style(nyz): update PADDPG in README

* test_model_hybrid_qac

* fix_typo_in_README

* test_policy_hybrid_qac

* polish(nyz): polish hybrid action space to dict structure and polish unittest

* fix(nyz): fix td3bc compatibility bug
Co-authored-by: N李可 <like2@CN0014008466M.local>

d2f79536

OpenDILab开源决策智能平台 / DI-engine 上一次同步 2 年多

OpenDILab开源决策智能平台 / DI-engine
上一次同步 2 年多