1. 03 12月, 2021 1 次提交
    • K
      feature(lk): implement multi pass DQN (#131) · f087d2c7
      Ke Li 提交于
      * feature(lk): add initial version of MP-PDQN
      
      * fix(lk): fix expand function bug
      
      * refactor(nyz): refactor mpdqn continuous args inputs module
      
      * fix(nyz): fix pdqn scatter index generation
      
      * fix(lk): fix pdqn scatter assignment bug
      
      * feature(lk): polish mpdqn code and style format
      
      * feature(lk): add mpdqn config and test file
      
      * feature(lk): polish mpdqn code and style format
      
      * fix(lk): fix import bug
      
      * polish(lk): add test for mpdqn
      
      * polish(lk): polish code style and format
      
      * polish(lk): rm print debug info
      
      * polish(lk): rm print debug info
      
      * polish(lk): polish code style and format
      
      * polish(lk): add MPDQN in readme.md
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      f087d2c7
  2. 19 11月, 2021 1 次提交
    • K
      feature(lk): add PDQN algorithm for hybrid action spaces (#118) · 39a7cfe3
      Ke Li 提交于
      * add_pdqn_model
      
      * modify_model_structure
      
      * initial_version_PDQN
      
      * bug_free_PDQN_no_test_convergence
      
      * update_pdqn_config
      
      * add_noise_to_continuous_args
      
      * polish(nyz): polish code style and add noise in pdqn
      
      * seperate_dis_and_cont_model
      
      * fix_bug_for_separation
      
      * fix(pu): current q value use the data action, fix cont loss detach bug, 1 encoder, dist and cont learning rate
      
      * polish(pu): actor delay update
      
      * fix(pu): fix disc cont update frequency
      
      * polish(pu): polish pdqn config
      
      * polish(lk): add comments and typelint for pdqn and dqn
      
      * feature(lk): add test file for pdqn model and policy
      
      * polish(lk): code style
      
      * polish(lk): rm the modify of unrelated files
      
      * polish(lk): rm useless commentes code in pdqn
      Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
      Co-authored-by: Npuyuan1996 <2402552459@qq.com>
      39a7cfe3
  3. 29 10月, 2021 1 次提交
    • S
      feature(nyz): add PADDPG for hybrid action space as baseline (#109) · d2f79536
      Swain 提交于
      * fix(nyz): fix gym_hybrid env not scale action bug
      
      * feature(nyz): add PADDPG basic implementation for hybrid action space
      
      * fix(nyz): fix td3/d4pg comatibility bug with new modifications
      
      * fix(nyz): fix hybrid ddpg action type grad bug and update config
      
      * feature(nyz): add eps greedy + multinomial wrapper and gym_hybrid ddpg convergence config
      
      * style(nyz): update PADDPG in README
      
      * test_model_hybrid_qac
      
      * fix_typo_in_README
      
      * test_policy_hybrid_qac
      
      * polish(nyz): polish hybrid action space to dict structure and polish unittest
      
      * fix(nyz): fix td3bc compatibility bug
      Co-authored-by: N李可 <like2@CN0014008466M.local>
      d2f79536