1. 22 11月, 2021 2 次提交
    • N
      fix(nyz): simplify onppo with traj_flag · 7e51de4f
      niuyazhe 提交于
      7e51de4f
    • fix(pu): fix recompute advantage in on policy ppo and polish rnd_onppo algorithm (#124) · 0b46dd24
      蒲源 提交于
      * test rnd
      
      * fix mz config
      
      * fix config
      
      * fix config
      
      * fix(pu): fix r2d2
      
      * fix(pu): fix ppo-onpolicy-rnd adv bug
      
      * fix(puyuan): fix r2d2
      
      * feature(puyuan): add minigrid r2d2 config
      
      * polish minigrid config
      
      * dev-ppo-onpolicy-rnd
      
      * fix(pu): fix rnd reward normalize bug
      
      * feature(pu): add minigrid fourrooms and doorkey env info
      
      * feature(pu): add serial_entry_onpolicy
      
      * fix(pu): fix config params of onpolicy ppo
      
      * feature(pu): add obs normalization
      
      * polish(pu): polish rnd intrinsic reward normalization
      
      * fix(pu): fix clear data bug
      
      * test(pu): add off-policy ppo config
      
      * polish(pu): polish minigrid onppo-rnd config
      
      * polish(pu): polish rnd reward model and minigrid config for rnd_onppo
      
      * polish(pu): polish minigrid rnd_onppo config
      
      * feature(pu): add gym-minigrid
      
      * fix(pu): fix ISerialEvaluator bug
      
      * fix(pu): fix cuda device compatibility
      
      * fix(pu): fix MiniGrid-ObstructedMaze-2Dlh-v0 env_id bug
      
      * polish(pu): squash rnd intrinsic reward to [0,1] according to the batch min and max
      
      * style(pu): yapf format
      
      * polich(pu):polish pitfall offppo config
      
      * polish(pu): polish rnd-onppo and onppo config
      
      * polish(pu): polish config and weight last reward
      
      * polish(pu):polish rnd-onppo config
      
      * fix(pu)" fix mujoco onppo config
      
      * fix(pu): fix continous version of  dict_data_split_traj_and_compute_adv
      
      * polish(pu):polish config
      
      * fix(pu): add key traj_flag in data to split traj correctly  when ignore_done is True in halfcheetah
      
      * polish(pu): polish annatation
      
      * polish(pu): withdraw files submitted wrongly
      
      * polish(pu): withdraw files deleted wrongly
      
      * polish(pu): polish onppo config
      
      * fix(pu): fix remaining_traj_data recompute adv bug and polish rnd onppo code
      
      * style(pu): yapf format
      
      * polish(pu): polish gae_traj_flag function
      
      * polish(pu): delete redundant function in onppo
      0b46dd24
  2. 25 10月, 2021 1 次提交
  3. 22 10月, 2021 2 次提交
    • Y
      feature(zym): add offlineRL algo td3_bc and polish policy comments(#88) · 7c1b5e95
      Yinmin.Zhang 提交于
      * feature(zym): add offlineRL algo td3_bc.
      
      * feature(zym): add offlineRL algo td3_bc.
      
      * feature(zym): add offlineRL algo td3_bc.
      
      * polish(zym): polish some annotations in td3/ddpg/sac/ppo; polish `_forward_collect` and `_foward_eval`.
      
      * fix(lj): fix dimension bug in cql for continuous env.
      
      * fix(zym): fix dimension bug in cql for continuous env.
      
      * fix(zym): fix dimension bug in cql for continuous env.
      
      * polish(zym): update README.md.
      7c1b5e95
    • S
      polish(nyz): fix ppo bugs and update atari ppo offpolicy config (#108) · 2d5ec7c3
      Swain 提交于
      * fix(nyz): fix ppo cuda bug and random collect bug
      
      * config(nyz): add pong ppo off policy better config
      
      * fix(nyz): fix ppo device bug in get_train_sample and update ppo offpolicy config
      
      * style(nyz): correct yapf format
      2d5ec7c3
  4. 13 9月, 2021 1 次提交
  5. 07 9月, 2021 1 次提交
  6. 24 8月, 2021 1 次提交
  7. 29 7月, 2021 1 次提交
  8. 23 7月, 2021 1 次提交
  9. 21 7月, 2021 2 次提交
  10. 16 7月, 2021 1 次提交
    • S
      polish(nyz): codestyle optimization by lgtm (#7) · f361bd3b
      Swain 提交于
      * refactor(nyz): refactor read_config to 3 different function interface
      
      * feature(nyz): enable env_setting param in entry
      
      * polish(nyz): remove redundant code and global declaration
      
      * polish(nyz): remove flag in import_helper
      
      * polish(nyz): remove unused import
      
      * style(nyz): correct format
      f361bd3b
  11. 13 7月, 2021 1 次提交
  12. 08 7月, 2021 1 次提交