- 22 11月, 2021 2 次提交
-
-
由 niuyazhe 提交于
-
由 蒲源 提交于
* test rnd * fix mz config * fix config * fix config * fix(pu): fix r2d2 * fix(pu): fix ppo-onpolicy-rnd adv bug * fix(puyuan): fix r2d2 * feature(puyuan): add minigrid r2d2 config * polish minigrid config * dev-ppo-onpolicy-rnd * fix(pu): fix rnd reward normalize bug * feature(pu): add minigrid fourrooms and doorkey env info * feature(pu): add serial_entry_onpolicy * fix(pu): fix config params of onpolicy ppo * feature(pu): add obs normalization * polish(pu): polish rnd intrinsic reward normalization * fix(pu): fix clear data bug * test(pu): add off-policy ppo config * polish(pu): polish minigrid onppo-rnd config * polish(pu): polish rnd reward model and minigrid config for rnd_onppo * polish(pu): polish minigrid rnd_onppo config * feature(pu): add gym-minigrid * fix(pu): fix ISerialEvaluator bug * fix(pu): fix cuda device compatibility * fix(pu): fix MiniGrid-ObstructedMaze-2Dlh-v0 env_id bug * polish(pu): squash rnd intrinsic reward to [0,1] according to the batch min and max * style(pu): yapf format * polich(pu):polish pitfall offppo config * polish(pu): polish rnd-onppo and onppo config * polish(pu): polish config and weight last reward * polish(pu):polish rnd-onppo config * fix(pu)" fix mujoco onppo config * fix(pu): fix continous version of dict_data_split_traj_and_compute_adv * polish(pu):polish config * fix(pu): add key traj_flag in data to split traj correctly when ignore_done is True in halfcheetah * polish(pu): polish annatation * polish(pu): withdraw files submitted wrongly * polish(pu): withdraw files deleted wrongly * polish(pu): polish onppo config * fix(pu): fix remaining_traj_data recompute adv bug and polish rnd onppo code * style(pu): yapf format * polish(pu): polish gae_traj_flag function * polish(pu): delete redundant function in onppo
-
- 25 10月, 2021 1 次提交
-
-
由 Weiyuhong-1998 提交于
* fix(wyh):reward model test * fix(wyh):sac ppo test * fix(wyh):ppo_continuous test * fix(wyh):style * fix(wyh):ppo test Co-authored-by: NSwain <niuyazhe314@outlook.com>
-
- 22 10月, 2021 2 次提交
-
-
由 Yinmin.Zhang 提交于
* feature(zym): add offlineRL algo td3_bc. * feature(zym): add offlineRL algo td3_bc. * feature(zym): add offlineRL algo td3_bc. * polish(zym): polish some annotations in td3/ddpg/sac/ppo; polish `_forward_collect` and `_foward_eval`. * fix(lj): fix dimension bug in cql for continuous env. * fix(zym): fix dimension bug in cql for continuous env. * fix(zym): fix dimension bug in cql for continuous env. * polish(zym): update README.md.
-
由 Swain 提交于
* fix(nyz): fix ppo cuda bug and random collect bug * config(nyz): add pong ppo off policy better config * fix(nyz): fix ppo device bug in get_train_sample and update ppo offpolicy config * style(nyz): correct yapf format
-
- 13 9月, 2021 1 次提交
-
-
由 Weiyuhong-1998 提交于
* fix_mappo_bug_masknan_and_dict_cannot_unsqueeze * squeeze_bug
-
- 07 9月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 24 8月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 29 7月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 23 7月, 2021 1 次提交
-
-
由 zhangyinmin 提交于
-
- 21 7月, 2021 2 次提交
-
-
由 zhangyinmin 提交于
-
由 niuyazhe 提交于
-
- 16 7月, 2021 1 次提交
-
-
由 Swain 提交于
* refactor(nyz): refactor read_config to 3 different function interface * feature(nyz): enable env_setting param in entry * polish(nyz): remove redundant code and global declaration * polish(nyz): remove flag in import_helper * polish(nyz): remove unused import * style(nyz): correct format
-
- 13 7月, 2021 1 次提交
-
-
由 zhangyinmin 提交于
-
- 08 7月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-