- 03 12月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 Ke Li 提交于
* feature(lk): add initial version of MP-PDQN * fix(lk): fix expand function bug * refactor(nyz): refactor mpdqn continuous args inputs module * fix(nyz): fix pdqn scatter index generation * fix(lk): fix pdqn scatter assignment bug * feature(lk): polish mpdqn code and style format * feature(lk): add mpdqn config and test file * feature(lk): polish mpdqn code and style format * fix(lk): fix import bug * polish(lk): add test for mpdqn * polish(lk): polish code style and format * polish(lk): rm print debug info * polish(lk): rm print debug info * polish(lk): polish code style and format * polish(lk): add MPDQN in readme.md Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>
-
由 Davide Liu 提交于
* added r2d2 + a2c configs * changed convergence reward for some env * removed configs that don't converge * removed 'on_policy' param in 2rd2 configs
-
由 Robin Chen 提交于
* update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error * add unpickle catch for sync * fix reset waitingenv bug
-
- 02 12月, 2021 1 次提交
-
-
由 Robin Chen 提交于
* update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error * add unpickle catch for sync
-
- 01 12月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 30 11月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 26 11月, 2021 2 次提交
-
-
由 蒲源 提交于
* fix(pu): fix adam weight decay bug * feature(pu): add pitfall offppo config * feature(pu): add qbert spaceinvaders pitfall r2d3 config * fix(pu): fix expert offfppo config in r2d3 * fix(pu): fix pong connfig * polish(pu): add loss statistics * fix(pu): fix loss statistics bug * polish(pu): polish pong r2d3 config * polish(pu): polish r2d3 pong and lunarlander config * polish(pu): delete unused files
-
由 Robin Chen 提交于
fix(crb): add renew for env manager; update retry and timeout logit for subprecess env manager (#127) * update base env manager and test * add test reset once * update subprecess env manager and test * format code * update picking error
-
- 25 11月, 2021 4 次提交
-
-
由 niuyazhe 提交于
-
由 Will-Nie 提交于
* add apple key to door treasure and polish * add test, revise reward, build four envs * add 7x7-1 ADTKT
-
由 niuyazhe 提交于
-
由 timothijoe 提交于
* curisity_icm_v1 * modified version1 * modified v2 * one_hot function change * add paper information * format minigrid ppo curiosity * flake8 ding checked * 6th-Oct-gpu-modified * reset configs in minigrid files * minigird-env-doorkey88-100-300 * use modulelist instead of list in icm module * change icm reward model * delete origin curiosit_reward model and add icm_reward model * modified icm reward model * polish icm model by zt, (1) polish ding/reward_model/icm_reward_model.py and related __init__.py (2) add config files for pong:dizoo/atari/config/serial/pong/pong_ppo_offpolicy_icm.py and minigrid env: dizoo/minigrid/config/doorkey8_icm_config.py,fourroom_icm_config.py,minigrid_icm_config.py (3) add element icm in README * remove some useless config files in minigrid * remove redundant part in ppo.py, add cartpole_ppo_icm_config.py, changed test_icm.py and Readme
-
- 24 11月, 2021 3 次提交
- 22 11月, 2021 8 次提交
-
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 Weiyuhong-1998 提交于
* guided_cost * max_e * guided_cost * fix(wyh):fix guided cost recompute bug * fix(wyh):add model save * feature(wyh):polish guided cost * feature(wyh):on guided cost * fix(wyh):gcl-modify * fix(wyh):gcl sac config * fix(wyh):gcl style * fix(wyh):modify comments * fix(wyh):masac_5m6m best config * fix(wyh):sac bug * fix(wyh):GCL readme * fix(wyh):GCL readme conflicts
-
由 puyuan1996 提交于
-
由 niuyazhe 提交于
-
由 niuyazhe 提交于
-
由 蒲源 提交于
* test rnd * fix mz config * fix config * fix config * fix(pu): fix r2d2 * fix(pu): fix ppo-onpolicy-rnd adv bug * fix(puyuan): fix r2d2 * feature(puyuan): add minigrid r2d2 config * polish minigrid config * dev-ppo-onpolicy-rnd * fix(pu): fix rnd reward normalize bug * feature(pu): add minigrid fourrooms and doorkey env info * feature(pu): add serial_entry_onpolicy * fix(pu): fix config params of onpolicy ppo * feature(pu): add obs normalization * polish(pu): polish rnd intrinsic reward normalization * fix(pu): fix clear data bug * test(pu): add off-policy ppo config * polish(pu): polish minigrid onppo-rnd config * polish(pu): polish rnd reward model and minigrid config for rnd_onppo * polish(pu): polish minigrid rnd_onppo config * feature(pu): add gym-minigrid * fix(pu): fix ISerialEvaluator bug * fix(pu): fix cuda device compatibility * fix(pu): fix MiniGrid-ObstructedMaze-2Dlh-v0 env_id bug * polish(pu): squash rnd intrinsic reward to [0,1] according to the batch min and max * style(pu): yapf format * polich(pu):polish pitfall offppo config * polish(pu): polish rnd-onppo and onppo config * polish(pu): polish config and weight last reward * polish(pu):polish rnd-onppo config * fix(pu)" fix mujoco onppo config * fix(pu): fix continous version of dict_data_split_traj_and_compute_adv * polish(pu):polish config * fix(pu): add key traj_flag in data to split traj correctly when ignore_done is True in halfcheetah * polish(pu): polish annatation * polish(pu): withdraw files submitted wrongly * polish(pu): withdraw files deleted wrongly * polish(pu): polish onppo config * fix(pu): fix remaining_traj_data recompute adv bug and polish rnd onppo code * style(pu): yapf format * polish(pu): polish gae_traj_flag function * polish(pu): delete redundant function in onppo
-
- 20 11月, 2021 1 次提交
-
-
由 niuyazhe 提交于
-
- 19 11月, 2021 4 次提交
-
-
由 Davide Liu 提交于
* added gail entry * added lunarlander and cartpole config * added gail mujoco config * added mujoco exp * update22-10 * added third exp * added metric to evaluate policies * added GAIL entry and config for Cartpole and Walker2d * checked style and unittest * restored lunarlander env * style problems * bug correction * Delete expert_data_train.pkl * changed loss of GAIL * Update walker2d_ddpg_gail_config.py * changed gail reward from -D(s, a) to -log(D(s, a)) * added small constant to reward function * added comment to clarify config * Update walker2d_ddpg_gail_config.py * added lunarlander entry + config * Added Atari discriminator + Pong entry config * Update gail_irl_model.py * Update gail_irl_model.py * added gail serial pipeline and onehot actions for gail atari * related to previous commit * removed main files * removed old comment
-
由 Ke Li 提交于
* add_pdqn_model * modify_model_structure * initial_version_PDQN * bug_free_PDQN_no_test_convergence * update_pdqn_config * add_noise_to_continuous_args * polish(nyz): polish code style and add noise in pdqn * seperate_dis_and_cont_model * fix_bug_for_separation * fix(pu): current q value use the data action, fix cont loss detach bug, 1 encoder, dist and cont learning rate * polish(pu): actor delay update * fix(pu): fix disc cont update frequency * polish(pu): polish pdqn config * polish(lk): add comments and typelint for pdqn and dqn * feature(lk): add test file for pdqn model and policy * polish(lk): code style * polish(lk): rm the modify of unrelated files * polish(lk): rm useless commentes code in pdqn Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com> Co-authored-by: Npuyuan1996 <2402552459@qq.com>
-
由 niuyazhe 提交于
-
由 zjowowen 提交于
-
- 18 11月, 2021 3 次提交
-
-
由 niuyazhe 提交于
-
由 jayyoung0802 提交于
* add spaceinvaders multi gpu * add dp and ddp * Update __init__.py * recover init
-
由 niuyazhe 提交于
qrdqn config
-
- 17 11月, 2021 1 次提交
-
-
由 Xu Jingxin 提交于
feature(nyz): extend torch1.1.0 support
-
- 16 11月, 2021 4 次提交
- 15 11月, 2021 2 次提交
-
-
由 Jia Ruonan 提交于
* commit bipedalwalkere_ppo_config * commit bipedalwalker_sac_config
-
由 niuyazhe 提交于
-
- 07 11月, 2021 1 次提交
-
-
由 niuyazhe 提交于
feature(nyz): enable arbitrary policy num in serial sample collector and evaluator, add git in docker(smac docker)
-