提交 · 18ca5a851ca84e567eeff3d54f036cb28a097b10 · OpenDILab开源决策智能平台 / DI-engine

27 12月, 2021 3 次提交
- P
  
  polish(pu): polish config · 18ca5a85
  由 puyuan1996 提交于 12月 27, 2021
  
  18ca5a85
- P
  
  polish(pu): polish config · 938fc921
  由 puyuan1996 提交于 12月 27, 2021
  
  938fc921
- P
  
  polish(pu): update the current best config · 293e5c36
  由 puyuan1996 提交于 12月 27, 2021
  
  293e5c36
26 12月, 2021 4 次提交
- P
  fix(pu): fix bug when collector_env_num>1, the self._traj_buffer is not empty... · 70328aab
  由 puyuan1996 提交于 12月 26, 2021
```
fix(pu): fix bug when collector_env_num>1, the self._traj_buffer is not empty and will leave over the data in random collect phase
```
  70328aab
- P
  
  polish(pu):polish config · 6ca77642
  由 puyuan1996 提交于 12月 26, 2021
  
  6ca77642
- P
  
  style(pu): yapf format · 96ea3624
  由 puyuan1996 提交于 12月 26, 2021
  
  96ea3624
- P
  
  polish(pu):polish td3_vae using the best setting · c7d85c97
  由 puyuan1996 提交于 12月 26, 2021
  
  c7d85c97
24 12月, 2021 1 次提交
- P
  
  polish(pu):polish kl weight and prediction weight · 3f7e2130
  由 puyuan1996 提交于 12月 24, 2021
  
  3f7e2130
23 12月, 2021 1 次提交

polish(pu): polish vae structure, use add not concat between the embeddings of... · 9dd84dd3

由 puyuan1996 提交于 12月 23, 2021

polish(pu): polish vae structure, use add not concat between the embeddings of obs and action, use tanh after sample z and after the reconstruction_action head

9dd84dd3

22 12月, 2021 1 次提交
- P
  
  polish(pu):polish td3_vae config · 9e6de548
  由 puyuan1996 提交于 12月 22, 2021
  
  9e6de548
20 12月, 2021 4 次提交
- P
  
  polish(pu): polish as review · b65eb2d4
  由 puyuan1996 提交于 12月 20, 2021
  
  b65eb2d4
- P
  
  polish(pu): polish config · b93a380b
  由 puyuan1996 提交于 12月 20, 2021
  
  b93a380b
- P
  
  feature(pu): add latent space constraint and tanh operation to sample z in vae · ec3a3618
  由 puyuan1996 提交于 12月 20, 2021
  
  ec3a3618
- P
  
  feature(pu): representaion shift correction for each transition · 92129676
  由 puyuan1996 提交于 12月 20, 2021
  
  92129676
19 12月, 2021 2 次提交
- P
  
  fix(pu): using decode_with_obs to use the obs_encoding generating from current obs · 2cfc411e
  由 puyuan1996 提交于 12月 19, 2021
  
  2cfc411e
- P
  
  feature(pu): add noise in original action in collect phase and add representation shift correction · fdf21026
  由 puyuan1996 提交于 12月 19, 2021
  
  fdf21026
17 12月, 2021 1 次提交
- P
  
  feature(pu): add ddpg lunarlander_cont config · 17c9f04d
  由 puyuan1996 提交于 12月 17, 2021
  
  17c9f04d
16 12月, 2021 1 次提交
- P
  
  polish(pu):polish config · 3dbce395
  由 puyuan1996 提交于 12月 16, 2021
  
  3dbce395
15 12月, 2021 11 次提交
- N
  
  feature(pu): add lunarlander continuous env(ci skip) · 5ad42c0c
  由 niuyazhe 提交于 12月 15, 2021
  
  5ad42c0c
- P
  
  polish(pu):polish td3_vae config · 0e5fa1be
  由 puyuan1996 提交于 12月 15, 2021
  
  0e5fa1be
- P
  
  fix(pu): use latent action relabel · ae0ae93c
  由 puyuan1996 提交于 12月 15, 2021
  
  ae0ae93c
- P
  
  test(pu): delete noise and change the data for updating vae · 18f86f26
  由 puyuan1996 提交于 12月 14, 2021
  
  18f86f26
- P
  
  polish(pu):polish td3_vae config · 5112584b
  由 puyuan1996 提交于 12月 13, 2021
  
  5112584b
- P
  
  polish(pu): vae and rl update alternately · 58810c4c
  由 puyuan1996 提交于 12月 13, 2021
  
  58810c4c
- P
  
  polish(pu): vae and rl update alternately · 4fca50f4
  由 puyuan1996 提交于 12月 13, 2021
  
  4fca50f4
- P
  
  fix(pu): fix log typo · 4d240686
  由 puyuan1996 提交于 12月 10, 2021
  
  4d240686
- P
  
  feature(pu): add td3_vae · f53806f0
  由 puyuan1996 提交于 12月 10, 2021
  
  f53806f0
- P
  
  feature(pu): add td3_vae · 39c7e2a6
  由 puyuan1996 提交于 12月 10, 2021
  
  39c7e2a6
- K
  fix(lk): fix port conflict in gym_soccer (#139) · aa612443
  由 Ke Li 提交于 12月 15, 2021
```
* feature(lk): fix port conflict

* polish(lk): polish code style and format

* fix(lk): change to subprocess
```
  aa612443
14 12月, 2021 4 次提交

N

fix(nyz): fix PER indice repeat unittest bug · ff31a86b
由 niuyazhe 提交于 12月 14, 2021

ff31a86b

polish(nyp): fix unittest for trex training and collecting (#144) · f089d02a

由 Will-Nie 提交于 12月 14, 2021

* add trex algorithm for pong

* sort style

* add atari, ll,cp; fix device, collision; add_ppo

* add accuracy evaluation

* correct style

* add seed to make sure results are replicable

* remove useless part in cum return  of model part

* add mujoco onppo training pipeline; ppo config

* improve style

* add sac training config for mujoco

* add log, add save data; polish config

* logger; hyperparameter;walker

* correct style

* modify else condition

* change rnd to trex

* revise according to comments, add eposode collect

* new collect mode for trex, fix all bugs, commnets

* final change

* polish after the final comment

* add readme/test

* add test for serial entry of trex/gcl

* sort style

* change mujoco to cartpole for test for trex_onppo

* remove files generated by testing

* revise tests for entry

* sort style

* revise tests

* modify pytest

* fix(nyz): speed up ppg/ppo and marl algo unittest

* polish(nyz): speed up trex unittest and fix trex entry default config bug

* fix(nyz): fix same name bug

* fix(nyz): fix remove conflict bug(ci skip)
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

f089d02a

N

style(nyz): update zh doc link and add more env tutorial zh(ci skip) · 973e33e2
由 niuyazhe 提交于 12月 14, 2021

973e33e2
W
polish(nyp):add R2d2 comments (#149) · a2edf6a2
由 Will-Nie 提交于 12月 14, 2021
```
* add comments for r2d2

* sort style

* revise according to the comments

* fix style
```
a2edf6a2

13 12月, 2021 1 次提交

feature(nyz): add delay reward mujoco env (#145) · 490691fb

由 Swain 提交于 12月 13, 2021

* feature(nyz): add delay reward mujoco env

* test(nyz): add delay reward mujoco env test and fix bug

490691fb

12 12月, 2021 1 次提交
- M
  
  style(zm): add conda auto release (#148) · bc0102ba
  由 Ming Zhang 提交于 12月 12, 2021
  
  bc0102ba
09 12月, 2021 2 次提交

N

style(nyz): update intro and env doc link(ci skip) · 147d56f3
由 niuyazhe 提交于 12月 09, 2021

147d56f3

feature(xjx): refactor buffer (#129) · a490729f

由 Xu Jingxin 提交于 12月 09, 2021

* Init base buffer and storage

* Use ratelimit as middleware

* Pass style check

* Keep the return original return value

* Add buffer.view

* Add replace flag on sample, rewrite middleware processing

* Test slicing

* Add buffer copy middleware

* Add update/delete api in buffer, rename middleware

* Implement update and delete api of buffer

* add naive use time count middleware in buffer

* Rename next to chain

* feature(nyz): add staleness check middleware and polish buffer

* feature(nyz): add naive priority experience replay

* Sample by indices

* Combine buffer and storage layers

* Support indices when deleting items from the queue

* Use dataclass to save buffered data, remove return_index and return_meta

* Add ignore_insufficient

* polish(nyz): add return index in push and copy same data in sample

* Drop useless import

* Fix sample with indices, ensure return size is equal to input size or indices size

* Make sure sampled data in buffer is different from each other

* Support sample by grouped meta key

* Support sample by rolling window

* Add import/export data in buffer

* Padding after sampling from buffer

* Polish use_time_check

* Use buffer as dataset

* Set collate_fn in buffer test

* feature(nyz): add deque buffer compatibility wrapper and demo

* polish(nyz): polish code style and add pong dqn new deque buffer demo

* feature(nyz): add use_time_count compatibility in wrapper

* feature(nyz): add priority replay buffer compatibility in wrapper

* Improve performance of buffer.update

* polish(nyz): add priority max limit and correct flake8

* Use __call__ to rewrite middleware

* Rewrite buffer index

* Fix buffer delete

* Skip first item

* Rewrite buffer delete

* Use caller

* Use caller in priority

* Add group sample
Co-authored-by: Nniuyazhe <niuyazhe@sensetime.com>

a490729f

08 12月, 2021 3 次提交

N

fix(nyz): disable trex unittest · a7de696a
由 niuyazhe 提交于 12月 08, 2021

a7de696a
N

fix(nyz): fix trex unittest bugs · 234de26b
由 niuyazhe 提交于 12月 08, 2021

234de26b

feature(nyp): add Trex algorithm (#119) · 63105fef

由 Will-Nie 提交于 12月 08, 2021

* add trex algorithm for pong

* sort style

* add atari, ll,cp; fix device, collision; add_ppo

* add accuracy evaluation

* correct style

* add seed to make sure results are replicable

* remove useless part in cum return  of model part

* add mujoco onppo training pipeline; ppo config

* improve style

* add sac training config for mujoco

* add log, add save data; polish config

* logger; hyperparameter;walker

* correct style

* modify else condition

* change rnd to trex

* revise according to comments, add eposode collect

* new collect mode for trex, fix all bugs, commnets

* final change

* polish after the final comment

* add readme/test

* add test for serial entry of trex/gcl

* sort style

63105fef

OpenDILab开源决策智能平台 / DI-engine 上一次同步 2 年多

OpenDILab开源决策智能平台 / DI-engine
上一次同步 2 年多