DI-engine tags
https://gitcode.net/opendilab/DI-engine/-/tags
2021-10-08T17:11:05+08:00
https://gitcode.net/opendilab/DI-engine/-/tags/v0.2.3
v0.2.3
DI-engine v0.2.3
niuyazhe
niuyazhe@sensetime.com
https://gitcode.net/opendilab/DI-engine/-/tags/v0.2.2
v0.2.2
DI-engine v0.2.2
<h1 data-sourcepos="1:1-1:13" dir="auto">
<a id="user-content-env-dizoo" class="anchor" href="#env-dizoo" aria-hidden="true"></a>Env (dizoo)</h1>
<ol data-sourcepos="2:1-5:0" dir="auto">
<li data-sourcepos="2:1-2:40">apple key to door treasure env (#128)</li>
<li data-sourcepos="3:1-3:33">bsuite memory benchmark (#138)</li>
<li data-sourcepos="4:1-5:0">polish atari impala config</li>
</ol>
<h1 data-sourcepos="6:1-6:11" dir="auto">
<a id="user-content-algorithm" class="anchor" href="#algorithm" aria-hidden="true"></a>Algorithm</h1>
<ol data-sourcepos="7:1-11:0" dir="auto">
<li data-sourcepos="7:1-7:34">Guided Cost IRL algorithm (#57)</li>
<li data-sourcepos="8:1-8:34">ICM exploration algorithm (#41)</li>
<li data-sourcepos="9:1-9:46">MP-DQN hybrid action space algorithm (#131)</li>
<li data-sourcepos="10:1-11:0">add loss statistics and polish r2d3 pong config (#126)</li>
</ol>
<h1 data-sourcepos="12:1-12:13" dir="auto">
<a id="user-content-enhancement" class="anchor" href="#enhancement" aria-hidden="true"></a>Enhancement</h1>
<ol data-sourcepos="13:1-14:0" dir="auto">
<li data-sourcepos="13:1-14:0">add renew env mechanism in env manager and update timeout mechanism (#127) (#134)</li>
</ol>
<h1 data-sourcepos="15:1-15:5" dir="auto">
<a id="user-content-fix" class="anchor" href="#fix" aria-hidden="true"></a>Fix</h1>
<ol data-sourcepos="16:1-22:0" dir="auto">
<li data-sourcepos="16:1-16:48">async subprocess env manager reset bug (#137)</li>
<li data-sourcepos="17:1-17:37">keepdims name bug in model wrapper</li>
<li data-sourcepos="18:1-18:31">on-policy ppo value norm bug</li>
<li data-sourcepos="19:1-19:27">GAE and RND unittest bug</li>
<li data-sourcepos="20:1-20:46">hidden state wrapper h tensor compatibility</li>
<li data-sourcepos="21:1-22:0">naive buffer auto config create bug</li>
</ol>
<h1 data-sourcepos="23:1-23:7" dir="auto">
<a id="user-content-style" class="anchor" href="#style" aria-hidden="true"></a>Style</h1>
<ol data-sourcepos="24:1-25:0" dir="auto">
<li data-sourcepos="24:1-25:0">add supporters list</li>
</ol>
<h1 data-sourcepos="26:1-26:18" dir="auto">
<a id="user-content-new-repo-feature" class="anchor" href="#new-repo-feature" aria-hidden="true"></a>New Repo Feature</h1>
<ol data-sourcepos="27:1-28:0" dir="auto">
<li data-sourcepos="27:1-28:0"><a href="https://github.com/opendilab/treevalue#speed-performance" rel="nofollow noreferrer noopener" target="_blank">treevalue speed benchmark</a></li>
</ol>
<p data-sourcepos="29:1-29:137" dir="auto"><strong>Contributors: @PaParaZz1 @puyuan1996 @RobinC94 @LikeJulia @Will-Nie @Weiyuhong-1998 @timothijoe @davide97l @lichuminglcm @YinminZhang</strong></p>
2021-12-04T08:09:21+08:00
niuyazhe
niuyazhe@sensetime.com
https://gitcode.net/opendilab/DI-engine/-/tags/v0.2.1
v0.2.1
DI-engine v0.2.1
<h1 data-sourcepos="1:1-1:12" dir="auto">
<a id="user-content-api-change" class="anchor" href="#api-change" aria-hidden="true"></a>API Change</h1>
<ol data-sourcepos="2:1-5:0" dir="auto">
<li data-sourcepos="2:1-2:73">remove torch in all envs (numpy array is the basic data format in env)</li>
<li data-sourcepos="3:1-3:45">remove <code>on_policy</code> field in all the config</li>
<li data-sourcepos="4:1-5:0">change <code>eval_freq</code> from 50 to 1000</li>
</ol>
<h1 data-sourcepos="6:1-6:18" dir="auto">
<a id="user-content-tutorial-and-doc" class="anchor" href="#tutorial-and-doc" aria-hidden="true"></a>Tutorial and Doc</h1>
<ol data-sourcepos="7:1-9:0" dir="auto">
<li data-sourcepos="7:1-9:0">
<a href="https://di-engine-docs.readthedocs.io/en/latest/env_tutorial/index.html" rel="nofollow noreferrer noopener" target="_blank">env tutorial</a>/<a href="https://di-engine-docs.readthedocs.io/en/main-zh/env_tutorial/index_zh.html" rel="nofollow noreferrer noopener" target="_blank">环境指南</a>
</li>
</ol>
<h1 data-sourcepos="10:1-10:13" dir="auto">
<a id="user-content-env-dizoo" class="anchor" href="#env-dizoo" aria-hidden="true"></a>Env (dizoo)</h1>
<ol data-sourcepos="11:1-15:0" dir="auto">
<li data-sourcepos="11:1-11:23">gym-hybrid env (#86)</li>
<li data-sourcepos="12:1-12:29">gym-soccer (HFO) env (#94)</li>
<li data-sourcepos="13:1-13:31">Go-Bigger env baseline (#95)</li>
<li data-sourcepos="14:1-15:0">sac and ppo config for bipedalwalker env(#121)</li>
</ol>
<h1 data-sourcepos="16:1-16:11" dir="auto">
<a id="user-content-algorithm" class="anchor" href="#algorithm" aria-hidden="true"></a>Algorithm</h1>
<ol data-sourcepos="17:1-25:0" dir="auto">
<li data-sourcepos="17:1-17:48">DQfD Imitation Learning algorithm (#48) (#98)</li>
<li data-sourcepos="18:1-18:35">TD3BC offline RL algorithm (#88)</li>
<li data-sourcepos="19:1-19:39">MBPO model-based RL algorithm (#113)</li>
<li data-sourcepos="20:1-20:46">PADDPG hybrid action space algorithm (#109)</li>
<li data-sourcepos="21:1-21:44">PDQN hybrid action space algorithm (#118)</li>
<li data-sourcepos="22:1-22:59">fix R2D2 bugs and produce benchmark, add naive NGU (#40)</li>
<li data-sourcepos="23:1-23:52">self-play training demo in slime_volley env (#23)</li>
<li data-sourcepos="24:1-25:0">add example of GAIL entry + config for mujoco (#114)</li>
</ol>
<h1 data-sourcepos="26:1-26:13" dir="auto">
<a id="user-content-enhancement" class="anchor" href="#enhancement" aria-hidden="true"></a>Enhancement</h1>
<ol data-sourcepos="27:1-32:0" dir="auto">
<li data-sourcepos="27:1-27:57">enable arbitrary policy num in serial sample collector</li>
<li data-sourcepos="28:1-28:54">add torch DataParallel for single machine multi-GPU</li>
<li data-sourcepos="29:1-29:40">add registry force_overwrite argument</li>
<li data-sourcepos="30:1-32:0">add naive buffer periodic thruput seconds argument</li>
</ol>
<h1 data-sourcepos="33:1-33:5" dir="auto">
<a id="user-content-fix" class="anchor" href="#fix" aria-hidden="true"></a>Fix</h1>
<ol data-sourcepos="34:1-45:0" dir="auto">
<li data-sourcepos="34:1-34:38">target model wrapper hard reset bug</li>
<li data-sourcepos="35:1-35:40">fix learn state_dict target model bug</li>
<li data-sourcepos="36:1-36:56">ppo bugs and update atari ppo offpolicy config (#108)</li>
<li data-sourcepos="37:1-37:27">pyyaml version bug (#99)</li>
<li data-sourcepos="38:1-38:41">small fix on bsuite environment (#117)</li>
<li data-sourcepos="39:1-39:28">discrete cql unittest bug</li>
<li data-sourcepos="40:1-40:23">release workflow bug</li>
<li data-sourcepos="41:1-41:43">base policy model state_dict overlap bug</li>
<li data-sourcepos="42:1-42:52">remove on_policy option in dizoo config and entry</li>
<li data-sourcepos="43:1-45:0">remove torch in env</li>
</ol>
<h1 data-sourcepos="46:1-46:6" dir="auto">
<a id="user-content-test" class="anchor" href="#test" aria-hidden="true"></a>Test</h1>
<ol data-sourcepos="47:1-52:0" dir="auto">
<li data-sourcepos="47:1-47:38">add pure docker setting test (#103)</li>
<li data-sourcepos="48:1-48:48">add unittest for dataset and evaluator (#107)</li>
<li data-sourcepos="49:1-49:45">add unittest for on-policy algorithm (#92)</li>
<li data-sourcepos="50:1-52:0">add unittest for ppo and td (MARL case) (#89)</li>
</ol>
<h1 data-sourcepos="53:1-53:7" dir="auto">
<a id="user-content-style" class="anchor" href="#style" aria-hidden="true"></a>Style</h1>
<ol data-sourcepos="54:1-58:0" dir="auto">
<li data-sourcepos="54:1-54:24">gym version == 0.20.0</li>
<li data-sourcepos="55:1-55:36">torch version >= 1.1.0, <= 1.10.0</li>
<li data-sourcepos="56:1-58:0">ale-py == 0.7.0</li>
</ol>
<h1 data-sourcepos="59:1-59:10" dir="auto">
<a id="user-content-new-repo" class="anchor" href="#new-repo" aria-hidden="true"></a>New Repo</h1>
<ul data-sourcepos="60:1-62:0" dir="auto">
<li data-sourcepos="60:1-60:108">
<a href="https://github.com/opendilab/GoBigger" rel="nofollow noreferrer noopener" target="_blank">Go-Bigger</a> OpenDILab Multi-Agent Decision Intelligence Environment</li>
<li data-sourcepos="61:1-62:0">
<a href="https://github.com/opendilab/GoBigger-Challenge-2021" rel="nofollow noreferrer noopener" target="_blank">GoBigger-Challenge-2021</a> Basic code and description for GoBigger challenge 2021</li>
</ul>
<p data-sourcepos="63:1-63:169" dir="auto"><strong>Contributors: @PaParaZz1 @puyuan1996 @Will-Nie @YinminZhang @Weiyuhong-1998 @LikeJulia @sailxjx @davide97l @jayyoung0802 @lichuminglcm @yifan123 @RobinC94 @zjowowen</strong></p>
2021-11-23T02:00:17+08:00
niuyazhe
niuyazhe@sensetime.com
https://gitcode.net/opendilab/DI-engine/-/tags/v0.2.0
v0.2.0
v0.2.0
<h1 data-sourcepos="1:1-1:12" dir="auto">
<a id="user-content-api-change" class="anchor" href="#api-change" aria-hidden="true"></a>API Change</h1>
<ol data-sourcepos="2:1-9:0" dir="auto">
<li data-sourcepos="2:1-2:54">
<code>SampleCollector</code> rename to <code>SampleSerialCollector</code>
</li>
<li data-sourcepos="3:1-3:56">
<code>EpisodeCollector</code> rename to <code>EpisodeSerialCollector</code>
</li>
<li data-sourcepos="4:1-4:63">
<code>BaseSerialEvaluator</code> rename to <code>InteractionSerialEvaluator</code>
</li>
<li data-sourcepos="5:1-5:60">
<code>ZerglingCollector</code> rename to <code>ZerglingParallelCollector</code>
</li>
<li data-sourcepos="6:1-6:58">
<code>OneVsOneCollector</code> rename to <code>MarineParallelCollector</code>
</li>
<li data-sourcepos="7:1-9:0">
<code>AdvancedBuffer</code> registry name from <code>priority</code> to <code>advanced</code>
</li>
</ol>
<h1 data-sourcepos="10:1-10:13" dir="auto">
<a id="user-content-env-dizoo" class="anchor" href="#env-dizoo" aria-hidden="true"></a>Env (dizoo)</h1>
<ol data-sourcepos="11:1-18:0" dir="auto">
<li data-sourcepos="11:1-11:23">overcooked env (#20)</li>
<li data-sourcepos="12:1-12:20">procgen env (#26)</li>
<li data-sourcepos="13:1-13:30">modified predator env (#30)</li>
<li data-sourcepos="14:1-14:17">d4rl env (#37)</li>
<li data-sourcepos="15:1-15:25">imagenet dataset (#27)</li>
<li data-sourcepos="16:1-16:19">bsuite env (#58)</li>
<li data-sourcepos="17:1-18:0">move atari_py to ale-py</li>
</ol>
<h1 data-sourcepos="19:1-19:11" dir="auto">
<a id="user-content-algorithm" class="anchor" href="#algorithm" aria-hidden="true"></a>Algorithm</h1>
<ol data-sourcepos="20:1-26:0" dir="auto">
<li data-sourcepos="20:1-20:29">SQIL algorithm (#25) (#44)</li>
<li data-sourcepos="21:1-21:50">CQL algorithm (discrete/continuous) (#37) (#68)</li>
<li data-sourcepos="22:1-22:24">MAPPO algorithm (#62)</li>
<li data-sourcepos="23:1-23:24">WQMIX algorithm (#24)</li>
<li data-sourcepos="24:1-24:23">D4PG algorithm (#76)</li>
<li data-sourcepos="25:1-26:0">update multi-discrete policy(dqn, ppo, rainbow) (#51) (#72)</li>
</ol>
<h1 data-sourcepos="27:1-27:13" dir="auto">
<a id="user-content-enhancement" class="anchor" href="#enhancement" aria-hidden="true"></a>Enhancement</h1>
<ol data-sourcepos="28:1-36:0" dir="auto">
<li data-sourcepos="28:1-28:58">image classification supervised training pipeline (#27)</li>
<li data-sourcepos="29:1-29:61">add force_reproducibility option in subprocess env manager</li>
<li data-sourcepos="30:1-30:46">add/delete/restart replicas via cli for k8s</li>
<li data-sourcepos="31:1-31:46">add league metric (trueskill and elo) (#22)</li>
<li data-sourcepos="32:1-32:64">add tb in naive buffer and modify tb in advanced buffer (#39)</li>
<li data-sourcepos="33:1-33:82">add k8s launcher and di-orchestrator launcher, add related unittest (#45) (#49)</li>
<li data-sourcepos="34:1-34:45">add hyper-parameter scheduler module (#38)</li>
<li data-sourcepos="35:1-36:0">add plot function (#59)</li>
</ol>
<h1 data-sourcepos="37:1-37:5" dir="auto">
<a id="user-content-fix" class="anchor" href="#fix" aria-hidden="true"></a>Fix</h1>
<ol data-sourcepos="38:1-51:0" dir="auto">
<li data-sourcepos="38:1-38:48">acer weight bug and update atari result (#21)</li>
<li data-sourcepos="39:1-39:56">mappo nan bug and dict obs cannot unsqueeze bug (#54)</li>
<li data-sourcepos="40:1-40:59">r2d2 hidden state and obs pre-processing bug (#36) (#52)</li>
<li data-sourcepos="41:1-41:41">ppo bug when use dual_clip and adv > 0</li>
<li data-sourcepos="42:1-42:33">qmix double_q hidden state bug</li>
<li data-sourcepos="43:1-43:54">spawn context problem in interaction unittest (#69)</li>
<li data-sourcepos="44:1-44:37">formatted config no eval bug (#53)</li>
<li data-sourcepos="45:1-45:80">the catch statements that will never succeed and system proxy bug (#71) (#79)</li>
<li data-sourcepos="46:1-46:28">lunarlander config polish</li>
<li data-sourcepos="47:1-47:35">c51 head dimension mismatch bug</li>
<li data-sourcepos="48:1-48:26">mujoco config typo bug</li>
<li data-sourcepos="49:1-49:37">ppg atari config multi buffer bug</li>
<li data-sourcepos="50:1-51:0">max use and priority update special branch bug in advanced_buffer</li>
</ol>
<h1 data-sourcepos="52:1-52:7" dir="auto">
<a id="user-content-style" class="anchor" href="#style" aria-hidden="true"></a>Style</h1>
<ol data-sourcepos="53:1-57:0" dir="auto">
<li data-sourcepos="53:1-53:57">add docker deploy in github workflow (#70) (#78) (#80)</li>
<li data-sourcepos="54:1-54:24">support PyTorch 1.9.0</li>
<li data-sourcepos="55:1-55:30">add algo/env list in README</li>
<li data-sourcepos="56:1-57:0">rename advanced_buffer register name to advanced</li>
</ol>
<h1 data-sourcepos="58:1-58:10" dir="auto">
<a id="user-content-new-repo" class="anchor" href="#new-repo" aria-hidden="true"></a>New Repo</h1>
<ul data-sourcepos="59:1-60:0" dir="auto">
<li data-sourcepos="59:1-60:0">
<a href="https://github.com/opendilab/DI-treetensor" rel="nofollow noreferrer noopener" target="_blank">DI-treetensor</a>: Tree Nested PyTorch Tensor Lib</li>
</ul>
<p data-sourcepos="61:1-61:199" dir="auto"><strong>Contributors: @PaParaZz1 @YinminZhang @Will-Nie @puyuan1996 @Weiyuhong-1998 <a href="/HansBug" data-user="196424" data-reference-type="user" data-container="body" data-placement="top" class="gfm gfm-project_member js-user-link" title="HansBug">@HansBug</a> @sailxjx @simonat2011 @konnase @RobinC94 @LikeJulia @LuciusMos @jayyoung0802 @yifan123 @davide97l @garyzhang99</strong></p>
2021-10-08T17:11:05+08:00
niuyazhe
niuyazhe@sensetime.com
https://gitcode.net/opendilab/DI-engine/-/tags/v0.1.1
v0.1.1
<h1 data-sourcepos="1:1-1:12" dir="auto">
<a id="user-content-api-change" class="anchor" href="#api-change" aria-hidden="true"></a>API Change</h1>
<ol data-sourcepos="2:1-3:0" dir="auto">
<li data-sourcepos="2:1-3:0">Indicate <code>exp_name</code> field in config to output logs and files</li>
</ol>
<h1 data-sourcepos="4:1-4:12" dir="auto">
<a id="user-content-envdizoo" class="anchor" href="#envdizoo" aria-hidden="true"></a>Env(dizoo)</h1>
<ol data-sourcepos="5:1-9:0" dir="auto">
<li data-sourcepos="5:1-5:29">selfplay/league demo (#12)</li>
<li data-sourcepos="6:1-6:21">pybullet env (#16)</li>
<li data-sourcepos="7:1-7:21">minigrid env (#13)</li>
<li data-sourcepos="8:1-9:0">atari enduro config (#11)</li>
</ol>
<h1 data-sourcepos="10:1-10:11" dir="auto">
<a id="user-content-algorithm" class="anchor" href="#algorithm" aria-hidden="true"></a>Algorithm</h1>
<ol data-sourcepos="11:1-13:0" dir="auto">
<li data-sourcepos="11:1-11:21">on policy PPO (#9)</li>
<li data-sourcepos="12:1-13:0">ACER algorithm (#14)</li>
</ol>
<h1 data-sourcepos="14:1-14:13" dir="auto">
<a id="user-content-enhancement" class="anchor" href="#enhancement" aria-hidden="true"></a>Enhancement</h1>
<ol data-sourcepos="15:1-17:0" dir="auto">
<li data-sourcepos="15:1-15:46">polish experiment directory structure (#10)</li>
<li data-sourcepos="16:1-17:0">split doc to new repo (#4)</li>
</ol>
<h1 data-sourcepos="18:1-18:5" dir="auto">
<a id="user-content-fix" class="anchor" href="#fix" aria-hidden="true"></a>Fix</h1>
<ol data-sourcepos="19:1-22:0" dir="auto">
<li data-sourcepos="19:1-19:34">atari env info action space bug</li>
<li data-sourcepos="20:1-20:53">env manager retry wrapper raise exception info bug</li>
<li data-sourcepos="21:1-22:0">dist entry disable-flask-log typo</li>
</ol>
<h1 data-sourcepos="23:1-23:7" dir="auto">
<a id="user-content-style" class="anchor" href="#style" aria-hidden="true"></a>Style</h1>
<ol data-sourcepos="24:1-27:0" dir="auto">
<li data-sourcepos="24:1-24:38">codestyle optimization by lgtm (#7)</li>
<li data-sourcepos="25:1-25:32">code/comment statistics badge</li>
<li data-sourcepos="26:1-27:0">polish github CI workflow</li>
</ol>
<p data-sourcepos="28:1-28:64" dir="auto"><strong>Contributors</strong>: @PaParaZz1 <a href="/HansBug" data-user="196424" data-reference-type="user" data-container="body" data-placement="top" class="gfm gfm-project_member js-user-link" title="HansBug">@HansBug</a> @YinminZhang @simonat2011</p>
2021-10-08T17:11:05+08:00
niuyazhe
niuyazhe@sensetime.com
https://gitcode.net/opendilab/DI-engine/-/tags/v0.1.0
v0.1.0
DI-engine (beta) v0.1.0
<p data-sourcepos="1:1-1:22" dir="auto">DI-engine(beta) v0.1.0</p>
2021-10-08T17:11:05+08:00
niuyazhe
niuyazhe@sensetime.com