提交 8ad508dd 编写于 作者: P PaParaZz1

Deploying to gh-pages from @ 13222d5a47f6ee74c4ab8e98382d0c5528bcea9a 🚀

上级 fc57bd02
......@@ -355,11 +355,15 @@
<span class="k">return</span> <span class="p">{</span>
<span class="s1">&#39;cur_lr&#39;</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">_optimizer</span><span class="o">.</span><span class="n">defaults</span><span class="p">[</span><span class="s1">&#39;lr&#39;</span><span class="p">],</span>
<span class="s1">&#39;total_loss&#39;</span><span class="p">:</span> <span class="n">loss</span><span class="o">.</span><span class="n">item</span><span class="p">(),</span>
<span class="s1">&#39;q_value&#39;</span><span class="p">:</span> <span class="n">q_value</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">(),</span>
<span class="s1">&#39;priority&#39;</span><span class="p">:</span> <span class="n">td_error_per_sample</span><span class="o">.</span><span class="n">abs</span><span class="p">()</span><span class="o">.</span><span class="n">tolist</span><span class="p">(),</span>
<span class="c1"># Only discrete action satisfying len(data[&#39;action&#39;])==1 can return this and draw histogram on tensorboard.</span>
<span class="c1"># &#39;[histogram]action_distribution&#39;: data[&#39;action&#39;],</span>
<span class="p">}</span></div>
<span class="k">def</span> <span class="nf">_monitor_vars_learn</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]:</span>
<span class="k">return</span> <span class="p">[</span><span class="s1">&#39;cur_lr&#39;</span><span class="p">,</span> <span class="s1">&#39;total_loss&#39;</span><span class="p">,</span> <span class="s1">&#39;q_value&#39;</span><span class="p">]</span>
<div class="viewcode-block" id="DQNPolicy._state_dict_learn"><a class="viewcode-back" href="../../../api_doc/policy/dqn.html#ding.policy.dqn.DQNPolicy._state_dict_learn">[docs]</a> <span class="k">def</span> <span class="nf">_state_dict_learn</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">Any</span><span class="p">]:</span>
<span class="sd">&quot;&quot;&quot;</span>
<span class="sd"> Overview:</span>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册