提交 dd8fea95 编写于 作者: X xiaowei_xing

test

上级 c3b39325
......@@ -756,7 +756,16 @@ $$
我们有:
$$
V^{\pi'}-V^{\pi} = \frac{1}{1-\gamma}
V^{\pi'}-V^{\pi} = \frac{1}{1-\gamma}(\mathbb{E}_ {s\sim d^{\pi'},a\sim\pi'(\cdot|s),s'\sim M(\cdot|s,a)}[\delta_{f}(s,a,s')] - \mathbb{E}_ {s\sim d^{\pi},a\sim\pi(\cdot|s),s'\sim M(\cdot|s,a)}[\delta_{f}(s,a,s')])。
\tag{17}
$$
我们先来关注第一项。
令 $\overline{\delta}_ {f}^{\pi'}(s)=\mathbb{E}_ {a\sim\pi'(\cdot|s),s'\sim M(\cdot|s,a)}[\delta_{f}(s,a,s')] \in \mathbb{R}^{|S|}$,那么:
$$
\mathbb{E}_ {s\sim d^{\pi'},a\sim\pi',s'\sim M}[\delta_{f}(s,a,s')] = \langle d^{\pi'}, \overline{\delta}_ {f}^{\pi'}\rangle
$$
证明完毕。$\diamondsuit$
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册