提交 0c4e8fa9 编写于 作者: X xiaowei_xing

test

上级 31a456b1
......@@ -467,4 +467,17 @@ $$
$$
\frac{1}{1-\gamma} L_{\pi}(\pi') - \frac{4\epsilon\gamma}{(1-\gamma)^2}\alpha^2 \leq V^{\pi'}-V^{\pi} \leq \frac{1}{1-\gamma} L_{\pi}(\pi') + \frac{4\epsilon\gamma}{(1-\gamma)^2}\alpha^2,
\tag{10}
$$
这里
$$
L_{\pi}(\pi') = \mathbb_{s\sim d^{\pi},a\sim\pi(\cdot|s)} [\frac{\pi'(a|s)}{\pi(a|s)} A^{\pi}(s,a)],
$$
$$
\epsilon = \mathop{\max}_{s,a} |A^{\pi}(s,a)|,
$$
$$
\alpha = \mathop{\max}_ s D_{TV}(\pi\lVert \pi')。
$$
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册