提交 7ee13176 编写于 作者: X xiaowei_xing

test

上级 a3b14ee8
......@@ -325,4 +325,8 @@ $$
$$
\nabla_{\theta} \log \pi_{\theta}(a|s) = \nabla_{\theta} [\phi(s,a)^{\text{T}}\theta - \log \sum_{a'}e^{\phi(s,a')^{\text{T}}\theta}]
$$
$$
\phi(s,a) - \frac{1}{\sum_{a'}e^{\phi(s,a')^{\text{T}}\theta}} \nabla_{\theta} \sum_{a'}e^{\phi(s,a')^{\text{T}}\theta}
$$
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册