提交 bcb23bce 编写于 作者: X xiaowei_xing

test

上级 37e345b0
......@@ -235,6 +235,6 @@ $$
式(6)可被表示为:
$$
\nabla_{\theta}V(\theta)=\nabla_{\theta}\mathbb{E}_ {\tau \sim \pi_{\theta}}[R(\tau)\sum_{t=0}^{T-1}\nabla_{\theta}\log \pi_{\theta}(a_t|s_t)]。
\nabla_{\theta}V(\theta) = \nabla_{\theta} \mathbb{E}_ {\tau \sim \pi_{\theta}}[R(\tau)] = \mathbb{E}_ {\tau \sim \pi_{\theta}}[R(\tau)\sum_{t=0}^{T-1}\nabla_{\theta}\log \pi_{\theta}(a_t|s_t)]。
\tag{11}
$$
\ No newline at end of file
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册