diff --git a/docs/10.md b/docs/10.md index 95fb0c54647a250903a50a1f84dea9163fd94290..e1eccf46b6303d294eaa5132c595dc877433204f 100644 --- a/docs/10.md +++ b/docs/10.md @@ -84,4 +84,8 @@ $$ $$ = \mathbb{E}_ {\tau\sim\pi_{\theta}} [\sum_{t=1}^{T}(\nabla_{\theta}(\log\pi_{\theta}(a_t|s_t))(\sum_{t=1}^{T}\gamma^t r(s_t,a_t)))] +$$ + +$$ +\approx \frac{1}{N} \sum_{i=1}^{N} \sum_{t=1}^{T} (\nabla_{\theta} (\log\pi_{\theta}(a_{i,t|}|s_{i,t}))(\sum_{t=1}^{T}\gamma^t r(s_{i,t},a_{i,t}))) $$ \ No newline at end of file