updated

a3c9d007 · xiaowei_xing · a87aaead · a3c9d007 · a3c9d007 · a3c9d007
8 changed file
--- a/docs/1.md
+++ b/docs/1.md
 # Lecture 1 Introduction to Reinforcement Learning

-# 课时1 强化学习介绍 2018.03.20
+# 课时1 强化学习介绍 2019.01.07

 ## 1. 引言


--- a/docs/10.md
+++ b/docs/10.md
+# Lecture 10 Advanced Policy Gradient
+
+# 课时10 高级策略梯度 2019.02.11
+
+## 1. 策略梯度的目标（Policy Gradient Objective）
\ No newline at end of file
--- a/docs/3.md
+++ b/docs/3.md
 # Lecture 3 Model Free Policy Evaluation: Policy Evaluation Without Knowing How the World Works

-# 课时3 无模型策略评估 2018.03.20
+# 课时3 无模型策略评估 2019.01.14

 ## 4. 无模型策略评估


--- a/docs/4.md
+++ b/docs/4.md
 # Lecture 4 Model Free Control

-# 课时4 无模型控制 2018.03.20
+# 课时4 无模型控制 2019.01.16

 ## 5. 无模型控制（Model Free Control）


--- a/docs/5.md
+++ b/docs/5.md
 # Lecture 5 Value Function Approximation

-# 课时4 值函数近似 2018.03.20
+# 课时4 值函数近似 2019.01.23

 ## 7. 介绍（Introduction）


--- a/docs/6.md
+++ b/docs/6.md
 # Lecture 6 CNNs and Deep Q-learning

-# 课时6 卷积神经网络与深度 Q-学习 2018.03.20
+# 课时6 卷积神经网络与深度 Q-学习 2019.01.28

 ## 7. 基于值的深度强化学习（Value-based Deep Reinforcement Learning）


--- a/docs/7.md
+++ b/docs/7.md
 # Lecture 7 Imitation Learning

-# 课时7 模仿学习 2018.03.20
+# 课时7 模仿学习 2019.01.30

 ## 8. 介绍（Introduction）


--- a/docs/8&9.md
+++ b/docs/8&9.md
 # Lecture 8&9 Policy Gradient

-# 课时8&9 策略梯度 2018.03.20
+# 课时8&9 策略梯度 2019.02.04 & 2019.02.06

 ## 1. 策略搜索介绍（Introduction to Policy Search）