提交 81d7c73e 编写于 作者: Z zenghsh3

add rough training time

上级 c1f95cd2
...@@ -13,7 +13,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor ...@@ -13,7 +13,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor
### [Atari games introduction](https://gym.openai.com/envs/#atari) ### [Atari games introduction](https://gym.openai.com/envs/#atari)
### Pong game result ### Pong game result
The average game rewards that can be obtained for the three models as the number of training steps changes during the training are as follows: The average game rewards that can be obtained for the three models as the number of training steps changes during the training are as follows(about 3 hours/1 Million steps):
![DQN result](assets/dqn.png) ![DQN result](assets/dqn.png)
## How to use ## How to use
......
...@@ -11,7 +11,7 @@ ...@@ -11,7 +11,7 @@
### [Atari游戏介绍](https://gym.openai.com/envs/#atari) ### [Atari游戏介绍](https://gym.openai.com/envs/#atari)
### Pong游戏训练结果 ### Pong游戏训练结果
三个模型在训练过程中随着训练步数的变化,能得到的平均游戏奖励如下图所示: 三个模型在训练过程中随着训练步数的变化,能得到的平均游戏奖励如下图所示(大概3小时每1百万步)
![DQN result](assets/dqn.png) ![DQN result](assets/dqn.png)
## 使用教程 ## 使用教程
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册