diff --git a/fluid/DeepQNetwork/README.md b/fluid/DeepQNetwork/README.md index 54534a368df76d6c478a78b897491da432c35009..47d97ae487c3437f6b5c87d8ba7d1753f3929d77 100644 --- a/fluid/DeepQNetwork/README.md +++ b/fluid/DeepQNetwork/README.md @@ -13,7 +13,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor ### [Atari games introduction](https://gym.openai.com/envs/#atari) ### Pong game result -The average game rewards that can be obtained for the three models as the number of training steps changes during the training are as follows: +The average game rewards that can be obtained for the three models as the number of training steps changes during the training are as follows(about 3 hours/1 Million steps): ![DQN result](assets/dqn.png) ## How to use diff --git a/fluid/DeepQNetwork/README_cn.md b/fluid/DeepQNetwork/README_cn.md index f6a15b85c8c235f7d426793c4a420b50e015639e..dcbd8e9e6f8fdc370bcf82254effbf10e4450a69 100644 --- a/fluid/DeepQNetwork/README_cn.md +++ b/fluid/DeepQNetwork/README_cn.md @@ -11,7 +11,7 @@ ### [Atari游戏介绍](https://gym.openai.com/envs/#atari) ### Pong游戏训练结果 -三个模型在训练过程中随着训练步数的变化,能得到的平均游戏奖励如下图所示: +三个模型在训练过程中随着训练步数的变化,能得到的平均游戏奖励如下图所示(大概3小时每1百万步): ![DQN result](assets/dqn.png) ## 使用教程