From 81d7c73e68942fb58dcbafe9c527adb4005cdf56 Mon Sep 17 00:00:00 2001 From: zenghsh3 Date: Thu, 28 Jun 2018 11:26:00 +0800 Subject: [PATCH] add rough training time --- fluid/DeepQNetwork/README.md | 2 +- fluid/DeepQNetwork/README_cn.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fluid/DeepQNetwork/README.md b/fluid/DeepQNetwork/README.md index 54534a36..47d97ae4 100644 --- a/fluid/DeepQNetwork/README.md +++ b/fluid/DeepQNetwork/README.md @@ -13,7 +13,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor ### [Atari games introduction](https://gym.openai.com/envs/#atari) ### Pong game result -The average game rewards that can be obtained for the three models as the number of training steps changes during the training are as follows: +The average game rewards that can be obtained for the three models as the number of training steps changes during the training are as follows(about 3 hours/1 Million steps): ![DQN result](assets/dqn.png) ## How to use diff --git a/fluid/DeepQNetwork/README_cn.md b/fluid/DeepQNetwork/README_cn.md index f6a15b85..dcbd8e9e 100644 --- a/fluid/DeepQNetwork/README_cn.md +++ b/fluid/DeepQNetwork/README_cn.md @@ -11,7 +11,7 @@ ### [Atari游戏介绍](https://gym.openai.com/envs/#atari) ### Pong游戏训练结果 -三个模型在训练过程中随着训练步数的变化,能得到的平均游戏奖励如下图所示: +三个模型在训练过程中随着训练步数的变化,能得到的平均游戏奖励如下图所示(大概3小时每1百万步): ![DQN result](assets/dqn.png) ## 使用教程 -- GitLab