提交 0cae3d05 编写于 作者: Z zenghsh3

Update README

上级 de84714e
[中文版](README_cn.md) [中文版](README_cn.md)
# Reproduce DQN, DoubleDQN, DuelingDQN model with Fluid version of PaddlePaddle ## Reproduce DQN, DoubleDQN, DuelingDQN model with Fluid version of PaddlePaddle
Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinforcement learning is reproduced, and the same level of indicators of the paper is reproduced in the classic Atari game. The model receives the image of the game as input, and uses the end-to-end model to directly predict the next step. The repository contains the following three types of models. Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinforcement learning is reproduced, and the same level of indicators of the paper is reproduced in the classic Atari game. The model receives the image of the game as input, and uses the end-to-end model to directly predict the next step. The repository contains the following three types of models.
+ DQN in + DQN in
[Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html) [Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
...@@ -9,13 +9,14 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor ...@@ -9,13 +9,14 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor
+ DuelingDQN in: + DuelingDQN in:
[Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html) [Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html)
# Atari benchmark & performance ## Atari benchmark & performance
## [Atari games introduction](https://gym.openai.com/envs/#atari) ### [Atari games introduction](https://gym.openai.com/envs/#atari)
+ Pong game result ### Pong game result
The average game rewards that can be obtained for the three models as the number of training steps changes during the training are as follows:
![DQN result](assets/dqn.png) ![DQN result](assets/dqn.png)
# How to use ## How to use
### Dependencies: ### Dependencies:
+ python2.7 + python2.7
+ gym + gym
......
# 基于PaddlePaddle的Fluid版本复现DQN, DoubleDQN, DuelingDQN三个模型 ## 基于PaddlePaddle的Fluid版本复现DQN, DoubleDQN, DuelingDQN三个模型
基于PaddlePaddle下一代API Fluid复现了深度强化学习领域的DQN模型,在经典的Atari 游戏上复现了论文同等水平的指标,模型接收游戏的图像作为输入,采用端到端的模型直接预测下一步要执行的控制信号,本仓库一共包含以下3类模型。 基于PaddlePaddle下一代API Fluid复现了深度强化学习领域的DQN模型,在经典的Atari 游戏上复现了论文同等水平的指标,模型接收游戏的图像作为输入,采用端到端的模型直接预测下一步要执行的控制信号,本仓库一共包含以下3类模型。
+ DQN模型: + DQN模型:
[Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html) [Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
...@@ -7,13 +7,14 @@ ...@@ -7,13 +7,14 @@
+ DuelingDQN模型: + DuelingDQN模型:
[Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html) [Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html)
# 模型效果:Atari游戏表现 ## 模型效果:Atari游戏表现
## [Atari游戏介绍](https://gym.openai.com/envs/#atari) ### [Atari游戏介绍](https://gym.openai.com/envs/#atari)
+ Pong游戏训练结果 ### Pong游戏训练结果
三个模型在训练过程中随着训练步数的变化,能得到的平均游戏奖励如下图所示:
![DQN result](assets/dqn.png) ![DQN result](assets/dqn.png)
# 使用教程 ## 使用教程
### 依赖: ### 依赖:
+ python2.7 + python2.7
+ gym + gym
...@@ -55,3 +56,4 @@ python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/ ...@@ -55,3 +56,4 @@ python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/
# 以可视化的形式来玩游戏 # 以可视化的形式来玩游戏
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong --viz 0.01 python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong --viz 0.01
``` ```
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册