提交 d8ec4424 编写于 作者: Z zenghsh3

Update README of English version

上级 e11353f9
# 基于PaddlePaddle的Fluid版本复现DQN, DoubleDQN, DuelingDQN三个模型 [中文版](README_cn.md)
基于Paddle下一代API fluid复现了深度强化学习领域的DQN模型,在经典的Atari 游戏上复现了论文同等水平的指标,模型接收游戏的图像作为输入,采用端到端的模型直接预测下一步要执行的控制信号,本仓库一共包含以下3类模型。
+ DQN模型: # Reproduce DQN, DoubleDQN, DuelingDQN model with Fluid version of PaddlePaddle
Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinforcement learning is reproduced, and the same level of indicators of the paper is reproduced in the classic Atari game. The model receives the image of the game as input, and uses the end-to-end model to directly predict the next step. The repository contains the following three types of models.
+ DQN in
[Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html) [Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
+ DoubleDQN模型: + DoubleDQN in:
[Deep Reinforcement Learning with Double Q-Learning](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389) [Deep Reinforcement Learning with Double Q-Learning](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389)
+ DuelingDQN模型: + DuelingDQN in:
[Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html) [Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html)
# 模型效果:Atari游戏表现 # Atari benchmark & performance
## [Atari游戏介绍](https://gym.openai.com/envs/#atari) ## [Atari games introduction](https://gym.openai.com/envs/#atari)
+ Pong游戏训练结果 + Pong game result
![DQN result](assets/dqn.png) ![DQN result](assets/dqn.png)
# 使用教程 # How to use
+ 依赖: + Dependencies:
+ python2.7 + python2.7
+ gym + gym
+ tqdm + tqdm
...@@ -22,36 +24,36 @@ ...@@ -22,36 +24,36 @@
+ paddlepaddle-gpu>=0.12.0 + paddlepaddle-gpu>=0.12.0
+ ale_python_interface + ale_python_interface
+ 下载依赖: + Install Dependencies:
+ 安装PaddlePaddle + Install PaddlePaddle
建议通过PaddlePaddle源码进行编译安装 Recommended to compile and install PaddlePaddle from source code
+ 下载其它依赖 + Install other dependencies:
``` ```
pip install -r requirement.txt pip install -r requirement.txt
pip install gym[atari] pip install gym[atari]
``` ```
安装ale_python_interface可以参考:https://github.com/mgbellemare/Arcade-Learning-Environment Install ale_python_interface, can reference:https://github.com/mgbellemare/Arcade-Learning-Environment
+ 训练模型: + Start Training:
``` ```
# 使用GPU训练Pong游戏(默认使用DQN模型) # To train a model for Pong game with gpu (use DQN model as default)
python train.py --rom ./rom_files/pong.bin --use_cuda python train.py --rom ./rom_files/pong.bin --use_cuda
# 训练DoubleDQN模型 # To train a model for Pong with DoubleDQN
python train.py --rom ./rom_files/pong.bin --use_cuda --alg DoubleDQN python train.py --rom ./rom_files/pong.bin --use_cuda --alg DoubleDQN
# 训练DuelingDQN模型 # To train a model for Pong with DuelingDQN
python train.py --rom ./rom_files/pong.bin --use_cuda --alg DuelingDQN python train.py --rom ./rom_files/pong.bin --use_cuda --alg DuelingDQN
``` ```
训练更多游戏,可以下载游戏rom从[这里](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms) To train more games, can install more rom files from [here](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms)
+ 测试模型: + Start Testing:
``` ```
# Play the game with saved model and calculate the average rewards # Play the game with saved best model and calculate the average rewards
# 使用训练过程中保存的最好模型玩游戏,以及计算平均奖励(rewards)
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong
# 以可视化的形式来玩游戏 # Play the game with visualization
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong --viz 0.01 python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong --viz 0.01
``` ```
# Reproduce DQN, DoubleDQN, DuelingDQN model with fluid version of PaddlePaddle # 基于PaddlePaddle的Fluid版本复现DQN, DoubleDQN, DuelingDQN三个模型
基于PaddlePaddle下一代API Fluid复现了深度强化学习领域的DQN模型,在经典的Atari 游戏上复现了论文同等水平的指标,模型接收游戏的图像作为输入,采用端到端的模型直接预测下一步要执行的控制信号,本仓库一共包含以下3类模型。
+ DQN in: + DQN模型:
[Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html) [Human-level Control Through Deep Reinforcement Learning](http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)
+ DoubleDQN in: + DoubleDQN模型:
[Deep Reinforcement Learning with Double Q-Learning](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389) [Deep Reinforcement Learning with Double Q-Learning](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12389)
+ DuelingDQN in: + DuelingDQN模型:
[Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html) [Dueling Network Architectures for Deep Reinforcement Learning](http://proceedings.mlr.press/v48/wangf16.html)
# Atari benchmark & performance # 模型效果:Atari游戏表现
## [Atari games introduction](https://gym.openai.com/envs/#atari) ## [Atari游戏介绍](https://gym.openai.com/envs/#atari)
+ Pong game result + Pong游戏训练结果
![DQN result](assets/dqn.png) ![DQN result](assets/dqn.png)
# How to use # 使用教程
+ Dependencies: + 依赖:
+ python2.7 + python2.7
+ gym + gym
+ tqdm + tqdm
...@@ -22,36 +22,36 @@ ...@@ -22,36 +22,36 @@
+ paddlepaddle-gpu>=0.12.0 + paddlepaddle-gpu>=0.12.0
+ ale_python_interface + ale_python_interface
+ Install Dependencies: + 下载依赖:
+ Install PaddlePaddle + 安装PaddlePaddle
Recommended to compile and install PaddlePaddle from source code 建议通过PaddlePaddle源码进行编译安装
+ Install other dependencies: + 下载其它依赖
``` ```
pip install -r requirement.txt pip install -r requirement.txt
pip install gym[atari] pip install gym[atari]
``` ```
Install ale_python_interface, can reference:https://github.com/mgbellemare/Arcade-Learning-Environment 安装ale_python_interface可以参考:https://github.com/mgbellemare/Arcade-Learning-Environment
+ Start Training: + 训练模型:
``` ```
# To train a model for Pong game with gpu (use DQN model as default) # 使用GPU训练Pong游戏(默认使用DQN模型)
python train.py --rom ./rom_files/pong.bin --use_cuda python train.py --rom ./rom_files/pong.bin --use_cuda
# To train a model for Pong with DoubleDQN # 训练DoubleDQN模型
python train.py --rom ./rom_files/pong.bin --use_cuda --alg DoubleDQN python train.py --rom ./rom_files/pong.bin --use_cuda --alg DoubleDQN
# To train a model for Pong with DuelingDQN # 训练DuelingDQN模型
python train.py --rom ./rom_files/pong.bin --use_cuda --alg DuelingDQN python train.py --rom ./rom_files/pong.bin --use_cuda --alg DuelingDQN
``` ```
To train more games, can install more rom files from [here](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms) 训练更多游戏,可以下载游戏rom从[这里](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms)
+ Start Testing: + 测试模型:
``` ```
# Play the game with saved best model and calculate the average rewards # Play the game with saved model and calculate the average rewards
# 使用训练过程中保存的最好模型玩游戏,以及计算平均奖励(rewards)
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong
# Play the game with visualization # 以可视化的形式来玩游戏
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong --viz 0.01 python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong --viz 0.01
``` ```
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册