提交 4a9aed3d 编写于 作者: Z zenghsh3

update README

上级 2448dd5e
......@@ -16,7 +16,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor
![DQN result](assets/dqn.png)
# How to use
+ Dependencies:
### Dependencies:
+ python2.7
+ gym
+ tqdm
......@@ -24,7 +24,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor
+ paddlepaddle-gpu>=0.12.0
+ ale_python_interface
+ Install Dependencies:
### Install Dependencies:
+ Install PaddlePaddle:
recommended to compile and install PaddlePaddle from source code
+ Install other dependencies:
......@@ -35,7 +35,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor
Install ale_python_interface, can reference:https://github.com/mgbellemare/Arcade-Learning-Environment
+ Start Training:
### Start Training:
```
# To train a model for Pong game with gpu (use DQN model as default)
python train.py --rom ./rom_files/pong.bin --use_cuda
......@@ -49,7 +49,7 @@ Based on PaddlePaddle's next-generation API Fluid, the DQN model of deep reinfor
To train more games, can install more rom files from [here](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms)
+ Start Testing:
### Start Testing:
```
# Play the game with saved best model and calculate the average rewards
python play.py --rom ./rom_files/pong.bin --use_cuda --model_path ./saved_model/DQN-pong
......
......@@ -14,7 +14,7 @@
![DQN result](assets/dqn.png)
# 使用教程
+ 依赖:
### 依赖:
+ python2.7
+ gym
+ tqdm
......@@ -22,7 +22,7 @@
+ paddlepaddle-gpu>=0.12.0
+ ale_python_interface
+ 下载依赖:
### 下载依赖:
+ 安装PaddlePaddle:
建议通过PaddlePaddle源码进行编译安装
+ 下载其它依赖:
......@@ -32,7 +32,7 @@
```
安装ale_python_interface可以参考:https://github.com/mgbellemare/Arcade-Learning-Environment
+ 训练模型:
### 训练模型:
```
# 使用GPU训练Pong游戏(默认使用DQN模型)
python train.py --rom ./rom_files/pong.bin --use_cuda
......@@ -46,7 +46,7 @@
训练更多游戏,可以下载游戏rom从[这里](https://github.com/openai/atari-py/tree/master/atari_py/atari_roms)
+ 测试模型:
### 测试模型:
```
# Play the game with saved model and calculate the average rewards
# 使用训练过程中保存的最好模型玩游戏,以及计算平均奖励(rewards)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册