diff --git a/README.cn.md b/README.cn.md index 66b4d0596de46d1bb1342adb796cecd74a848861..2b5d1846405cb77f22d56b8b7f49a3a341192515 100644 --- a/README.cn.md +++ b/README.cn.md @@ -103,7 +103,7 @@ ans = agent.sum(1,5) # run remotely and not comsume any local computation resour # 安装: ### 依赖 - Python 2.7 or 3.5+. -- PaddlePaddle >=1.2.1 (**非必须的**,如果你只用并行部分的接口不需要安装paddle) +- [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) >=1.2.1 (**非必须的**,如果你只用并行部分的接口不需要安装paddle) ``` @@ -118,7 +118,7 @@ pip install parl - [IMPALA](examples/IMPALA/) - [A2C](examples/A2C/) - [GA3C](examples/GA3C/) -- [NIPS2018强化学习假肢挑战赛冠军解决方案](examples/NeurIPS2018-AI-for-Prosthetics-Challenge/) +- [冠军解决方案:NIPS2018强化学习假肢挑战赛](examples/NeurIPS2018-AI-for-Prosthetics-Challenge/) NeurlIPS2018 Half-Cheetah Breakout
diff --git a/README.md b/README.md index 39a943d1cb7dd5a286e11aa5e98054421a8c4a59..f00aa4a122dbe3ab580aa23d79ab805d7b9f6b8c 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ English | [简体中文](./README.cn.md) -> PARL is a flexible and high-efficient reinforcement learning framework based on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle). +> PARL is a flexible and high-efficient reinforcement learning framework. # Features **Reproducible**. We provide algorithms that stably reproduce the result of many influential reinforcement learning algorithms. @@ -28,7 +28,7 @@ The main abstractions introduced by PARL that are used to build an agent recursi `Algorithm` describes the mechanism to update parameters in `Model` and often contains at least one model. ### Agent -`Agent`, a data bridge between environment and algorithm, is responsible for data I/O with the outside environment and describes data preprocessing before feeding data into the training process. +`Agent`, a data bridge between the environment and the algorithm, is responsible for data I/O with the outside environment and describes data preprocessing before feeding data into the training process. Here is an example of building an agent with DQN algorithm for Atari games. ```python @@ -106,7 +106,7 @@ For users, they can write code in a simple way, just like writing multi-thread c # Install: ### Dependencies - Python 2.7 or 3.5+. -- PaddlePaddle >=1.2.1 (**Optional**, if you only want to use APIs related to parallelization alone) +- [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) >=1.2.1 (**Optional**, if you only want to use APIs related to parallelization alone) ``` diff --git a/examples/A2C/README.md b/examples/A2C/README.md index 27488b96c958829280616f5792c4d18bcc4eecbd..93fdb7ec2816c42aa54893db7d9832772b02a19d 100644 --- a/examples/A2C/README.md +++ b/examples/A2C/README.md @@ -20,7 +20,7 @@ Mean episode reward in training process after 10 million sample steps. + [paddlepaddle>=1.3.0](https://github.com/PaddlePaddle/Paddle) + [parl](https://github.com/PaddlePaddle/PARL) + gym -+ atari_py ++ atari-py ### Distributed Training diff --git a/examples/DQN/README.md b/examples/DQN/README.md index b3fda95a71f0757c566d35bc5e2289098600b504..b6d785609fd3911e44130449cdf7c07e1b97c901 100644 --- a/examples/DQN/README.md +++ b/examples/DQN/README.md @@ -20,7 +20,7 @@ Please see [here](https://gym.openai.com/envs/#atari) to know more about Atari g + [parl](https://github.com/PaddlePaddle/PARL) + gym + tqdm -+ atari_py ++ atari-py + [ale_python_interface](https://github.com/mgbellemare/Arcade-Learning-Environment) diff --git a/examples/GA3C/README.md b/examples/GA3C/README.md index 919aae3a51b6f2938a73c69fccd8fea699f0583f..d7b8dbbf7065993230f541dcf55e2a535da21ea9 100644 --- a/examples/GA3C/README.md +++ b/examples/GA3C/README.md @@ -20,7 +20,7 @@ Results with one learner (in a P40 GPU) and 24 simulators (in 12 CPU) in 10 mill + [paddlepaddle>=1.3.0](https://github.com/PaddlePaddle/Paddle) + [parl](https://github.com/PaddlePaddle/PARL) + gym -+ atari_py ++ atari-py ### Distributed Training diff --git a/examples/IMPALA/README.md b/examples/IMPALA/README.md index 35510d6d6f323f77bf4714de0a99d1b3c69fb385..b7a5c4dc6ac00e909963ed86f19d32458401154d 100644 --- a/examples/IMPALA/README.md +++ b/examples/IMPALA/README.md @@ -24,7 +24,7 @@ Result with one learner (in a P40 GPU) and 32 actors (in 32 CPUs). + [paddlepaddle>=1.3.0](https://github.com/PaddlePaddle/Paddle) + [parl](https://github.com/PaddlePaddle/PARL) + gym -+ atari_py ++ atari-py ### Distributed Training: