diff --git a/.github/PARL-logo.png b/.github/PARL-logo.png new file mode 100644 index 0000000000000000000000000000000000000000..ed9d5d50ae2261deb9b4f811522ca79c5c3ae3cd Binary files /dev/null and b/.github/PARL-logo.png differ diff --git a/.github/abstractions.png b/.github/abstractions.png new file mode 100644 index 0000000000000000000000000000000000000000..5f5dffdcd9f1749eb2a03dc07ef249e8a02d18bd Binary files /dev/null and b/.github/abstractions.png differ diff --git a/README.md b/README.md index 27bd133cc8921af7c1af99ad0c2ff870bef2c72f..5476837d07889f88af3b746b8d86cf85e11b4fd6 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,64 @@ -# PARL -**Pa**ddlePaddle **R**einforcement **L**earning Framework +
+ +
+ +# Features +**Reproducible**. We provide algorithms that stably reproduce the result of many influential reinforcement learning algorithms + +**Large Scale**. Ability to support high performance parallelization of training with thousands of CPUs and multi-GPUs + +**Reusable**. Algorithms provided in repository could be directly adapted to a new task by defining a forward network and training mechanism will be built automatically. + +**Extensible**. Build new algorithms quickly by inheriting the abstract class in the framework. + + +# Abstractions + +PARL aims to build an agent for training algorithms to perform complex tasks. +The main abstractions introduced by PARL that are used to build an agent recursively are the following: + +### Model +`Model` is abstracted to construct the forward network which defines a policy network or critic network given state as input. + +### Algorithm +`Algorithm` describes the mechanism to update parameters in `Model` and often contains at least one model. + +### Agent +`Agent` is a data bridge between environment and algorithm. It is responsible for data I/O with outside and describes data preprocessing before feeding into the training process. + +Here is an example of building an agent with DQN algorithm for atari games. +```python +import parl +from parl.algorithms import DQN, DDQN + +class CriticModel(parl.Model): +""" define specific forward model for environment ...""" +""" +three steps to build an agent + 1. define a forward model which is critic_model is this example + 2. a. to build a DQN algorithm, just pass the critic_model to `DQN` + b. to build a DDQN algorithm, just replace DQN in following line with DDQN + 3. define the I/O part in AtariAgent so that it could update the algorithm based on the interactive data +""" + +critic_model = CriticModel(act_dim=2) +algorithm = DQN(critic_model) +agent = AtariAgent(aglrotihm) +``` + +# Install: +### Dependencies +- Python 2.7 or 3.5+. +- PaddlePaddle >=1.0 (We try to make our repository always compatible with newest version PaddlePaddle) + + +``` +pip install --upgrade git+https://github.com/PaddlePaddle/PARL.git +``` + +# Examples + +- DQN +- DDPG +- PPO +- Winning Solution for NIPS2018: AI for Prosthetics Challenge