未验证 提交 452050a0 编写于 作者: B Bo Zhou 提交者: GitHub

add some introduction for our parallelization feature (#61)

* Update remote_decorator.py

* Update README.md

* add an figure for the demonstration about parallelization

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* add a link to IMPALA
上级 b28289ac
...@@ -5,9 +5,9 @@ ...@@ -5,9 +5,9 @@
> PARL is a flexible and high-efficient reinforcement learning framework based on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle). > PARL is a flexible and high-efficient reinforcement learning framework based on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle).
# Features # Features
**Reproducible**. We provide algorithms that stably reproduce the result of many influential reinforcement learning algorithms **Reproducible**. We provide algorithms that stably reproduce the result of many influential reinforcement learning algorithms.
**Large Scale**. Ability to support high performance parallelization of training with thousands of CPUs and multi-GPUs **Large Scale**. Ability to support high performance parallelization of training with thousands of CPUs and multi-GPUs.
**Reusable**. Algorithms provided in repository could be directly adapted to a new task by defining a forward network and training mechanism will be built automatically. **Reusable**. Algorithms provided in repository could be directly adapted to a new task by defining a forward network and training mechanism will be built automatically.
...@@ -44,6 +44,7 @@ class AtariModel(parl.Model): ...@@ -44,6 +44,7 @@ class AtariModel(parl.Model):
stride=1, padding=2, act='relu') stride=1, padding=2, act='relu')
... ...
self.fc1 = layers.fc(action_dim) self.fc1 = layers.fc(action_dim)
def value(self, img): def value(self, img):
# define how to estimate the Q value based on the image of atari games. # define how to estimate the Q value based on the image of atari games.
img = img / 255.0 img = img / 255.0
...@@ -64,6 +65,42 @@ algorithm = DQN(model) ...@@ -64,6 +65,42 @@ algorithm = DQN(model)
agent = AtariAgent(algorithm) agent = AtariAgent(algorithm)
``` ```
# Parallelization
PARL provides a compact API for distributed training, allowing users to transfer the code into a parallelized version by simply adding a decorator.
Here is a `Hello World!` example to demonstrate how easily it is to leverage outer computation resources.
```python
#============Agent.py=================
@parl.remote_class
class Agent(object):
def say_hello(self):
print("Hello World!")
def sum(self, a, b):
return a+b
# launch `Agent.py` at any computation platforms such as a CPU cluster.
if __main__ == '__main__':
agent = Agent()
agent.as_remote(server_address)
#============Server.py=================
remote_manager = parl.RemoteManager()
agent = remote_manager.get_remote()
agent.say_hello()
ans = agent.sum(1,5) # run remotely and not comsume any local computation resources
```
Two steps to use outer computation resources:
1. use the `parl.remote_class` to decorate a class at first, after which it is transfered to be a new class that can run in other CPUs or machines.
2. Get remote objects from the `RemoteManager`, and these objects have the same functions as the real ones. However, calling any function of these objects **does not** consume local computation resources since they are executed elsewhere.
<img src=".github/decorator.png" alt="PARL" width="450"/>
As shown in the above figure, real actors(orange circle) are running at the cpu cluster, while the learner(bule circle) is running at the local gpu with several remote actors(yellow circle with dotted edge).
For users, they can write code in a simple way, just like writing multi-thread code, but with actors consuming remote resources. We have also provided examples of parallized algorithms like IMPALA, A2C and GA3C. For more details in usage please refer to these examples.
# Install: # Install:
### Dependencies ### Dependencies
- Python 2.7 or 3.5+. - Python 2.7 or 3.5+.
......
...@@ -32,7 +32,7 @@ Class Simulator(object): ...@@ -32,7 +32,7 @@ Class Simulator(object):
... ...
sim = Simulator() sim = Simulator()
sim.as_remote(server_ip='172.18.202.45', port=8001) sim.as_remote(server_ip='172.18.202.45', server_port=8001)
""" """
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册