README.md 3.5 KB
Newer Older
R
rical730 已提交
1 2 3
## Reproduce MADDPG with PARL
Based on PARL, the MADDPG algorithm of deep reinforcement learning has been reproduced.

R
rical730 已提交
4
> Paper: MADDPG in [ Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments](https://arxiv.org/abs/1706.02275)
R
rical730 已提交
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96

### Multi-agent particle environment introduction
A simple multi-agent particle world based on gym. Please see [here](https://github.com/openai/multiagent-particle-envs) to install and know more about the environment.

### Benchmark result
Mean episode reward (every 1000 episodes) in training process (totally 25000 episodes).

<table>
<tr>
<td>
simple<br>
<img src=".benchmark/MADDPG_simple.png"                  width = "170" height = "170" alt="MADDPG_simple"/>
</td>
<td>
simple_adversary<br>
<img src=".benchmark/MADDPG_simple_adversary.png"        width = "170" height = "170" alt="MADDPG_simple_adversary"/>
</td>
<td>
simple_push<br>
<img src=".benchmark/MADDPG_simple_push.png"             width = "170" height = "170" alt="MADDPG_simple_push"/>
</td>
<td>
simple_reference<br>
<img src=".benchmark/MADDPG_simple_reference.png"        width = "170" height = "170" alt="MADDPG_simple_reference"/>
</td>
</tr>
<tr>
<td>
simple_speaker_listener<br>
<img src=".benchmark/MADDPG_simple_speaker_listener.png" width = "170" height = "170" alt="MADDPG_simple_speaker_listener"/>
</td>
<td>
simple_spread<br>
<img src=".benchmark/MADDPG_simple_spread.png"           width = "170" height = "170" alt="MADDPG_simple_spread"/>
</td>
<td>
simple_tag<br>
<img src=".benchmark/MADDPG_simple_tag.png"              width = "170" height = "170" alt="MADDPG_simple_tag"/>
</td>
<td>
simple_world_comm<br>
<img src=".benchmark/MADDPG_simple_world_comm.png"       width = "170" height = "170" alt="MADDPG_simple_world_comm"/>
</td>
</tr>
</table>

### Experiments result
Display after 25000 episodes.

<table>
<tr>
<td>
simple<br>
<img src=".benchmark/MADDPG_simple.gif"                  width = "170" height = "170" alt="MADDPG_simple"/>
</td>
<td>
simple_adversary<br>
<img src=".benchmark/MADDPG_simple_adversary.gif"        width = "170" height = "170" alt="MADDPG_simple_adversary"/>
</td>
<td>
simple_push<br>
<img src=".benchmark/MADDPG_simple_push.gif"             width = "170" height = "170" alt="MADDPG_simple_push"/>
</td>
<td>
simple_reference<br>
<img src=".benchmark/MADDPG_simple_reference.gif"        width = "170" height = "170" alt="MADDPG_simple_reference"/>
</td>
</tr>
<tr>
<td>
simple_speaker_listener<br>
<img src=".benchmark/MADDPG_simple_speaker_listener.gif" width = "170" height = "170" alt="MADDPG_simple_speaker_listener"/>
</td>
<td>
simple_spread<br>
<img src=".benchmark/MADDPG_simple_spread.gif"           width = "170" height = "170" alt="MADDPG_simple_spread"/>
</td>
<td>
simple_tag<br>
<img src=".benchmark/MADDPG_simple_tag.gif"              width = "170" height = "170" alt="MADDPG_simple_tag"/>
</td>
<td>
simple_world_comm<br>
<img src=".benchmark/MADDPG_simple_world_comm.gif"       width = "170" height = "170" alt="MADDPG_simple_world_comm"/>
</td>
</tr>
</table>


## How to use
### Dependencies:
+ python3.5+
B
Bo Zhou 已提交
97
+ [paddlepaddle>=1.6.1](https://github.com/PaddlePaddle/Paddle)
R
rical730 已提交
98 99
+ [parl](https://github.com/PaddlePaddle/PARL)
+ [multiagent-particle-envs](https://github.com/openai/multiagent-particle-envs)
100
+ gym==0.10.5
R
rical730 已提交
101 102 103 104 105 106 107 108 109 110 111

### Start Training:
```
# To train an agent for simple_speaker_listener scenario
python train.py

# To train for other scenario, model is automatically saved every 1000 episodes
# python train.py --env [ENV_NAME]

# To show animation effects after training
# python train.py --env [ENV_NAME] --show --restore