Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
PaddlePaddle
PARL
提交
5be4ca00
P
PARL
项目概览
PaddlePaddle
/
PARL
通知
68
Star
3
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
18
列表
看板
标记
里程碑
合并请求
3
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
PARL
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
18
Issue
18
列表
看板
标记
里程碑
合并请求
3
合并请求
3
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
5be4ca00
编写于
12月 04, 2018
作者:
B
Bo Zhou
提交者:
Hongsheng Zeng
12月 04, 2018
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update README.md (#34)
a more detailed example for DQN model.
上级
b249dee3
变更
1
隐藏空白更改
内联
并排
Showing
1 changed file
with
20 addition
and
4 deletion
+20
-4
README.md
README.md
+20
-4
未找到文件。
README.md
浏览文件 @
5be4ca00
...
@@ -31,8 +31,24 @@ Here is an example of building an agent with DQN algorithm for atari games.
...
@@ -31,8 +31,24 @@ Here is an example of building an agent with DQN algorithm for atari games.
import
parl
import
parl
from
parl.algorithms
import
DQN
,
DDQN
from
parl.algorithms
import
DQN
,
DDQN
class
CriticModel
(
parl
.
Model
):
class
AtariModel
(
parl
.
Model
):
""" define specific forward model for environment ..."""
"""AtariModel
This class defines the forward part for an algorithm,
its input is state observed on environment.
"""
def
__init__
(
self
,
img_shape
,
action_dim
):
# define your layers
self
.
cnn1
=
layers
.
conv_2d
(
num_filters
=
32
,
filter_size
=
5
,
stride
=
[
1
,
1
],
padding
=
[
2
,
2
],
act
=
'relu'
)
...
self
.
fc1
=
layers
.
fc
(
action_dim
)
def
value
(
self
,
img
):
# define how to estimate the Q value based on the image of atari games.
img
=
img
/
255.0
l
=
self
.
cnn1
(
img
)
...
Q
=
self
.
fc1
(
l
)
return
Q
"""
"""
three steps to build an agent
three steps to build an agent
1. define a forward model which is critic_model is this example
1. define a forward model which is critic_model is this example
...
@@ -41,8 +57,8 @@ three steps to build an agent
...
@@ -41,8 +57,8 @@ three steps to build an agent
3. define the I/O part in AtariAgent so that it could update the algorithm based on the interactive data
3. define the I/O part in AtariAgent so that it could update the algorithm based on the interactive data
"""
"""
critic_model
=
CriticModel
(
act_dim
=
2
)
model
=
AtariModel
(
img_shape
=
(
32
,
32
),
action_dim
=
4
)
algorithm
=
DQN
(
critic_
model
)
algorithm
=
DQN
(
model
)
agent
=
AtariAgent
(
aglrotihm
)
agent
=
AtariAgent
(
aglrotihm
)
```
```
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录