Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
4f5b5e28
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
4f5b5e28
编写于
2月 28, 2018
作者:
T
Travis CI
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Deploy to GitHub Pages:
2edeb639
上级
b47e3d9f
变更
6
展开全部
隐藏空白更改
内联
并排
Showing
6 changed file
with
46 addition
and
42 deletion
+46
-42
develop/doc/_sources/design/parallel_do.md.txt
develop/doc/_sources/design/parallel_do.md.txt
+11
-10
develop/doc/design/parallel_do.html
develop/doc/design/parallel_do.html
+11
-10
develop/doc/searchindex.js
develop/doc/searchindex.js
+1
-1
develop/doc_cn/_sources/design/parallel_do.md.txt
develop/doc_cn/_sources/design/parallel_do.md.txt
+11
-10
develop/doc_cn/design/parallel_do.html
develop/doc_cn/design/parallel_do.html
+11
-10
develop/doc_cn/searchindex.js
develop/doc_cn/searchindex.js
+1
-1
未找到文件。
develop/doc/_sources/design/parallel_do.md.txt
浏览文件 @
4f5b5e28
...
...
@@ -39,15 +39,16 @@ In the backward pass
This implementation allows to write mixed device program like this
```python
# get embedding feature on CPU
feature = some_cpu_only_op(data
)
W1 = fluid.tensor(size=[100,20], parameter=true)
W2 = fluid.tensor(size=[20,15], parameter=true
)
gpu_places = get_place(use_gpu=True)
data = layers.data()
gpu_places = layers.get_place(use_gpu=True)
# parallel processing on multiple GPUs
pd = ParallelDo(gpu_places)
with pd.do():
read_input(feature)
prediction = my_net(feature)
with pd.do(input=data):
prediction = softmax(fc(fc(data, W1), W2))
write_output(prediction)
prediction = pd()
loss = cross_entropy(prediction, label)
...
...
@@ -66,20 +67,20 @@ start_program
main_program
{
block0 {
vars: data, places, w1, w2
vars: data, places, w1, w2
, w1_grad, w2_grad,
ops: data, get_place, parallel_do(block1),
parallel_do_grad(block2),
sgd(w2, w2_grad),
sgd(w1, w1_grad)
}
block1 {
block1 {
# the forward pass
parent_block: 0
vars: data, h1, h2, loss
ops: fc, fc, softmax
}
block2 {
block2 {
# the backward pass
parent_block: 1
vars: data_grad, h1_grad, h2_grad, loss_gard,
w1_grad,
w2_grad
vars: data_grad, h1_grad, h2_grad, loss_gard,
local_w1_grad, local_
w2_grad
ops: softmax_grad,
fc_grad
fc_grad
...
...
develop/doc/design/parallel_do.html
浏览文件 @
4f5b5e28
...
...
@@ -223,15 +223,16 @@
</pre></div>
</div>
<p>
This implementation allows to write mixed device program like this
</p>
<div
class=
"highlight-python"
><div
class=
"highlight"
><pre><span></span><span
class=
"
c1"
>
# get embedding feature on CPU
</span>
<span
class=
"n"
>
feature
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
some_cpu_only_op
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
)
</span>
<div
class=
"highlight-python"
><div
class=
"highlight"
><pre><span></span><span
class=
"
n"
>
W1
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
fluid
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
tensor
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
size
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"mi"
>
100
</span><span
class=
"p"
>
,
</span><span
class=
"mi"
>
20
</span><span
class=
"p"
>
],
</span>
<span
class=
"n"
>
parameter
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
true
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
W2
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
fluid
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
tensor
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
size
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"mi"
>
20
</span><span
class=
"p"
>
,
</span><span
class=
"mi"
>
15
</span><span
class=
"p"
>
],
</span>
<span
class=
"n"
>
parameter
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
true
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
gpu_places
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
get_place
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
use_gpu
</span><span
class=
"o"
>
=
</span><span
class=
"bp"
>
True
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
data
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
layers
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
()
</span>
<span
class=
"n"
>
gpu_places
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
layers
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
get_place
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
use_gpu
</span><span
class=
"o"
>
=
</span><span
class=
"bp"
>
True
</span><span
class=
"p"
>
)
</span>
<span
class=
"c1"
>
# parallel processing on multiple GPUs
</span>
<span
class=
"n"
>
pd
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
ParallelDo
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
gpu_places
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
with
</span>
<span
class=
"n"
>
pd
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
do
</span><span
class=
"p"
>
():
</span>
<span
class=
"n"
>
read_input
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
feature
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
prediction
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
my_net
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
feature
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
with
</span>
<span
class=
"n"
>
pd
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
do
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
):
</span>
<span
class=
"n"
>
prediction
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
softmax
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
fc
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
fc
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
W1
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
W2
</span><span
class=
"p"
>
))
</span>
<span
class=
"n"
>
write_output
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
prediction
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
prediction
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
pd
</span><span
class=
"p"
>
()
</span>
<span
class=
"n"
>
loss
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
cross_entropy
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
prediction
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
label
</span><span
class=
"p"
>
)
</span>
...
...
@@ -248,20 +249,20 @@
<span
class=
"n"
>
main_program
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"n"
>
block0
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
places
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
places
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2
</span>
<span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
ops
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
get_place
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
parallel_do
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
block1
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
parallel_do_grad
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
block2
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
sgd
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
w2
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2_grad
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
sgd
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
w1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1_grad
</span><span
class=
"p"
>
)
</span>
<span
class=
"p"
>
}
</span>
<span
class=
"n"
>
block1
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"n"
>
block1
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"c1"
>
# the forward pass
</span>
<span
class=
"n"
>
parent_block
</span><span
class=
"p"
>
:
</span>
<span
class=
"mi"
>
0
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h2
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
loss
</span>
<span
class=
"n"
>
ops
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
fc
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
fc
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
softmax
</span>
<span
class=
"p"
>
}
</span>
<span
class=
"n"
>
block2
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"n"
>
block2
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"c1"
>
# the backward pass
</span>
<span
class=
"n"
>
parent_block
</span><span
class=
"p"
>
:
</span>
<span
class=
"mi"
>
1
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h2_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
loss_gard
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2_grad
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h2_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
loss_gard
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
local_w1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
local_
w2_grad
</span>
<span
class=
"n"
>
ops
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
softmax_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
fc_grad
</span>
<span
class=
"n"
>
fc_grad
</span>
...
...
develop/doc/searchindex.js
浏览文件 @
4f5b5e28
因为 它太大了无法显示 source diff 。你可以改为
查看blob
。
develop/doc_cn/_sources/design/parallel_do.md.txt
浏览文件 @
4f5b5e28
...
...
@@ -39,15 +39,16 @@ In the backward pass
This implementation allows to write mixed device program like this
```python
# get embedding feature on CPU
feature = some_cpu_only_op(data
)
W1 = fluid.tensor(size=[100,20], parameter=true)
W2 = fluid.tensor(size=[20,15], parameter=true
)
gpu_places = get_place(use_gpu=True)
data = layers.data()
gpu_places = layers.get_place(use_gpu=True)
# parallel processing on multiple GPUs
pd = ParallelDo(gpu_places)
with pd.do():
read_input(feature)
prediction = my_net(feature)
with pd.do(input=data):
prediction = softmax(fc(fc(data, W1), W2))
write_output(prediction)
prediction = pd()
loss = cross_entropy(prediction, label)
...
...
@@ -66,20 +67,20 @@ start_program
main_program
{
block0 {
vars: data, places, w1, w2
vars: data, places, w1, w2
, w1_grad, w2_grad,
ops: data, get_place, parallel_do(block1),
parallel_do_grad(block2),
sgd(w2, w2_grad),
sgd(w1, w1_grad)
}
block1 {
block1 {
# the forward pass
parent_block: 0
vars: data, h1, h2, loss
ops: fc, fc, softmax
}
block2 {
block2 {
# the backward pass
parent_block: 1
vars: data_grad, h1_grad, h2_grad, loss_gard,
w1_grad,
w2_grad
vars: data_grad, h1_grad, h2_grad, loss_gard,
local_w1_grad, local_
w2_grad
ops: softmax_grad,
fc_grad
fc_grad
...
...
develop/doc_cn/design/parallel_do.html
浏览文件 @
4f5b5e28
...
...
@@ -230,15 +230,16 @@
</pre></div>
</div>
<p>
This implementation allows to write mixed device program like this
</p>
<div
class=
"highlight-python"
><div
class=
"highlight"
><pre><span></span><span
class=
"
c1"
>
# get embedding feature on CPU
</span>
<span
class=
"n"
>
feature
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
some_cpu_only_op
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
)
</span>
<div
class=
"highlight-python"
><div
class=
"highlight"
><pre><span></span><span
class=
"
n"
>
W1
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
fluid
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
tensor
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
size
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"mi"
>
100
</span><span
class=
"p"
>
,
</span><span
class=
"mi"
>
20
</span><span
class=
"p"
>
],
</span>
<span
class=
"n"
>
parameter
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
true
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
W2
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
fluid
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
tensor
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
size
</span><span
class=
"o"
>
=
</span><span
class=
"p"
>
[
</span><span
class=
"mi"
>
20
</span><span
class=
"p"
>
,
</span><span
class=
"mi"
>
15
</span><span
class=
"p"
>
],
</span>
<span
class=
"n"
>
parameter
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
true
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
gpu_places
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
get_place
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
use_gpu
</span><span
class=
"o"
>
=
</span><span
class=
"bp"
>
True
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
data
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
layers
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
()
</span>
<span
class=
"n"
>
gpu_places
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
layers
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
get_place
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
use_gpu
</span><span
class=
"o"
>
=
</span><span
class=
"bp"
>
True
</span><span
class=
"p"
>
)
</span>
<span
class=
"c1"
>
# parallel processing on multiple GPUs
</span>
<span
class=
"n"
>
pd
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
ParallelDo
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
gpu_places
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
with
</span>
<span
class=
"n"
>
pd
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
do
</span><span
class=
"p"
>
():
</span>
<span
class=
"n"
>
read_input
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
feature
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
prediction
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
my_net
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
feature
</span><span
class=
"p"
>
)
</span>
<span
class=
"k"
>
with
</span>
<span
class=
"n"
>
pd
</span><span
class=
"o"
>
.
</span><span
class=
"n"
>
do
</span><span
class=
"p"
>
(
</span><span
class=
"nb"
>
input
</span><span
class=
"o"
>
=
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
):
</span>
<span
class=
"n"
>
prediction
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
softmax
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
fc
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
fc
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
W1
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
W2
</span><span
class=
"p"
>
))
</span>
<span
class=
"n"
>
write_output
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
prediction
</span><span
class=
"p"
>
)
</span>
<span
class=
"n"
>
prediction
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
pd
</span><span
class=
"p"
>
()
</span>
<span
class=
"n"
>
loss
</span>
<span
class=
"o"
>
=
</span>
<span
class=
"n"
>
cross_entropy
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
prediction
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
label
</span><span
class=
"p"
>
)
</span>
...
...
@@ -255,20 +256,20 @@
<span
class=
"n"
>
main_program
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"n"
>
block0
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
places
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
places
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2
</span>
<span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
ops
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
get_place
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
parallel_do
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
block1
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
parallel_do_grad
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
block2
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
sgd
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
w2
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2_grad
</span><span
class=
"p"
>
),
</span>
<span
class=
"n"
>
sgd
</span><span
class=
"p"
>
(
</span><span
class=
"n"
>
w1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1_grad
</span><span
class=
"p"
>
)
</span>
<span
class=
"p"
>
}
</span>
<span
class=
"n"
>
block1
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"n"
>
block1
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"c1"
>
# the forward pass
</span>
<span
class=
"n"
>
parent_block
</span><span
class=
"p"
>
:
</span>
<span
class=
"mi"
>
0
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h1
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h2
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
loss
</span>
<span
class=
"n"
>
ops
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
fc
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
fc
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
softmax
</span>
<span
class=
"p"
>
}
</span>
<span
class=
"n"
>
block2
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"n"
>
block2
</span>
<span
class=
"p"
>
{
</span>
<span
class=
"c1"
>
# the backward pass
</span>
<span
class=
"n"
>
parent_block
</span><span
class=
"p"
>
:
</span>
<span
class=
"mi"
>
1
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h2_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
loss_gard
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
w2_grad
</span>
<span
class=
"nb"
>
vars
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
data_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
h2_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
loss_gard
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
local_w1_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
local_
w2_grad
</span>
<span
class=
"n"
>
ops
</span><span
class=
"p"
>
:
</span>
<span
class=
"n"
>
softmax_grad
</span><span
class=
"p"
>
,
</span>
<span
class=
"n"
>
fc_grad
</span>
<span
class=
"n"
>
fc_grad
</span>
...
...
develop/doc_cn/searchindex.js
浏览文件 @
4f5b5e28
此差异已折叠。
点击以展开。
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录