Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
机器未来
Paddle
提交
2ac9a3d8
P
Paddle
项目概览
机器未来
/
Paddle
与 Fork 源项目一致
Fork自
PaddlePaddle / Paddle
通知
1
Star
1
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
1
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
P
Paddle
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
1
Issue
1
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
提交
2ac9a3d8
编写于
10月 31, 2017
作者:
C
caoying03
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
follow comments.
上级
dd2be3da
变更
4
隐藏空白更改
内联
并排
Showing
4 changed file
with
26 addition
and
18 deletion
+26
-18
paddle/framework/tensor_impl.h
paddle/framework/tensor_impl.h
+1
-1
paddle/operators/linear_chain_crf_op.cc
paddle/operators/linear_chain_crf_op.cc
+13
-12
paddle/operators/linear_chain_crf_op.h
paddle/operators/linear_chain_crf_op.h
+9
-5
python/paddle/v2/framework/tests/test_linear_chain_crf_op.py
python/paddle/v2/framework/tests/test_linear_chain_crf_op.py
+3
-0
未找到文件。
paddle/framework/tensor_impl.h
浏览文件 @
2ac9a3d8
...
@@ -235,7 +235,7 @@ inline Tensor Tensor::Slice(const int& begin_idx, const int& end_idx) const {
...
@@ -235,7 +235,7 @@ inline Tensor Tensor::Slice(const int& begin_idx, const int& end_idx) const {
PADDLE_ENFORCE_LE
(
end_idx
,
dims_
[
0
],
"The end row index is out of bound."
);
PADDLE_ENFORCE_LE
(
end_idx
,
dims_
[
0
],
"The end row index is out of bound."
);
PADDLE_ENFORCE_LT
(
PADDLE_ENFORCE_LT
(
begin_idx
,
end_idx
,
begin_idx
,
end_idx
,
"The start row index must be
small
er than the end row index."
);
"The start row index must be
less
er than the end row index."
);
if
(
dims_
[
0
]
==
1
)
{
if
(
dims_
[
0
]
==
1
)
{
return
*
this
;
return
*
this
;
...
...
paddle/operators/linear_chain_crf_op.cc
浏览文件 @
2ac9a3d8
...
@@ -26,9 +26,8 @@ class LinearChainCRFOpMaker : public framework::OpProtoAndCheckerMaker {
...
@@ -26,9 +26,8 @@ class LinearChainCRFOpMaker : public framework::OpProtoAndCheckerMaker {
"Emission"
,
"Emission"
,
"(LoDTensor, default: LoDTensor<float>). "
"(LoDTensor, default: LoDTensor<float>). "
"The unscaled emission weight matrix for the linear chain CRF. "
"The unscaled emission weight matrix for the linear chain CRF. "
"This input is a LoDTensor with shape [N x D] where N is the total "
"This input is a LoDTensor with shape [N x D] where N is the size of "
"element number of all input squences in a mini-batch, "
"the mini-batch and D is the total tag number."
);
"and D is the total tag number."
);
AddInput
(
AddInput
(
"Transition"
,
"Transition"
,
"(Tensor, default: Tensor<float>). A Tensor with shape [(D + 2) x D]. "
"(Tensor, default: Tensor<float>). A Tensor with shape [(D + 2) x D]. "
...
@@ -36,7 +35,7 @@ class LinearChainCRFOpMaker : public framework::OpProtoAndCheckerMaker {
...
@@ -36,7 +35,7 @@ class LinearChainCRFOpMaker : public framework::OpProtoAndCheckerMaker {
"See more details in the operator's comments."
);
"See more details in the operator's comments."
);
AddInput
(
AddInput
(
"Label"
,
"Label"
,
"(LoDTensor, default: LoDTensor<int>). The groundtruth which is a 2-D "
"(LoDTensor, default: LoDTensor<int>). The ground
truth which is a 2-D "
"LoDTensor with shape [N x 1], where N is the total element number in "
"LoDTensor with shape [N x 1], where N is the total element number in "
"a mini-batch."
);
"a mini-batch."
);
AddOutput
(
AddOutput
(
...
@@ -77,12 +76,13 @@ variables. CRF learns the conditional probability \f$P(Y|X)\f$, where
...
@@ -77,12 +76,13 @@ variables. CRF learns the conditional probability \f$P(Y|X)\f$, where
Linear chain CRF is a special case of CRF that is useful for sequence labeling
Linear chain CRF is a special case of CRF that is useful for sequence labeling
task. Sequence labeling tasks do not assume a lot of conditional
task. Sequence labeling tasks do not assume a lot of conditional
independences among inputs. The
y only concern about the input and the out
put
independences among inputs. The
only constraint they impose is that the in
put
being linear sequences. Thus, the graph model of such a CRF is a simple chain
and output must be linear sequences. Thus, the graph of such a CRF is a simple
or a line, which results in the linear chain CRF.
chain
or a line, which results in the linear chain CRF.
This operator implements the Forward-Backward algorithm for the linear chain
This operator implements the Forward-Backward algorithm for the linear chain
CRF. Please see http://www.cs.columbia.edu/~mcollins/fb.pdf for reference.
CRF. Please see http://www.cs.columbia.edu/~mcollins/fb.pdf and
http://cseweb.ucsd.edu/~elkan/250Bwinter2012/loglinearCRFs.pdf for reference.
Equation:
Equation:
...
@@ -111,7 +111,7 @@ NOTE:
...
@@ -111,7 +111,7 @@ NOTE:
transition features. The emission feature weights are NOT computed in
transition features. The emission feature weights are NOT computed in
this operator. They MUST be computed first before this operator is called.
this operator. They MUST be computed first before this operator is called.
2. Because this operator performs global
ly normaliz
tion over all possible
2. Because this operator performs global
normaliza
tion over all possible
sequences internally, it expects UNSCALED emission feature weights.
sequences internally, it expects UNSCALED emission feature weights.
Please do not call this op with the emission feature being output of any
Please do not call this op with the emission feature being output of any
nonlinear activation.
nonlinear activation.
...
@@ -171,9 +171,10 @@ class LinearChainCRFOp : public framework::OperatorWithKernel {
...
@@ -171,9 +171,10 @@ class LinearChainCRFOp : public framework::OperatorWithKernel {
ctx
->
SetOutputDim
(
"Alpha"
,
emission_dims
);
ctx
->
SetOutputDim
(
"Alpha"
,
emission_dims
);
ctx
->
SetOutputDim
(
"EmissionExps"
,
emission_dims
);
ctx
->
SetOutputDim
(
"EmissionExps"
,
emission_dims
);
ctx
->
SetOutputDim
(
"TransitionExps"
,
transition_dims
);
ctx
->
SetOutputDim
(
"TransitionExps"
,
transition_dims
);
//
(TODO
caoying) This is tricky. The 1st dimension of Output(LogLikelihood)
//
TODO(
caoying) This is tricky. The 1st dimension of Output(LogLikelihood)
// is the sequence number in a mini-batch. The dimension set here should be
// is the sequence number in a mini-batch. The dimension set here should be
// resized to its correct size in the function Compute.
// resized to its correct size in the function Compute. Fix this once we can
// get LoD information in the InferShape interface.
ctx
->
SetOutputDim
(
"LogLikelihood"
,
{
emission_dims
[
0
],
1
});
ctx
->
SetOutputDim
(
"LogLikelihood"
,
{
emission_dims
[
0
],
1
});
}
}
...
@@ -236,7 +237,7 @@ class LinearChainCRFGradOp : public framework::OperatorWithKernel {
...
@@ -236,7 +237,7 @@ class LinearChainCRFGradOp : public framework::OperatorWithKernel {
protected:
protected:
// Explicitly set that the data type of output of the linear_chain_crf_grad
// Explicitly set that the data type of output of the linear_chain_crf_grad
// operator is determined by its input: gra
id
ents of LogLikelihood.
// operator is determined by its input: gra
di
ents of LogLikelihood.
framework
::
DataType
IndicateDataType
(
framework
::
DataType
IndicateDataType
(
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
const
framework
::
ExecutionContext
&
ctx
)
const
override
{
return
framework
::
ToDataType
(
return
framework
::
ToDataType
(
...
...
paddle/operators/linear_chain_crf_op.h
浏览文件 @
2ac9a3d8
...
@@ -188,7 +188,6 @@ class LinearChainCRFOpKernel : public framework::OpKernel<T> {
...
@@ -188,7 +188,6 @@ class LinearChainCRFOpKernel : public framework::OpKernel<T> {
const
LoDTensor
&
src
,
LoDTensor
*
dst
)
{
const
LoDTensor
&
src
,
LoDTensor
*
dst
)
{
dst
->
mutable_data
<
T
>
(
src
.
dims
(),
platform
::
CPUPlace
());
dst
->
mutable_data
<
T
>
(
src
.
dims
(),
platform
::
CPUPlace
());
dst
->
CopyFrom
(
src
,
platform
::
CPUPlace
(),
ctx
);
dst
->
CopyFrom
(
src
,
platform
::
CPUPlace
(),
ctx
);
};
};
copyLoDTensor
(
ctx
,
emission_weights_src
,
emission_weights_dst
);
copyLoDTensor
(
ctx
,
emission_weights_src
,
emission_weights_dst
);
...
@@ -248,7 +247,7 @@ class LinearChainCRFOpKernel : public framework::OpKernel<T> {
...
@@ -248,7 +247,7 @@ class LinearChainCRFOpKernel : public framework::OpKernel<T> {
for
(
size_t
i
=
0
;
i
<
tag_num
;
++
i
)
{
for
(
size_t
i
=
0
;
i
<
tag_num
;
++
i
)
{
T
sum
=
0.
;
T
sum
=
0.
;
for
(
size_t
j
=
0
;
j
<
tag_num
;
++
j
)
{
for
(
size_t
j
=
0
;
j
<
tag_num
;
++
j
)
{
sum
+=
alpha_value
[(
k
-
1
)
*
tag_num
+
j
]
*
sum
+=
alpha_value
[(
k
-
1
)
*
tag_num
+
j
]
*
// (*)
w_exps
[(
j
+
state_trans_base_idx
)
*
tag_num
+
i
];
w_exps
[(
j
+
state_trans_base_idx
)
*
tag_num
+
i
];
}
}
alpha_value
[
k
*
tag_num
+
i
]
=
x_exps
[
k
*
tag_num
+
i
]
*
sum
;
alpha_value
[
k
*
tag_num
+
i
]
=
x_exps
[
k
*
tag_num
+
i
]
*
sum
;
...
@@ -291,7 +290,8 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
...
@@ -291,7 +290,8 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
// These local variables hold the inputs and outputs, garanteeing them on
// These local variables hold the inputs and outputs, garanteeing them on
// CPU memory, to provide a consistent reference.
// CPU memory, to provide a consistent reference.
// TODO(caoying) Fix this by moving all these local variables into the
// TODO(caoying) Fix this by moving all these local variables into the
// class's data members once we can profile the training process.
// class's data members once we can profile the training process, or
// implementing a real GPU kernel for CRF.
Tensor
*
label
=
nullptr
;
Tensor
*
label
=
nullptr
;
Tensor
label_tensor
;
Tensor
label_tensor
;
Tensor
*
emission_exps
=
nullptr
;
Tensor
*
emission_exps
=
nullptr
;
...
@@ -344,6 +344,9 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
...
@@ -344,6 +344,9 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
transition_grad
=
transition_grad
=
ctx
.
Output
<
Tensor
>
(
framework
::
GradVarName
(
"Transition"
));
ctx
.
Output
<
Tensor
>
(
framework
::
GradVarName
(
"Transition"
));
}
}
// TODO(caoying) Fix this constraint. When the Input(Emission) is from the
// data reader operator, it can have no gradients.
PADDLE_ENFORCE
(
emission_grad
,
"Output(Emission@Grad) should not be null."
);
PADDLE_ENFORCE
(
emission_grad
,
"Output(Emission@Grad) should not be null."
);
emission_grad
->
mutable_data
<
T
>
(
platform
::
CPUPlace
());
emission_grad
->
mutable_data
<
T
>
(
platform
::
CPUPlace
());
math
::
SetConstant
<
platform
::
CPUPlace
,
T
>
()(
ctx
.
device_context
(),
math
::
SetConstant
<
platform
::
CPUPlace
,
T
>
()(
ctx
.
device_context
(),
...
@@ -458,7 +461,7 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
...
@@ -458,7 +461,7 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
for
(
size_t
i
=
0
;
i
<
tag_num
;
++
i
)
{
for
(
size_t
i
=
0
;
i
<
tag_num
;
++
i
)
{
T
sum
=
0.
;
T
sum
=
0.
;
for
(
size_t
j
=
0
;
j
<
tag_num
;
++
j
)
{
for
(
size_t
j
=
0
;
j
<
tag_num
;
++
j
)
{
sum
+=
w_exps
[(
i
+
state_trans_base_idx
)
*
tag_num
+
j
]
*
sum
+=
w_exps
[(
i
+
state_trans_base_idx
)
*
tag_num
+
j
]
*
// (**)
x_exps
[(
k
+
1
)
*
tag_num
+
j
]
*
x_exps
[(
k
+
1
)
*
tag_num
+
j
]
*
beta_value
[(
k
+
1
)
*
tag_num
+
j
];
beta_value
[(
k
+
1
)
*
tag_num
+
j
];
}
}
...
@@ -493,7 +496,8 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
...
@@ -493,7 +496,8 @@ class LinearChainCRFGradOpKernel : public framework::OpKernel<T> {
auto
x_exps_mat
=
EigenMatrix
<
T
>::
From
(
emission_exps
);
auto
x_exps_mat
=
EigenMatrix
<
T
>::
From
(
emission_exps
);
// TODO(caoying): Fix this to avoid using this local variable.
// TODO(caoying): Fix this to avoid using this local variable if when can
// profiling the training process.
Tensor
tmp
;
Tensor
tmp
;
tmp
.
mutable_data
<
T
>
(
beta
->
dims
(),
platform
::
CPUPlace
());
tmp
.
mutable_data
<
T
>
(
beta
->
dims
(),
platform
::
CPUPlace
());
auto
tmp_mat
=
EigenMatrix
<
T
>::
From
(
tmp
);
auto
tmp_mat
=
EigenMatrix
<
T
>::
From
(
tmp
);
...
...
python/paddle/v2/framework/tests/test_linear_chain_crf_op.py
浏览文件 @
2ac9a3d8
...
@@ -83,6 +83,9 @@ class LinearChainCrfForward(object):
...
@@ -83,6 +83,9 @@ class LinearChainCrfForward(object):
class
TestLinearChainCrfOp
(
OpTest
):
class
TestLinearChainCrfOp
(
OpTest
):
def
set_test_data
(
self
):
def
set_test_data
(
self
):
# TODO(caoying) Fix the unittest by: add the boundary cases when
# sequence lengths are 1, 2, and 3.
SEQ_NUM
=
3
SEQ_NUM
=
3
TAG_NUM
=
17
TAG_NUM
=
17
MAX_SEQ_LEN
=
5
MAX_SEQ_LEN
=
5
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录