- 4. We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
mix_hidden = paddle.layer.mixed(
size=hidden_dim,
```python
bias_attr=std_default,
feature_out = paddle.layer.mixed(
input=[
size=label_dict_len,
paddle.layer.full_matrix_projection(
bias_attr=std_default,
input=input_tmp[0], param_attr=hidden_para_attr),
input=[
paddle.layer.full_matrix_projection(
paddle.layer.full_matrix_projection(
input=input_tmp[1], param_attr=lstm_para_attr)
input=input_tmp[0], param_attr=hidden_para_attr),
])
paddle.layer.full_matrix_projection(
input=input_tmp[1], param_attr=lstm_para_attr)
lstm = paddle.layer.lstmemory(
], )
input=mix_hidden,
```
act=paddle.activation.Relu(),
gate_act=paddle.activation.Sigmoid(),
- 5. We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
state_act=paddle.activation.Sigmoid(),
reverse=((i % 2) == 1),
```python
bias_attr=std_0,
crf_cost = paddle.layer.crf(
param_attr=lstm_para_attr)
input_tmp = [mix_hidden, lstm]
```
4. We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
```python
feature_out = paddle.layer.mixed(
size=label_dict_len,
size=label_dict_len,
bias_attr=std_default,
input=feature_out,
input=[
label=target,
paddle.layer.full_matrix_projection(
param_attr=paddle.attr.Param(
input=input_tmp[0], param_attr=hidden_para_attr),
name='crfw',
paddle.layer.full_matrix_projection(
initial_std=default_std,
input=input_tmp[1], param_attr=lstm_para_attr)
learning_rate=mix_hidden_lr))
], )
```
```
- 6. CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.
5. We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
```python
```python
crf_dec = paddle.layer.crf_decoding(
crf_cost = paddle.layer.crf(
name='crf_dec_l',
size=label_dict_len,
size=label_dict_len,
input=feature_out,
input=feature_out,
label=target,
label=target,
param_attr=paddle.attr.Param(
param_attr=paddle.attr.Param(name='crfw'))
name='crfw',
```
initial_std=default_std,
learning_rate=mix_hidden_lr))
```
6. CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.
```python
crf_dec = paddle.layer.crf_decoding(
name='crf_dec_l',
size=label_dict_len,
input=feature_out,
label=target,
param_attr=paddle.attr.Param(name='crfw'))
```
## Train model
## Train model
...
@@ -454,8 +456,8 @@ Now we load pre-trained word lookup table.
...
@@ -454,8 +456,8 @@ Now we load pre-trained word lookup table.