4. We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
-4. We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
```python
```python
feature_out=paddle.layer.mixed(
feature_out=paddle.layer.mixed(
...
@@ -368,10 +370,10 @@ print len(pred_len)
...
@@ -368,10 +370,10 @@ print len(pred_len)
],)
],)
```
```
5. We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
-5. We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
```python
```python
crf_cost = paddle.layer.crf(
crf_cost=paddle.layer.crf(
size=label_dict_len,
size=label_dict_len,
input=feature_out,
input=feature_out,
label=target,
label=target,
...
@@ -379,18 +381,18 @@ print len(pred_len)
...
@@ -379,18 +381,18 @@ print len(pred_len)
name='crfw',
name='crfw',
initial_std=default_std,
initial_std=default_std,
learning_rate=mix_hidden_lr))
learning_rate=mix_hidden_lr))
```
```
6. CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.
-6. CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.