-3. 8 LSTM units will be trained in "forward / backward" order.
- 8 LSTM units will be trained in "forward / backward" order.
```python
hidden_0=paddle.layer.mixed(
...
...
@@ -326,7 +326,7 @@ for i in range(1, depth):
input_tmp=[mix_hidden,lstm]
```
-4. We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
- We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
```python
feature_out=paddle.layer.mixed(
...
...
@@ -340,7 +340,7 @@ for i in range(1, depth):
],)
```
-5. We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
- We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
```python
crf_cost=paddle.layer.crf(
...
...
@@ -353,7 +353,7 @@ crf_cost = paddle.layer.crf(
learning_rate=mix_hidden_lr))
```
-6. CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.
- CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.
- 3. 8 LSTM units will be trained in "forward / backward" order.
- 8 LSTM units will be trained in "forward / backward" order.
```python
hidden_0 = paddle.layer.mixed(
...
...
@@ -368,7 +368,7 @@ for i in range(1, depth):
input_tmp = [mix_hidden, lstm]
```
- 4. We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
- We will concatenate the output of top LSTM unit with it's input, and project into a hidden layer. Then put a fully connected layer on top of it to get the final vector representation.
```python
feature_out = paddle.layer.mixed(
...
...
@@ -382,7 +382,7 @@ for i in range(1, depth):
], )
```
- 5. We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
- We use CRF as cost function, the parameter of CRF cost will be named `crfw`.
```python
crf_cost = paddle.layer.crf(
...
...
@@ -395,7 +395,7 @@ crf_cost = paddle.layer.crf(
learning_rate=mix_hidden_lr))
```
- 6. CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.
- CRF decoding layer is used for evaluation and inference. It shares parameter with CRF layer. The sharing of parameters among multiple layers is specified by the same parameter name in these layers.