提交 0bb46dc8 编写于 作者: H Helin Wang

change images of label_semantic_roles to english images

上级 db79c856
...@@ -22,34 +22,20 @@ Standard SRL system mostly builds on top of Syntactic Analysis and contains five ...@@ -22,34 +22,20 @@ Standard SRL system mostly builds on top of Syntactic Analysis and contains five
<div align="center"> <div align="center">
<img src="image/dependency_parsing.png" width = "80%" align=center /><br> <img src="image/dependency_parsing_en.png" width = "80%" align=center /><br>
Fig 1. Syntactic parse tree Fig 1. Syntactic parse tree
</div> </div>
核心关系-> HED
定中关系-> ATT
主谓关系-> SBV
状中结构-> ADV
介宾关系-> POB
右附加关系-> RAD
动宾关系-> VOB
标点-> WP
However, complete syntactic analysis requires identifying the relation among all constitutes and the performance of SRL is sensitive to the precision of syntactic analysis, which makes SRL a very challenging task. To reduce the complexity and obtain some syntactic structure information, we often use shallow syntactic analysis. Shallow Syntactic Analysis is also called partial parsing or chunking. Unlike complete syntactic analysis which requires the construction of the complete parsing tree, Shallow Syntactic Analysis only need to identify some independent components with relatively simple structure, such as verb phrases (chunk). To avoid difficulties in constructing a syntactic tree with high accuracy, some work\[[1](#Reference)\] proposed semantic chunking based SRL methods, which convert SRL as a sequence tagging problem. Sequence tagging tasks classify syntactic chunks using BIO representation. For syntactic chunks forming a chunk of type A, the first chunk receives the B-A tag (Begin), the remaining ones receive the tag I-A (Inside), and all chunks outside receive the tag O-A. However, complete syntactic analysis requires identifying the relation among all constitutes and the performance of SRL is sensitive to the precision of syntactic analysis, which makes SRL a very challenging task. To reduce the complexity and obtain some syntactic structure information, we often use shallow syntactic analysis. Shallow Syntactic Analysis is also called partial parsing or chunking. Unlike complete syntactic analysis which requires the construction of the complete parsing tree, Shallow Syntactic Analysis only need to identify some independent components with relatively simple structure, such as verb phrases (chunk). To avoid difficulties in constructing a syntactic tree with high accuracy, some work\[[1](#Reference)\] proposed semantic chunking based SRL methods, which convert SRL as a sequence tagging problem. Sequence tagging tasks classify syntactic chunks using BIO representation. For syntactic chunks forming a chunk of type A, the first chunk receives the B-A tag (Begin), the remaining ones receive the tag I-A (Inside), and all chunks outside receive the tag O-A.
The BIO representation of above example is shown in Fig.1. The BIO representation of above example is shown in Fig.1.
<div align="center"> <div align="center">
<img src="image/bio_example.png" width = "90%" align=center /><br> <img src="image/bio_example_en.png" width = "90%" align=center /><br>
Fig 2. BIO represention Fig 2. BIO represention
</div> </div>
输入序列-> input sequence
语块-> chunk
标注序列-> label sequence
角色-> role
This example illustrates the simplicity of sequence tagging because (1) shallow syntactic analysis reduces the precision requirement of syntactic analysis; (2) pruning candidate arguments is removed; 3) argument identification and tagging are finished at the same time. Such unified methods simplify the procedure, reduce the risk of accumulating errors and boost the performance further. This example illustrates the simplicity of sequence tagging because (1) shallow syntactic analysis reduces the precision requirement of syntactic analysis; (2) pruning candidate arguments is removed; 3) argument identification and tagging are finished at the same time. Such unified methods simplify the procedure, reduce the risk of accumulating errors and boost the performance further.
In this tutorial, our SRL system is built as an end-to-end system via a neural network. We take only text sequences, without using any syntactic parsing results or complex hand-designed features. We give public dataset [CoNLL-2004 and CoNLL-2005 Shared Tasks](http://www.cs.upc.edu/~srlconll/) as an example to illustrate: given a sentence with predicates marked, identify the corresponding arguments and their semantic roles by sequence tagging method. In this tutorial, our SRL system is built as an end-to-end system via a neural network. We take only text sequences, without using any syntactic parsing results or complex hand-designed features. We give public dataset [CoNLL-2004 and CoNLL-2005 Shared Tasks](http://www.cs.upc.edu/~srlconll/) as an example to illustrate: given a sentence with predicates marked, identify the corresponding arguments and their semantic roles by sequence tagging method.
...@@ -71,13 +57,10 @@ The operation of a single LSTM cell contain 3 parts: (1) input-to-hidden: map in ...@@ -71,13 +57,10 @@ The operation of a single LSTM cell contain 3 parts: (1) input-to-hidden: map in
Fig.3 illustrate the final stacked recurrent neural networks. Fig.3 illustrate the final stacked recurrent neural networks.
<p align="center"> <p align="center">
<img src="./image/stacked_lstm.png" width = "40%" align=center><br> <img src="./image/stacked_lstm_en.png" width = "40%" align=center><br>
Fig 3. Stacked Recurrent Neural Networks Fig 3. Stacked Recurrent Neural Networks
</p> </p>
线性变换-> linear transformation
输入层到隐层-> input-to-hidden
### Bidirectional Recurrent Neural Network ### Bidirectional Recurrent Neural Network
LSTMs can summarize the history of previous inputs seen up to now, but can not see the future. In most of NLP (natural language processing) tasks, the entire sentences are ready to use. Therefore, sequential learning might be much efficient if the future can be encoded as well like histories. LSTMs can summarize the history of previous inputs seen up to now, but can not see the future. In most of NLP (natural language processing) tasks, the entire sentences are ready to use. Therefore, sequential learning might be much efficient if the future can be encoded as well like histories.
...@@ -86,15 +69,10 @@ To address the above drawbacks, we can design bidirectional recurrent neural net ...@@ -86,15 +69,10 @@ To address the above drawbacks, we can design bidirectional recurrent neural net
<p align="center"> <p align="center">
<img src="./image/bidirectional_stacked_lstm.png" width = "60%" align=center><br> <img src="./image/bidirectional_stacked_lstm_en.png" width = "60%" align=center><br>
Fig 4. Bidirectional LSTMs Fig 4. Bidirectional LSTMs
</p> </p>
线性变换-> linear transformation
输入层到隐层-> input-to-hidden
正向处理输出序列->process sequence in the forward direction
反向处理上一层序列-> process sequence from the previous layer in backward direction
Note that, this bidirectional RNNs is different with the one proposed by Bengio et al. in machine translation tasks \[[3](#Reference), [4](#Reference)\]. We will introduce another bidirectional RNNs in the following tasks[machine translation](https://github.com/PaddlePaddle/book/blob/develop/machine_translation/README.md) Note that, this bidirectional RNNs is different with the one proposed by Bengio et al. in machine translation tasks \[[3](#Reference), [4](#Reference)\]. We will introduce another bidirectional RNNs in the following tasks[machine translation](https://github.com/PaddlePaddle/book/blob/develop/machine_translation/README.md)
### Conditional Random Field ### Conditional Random Field
...@@ -156,18 +134,10 @@ After modification, the model is as follows: ...@@ -156,18 +134,10 @@ After modification, the model is as follows:
<div align="center"> <div align="center">
<img src="image/db_lstm_network.png" width = "60%" align=center /><br> <img src="image/db_lstm_en.png" width = "60%" align=center /><br>
Fig 6. DB-LSTM for SRL tasks Fig 6. DB-LSTM for SRL tasks
</div> </div>
论元-> argu
谓词-> pred
谓词上下文-> ctx-p
谓词上下文区域标记-> $m_r$
输入-> input
原句-> sentence
反向LSTM-> LSTM Reverse
## Data Preparation ## Data Preparation
In the tutorial, we use [CoNLL 2005](http://www.cs.upc.edu/~srlconll/) SRL task open dataset as an example. It is important to note that the training set and development set of the CoNLL 2005 SRL task are not free to download after the competition. Currently, only the test set can be obtained, including 23 sections of the Wall Street Journal and three sections of the Brown corpus. In this tutorial, we use the WSJ corpus as the training dataset to explain the model. However, since the training set is small, if you want to train a usable neural network SRL system, consider paying for the full corpus. In the tutorial, we use [CoNLL 2005](http://www.cs.upc.edu/~srlconll/) SRL task open dataset as an example. It is important to note that the training set and development set of the CoNLL 2005 SRL task are not free to download after the competition. Currently, only the test set can be obtained, including 23 sections of the Wall Street Journal and three sections of the Brown corpus. In this tutorial, we use the WSJ corpus as the training dataset to explain the model. However, since the training set is small, if you want to train a usable neural network SRL system, consider paying for the full corpus.
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册