Merge pull request #39 from 0YuanZhang0/sequence_tagging

add_predict_and_eval

Merge pull request #39 from 0YuanZhang0/sequence_tagging
add_predict_and_eval
434c5c20 · pkpk · GitHub · 8a312a95 · c9eec53e · 434c5c20
16 changed file
--- a/hapi/text/__init__.py
+++ b/hapi/text/__init__.py
@@ -25,7 +25,8 @@ from hapi.text.text import TransformerDecoderLayer as TransformerDecoderLayer
 from hapi.text.text import TransformerEncoder as TransformerEncoder
 from hapi.text.text import TransformerDecoder as TransformerDecoder
 from hapi.text.text import TransformerBeamSearchDecoder as TransformerBeamSearchDecoder
-from hapi.text.text import DynamicGRU as DynamicGRU
+from hapi.text.text import GRUCell as GRUCell
+from hapi.text.text import GRUEncoderCell as GRUEncoderCell
 from hapi.text.text import BiGRU as BiGRU
 from hapi.text.text import Linear_chain_crf as Linear_chain_crf
 from hapi.text.text import Crf_decoding as Crf_decoding

--- a/hapi/text/text.py
+++ b/hapi/text/text.py
--- a/sequence_tagging/README.md
+++ b/sequence_tagging/README.md
+# 序列标注任务
+
+## 1. 简介
+
+Sequence Tagging，是一个序列标注模型，模型可用于实现，分词、词性标注、专名识别等序列标注任务。我们在自建的数据集上对分词、词性标注、专名识别进行整体的评估效果（即联合标签模型），具体数值见下表；
+
+|模型|Precision|Recall|F1-score|
+|:-:|:-:|:-:|:-:|
+|Lexical Analysis|88.26%|89.20%|88.73%|
+
+## 2. 快速开始
+
+### 安装说明
+
+#### 1.PaddlePaddle 安装
+
+本项目依赖 PaddlePaddle 1.7 及以上版本和PaddleHub 1.0.0及以上版本 ，PaddlePaddle安装请参考官网 [快速安装](http://www.paddlepaddle.org/paddle#quick-start)，PaddleHub安装参考 [PaddleHub](https://github.com/PaddlePaddle/PaddleHub)。
+
+> Warning: GPU 和 CPU 版本的 PaddlePaddle 分别是 paddlepaddle-gpu 和 paddlepaddle，请安装时注意区别。
+
+#### 2. 克隆代码
+克隆工具集代码库到本地
+```bash
+ git clone https://github.com/PaddlePaddle/hapi.git
+ cd hapi/sequence_tagging
+```
+
+#### 3. 环境依赖
+PaddlePaddle的版本要求是：Python 2 版本是 2.7.15+、Python 3 版本是 3.5.1+/3.6/3.7。sequence tagging的代码可支持Python2/3，无具体版本限制
+
+### 数据准备
+
+#### 1. 快速下载
+
+本项目涉及的**数据集**和**训练模型**的数据可通过执行以下脚本进行快速下载，若仅需使用部分数据或者模型，可根据需要参照2和3进行下载
+
+```bash
+python downloads.py all
+```
+或在支持运行shell脚本的环境下执行：
+```bash
+sh downloads.sh
+```
+
+#### 2. 训练数据集
+
+下载数据集文件，解压后会生成 `./data/` 文件夹
+```bash
+python downloads.py dataset
+```
+
+#### 3. 已训练模型
+
+我们开源了在自建数据集上训练的词法分析模型，可供用户直接使用，可通过下述链接进行下载:
+```bash
+# download baseline model
+python downloads.py lac
+```
+
+### 模型训练
+基于示例的数据集，可通过下面的命令，在训练集 `./data/train.tsv` 上进行训练；
+
+GPU上单卡训练
+```
+# setting visible devices for training
+export CUDA_VISIBLE_DEVICES=0
+
+python -u train.py \
+          --train_file ./data/train.tsv \
+          --test_file ./data/test.tsv \
+          --word_dict_path ./conf/word.dic \
+          --label_dict_path ./conf/tag.dic \ 
+          --word_rep_dict_path ./conf/q2b.dic \
+          --device gpu \
+          --grnn_hidden_dim 128 \
+          --word_emb_dim 128 \
+          --bigru_num 2 \
+          --base_learning_rate 1e-3 \
+          --batch_size 300 \
+          --epoch 10 \
+          --save_dir   ./model \
+          --num_devices 1 \
+          -d
+
+# -d： 是否使用动态图模式进行训练，如果使用静态图训练，命令行请删除-d参数
+```
+GPU上多卡训练
+```
+# setting visible devices for training
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+
+python -m paddle.distributed.launch --selected_gpus=0,1,2,3  train.py \
+          --train_file ./data/train.tsv \
+          --test_file  ./data/test.tsv \
+          --word_dict_path ./conf/word.dic \
+          --label_dict_path ./conf/tag.dic \ 
+          --word_rep_dict_path ./conf/q2b.dic \
+          --device gpu \
+          --grnn_hidden_dim 128 \
+          --word_emb_dim 128 \
+          --bigru_num 2 \
+          --base_learning_rate 1e-3 \
+          --batch_size 300 \
+          --epoch 10 \
+          --save_dir   ./model \
+          -d
+
+# -d： 是否使用动态图模式进行训练，如果使用静态图训练，命令行请删除-d参数
+```
+CPU上训练
+```
+python -u train.py \
+          --train_file ./data/train.tsv \
+          --test_file ./data/test.tsv \
+          --word_dict_path ./conf/word.dic \
+          --label_dict_path ./conf/tag.dic \ 
+          --word_rep_dict_path ./conf/q2b.dic \
+          --device cpu \
+          --grnn_hidden_dim 128 \
+          --word_emb_dim 128 \
+          --bigru_num 2 \
+          --base_learning_rate 1e-3 \
+          --batch_size 300 \
+          --epoch 10 \
+          --save_dir   ./model \
+          -d
+
+```
+
+### 模型预测
+
+加载已有的模型，对未知的数据进行预测
+```bash
+python predict.py \
+      --predict_file ./data/infer.tsv \
+      --word_dict_path ./conf/word.dic \
+      --label_dict_path ./conf/tag.dic \
+      --word_rep_dict_path ./conf/q2b.dic \
+      --init_from_checkpoint  model_baseline/params \
+      --output_file predict.result  \
+      --mode predict \
+      --device cpu  \
+      -d 
+  
+# -d： 是否使用动态图模式进行训练，如果使用静态图训练，命令行请删除-d参数
+
+```
+
+### 模型评估
+
+我们基于自建的数据集训练了一个词法分析的模型，可以直接用这个模型对测试集 `./data/test.tsv` 进行验证，
+```bash
+# baseline model
+python eval.py \
+        --test_file  ./data/test.tsv \
+        --word_dict_path ./conf/word.dic  \
+        --label_dict_path ./conf/tag.dic  \
+        --word_rep_dict_path ./conf/q2b.dic \
+        --init_from_checkpoint  ./model_baseline/params \
+        --device cpu  \
+        -d
+
+# -d： 是否使用动态图模式进行训练，如果使用静态图训练，命令行请删除-d参数
+```
+
+
+## 3. 进阶使用
+
+### 任务定义与建模
+序列标注任务的输入是一个字符串（我们后面使用『句子』来指代它），而输出是句子中的词边界和类别。序列标注是词法分析的经典建模方式。我们使用基于 GRU 的网络结构学习特征，将学习到的特征接入 CRF 解码层完成序列标注。CRF 解码层本质上是将传统 CRF 中的线性模型换成了非线性神经网络，基于句子级别的似然概率，因而能够更好的解决标记偏置问题。模型要点如下。
+
+1. 输入采用 one-hot 方式表示，每个字以一个 id 表示
+2. one-hot 序列通过字表，转换为实向量表示的字向量序列；
+3. 字向量序列作为双向 GRU 的输入，学习输入序列的特征表示，得到新的特性表示序列，我们堆叠了两层双向GRU以增加学习能力；
+4. CRF 以 GRU 学习到的特征为输入，以标记序列为监督信号，实现序列标注。
+
+可供用户下载的自有数据是对分词、词性标注、专名识别同时标注的联合数据集，进行词性和专名类别标签集合如下表，其中词性标签 24 个（小写字母），专名类别标签 4 个（大写字母）。这里需要说明的是，人名、地名、机构名和时间四个类别，在上表中存在两套标签（PER / LOC / ORG / TIME 和 nr / ns / nt / t），被标注为第二套标签的词，是模型判断为低置信度的人名、地名、机构名和时间词。开发者可以基于这两套标签，在四个类别的准确、召回之间做出自己的权衡。
+
+| 标签 | 含义     | 标签 | 含义     | 标签 | 含义     | 标签 | 含义     |
+| ---- | -------- | ---- | -------- | ---- | -------- | ---- | -------- |
+| n    | 普通名词 | f    | 方位名词 | s    | 处所名词 | t    | 时间     |
+| nr   | 人名     | ns   | 地名     | nt   | 机构名   | nw   | 作品名   |
+| nz   | 其他专名 | v    | 普通动词 | vd   | 动副词   | vn   | 名动词   |
+| a    | 形容词   | ad   | 副形词   | an   | 名形词   | d    | 副词     |
+| m    | 数量词   | q    | 量词     | r    | 代词     | p    | 介词     |
+| c    | 连词     | u    | 助词     | xc   | 其他虚词 | w    | 标点符号 |
+| PER  | 人名     | LOC  | 地名     | ORG  | 机构名   | TIME | 时间     |
+
+### 模型原理介绍
+上面介绍的模型原理如下图所示：<br />
+
+<p align="center">
+<img src="./images/gru-crf-model.png" width = "340" height = "300" /> <br />
+Overall Architecture of GRU-CRF-MODEL
+</p>
+
+### 数据格式
+训练使用的数据可以由用户根据实际的应用场景，自己组织数据。除了第一行是 `text_a\tlabel` 固定的开头，后面的每行数据都是由两列组成，以制表符分隔，第一列是 utf-8 编码的中文文本，以 `\002` 分割，第二列是对应每个字的标注，以 `\002` 分隔。我们采用 IOB2 标注体系，即以 X-B 作为类型为 X 的词的开始，以 X-I 作为类型为 X 的词的持续，以 O 表示不关注的字（实际上，在词性、专名联合标注中，不存在 O ）。示例如下：
+
+```text
+除\002了\002他\002续\002任\002十\002二\002届\002政\002协\002委\002员\002,\002马\002化\002腾\002,\002雷\002军\002,\002李\002彦\002宏\002也\002被\002推\002选\002为\002新\002一\002届\002全\002国\002人\002大\002代\002表\002或\002全\002国\002政\002协\002委\002员	p-B\002p-I\002r-B\002v-B\002v-I\002m-B\002m-I\002m-I\002ORG-B\002ORG-I\002n-B\002n-I\002w-B\002PER-B\002PER-I\002PER-I\002w-B\002PER-B\002PER-I\002w-B\002PER-B\002PER-I\002PER-I\002d-B\002p-B\002v-B\002v-I\002v-B\002a-B\002m-B\002m-I\002ORG-B\002ORG-I\002ORG-I\002ORG-I\002n-B\002n-I\002c-B\002n-B\002n-I\002ORG-B\002ORG-I\002n-B\002n-I
+```
+
+ 我们随同代码一并发布了完全版的模型和相关的依赖数据。但是，由于模型的训练数据过于庞大，我们没有发布训练数据，仅在`data`目录下放置少数样本用以示例输入数据格式。
+
+ 模型依赖数据包括：
+    1. 输入文本的词典，在`conf`目录下，对应`word.dic`
+    2. 对输入文本中特殊字符进行转换的字典，在`conf`目录下，对应`q2b.dic`
+    3. 标记标签的词典,在`conf`目录下，对应`tag.dic`
+
+ 在训练和预测阶段，我们都需要进行原始数据的预处理，具体处理工作包括：
+
+    1. 从原始数据文件中抽取出句子和标签，构造句子序列和标签序列
+    2. 将句子序列中的特殊字符进行转换
+    3. 依据词典获取词对应的整数索引
+
+### 代码结构说明
+```text
+├── README.md                          # 本文档
+├── data/                                   # 存放数据集的目录
+├── conf/                                   # 词典及程序默认配置的目录
+├── images/                               # 文档图片存放位置
+├── utils/                                   # 常用工具函数
+├── train.py                               # 训练脚本
+├── predict.py                           # 预测脚本
+├── eval.py                               # 词法分析评估的脚本
+├── downloads.py                      # 用于下载数据和模型的脚本
+├── downloads.sh                      # 用于下载数据和模型的脚本
+└──reader.py                           # 文件读取相关函数
+```
+
+
+## 4. 其他
+### 在论文中引用 sequence tagging
+
+如果您的学术工作成果中使用了 sequence tagging，请您增加下述引用。我们非常欣慰sequence tagging模型能够对您的学术工作带来帮助。
+
+```text
+@article{jiao2018LAC,
+	title={Chinese Lexical Analysis with Deep Bi-GRU-CRF Network},
+	author={Jiao, Zhenyu and Sun, Shuqi and Sun, Ke},
+	journal={arXiv preprint arXiv:1807.01882},
+	year={2018},
+	url={https://arxiv.org/abs/1807.01882}
+}
+```
+### 如何贡献代码
+如果你可以修复某个 issue 或者增加一个新功能，欢迎给我们提交PR。如果对应的PR被接受了，我们将根据贡献的质量和难度 进行打分（0-5分，越高越好）。如果你累计获得了 10 分，可以联系我们获得面试机会或为你写推荐信。
--- a/sequence_tagging/conf/q2b.dic
+++ b/sequence_tagging/conf/q2b.dic
+　	 
+、	,
+。	.
+—	-
+～	~
+‖	|
+…	.
+‘	'
+’	'
+“	"
+”	"
+〔	(
+〕	)
+〈	<
+〉	>
+「	'
+」	'
+『	"
+』	"
+〖	[
+〗	]
+【	[
+】	]
+∶	:
+＄	$
+！	!
+＂	"
+＃	#
+％	%
+＆	&
+＇	'
+（	(
+）	)
+＊	*
+＋	+
+，	,
+－	-
+．	.
+／	/
+０	0
+１	1
+２	2
+３	3
+４	4
+５	5
+６	6
+７	7
+８	8
+９	9
+：	:
+；	;
+＜	<
+＝	=
+＞	>
+？	?
+＠	@
+Ａ	a
+Ｂ	b
+Ｃ	c
+Ｄ	d
+Ｅ	e
+Ｆ	f
+Ｇ	g
+Ｈ	h
+Ｉ	i
+Ｊ	j
+Ｋ	k
+Ｌ	l
+Ｍ	m
+Ｎ	n
+Ｏ	o
+Ｐ	p
+Ｑ	q
+Ｒ	r
+Ｓ	s
+Ｔ	t
+Ｕ	u
+Ｖ	v
+Ｗ	w
+Ｘ	x
+Ｙ	y
+Ｚ	z
+［	[
+＼	\
+］	]
+＾	^
+＿	_
+｀	`
+ａ	a
+ｂ	b
+ｃ	c
+ｄ	d
+ｅ	e
+ｆ	f
+ｇ	g
+ｈ	h
+ｉ	i
+ｊ	j
+ｋ	k
+ｌ	l
+ｍ	m
+ｎ	n
+ｏ	o
+ｐ	p
+ｑ	q
+ｒ	r
+ｓ	s
+ｔ	t
+ｕ	u
+ｖ	v
+ｗ	w
+ｘ	x
+ｙ	y
+ｚ	z
+｛	{
+｜	|
+｝	}
+￣	~
+〝	"
+〞	"
+﹐	,
+﹑	,
+﹒	.
+﹔	;
+﹕	:
+﹖	?
+﹗	!
+﹙	(
+﹚	)
+﹛	{
+﹜	{
+﹝	[
+﹞	]
+﹟	#
+﹠	&
+﹡	*
+﹢	+
+﹣	-
+﹤	<
+﹥	>
+﹦	=
+﹨	\
+﹩	$
+﹪	%
+﹫	@
+ 	,
+A	a
+B	b
+C	c
+D	d
+E	e
+F	f
+G	g
+H	h
+I	i
+J	j
+K	k
+L	l
+M	m
+N	n
+O	o
+P	p
+Q	q
+R	r
+S	s
+T	t
+U	u
+V	v
+W	w
+X	x
+Y	y
+Z	z
--- a/sequence_tagging/conf/tag.dic
+++ b/sequence_tagging/conf/tag.dic
+0	a-B
+1	a-I
+2	ad-B
+3	ad-I
+4	an-B
+5	an-I
+6	c-B
+7	c-I
+8	d-B
+9	d-I
+10	f-B
+11	f-I
+12	m-B
+13	m-I
+14	n-B
+15	n-I
+16	nr-B
+17	nr-I
+18	ns-B
+19	ns-I
+20	nt-B
+21	nt-I
+22	nw-B
+23	nw-I
+24	nz-B
+25	nz-I
+26	p-B
+27	p-I
+28	q-B
+29	q-I
+30	r-B
+31	r-I
+32	s-B
+33	s-I
+34	t-B
+35	t-I
+36	u-B
+37	u-I
+38	v-B
+39	v-I
+40	vd-B
+41	vd-I
+42	vn-B
+43	vn-I
+44	w-B
+45	w-I
+46	xc-B
+47	xc-I
+48	PER-B
+49	PER-I
+50	LOC-B
+51	LOC-I
+52	ORG-B
+53	ORG-I
+54	TIME-B
+55	TIME-I
+56	O
--- a/sequence_tagging/conf/word.dic
+++ b/sequence_tagging/conf/word.dic
--- a/sequence_tagging/downloads.py
+++ b/sequence_tagging/downloads.py
+# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+Download script, download dataset and pretrain models.
+"""
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import io
+import os
+import sys
+import time
+import hashlib
+import tarfile
+import requests
+
+FILE_INFO = {
+    'BASE_URL': 'https://baidu-nlp.bj.bcebos.com/',
+    'DATA': {
+        'name': 'lexical_analysis-dataset-2.0.0.tar.gz',
+        'md5': '71e4a9a36d0f0177929a1bccedca7dba'
+    },
+    'LAC_MODEL': {
+        'name': 'lexical_analysis-2.0.0.tar.gz',
+        'md5': "fc1daef00de9564083c7dc7b600504ca"
+    },
+}
+
+
+def usage():
+    desc = ("\nDownload datasets and pretrained models for LAC.\n"
+            "Usage:\n"
+            "   1. python download.py all\n"
+            "   2. python download.py dataset\n"
+            "   3. python download.py lac\n")
+    print(desc)
+
+
+def md5file(fname):
+    hash_md5 = hashlib.md5()
+    with io.open(fname, "rb") as fin:
+        for chunk in iter(lambda: fin.read(4096), b""):
+            hash_md5.update(chunk)
+    return hash_md5.hexdigest()
+
+
+def extract(fname, dir_path):
+    """
+    Extract tar.gz file
+    """
+    try:
+        tar = tarfile.open(fname, "r:gz")
+        file_names = tar.getnames()
+        for file_name in file_names:
+            tar.extract(file_name, dir_path)
+            print(file_name)
+        tar.close()
+    except Exception as e:
+        raise e
+
+
+def _download(url, filename, md5sum):
+    """
+    Download file and check md5
+    """
+    retry = 0
+    retry_limit = 3
+    chunk_size = 4096
+    while not (os.path.exists(filename) and md5file(filename) == md5sum):
+        if retry < retry_limit:
+            retry += 1
+        else:
+            raise RuntimeError(
+                "Cannot download dataset ({0}) with retry {1} times.".format(
+                    url, retry_limit))
+        try:
+            start = time.time()
+            size = 0
+            res = requests.get(url, stream=True)
+            filesize = int(res.headers['content-length'])
+            if res.status_code == 200:
+                print("[Filesize]: %0.2f MB" % (filesize / 1024 / 1024))
+                # save by chunk
+                with io.open(filename, "wb") as fout:
+                    for chunk in res.iter_content(chunk_size=chunk_size):
+                        if chunk:
+                            fout.write(chunk)
+                            size += len(chunk)
+                            pr = '>' * int(size * 50 / filesize)
+                            print(
+                                '\r[Process ]: %s%.2f%%' %
+                                (pr, float(size / filesize * 100)),
+                                end='')
+            end = time.time()
+            print("\n[CostTime]: %.2f s" % (end - start))
+        except Exception as e:
+            print(e)
+
+
+def download(name, dir_path):
+    url = FILE_INFO['BASE_URL'] + FILE_INFO[name]['name']
+    file_path = os.path.join(dir_path, FILE_INFO[name]['name'])
+
+    if not os.path.exists(dir_path):
+        os.makedirs(dir_path)
+
+    # download data
+    print("Downloading : %s" % name)
+    _download(url, file_path, FILE_INFO[name]['md5'])
+
+    # extract data
+    print("Extracting : %s" % file_path)
+    extract(file_path, dir_path)
+    os.remove(file_path)
+
+
+if __name__ == '__main__':
+    if len(sys.argv) != 2:
+        usage()
+        sys.exit(1)
+    pwd = os.path.join(os.path.dirname(__file__), './')
+    ernie_dir = os.path.join(os.path.dirname(__file__), './pretrained')
+
+    if sys.argv[1] == 'all':
+        download('DATA', pwd)
+        download('LAC_MODEL', pwd)
+
+    if sys.argv[1] == "dataset":
+        download('DATA', pwd)
+
+    elif sys.argv[1] == "lac":
+        download('LAC_MODEL', pwd)
+
+    else:
+        usage()
--- a/sequence_tagging/downloads.sh
+++ b/sequence_tagging/downloads.sh
+#!/bin/bash
+
+# download baseline model file to ./model_baseline/
+if [ -d ./model_baseline/ ]
+then
+    echo "./model_baseline/ directory already existed, ignore download"
+else
+    wget --no-check-certificate https://baidu-nlp.bj.bcebos.com/lexical_analysis-2.0.0.tar.gz
+    tar xvf lexical_analysis-2.0.0.tar.gz
+    /bin/rm lexical_analysis-2.0.0.tar.gz
+fi
+
+# download dataset file to ./data/
+if [ -d ./data/ ]
+then
+    echo "./data/ directory already existed, ignore download"
+else
+    wget --no-check-certificate https://baidu-nlp.bj.bcebos.com/lexical_analysis-dataset-2.0.0.tar.gz
+    tar xvf lexical_analysis-dataset-2.0.0.tar.gz
+    /bin/rm lexical_analysis-dataset-2.0.0.tar.gz
+fi
+
--- a/sequence_tagging/eval.py
+++ b/sequence_tagging/eval.py
+#   Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+SequenceTagging network structure
+"""
+
+from __future__ import division
+from __future__ import print_function
+
+import io
+import os
+import sys
+import math
+import argparse
+import numpy as np
+
+from train import SeqTagging
+from utils.check import check_gpu, check_version
+from utils.metrics import chunk_count
+from reader import LacDataset, create_lexnet_data_generator, create_dataloader
+
+work_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+sys.path.append(work_dir)
+from hapi.model import set_device, Input
+
+import paddle.fluid as fluid
+from paddle.fluid.optimizer import AdamOptimizer
+from paddle.fluid.layers.utils import flatten
+
+
+def main(args):
+    place = set_device(args.device)
+    fluid.enable_dygraph(place) if args.dynamic else None
+
+    inputs = [Input([None, None], 'int64', name='words'), 
+              Input([None], 'int64', name='length')] 
+
+    feed_list = None if args.dynamic else [x.forward() for x in inputs]
+    dataset = LacDataset(args)
+    eval_path = args.test_file
+
+    chunk_evaluator = fluid.metrics.ChunkEvaluator()
+    chunk_evaluator.reset()
+
+    eval_generator = create_lexnet_data_generator(
+        args, reader=dataset, file_name=eval_path, place=place, mode="test")
+
+    eval_dataset = create_dataloader(
+        eval_generator, place, feed_list=feed_list)
+
+    vocab_size = dataset.vocab_size
+    num_labels = dataset.num_labels
+    model = SeqTagging(args, vocab_size, num_labels)
+
+    optim = AdamOptimizer(
+        learning_rate=args.base_learning_rate,
+        parameter_list=model.parameters())
+
+    model.mode = "test"
+    model.prepare(inputs=inputs)
+    model.load(args.init_from_checkpoint, skip_mismatch=True)
+
+    for data in eval_dataset():
+        if len(data) == 1: 
+            batch_data = data[0]
+            targets = np.array(batch_data[2])
+        else: 
+            batch_data = data
+            targets = batch_data[2].numpy()
+        inputs_data = [batch_data[0], batch_data[1]]
+        crf_decode, length = model.test(inputs=inputs_data)
+        num_infer_chunks, num_label_chunks, num_correct_chunks = chunk_count(crf_decode, targets, length, dataset.id2label_dict)
+        chunk_evaluator.update(num_infer_chunks, num_label_chunks, num_correct_chunks)
+    
+    precision, recall, f1 = chunk_evaluator.eval()
+    print("[test] P: %.5f, R: %.5f, F1: %.5f" % (precision, recall, f1))
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser("sequence tagging training")
+    parser.add_argument(
+        "-wd",
+        "--word_dict_path",
+        default=None,
+        type=str,
+        help='word dict path')
+    parser.add_argument(
+        "-ld",
+        "--label_dict_path",
+        default=None,
+        type=str,
+        help='label dict path')
+    parser.add_argument(
+        "-wrd",
+        "--word_rep_dict_path",
+        default=None,
+        type=str,
+        help='The path of the word replacement Dictionary.')
+    parser.add_argument(
+        "-dev",
+        "--device",
+        type=str,
+        default='gpu',
+        help="device to use, gpu or cpu")
+    parser.add_argument(
+        "-d", "--dynamic", action='store_true', help="enable dygraph mode")
+    parser.add_argument(
+        "-e", "--epoch", default=10, type=int, help="number of epoch")
+    parser.add_argument(
+        '-lr',
+        '--base_learning_rate',
+        default=1e-3,
+        type=float,
+        metavar='LR',
+        help='initial learning rate')
+    parser.add_argument(
+        "--word_emb_dim",
+        default=128,
+        type=int,
+        help='word embedding dimension')
+    parser.add_argument(
+        "--grnn_hidden_dim", default=128, type=int, help="hidden dimension")
+    parser.add_argument(
+        "--bigru_num", default=2, type=int, help='the number of bi-rnn')
+    parser.add_argument("-elr", "--emb_learning_rate", default=1.0, type=float)
+    parser.add_argument("-clr", "--crf_learning_rate", default=1.0, type=float)
+    parser.add_argument(
+        "-b", "--batch_size", default=300, type=int, help="batch size")
+    parser.add_argument(
+        "--max_seq_len", default=126, type=int, help="max sequence length")
+    parser.add_argument(
+        "-n", "--num_devices", default=1, type=int, help="number of devices")
+    parser.add_argument(
+        "-o",
+        "--save_dir",
+        default="./model",
+        type=str,
+        help="save model path")
+    parser.add_argument(
+        "--init_from_checkpoint",
+        default=None,
+        type=str,
+        help="load init model parameters")
+    parser.add_argument(
+        "--init_from_pretrain_model",
+        default=None,
+        type=str,
+        help="load pretrain model parameters")
+    parser.add_argument(
+        "-sf", "--save_freq", default=1, type=int, help="save frequency")
+    parser.add_argument(
+        "-ef", "--eval_freq", default=1, type=int, help="eval frequency")
+    parser.add_argument(
+        "--output_file", default="predict.result", type=str, help="predict output file")
+    parser.add_argument(
+        "--predict_file", default="./data/infer.tsv", type=str, help="predict output file")
+    parser.add_argument(
+        "--test_file", default="./data/test.tsv", type=str, help="predict and eval output file")
+    parser.add_argument(
+        "--train_file", default="./data/train.tsv", type=str, help="train file")
+    parser.add_argument(
+        "--mode", default="predict", type=str, help="train|test|predict")
+
+    args = parser.parse_args()
+    print(args)
+    use_gpu = True if args.device == "gpu" else False 
+    check_gpu(use_gpu)
+    check_version()
+
+    main(args)
--- a/sequence_tagging/images/gru-crf-model.png
+++ b/sequence_tagging/images/gru-crf-model.png
--- a/sequence_tagging/predict.py
+++ b/sequence_tagging/predict.py
+#   Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""
+SequenceTagging network structure
+"""
+
+from __future__ import division
+from __future__ import print_function
+
+import io
+import os
+import sys
+import math
+import argparse
+import numpy as np
+
+from train import SeqTagging
+from utils.check import check_gpu, check_version
+from reader import LacDataset, create_lexnet_data_generator, create_dataloader
+
+work_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+sys.path.append(work_dir)
+from hapi.model import set_device, Input
+
+import paddle.fluid as fluid
+from paddle.fluid.optimizer import AdamOptimizer
+from paddle.fluid.layers.utils import flatten
+
+
+def main(args):
+    place = set_device(args.device)
+    fluid.enable_dygraph(place) if args.dynamic else None
+
+    inputs = [Input([None, None], 'int64', name='words'), 
+              Input([None], 'int64', name='length')]
+
+    feed_list = None if args.dynamic else [x.forward() for x in inputs]
+    dataset = LacDataset(args)
+    predict_path = args.predict_file
+
+    predict_generator = create_lexnet_data_generator(
+        args, reader=dataset, file_name=predict_path, place=place, mode="predict")
+
+    predict_dataset = create_dataloader(
+        predict_generator, place, feed_list=feed_list)
+
+    vocab_size = dataset.vocab_size
+    num_labels = dataset.num_labels
+    model = SeqTagging(args, vocab_size, num_labels)
+
+    optim = AdamOptimizer(
+        learning_rate=args.base_learning_rate,
+        parameter_list=model.parameters())
+
+    model.mode = "test"
+    model.prepare(inputs=inputs)
+
+    model.load(args.init_from_checkpoint)
+
+    f = open(args.output_file, "wb")
+    for data in predict_dataset(): 
+        if len(data) == 1: 
+            input_data = data[0]
+        else: 
+            input_data = data
+        results, length = model.test(inputs=flatten(input_data))
+        for i in range(len(results)): 
+            word_len = length[i]
+            word_ids = results[i][: word_len]
+            tags = [dataset.id2label_dict[str(id)] for id in word_ids]
+            f.write("\002".join(tags) + "\n")
+        
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser("sequence tagging training")
+    parser.add_argument(
+        "-wd",
+        "--word_dict_path",
+        default=None,
+        type=str,
+        help='word dict path')
+    parser.add_argument(
+        "-ld",
+        "--label_dict_path",
+        default=None,
+        type=str,
+        help='label dict path')
+    parser.add_argument(
+        "-wrd",
+        "--word_rep_dict_path",
+        default=None,
+        type=str,
+        help='The path of the word replacement Dictionary.')
+    parser.add_argument(
+        "-dev",
+        "--device",
+        type=str,
+        default='gpu',
+        help="device to use, gpu or cpu")
+    parser.add_argument(
+        "-d", "--dynamic", action='store_true', help="enable dygraph mode")
+    parser.add_argument(
+        "-e", "--epoch", default=10, type=int, help="number of epoch")
+    parser.add_argument(
+        '-lr',
+        '--base_learning_rate',
+        default=1e-3,
+        type=float,
+        metavar='LR',
+        help='initial learning rate')
+    parser.add_argument(
+        "--word_emb_dim",
+        default=128,
+        type=int,
+        help='word embedding dimension')
+    parser.add_argument(
+        "--grnn_hidden_dim", default=128, type=int, help="hidden dimension")
+    parser.add_argument(
+        "--bigru_num", default=2, type=int, help='the number of bi-rnn')
+    parser.add_argument("-elr", "--emb_learning_rate", default=1.0, type=float)
+    parser.add_argument("-clr", "--crf_learning_rate", default=1.0, type=float)
+    parser.add_argument(
+        "-b", "--batch_size", default=300, type=int, help="batch size")
+    parser.add_argument(
+        "--max_seq_len", default=126, type=int, help="max sequence length")
+    parser.add_argument(
+        "-n", "--num_devices", default=1, type=int, help="number of devices")
+    parser.add_argument(
+        "-o",
+        "--save_dir",
+        default="./model",
+        type=str,
+        help="save model path")
+    parser.add_argument(
+        "--init_from_checkpoint",
+        default=None,
+        type=str,
+        help="load init model parameters")
+    parser.add_argument(
+        "--init_from_pretrain_model",
+        default=None,
+        type=str,
+        help="load pretrain model parameters")
+    parser.add_argument(
+        "-sf", "--save_freq", default=1, type=int, help="save frequency")
+    parser.add_argument(
+        "-ef", "--eval_freq", default=1, type=int, help="eval frequency")
+    parser.add_argument(
+        "--output_file", default="predict.result", type=str, help="predict output file")
+    parser.add_argument(
+        "--predict_file", default="./data/infer.tsv", type=str, help="predict output file")
+    parser.add_argument(
+        "--mode", default="train", type=str, help="train|test|predict")
+
+    args = parser.parse_args()
+    print(args)
+    use_gpu = True if args.device == "gpu" else False
+    check_gpu(use_gpu)
+    check_version()
+
+    main(args)
--- a/sequence_tagging/reader.py
+++ b/sequence_tagging/reader.py
@@ -21,7 +21,7 @@ from __future__ import print_function
 import io
 import numpy as np

-import paddle.fluid as fluid
+import paddle


 class LacDataset(object):
@@ -76,11 +76,11 @@ class LacDataset(object):

    @property
    def vocab_size(self):
-        return len(self.word2id_dict.values())
+        return max(self.word2id_dict.values()) + 1

    @property
    def num_labels(self):
-        return len(self.label2id_dict.values())
+        return max(self.label2id_dict.values()) + 1

    def get_num_examples(self, filename):
        """num of line of file"""
@@ -120,48 +120,121 @@ class LacDataset(object):

        def wrapper():
            fread = io.open(filename, "r", encoding="utf-8")
-            headline = next(fread)
-            headline = headline.strip().split('\t')
-            assert len(headline) == 2 and headline[0] == "text_a" and headline[
-                1] == "label"
-            buf = []
-            for line in fread:
-                words, labels = line.strip("\n").split("\t")
-                if len(words) < 1:
-                    continue
-                word_ids = self.word_to_ids(words.split("\002"))
-                label_ids = self.label_to_ids(labels.split("\002"))
-                assert len(word_ids) == len(label_ids)
-                word_ids = word_ids[0:max_seq_len]
-                words_len = np.int64(len(word_ids))
-                word_ids += [0 for _ in range(max_seq_len - words_len)]
-                label_ids = label_ids[0:max_seq_len]
-                label_ids += [0 for _ in range(max_seq_len - words_len)]
-                assert len(word_ids) == len(label_ids)
-                yield word_ids, label_ids, words_len
+            if mode == "train": 
+                headline = next(fread)
+                headline = headline.strip().split('\t')
+                assert len(headline) == 2 and headline[0] == "text_a" and headline[
+                    1] == "label"
+                buf = []
+                for line in fread:
+                    words, labels = line.strip("\n").split("\t")
+                    if len(words) < 1:
+                        continue
+                    word_ids = self.word_to_ids(words.split("\002"))
+                    label_ids = self.label_to_ids(labels.split("\002"))
+                    assert len(word_ids) == len(label_ids)
+                    words_len = np.int64(len(word_ids))
+                        
+                    word_ids = word_ids[0:max_seq_len]
+                    words_len = np.int64(len(word_ids))
+                    word_ids += [0 for _ in range(max_seq_len - words_len)]
+                    label_ids = label_ids[0:max_seq_len]
+                    label_ids += [0 for _ in range(max_seq_len - words_len)]
+                    assert len(word_ids) == len(label_ids)
+                    yield word_ids, label_ids, words_len
+            elif mode == "test": 
+                headline = next(fread)
+                headline = headline.strip().split('\t')
+                assert len(headline) == 2 and headline[0] == "text_a" and headline[
+                           1] == "label"
+                buf = []
+                for line in fread:
+                    words, labels = line.strip("\n").split("\t")
+                    if len(words) < 1:
+                        continue
+                    word_ids = self.word_to_ids(words.split("\002"))
+                    label_ids = self.label_to_ids(labels.split("\002"))
+                    assert len(word_ids) == len(label_ids)
+                    words_len = np.int64(len(word_ids))
+                    yield word_ids, label_ids, words_len
+            else: 
+                for line in fread: 
+                    words = line.strip("\n").split('\t')[0]
+                    if words == u"text_a": 
+                        continue
+                    if "\002" not in words: 
+                        word_ids = self.word_to_ids(words)
+                    else: 
+                        word_ids = self.word_to_ids(words.split("\002"))
+                    words_len = np.int64(len(word_ids))
+                    yield word_ids, words_len
+
            fread.close()

        return wrapper


-def create_lexnet_data_generator(args, reader, file_name, place, mode="train"):
-    def wrapper():
-        batch_words, batch_labels, seq_lens = [], [], []
-        for epoch in xrange(args.epoch):
+def create_lexnet_data_generator(args, reader, file_name, place, mode="train"): 
+    def padding_data(max_len, batch_data): 
+        padding_batch_data = []
+        for data in batch_data: 
+            data += [0 for _ in range(max_len - len(data))]
+            padding_batch_data.append(data)
+        return padding_batch_data
+
+    def wrapper(): 
+        if mode == "train": 
+            batch_words, batch_labels, seq_lens = [], [], []
+            for epoch in xrange(args.epoch):
+                for instance in reader.file_reader(
+                        file_name, mode, max_seq_len=args.max_seq_len)():
+                    words, labels, words_len = instance
+                    if len(seq_lens) < args.batch_size:
+                        batch_words.append(words)
+                        batch_labels.append(labels)
+                        seq_lens.append(words_len)
+                    if len(seq_lens) == args.batch_size: 
+                        yield batch_words, seq_lens, batch_labels, batch_labels
+                        batch_words, batch_labels, seq_lens = [], [], []
+
+            if len(seq_lens) > 0:
+                yield batch_words, seq_lens, batch_labels, batch_labels
+        elif mode == "test": 
+            batch_words, batch_labels, seq_lens, max_len = [], [], [], 0
            for instance in reader.file_reader(
-                    file_name, mode, max_seq_len=args.max_seq_len)():
+                file_name, mode, max_seq_len=args.max_seq_len)():
                words, labels, words_len = instance
+                max_len = words_len if words_len > max_len else max_len
                if len(seq_lens) < args.batch_size:
                    batch_words.append(words)
-                    batch_labels.append(labels)
                    seq_lens.append(words_len)
-                if len(seq_lens) == args.batch_size:
-                    yield batch_words, batch_labels, seq_lens, batch_labels
-                    batch_words, batch_labels, seq_lens = [], [], []
+                    batch_labels.append(labels)
+                if len(seq_lens) == args.batch_size: 
+                    padding_batch_words = padding_data(max_len, batch_words)
+                    padding_batch_labels = padding_data(max_len, batch_labels)
+                    yield padding_batch_words, seq_lens, padding_batch_labels, padding_batch_labels
+                    batch_words, batch_labels, seq_lens, max_len = [], [], [], 0
+            if len(seq_lens) > 0: 
+                padding_batch_words = padding_data(max_len, batch_words)
+                padding_batch_labels = padding_data(max_len, batch_labels)
+                yield padding_batch_words, seq_lens, padding_batch_labels, padding_batch_labels

-        if len(seq_lens) > 0:
-            yield batch_words, batch_labels, seq_lens, batch_labels
-            batch_words, batch_labels, seq_lens = [], [], []
+        else: 
+            batch_words, seq_lens, max_len = [], [], 0
+            for instance in reader.file_reader(
+                   file_name, mode, max_seq_len=args.max_seq_len)():
+                words, words_len = instance
+                if len(seq_lens) < args.batch_size:
+                    batch_words.append(words)
+                    seq_lens.append(words_len)
+                    max_len = words_len if words_len > max_len else max_len
+                if len(seq_lens) == args.batch_size: 
+                    padding_batch_words = padding_data(max_len, batch_words)
+                    yield padding_batch_words, seq_lens
+                    batch_words, seq_lens, max_len = [], [], 0
+            if len(seq_lens) > 0: 
+                padding_batch_words = padding_data(max_len, batch_words)
+                yield padding_batch_words, seq_lens

    return wrapper


--- a/sequence_tagging/sequence_tagging.py
+++ b/sequence_tagging/sequence_tagging.py
@@ -24,11 +24,15 @@ import sys
 import math
 import argparse
 import numpy as np
-sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

-from metrics import Metric
-from model import Model, Input, Loss, set_device
-from text import SequenceTagging
+work_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+sys.path.append(os.path.join(work_dir))
+
+from hapi.metrics import Metric
+from hapi.model import Model, Input, Loss, set_device
+from hapi.text.text import SequenceTagging
+
+from utils.check import check_gpu, check_version
 from reader import LacDataset, create_lexnet_data_generator, create_dataloader 

 import paddle.fluid as fluid
@@ -47,6 +51,7 @@ class SeqTagging(Model):
            for infer: return the prediction
            otherwise: return the prediction
        """
+        self.mode_type = args.mode
        self.word_emb_dim = args.word_emb_dim
        self.vocab_size = vocab_size
        self.num_labels = num_labels
@@ -72,12 +77,18 @@ class SeqTagging(Model):
                        init_bound=self.init_bound,
                        length=self.length)

-    def forward(self, word, target, lengths):
+    def forward(self, *inputs):
        """
        Configure the network
        """
-        crf_decode, avg_cost, lengths = self.sequence_tagging(word, target, lengths)
-        return crf_decode, avg_cost, lengths
+        word = inputs[0]
+        lengths = inputs[1]
+        if self.mode_type == "train" or self.mode_type == "test": 
+            target = inputs[2]
+            outputs = self.sequence_tagging(word, lengths, target)
+        else: 
+            outputs = self.sequence_tagging(word, lengths)
+        return outputs


 class Chunk_eval(fluid.dygraph.Layer):
@@ -103,10 +114,9 @@ class Chunk_eval(fluid.dygraph.Layer):
            dtype="int64")
        num_correct_chunks = self._helper.create_variable_for_type_inference(
            dtype="int64")
-
-        this_input = {"Inference": input, "Label": label[0]}
-        if seq_length:
-            this_input["SeqLength"] = seq_length[0]
+        this_input = {"Inference": input, "Label": label}
+        if seq_length is not None:
+            this_input["SeqLength"] = seq_length
        self._helper.append_op(
            type='chunk_eval',
            inputs=this_input,
@@ -144,9 +154,10 @@ class ChunkEval(Metric):
            int(math.ceil((num_labels - 1) / 2.0)), "IOB")
        self.reset()

-    def add_metric_op(self, pred, label, *args, **kwargs):
-        crf_decode = pred[0]
-        lengths = pred[2]
+    def add_metric_op(self, *args): 
+        crf_decode = args[0]
+        lengths = args[2]
+        label = args[3]
        (num_infer_chunks, num_label_chunks,
         num_correct_chunks) = self.chunk_eval(
             input=crf_decode, label=label, seq_length=lengths)
@@ -194,18 +205,16 @@ def main(args):
    place = set_device(args.device)
    fluid.enable_dygraph(place) if args.dynamic else None

-    inputs = [
-        Input(
-            [None, args.max_seq_len], 'int64', name='words'), Input(
-                [None, args.max_seq_len], 'int64', name='target'), Input(
-                    [None], 'int64', name='length')
-    ]
-    labels = [Input([None, args.max_seq_len], 'int64', name='labels')]
+    inputs = [Input([None, None], 'int64', name='words'),
+              Input([None], 'int64', name='length'), 
+              Input([None, None], 'int64', name='target')]
+
+    labels = [Input([None, None], 'int64', name='labels')]

    feed_list = None if args.dynamic else [x.forward() for x in inputs + labels]
    dataset = LacDataset(args)
-    train_path = os.path.join(args.data, "train.tsv")
-    test_path = os.path.join(args.data, "test.tsv")
+    train_path = args.train_file
+    test_path = args.test_file

    train_generator = create_lexnet_data_generator(
        args, reader=dataset, file_name=train_path, place=place, mode="train")
@@ -233,8 +242,11 @@ def main(args):
        labels=labels,
        device=args.device)

-    if args.resume is not None:
-        model.load(args.resume)
+    if args.init_from_checkpoint:
+        model.load(args.init_from_checkpoint)
+
+    if args.init_from_pretrain_model:
+        model.load(args.init_from_pretrain_model, reset_optimizer=True)

    model.fit(train_dataset,
              test_dataset,
@@ -246,9 +258,7 @@ def main(args):


 if __name__ == '__main__':
-    parser = argparse.ArgumentParser("LAC training")
-    parser.add_argument(
-        "-dir", "--data", default=None, type=str, help='path to LAC dataset')
+    parser = argparse.ArgumentParser("sequence tagging training")
    parser.add_argument(
        "-wd",
        "--word_dict_path",
@@ -301,23 +311,41 @@ if __name__ == '__main__':
        "--max_seq_len", default=126, type=int, help="max sequence length")
    parser.add_argument(
        "-n", "--num_devices", default=1, type=int, help="number of devices")
-    parser.add_argument(
-        "-r",
-        "--resume",
-        default=None,
-        type=str,
-        help="checkpoint path to resume")
    parser.add_argument(
        "-o",
        "--save_dir",
        default="./model",
        type=str,
        help="save model path")
+    parser.add_argument(
+        "--init_from_checkpoint",
+        default=None,
+        type=str,
+        help="load init model parameters")
+    parser.add_argument(
+        "--init_from_pretrain_model",
+        default=None,
+        type=str,
+        help="load pretrain model parameters")
    parser.add_argument(
        "-sf", "--save_freq", default=1, type=int, help="save frequency")
    parser.add_argument(
        "-ef", "--eval_freq", default=1, type=int, help="eval frequency")
+    parser.add_argument(
+        "--output_file", default="predict.result", type=str, help="predict output file")
+    parser.add_argument(
+        "--predict_file", default="./data/infer.tsv", type=str, help="predict output file")
+    parser.add_argument(
+        "--test_file", default="./data/test.tsv", type=str, help="predict and eval output file")
+    parser.add_argument(
+        "--train_file", default="./data/train.tsv", type=str, help="train file")
+    parser.add_argument(
+        "--mode", default="train", type=str, help="train|test|predict")

    args = parser.parse_args()
    print(args)
+    use_gpu = True if args.device == "gpu" else False
+    check_gpu(use_gpu)
+    check_version()
+
    main(args)
--- a/sequence_tagging/utils/__init__.py
+++ b/sequence_tagging/utils/__init__.py
--- a/sequence_tagging/utils/check.py
+++ b/sequence_tagging/utils/check.py
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import sys
+
+import paddle.fluid as fluid
+
+__all__ = ['check_gpu', 'check_version']
+
+
+def check_gpu(use_gpu):
+    """
+     Log error and exit when set use_gpu=true in paddlepaddle
+     cpu version.
+     """
+    err = "Config use_gpu cannot be set as true while you are " \
+          "using paddlepaddle cpu version ! \nPlease try: \n" \
+          "\t1. Install paddlepaddle-gpu to run model on GPU \n" \
+          "\t2. Set use_gpu as false in config file to run " \
+          "model on CPU"
+
+    try:
+        if use_gpu and not fluid.is_compiled_with_cuda():
+            print(err)
+            sys.exit(1)
+    except Exception as e:
+        pass
+
+
+def check_version():
+    """
+    Log error and exit when the installed version of paddlepaddle is
+    not satisfied.
+    """
+    err = "PaddlePaddle version 1.6 or higher is required, " \
+          "or a suitable develop version is satisfied as well. \n" \
+          "Please make sure the version is good with your code." \
+
+    try:
+        fluid.require_version('1.7.0')
+    except Exception as e:
+        print(err)
+        sys.exit(1)
--- a/sequence_tagging/utils/metrics.py
+++ b/sequence_tagging/utils/metrics.py
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import sys
+
+import paddle.fluid as fluid
+
+__all__ = ['chunk_count', "build_chunk"]
+
+
+def build_chunk(data_list, id2label_dict): 
+    """
+    Assembly entity
+    """
+    tag_list = [id2label_dict.get(str(id)) for id in data_list]
+    ner_dict = {}
+    ner_str = ""
+    ner_start = 0
+    for i in range(len(tag_list)): 
+        tag = tag_list[i]
+        if tag == u"O": 
+            if i != 0: 
+                key = "%d_%d" % (ner_start, i - 1)
+                ner_dict[key] = ner_str
+            ner_start = i
+            ner_str = tag 
+        elif tag.endswith(u"B"): 
+            if i != 0: 
+                key = "%d_%d" % (ner_start, i - 1)
+                ner_dict[key] = ner_str
+            ner_start = i
+            ner_str = tag.split('-')[0]
+        elif tag.endswith(u"I"): 
+            if tag.split('-')[0] != ner_str: 
+                if i != 0: 
+                    key = "%d_%d" % (ner_start, i - 1)
+                    ner_dict[key] = ner_str
+                ner_start = i
+                ner_str = tag.split('-')[0]
+    return ner_dict
+                    
+
+def chunk_count(infer_numpy, label_numpy, seq_len, id2label_dict):
+    """
+    calculate num_correct_chunks num_error_chunks total_num for metrics
+    """
+    num_infer_chunks, num_label_chunks, num_correct_chunks = 0, 0, 0
+    assert infer_numpy.shape[0] == label_numpy.shape[0]
+
+    for i in range(infer_numpy.shape[0]): 
+        infer_list = infer_numpy[i][: seq_len[i]]
+        label_list = label_numpy[i][: seq_len[i]]
+        infer_dict = build_chunk(infer_list, id2label_dict)
+        num_infer_chunks += len(infer_dict)
+        label_dict = build_chunk(label_list, id2label_dict)
+        num_label_chunks += len(label_dict)
+        for key in infer_dict: 
+            if key in label_dict and label_dict[key] == infer_dict[key]: 
+                num_correct_chunks += 1
+    return num_infer_chunks, num_label_chunks, num_correct_chunks
+