未验证 提交 6e509429 编写于 作者: 王肖 提交者: GitHub

add similarity_net dygraph (#4289)

* Update README.md (#4267)

* test=develop (#4269)

* 3d use new api (#4275)

* PointNet++ and PointRCNN use new API

* Update Readme of Dygraph BERT (#4277)

Fix some typos.

* Update run_classifier_multi_gpu.sh (#4279)

remove the CUDA_VISIBLE_DEVICES

* Update README.md (#4280)

* add similarity_net dygraph
Co-authored-by: Npkpk <xiyzhouang@gmail.com>
Co-authored-by: NKaipeng Deng <dengkaipeng@baidu.com>
上级 c4f3ebc3
#!/usr/bin/env bash
export FLAGS_enable_parallel_graph=1
export FLAGS_sync_nccl_allreduce=1
export FLAGS_fraction_of_gpu_memory_to_use=0.95
TASK_NAME='simnet'
TRAIN_DATA_PATH=./data/train_pointwise_data
VALID_DATA_PATH=./data/test_pointwise_data
TEST_DATA_PATH=./data/test_pointwise_data
INFER_DATA_PATH=./data/infer_data
VOCAB_PATH=./data/term2id.dict
CKPT_PATH=./model_files
TEST_RESULT_PATH=./test_result
INFER_RESULT_PATH=./infer_result
TASK_MODE='pointwise'
CONFIG_PATH=./config/bow_pointwise.json
INIT_CHECKPOINT=./model_files/simnet_bow_pointwise_pretrained_model/
# run_train
train() {
python run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda True \
--do_train True \
--do_valid True \
--do_test True \
--do_infer False \
--batch_size 128 \
--train_data_dir ${TRAIN_DATA_PATH} \
--valid_data_dir ${VALID_DATA_PATH} \
--test_data_dir ${TEST_DATA_PATH} \
--infer_data_dir ${INFER_DATA_PATH} \
--output_dir ${CKPT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--epoch 3 \
--save_steps 1000 \
--validation_steps 100 \
--compute_accuracy False \
--lamda 0.958 \
--task_mode ${TASK_MODE} \
--enable_ce
}
export CUDA_VISIBLE_DEVICES=0
train | python _ce.py
sleep 20
export CUDA_VISIBLE_DEVICES=0,1,2,3
train | python _ce.py
# 短文本语义匹配
## 简介
### 任务说明
短文本语义匹配(SimilarityNet, SimNet)是一个计算短文本相似度的框架,可以根据用户输入的两个文本,计算出相似度得分。SimNet框架在百度各产品上广泛应用,主要包括BOW、CNN、RNN、MMDNN等核心网络结构形式,提供语义相似度计算训练和预测框架,适用于信息检索、新闻推荐、智能客服等多个应用场景,帮助企业解决语义匹配问题。可通过[AI开放平台-短文本相似度](https://ai.baidu.com/tech/nlp_basic/simnet)线上体验。
同时推荐用户参考[ IPython Notebook demo](https://aistudio.baidu.com/aistudio/projectDetail/124373)
### 效果说明
基于百度海量搜索数据,我们训练了一个SimNet-BOW-Pairwise语义匹配模型,在一些真实的FAQ问答场景中,该模型效果比基于字面的相似度方法AUC提升5%以上,我们基于百度自建测试集(包含聊天、客服等数据集)和语义匹配数据集(LCQMC)进行评测,效果如下表所示。LCQMC数据集以Accuracy为评测指标,而pairwise模型的输出为相似度,因此我们采用0.958作为分类阈值,相比于基线模型中网络结构同等复杂的CBOW模型(准确率为0.737),我们模型的准确率为0.7532。
| 模型 | 百度知道 | ECOM |QQSIM | UNICOM |
|:-----------:|:-------------:|:-------------:|:-------------:|:-------------:|
| | AUC | AUC | AUC|正逆序比|
|BOW_Pairwise|0.6767|0.7329|0.7650|1.5630|
#### 测试集说明
| 数据集 | 来源 | 垂类 |
|:-----------:|:-------------:|:-------------:|
|百度知道 | 百度知道问题 | 日常 |
|ECOM|商业问句|金融|
|QQSIM|闲聊对话|日常|
|UNICOM|联通客服|客服|
## 快速开始
#### 版本依赖
本项目依赖于 Paddlepaddle Fluid 1.6,请参考[安装指南](http://www.paddlepaddle.org/#quick-start)进行安装。
python版本依赖python 2.7
#### 安装代码
克隆工具集代码库到本地
```shell
git clone https://github.com/PaddlePaddle/models.git
cd models/PaddleNLP/similarity_net
```
#### 数据准备
下载经过预处理的数据,运行命令后,data目录下会存在训练集数据示例、集数据示例、测试集数据示例,以及对应词索引字典(term2id.dict)。
```shell
sh download_data.sh
```
或者
```
python download.py dataset
```
#### 模型准备
我们开源了基于大规模数据训练好的```pairwise```模型(基于bow模型训练),用户可以通过运行命令下载预训练好的模型,该模型将保存在```./model_files/simnet_bow_pairwise_pretrained_model/```下。
```shell
sh download_pretrained_model.sh
```
或者
```
python download.py model
```
#### 评估
我们公开了自建的测试集,包括百度知道、ECOM、QQSIM、UNICOM四个数据集,基于上面的预训练模型,用户可以进入evaluate目录下依次执行下列命令获取测试集评估结果。
```shell
sh evaluate_ecom.sh
sh evaluate_qqsim.sh
sh evaluate_zhidao.sh
sh evaluate_unicom.sh
```
用户也可以指定./run.sh中的TEST_DATA_PATH的值,通过下列命令评估自己指定的测试集。
```shell
sh run.sh eval
```
#### 推测
基于上面的预训练模型,可以运行下面的命令进行推测,并保存推测结果到本地。
```shell
sh run.sh infer
```
#### 训练与验证
用户可以基于示例数据构建训练集和开发集,可以运行下面的命令,进行模型训练和开发集验证。
```shell
sh run.sh train
```
用户也可以指定./run.sh中train()函数里的INIT_CHECKPOINT的值,载入训练好的模型进行热启动训练。
## 进阶使用
### 任务定义与建模
传统的文本匹配技术如信息检索中的向量空间模型 VSM、BM25 等算法,主要解决词汇层面的相似度问题,这种方法的效果在实际应用中受到语言的多义词和语言结构等问题影响。SimNet 在语义表示上沿袭了隐式连续向量表示的方式,但对语义匹配问题在深度学习框架下进行了 End-to-End 的建模,将```point-wise``````pair-wise```两种有监督学习方式全部统一在一个整体框架内。在实际应用场景下,将海量的用户点击行为数据转化为大规模的弱标记数据,在网页搜索任务上的初次使用即展现出极大威力,带来了相关性的明显提升。
### 模型原理介绍
SimNet如下图所示:
<p align="center">
<img src="./struct.jpg"/> <br />
</p>
### 数据格式说明
训练模式一共分为```pairwise``````pointwise```两种模式。
#### pairwise模式:
训练集格式如下: query \t pos_query \t neg_query。
query、pos_query和neg_query是以空格分词的中文文本,中间使用制表符'\t'隔开,pos_query表示与query相似的正例,neg_query表示与query不相似的随机负例,文本编码为utf-8。</br>
```
现在 安卓模拟器 哪个 好 用 电脑 安卓模拟器 哪个 更好 电信 手机 可以 用 腾讯 大王 卡 吗 ?
土豆 一亩地 能 收 多少 斤 一亩 地土豆 产 多少 斤 一亩 地 用 多少 斤 土豆 种子
```
开发集和测试集格式:query1 \t query2 \t label。</br>
query1和query2表示以空格分词的中文文本,label为0或1,1表示query1与query2相似,0表示query1与query2不相似,query1、query2和label中间以制表符'\t'隔开,文本编码为utf-8。</br>
```
现在 安卓模拟器 哪个 好 用 电脑 安卓模拟器 哪个 更好 1
为什么 头发 掉 得 很厉害 我 头发 为什么 掉 得 厉害 1
常喝 薏米 水 有 副 作用 吗 女生 可以 长期 喝 薏米 水养生 么 0
长 的 清新 是 什么 意思 小 清新 的 意思 是 什么 0
```
#### pointwise模式:
训练集、开发集和测试集数据格式相同:query1和query2表示以空格分词的中文文本,label为0或1,1表示query1与query2相似,0表示query1与query2不相似,query1、query2和label中间以制表符'\t'隔开,文本编码为utf-8。
```
现在 安卓模拟器 哪个 好 用 电脑 安卓模拟器 哪个 更好 1
为什么 头发 掉 得 很厉害 我 头发 为什么 掉 得 厉害 1
常喝 薏米 水 有 副 作用 吗 女生 可以 长期 喝 薏米 水养生 么 0
长 的 清新 是 什么 意思 小 清新 的 意思 是 什么 0
```
#### infer数据集:
```pairwise```和```pointwise```的infer数据集格式相同:query1 \t query2。</br>
query1和query2为以空格分词的中文文本。
```
怎么 调理 湿热 体质 ? 湿热 体质 怎样 调理 啊
搞笑 电影 美国 搞笑 的 美国 电影
```
__注__:本项目额外提供了分词预处理脚本(在preprocess目录下),可供用户使用,具体使用方法如下:
```shell
python tokenizer.py --test_data_dir ./test.txt.utf8 --batch_size 1 > test.txt.utf8.seg
```
其中test.txt.utf8为待分词的文件,一条文本数据一行,utf8编码,分词结果存放在test.txt.utf8.seg文件中
### 代码结构说明
```text
.
├── run_classifier.py:该项目的主函数,封装包括训练、预测、评估的部分
├── config.py:定义该项目模型的配置类,读取具体模型类别、以及模型的超参数等
├── reader.py:定义了读入数据的相关函数
├── utils.py:定义了其他常用的功能函数
├── Config: 定义多种模型的配置文件
├── download.py: 下载数据及预训练模型脚本
```
### 如何训练
```shell
python run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda false \ #是否使用GPU
--do_train True \ #是否训练
--do_valid True \ #是否在训练中测试开发集
--do_test True \ #是否验证测试集
--do_infer False \ #是否预测
--batch_size 128 \ #batch_size的值
--train_data_dir ${TRAIN_DATA_kPATH} \ #训练集的路径
--valid_data_dir ${VALID_DATA_PATH} \ #开发集的路径
--test_data_dir ${TEST_DATA_PATH} \ #测试集的路径
--infer_data_dir ${INFER_DATA_PATH} \ #待推测数据的路径
--output_dir ${CKPT_PATH} \ #模型存放的路径
--config_path ${CONFIG_PATH} \ #配置文件路径
--vocab_path ${VOCAB_PATH} \ #字典路径
--epoch 10 \ #epoch值
--save_steps 1000 \ #每save_steps保存一次模型
--validation_steps 100 \ #每validation_steps验证一次开发集结果
--task_mode ${TASK_MODE} #训练模式,pairwise或pointwise,与相应的配置文件匹配。
--compute_accuracy False \ #是否计算accuracy
--lamda 0.91 \ #pairwise模式计算accuracy时的阈值
--init_checkpoint "" #预加载模型路径
```
### 如何组建自己的模型
用户可以根据自己的需求,组建自定义的模型,具体方法如下所示:
i. 定义自己的网络结构
用户可以在```./models/```下定义自己的模型;
ii. 更改模型配置
用户仿照```config```中的文件生成自定义模型的配置文件。
用户需要保留配置文件中的```net```、```loss```、```optimizer```、```task_mode```和```model_path```字段。```net```为用户自定义的模型参数,```task_mode```表示训练模式,为```pairwise```或```pointwise```,要与训练命令中的```--task_mode```命令保持一致,```model_path```为模型保存路径,```loss```和```optimizer```依据自定义模型的需要仿照```config```下的其他文件填写。
iii.模型训练,运行训练、评估、预测脚本即可(具体方法同上)。
## 其他
### 如何贡献代码
如果你可以修复某个issue或者增加一个新功能,欢迎给我们提交PR。如果对应的PR被接受了,我们将根据贡献的质量和难度进行打分(0-5分,越高越好)。如果你累计获得了10分,可以联系我们获得面试机会或者为你写推荐信。
# this file is only used for continuous evaluation test!
import os
import sys
sys.path.append(os.environ['ceroot'])
from kpi import CostKpi
from kpi import DurationKpi
from kpi import AccKpi
each_step_duration_simnet_card1 = DurationKpi('each_step_duration_simnet_card1', 0.03, 0, actived=True)
train_loss_simnet_card1 = CostKpi('train_loss_simnet_card1', 0.01, 0, actived=True)
each_step_duration_simnet_card4 = DurationKpi('each_step_duration_simnet_card4', 0.02, 0, actived=True)
train_loss_simnet_card4 = CostKpi('train_loss_simnet_card4', 0.01, 0, actived=True)
tracking_kpis = [
each_step_duration_simnet_card1,
train_loss_simnet_card1,
each_step_duration_simnet_card4,
train_loss_simnet_card4,
]
def parse_log(log):
'''
This method should be implemented by model developers.
The suggestion:
each line in the log should be key, value, for example:
"
train_cost\t1.0
test_cost\t1.0
train_cost\t1.0
train_cost\t1.0
train_acc\t1.2
"
'''
for line in log.split('\n'):
fs = line.strip().split('\t')
print(fs)
if len(fs) == 3 and fs[0] == 'kpis':
kpi_name = fs[1]
kpi_value = float(fs[2])
yield kpi_name, kpi_value
def log_to_ce(log):
kpi_tracker = {}
for kpi in tracking_kpis:
kpi_tracker[kpi.name] = kpi
for (kpi_name, kpi_value) in parse_log(log):
print(kpi_name, kpi_value)
kpi_tracker[kpi_name].add_record(kpi_value)
kpi_tracker[kpi_name].persist()
if __name__ == '__main__':
log = sys.stdin.read()
log_to_ce(log)
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
SimNet config
"""
import six
import json
import io
class SimNetConfig(object):
"""
simnet Config
"""
def __init__(self, args):
self.task_mode = args.task_mode
self.config_path = args.config_path
self._config_dict = self._parse(args.config_path)
def _parse(self, config_path):
try:
with io.open(config_path) as json_file:
config_dict = json.load(json_file)
except Exception:
raise IOError("Error in parsing simnet model config file '%s'" % config_path)
else:
if config_dict["task_mode"] != self.task_mode:
raise ValueError(
"the config '{}' does not match the task_mode '{}'".format(self.config_path, self.task_mode))
return config_dict
def __getitem__(self, key):
return self._config_dict[key]
def __setitem__(self, key, value):
self._config_dict[key] = value
def print_config(self):
"""
Print Config
"""
for arg, value in sorted(six.iteritems(self._config_dict)):
print('%s: %s' % (arg, value))
print('------------------------------------------------')
{
"net": {
"module_name": "bow",
"class_name": "BOW",
"emb_dim": 128,
"bow_dim": 128,
"hidden_dim": 128
},
"loss": {
"module_name": "hinge_loss",
"class_name": "HingeLoss",
"margin": 0.1
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pairwise",
"model_path": "bow_pairwise"
}
{
"net": {
"module_name": "bow",
"class_name": "BOW",
"emb_dim": 128,
"bow_dim": 128
},
"loss": {
"module_name": "softmax_cross_entropy_loss",
"class_name": "SoftmaxCrossEntropyLoss"
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pointwise",
"model_path": "bow_pointwise"
}
{
"net": {
"module_name": "cnn",
"class_name": "CNN",
"emb_dim": 128,
"filter_size": 3,
"num_filters": 256,
"hidden_dim": 128
},
"loss": {
"module_name": "hinge_loss",
"class_name": "HingeLoss",
"margin": 0.1
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.2,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pairwise",
"model_path": "cnn_pairwise"
}
{
"net": {
"module_name": "cnn",
"class_name": "CNN",
"emb_dim": 128,
"filter_size": 3,
"num_filters": 256,
"hidden_dim": 128
},
"loss": {
"module_name": "softmax_cross_entropy_loss",
"class_name": "SoftmaxCrossEntropyLoss"
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pointwise",
"model_path": "cnn_pointwise"
}
{
"net": {
"module_name": "gru",
"class_name": "GRU",
"emb_dim": 128,
"gru_dim": 128,
"hidden_dim": 128
},
"loss": {
"module_name": "hinge_loss",
"class_name": "HingeLoss",
"margin": 0.1
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pairwise",
"model_path": "gru_pairwise"
}
{
"net": {
"module_name": "gru",
"class_name": "GRU",
"emb_dim": 128,
"gru_dim": 128,
"hidden_dim": 128
},
"loss": {
"module_name": "softmax_cross_entropy_loss",
"class_name": "SoftmaxCrossEntropyLoss"
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate" : 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pointwise",
"model_path": "gru_pointwise"
}
{
"net": {
"module_name": "lstm",
"class_name": "LSTM",
"emb_dim": 128,
"lstm_dim": 128,
"hidden_dim": 128
},
"loss": {
"module_name": "hinge_loss",
"class_name": "HingeLoss",
"margin": 0.1
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pairwise",
"model_path": "lstm_pairwise"
}
{
"net": {
"module_name": "lstm",
"class_name": "LSTM",
"emb_dim": 128,
"lstm_dim": 128,
"hidden_dim": 128
},
"loss": {
"module_name": "softmax_cross_entropy_loss",
"class_name": "SoftmaxCrossEntropyLoss"
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"task_mode": "pointwise",
"model_path": "lstm_pointwise"
}
{
"net": {
"module_name": "mm_dnn",
"class_name": "MMDNN",
"embedding_dim": 128,
"num_filters": 256,
"lstm_dim": 128,
"hidden_size": 128,
"window_size_left": 3,
"window_size_right": 3,
"dpool_size_left": 2,
"dpool_size_right": 2
},
"loss": {
"module_name": "softmax_cross_entropy_loss",
"class_name": "SoftmaxCrossEntropyLoss"
},
"optimizer": {
"class_name": "AdamOptimizer",
"learning_rate": 0.001,
"beta1": 0.9,
"beta2": 0.999,
"epsilon": 1e-08
},
"max_len_left": 32,
"max_len_right": 32,
"n_class": 2,
"task_mode": "pointwise",
"match_mask" : 1,
"model_path": "mm_dnn_pointwise"
}
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
Download script, download dataset and pretrain models.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import io
import os
import sys
import time
import hashlib
import tarfile
import requests
def usage():
desc = ("\nDownload datasets and pretrained models for SimilarityNet task.\n"
"Usage:\n"
" 1. python download.py dataset\n"
" 2. python download.py model\n")
print(desc)
def md5file(fname):
hash_md5 = hashlib.md5()
with io.open(fname, "rb") as fin:
for chunk in iter(lambda: fin.read(4096), b""):
hash_md5.update(chunk)
return hash_md5.hexdigest()
def extract(fname, dir_path):
"""
Extract tar.gz file
"""
try:
tar = tarfile.open(fname, "r:gz")
file_names = tar.getnames()
for file_name in file_names:
tar.extract(file_name, dir_path)
print(file_name)
tar.close()
except Exception as e:
raise e
def download(url, filename, md5sum):
"""
Download file and check md5
"""
retry = 0
retry_limit = 3
chunk_size = 4096
while not (os.path.exists(filename) and md5file(filename) == md5sum):
if retry < retry_limit:
retry += 1
else:
raise RuntimeError("Cannot download dataset ({0}) with retry {1} times.".
format(url, retry_limit))
try:
start = time.time()
size = 0
res = requests.get(url, stream=True)
filesize = int(res.headers['content-length'])
if res.status_code == 200:
print("[Filesize]: %0.2f MB" % (filesize / 1024 / 1024))
# save by chunk
with io.open(filename, "wb") as fout:
for chunk in res.iter_content(chunk_size=chunk_size):
if chunk:
fout.write(chunk)
size += len(chunk)
pr = '>' * int(size * 50 / filesize)
print('\r[Process ]: %s%.2f%%' % (pr, float(size / filesize*100)), end='')
end = time.time()
print("\n[CostTime]: %.2f s" % (end - start))
except Exception as e:
print(e)
def download_dataset(dir_path):
BASE_URL = "https://baidu-nlp.bj.bcebos.com/"
DATASET_NAME = "simnet_dataset-1.0.0.tar.gz"
DATASET_MD5 = "ec65b313bc237150ef536a8d26f3c73b"
file_path = os.path.join(dir_path, DATASET_NAME)
url = BASE_URL + DATASET_NAME
if not os.path.exists(dir_path):
os.makedirs(dir_path)
# download dataset
print("Downloading dataset: %s" % url)
download(url, file_path, DATASET_MD5)
# extract dataset
print("Extracting dataset: %s" % file_path)
extract(file_path, dir_path)
os.remove(file_path)
def download_model(dir_path):
MODELS = {}
BASE_URL = "https://baidu-nlp.bj.bcebos.com/"
CNN_NAME = "simnet_bow-pairwise-1.0.0.tar.gz"
CNN_MD5 = "199a3f3af31558edcc71c3b54ea5e129"
MODELS[CNN_NAME] = CNN_MD5
if not os.path.exists(dir_path):
os.makedirs(dir_path)
for model in MODELS:
url = BASE_URL + model
model_path = os.path.join(dir_path, model)
print("Downloading model: %s" % url)
# download model
download(url, model_path, MODELS[model])
# extract model.tar.gz
print("Extracting model: %s" % model_path)
extract(model_path, dir_path)
os.remove(model_path)
if __name__ == '__main__':
if len(sys.argv) != 2:
usage()
sys.exit(1)
if sys.argv[1] == "dataset":
pwd = os.path.join(os.path.dirname(__file__), './')
download_dataset(pwd)
elif sys.argv[1] == "model":
pwd = os.path.join(os.path.dirname(__file__), './model_files')
download_model(pwd)
else:
usage()
#get data
wget --no-check-certificate https://baidu-nlp.bj.bcebos.com/simnet_dataset-1.0.0.tar.gz
tar xzf simnet_dataset-1.0.0.tar.gz
rm simnet_dataset-1.0.0.tar.gz
#!/usr/bin/env bash
model_files_path="./model_files"
#get pretrained_bow_pairwise_model
wget --no-check-certificate https://baidu-nlp.bj.bcebos.com/simnet_bow-pairwise-1.0.0.tar.gz
if [ ! -d $model_files_path ]; then
mkdir $model_files_path
fi
tar xzf simnet_bow-pairwise-1.0.0.tar.gz -C $model_files_path
rm simnet_bow-pairwise-1.0.0.tar.gz
\ No newline at end of file
#!/usr/bin/env bash
export FLAGS_enable_parallel_graph=1
export FLAGS_sync_nccl_allreduce=1
export CUDA_VISIBLE_DEVICES=3
export FLAGS_fraction_of_gpu_memory_to_use=0.95
TASK_NAME='simnet'
TEST_DATA_PATH=./data/ecom
VOCAB_PATH=./data/term2id.dict
CKPT_PATH=./model_files
TEST_RESULT_PATH=./evaluate/ecom_test_result
TASK_MODE='pairwise'
CONFIG_PATH=./config/bow_pairwise.json
INIT_CHECKPOINT=./model_files/simnet_bow_pairwise_pretrained_model/
cd ..
python ./run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda false \
--do_test True \
--verbose_result True \
--batch_size 128 \
--test_data_dir ${TEST_DATA_PATH} \
--test_result_path ${TEST_RESULT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--task_mode ${TASK_MODE} \
--init_checkpoint ${INIT_CHECKPOINT}
#!/usr/bin/env bash
export FLAGS_enable_parallel_graph=1
export FLAGS_sync_nccl_allreduce=1
export CUDA_VISIBLE_DEVICES=3
export FLAGS_fraction_of_gpu_memory_to_use=0.95
TASK_NAME='simnet'
TEST_DATA_PATH=./data/qqsim
VOCAB_PATH=./data/term2id.dict
CKPT_PATH=./model_files
TEST_RESULT_PATH=./evaluate/qqsim_test_result
TASK_MODE='pairwise'
CONFIG_PATH=./config/bow_pairwise.json
INIT_CHECKPOINT=./model_files/simnet_bow_pairwise_pretrained_model/
cd ..
python ./run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda false \
--do_test True \
--verbose_result True \
--batch_size 128 \
--test_data_dir ${TEST_DATA_PATH} \
--test_result_path ${TEST_RESULT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--task_mode ${TASK_MODE} \
--init_checkpoint ${INIT_CHECKPOINT}
#!/usr/bin/env bash
export FLAGS_enable_parallel_graph=1
export FLAGS_sync_nccl_allreduce=1
export CUDA_VISIBLE_DEVICES=3
export FLAGS_fraction_of_gpu_memory_to_use=0.95
TASK_NAME='simnet'
INFER_DATA_PATH=./evaluate/unicom_infer
VOCAB_PATH=./data/term2id.dict
CKPT_PATH=./model_files
INFER_RESULT_PATH=./evaluate/unicom_infer_result
TASK_MODE='pairwise'
CONFIG_PATH=./config/bow_pairwise.json
INIT_CHECKPOINT=./model_files/simnet_bow_pairwise_pretrained_model/
python unicom_split.py
cd ..
python ./run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda false \
--do_infer True \
--batch_size 128 \
--infer_data_dir ${INFER_DATA_PATH} \
--infer_result_path ${INFER_RESULT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--task_mode ${TASK_MODE} \
--init_checkpoint ${INIT_CHECKPOINT}
cd evaluate
python unicom_compute_pos_neg.py
#!/usr/bin/env bash
export FLAGS_enable_parallel_graph=1
export FLAGS_sync_nccl_allreduce=1
export CUDA_VISIBLE_DEVICES=3
export FLAGS_fraction_of_gpu_memory_to_use=0.95
TASK_NAME='simnet'
TEST_DATA_PATH=./data/zhidao
VOCAB_PATH=./data/term2id.dict
CKPT_PATH=./model_files
TEST_RESULT_PATH=./evaluate/zhidao_test_result
TASK_MODE='pairwise'
CONFIG_PATH=./config/bow_pairwise.json
INIT_CHECKPOINT=./model_files/simnet_bow_pairwise_pretrained_model/
cd ..
python ./run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda false \
--do_test True \
--verbose_result True \
--batch_size 128 \
--test_data_dir ${TEST_DATA_PATH} \
--test_result_path ${TEST_RESULT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--task_mode ${TASK_MODE} \
--init_checkpoint ${INIT_CHECKPOINT}
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
comput unicom
"""
import io
infer_results = []
labels = []
result = []
temp_reuslt = []
temp_query = ""
pos_num = 0.0
neg_num = 0.0
with io.open("./unicom_infer_result", "r", encoding="utf8") as infer_result_file:
for line in infer_result_file:
infer_results.append(line.strip().split("\t"))
with io.open("./unicom_label", "r", encoding="utf8") as label_file:
for line in label_file:
labels.append(line.strip().split("\t"))
for infer_result, label in zip(infer_results, labels):
if infer_result[0] != temp_query and temp_query != "":
result.append(temp_reuslt)
temp_query = infer_result[0]
temp_reuslt = []
temp_reuslt.append(infer_result + label)
else:
if temp_query == '':
temp_query = infer_result[0]
temp_reuslt.append(infer_result + label)
else:
result.append(temp_reuslt)
for _result in result:
for n, i in enumerate(_result, start=1):
for j in _result[n:]:
if (int(j[-1]) > int(i[-1]) and float(j[-2]) < float(i[-2])) or (
int(j[-1]) < int(i[-1]) and float(j[-2]) > float(i[-2])):
neg_num += 1
elif (int(j[-1]) > int(i[-1]) and float(j[-2]) > float(i[-2])) or (
int(j[-1]) < int(i[-1]) and float(j[-2]) < float(i[-2])):
pos_num += 1
print("pos/neg of unicom data is %f" % (pos_num / neg_num))
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
split unicom file
"""
import io
with io.open("../data/unicom", "r", encoding="utf8") as unicom_file:
with io.open("./unicom_infer", "w", encoding="utf8") as infer_file:
with io.open("./unicom_label", "w", encoding="utf8") as label_file:
for line in unicom_file:
line = line.strip().split('\t')
infer_file.write("\t".join(line[:2]) + '\n')
label_file.write(line[2] + '\n')
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
MMDNN class
"""
import numpy as np
import paddle.fluid as fluid
import logging
from paddle.fluid.dygraph import Embedding, LayerNorm, Linear, to_variable, Layer, guard
from paddle.fluid.dygraph.nn import Conv2D
import paddle_layers as pd_layers
from paddle.fluid import layers
from paddle.fluid.dygraph import Layer
class BasicLSTMUnit(Layer):
"""
****
BasicLSTMUnit class, Using basic operator to build LSTM
The algorithm can be described as the code below.
.. math::
i_t &= \sigma(W_{ix}x_{t} + W_{ih}h_{t-1} + b_i)
f_t &= \sigma(W_{fx}x_{t} + W_{fh}h_{t-1} + b_f + forget_bias )
o_t &= \sigma(W_{ox}x_{t} + W_{oh}h_{t-1} + b_o)
\\tilde{c_t} &= tanh(W_{cx}x_t + W_{ch}h_{t-1} + b_c)
c_t &= f_t \odot c_{t-1} + i_t \odot \\tilde{c_t}
h_t &= o_t \odot tanh(c_t)
- $W$ terms denote weight matrices (e.g. $W_{ix}$ is the matrix
of weights from the input gate to the input)
- The b terms denote bias vectors ($bx_i$ and $bh_i$ are the input gate bias vector).
- sigmoid is the logistic sigmoid function.
- $i, f, o$ and $c$ are the input gate, forget gate, output gate,
and cell activation vectors, respectively, all of which have the same size as
the cell output activation vector $h$.
- The :math:`\odot` is the element-wise product of the vectors.
- :math:`tanh` is the activation functions.
- :math:`\\tilde{c_t}` is also called candidate hidden state,
which is computed based on the current input and the previous hidden state.
Args:
name_scope(string) : The name scope used to identify parameter and bias name
hidden_size (integer): The hidden size used in the Unit.
param_attr(ParamAttr|None): The parameter attribute for the learnable
weight matrix. Note:
If it is set to None or one attribute of ParamAttr, lstm_unit will
create ParamAttr as param_attr. If the Initializer of the param_attr
is not set, the parameter is initialized with Xavier. Default: None.
bias_attr (ParamAttr|None): The parameter attribute for the bias
of LSTM unit.
If it is set to None or one attribute of ParamAttr, lstm_unit will
create ParamAttr as bias_attr. If the Initializer of the bias_attr
is not set, the bias is initialized as zero. Default: None.
gate_activation (function|None): The activation function for gates (actGate).
Default: 'fluid.layers.sigmoid'
activation (function|None): The activation function for cells (actNode).
Default: 'fluid.layers.tanh'
forget_bias(float|1.0): forget bias used when computing forget gate
dtype(string): data type used in this unit
"""
def __init__(self,
hidden_size,
input_size,
param_attr=None,
bias_attr=None,
gate_activation=None,
activation=None,
forget_bias=1.0,
dtype='float32'):
super(BasicLSTMUnit, self).__init__(dtype)
self._hiden_size = hidden_size
self._param_attr = param_attr
self._bias_attr = bias_attr
self._gate_activation = gate_activation or layers.sigmoid
self._activation = activation or layers.tanh
self._forget_bias = layers.fill_constant(
[1], dtype=dtype, value=forget_bias)
self._forget_bias.stop_gradient = False
self._dtype = dtype
self._input_size = input_size
self._weight = self.create_parameter(
attr=self._param_attr,
shape=[self._input_size + self._hiden_size, 4 * self._hiden_size],
dtype=self._dtype)
self._bias = self.create_parameter(
attr=self._bias_attr,
shape=[4 * self._hiden_size],
dtype=self._dtype,
is_bias=True)
def forward(self, input, pre_hidden, pre_cell):
concat_input_hidden = layers.concat([input, pre_hidden], 1)
gate_input = layers.matmul(x=concat_input_hidden, y=self._weight)
gate_input = layers.elementwise_add(gate_input, self._bias)
i, j, f, o = layers.split(gate_input, num_or_sections=4, dim=-1)
new_cell = layers.elementwise_add(
layers.elementwise_mul(
pre_cell,
layers.sigmoid(layers.elementwise_add(f, self._forget_bias))),
layers.elementwise_mul(layers.sigmoid(i), layers.tanh(j)))
new_hidden = layers.tanh(new_cell) * layers.sigmoid(o)
return new_hidden, new_cell
class MMDNN(object):
"""
MMDNN
"""
def __init__(self, config):
"""
initialize
"""
self.vocab_size = int(config['dict_size'])
self.emb_size = int(config['net']['embedding_dim'])
self.lstm_dim = int(config['net']['lstm_dim'])
self.kernel_size = int(config['net']['num_filters'])
self.win_size1 = int(config['net']['window_size_left'])
self.win_size2 = int(config['net']['window_size_right'])
self.dpool_size1 = int(config['net']['dpool_size_left'])
self.dpool_size2 = int(config['net']['dpool_size_right'])
self.hidden_size = int(config['net']['hidden_size'])
self.seq_len1 = int(config['max_len_left'])
self.seq_len2 = int(config['max_len_right'])
self.task_mode = config['task_mode']
if int(config['match_mask']) != 0:
self.match_mask = True
else:
self.match_mask = False
if self.task_mode == "pointwise":
self.n_class = int(config['n_class'])
self.out_size = self.n_class
elif self.task_mode == "pairwise":
self.out_size = 1
else:
logging.error("training mode not supported")
def embedding_layer(self, input, zero_pad=True, scale=True):
"""
embedding layer
"""
emb = Embedding(
size=[self.vocab_size, self.emb_size],
padding_idx=(0 if zero_pad else None),
param_attr=fluid.ParamAttr(
name="word_embedding", initializer=fluid.initializer.Xavier()))
emb = emb(input)
if scale:
emb = emb * (self.emb_size**0.5)
return emb
def bi_dynamic_lstm(self, input, hidden_size):
"""
bi_lstm layer
"""
fw_in_proj = Linear(
input_dim=self.emb_size,
output_dim=4 * hidden_size,
param_attr=fluid.ParamAttr(name="fw_fc.w"),
bias_attr=False)
fw_in_proj = fw_in_proj(input)
forward = pd_layers.DynamicLSTMLayer(
size=4 * hidden_size,
is_reverse=False,
param_attr=fluid.ParamAttr(name="forward_lstm.w"),
bias_attr=fluid.ParamAttr(name="forward_lstm.b")).ops()
forward = forward(fw_in_proj)
rv_in_proj = Linear(
input_dim=self.emb_size,
output_dim=4 * hidden_size,
param_attr=fluid.ParamAttr(name="rv_fc.w"),
bias_attr=False)
rv_in_proj = rv_in_proj(input)
reverse = pd_layers.DynamicLSTMLayer(
4 * hidden_size,
'lstm'
is_reverse=True,
param_attr=fluid.ParamAttr(name="reverse_lstm.w"),
bias_attr=fluid.ParamAttr(name="reverse_lstm.b")).ops()
reverse = reverse(rv_in_proj)
return [forward, reverse]
def conv_pool_relu_layer(self, input, mask=None):
"""
convolution and pool layer
"""
# data format NCHW
emb_expanded = fluid.layers.unsqueeze(input=input, axes=[1])
# same padding
conv = Conv2d(
num_filters=self.kernel_size,
stride=1,
padding=(int(self.seq_len1 / 2), int(self.seq_len2 // 2)),
filter_size=(self.seq_len1, self.seq_len2),
bias_attr=fluid.ParamAttr(
initializer=fluid.initializer.Constant(0.1)))
conv = conv(emb_expanded)
if mask is not None:
cross_mask = fluid.layers.stack(x=[mask] * self.kernel_size, axis=1)
conv = cross_mask * conv + (1 - cross_mask) * (-2**32 + 1)
# valid padding
pool = fluid.layers.pool2d(
input=conv,
pool_size=[
int(self.seq_len1 / self.dpool_size1),
int(self.seq_len2 / self.dpool_size2)
],
pool_stride=[
int(self.seq_len1 / self.dpool_size1),
int(self.seq_len2 / self.dpool_size2)
],
pool_type="max", )
relu = fluid.layers.relu(pool)
return relu
def get_cross_mask(self, left_lens, right_lens):
"""
cross mask
"""
mask1 = fluid.layers.sequence_mask(
x=left_lens, dtype='float32', maxlen=self.seq_len1 + 1)
mask2 = fluid.layers.sequence_mask(
x=right_lens, dtype='float32', maxlen=self.seq_len2 + 1)
mask1 = fluid.layers.transpose(x=mask1, perm=[0, 2, 1])
cross_mask = fluid.layers.matmul(x=mask1, y=mask2)
return cross_mask
def predict(self, left, right):
"""
Forward network
"""
left_emb = self.embedding_layer(left, zero_pad=True, scale=False)
right_emb = self.embedding_layer(right, zero_pad=True, scale=False)
bi_left_outputs = self.bi_dynamic_lstm(
input=left_emb, hidden_size=self.lstm_dim)
left_seq_encoder = fluid.layers.concat(input=bi_left_outputs, axis=1)
bi_right_outputs = self.bi_dynamic_lstm(
input=right_emb, hidden_size=self.lstm_dim)
right_seq_encoder = fluid.layers.concat(input=bi_right_outputs, axis=1)
pad_value = fluid.layers.assign(input=np.array([0]).astype("float32"))
left_seq_encoder, left_lens = fluid.layers.sequence_pad(
x=left_seq_encoder, pad_value=pad_value, maxlen=self.seq_len1)
right_seq_encoder, right_lens = fluid.layers.sequence_pad(
x=right_seq_encoder, pad_value=pad_value, maxlen=self.seq_len2)
cross = fluid.layers.matmul(
left_seq_encoder, right_seq_encoder, transpose_y=True)
if self.match_mask:
cross_mask = self.get_cross_mask(left_lens, right_lens)
else:
cross_mask = None
conv_pool_relu = self.conv_pool_relu_layer(input=cross, mask=cross_mask)
relu_hid1 = Linear(
input_dim=conv_pool_relu.shape[-1],
output_dim=self.hidden_size)
relu_hid1 = relu_hid1(conv_pool_relu)
relu_hid1 = fluid.layers.tanh(relu_hid1)
relu_hid1 = Linear(
input_dim=relu_hid1.shape[-1],
output_dim=self.out_size)
pred = relu_hid1(pred)
pred = fluid.layers.softmax(pred)
return left_seq_encoder, pred
#encoding=utf8
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
import paddle
import paddle.fluid as fluid
def check_cuda(use_cuda, err = \
"\nYou can not set use_cuda = True in the model because you are using paddlepaddle-cpu.\n \
Please: 1. Install paddlepaddle-gpu to run your models on GPU or 2. Set use_cuda = False to run models on CPU.\n"
):
"""
Log error and exit when set use_gpu=true in paddlepaddle
cpu version.
"""
try:
if use_cuda == True and fluid.is_compiled_with_cuda() == False:
print(err)
sys.exit(1)
except Exception as e:
pass
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
print(err)
sys.exit(1)
def check_version():
"""
Log error and exit when the installed version of paddlepaddle is
not satisfied.
"""
err = "PaddlePaddle version 1.6 or higher is required, " \
"or a suitable develop version is satisfied as well. \n" \
"Please make sure the version is good with your code." \
try:
fluid.require_version('1.6.0')
except Exception as e:
print(err)
sys.exit(1)
if __name__ == "__main__":
check_cuda(True)
check_cuda(False)
check_cuda(True, "This is only for testing.")
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
base layers
"""
from paddle.fluid import layers
from paddle.fluid.dygraph import Layer
from paddle.fluid.dygraph import GRUUnit
from paddle.fluid.dygraph.base import to_variable
# import numpy as np
# import logging
class DynamicGRU(Layer):
def __init__(self,
size,
param_attr=None,
bias_attr=None,
is_reverse=False,
gate_activation='sigmoid',
candidate_activation='tanh',
h_0=None,
origin_mode=False,
init_size = None):
super(DynamicGRU, self).__init__()
self.gru_unit = GRUUnit(
size * 3,
param_attr=param_attr,
bias_attr=bias_attr,
activation=candidate_activation,
gate_activation=gate_activation,
origin_mode=origin_mode)
self.size = size
self.h_0 = h_0
self.is_reverse = is_reverse
def forward(self, inputs):
hidden = self.h_0
res = []
for i in range(inputs.shape[1]):
if self.is_reverse:
i = inputs.shape[1] - 1 - i
input_ = inputs[ :, i:i+1, :]
input_ = fluid.layers.reshape(input_, [-1, input_.shape[2]], inplace=False)
hidden, reset, gate = self.gru_unit(input_, hidden)
hidden_ = fluid.layers.reshape(hidden, [-1, 1, hidden.shape[1]], inplace=False)
res.append(hidden_)
if self.is_reverse:
res = res[::-1]
res = fluid.layers.concat(res, axis=1)
return res
\ No newline at end of file
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
bow class
"""
import paddle_layers as layers
from paddle import fluid
from paddle.fluid.dygraph.base import to_variable
from paddle.fluid.dygraph import Layer, Embedding
import paddle.fluid.param_attr as attr
uniform_initializer = lambda x: fluid.initializer.UniformInitializer(low=-x, high=x)
class BOW(Layer):
"""
BOW
"""
def __init__(self, conf_dict):
"""
initialize
"""
super(BOW, self).__init__()
self.dict_size = conf_dict["dict_size"]
self.task_mode = conf_dict["task_mode"]
self.emb_dim = conf_dict["net"]["emb_dim"]
self.bow_dim = conf_dict["net"]["bow_dim"]
self.seq_len = 5
self.emb_layer = layers.EmbeddingLayer(self.dict_size, self.emb_dim, "emb").ops()
self.bow_layer = layers.FCLayer(self.bow_dim, None, "fc").ops()
self.softmax_layer = layers.FCLayer(2, "softmax", "cos_sim").ops()
def forward(self, left, right):
"""
Forward network
"""
# embedding layer
left_emb = self.emb_layer(left)
right_emb = self.emb_layer(right)
left_emb = fluid.layers.reshape(
left_emb, shape=[-1, self.seq_len, self.bow_dim])
right_emb = fluid.layers.reshape(
right_emb, shape=[-1, self.seq_len, self.bow_dim])
bow_left = fluid.layers.reduce_sum(left_emb, dim=1)
bow_right = fluid.layers.reduce_sum(right_emb, dim=1)
softsign_layer = layers.SoftsignLayer()
left_soft = softsign_layer.ops(bow_left)
right_soft = softsign_layer.ops(bow_right)
# matching layer
if self.task_mode == "pairwise":
left_bow = self.bow_layer(left_soft)
right_bow = self.bow_layer(right_soft)
cos_sim_layer = layers.CosSimLayer()
pred = cos_sim_layer.ops(left_bow, right_bow)
return left_bow, pred
else:
concat_layer = layers.ConcatLayer(1)
concat = concat_layer.ops([left_soft, right_soft])
concat_fc = self.bow_layer(concat)
pred = self.softmax_layer(concat_fc)
return left_soft, pred
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
cnn class
"""
import paddle_layers as layers
from paddle import fluid
from paddle.fluid.dygraph import Layer
class CNN(Layer):
"""
CNN
"""
def __init__(self, conf_dict):
"""
initialize
"""
super(CNN, self).__init__()
self.dict_size = conf_dict["dict_size"]
self.task_mode = conf_dict["task_mode"]
self.emb_dim = conf_dict["net"]["emb_dim"]
self.filter_size = conf_dict["net"]["filter_size"]
self.num_filters = conf_dict["net"]["num_filters"]
self.hidden_dim = conf_dict["net"]["hidden_dim"]
self.seq_len = 5
self.channels = 1
# layers
self.emb_layer = layers.EmbeddingLayer(self.dict_size, self.emb_dim, "emb").ops()
self.fc_layer = layers.FCLayer(self.hidden_dim, None, "fc").ops()
self.softmax_layer = layers.FCLayer(2, "softmax", "cos_sim").ops()
self.cnn_layer = layers.SimpleConvPool(
self.channels,
self.num_filters,
self.filter_size)
def forward(self, left, right):
"""
Forward network
"""
# embedding layer
left_emb = self.emb_layer(left)
right_emb = self.emb_layer(right)
# Presentation context
left_emb = fluid.layers.reshape(
left_emb, shape=[-1, self.channels, self.seq_len, self.hidden_dim])
right_emb = fluid.layers.reshape(
right_emb, shape=[-1, self.channels, self.seq_len, self.hidden_dim])
left_cnn = self.cnn_layer(left_emb)
right_cnn = self.cnn_layer(right_emb)
# matching layer
if self.task_mode == "pairwise":
left_fc = self.fc_layer(left_cnn)
right_fc = self.fc_layer(right_cnn)
cos_sim_layer = layers.CosSimLayer()
pred = cos_sim_layer.ops(left_fc, right_fc)
return left_fc, pred
else:
concat_layer = layers.ConcatLayer(1)
concat = concat_layer.ops([left_cnn, right_cnn])
concat_fc = self.fc_layer(concat)
pred = self.softmax_layer(concat_fc)
return left_cnn, pred
此差异已折叠。
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
gru class
"""
import paddle_layers as layers
from paddle.fluid.dygraph.base import to_variable
from paddle.fluid.dygraph.nn import Linear
from paddle.fluid.dygraph import Layer
from paddle import fluid
import numpy as np
class GRU(Layer):
"""
GRU
"""
def __init__(self, conf_dict):
"""
initialize
"""
super(GRU, self).__init__()
self.dict_size = conf_dict["dict_size"]
self.task_mode = conf_dict["task_mode"]
self.emb_dim = conf_dict["net"]["emb_dim"]
self.gru_dim = conf_dict["net"]["gru_dim"]
self.hidden_dim = conf_dict["net"]["hidden_dim"]
self.emb_layer = layers.EmbeddingLayer(self.dict_size, self.emb_dim, "emb").ops()
self.gru_layer = layers.DynamicGRULayer(self.gru_dim, "gru").ops()
self.fc_layer = layers.FCLayer(self.hidden_dim, None, "fc").ops()
self.proj_layer = Linear(input_dim = self.hidden_dim, output_dim=self.gru_dim*3)
self.softmax_layer = layers.FCLayer(2, "softmax", "cos_sim").ops()
self.seq_len=5
def forward(self, left, right):
"""
Forward network
"""
# embedding layer
left_emb = self.emb_layer(left)
right_emb = self.emb_layer(right)
# Presentation context
left_emb = self.proj_layer(left_emb)
right_emb = self.proj_layer(right_emb)
h_0 = np.zeros((left_emb.shape[0], self.hidden_dim), dtype="float32")
h_0 = to_variable(h_0)
left_gru = self.gru_layer(left_emb, h_0=h_0)
right_gru = self.gru_layer(right_emb, h_0=h_0)
left_emb = fluid.layers.reduce_max(left_gru, dim=1)
right_emb = fluid.layers.reduce_max(right_gru, dim=1)
left_emb = fluid.layers.reshape(
left_emb, shape=[-1, self.seq_len, self.hidden_dim])
right_emb = fluid.layers.reshape(
right_emb, shape=[-1, self.seq_len, self.hidden_dim])
left_emb = fluid.layers.reduce_sum(left_emb, dim=1)
right_emb = fluid.layers.reduce_sum(right_emb, dim=1)
left_last = fluid.layers.tanh(left_emb)
right_last = fluid.layers.tanh(right_emb)
if self.task_mode == "pairwise":
left_fc = self.fc_layer(left_last)
right_fc = self.fc_layer(right_last)
cos_sim_layer = layers.CosSimLayer()
pred = cos_sim_layer.ops(left_fc, right_fc)
return left_fc, pred
else:
concat_layer = layers.ConcatLayer(1)
concat = concat_layer.ops([left_last, right_last])
concat_fc = self.fc_layer(concat)
pred = self.softmax_layer(concat_fc)
return left_last, pred
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
hinge loss
"""
import sys
sys.path.append("../")
import nets.paddle_layers as layers
class HingeLoss(object):
"""
Hing Loss Calculate class
"""
def __init__(self, conf_dict):
"""
initialize
"""
self.margin = conf_dict["loss"]["margin"]
def compute(self, pos, neg):
"""
compute loss
"""
elementwise_max = layers.ElementwiseMaxLayer()
elementwise_add = layers.ElementwiseAddLayer()
elementwise_sub = layers.ElementwiseSubLayer()
constant = layers.ConstantLayer()
reduce_mean = layers.ReduceMeanLayer()
loss = reduce_mean.ops(
elementwise_max.ops(
constant.ops(neg, neg.shape, "float32", 0.0),
elementwise_add.ops(
elementwise_sub.ops(neg, pos),
constant.ops(neg, neg.shape, "float32", self.margin))))
return loss
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
log loss
"""
import sys
sys.path.append("../")
import nets.paddle_layers as layers
class LogLoss(object):
"""
Log Loss Calculate
"""
def __init__(self, conf_dict):
"""
initialize
"""
pass
def compute(self, pos, neg):
"""
compute loss
"""
sigmoid = layers.SigmoidLayer()
reduce_mean = layers.ReduceMeanLayer()
loss = reduce_mean.ops(sigmoid.ops(neg - pos))
return loss
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
softmax loss
"""
import sys
import paddle.fluid as fluid
sys.path.append("../")
import nets.paddle_layers as layers
class SoftmaxCrossEntropyLoss(object):
"""
Softmax with Cross Entropy Loss Calculate
"""
def __init__(self, conf_dict):
"""
initialize
"""
pass
def compute(self, input, label):
"""
compute loss
"""
reduce_mean = layers.ReduceMeanLayer()
cost = fluid.layers.cross_entropy(input=input, label=label)
avg_cost = reduce_mean.ops(cost)
return avg_cost
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
lstm class
"""
import paddle_layers as layers
from paddle.fluid.dygraph import Layer, Linear
from paddle import fluid
class LSTM(Layer):
"""
LSTM
"""
def __init__(self, conf_dict):
"""
initialize
"""
super(LSTM,self).__init__()
self.dict_size = conf_dict["dict_size"]
self.task_mode = conf_dict["task_mode"]
self.emb_dim = conf_dict["net"]["emb_dim"]
self.lstm_dim = conf_dict["net"]["lstm_dim"]
self.hidden_dim = conf_dict["net"]["hidden_dim"]
self.emb_layer = layers.EmbeddingLayer(self.dict_size, self.emb_dim, "emb").ops()
self.lstm_layer = layers.DynamicLSTMLayer(self.lstm_dim, "lstm").ops()
self.fc_layer = layers.FCLayer(self.hidden_dim, None, "fc").ops()
self.softmax_layer = layers.FCLayer(2, "softmax", "cos_sim").ops()
self.proj_layer = Linear(input_dim = self.hidden_dim, output_dim=self.lstm_dim*4)
self.seq_len = 5
def forward(self, left, right):
"""
Forward network
"""
# embedding layer
left_emb = self.emb_layer(left)
right_emb = self.emb_layer(right)
# Presentation context
left_proj = self.proj_layer(left_emb)
right_proj = self.proj_layer(right_emb)
left_lstm, _ = self.lstm_layer(left_proj)
right_lstm, _ = self.lstm_layer(right_proj)
left_emb = fluid.layers.reduce_max(left_lstm, dim=1)
right_emb = fluid.layers.reduce_max(right_lstm, dim=1)
left_emb = fluid.layers.reshape(
left_emb, shape=[-1, self.seq_len, self.hidden_dim])
right_emb = fluid.layers.reshape(
right_emb, shape=[-1, self.seq_len, self.hidden_dim])
left_emb = fluid.layers.reduce_sum(left_emb, dim=1)
right_emb = fluid.layers.reduce_sum(right_emb, dim=1)
left_last = fluid.layers.tanh(left_emb)
right_last = fluid.layers.tanh(right_emb)
# matching layer
if self.task_mode == "pairwise":
left_fc = self.fc_layer(left_last)
right_fc = self.fc_layer(right_last)
cos_sim_layer = layers.CosSimLayer()
pred = cos_sim_layer.ops(left_fc, right_fc)
return left_fc, pred
else:
concat_layer = layers.ConcatLayer(1)
concat = concat_layer.ops([left_last, right_last])
concat_fc = self.fc_layer(concat)
pred = self.softmax_layer(concat_fc)
return left_last, pred
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
MMDNN class
"""
import numpy as np
import paddle.fluid as fluid
import logging
from paddle.fluid.dygraph import Linear, to_variable, Layer, Pool2D, Conv2D
import paddle_layers as pd_layers
from paddle.fluid import layers
class MMDNN(Layer):
"""
MMDNN
"""
def __init__(self, config):
"""
initialize
"""
super(MMDNN, self).__init__()
self.vocab_size = int(config['dict_size'])
self.emb_size = int(config['net']['embedding_dim'])
self.lstm_dim = int(config['net']['lstm_dim'])
self.kernel_size = int(config['net']['num_filters'])
self.win_size1 = int(config['net']['window_size_left'])
self.win_size2 = int(config['net']['window_size_right'])
self.dpool_size1 = int(config['net']['dpool_size_left'])
self.dpool_size2 = int(config['net']['dpool_size_right'])
self.hidden_size = int(config['net']['hidden_size'])
self.seq_len1 = 5
#int(config['max_len_left'])
self.seq_len2 = 5 #int(config['max_len_right'])
self.task_mode = config['task_mode']
self.zero_pad = True
self.scale = False
if int(config['match_mask']) != 0:
self.match_mask = True
else:
self.match_mask = False
if self.task_mode == "pointwise":
self.n_class = int(config['n_class'])
self.out_size = self.n_class
elif self.task_mode == "pairwise":
self.out_size = 1
else:
logging.error("training mode not supported")
# layers
self.emb_layer = pd_layers.EmbeddingLayer(self.vocab_size, self.emb_size,
name="word_embedding",padding_idx=(0 if self.zero_pad else None)).ops()
self.fw_in_proj = Linear(
input_dim=self.emb_size,
output_dim=4 * self.lstm_dim,
param_attr=fluid.ParamAttr(name="fw_fc.w"),
bias_attr=False)
self.lstm_layer = pd_layers.DynamicLSTMLayer(self.lstm_dim, "lstm").ops()
self.rv_in_proj = Linear(
input_dim=self.emb_size,
output_dim=4 * self.lstm_dim,
param_attr=fluid.ParamAttr(name="rv_fc.w"),
bias_attr=False)
self.reverse_layer = pd_layers.DynamicLSTMLayer(
self.lstm_dim,
is_reverse=True).ops()
self.conv = Conv2D(
num_channels=1,
num_filters=self.kernel_size,
stride=1,
padding=(int(self.seq_len1 / 2), int(self.seq_len2 // 2)),
filter_size=(self.seq_len1, self.seq_len2),
bias_attr=fluid.ParamAttr(
initializer=fluid.initializer.Constant(0.1)))
self.pool_layer = Pool2D(
pool_size=[
int(self.seq_len1 / self.dpool_size1),
int(self.seq_len2 / self.dpool_size2)
],
pool_stride=[
int(self.seq_len1 / self.dpool_size1),
int(self.seq_len2 / self.dpool_size2)
],
pool_type="max" )
self.fc_layer = pd_layers.FCLayer(self.hidden_size, "tanh", "fc").ops()
self.fc1_layer = pd_layers.FCLayer(self.out_size, "softmax", "fc1").ops()
def forward(self, left, right):
"""
Forward network
"""
left_emb = self.emb_layer(left)
right_emb = self.emb_layer(right)
if self.scale:
left_emb = left_emb * (self.emb_size**0.5)
right_emb = right_emb * (self.emb_size**0.5)
# bi_listm
left_proj = self.fw_in_proj(left_emb)
right_proj = self.fw_in_proj(right_emb)
left_lstm, _ = self.lstm_layer(left_proj)
right_lstm, _ = self.lstm_layer(right_proj)
left_rv_proj = self.rv_in_proj(left_lstm)
right_rv_proj = self.rv_in_proj(right_lstm)
left_reverse,_ = self.reverse_layer(left_rv_proj)
right_reverse,_ = self.reverse_layer(right_rv_proj)
left_seq_encoder = fluid.layers.concat([left_lstm, left_reverse], axis=1)
right_seq_encoder = fluid.layers.concat([right_lstm, right_reverse], axis=1)
pad_value = fluid.layers.assign(input=np.array([0]).astype("float32"))
left_seq_encoder = fluid.layers.reshape(left_seq_encoder, shape=[left_seq_encoder.shape[0]/5,5,-1])
right_seq_encoder = fluid.layers.reshape(right_seq_encoder, shape=[right_seq_encoder.shape[0]/5,5,-1])
cross = fluid.layers.matmul(
left_seq_encoder, right_seq_encoder, transpose_y=True)
left_lens=to_variable(np.array([5]))
right_lens=to_variable(np.array([5]))
if self.match_mask:
mask1 = fluid.layers.sequence_mask(
x=left_lens, dtype='float32', maxlen=self.seq_len1 + 1)
mask2 = fluid.layers.sequence_mask(
x=right_lens, dtype='float32', maxlen=self.seq_len2 + 1)
mask1 = fluid.layers.transpose(x=mask1, perm=[1, 0])
mask = fluid.layers.matmul(x=mask1, y=mask2)
else:
mask = None
# conv_pool_relu
emb_expand = fluid.layers.unsqueeze(input=cross, axes=[1])
conv = self.conv(emb_expand)
if mask is not None:
cross_mask = fluid.layers.stack(x=[mask] * self.kernel_size, axis=0)
cross_mask = fluid.layers.stack(x=[cross] * conv.shape[1], axis=1)
conv = cross_mask * conv + (1 - cross_mask) * (-2**5 + 1)
pool = self.pool_layer(conv)
conv_pool_relu = fluid.layers.relu(pool)
relu_hid1 = self.fc_layer(conv_pool_relu)
relu_hid1 = fluid.layers.tanh(relu_hid1)
pred = self.fc1_layer(relu_hid1)
pred = fluid.layers.softmax(pred)
return left_seq_encoder, pred
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
network layers
"""
import paddle.fluid as fluid
import paddle.fluid.param_attr as attr
class EmbeddingLayer(object):
"""
Embedding Layer class
"""
def __init__(self, dict_size, emb_dim, name="emb"):
"""
initialize
"""
self.dict_size = dict_size
self.emb_dim = emb_dim
self.name = name
def ops(self, input):
"""
operation
"""
emb = fluid.dygraph.Embedding(
input=input,
size=[self.dict_size, self.emb_dim],
is_sparse=True,
param_attr=attr.ParamAttr(name=self.name))
return emb
class SequencePoolLayer(object):
"""
Sequence Pool Layer class
"""
def __init__(self, pool_type):
"""
initialize
"""
self.pool_type = pool_type
def ops(self, input):
"""
operation
"""
pool = fluid.dygraph.Pool2D(input=input, pool_type=self.pool_type)
return pool
class FCLayer(object):
"""
Fully Connect Layer class
"""
def __init__(self, fc_dim, act, name="fc"):
"""
initialize
"""
self.fc_dim = fc_dim
self.act = act
self.name = name
def ops(self, input):
"""
operation
"""
fc = fluid.dygraph.FC(input=input,
size=self.fc_dim,
param_attr=attr.ParamAttr(name="%s.w" % self.name),
bias_attr=attr.ParamAttr(name="%s.b" % self.name),
act=self.act,
name=self.name)
return fc
class DynamicGRULayer(object):
"""
Dynamic GRU Layer class
"""
def __init__(self, gru_dim, name="dyn_gru"):
"""
initialize
"""
self.gru_dim = gru_dim
self.name = name
def ops(self, input):
"""
operation
"""
proj = fluid.dygraph.FC(
input=input,
size=self.gru_dim * 3,
param_attr=attr.ParamAttr(name="%s_fc.w" % self.name),
bias_attr=attr.ParamAttr(name="%s_fc.b" % self.name))
gru = fluid.layers.dynamic_gru(
input=proj,
size=self.gru_dim,
param_attr=attr.ParamAttr(name="%s.w" % self.name),
bias_attr=attr.ParamAttr(name="%s.b" % self.name))
return gru
class DynamicLSTMLayer(object):
"""
Dynamic LSTM Layer class
"""
def __init__(self, lstm_dim, name="dyn_lstm"):
"""
initialize
"""
self.lstm_dim = lstm_dim
self.name = name
def ops(self, input):
"""
operation
"""
proj = fluid.dygraph.FC(
input=input,
size=self.lstm_dim * 4,
param_attr=attr.ParamAttr(name="%s_fc.w" % self.name),
bias_attr=attr.ParamAttr(name="%s_fc.b" % self.name))
lstm, _ = fluid.layers.dynamic_lstm(
input=proj,
size=self.lstm_dim * 4,
param_attr=attr.ParamAttr(name="%s.w" % self.name),
bias_attr=attr.ParamAttr(name="%s.b" % self.name))
return lstm
class SequenceLastStepLayer(object):
"""
Get Last Step Sequence Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, input):
"""
operation
"""
last = fluid.layers.sequence_last_step(input)
return last
class SequenceConvPoolLayer(object):
"""
Sequence convolution and pooling Layer class
"""
def __init__(self, filter_size, num_filters, name):
"""
initialize
Args:
filter_size:Convolution kernel size
num_filters:Convolution kernel number
"""
self.filter_size = filter_size
self.num_filters = num_filters
self.name = name
def ops(self, input):
"""
operation
"""
conv = fluid.nets.sequence_conv_pool(
input=input,
filter_size=self.filter_size,
num_filters=self.num_filters,
param_attr=attr.ParamAttr(name=self.name),
act="relu")
return conv
class DataLayer(object):
"""
Data Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, name, shape, dtype, lod_level=0):
"""
operation
"""
data = fluid.layers.data( #不用改
name=name, shape=shape, dtype=dtype, lod_level=lod_level)
return data
class ConcatLayer(object):
"""
Connection Layer class
"""
def __init__(self, axis):
"""
initialize
"""
self.axis = axis
def ops(self, inputs):
"""
operation
"""
concat = fluid.layers.concat(inputs, axis=self.axis)
return concat
class ReduceMeanLayer(object):
"""
Reduce Mean Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, input):
"""
operation
"""
mean = fluid.layers.reduce_mean(input)
return mean
class CrossEntropyLayer(object):
"""
Cross Entropy Calculate Layer
"""
def __init__(self, name="cross_entropy"):
"""
initialize
"""
pass
def ops(self, input, label):
"""
operation
"""
loss = fluid.layers.cross_entropy(input=input, label=label) # 不用改
return loss
class SoftmaxWithCrossEntropyLayer(object):
"""
Softmax with Cross Entropy Calculate Layer
"""
def __init__(self, name="softmax_with_cross_entropy"):
"""
initialize
"""
pass
def ops(self, input, label):
"""
operation
"""
loss = fluid.layers.softmax_with_cross_entropy( # 不用改
logits=input, label=label)
return loss
class CosSimLayer(object):
"""
Cos Similarly Calculate Layer
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, x, y):
"""
operation
"""
sim = fluid.layers.cos_sim(x, y)
return sim
class ElementwiseMaxLayer(object):
"""
Elementwise Max Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, x, y):
"""
operation
"""
max = fluid.layers.elementwise_max(x, y)
return max
class ElementwiseAddLayer(object):
"""
Elementwise Add Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, x, y):
"""
operation
"""
add = fluid.layers.elementwise_add(x, y)
return add
class ElementwiseSubLayer(object):
"""
Elementwise Add Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, x, y):
"""
operation
"""
sub = fluid.layers.elementwise_sub(x, y)
return sub
class ConstantLayer(object):
"""
Generate A Constant Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, input, shape, dtype, value):
"""
operation
"""
constant = fluid.layers.fill_constant_batch_size_like(input, shape,
dtype, value)
return constant
class SigmoidLayer(object):
"""
Sigmoid Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, input):
"""
operation
"""
sigmoid = fluid.layers.sigmoid(input)
return sigmoid
class SoftsignLayer(object):
"""
Softsign Layer class
"""
def __init__(self):
"""
initialize
"""
pass
def ops(self, input):
"""
operation
"""
softsign = fluid.layers.softsign(input)
return softsign
# class MatmulLayer(object):
# def __init__(self, transpose_x, transpose_y):
# self.transpose_x = transpose_x
# self.transpose_y = transpose_y
# def ops(self, x, y):
# matmul = fluid.layers.matmul(x, y, self.transpose_x, self.transpose_y)
# return matmul
# class Conv2dLayer(object):
# def __init__(self, num_filters, filter_size, act, name):
# self.num_filters = num_filters
# self.filter_size = filter_size
# self.act = act
# self.name = name
# def ops(self, input):
# conv = fluid.layers.conv2d(input, self.num_filters, self.filter_size, param_attr=attr.ParamAttr(name="%s.w" % self.name), bias_attr=attr.ParamAttr(name="%s.b" % self.name), act=self.act)
# return conv
# class Pool2dLayer(object):
# def __init__(self, pool_size, pool_type):
# self.pool_size = pool_size
# self.pool_type = pool_type
# def ops(self, input):
# pool = fluid.layers.pool2d(input, self.pool_size, self.pool_type)
# return pool
此差异已折叠。
# Copyright (c) 2019 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
SimNet reader
"""
import logging
import numpy as np
import io
class SimNetProcessor(object):
def __init__(self, args, vocab):
self.args = args
# load vocab
self.vocab = vocab
self.valid_label = np.array([])
self.test_label = np.array([])
def get_reader(self, mode, epoch=0):
"""
Get Reader
"""
def reader_with_pairwise():
"""
Reader with Pairwise
"""
if mode == "valid":
with io.open(self.args.valid_data_dir, "r",
encoding="utf8") as file:
for line in file:
query, title, label = line.strip().split("\t")
if len(query) == 0 or len(title) == 0 or len(
label) == 0 or not label.isdigit() or int(
label) not in [0, 1]:
logging.warning(
"line not match format in test file")
continue
query = [
self.vocab[word] for word in query.split(" ")
if word in self.vocab
]
title = [
self.vocab[word] for word in title.split(" ")
if word in self.vocab
]
if len(query) == 0:
query = [0]
if len(title) == 0:
title = [0]
yield [query, title]
elif mode == "test":
with io.open(self.args.test_data_dir, "r", encoding="utf8") as file:
for line in file:
query, title, label = line.strip().split("\t")
if len(query) == 0 or len(title) == 0 or len(
label) == 0 or not label.isdigit() or int(
label) not in [0, 1]:
logging.warning(
"line not match format in test file")
continue
query = [
self.vocab[word] for word in query.split(" ")
if word in self.vocab
]
title = [
self.vocab[word] for word in title.split(" ")
if word in self.vocab
]
if len(query) == 0:
query = [0]
if len(title) == 0:
title = [0]
# query = np.array([x.reshape(-1,1) for x in query]).astype('int64')
# title = np.array([x.reshape(-1,1) for x in title]).astype('int64')
yield [query, title]
else:
for idx in range(epoch):
with io.open(self.args.train_data_dir, "r",
encoding="utf8") as file:
for line in file:
query, pos_title, neg_title = line.strip().split("\t")
if len(query) == 0 or len(pos_title) == 0 or len(
neg_title) == 0:
logging.warning(
"line not match format in test file")
continue
query = [
self.vocab[word] for word in query.split(" ")
if word in self.vocab
]
pos_title = [
self.vocab[word] for word in pos_title.split(" ")
if word in self.vocab
]
neg_title = [
self.vocab[word] for word in neg_title.split(" ")
if word in self.vocab
]
if len(query) == 0:
query = [0]
if len(pos_title) == 0:
pos_title = [0]
if len(neg_title) == 0:
neg_title = [0]
yield [query, pos_title, neg_title]
def reader_with_pointwise():
"""
Reader with Pointwise
"""
if mode == "valid":
with io.open(self.args.valid_data_dir, "r",
encoding="utf8") as file:
for line in file:
query, title, label = line.strip().split("\t")
if len(query) == 0 or len(title) == 0 or len(
label) == 0 or not label.isdigit() or int(
label) not in [0, 1]:
logging.warning(
"line not match format in test file")
continue
query = [
self.vocab[word] for word in query.split(" ")
if word in self.vocab
]
title = [
self.vocab[word] for word in title.split(" ")
if word in self.vocab
]
if len(query) == 0:
query = [0]
if len(title) == 0:
title = [0]
yield [query, title]
elif mode == "test":
with io.open(self.args.test_data_dir, "r", encoding="utf8") as file:
for line in file:
query, title, label = line.strip().split("\t")
if len(query) == 0 or len(title) == 0 or len(
label) == 0 or not label.isdigit() or int(
label) not in [0, 1]:
logging.warning(
"line not match format in test file")
continue
query = [
self.vocab[word] for word in query.split(" ")
if word in self.vocab
]
title = [
self.vocab[word] for word in title.split(" ")
if word in self.vocab
]
if len(query) == 0:
query = [0]
if len(title) == 0:
title = [0]
yield [query, title]
else:
for idx in range(epoch):
with io.open(self.args.train_data_dir, "r",
encoding="utf8") as file:
for line in file:
query, title, label = line.strip().split("\t")
if len(query) == 0 or len(title) == 0 or len(
label) == 0 or not label.isdigit() or int(
label) not in [0, 1]:
logging.warning(
"line not match format in test file")
continue
query = [
self.vocab[word] for word in query.split(" ")
if word in self.vocab
]
title = [
self.vocab[word] for word in title.split(" ")
if word in self.vocab
]
label = int(label)
if len(query) == 0:
query = [0]
if len(title) == 0:
title = [0]
yield [query, title, label]
if self.args.task_mode == "pairwise":
return reader_with_pairwise
else:
return reader_with_pointwise
def get_infer_reader(self):
"""
get infer reader
"""
with io.open(self.args.infer_data_dir, "r", encoding="utf8") as file:
for line in file:
query, title = line.strip().split("\t")
if len(query) == 0 or len(title) == 0:
logging.warning("line not match format in test file")
continue
query = [
self.vocab[word] for word in query.split(" ")
if word in self.vocab
]
title = [
self.vocab[word] for word in title.split(" ")
if word in self.vocab
]
if len(query) == 0:
query = [0]
if len(title) == 0:
title = [0]
yield [query, title]
def get_infer_data(self):
"""
get infer data
"""
with io.open(self.args.infer_data_dir, "r", encoding="utf8") as file:
for line in file:
query, title = line.strip().split("\t")
if len(query) == 0 or len(title) == 0:
logging.warning("line not match format in test file")
continue
yield line.strip()
def get_valid_label(self):
"""
get valid data label
"""
if self.valid_label.size == 0:
labels = []
with io.open(self.args.valid_data_dir, "r", encoding="utf8") as f:
for line in f:
labels.append([int(line.strip().split("\t")[-1])])
self.valid_label = np.array(labels)
return self.valid_label
def get_test_label(self):
"""
get test data label
"""
if self.test_label.size == 0:
labels = []
with io.open(self.args.test_data_dir, "r", encoding="utf8") as f:
for line in f:
labels.append([int(line.strip().split("\t")[-1])])
self.test_label = np.array(labels)
return self.test_label
#!/usr/bin/env bash
export FLAGS_enable_parallel_graph=1
export FLAGS_sync_nccl_allreduce=1
export CUDA_VISIBLE_DEVICES=3
export FLAGS_fraction_of_gpu_memory_to_use=0.95
TASK_NAME='simnet'
TRAIN_DATA_PATH=./data/train_pairwise_data
VALID_DATA_PATH=./data/test_pairwise_data
TEST_DATA_PATH=./data/test_pairwise_data
INFER_DATA_PATH=./data/infer_data
VOCAB_PATH=./data/term2id.dict
CKPT_PATH=./model_files
TEST_RESULT_PATH=./test_result
INFER_RESULT_PATH=./infer_result
TASK_MODE='pairwise'
CONFIG_PATH=./config/bow_pairwise.json
INIT_CHECKPOINT=./model_files/simnet_bow_pairwise_pretrained_model/
# run_train
train() {
python run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda False \
--do_train True \
--do_valid True \
--do_test True \
--do_infer False \
--batch_size 128 \
--train_data_dir ${TRAIN_DATA_PATH} \
--valid_data_dir ${VALID_DATA_PATH} \
--test_data_dir ${TEST_DATA_PATH} \
--infer_data_dir ${INFER_DATA_PATH} \
--output_dir ${CKPT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--epoch 40 \
--save_steps 2000 \
--validation_steps 200 \
--compute_accuracy False \
--lamda 0.958 \
--task_mode ${TASK_MODE}\
--init_checkpoint ""
}
#run_evaluate
evaluate() {
python run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda false \
--do_test True \
--verbose_result True \
--batch_size 128 \
--test_data_dir ${TEST_DATA_PATH} \
--test_result_path ${TEST_RESULT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--task_mode ${TASK_MODE} \
--compute_accuracy False \
--lamda 0.958 \
--init_checkpoint ${INIT_CHECKPOINT}
}
# run_infer
infer() {
python run_classifier.py \
--task_name ${TASK_NAME} \
--use_cuda false \
--do_infer True \
--batch_size 128 \
--infer_data_dir ${INFER_DATA_PATH} \
--infer_result_path ${INFER_RESULT_PATH} \
--config_path ${CONFIG_PATH} \
--vocab_path ${VOCAB_PATH} \
--task_mode ${TASK_MODE} \
--init_checkpoint ${INIT_CHECKPOINT}
}
main() {
local cmd=${1:-help}
case "${cmd}" in
train)
train "$@";
;;
eval)
evaluate "$@";
;;
infer)
infer "$@";
;;
help)
echo "Usage: ${BASH_SOURCE} {train|eval|infer}";
return 0;
;;
*)
echo "Unsupport commend [${cmd}]";
echo "Usage: ${BASH_SOURCE} {train|eval|infer}";
return 1;
;;
esac
}
main "$@"
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册