提交 621931d2 编写于 作者: W wuzewu

update code

上级 185b7e0f
[style] [style]
based_on_style = pep8 based_on_style = pep8
column_limit = 80 column_limit = 120
# PaddleHub Demo 简介
目前PaddleHub有以下任务示例:
* [口罩检测](./mask_detection)
提供了基于完整的口罩人脸检测及分类的模型搭建的完整的视频级别Demo,同时提供基于飞桨高性能预测库的C++和Python部署方案。
* [图像分类](./image_classification)
该样例展示了PaddleHub如何将ResNet50、ResNet101、ResNet152、MobileNet、NasNet以及PNasNet作为预训练模型在Flowers、DogCat、Indoor67、Food101、StanfordDogs等数据集上进行图像分类的FineTune和预测。
* [中文词法分析](./lac)
该样例展示了PaddleHub如何利用中文词法分析LAC进行预测。
* [情感分析](./senta)
该样例展示了PaddleHub如何利用中文情感分析模型Senta进行FineTune和预测。
* [序列标注](./sequence_labeling)
该样例展示了PaddleHub如何将ERNIE/BERT等Transformer类模型作为预训练模型在MSRA_NER数据集上完成序列标注的FineTune和预测。
* [目标检测](./ssd)
该样例展示了PaddleHub如何将SSD作为预训练模型在PascalVOC数据集上完成目标检测的预测。
* [文本分类](./text_classification)
该样例展示了PaddleHub如何将ERNIE/BERT等Transformer类模型作为预训练模型在GLUE、ChnSentiCorp等数据集上完成文本分类的FineTune和预测。
**同时,该样例还展示了如何将一个Fine-tune保存的模型转化成PaddleHub Module。** 请确认转化时,使用的PaddleHub为1.6.0以上版本。
* [多标签分类](./multi_label_classification)
该样例展示了PaddleHub如何将BERT作为预训练模型在Toxic数据集上完成多标签分类的FineTune和预测。
* [回归任务](./regression)
该样例展示了PaddleHub如何将BERT作为预训练模型在GLUE-STSB数据集上完成回归任务的FineTune和预测。
* [阅读理解](./reading_comprehension)
该样例展示了PaddleHub如何将BERT作为预训练模型在SQAD数据集上完成阅读理解的FineTune和预测。
* [检索式问答任务](./qa_classification)
该样例展示了PaddleHub如何将ERNIE和BERT作为预训练模型在NLPCC-DBQA等数据集上完成检索式问答任务的FineTune和预测。
* [句子语义相似度计算](./sentence_similarity)
该样例展示了PaddleHub如何将word2vec_skipgram用于计算两个文本语义相似度。
* 超参优化AutoDL Finetuner使用
该样例展示了PaddleHub超参优化AutoDL Finetuner如何使用,给出了自动搜索[图像分类](./autofinetune_image_classification)/[文本分类](./autofinetune_text_classification)任务的较佳超参数示例。
* [服务化部署Hub Serving使用](./serving)
该样例文件夹下展示了服务化部署Hub Serving如何使用,将PaddleHub支持的可预测Module如何服务化部署。
* [预训练模型转化成PaddleHub Module](./senta_module_sample)
该样例展示了如何将一个预训练模型转化成PaddleHub Module形式,使得可以通过`hub.Module(name="module_name")`实现一键加载。
请确认转化时,使用的PaddleHub为1.6.0以上版本。
**NOTE:**
以上任务示例均是利用PaddleHub提供的数据集,若您想在自定义数据集上完成相应任务,请查看[PaddleHub适配自定义数据完成Fine-tune](../docs/tutorial/how_to_load_data.md)
## 在线体验
我们在AI Studio上提供了IPython NoteBook形式的demo,您可以直接在平台上在线体验,链接如下:
|预训练模型|任务类型|数据集|AIStudio链接|备注|
|-|-|-|-|-|
|ResNet|图像分类|猫狗数据集DogCat|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147010)||
|ERNIE|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147006)||
|ERNIE|文本分类|中文新闻分类数据集THUNEWS|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/221999)|本教程讲述了如何将自定义数据集加载,并利用Fine-tune API完成文本分类迁移学习。|
|ERNIE|序列标注|中文序列标注数据集MSRA_NER|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147009)||
|ERNIE|序列标注|中文快递单数据集Express|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/184200)|本教程讲述了如何将自定义数据集加载,并利用Fine-tune API完成序列标注迁移学习。|
|ERNIE Tiny|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/186443)||
|Senta|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/216846)|本教程讲述了任何利用Senta和Fine-tune API完成情感分类迁移学习。|
|Senta|情感分析预测|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215814)||
|LAC|词法分析|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215711)||
|Ultra-Light-Fast-Generic-Face-Detector-1MB|人脸检测|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215962)||
## 超参优化AutoDL Finetuner
PaddleHub还提供了超参优化(Hyperparameter Tuning)功能, 自动搜索最优模型超参得到更好的模型效果。详细信息参见[AutoDL Finetuner超参优化功能教程](../docs/tutorial/autofinetune.md)
# PaddleHub超参优化——图像分类
**确认安装PaddleHub版本在1.3.0以上, 同时PaddleHub AutoDL Finetuner功能要求至少有一张GPU显卡可用。**
本示例展示如何利用PaddleHub超参优化AutoDL Finetuner,得到一个效果较佳的超参数组合。
每次执行AutoDL Finetuner,用户只需要定义搜索空间,改动几行代码,就能利用PaddleHub搜索最好的超参组合。 只需要两步即可完成:
* 定义搜索空间:AutoDL Finetuner会根据搜索空间来取样生成参数和网络架构。搜索空间通过YAML文件来定义。
* 改动模型代码:需要首先定义参数组,并更新模型代码。
## Step1:定义搜索空间
AutoDL Finetuner会根据搜索空间来取样生成参数和网络架构。搜索空间通过YAML文件来定义。
要定义搜索空间,需要定义变量名称、类型及其搜索范围。通过这些信息构建了一个超参空间,
PaddleHub将在这个空间内进行超参数的搜索,将搜索到的超参传入train.py获得评估效果,根据评估效果自动调整超参搜索方向,直到满足搜索次数。
以Fine-tune图像分类任务为例, 以下是待优化超参数的yaml文件hparam.yaml,包含需要搜素的超参名字、类型、范围等信息。目前参数搜索类型只支持float和int类型。
```
param_list:
- name : learning_rate
init_value : 0.001
type : float
lower_than : 0.05
greater_than : 0.00005
- name : batch_size
init_value : 12
type : int
lower_than : 20
greater_than : 10
```
## Step2:改动模型代码
img_cls.py以mobilenet为预训练模型,在flowers数据集上进行Fine-tune。PaddleHub如何完成Finetune可以参考[图像分类迁移学习示例](../image_classification)
* import paddlehub
在img_cls.py加上`import paddlehub as hub`
* 从AutoDL Finetuner获得参数值
1. img_cls.py的选项参数须包含待优化超参数,需要将超参以argparser的方式写在其中,待搜索超参数选项名字和yaml文件中的超参数名字保持一致。
2. img_cls.py须包含选项参数saved_params_dir,优化后的参数将会保存到该路径下。
3. 超参评估策略选择PopulationBased时,img_cls.py须包含选项参数model_path,自动从model_path指定的路径恢复模型
* 返回配置的最终效果
img_cls.py须反馈模型的评价效果(建议使用验证集或者测试集上的评价效果),通过调用`report_final_result`接口反馈,如
```python
hub.report_final_result(eval_avg_score["acc"])
```
**NOTE:** 输出的评价效果取值范围应为`(-∞, 1]`,取值越高,表示效果越好。
## 启动AutoDL Finetuner
在完成安装PaddlePaddle与PaddleHub后,通过执行脚本`sh run_autofinetune.sh`即可开始使用超参优化功能。
**NOTE:** 关于PaddleHub超参优化详情参考[教程](../../docs/tutorial/autofinetune.md)
param_list:
- name : learning_rate
init_value : 0.001
type : float
lower_than : 0.05
greater_than : 0.00005
- name : batch_size
init_value : 12
type : int
lower_than : 20
greater_than : 10
# coding:utf-8
import argparse
import os
import ast
import shutil
import paddlehub as hub
from paddlehub.common.logger import logger
parser = argparse.ArgumentParser(__doc__)
parser.add_argument(
"--epochs", type=int, default=5, help="Number of epoches for fine-tuning.")
parser.add_argument(
"--checkpoint_dir", type=str, default=None, help="Path to save log data.")
parser.add_argument(
"--module",
type=str,
default="mobilenet",
help="Module used as feature extractor.")
# the name of hyper-parameters to be searched should keep with hparam.py
parser.add_argument(
"--batch_size",
type=int,
default=16,
help="Total examples' number in batch for training.")
parser.add_argument(
"--learning_rate", type=float, default=1e-4, help="learning_rate.")
# saved_params_dir and model_path are needed by auto fine-tune
parser.add_argument(
"--saved_params_dir",
type=str,
default="",
help="Directory for saving model")
parser.add_argument(
"--model_path", type=str, default="", help="load model path")
module_map = {
"resnet50": "resnet_v2_50_imagenet",
"resnet101": "resnet_v2_101_imagenet",
"resnet152": "resnet_v2_152_imagenet",
"mobilenet": "mobilenet_v2_imagenet",
"nasnet": "nasnet_imagenet",
"pnasnet": "pnasnet_imagenet"
}
def is_path_valid(path):
if path == "":
return False
path = os.path.abspath(path)
dirname = os.path.dirname(path)
if not os.path.exists(dirname):
os.mkdir(dirname)
return True
def finetune(args):
# Load Paddlehub pretrained model, default as mobilenet
module = hub.Module(name=args.module)
input_dict, output_dict, program = module.context(trainable=True)
# Download dataset and use ImageClassificationReader to read dataset
dataset = hub.dataset.Flowers()
data_reader = hub.reader.ImageClassificationReader(
image_width=module.get_expected_image_width(),
image_height=module.get_expected_image_height(),
images_mean=module.get_pretrained_images_mean(),
images_std=module.get_pretrained_images_std(),
dataset=dataset)
# The last 2 layer of resnet_v2_101_imagenet network
feature_map = output_dict["feature_map"]
img = input_dict["image"]
feed_list = [img.name]
# Select fine-tune strategy, setup config and fine-tune
strategy = hub.DefaultFinetuneStrategy(learning_rate=args.learning_rate)
config = hub.RunConfig(
use_cuda=True,
num_epoch=args.epochs,
batch_size=args.batch_size,
checkpoint_dir=args.checkpoint_dir,
strategy=strategy)
# Construct transfer learning network
task = hub.ImageClassifierTask(
data_reader=data_reader,
feed_list=feed_list,
feature=feature_map,
num_classes=dataset.num_labels,
config=config)
# Load model from the defined model path or not
if args.model_path != "":
with task.phase_guard(phase="train"):
task.init_if_necessary()
task.load_parameters(args.model_path)
logger.info("PaddleHub has loaded model from %s" % args.model_path)
# Fine-tune by PaddleHub's API
task.finetune()
# Evaluate by PaddleHub's API
run_states = task.eval()
# Get acc score on dev
eval_avg_score, eval_avg_loss, eval_run_speed = task._calculate_metrics(
run_states)
# Move ckpt/best_model to the defined saved parameters directory
best_model_dir = os.path.join(config.checkpoint_dir, "best_model")
if is_path_valid(args.saved_params_dir) and os.path.exists(best_model_dir):
shutil.copytree(best_model_dir, args.saved_params_dir)
shutil.rmtree(config.checkpoint_dir)
# acc on dev will be used by auto fine-tune
hub.report_final_result(eval_avg_score["acc"])
if __name__ == "__main__":
args = parser.parse_args()
if not args.module in module_map:
hub.logger.error("module should in %s" % module_map.keys())
exit(1)
args.module = module_map[args.module]
finetune(args)
OUTPUT=result
hub autofinetune img_cls.py \
--param_file=hparam.yaml \
--gpu=0 \
--popsize=15 \
--round=10 \
--output_dir=${OUTPUT} \
--evaluator=fulltrail \
--tuning_strategy=pshe2
# PaddleHub超参优化——文本分类
**确认安装PaddleHub版本在1.3.0以上, 同时PaddleHub AutoDL Finetuner功能要求至少有一张GPU显卡可用。**
本示例展示如何利用PaddleHub超参优化AutoDL Finetuner,得到一个效果较佳的超参数组合。
每次执行AutoDL Finetuner,用户只需要定义搜索空间,改动几行代码,就能利用PaddleHub搜索最好的超参组合。 只需要两步即可完成:
* 定义搜索空间:AutoDL Finetuner会根据搜索空间来取样生成参数和网络架构。搜索空间通过YAML文件来定义。
* 改动模型代码:需要首先定义参数组,并更新模型代码。
## Step1:定义搜索空间
AutoDL Finetuner会根据搜索空间来取样生成参数和网络架构。搜索空间通过YAML文件来定义。
要定义搜索空间,需要定义变量名称、类型及其搜索范围。通过这些信息构建了一个超参空间,
PaddleHub将在这个空间内进行超参数的搜索,将搜索到的超参传入train.py获得评估效果,根据评估效果自动调整超参搜索方向,直到满足搜索次数。
以Fine-tune文本分类任务为例, 以下是待优化超参数的yaml文件hparam.yaml,包含需要搜素的超参名字、类型、范围等信息。目前参数搜索类型只支持float和int类型。
```
param_list:
- name : learning_rate
init_value : 0.001
type : float
lower_than : 0.05
greater_than : 0.000005
- name : weight_decay
init_value : 0.1
type : float
lower_than : 1
greater_than : 0.0
- name : batch_size
init_value : 32
type : int
lower_than : 40
greater_than : 30
- name : warmup_prop
init_value : 0.1
type : float
lower_than : 0.2
greater_than : 0.0
```
## Step2:改动模型代码
text_cls.py以ernie为预训练模型,在ChnSentiCorp数据集上进行Fine-tune。PaddleHub如何完成Finetune可以参考[文本分类迁移学习示例](../text_classification)
* import paddlehub
在text_cls.py加上`import paddlehub as hub`
* 从AutoDL Finetuner获得参数值
1. text_cls.py的选项参数须包含待优化超参数,需要将超参以argparser的方式写在其中,待搜索超参数选项名字和yaml文件中的超参数名字保持一致。
2. text_cls.py须包含选项参数saved_params_dir,优化后的参数将会保存到该路径下。
3. 超参评估策略选择PopulationBased时,text_cls.py须包含选项参数model_path,自动从model_path指定的路径恢复模型
* 返回配置的最终效果
text_cls.py须反馈模型的评价效果(建议使用验证集或者测试集上的评价效果),通过调用`report_final_result`接口反馈,如
```python
hub.report_final_result(eval_avg_score["acc"])
```
**NOTE:** 输出的评价效果取值范围应为`(-∞, 1]`,取值越高,表示效果越好。
## 启动AutoDL Finetuner
在完成安装PaddlePaddle与PaddleHub后,通过执行脚本`sh run_autofinetune.sh`即可开始使用超参优化功能。
**NOTE:** 关于PaddleHub超参优化详情参考[教程](../../docs/tutorial/autofinetune.md)
param_list:
- name : learning_rate
init_value : 0.001
type : float
lower_than : 0.05
greater_than : 0.000005
- name : weight_decay
init_value : 0.1
type : float
lower_than : 1
greater_than : 0.0
- name : batch_size
init_value : 32
type : int
lower_than : 40
greater_than : 30
- name : warmup_prop
init_value : 0.1
type : float
lower_than : 0.2
greater_than : 0.0
OUTPUT=result
hub autofinetune text_cls.py \
--param_file=hparam.yaml \
--gpu=0 \
--popsize=15 \
--round=10 \
--output_dir=${OUTPUT} \
--evaluator=fulltrail \
--tuning_strategy=pshe2
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import ast
import shutil
import paddlehub as hub
import os
from paddlehub.common.logger import logger
parser = argparse.ArgumentParser(__doc__)
parser.add_argument("--epochs", type=int, default=3, help="epochs.")
# the name of hyper-parameters to be searched should keep with hparam.py
parser.add_argument("--batch_size", type=int, default=32, help="batch_size.")
parser.add_argument(
"--learning_rate", type=float, default=5e-5, help="learning_rate.")
parser.add_argument(
"--warmup_prop", type=float, default=0.1, help="warmup_prop.")
parser.add_argument(
"--weight_decay", type=float, default=0.01, help="weight_decay.")
parser.add_argument(
"--max_seq_len",
type=int,
default=128,
help="Number of words of the longest seqence.")
parser.add_argument(
"--checkpoint_dir",
type=str,
default=None,
help="Directory to model checkpoint")
# saved_params_dir and model_path are needed by auto fine-tune
parser.add_argument(
"--saved_params_dir",
type=str,
default="",
help="Directory for saving model during ")
parser.add_argument(
"--model_path", type=str, default="", help="load model path")
args = parser.parse_args()
def is_path_valid(path):
if path == "":
return False
path = os.path.abspath(path)
dirname = os.path.dirname(path)
if not os.path.exists(dirname):
os.mkdir(dirname)
return True
if __name__ == '__main__':
# Load Paddlehub ERNIE pretrained model
module = hub.Module(name="ernie")
inputs, outputs, program = module.context(
trainable=True, max_seq_len=args.max_seq_len)
# Download dataset and use ClassifyReader to read dataset
dataset = hub.dataset.ChnSentiCorp()
metrics_choices = ["acc"]
reader = hub.reader.ClassifyReader(
dataset=dataset,
vocab_path=module.get_vocab_path(),
max_seq_len=args.max_seq_len)
# Construct transfer learning network
# Use "pooled_output" for classification tasks on an entire sentence.
pooled_output = outputs["pooled_output"]
# Setup feed list for data feeder
# Must feed all the tensor of ERNIE's module need
feed_list = [
inputs["input_ids"].name,
inputs["position_ids"].name,
inputs["segment_ids"].name,
inputs["input_mask"].name,
]
# Select fine-tune strategy, setup config and fine-tune
strategy = hub.AdamWeightDecayStrategy(
warmup_proportion=args.warmup_prop,
learning_rate=args.learning_rate,
weight_decay=args.weight_decay,
lr_scheduler="linear_decay")
# Setup RunConfig for PaddleHub Fine-tune API
config = hub.RunConfig(
checkpoint_dir=args.checkpoint_dir,
use_cuda=True,
num_epoch=args.epochs,
batch_size=args.batch_size,
enable_memory_optim=True,
strategy=strategy)
# Define a classfication fine-tune task by PaddleHub's API
cls_task = hub.TextClassifierTask(
data_reader=reader,
feature=pooled_output,
feed_list=feed_list,
num_classes=dataset.num_labels,
config=config,
metrics_choices=metrics_choices)
# Load model from the defined model path or not
if args.model_path != "":
with cls_task.phase_guard(phase="train"):
cls_task.init_if_necessary()
cls_task.load_parameters(args.model_path)
logger.info("PaddleHub has loaded model from %s" % args.model_path)
cls_task.finetune()
run_states = cls_task.eval()
eval_avg_score, eval_avg_loss, eval_run_speed = cls_task._calculate_metrics(
run_states)
# Move ckpt/best_model to the defined saved parameters directory
best_model_dir = os.path.join(config.checkpoint_dir, "best_model")
if is_path_valid(args.saved_params_dir) and os.path.exists(best_model_dir):
shutil.copytree(best_model_dir, args.saved_params_dir)
shutil.rmtree(config.checkpoint_dir)
# acc on dev will be used by auto fine-tune
hub.report_final_result(eval_avg_score["acc"])
# PaddleHub 图像分类
本示例将展示如何使用PaddleHub Fine-tune API以及[ResNet](https://www.paddlepaddle.org.cn/hubdetail?name=resnet_v2_50_imagenet&en_category=ImageClassification)等预训练模型完成分类任务。
## 如何开始Fine-tune
在完成安装PaddlePaddle与PaddleHub后,通过执行脚本`sh run_classifier.sh`即可开始使用ResNet对[Flowers](../../docs/reference/dataset.md#class-hubdatasetflowers)等数据集进行Fine-tune。
其中脚本参数说明如下:
```shell
--batch_size: 批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数。默认为16;
--num_epoch: Fine-tune迭代的轮数。默认为1;
--module: 使用哪个Module作为Fine-tune的特征提取器,脚本支持{resnet50/resnet101/resnet152/mobilenet/nasnet/pnasnet}等模型。默认为resnet50;
--checkpoint_dir: 模型保存路径,PaddleHub会自动保存验证集上表现最好的模型。默认为paddlehub_finetune_ckpt;
--dataset: 使用什么数据集进行Fine-tune, 脚本支持分别是{flowers/dogcat/stanforddogs/indoor67/food101}。默认为flowers;
--use_gpu: 是否使用GPU进行训练,如果机器支持GPU且安装了GPU版本的PaddlePaddle,我们建议您打开这个开关。默认关闭;
--use_data_parallel: 是否使用数据并行,打开该开关时,会将数据分散到不同的卡上进行训练(CPU下会分布到不同线程)。默认打开;
```
## 代码步骤
使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤。
### Step1: 加载预训练模型
```python
module = hub.Module(name="resnet_v2_50_imagenet")
inputs, outputs, program = module.context(trainable=True)
```
PaddleHub提供许多图像分类预训练模型,如xception、mobilenet、efficientnet等,详细信息参见[图像分类模型](https://www.paddlepaddle.org.cn/hub?filter=en_category&value=ImageClassification)
如果想尝试efficientnet模型,只需要更换Module中的`name`参数即可.
```python
# 更换name参数即可无缝切换efficientnet模型, 代码示例如下
module = hub.Module(name="efficientnetb7_imagenet")
```
### Step2: 下载数据集并使用ImageClassificationReader读取数据
```python
dataset = hub.dataset.Flowers()
data_reader = hub.reader.ImageClassificationReader(
image_width=module.get_expected_image_width(),
image_height=module.get_expected_image_height(),
images_mean=module.get_pretrained_images_mean(),
images_std=module.get_pretrained_images_std(),
dataset=dataset)
```
其中数据集的准备代码可以参考 [flowers.py](../../paddlehub/dataset/flowers.py)
同时,PaddleHub提供了更多的图像分类数据集:
| 数据集 | API |
| -------- | ------------------------------------------ |
| Flowers | hub.dataset.Flowers() |
| DogCat | hub.dataset.DogCat() |
| Indoor67 | hub.dataset.Indoor67() |
| Food101 | hub.dataset.Food101() |
`hub.dataset.Flowers()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
`module.get_expected_image_width()``module.get_expected_image_height()`会返回预训练模型对应的图片尺寸。
`module.module.get_pretrained_images_mean()``module.get_pretrained_images_std()`会返回预训练模型对应的图片均值和方差。
#### 自定义数据集
如果想加载自定义数据集完成迁移学习,详细参见[自定义数据集](../../docs/tutorial/how_to_load_data.md)
### Step3:选择优化策略和运行配置
```python
strategy = hub.DefaultFinetuneStrategy(
learning_rate=1e-4,
optimizer_name="adam",
regularization_coeff=1e-3)
config = hub.RunConfig(use_cuda=True, use_data_parallel=True, num_epoch=3, batch_size=32, strategy=strategy)
```
#### 优化策略
PaddleHub提供了许多优化策略,如`AdamWeightDecayStrategy``ULMFiTStrategy``DefaultFinetuneStrategy`等,详细信息参见[策略](../../docs/reference/strategy.md)
其中`DefaultFinetuneStrategy`:
* `learning_rate`: 全局学习率。默认为1e-4;
* `optimizer_name`: 优化器名称。默认adam;
* `regularization_coeff`: 正则化的λ参数。默认为1e-3;
#### 运行配置
`RunConfig` 主要控制Fine-tune的训练,包含以下可控制的参数:
* `log_interval`: 进度日志打印间隔,默认每10个step打印一次;
* `eval_interval`: 模型评估的间隔,默认每100个step评估一次验证集;
* `save_ckpt_interval`: 模型保存间隔,请根据任务大小配置,默认只保存验证集效果最好的模型和训练结束的模型;
* `use_cuda`: 是否使用GPU训练,默认为False;
* `use_pyreader`: 是否使用pyreader,默认False;
* `use_data_parallel`: 是否使用并行计算,默认True。打开该功能依赖nccl库;
* `checkpoint_dir`: 模型checkpoint保存路径, 若用户没有指定,程序会自动生成;
* `num_epoch`: Fine-tune的轮数;
* `batch_size`: 训练的批大小,如果使用GPU,请根据实际情况调整batch_size;
* `strategy`: Fine-tune优化策略;
### Step4: 构建网络并创建分类迁移任务进行Fine-tune
```python
feature_map = output_dict["feature_map"]
feed_list = [input_dict["image"].name]
task = hub.ImageClassifierTask(
data_reader=data_reader,
feed_list=feed_list,
feature=feature_map,
num_classes=dataset.num_labels,
config=config)
task.finetune_and_eval()
```
**NOTE:**
1. `output_dict["feature_map"]`返回了resnet/mobilenet等模型对应的feature_map,可以用于图片的特征表达。
2. `feed_list`中的inputs参数指明了resnet/mobilenet等模型的输入tensor的顺序,与ImageClassifierTask返回的结果一致。
3. `hub.ImageClassifierTask`通过输入特征,label与迁移的类别数,可以生成适用于图像分类的迁移任务`ImageClassifierTask`
#### 自定义迁移任务
如果想改变迁移任务组网,详细参见[自定义迁移任务](../../docs/tutorial/how_to_define_task.md)
## 可视化
Fine-tune API训练过程中会自动对关键训练指标进行打点,启动程序后执行下面命令
```bash
$ visualdl --logdir $CKPT_DIR/visualization --host ${HOST_IP} --port ${PORT_NUM}
```
其中${HOST_IP}为本机IP地址,${PORT_NUM}为可用端口号,如本机IP地址为192.168.0.1,端口号8040,用浏览器打开192.168.0.1:8040,即可看到训练过程中指标的变化情况。
## 模型预测
当完成Fine-tune后,Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下,其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。
我们使用该模型来进行预测。predict.py脚本支持的参数如下:
```shell
--module: 使用哪个Module作为Fine-tune的特征提取器,脚本支持{resnet50/resnet101/resnet152/mobilenet/nasnet/pnasnet}等模型。默认为resnet50;
--checkpoint_dir: 模型保存路径,PaddleHub会自动保存验证集上表现最好的模型。默认为paddlehub_finetune_ckpt;
--dataset: 使用什么数据集进行Fine-tune, 脚本支持分别是{flowers/dogcat}。默认为flowers;
--use_gpu: 使用使用GPU进行训练,如果本机支持GPU且安装了GPU版本的PaddlePaddle,我们建议您打开这个开关。默认关闭;
--use_pyreader: 是否使用pyreader进行数据喂入。默认关闭;
```
**NOTE:** 进行预测时,所选择的module,checkpoint_dir,dataset必须和Fine-tune所用的一样。
参数配置正确后,请执行脚本`sh run_predict.sh`,即可看到以下图片分类预测结果。
如需了解更多预测步骤,请参考`predict.py`
我们在AI Studio上提供了IPython NoteBook形式的demo,您可以直接在平台上在线体验,链接如下:
|预训练模型|任务类型|数据集|AIStudio链接|备注|
|-|-|-|-|-|
|ResNet|图像分类|猫狗数据集DogCat|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147010)||
|ERNIE|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147006)||
|ERNIE|文本分类|中文新闻分类数据集THUNEWS|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/221999)|本教程讲述了如何将自定义数据集加载,并利用Fine-tune API完成文本分类迁移学习。|
|ERNIE|序列标注|中文序列标注数据集MSRA_NER|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147009)||
|ERNIE|序列标注|中文快递单数据集Express|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/184200)|本教程讲述了如何将自定义数据集加载,并利用Fine-tune API完成序列标注迁移学习。|
|ERNIE Tiny|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/186443)||
|Senta|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/216846)|本教程讲述了任何利用Senta和Fine-tune API完成情感分类迁移学习。|
|Senta|情感分析预测|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215814)||
|LAC|词法分析|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215711)||
|Ultra-Light-Fast-Generic-Face-Detector-1MB|人脸检测|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215962)||
## 超参优化AutoDL Finetuner
PaddleHub还提供了超参优化(Hyperparameter Tuning)功能, 自动搜索最优模型超参得到更好的模型效果。详细信息参见[AutoDL Finetuner超参优化功能教程](../../docs/tutorial/autofinetune.md)
#coding:utf-8
import argparse
import os
import ast
import paddle.fluid as fluid
import paddlehub as hub
import numpy as np
# yapf: disable
parser = argparse.ArgumentParser(__doc__)
parser.add_argument("--num_epoch", type=int, default=1, help="Number of epoches for fine-tuning.")
parser.add_argument("--use_gpu", type=ast.literal_eval, default=True, help="Whether use GPU for fine-tuning.")
parser.add_argument("--checkpoint_dir", type=str, default="paddlehub_finetune_ckpt", help="Path to save log data.")
parser.add_argument("--batch_size", type=int, default=16, help="Total examples' number in batch for training.")
parser.add_argument("--module", type=str, default="resnet50", help="Module used as feature extractor.")
parser.add_argument("--dataset", type=str, default="flowers", help="Dataset to fine-tune.")
parser.add_argument("--use_data_parallel", type=ast.literal_eval, default=True, help="Whether use data parallel.")
# yapf: enable.
module_map = {
"resnet50": "resnet_v2_50_imagenet",
"resnet101": "resnet_v2_101_imagenet",
"resnet152": "resnet_v2_152_imagenet",
"mobilenet": "mobilenet_v2_imagenet",
"nasnet": "nasnet_imagenet",
"pnasnet": "pnasnet_imagenet"
}
def finetune(args):
# Load Paddlehub pretrained model
module = hub.Module(name=args.module)
input_dict, output_dict, program = module.context(trainable=True)
# Download dataset
if args.dataset.lower() == "flowers":
dataset = hub.dataset.Flowers()
elif args.dataset.lower() == "dogcat":
dataset = hub.dataset.DogCat()
elif args.dataset.lower() == "indoor67":
dataset = hub.dataset.Indoor67()
elif args.dataset.lower() == "food101":
dataset = hub.dataset.Food101()
elif args.dataset.lower() == "stanforddogs":
dataset = hub.dataset.StanfordDogs()
else:
raise ValueError("%s dataset is not defined" % args.dataset)
# Use ImageClassificationReader to read dataset
data_reader = hub.reader.ImageClassificationReader(
image_width=module.get_expected_image_width(),
image_height=module.get_expected_image_height(),
images_mean=module.get_pretrained_images_mean(),
images_std=module.get_pretrained_images_std(),
dataset=dataset)
feature_map = output_dict["feature_map"]
# Setup feed list for data feeder
feed_list = [input_dict["image"].name]
# Setup RunConfig for PaddleHub Fine-tune API
config = hub.RunConfig(
use_data_parallel=args.use_data_parallel,
use_cuda=args.use_gpu,
num_epoch=args.num_epoch,
batch_size=args.batch_size,
checkpoint_dir=args.checkpoint_dir,
strategy=hub.finetune.strategy.DefaultFinetuneStrategy())
# Define a image classification task by PaddleHub Fine-tune API
task = hub.ImageClassifierTask(
data_reader=data_reader,
feed_list=feed_list,
feature=feature_map,
num_classes=dataset.num_labels,
config=config)
# Fine-tune by PaddleHub's API
task.finetune_and_eval()
if __name__ == "__main__":
args = parser.parse_args()
if not args.module in module_map:
hub.logger.error("module should in %s" % module_map.keys())
exit(1)
args.module = module_map[args.module]
finetune(args)
#coding:utf-8
import argparse
import os
import numpy as np
import paddlehub as hub
import paddle.fluid as fluid
from paddle.fluid.dygraph import Linear
from paddle.fluid.dygraph.base import to_variable
from paddle.fluid.optimizer import AdamOptimizer
# yapf: disable
parser = argparse.ArgumentParser(__doc__)
parser.add_argument("--num_epoch", type=int, default=1, help="Number of epoches for fine-tuning.")
parser.add_argument("--checkpoint_dir", type=str, default="paddlehub_finetune_ckpt_dygraph", help="Path to save log data.")
parser.add_argument("--batch_size", type=int, default=16, help="Total examples' number in batch for training.")
parser.add_argument("--log_interval", type=int, default=10, help="log interval.")
parser.add_argument("--save_interval", type=int, default=10, help="save interval.")
# yapf: enable.
class ResNet50(fluid.dygraph.Layer):
def __init__(self, num_classes, backbone):
super(ResNet50, self).__init__()
self.fc = Linear(input_dim=2048, output_dim=num_classes)
self.backbone = backbone
def forward(self, imgs):
feature_map = self.backbone(imgs)
feature_map = fluid.layers.reshape(feature_map, shape=[-1, 2048])
pred = self.fc(feature_map)
return fluid.layers.softmax(pred)
def finetune(args):
with fluid.dygraph.guard():
resnet50_vd_10w = hub.Module(name="resnet50_vd_10w")
dataset = hub.dataset.Flowers()
resnet = ResNet50(
num_classes=dataset.num_labels, backbone=resnet50_vd_10w)
adam = AdamOptimizer(
learning_rate=0.001, parameter_list=resnet.parameters())
state_dict_path = os.path.join(args.checkpoint_dir,
'dygraph_state_dict')
if os.path.exists(state_dict_path + '.pdparams'):
state_dict, _ = fluid.load_dygraph(state_dict_path)
resnet.load_dict(state_dict)
reader = hub.reader.ImageClassificationReader(
image_width=resnet50_vd_10w.get_expected_image_width(),
image_height=resnet50_vd_10w.get_expected_image_height(),
images_mean=resnet50_vd_10w.get_pretrained_images_mean(),
images_std=resnet50_vd_10w.get_pretrained_images_std(),
dataset=dataset)
train_reader = reader.data_generator(
batch_size=args.batch_size, phase='train')
loss_sum = acc_sum = cnt = 0
# 执行epoch_num次训练
for epoch in range(args.num_epoch):
# 读取训练数据进行训练
for batch_id, data in enumerate(train_reader()):
imgs = np.array(data[0][0])
labels = np.array(data[0][1])
pred = resnet(imgs)
acc = fluid.layers.accuracy(pred, to_variable(labels))
loss = fluid.layers.cross_entropy(pred, to_variable(labels))
avg_loss = fluid.layers.mean(loss)
avg_loss.backward()
# 参数更新
adam.minimize(avg_loss)
loss_sum += avg_loss.numpy() * imgs.shape[0]
acc_sum += acc.numpy() * imgs.shape[0]
cnt += imgs.shape[0]
if batch_id % args.log_interval == 0:
print('epoch {}: loss {}, acc {}'.format(
epoch, loss_sum / cnt, acc_sum / cnt))
loss_sum = acc_sum = cnt = 0
if batch_id % args.save_interval == 0:
state_dict = resnet.state_dict()
fluid.save_dygraph(state_dict, state_dict_path)
if __name__ == "__main__":
args = parser.parse_args()
finetune(args)
#coding:utf-8
import argparse
import os
import ast
import paddle.fluid as fluid import paddle.fluid as fluid
import paddlehub as hub import paddlehub as hub
import numpy as np from paddlehub.finetune.trainer import Trainer
from paddlehub.datasets.flowers import Flowers
# yapf: disable from paddlehub.process.transforms import Compose, Resize, Normalize
parser = argparse.ArgumentParser(__doc__) from paddlehub.module.cv_module import ImageClassifierModule
parser.add_argument("--use_gpu", type=ast.literal_eval, default=True, help="Whether use GPU for predict.")
parser.add_argument("--checkpoint_dir", type=str, default="paddlehub_finetune_ckpt", help="Path to save log data.")
parser.add_argument("--batch_size", type=int, default=16, help="Total examples' number in batch for training.")
parser.add_argument("--module", type=str, default="resnet50", help="Module used as a feature extractor.")
parser.add_argument("--dataset", type=str, default="flowers", help="Dataset to fine-tune.")
# yapf: enable.
module_map = {
"resnet50": "resnet_v2_50_imagenet",
"resnet101": "resnet_v2_101_imagenet",
"resnet152": "resnet_v2_152_imagenet",
"mobilenet": "mobilenet_v2_imagenet",
"nasnet": "nasnet_imagenet",
"pnasnet": "pnasnet_imagenet"
}
def predict(args):
# Load Paddlehub pretrained model
module = hub.Module(name=args.module)
input_dict, output_dict, program = module.context(trainable=True)
# Download dataset
if args.dataset.lower() == "flowers":
dataset = hub.dataset.Flowers()
elif args.dataset.lower() == "dogcat":
dataset = hub.dataset.DogCat()
elif args.dataset.lower() == "indoor67":
dataset = hub.dataset.Indoor67()
elif args.dataset.lower() == "food101":
dataset = hub.dataset.Food101()
elif args.dataset.lower() == "stanforddogs":
dataset = hub.dataset.StanfordDogs()
else:
raise ValueError("%s dataset is not defined" % args.dataset)
# Use ImageClassificationReader to read dataset
data_reader = hub.reader.ImageClassificationReader(
image_width=module.get_expected_image_width(),
image_height=module.get_expected_image_height(),
images_mean=module.get_pretrained_images_mean(),
images_std=module.get_pretrained_images_std(),
dataset=dataset)
feature_map = output_dict["feature_map"]
# Setup feed list for data feeder
feed_list = [input_dict["image"].name]
# Setup RunConfig for PaddleHub Fine-tune API
config = hub.RunConfig(
use_data_parallel=False,
use_cuda=args.use_gpu,
batch_size=args.batch_size,
checkpoint_dir=args.checkpoint_dir,
strategy=hub.finetune.strategy.DefaultFinetuneStrategy())
# Define a image classification task by PaddleHub Fine-tune API if __name__ == '__main__':
task = hub.ImageClassifierTask(
data_reader=data_reader,
feed_list=feed_list,
feature=feature_map,
num_classes=dataset.num_labels,
config=config)
data = ["./test/test_img_daisy.jpg", "./test/test_img_roses.jpg"] with fluid.dygraph.guard():
print(task.predict(data=data, return_result=True)) transforms = Compose([Resize((224, 224)), Normalize()])
flowers = Flowers(transforms)
flowers_validate = Flowers(transforms, mode='val')
model = hub.Module(directory='mobilenet_v2_animals', class_dim=flowers.num_classes)
# model = hub.Module(name='mobilenet_v2_animals', class_dim=flowers.num_classes)
if __name__ == "__main__": optimizer = fluid.optimizer.AdamOptimizer(learning_rate=0.001, parameter_list=model.parameters())
args = parser.parse_args() trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_cls')
if not args.module in module_map:
hub.logger.error("module should in %s" % module_map.keys())
exit(1)
args.module = module_map[args.module]
predict(args) # trainer.train(flowers, epochs=100, batch_size=32, eval_dataset=flowers_validate)
export FLAGS_eager_delete_tensor_gb=0.0
export CUDA_VISIBLE_DEVICES=0
python -u img_classifier.py $@
export FLAGS_eager_delete_tensor_gb=0.0
export CUDA_VISIBLE_DEVICES=0
python -u predict.py $@
IMAGE_PATH
./resources/test/test_img_bird.jpg
input_data:
image:
type : IMAGE
key : IMAGE_PATH
config:
top_only : True
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.dygraph.parallel import ParallelEnv
from paddlehub.finetune.trainer import Trainer
from paddlehub.datasets.flowers import Flowers
from paddlehub.process.transforms import Compose, Resize, Normalize
from paddlehub.module.cv_module import ImageClassifierModule
if __name__ == '__main__':
with fluid.dygraph.guard(fluid.CUDAPlace(ParallelEnv().dev_id)):
transforms = Compose([Resize((224, 224)), Normalize()])
flowers = Flowers(transforms)
flowers_validate = Flowers(transforms, mode='val')
model = hub.Module(directory='mobilenet_v2_animals', class_dim=flowers.num_classes)
# model = hub.Module(name='mobilenet_v2_animals', class_dim=flowers.num_classes)
optimizer = fluid.optimizer.AdamOptimizer(learning_rate=0.001, parameter_list=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='test_ckpt_img_cls')
trainer.train(flowers, epochs=100, batch_size=32, eval_dataset=flowers_validate, save_interval=1)
# LAC 词法分析
本示例展示如何使用LAC Module进行预测。
LAC是中文词法分析模型,可以用于进行中文句子的分词/词性标注/命名实体识别等功能,关于模型的细节参见[模型介绍](https://www.paddlepaddle.org.cn/hubdetail?name=lac&en_category=LexicalAnalysis)
## 命令行方式预测
`cli_demo.sh`给出了使用命令行接口(Command Line Interface)调用Module预测的示例脚本,
通过以下命令试验下效果。
```shell
$ hub run lac --input_text "今天是个好日子"
$ hub run lac --input_file test.txt --user_dict user.dict
```
test.txt 存放待分词文本, 如:
```text
今天是个好日子
今天天气晴朗
```
user.dict为用户自定义词典,可以不指定,当指定自定义词典时,可以干预默认分词结果。
词典包含三列,第一列为单词,第二列为单词词性,第三列为单词词频,以水平制表符\t分隔。词频越高的单词,对分词结果影响越大,词典样例如下:
```text
天气预报 n 400000
经 v 1000
常 d 1000
```
**NOTE:**
* 该PaddleHub Module使用词典干预功能时,依赖于第三方库pyahocorasick,请自行安装;
* 请不要直接复制示例文本使用,复制后的格式可能存在问题;
## 通过Python API预测
`lac_demo.py`给出了使用python API调用PaddleHub LAC Module预测的示例代码,
通过以下命令试验下效果。
```shell
python lac_demo.py
```
#coding:utf-8
from __future__ import print_function
import json
import os
import six
import paddlehub as hub
if __name__ == "__main__":
# Load LAC Module
lac = hub.Module(name="lac")
test_text = ["今天是个好日子", "天气预报说今天要下雨", "下一班地铁马上就要到了"]
# Set input dict
inputs = {"text": test_text}
# execute predict and print the result
results = lac.lexical_analysis(data=inputs, use_gpu=True, batch_size=10)
for result in results:
if six.PY2:
print(
json.dumps(result['word'], encoding="utf8", ensure_ascii=False))
print(
json.dumps(result['tag'], encoding="utf8", ensure_ascii=False))
else:
print(result['word'])
print(result['tag'])
# 基于PaddleHub实现口罩佩戴检测应用
本文档基于飞桨本次开源的口罩佩戴识别模型, 提供了一个完整的支持视频流的Web Demo,以及高性能的Python和C++集成部署方案, 适用于不同场景下的软件集成。
## 目录
- [1. 搭建视频流场景的WebDemo](#1-搭建视频流场景webdemo)
- [2. 高性能Python部署方案](#2-高性能Python部署方案)
- [3. 高性能C++部署方案](#3-高性能c部署方案)
## 1. 搭建视频流场景WebDemo
![image](./images/web1.jpg)
### [>点击查看视频链接<](https://www.bilibili.com/video/av88962128)
### 背景
本项目可以部署在大型场馆出入口,学校,医院,交通通道出入口,人脸识别闸机,机器人上,支持的方案有:安卓方案(如RK3399的人脸识别机,机器人),Ubuntu 边缘计算,WindowsPC+摄像头,识别率80%~90%,如果立项使用场景可以达到 99% (如:人脸识别机场景)。但是限于清晰度和遮挡关系,对应用场景有一些要求。
### 效果分析
可以看到识别率在80~90%之前,稍小的人脸有误识别的情况,有些挡住嘴的场景也被误识别成了戴口罩,一个人带着口罩,鼻子漏出来识别成没有戴口罩,这个是合理的因为的鼻子漏出来是佩戴不规范。这个模型应用在门口,狭长通道,人脸识别机所在位置都是可以的。
![image](./images/mask1.jpg)![image](./images/mask2.jpg)![image](./images/mask3.jpg)
### 1.1 部署环境
参考: https://www.paddlepaddle.org.cn/install/quick
#### 安装paddlehub
`pip install paddlehub`
### 1.2 开发识别服务
#### 加载预训练模型
```python
import paddlehub as hub
module = hub.Module(name="pyramidbox_lite_server_mask") #口罩检测模型
```
>以上语句paddlehub会自动下载口罩检测模型 "pyramidbox_lite_mobile_mask" 不需要提前下载模型
#### OpenCV打开摄像头或视频文件
下载测试视频
```
wget https://paddlehub.bj.bcebos.com/mask_detection/test_video.mp4
```
```python
import cv2
capture = cv2.VideoCapture(0) # 打开摄像头
# capture = cv2.VideoCapture('./test_video.mp4') # 打开视频文件
while(1):
ret, frame = capture.read() # frame即视频的一帧数据
if ret == False:
break
cv2.imshow('Mask Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
```
#### 口罩佩戴检测
```python
# frame为一帧数据
input_dict = {"data": [frame]}
results = module.face_detection(data=input_dict)
print(results)
```
输出结果:
```json
[
{
"data": {
"label": "MASK",
"left": 258.37087631225586,
"right": 374.7980499267578,
"top": 122.76758193969727,
"bottom": 254.20085906982422,
"confidence": 0.5630852
},
"id": 1
}
]
```
>"label":是否戴口罩,"confidence":置信度,其余字段为脸框的位置大小
#### 将结果显示到原视频帧中
```python
# results为口罩检测结果
for result in results:
# print(result)
label = result['data']['label']
confidence = result['data']['confidence']
top, right, bottom, left = int(result['data']['top']), int(result['data']['right']), int(result['data']['bottom']), int(result['data']['left'])
color = (0, 255, 0)
if label == 'NO MASK':
color = (0, 0, 255)
cv2.rectangle(frame, (left, top), (right, bottom), color, 3)
cv2.putText(frame, label + ":" + str(confidence), (left, top-10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, color, 2)
```
![image](./images/maskdetection_1.jpg)
>原DEMO中是英文+置信度显示在框的上面,尝试改为中文,遇到字体问题,以下是解决办法
#### 图片写入中文
需要事先准备ttf/otf等格式的字体文件
```python
def paint_chinese_opencv(im,chinese,position,fontsize,color_bgr):
img_PIL = Image.fromarray(cv2.cvtColor(im,cv2.COLOR_BGR2RGB)) # 图像从OpenCV格式转换成PIL格式
font = ImageFont.truetype('思源黑体SC-Heavy.otf',fontsize,encoding="utf-8") # 加载字体文件
#color = (255,0,0) # 字体颜色
#position = (100,100) # 文字输出位置
color = color_bgr[::-1]
draw = ImageDraw.Draw(img_PIL)
# PIL图片上打印汉字 # 参数1:打印坐标,参数2:文本,参数3:字体颜色,参数4:字体
draw.text(position,chinese,font=font,fill=color)
img = cv2.cvtColor(np.asarray(img_PIL),cv2.COLOR_RGB2BGR)# PIL图片转cv2图片
return img
```
```python
for result in results:
label = result['data']['label']
confidence = result['data']['confidence']
top, right, bottom, left = int(result['data']['top']), int(result['data']['right']), int(result['data']['bottom']), int(result['data']['left'])
color = (0, 255, 0)
label_cn = "有口罩"
if label == 'NO MASK':
color = (0, 0, 255)
label_cn = "无口罩"
cv2.rectangle(frame, (left, top), (right, bottom), color, 3)
# cv2.putText(frame, label + ":" + str(confidence), (left, top-10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, color, 2)
frame = paint_chinese_opencv(frame, label_cn + ":" + str(confidence), (left, top-36), 24, color)
```
![image](./images/maskdetection_2.jpg)
#### 提取头像文件
```python
img_name = "avatar_%d.png" % (maskIndex)
path = "./result/" + img_name
image = frame[top - 10: bottom + 10, left - 10: right + 10]
cv2.imwrite(path, image,[int(cv2.IMWRITE_PNG_COMPRESSION), 9])
```
#### 结果写入JSON
```python
with open("./result/2-mask_detection.json","w") as f:
json.dump(data, f)
```
>此处可以按照自己的应用需要改为输出到mysql,Redis,kafka ,MQ 供应用消化数据
完整代码可以参考`mask_detection.py`
### 1.3 制作网页呈现效果
此DEMO是显示一个固定视频,分析导出的 json 渲染到网页里面,如需实时显示需要再次开发
#### python 导出的数据
使用上面的 python 文件完整执行后会有3个种类的数据输出,放到`web/video/result`目录下
![image](./images/result.jpg)
#### json数据结构
![image](./images/json.jpg)
#### 使用数据渲染网页
- 网页中左侧 "视频播放视频区",播放同时实时回调当前播放的时间点
- 根据时间点换算为帧(1秒30帧),遍历 json 数据中的数据
- 把数据中对应的数据输出到网页右侧 "信息区"
![image](./images/web2.jpg)
## 2. 高性能Python部署方案
更多信息可以参考[文档](./python/README.md)
## 3. 高性能C++部署方案
更多信息可以参考[文档](./cpp/README.md)
## 欢迎交流
**百度飞桨合作伙伴:**
![image](./images/logo.jpg)
北京奇想天外科技有限公司
cmake_minimum_required(VERSION 3.0)
project(PaddleMaskDetector CXX C)
option(WITH_MKL "Compile demo with MKL/OpenBlas support,defaultuseMKL." ON)
option(WITH_GPU "Compile demo with GPU/CPU, default use CPU." ON)
option(WITH_STATIC_LIB "Compile demo with static/shared library, default use static." ON)
option(USE_TENSORRT "Compile demo with TensorRT." OFF)
SET(PADDLE_DIR "" CACHE PATH "Location of libraries")
SET(OPENCV_DIR "" CACHE PATH "Location of libraries")
SET(CUDA_LIB "" CACHE PATH "Location of libraries")
macro(safe_set_static_flag)
foreach(flag_var
CMAKE_CXX_FLAGS CMAKE_CXX_FLAGS_DEBUG CMAKE_CXX_FLAGS_RELEASE
CMAKE_CXX_FLAGS_MINSIZEREL CMAKE_CXX_FLAGS_RELWITHDEBINFO)
if(${flag_var} MATCHES "/MD")
string(REGEX REPLACE "/MD" "/MT" ${flag_var} "${${flag_var}}")
endif(${flag_var} MATCHES "/MD")
endforeach(flag_var)
endmacro()
if (WITH_MKL)
ADD_DEFINITIONS(-DUSE_MKL)
endif()
if (NOT DEFINED PADDLE_DIR OR ${PADDLE_DIR} STREQUAL "")
message(FATAL_ERROR "please set PADDLE_DIR with -DPADDLE_DIR=/path/paddle_influence_dir")
endif()
if (NOT DEFINED OPENCV_DIR OR ${OPENCV_DIR} STREQUAL "")
message(FATAL_ERROR "please set OPENCV_DIR with -DOPENCV_DIR=/path/opencv")
endif()
include_directories("${CMAKE_SOURCE_DIR}/")
include_directories("${PADDLE_DIR}/")
include_directories("${PADDLE_DIR}/third_party/install/protobuf/include")
include_directories("${PADDLE_DIR}/third_party/install/glog/include")
include_directories("${PADDLE_DIR}/third_party/install/gflags/include")
include_directories("${PADDLE_DIR}/third_party/install/xxhash/include")
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/include")
include_directories("${PADDLE_DIR}/third_party/install/snappy/include")
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/include")
include_directories("${PADDLE_DIR}/third_party/install/snappystream/include")
endif()
include_directories("${PADDLE_DIR}/third_party/install/zlib/include")
include_directories("${PADDLE_DIR}/third_party/boost")
include_directories("${PADDLE_DIR}/third_party/eigen3")
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
link_directories("${PADDLE_DIR}/third_party/install/snappy/lib")
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
link_directories("${PADDLE_DIR}/third_party/install/snappystream/lib")
endif()
link_directories("${PADDLE_DIR}/third_party/install/zlib/lib")
link_directories("${PADDLE_DIR}/third_party/install/protobuf/lib")
link_directories("${PADDLE_DIR}/third_party/install/glog/lib")
link_directories("${PADDLE_DIR}/third_party/install/gflags/lib")
link_directories("${PADDLE_DIR}/third_party/install/xxhash/lib")
link_directories("${PADDLE_DIR}/paddle/lib/")
link_directories("${CMAKE_CURRENT_BINARY_DIR}")
if (WIN32)
include_directories("${PADDLE_DIR}/paddle/fluid/inference")
include_directories("${PADDLE_DIR}/paddle/include")
link_directories("${PADDLE_DIR}/paddle/fluid/inference")
include_directories("${OPENCV_DIR}/build/include")
include_directories("${OPENCV_DIR}/opencv/build/include")
link_directories("${OPENCV_DIR}/build/x64/vc14/lib")
else ()
include_directories("${PADDLE_DIR}/paddle/include")
link_directories("${PADDLE_DIR}/paddle/lib")
include_directories("${OPENCV_DIR}/include")
link_directories("${OPENCV_DIR}/lib64")
endif ()
if (WIN32)
add_definitions("/DGOOGLE_GLOG_DLL_DECL=")
set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} /bigobj /MTd")
set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} /bigobj /MT")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /bigobj /MTd")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /bigobj /MT")
if (WITH_STATIC_LIB)
safe_set_static_flag()
add_definitions(-DSTATIC_LIB)
endif()
else()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -o2 -fopenmp -std=c++11")
set(CMAKE_STATIC_LIBRARY_PREFIX "")
endif()
# TODO let users define cuda lib path
if (WITH_GPU)
if (NOT DEFINED CUDA_LIB OR ${CUDA_LIB} STREQUAL "")
message(FATAL_ERROR "please set CUDA_LIB with -DCUDA_LIB=/path/cuda-8.0/lib64")
endif()
if (NOT WIN32)
if (NOT DEFINED CUDNN_LIB)
message(FATAL_ERROR "please set CUDNN_LIB with -DCUDNN_LIB=/path/cudnn_v7.4/cuda/lib64")
endif()
endif(NOT WIN32)
endif()
if (NOT WIN32)
if (USE_TENSORRT AND WITH_GPU)
include_directories("${PADDLE_DIR}/third_party/install/tensorrt/include")
link_directories("${PADDLE_DIR}/third_party/install/tensorrt/lib")
endif()
endif(NOT WIN32)
if (NOT WIN32)
set(NGRAPH_PATH "${PADDLE_DIR}/third_party/install/ngraph")
if(EXISTS ${NGRAPH_PATH})
include(GNUInstallDirs)
include_directories("${NGRAPH_PATH}/include")
link_directories("${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}")
set(NGRAPH_LIB ${NGRAPH_PATH}/${CMAKE_INSTALL_LIBDIR}/libngraph${CMAKE_SHARED_LIBRARY_SUFFIX})
endif()
endif()
if(WITH_MKL)
include_directories("${PADDLE_DIR}/third_party/install/mklml/include")
if (WIN32)
set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.lib
${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.lib)
else ()
set(MATH_LIB ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX}
${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5${CMAKE_SHARED_LIBRARY_SUFFIX})
execute_process(COMMAND cp -r ${PADDLE_DIR}/third_party/install/mklml/lib/libmklml_intel${CMAKE_SHARED_LIBRARY_SUFFIX} /usr/lib)
endif ()
set(MKLDNN_PATH "${PADDLE_DIR}/third_party/install/mkldnn")
if(EXISTS ${MKLDNN_PATH})
include_directories("${MKLDNN_PATH}/include")
if (WIN32)
set(MKLDNN_LIB ${MKLDNN_PATH}/lib/mkldnn.lib)
else ()
set(MKLDNN_LIB ${MKLDNN_PATH}/lib/libmkldnn.so.0)
endif ()
endif()
else()
set(MATH_LIB ${PADDLE_DIR}/third_party/install/openblas/lib/libopenblas${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
if (WIN32)
if(EXISTS "${PADDLE_DIR}/paddle/fluid/inference/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX}")
set(DEPS
${PADDLE_DIR}/paddle/fluid/inference/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
else()
set(DEPS
${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
endif()
if(WITH_STATIC_LIB)
set(DEPS
${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_STATIC_LIBRARY_SUFFIX})
else()
set(DEPS
${PADDLE_DIR}/paddle/lib/libpaddle_fluid${CMAKE_SHARED_LIBRARY_SUFFIX})
endif()
if (NOT WIN32)
set(DEPS ${DEPS}
${MATH_LIB} ${MKLDNN_LIB}
glog gflags protobuf z xxhash
)
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
set(DEPS ${DEPS} snappystream)
endif()
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
set(DEPS ${DEPS} snappy)
endif()
else()
set(DEPS ${DEPS}
${MATH_LIB} ${MKLDNN_LIB}
opencv_world346 glog gflags_static libprotobuf zlibstatic xxhash)
set(DEPS ${DEPS} libcmt shlwapi)
if (EXISTS "${PADDLE_DIR}/third_party/install/snappy/lib")
set(DEPS ${DEPS} snappy)
endif()
if(EXISTS "${PADDLE_DIR}/third_party/install/snappystream/lib")
set(DEPS ${DEPS} snappystream)
endif()
endif(NOT WIN32)
if(WITH_GPU)
if(NOT WIN32)
if (USE_TENSORRT)
set(DEPS ${DEPS} ${PADDLE_DIR}/third_party/install/tensorrt/lib/libnvinfer${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${PADDLE_DIR}/third_party/install/tensorrt/lib/libnvinfer_plugin${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
set(DEPS ${DEPS} ${CUDA_LIB}/libcudart${CMAKE_SHARED_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${CUDNN_LIB}/libcudnn${CMAKE_SHARED_LIBRARY_SUFFIX})
else()
set(DEPS ${DEPS} ${CUDA_LIB}/cudart${CMAKE_STATIC_LIBRARY_SUFFIX} )
set(DEPS ${DEPS} ${CUDA_LIB}/cublas${CMAKE_STATIC_LIBRARY_SUFFIX} )
set(DEPS ${DEPS} ${CUDA_LIB}/cudnn${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
endif()
if (NOT WIN32)
set(DEPS ${DEPS} ${OPENCV_DIR}/lib64/libopencv_imgcodecs${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/lib64/libopencv_imgproc${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/lib64/libopencv_core${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/lib64/libopencv_highgui${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/libIlmImf${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/liblibjasper${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/liblibpng${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/liblibtiff${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/libittnotify${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/liblibjpeg-turbo${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/liblibwebp${CMAKE_STATIC_LIBRARY_SUFFIX})
set(DEPS ${DEPS} ${OPENCV_DIR}/share/OpenCV/3rdparty/lib64/libzlib${CMAKE_STATIC_LIBRARY_SUFFIX})
endif()
if (NOT WIN32)
set(EXTERNAL_LIB "-ldl -lrt -lpthread")
set(DEPS ${DEPS} ${EXTERNAL_LIB})
endif()
add_executable(mask_detector main.cc mask_detector.cc)
target_link_libraries(mask_detector ${DEPS})
if (WIN32)
add_custom_command(TARGET mask_detector POST_BUILD
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./mklml.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./libiomp5md.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./mkldnn.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/mklml.dll ./release/mklml.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mklml/lib/libiomp5md.dll ./release/libiomp5md.dll
COMMAND ${CMAKE_COMMAND} -E copy_if_different ${PADDLE_DIR}/third_party/install/mkldnn/lib/mkldnn.dll ./release/mkldnn.dll
)
endif()
{
"configurations": [
{
"name": "x64-Release",
"generator": "Ninja",
"configurationType": "RelWithDebInfo",
"inheritEnvironments": [ "msvc_x64_x64" ],
"buildRoot": "${projectDir}\\out\\build\\${name}",
"installRoot": "${projectDir}\\out\\install\\${name}",
"cmakeCommandArgs": "",
"buildCommandArgs": "-v",
"ctestCommandArgs": "",
"variables": [
{
"name": "CUDA_LIB",
"value": "D:/projects/packages/cuda10_0/lib64",
"type": "PATH"
},
{
"name": "CUDNN_LIB",
"value": "D:/projects/packages/cuda10_0/lib64",
"type": "PATH"
},
{
"name": "OPENCV_DIR",
"value": "D:/projects/packages/opencv3_4_6",
"type": "PATH"
},
{
"name": "PADDLE_DIR",
"value": "D:/projects/packages/fluid_inference1_6_1",
"type": "PATH"
},
{
"name": "CMAKE_BUILD_TYPE",
"value": "Release",
"type": "STRING"
}
]
}
]
}
# PaddleHub口罩人脸识别及分类模型C++预测部署
百度通过 `PaddleHub` 开源了业界首个口罩人脸检测及人类模型,该模型可以有效检测在密集人类区域中携带和未携带口罩的所有人脸,同时判断出是否有佩戴口罩。开发者可以通过 `PaddleHub` 快速体验模型效果、搭建在线服务,还可以导出模型集成到`Windows``Linux`等不同平台的`C++`开发项目中。
本文档主要介绍如何把模型在`Windows``Linux`上完成基于`C++`的预测部署。
主要包含两个步骤:
- [1. PaddleHub导出预测模型](#1-paddlehub导出预测模型)
- [2. C++预测部署编译](#2-c预测部署编译)
## 1. PaddleHub导出预测模型
#### 1.1 安装 `PaddlePaddle` 和 `PaddleHub`
- `PaddlePaddle`的安装:
请点击[官方安装文档](https://paddlepaddle.org.cn/install/quick) 选择适合的方式
- `PaddleHub`的安装: `pip install paddlehub`
#### 1.2 从`PaddleHub`导出预测模型
在有网络访问条件下,执行`python export_model.py`导出两个可用于推理部署的口罩模型
其中`pyramidbox_lite_mobile_mask`为移动版模型, 模型更小,计算量低;
`pyramidbox_lite_server_mask`为服务器版模型,在此推荐该版本模型,精度相对移动版本更高。
成功执行代码后导出的模型路径结构:
```
pyramidbox_lite_server_mask
|
├── mask_detector # 口罩人脸分类模型
| ├── __model__ # 模型文件
│ └── __params__ # 参数文件
|
└── pyramidbox_lite # 口罩人脸检测模型
├── __model__ # 模型文件
└── __params__ # 参数文件
```
## 2. C++预测部署编译
本项目支持在Windows和Linux上编译并部署C++项目,不同平台的编译请参考:
- [Linux 编译](./docs/linux_build.md)
- [Windows 使用 Visual Studio 2019编译](./docs/windows_build.md)
# Linux平台口罩人脸检测及分类模型C++预测部署
## 1. 系统和软件依赖
### 1.1 操作系统及硬件要求
- Ubuntu 14.04 或者 16.04 (其它平台未测试)
- GCC版本4.8.5 ~ 4.9.2
- 支持Intel MKL-DNN的CPU
- NOTE: 如需在Nvidia GPU运行,请自行安装CUDA 9.0 / 10.0 + CUDNN 7.3+ (不支持9.1/10.1版本的CUDA)
### 1.2 下载PaddlePaddle C++预测库
PaddlePaddle C++ 预测库主要分为CPU版本和GPU版本。
其中,GPU 版本支持`CUDA 10.0``CUDA 9.0`:
以下为各版本C++预测库的下载链接:
| 版本 | 链接 |
| ---- | ---- |
| CPU+MKL版 | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.6.3-cpu-avx-mkl/fluid_inference.tgz) |
| CUDA9.0+MKL 版 | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.6.3-gpu-cuda9-cudnn7-avx-mkl/fluid_inference.tgz) |
| CUDA10.0+MKL 版 | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.6.3-gpu-cuda10-cudnn7-avx-mkl/fluid_inference.tgz) |
更多可用预测库版本,请点击以下链接下载:[C++预测库下载列表](https://paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/deploy/inference/build_and_install_lib_cn.html)
下载并解压, 解压后的 `fluid_inference`目录包含的内容:
```
fluid_inference
├── paddle # paddle核心库和头文件
|
├── third_party # 第三方依赖库和头文件
|
└── version.txt # 版本和编译信息
```
**注意:** 请把解压后的目录放到合适的路径,**该目录路径后续会作为编译依赖**使用。
### 1.2 编译安装 OpenCV
```shell
# 1. 下载OpenCV3.4.6版本源代码
wget -c https://paddleseg.bj.bcebos.com/inference/opencv-3.4.6.zip
# 2. 解压
unzip opencv-3.4.6.zip && cd opencv-3.4.6
# 3. 创建build目录并编译, 这里安装到/root/projects/opencv3目录
mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/opencv3 -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DWITH_IPP=OFF -DBUILD_IPP_IW=OFF -DWITH_LAPACK=OFF -DWITH_EIGEN=OFF -DCMAKE_INSTALL_LIBDIR=lib64 -DWITH_ZLIB=ON -DBUILD_ZLIB=ON -DWITH_JPEG=ON -DBUILD_JPEG=ON -DWITH_PNG=ON -DBUILD_PNG=ON -DWITH_TIFF=ON -DBUILD_TIFF=ON
make -j4
make install
```
其中参数 `CMAKE_INSTALL_PREFIX` 参数指定了安装路径, 上述操作完成后,`opencv` 被安装在 `$HOME/opencv3` 目录(用户也可选择其他路径),**该目录后续作为编译依赖**
## 2. 编译与运行
### 2.1 配置编译脚本
cd `PaddleHub/deploy/demo/mask_detector/`
打开文件`linux_build.sh`, 看到以下内容:
```shell
# Paddle 预测库路径
PADDLE_DIR=/PATH/TO/fluid_inference/
# OpenCV 库路径
OPENCV_DIR=/PATH/TO/opencv3gcc4.8/
# 是否使用GPU
WITH_GPU=ON
# CUDA库路径, 仅 WITH_GPU=ON 时设置
CUDA_LIB=/PATH/TO/CUDA_LIB64/
# CUDNN库路径,仅 WITH_GPU=ON 且 CUDA_LIB有效时设置
CUDNN_LIB=/PATH/TO/CUDNN_LIB64/
cd build
cmake .. \
-DWITH_GPU=${WITH_GPU} \
-DPADDLE_DIR=${PADDLE_DIR} \
-DCUDA_LIB=${CUDA_LIB} \
-DCUDNN_LIB=${CUDNN_LIB} \
-DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_STATIC_LIB=OFF
make -j4
```
把上述参数根据实际情况做修改后,运行脚本编译程序:
```shell
sh linux_build.sh
```
### 2.2. 运行和可视化
可执行文件有 **2** 个参数,第一个是前面导出的`inference_model`路径,第二个是需要预测的图片路径。
示例:
```shell
./build/main /PATH/TO/pyramidbox_lite_server_mask/ /PATH/TO/TEST_IMAGE
```
执行程序时会打印检测框的位置与口罩是否佩戴的结果,另外result.jpg文件为检测的可视化结果。
**预测结果示例:**
![output_image](https://paddlehub.bj.bcebos.com/deploy/result.jpg)
# Windows平台口罩人脸检测及分类模型C++预测部署
## 1. 系统和软件依赖
### 1.1 基础依赖
- Windows 10 / Windows Server 2016+ (其它平台未测试)
- Visual Studio 2019 (社区版或专业版均可)
- CUDA 9.0 / 10.0 + CUDNN 7.3+ (不支持9.1/10.1版本的CUDA)
### 1.2 下载OpenCV并设置环境变量
- 在OpenCV官网下载适用于Windows平台的3.4.6版本: [点击下载](https://sourceforge.net/projects/opencvlibrary/files/3.4.6/opencv-3.4.6-vc14_vc15.exe/download)
- 运行下载的可执行文件,将OpenCV解压至合适目录,这里以解压到`D:\projects\opencv`为例
- 把OpenCV动态库加入到系统环境变量
- 此电脑(我的电脑)->属性->高级系统设置->环境变量
- 在系统变量中找到Path(如没有,自行创建),并双击编辑
- 新建,将opencv路径填入并保存,如D:\projects\opencv\build\x64\vc14\bin
**注意:** `OpenCV`的解压目录后续将做为编译配置项使用,所以请放置合适的目录中。
### 1.3 下载PaddlePaddle C++ 预测库
`PaddlePaddle` **C++ 预测库** 主要分为`CPU``GPU`版本, 其中`GPU版本`提供`CUDA 9.0``CUDA 10.0` 支持。
常用的版本如下:
| 版本 | 链接 |
| ---- | ---- |
| CPU+MKL版 | [fluid_inference_install_dir.zip](https://paddle-wheel.bj.bcebos.com/1.6.3/win-infer/mkl/cpu/fluid_inference_install_dir.zip) |
| CUDA9.0+MKL 版 | [fluid_inference_install_dir.zip](https://paddle-wheel.bj.bcebos.com/1.6.3/win-infer/mkl/post97/fluid_inference_install_dir.zip) |
| CUDA10.0+MKL 版 | [fluid_inference_install_dir.zip](https://paddle-wheel.bj.bcebos.com/1.6.3/win-infer/mkl/post107/fluid_inference_install_dir.zip) |
更多不同平台的可用预测库版本,请[点击查看](https://paddlepaddle.org.cn/documentation/docs/zh/advanced_usage/deploy/inference/windows_cpp_inference.html) 选择适合你的版本。
下载并解压, 解压后的 `fluid_inference`目录包含的内容:
```
fluid_inference_install_dir
├── paddle # paddle核心库和头文件
|
├── third_party # 第三方依赖库和头文件
|
└── version.txt # 版本和编译信息
```
**注意:** 这里的`fluid_inference_install_dir` 目录所在路径,将用于后面的编译参数设置,请放置在合适的位置。
## 2. Visual Studio 2019 编译
- 2.1 打开Visual Studio 2019 Community,点击`继续但无需代码`, 如下图:
![step2.1](https://paddleseg.bj.bcebos.com/inference/vs2019_step1.png)
- 2.2 点击 `文件`->`打开`->`CMake`, 如下图:
![step2.2](https://paddleseg.bj.bcebos.com/inference/vs2019_step2.png)
- 2.3 选择本项目根目录`CMakeList.txt`文件打开, 如下图:
![step2.3](https://paddleseg.bj.bcebos.com/deploy/docs/vs2019_step2.3.png)
- 2.4 点击:`项目`->`PaddleMaskDetector的CMake设置`
![step2.4](https://paddleseg.bj.bcebos.com/deploy/docs/vs2019_step2.4.png)
- 2.5 点击浏览设置`OPENCV_DIR`, `CUDA_LIB``PADDLE_DIR` 3个编译依赖库的位置, 设置完成后点击`保存并生成CMake缓存并加载变量`
![step2.5](https://paddleseg.bj.bcebos.com/inference/vs2019_step5.png)
- 2.6 点击`生成`->`全部生成` 编译项目
![step2.6](https://paddleseg.bj.bcebos.com/inference/vs2019_step6.png)
## 3. 运行程序
成功编译后, 产出的可执行文件在项目子目录`out\build\x64-Release`目录, 按以下步骤运行代码:
- 打开`cmd`切换至该目录
- 运行以下命令传入口罩识别模型路径与测试图片
```shell
main.exe ./pyramidbox_lite_server_mask/ ./images/mask_input.png
```
第一个参数即`PaddleHub`导出的预测模型,第二个参数即要预测的图片。
运行后,预测结果保存在文件`result.jpg`中。
**预测结果示例:**
![output_image](https://paddlehub.bj.bcebos.com/deploy/result.jpg)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddlehub as hub
# Load mask detector module from PaddleHub
module = hub.Module(name="pyramidbox_lite_server_mask", version='1.1.0')
# Export inference model for deployment
module.processor.save_inference_model("./pyramidbox_lite_server_mask")
print("pyramidbox_lite_server_mask module export done!")
# Load mask detector (mobile version) module from PaddleHub
module = hub.Module(name="pyramidbox_lite_mobile_mask", version="1.1.0")
# Export inference model for deployment
module.processor.save_inference_model("./pyramidbox_lite_mobile_mask")
print("pyramidbox_lite_mobile_mask module export done!")
WITH_GPU=ON
PADDLE_DIR=/ssd3/chenzeyu01/PaddleMaskDetector/fluid_inference
CUDA_LIB=/home/work/cuda-10.1/lib64/
CUDNN_LIB=/home/work/cudnn/cudnn_v7.4/cuda/lib64/
OPENCV_DIR=/ssd3/chenzeyu01/PaddleMaskDetector/opencv3gcc4.8/
rm -rf build
mkdir -p build
cd build
cmake .. \
-DWITH_GPU=${WITH_GPU} \
-DPADDLE_DIR=${PADDLE_DIR} \
-DCUDA_LIB=${CUDA_LIB} \
-DCUDNN_LIB=${CUDNN_LIB} \
-DOPENCV_DIR=${OPENCV_DIR} \
-DWITH_STATIC_LIB=OFF
make clean
make -j12
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#include <iostream>
#include <string>
#include "mask_detector.h" // NOLINT
int main(int argc, char* argv[]) {
if (argc < 3 || argc > 4) {
std::cout << "Usage:"
<< "./mask_detector ./models/ ./images/test.png"
<< std::endl;
return -1;
}
bool use_gpu = (argc == 4 ? std::stoi(argv[3]) : false);
auto det_model_dir = std::string(argv[1]) + "/pyramidbox_lite";
auto cls_model_dir = std::string(argv[1]) + "/mask_detector";
auto image_path = argv[2];
// Init Detection Model
float det_shrink = 0.6;
float det_threshold = 0.7;
std::vector<float> det_means = {104, 177, 123};
std::vector<float> det_scale = {0.007843, 0.007843, 0.007843};
FaceDetector detector(
det_model_dir,
det_means,
det_scale,
use_gpu,
det_threshold);
// Init Classification Model
std::vector<float> cls_means = {0.5, 0.5, 0.5};
std::vector<float> cls_scale = {1.0, 1.0, 1.0};
MaskClassifier classifier(
cls_model_dir,
cls_means,
cls_scale,
use_gpu);
// Load image
cv::Mat img = imread(image_path, cv::IMREAD_COLOR);
// Prediction result
std::vector<FaceResult> results;
// Stage1: Face detection
detector.Predict(img, &results, det_shrink);
// Stage2: Mask wearing classification
classifier.Predict(&results);
for (const FaceResult& item : results) {
printf("{left=%d, right=%d, top=%d, bottom=%d},"
" class_id=%d, confidence=%.5f\n",
item.rect[0],
item.rect[1],
item.rect[2],
item.rect[3],
item.class_id,
item.confidence);
}
// Visualization result
cv::Mat vis_img;
VisualizeResult(img, results, &vis_img);
cv::imwrite("result.jpg", vis_img);
return 0;
}
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
# include "mask_detector.h"
// Normalize the image by (pix - mean) * scale
void NormalizeImage(
const std::vector<float> &mean,
const std::vector<float> &scale,
cv::Mat& im, // NOLINT
float* input_buffer) {
int height = im.rows;
int width = im.cols;
int stride = width * height;
for (int h = 0; h < height; h++) {
for (int w = 0; w < width; w++) {
int base = h * width + w;
input_buffer[base + 0 * stride] =
(im.at<cv::Vec3f>(h, w)[0] - mean[0]) * scale[0];
input_buffer[base + 1 * stride] =
(im.at<cv::Vec3f>(h, w)[1] - mean[1]) * scale[1];
input_buffer[base + 2 * stride] =
(im.at<cv::Vec3f>(h, w)[2] - mean[2]) * scale[2];
}
}
}
// Load Model and return model predictor
void LoadModel(
const std::string& model_dir,
bool use_gpu,
std::unique_ptr<paddle::PaddlePredictor>* predictor) {
// Config the model info
paddle::AnalysisConfig config;
config.SetModel(model_dir + "/__model__",
model_dir + "/__params__");
if (use_gpu) {
config.EnableUseGpu(100, 0);
} else {
config.DisableGpu();
}
config.SwitchUseFeedFetchOps(false);
config.SwitchSpecifyInputNames(true);
// Memory optimization
config.EnableMemoryOptim();
*predictor = std::move(CreatePaddlePredictor(config));
}
// Visualiztion MaskDetector results
void VisualizeResult(const cv::Mat& img,
const std::vector<FaceResult>& results,
cv::Mat* vis_img) {
for (int i = 0; i < results.size(); ++i) {
int w = results[i].rect[1] - results[i].rect[0];
int h = results[i].rect[3] - results[i].rect[2];
cv::Rect roi = cv::Rect(results[i].rect[0], results[i].rect[2], w, h);
// Configure color and text size
cv::Scalar roi_color;
std::string text;
if (results[i].class_id == 1) {
text = "MASK: ";
roi_color = cv::Scalar(0, 255, 0);
} else {
text = "NO MASK: ";
roi_color = cv::Scalar(0, 0, 255);
}
text += std::to_string(static_cast<int>(results[i].confidence * 100)) + "%";
int font_face = cv::FONT_HERSHEY_TRIPLEX;
double font_scale = 1.f;
float thickness = 1;
cv::Size text_size = cv::getTextSize(text,
font_face,
font_scale,
thickness,
nullptr);
float new_font_scale = roi.width * font_scale / text_size.width;
text_size = cv::getTextSize(text,
font_face,
new_font_scale,
thickness,
nullptr);
cv::Point origin;
origin.x = roi.x;
origin.y = roi.y;
// Configure text background
cv::Rect text_back = cv::Rect(results[i].rect[0],
results[i].rect[2] - text_size.height,
text_size.width,
text_size.height);
// Draw roi object, text, and background
*vis_img = img;
cv::rectangle(*vis_img, roi, roi_color, 2);
cv::rectangle(*vis_img, text_back, cv::Scalar(225, 225, 225), -1);
cv::putText(*vis_img,
text,
origin,
font_face,
new_font_scale,
cv::Scalar(0, 0, 0),
thickness);
}
}
void FaceDetector::Preprocess(const cv::Mat& image_mat, float shrink) {
// Clone the image : keep the original mat for postprocess
cv::Mat im = image_mat.clone();
cv::resize(im, im, cv::Size(), shrink, shrink, cv::INTER_CUBIC);
im.convertTo(im, CV_32FC3, 1.0);
int rc = im.channels();
int rh = im.rows;
int rw = im.cols;
input_shape_ = {1, rc, rh, rw};
input_data_.resize(1 * rc * rh * rw);
float* buffer = input_data_.data();
NormalizeImage(mean_, scale_, im, input_data_.data());
}
void FaceDetector::Postprocess(
const cv::Mat& raw_mat,
float shrink,
std::vector<FaceResult>* result) {
result->clear();
int rect_num = 0;
int rh = input_shape_[2];
int rw = input_shape_[3];
int total_size = output_data_.size() / 6;
for (int j = 0; j < total_size; ++j) {
// Class id
int class_id = static_cast<int>(round(output_data_[0 + j * 6]));
// Confidence score
float score = output_data_[1 + j * 6];
int xmin = (output_data_[2 + j * 6] * rw) / shrink;
int ymin = (output_data_[3 + j * 6] * rh) / shrink;
int xmax = (output_data_[4 + j * 6] * rw) / shrink;
int ymax = (output_data_[5 + j * 6] * rh) / shrink;
int wd = xmax - xmin;
int hd = ymax - ymin;
if (score > threshold_) {
auto roi = cv::Rect(xmin, ymin, wd, hd) &
cv::Rect(0, 0, rw / shrink, rh / shrink);
// A view ref to original mat
cv::Mat roi_ref(raw_mat, roi);
FaceResult result_item;
result_item.rect = {xmin, xmax, ymin, ymax};
result_item.roi_rect = roi_ref;
result->push_back(result_item);
}
}
}
void FaceDetector::Predict(const cv::Mat& im,
std::vector<FaceResult>* result,
float shrink) {
// Preprocess image
Preprocess(im, shrink);
// Prepare input tensor
auto input_names = predictor_->GetInputNames();
auto in_tensor = predictor_->GetInputTensor(input_names[0]);
in_tensor->Reshape(input_shape_);
in_tensor->copy_from_cpu(input_data_.data());
// Run predictor
predictor_->ZeroCopyRun();
// Get output tensor
auto output_names = predictor_->GetOutputNames();
auto out_tensor = predictor_->GetOutputTensor(output_names[0]);
std::vector<int> output_shape = out_tensor->shape();
// Calculate output length
int output_size = 1;
for (int j = 0; j < output_shape.size(); ++j) {
output_size *= output_shape[j];
}
output_data_.resize(output_size);
out_tensor->copy_to_cpu(output_data_.data());
// Postprocessing result
Postprocess(im, shrink, result);
}
inline void MaskClassifier::Preprocess(std::vector<FaceResult>* faces) {
int batch_size = faces->size();
input_shape_ = {
batch_size,
EVAL_CROP_SIZE_[0],
EVAL_CROP_SIZE_[1],
EVAL_CROP_SIZE_[2]
};
// Reallocate input buffer
int input_size = 1;
for (int x : input_shape_) {
input_size *= x;
}
input_data_.resize(input_size);
auto buffer_base = input_data_.data();
for (int i = 0; i < batch_size; ++i) {
cv::Mat im = (*faces)[i].roi_rect;
// Resize
int rc = im.channels();
int rw = im.cols;
int rh = im.rows;
cv::Size resize_size(input_shape_[3], input_shape_[2]);
if (rw != input_shape_[3] || rh != input_shape_[2]) {
cv::resize(im, im, resize_size, 0.f, 0.f, cv::INTER_CUBIC);
}
im.convertTo(im, CV_32FC3, 1.0 / 256.0);
rc = im.channels();
rw = im.cols;
rh = im.rows;
float* buffer_i = buffer_base + i * rc * rw * rh;
NormalizeImage(mean_, scale_, im, buffer_i);
}
}
void MaskClassifier::Postprocess(std::vector<FaceResult>* faces) {
float* data = output_data_.data();
int batch_size = faces->size();
int out_num = output_data_.size();
for (int i = 0; i < batch_size; ++i) {
auto out_addr = data + (out_num / batch_size) * i;
int best_class_id = 0;
float best_class_score = *(best_class_id + out_addr);
for (int j = 0; j < (out_num / batch_size); ++j) {
auto infer_class = j;
auto score = *(j + out_addr);
if (score > best_class_score) {
best_class_id = infer_class;
best_class_score = score;
}
}
(*faces)[i].class_id = best_class_id;
(*faces)[i].confidence = best_class_score;
}
}
void MaskClassifier::Predict(std::vector<FaceResult>* faces) {
Preprocess(faces);
// Prepare input tensor
auto input_names = predictor_->GetInputNames();
auto in_tensor = predictor_->GetInputTensor(input_names[0]);
in_tensor->Reshape(input_shape_);
in_tensor->copy_from_cpu(input_data_.data());
// Run predictor
predictor_->ZeroCopyRun();
// Get output tensor
auto output_names = predictor_->GetOutputNames();
auto out_tensor = predictor_->GetOutputTensor(output_names[0]);
std::vector<int> output_shape = out_tensor->shape();
// Calculate output length
int output_size = 1;
for (int j = 0; j < output_shape.size(); ++j) {
output_size *= output_shape[j];
}
output_data_.resize(output_size);
out_tensor->copy_to_cpu(output_data_.data());
Postprocess(faces);
}
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <string>
#include <vector>
#include <memory>
#include <utility>
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include "paddle_inference_api.h" // NOLINT
// MaskDetector Result
struct FaceResult {
// Detection result: face rectangle
std::vector<int> rect;
// Detection result: cv::Mat of face rectange
cv::Mat roi_rect;
// Classification result: confidence
float confidence;
// Classification result : class id
int class_id;
};
// Load Paddle Inference Model
void LoadModel(
const std::string& model_dir,
bool use_gpu,
std::unique_ptr<paddle::PaddlePredictor>* predictor);
// Visualiztion MaskDetector results
void VisualizeResult(const cv::Mat& img,
const std::vector<FaceResult>& results,
cv::Mat* vis_img);
class FaceDetector {
public:
explicit FaceDetector(const std::string& model_dir,
const std::vector<float>& mean,
const std::vector<float>& scale,
bool use_gpu = false,
float threshold = 0.7) :
mean_(mean),
scale_(scale),
threshold_(threshold) {
LoadModel(model_dir, use_gpu, &predictor_);
}
// Run predictor
void Predict(
const cv::Mat& img,
std::vector<FaceResult>* result,
float shrink);
private:
// Preprocess image and copy data to input buffer
void Preprocess(const cv::Mat& image_mat, float shrink);
// Postprocess result
void Postprocess(
const cv::Mat& raw_mat,
float shrink,
std::vector<FaceResult>* result);
std::unique_ptr<paddle::PaddlePredictor> predictor_;
std::vector<float> input_data_;
std::vector<float> output_data_;
std::vector<int> input_shape_;
std::vector<float> mean_;
std::vector<float> scale_;
float threshold_;
};
class MaskClassifier {
public:
explicit MaskClassifier(const std::string& model_dir,
const std::vector<float>& mean,
const std::vector<float>& scale,
bool use_gpu = false) :
mean_(mean),
scale_(scale) {
LoadModel(model_dir, use_gpu, &predictor_);
}
void Predict(std::vector<FaceResult>* faces);
private:
void Preprocess(std::vector<FaceResult>* faces);
void Postprocess(std::vector<FaceResult>* faces);
std::unique_ptr<paddle::PaddlePredictor> predictor_;
std::vector<float> input_data_;
std::vector<int> input_shape_;
std::vector<float> output_data_;
const std::vector<int> EVAL_CROP_SIZE_ = {3, 128, 128};
std::vector<float> mean_;
std::vector<float> scale_;
};
# -*- coding:utf-8 -*-
import paddlehub as hub
import cv2
from PIL import Image, ImageDraw, ImageFont
import numpy as np
import json
import os
module = hub.Module(name="pyramidbox_lite_server_mask")
# opencv输出中文
def paint_chinese(im, chinese, position, fontsize, color_bgr):
# 图像从OpenCV格式转换成PIL格式
img_PIL = Image.fromarray(cv2.cvtColor(im, cv2.COLOR_BGR2RGB))
font = ImageFont.truetype(
'SourceHanSansSC-Medium.otf', fontsize, encoding="utf-8")
#color = (255,0,0) # 字体颜色
#position = (100,100)# 文字输出位置
color = color_bgr[::-1]
draw = ImageDraw.Draw(img_PIL)
# PIL图片上打印汉字 # 参数1:打印坐标,参数2:文本,参数3:字体颜色,参数4:字体
draw.text(position, chinese, font=font, fill=color)
img = cv2.cvtColor(np.asarray(img_PIL), cv2.COLOR_RGB2BGR) # PIL图片转cv2 图片
return img
result_path = './result'
if not os.path.exists(result_path):
os.mkdir(result_path)
name = "./result/1-mask_detection.mp4"
width = 1280
height = 720
fps = 30
fourcc = cv2.VideoWriter_fourcc(*'vp90')
writer = cv2.VideoWriter(name, fourcc, fps, (width, height))
maskIndex = 0
index = 0
data = []
capture = cv2.VideoCapture(0) # 打开摄像头
#capture = cv2.VideoCapture('./test_video.mp4') # 打开视频文件
while True:
frameData = {}
ret, frame = capture.read() # frame即视频的一帧数据
if ret == False:
break
frame_copy = frame.copy()
input_dict = {"data": [frame]}
results = module.face_detection(data=input_dict)
maskFrameDatas = []
for result in results:
label = result['data']['label']
confidence_origin = result['data']['confidence']
confidence = round(confidence_origin, 2)
confidence_desc = str(confidence)
top, right, bottom, left = int(result['data']['top']), int(
result['data']['right']), int(result['data']['bottom']), int(
result['data']['left'])
#将当前帧保存为图片
img_name = "avatar_%d.png" % (maskIndex)
path = "./result/" + img_name
image = frame[top - 10:bottom + 10, left - 10:right + 10]
cv2.imwrite(path, image, [int(cv2.IMWRITE_PNG_COMPRESSION), 9])
maskFrameData = {}
maskFrameData['top'] = top
maskFrameData['right'] = right
maskFrameData['bottom'] = bottom
maskFrameData['left'] = left
maskFrameData['confidence'] = float(confidence_origin)
maskFrameData['label'] = label
maskFrameData['img'] = img_name
maskFrameDatas.append(maskFrameData)
maskIndex += 1
color = (0, 255, 0)
label_cn = "有口罩"
if label == 'NO MASK':
color = (0, 0, 255)
label_cn = "无口罩"
cv2.rectangle(frame_copy, (left, top), (right, bottom), color, 3)
cv2.putText(frame_copy, label, (left, top - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.8, color, 2)
#origin_point = (left, top - 36)
#frame_copy = paint_chinese(frame_copy, label_cn, origin_point, 24,
# color)
writer.write(frame_copy)
cv2.imshow('Mask Detection', frame_copy)
frameData['frame'] = index
# frameData['seconds'] = int(index/fps)
frameData['data'] = maskFrameDatas
data.append(frameData)
print(json.dumps(frameData))
index += 1
if cv2.waitKey(1) & 0xFF == ord('q'):
break
with open("./result/2-mask_detection.json", "w") as f:
json.dump(data, f)
writer.release()
cv2.destroyAllWindows()
# 口罩佩戴检测模型Python高性能部署方案
百度通过 PaddleHub 开源了业界首个口罩人脸检测及人类模型,该模型可以有效检测在密集人类区域中携带和未携带口罩的所有人脸,同时判断出是否有佩戴口罩。开发者可以通过 PaddleHub 快速体验模型效果、搭建在线服务。
本文档主要介绍如何完成基于`python`的口罩佩戴检测预测。
主要包含两个步骤:
- [1. PaddleHub导出预测模型](#1-paddlehub导出预测模型)
- [2. 基于python的预测](#2-预测部署编译)
## 1. PaddleHub导出预测模型
#### 1.1 安装 `PaddlePaddle` 和 `PaddleHub`
- `PaddlePaddle`的安装:
请点击[官方安装文档](https://paddlepaddle.org.cn/install/quick) 选择适合的方式
- `PaddleHub`的安装: `pip install paddlehub`
- `opencv`的安装: `pip install opencv-python`
#### 1.2 从`PaddleHub`导出预测模型
```
git clone https://github.com/PaddlePaddle/PaddleHub.git
cd PaddleHub/demo/mask_detection/python/
python export_model.py
```
在有网络访问条件下,执行`python export_model.py`导出两个可用于推理部署的口罩模型
其中`pyramidbox_lite_mobile_mask`为移动版模型, 模型更小,计算量低;
`pyramidbox_lite_server_mask`为服务器版模型,在此推荐该版本模型,精度相对移动版本更高。
成功执行代码后导出的模型路径结构:
```
pyramidbox_lite_server_mask
|
├── mask_detector # 口罩人脸分类模型
| ├── __model__ # 模型文件
│ └── __params__ # 参数文件
|
└── pyramidbox_lite # 口罩人脸检测模型
├── __model__ # 模型文件
└── __params__ # 参数文件
```
## 2. 基于python的预测
### 2.1 执行预测程序
在终端输入以下命令进行预测:
```bash
python infer.py --models_dir=/path/to/models --img_paths=/path/to/images --video_path=/path/to/video --video_size=size/of/video --use_camera=(False/True)
--use_gpu=(False/True)
```
参数说明如下:
| 参数 | 是否必须|含义 |
|-------|-------|----------|
| models_dir | Yes|上述导出的模型路径 |
| img_paths |img_paths/video_path 二选一|需要预测的图片目录 |
| video_path |img_paths/video_path 二选一|需要预测的视频目录|
| use_camera |No|是否打开摄像头进行预测,默认为False |
| open_imshow |No|是否对视频的检测结果实时绘图,默认为False |
| use_gpu |No|是否GPU,默认为False|
说明:
如果use_gpu=True,请先在命令行指定GPU,如:
```
export CUDA_VISIBLE_DEVICES=0
```
## 3. 可视化结果
执行完预测后,预测的可视化结果会保存在当前路径下的`./result/`下面。
输入样例:
![avatar](./images/mask.jpg)
输出结果:
![avatar](./images/mask.jpg.result.jpg)
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import paddlehub as hub
# Load mask detector module from PaddleHub
module = hub.Module(name="pyramidbox_lite_server_mask", version='1.1.0')
# Export inference model for deployment
module.processor.save_inference_model("./pyramidbox_lite_server_mask")
print("pyramidbox_lite_server_mask module export done!")
# Load mask detector (mobile version) module from PaddleHub
module = hub.Module(name="pyramidbox_lite_mobile_mask", version="1.1.0")
# Export inference model for deployment
module.processor.save_inference_model("./pyramidbox_lite_mobile_mask")
print("pyramidbox_lite_mobile_mask module export done!")
# coding: utf8
# copyright (c) 2019 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import ast
import time
import json
import argparse
import numpy as np
import cv2
import paddle.fluid as fluid
from PIL import Image
from PIL import ImageDraw
import argparse
def parse_args():
parser = argparse.ArgumentParser('mask detection.')
parser.add_argument(
'--models_dir', type=str, default='', help='path of models.')
parser.add_argument(
'--img_paths', type=str, default='', help='path of images')
parser.add_argument(
'--video_path', type=str, default='', help='path of video.')
parser.add_argument(
'--use_camera',
type=bool,
default=False,
help='switch detect video or camera, default:video.')
parser.add_argument(
'--open_imshow',
type=bool,
default=False,
help='visualize video detection results in real time.')
parser.add_argument(
'--use_gpu',
type=bool,
default=False,
help='switch cpu/gpu, default:cpu.')
args = parser.parse_args()
return args
class FaceResult:
def __init__(self, rect_data, rect_info):
self.rect_info = rect_info
self.rect_data = rect_data
self.class_id = -1
self.score = 0.0
def VisualizeResult(im, faces):
LABELS = ['NO_MASK', 'MASK']
COLORS = [(0, 0, 255), (0, 255, 0)]
for face in faces:
label = LABELS[face.class_id]
color = COLORS[face.class_id]
left, right, top, bottom = [int(item) for item in face.rect_info]
label_position = (left, top)
cv2.putText(im, label, label_position, cv2.FONT_HERSHEY_SIMPLEX, 1,
color, 2, cv2.LINE_AA)
cv2.rectangle(im, (left, top), (right, bottom), color, 3)
return im
def LoadModel(model_dir, use_gpu=False):
config = fluid.core.AnalysisConfig(model_dir + '/__model__',
model_dir + '/__params__')
if use_gpu:
config.enable_use_gpu(100, 0)
config.switch_ir_optim(True)
else:
config.disable_gpu()
config.disable_glog_info()
config.switch_specify_input_names(True)
config.enable_memory_optim()
return fluid.core.create_paddle_predictor(config)
class MaskClassifier:
def __init__(self, model_dir, mean, scale, use_gpu=False):
self.mean = np.array(mean).reshape((3, 1, 1))
self.scale = np.array(scale).reshape((3, 1, 1))
self.predictor = LoadModel(model_dir, use_gpu)
self.EVAL_SIZE = (128, 128)
def Preprocess(self, faces):
h, w = self.EVAL_SIZE[1], self.EVAL_SIZE[0]
inputs = []
for face in faces:
im = cv2.resize(
face.rect_data, (128, 128),
fx=0,
fy=0,
interpolation=cv2.INTER_CUBIC)
# HWC -> CHW
im = im.swapaxes(1, 2)
im = im.swapaxes(0, 1)
# Convert to float
im = im[:, :, :].astype('float32') / 256.0
# im = (im - mean) * scale
im = im - self.mean
im = im * self.scale
im = im[np.newaxis, :, :, :]
inputs.append(im)
return inputs
def Postprocess(self, output_data, faces):
argmx = np.argmax(output_data, axis=1)
for idx in range(len(faces)):
faces[idx].class_id = argmx[idx]
faces[idx].score = output_data[idx][argmx[idx]]
return faces
def Predict(self, faces):
inputs = self.Preprocess(faces)
if len(inputs) != 0:
input_data = np.concatenate(inputs)
im_tensor = fluid.core.PaddleTensor(
input_data.copy().astype('float32'))
output_data = self.predictor.run([im_tensor])[0]
output_data = output_data.as_ndarray()
self.Postprocess(output_data, faces)
class FaceDetector:
def __init__(self, model_dir, mean, scale, use_gpu=False, threshold=0.7):
self.mean = np.array(mean).reshape((3, 1, 1))
self.scale = np.array(scale).reshape((3, 1, 1))
self.threshold = threshold
self.predictor = LoadModel(model_dir, use_gpu)
def Preprocess(self, image, shrink):
h, w = int(image.shape[1] * shrink), int(image.shape[0] * shrink)
im = cv2.resize(
image, (h, w), fx=0, fy=0, interpolation=cv2.INTER_CUBIC)
# HWC -> CHW
im = im.swapaxes(1, 2)
im = im.swapaxes(0, 1)
# Convert to float
im = im[:, :, :].astype('float32')
# im = (im - mean) * scale
im = im - self.mean
im = im * self.scale
im = im[np.newaxis, :, :, :]
return im
def Postprocess(self, output_data, ori_im, shrink):
det_out = []
h, w = ori_im.shape[0], ori_im.shape[1]
for out in output_data:
class_id = int(out[0])
score = out[1]
xmin = (out[2] * w)
ymin = (out[3] * h)
xmax = (out[4] * w)
ymax = (out[5] * h)
wd = xmax - xmin
hd = ymax - ymin
valid = (xmax >= xmin and xmin > 0 and ymax >= ymin and ymin > 0)
if score > self.threshold and valid:
roi_rect = ori_im[int(ymin):int(ymax), int(xmin):int(xmax)]
det_out.append(FaceResult(roi_rect, [xmin, xmax, ymin, ymax]))
return det_out
def Predict(self, image, shrink):
ori_im = image.copy()
im = self.Preprocess(image, shrink)
im_tensor = fluid.core.PaddleTensor(im.copy().astype('float32'))
output_data = self.predictor.run([im_tensor])[0]
output_data = output_data.as_ndarray()
return self.Postprocess(output_data, ori_im, shrink)
def predict_images(args):
detector = FaceDetector(
model_dir=args.models_dir + '/pyramidbox_lite/',
mean=[104.0, 177.0, 123.0],
scale=[0.007843, 0.007843, 0.007843],
use_gpu=args.use_gpu,
threshold=0.7)
classifier = MaskClassifier(
model_dir=args.models_dir + '/mask_detector/',
mean=[0.5, 0.5, 0.5],
scale=[1.0, 1.0, 1.0],
use_gpu=args.use_gpu)
names = []
image_paths = []
for name in os.listdir(args.img_paths):
if name.split('.')[-1] in ['jpg', 'png', 'jpeg']:
names.append(name)
image_paths.append(os.path.join(args.img_paths, name))
images = [cv2.imread(path, cv2.IMREAD_COLOR) for path in image_paths]
path = './result'
isExists = os.path.exists(path)
if not isExists:
os.makedirs(path)
for idx in range(len(images)):
im = images[idx]
det_out = detector.Predict(im, shrink=0.7)
classifier.Predict(det_out)
img = VisualizeResult(im, det_out)
cv2.imwrite(os.path.join(path, names[idx] + '.result.jpg'), img)
def predict_video(args, im_shape=(1920, 1080), use_camera=False):
if args.use_camera:
capture = cv2.VideoCapture(0)
else:
capture = cv2.VideoCapture(args.video_path)
detector = FaceDetector(
model_dir=args.models_dir + '/pyramidbox_lite/',
mean=[104.0, 177.0, 123.0],
scale=[0.007843, 0.007843, 0.007843],
use_gpu=args.use_gpu,
threshold=0.7)
classifier = MaskClassifier(
model_dir=args.models_dir + '/mask_detector/',
mean=[0.5, 0.5, 0.5],
scale=[1.0, 1.0, 1.0],
use_gpu=args.use_gpu)
path = './result'
isExists = os.path.exists(path)
if not isExists:
os.makedirs(path)
fps = 30
width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
writer = cv2.VideoWriter(
os.path.join(path, 'result.mp4'), fourcc, fps, (width, height))
import time
start_time = time.time()
index = 0
while (1):
ret, frame = capture.read()
if not ret:
break
print('detect frame:%d' % (index))
index += 1
det_out = detector.Predict(frame, shrink=0.5)
classifier.Predict(det_out)
end_pre = time.time()
im = VisualizeResult(frame, det_out)
writer.write(im)
if args.open_imshow:
cv2.imshow('Mask Detection', im)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
end_time = time.time()
print("Average prediction time per frame:", (end_time - start_time) / index)
writer.release()
if __name__ == "__main__":
args = parse_args()
print(args.models_dir)
if args.img_paths != '':
predict_images(args)
elif args.video_path != '' or args.use_camera:
predict_video(args)
/*reset*/
*{ margin: 0; padding: 0; box-sizing: border-box; }
body{ font-family: Helvetica Neue,Helvetica,PingFang SC,Hiragino Sans GB,Microsoft YaHei,Noto Sans CJK SC,WenQuanYi Micro Hei,Arial,sans-serif; font-size: 14px; line-height: 1.4; color: #fff; -webkit-font-smoothing: antialiased; background: #2f3242;}
ul,ol{ list-style-type: none; }
a{ text-decoration: none; transition: all .2s ease; -webkit-transition: all .2s ease;}
img{border: none; }
table{ border-collapse:collapse; border-spacing:0; }
p{ line-height: 1.4 }
a {color: #333;}
a:hover {color: #666;}
input{ outline: none; }
button {background: none;border:none}
.clear:after{ content: " "; clear: both; display: block; }
h1,h2,h3,h4,h5,h6{font-style: normal;margin:0;padding:0;font-weight: normal;}
.clear:after{ content: " "; clear: both; display: block; }
.header {height:95px;padding-left:95px;padding-right:25px;}
.header .logo{padding-left:0px;padding-top:25px;}
.header .right-bar {float: right;padding-top:20px;}
.header .search {width: 313px;
height: 48px;
border: 2px solid #fff; /* stroke */
-moz-border-radius: 26px;
-webkit-border-radius: 26px;
border-radius: 26px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background: url(../image/search.png) 10px center no-repeat;
padding-left:50px;
color: #fff;
font-size: 16px;
}
.header .icon {display: inline-block;margin-left:40px;width:31px;height: 31px;vertical-align: middle;position: relative;}
.icon1 {background: url(../image/icon1.png) no-repeat center center;}
.icon2 {background: url(../image/icon2.png) no-repeat center center;}
.header .avatar {width:56px;height: 56px;border-radius: 100%;overflow: hidden;display: inline-block;margin-left:40px;vertical-align: middle;}
.header .right-bar .news-dot {
width: 15px;
height: 15px;
border: 3px solid #2f3141; /* stroke */
-moz-border-radius: 9px;
-webkit-border-radius: 9px;
border-radius: 9px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #2195f3; /* layer fill content */
position: absolute;
right:0px;
top:0px;
}
.header .arrow {display: inline-block;background: url(../image/arrow.png) no-repeat 0 0;width:12px;height: 8px;margin-left:10px;}
.bd {padding-left:95px;position: relative;background: #2f3242;}
.bd .side {width:95px;position: absolute;left:0;top:0;}
.bd .con {background: #2b2d3c;}
.bd .con .hd {background: #6d7499;color: #fff;font-size: 18px;padding:0 25px 0 60px;line-height: 60px;height: 60px;}
.bd.chart {padding:0;}
.bd .con .hd .right-bar {float:right;}
.select-wrap {min-width: 150px;margin-top:10px;line-height: 38px;
height: 38px;
border-radius: 19px;
position: relative;
border:1px solid #9da3bd;
display: inline-block;
color: #fff;
margin-left:20px;
}
.select-wrap select {border: none;
outline: none;
width: 100%;
height: 40px;
line-height: 40px;
background: transparent;
appearance: none;
-webkit-appearance: none;
-moz-appearance: none;
font-size: 16px;
color: #fff;
padding-left: 20px;}
.select-wrap:after{
content: "";
width: 14px;
height: 8px;
background: url(../image/arrow.png) no-repeat center;
position: absolute;
right: 20px;
top: 45%;
pointer-events: none;
}
.con-bd {position: relative;min-height: 960px;}
.maper {padding-left:380px;}
.con-left {left:0;top:0;position: absolute;width:380px;min-height: 960px;}
.con-left .map {padding:20px;}
.con-left .ctr-pan {position:absolute;bottom:20px;left:0;width:380px;color: #fff;}
.con-left .ctr-pan ul{padding-left:20px;}
.con-left .ctr-pan li {padding:5px 0;}
.con-left .ctr-pan .icon {width:23px;height: 23px;display: inline-block;vertical-align: middle;margin-right:10px;}
.con-left .ctr-pan .icon3 {background: url(../image/icon3.png) no-repeat center center;}
.icon4 {background: url(../image/icon4.png) no-repeat center center;}
.con-left .ctr-pan .icon5 {background: url(../image/icon5.png) no-repeat center center;}
.con-left .ctr-pan .switch {border:2px solid #aaaaaa;border-radius:8px;height:15px;width:30px;display: inline-block;line-height: 15px;position: relative;vertical-align: middle;margin-left:10px;}
.con-left .ctr-pan .switch em {width: 7px;
height: 7px;
-moz-border-radius: 4px / 3px;
-webkit-border-radius: 4px / 3px;
border-radius: 4px / 3px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #aaa; /* layer fill content */
display: inline-block;
vertical-align: middle;
position: absolute;
top:2px;
}
.con-left .ctr-pan .switch.on em{left:16px;}
.con-left .ctr-pan .switch.off em{left:2px;}
.map-wrap {padding:20px 0 0 0;}
.map-wrap img {}
.zoom-bar {
width: 49px;
height: 231px;
border: 3px solid #757da3; /* stroke */
-moz-border-radius: 27px;
-webkit-border-radius: 27px;
border-radius: 27px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #2f3141; /* layer fill content */
position: absolute;
right:50px;
top:30%;
text-align: center;
}
.zoom-bar .icon {width:30px;height: 30px;display: inline-block;}
.zoom-bar .zoom {background: url(../image/zoom.png) no-repeat center center;margin-top:5px;}
.zoom-bar .narrow {background: url(../image/narrow.png) no-repeat center center;margin-top:55px;}
.zoom-bar .mouse {background: url(../image/mu.png) no-repeat center center;margin-top:58px;}
.side .nav {padding-top:118px;}
.side .nav a{display: block;width:95px;height: 95px;margin:100px 0 0 0;position: relative;}
.side .nav a:hover {background-color: #2c2e3d;}
.side .nav .nav1 {background-image: url(../image/nav1.png);background-repeat: no-repeat;background-position: center center;}
.side .nav .nav2 {background-image: url(../image/nav2.png);background-repeat: no-repeat;background-position: center center;}
.side .nav .nav3 {background-image: url(../image/nav3.png);background-repeat: no-repeat;background-position: center center;}
.side .news-dot {
width: 15px;
height: 15px;
border: 3px solid #2f3141; /* stroke */
-moz-border-radius: 9px;
-webkit-border-radius: 9px;
border-radius: 9px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #2195f3; /* layer fill content */
position: absolute;
left:55px;
top:30px;
}
.side .nav a.current {background-color: #2c2e3d;}
.person-list {font-size: 16px;padding-left:60px;}
.person-list table {width:100%;}
.person-list thead td{color: #9ea3b4;height:70px;line-height: 70px;text-align: center;background: none;}
.person-list td {color: #fff;text-align: center;padding:10px 0;}
.person-list .pro .item {display: inline-block;}
.person-list .pro .item span {display:block;width:33px;height: 33px;margin:0 10px;margin-bottom:5px;}
.t1 {background: url(../image/t1.png) no-repeat center center;}
.t2 {background: url(../image/t2.png) no-repeat center center;}
.t3 {background: url(../image/t3.png) no-repeat center center;}
.t4 {background: url(../image/t4.png) no-repeat center center;}
.t5 {background: url(../image/t5.png) no-repeat center center;}
.t6 {background: url(../image/t6.png) no-repeat center center;}
.t7 {background: url(../image/t7.png) no-repeat center center;}
.person-list .dot {background: url(../image/dot.png) no-repeat 0 0;width:4px;height: 16px;display: inline-block;}
.person-list .time {font-size: 14px;color: #9ea3b4;}
.person-list tr:nth-child(odd) td {background: #3a3e52;}
.person-list thead tr:nth-child(odd) td {background: none;}
.person-list tr:nth-child(even) td {height:20px;}
.person-list tr td:last-child {padding:10px 10px;}
.line-female {border-left:4px solid #ed70bc;}
.line-male {border-left:4px solid #2196f3;}
.line-child {border-left:4px solid #e2e3e8;}
.mask {background: rgba(0,0,0,.90);position: absolute;width: 100%;height: 100%;left: 0;top:0;z-index: 99;}
.mask .cp {background: url(../image/cp.png) no-repeat 0 0 ;width:154px;height: 59px;margin:10% auto 0 auto;}
.drop-area {
width: 247px;
height: 247px;
border:4px dotted #fff;
border-radius: 30px;
background: url(../image/icon6.png) no-repeat center 50px;
margin:10% auto 50px auto;
}
.drop-area .text {font-size: 20px;text-align: center;color: #fff;margin-top:170px;}
.match-box {text-align: center;width:470px;margin: auto;color: #fff;font-size: 20px;}
.match-box .match-bar {width: 316px;
height: 8px;
-moz-border-radius: 4px;
-webkit-border-radius: 4px;
border-radius: 4px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #252734; /* layer fill content */
-moz-box-shadow: 0 0 48px rgba(0,0,0,.14); /* outer glow */
-webkit-box-shadow: 0 0 48px rgba(0,0,0,.14); /* outer glow */
box-shadow: 0 0 48px rgba(0,0,0,.14); /* outer glow */
position: relative;
display: inline-block;
margin:0 10px;
vertical-align: middle;
}
.match-box .match-bar span {
height: 8px;
-moz-border-radius: 4px;
-webkit-border-radius: 4px;
border-radius: 4px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #71799f; /* layer fill content */
position: absolute;
left: 0;
top:0;
}
.match-box .match-bar span em {
width: 25px;
height: 25px;
-moz-border-radius: 12px / 13px;
-webkit-border-radius: 12px / 13px;
border-radius: 12px / 13px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #fff; /* layer fill content */
-moz-box-shadow: 0 5px 5px rgba(0,0,0,.13); /* drop shadow */
-webkit-box-shadow: 0 5px 5px rgba(0,0,0,.13); /* drop shadow */
box-shadow: 0 5px 5px rgba(0,0,0,.13); /* drop shadow */
position: absolute;
right:0;
top:-8px;
}
.match-res-box {text-align: center;margin:60px auto 80px auto;width:320px;counter-reset: #fff;font-size: 17px;}
.match-avatar {
width: 98px;
height: 98px;
-moz-border-radius: 49px;
-webkit-border-radius: 49px;
border-radius: 49px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #6b7397; /* layer fill content */
overflow: hidden;
background: #6c7398;
margin: auto;
}
.match-res-box .text {text-align: center;margin-top:15px;}
.match-res-box .text span {font-weight: bold;}
.avatar-s {width: 56px;
height: 56px;
-moz-border-radius: 28px;
-webkit-border-radius: 28px;
border-radius: 28px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #ed6fbb; /* layer fill content */
display: inline-block;
overflow: hidden;
vertical-align: middle;
margin-right:20px;
}
.camera-list {padding:75px 0 0 90px;}
.camera-list li {float: left;width:400px;padding:0 40px 40px 40px;text-align: center;}
.camera-list .video-box {
width: 315px;
height: 236px;
-moz-border-radius: 24px;
-webkit-border-radius: 24px;
border-radius: 24px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #37394b; /* layer fill content */
overflow: hidden;
}
.camera-list li p {height: 44px;line-height: 44px;font-size: 18px;}
.camera-list li .icon {width:23px;height: 23px;display: inline-block;vertical-align: middle;margin-right:10px;}
.camera-con {padding-right:350px;position: relative;margin-top:40px;padding-left:40px;}
.camera-con .list {width:240px;position: absolute;right:50px;top:0;text-align: center;}
.camera-con .list li {margin-bottom:30px;}
.camera-con .list li .ca {width:240px;height: 178px;overflow: hidden;border-radius: 20px;}
.video-box {width:100%;}
.camera-con .list li p {padding:10px 0 0 0;}
.camera-con .list li .icon {width: 23px;
height: 23px;
display: inline-block;
vertical-align: middle;
margin-right: 10px;}
.tab li {float: left;}
.tab a {font-size: 18px;display: block;width: 180px;height: 58px;list-style: 58px;text-align: center;color: #fff;}
.tab a:hover,
.tab a.current {background: #8088ad;}
.chart .con .hd {padding-left:130px;}
.chart-bd {padding:20px;}
.chart-panel {padding-bottom:20px;}
.left-panel {width:790px;height:380px;float: left;background: #2f3242;padding:10px 20px;}
.right-panel {width:1040px;height: 380px;float: left;margin-left:20px;}
.panel-hd .right-bar {float:right;color: #9ea3b4;font-size: 16px;}
.panel-hd {font-size: 25px;color:#fff;height:70px;}
.panel-hd .select-wrap select{color: #9ea3b4;font-size: 16px;}
.right-panel .panel-hd {background: #3a3e52;height:96px;line-height: 96px;font-size: 25px;color: #fff;text-align: center;}
.right-panel .list {height:65px;padding:30px 0;}
.right-panel .list li {padding:30px 100px 30px 85px;position: relative;border-bottom:1px solid #505464;}
.right-panel .list li .avatar {
width: 42px;
height: 42px;
-moz-border-radius: 21px;
-webkit-border-radius: 21px;
border-radius: 21px; /* border radius */
-moz-background-clip: padding;
-webkit-background-clip: padding-box;
background-clip: padding-box; /* prevents bg color from leaking outside the border */
background-color: #9ba1b1; /* layer fill content */
display: block;
position: absolute;
left:25px;
}
.right-panel .list li .time {font-size: 16px;color: #fff;}
.right-panel .list li .text {color: #9ea3b4;}
.right-panel .list li .dot {position: absolute;right:25px;background: url(../image/dot.png) no-repeat center center;width:4px;height: 16px;top:40px;}
<?php
$json_string = file_get_contents(".\\video\\result\\2-mask_detection.json");
// 用参数true把JSON字符串强制转成PHP数组
$data_o = json_decode($json_string, true);
?>
<script language="javascript">
var pic_data = new Array();
<?php
for($i=0;$i<count($data_o);$i++){
$frame=$data_o[$i]["frame"];
$data_1=$data_o[$i]["data"];
for($j=0;$j<count($data_1);$j++){
?>
pic_data[<?php echo $frame;?>] = <?php echo json_encode($data_1);?>;
// pic_data[<?php echo $frame;?>] = 1;
<?php
}
}
?>
</script>
<?php
//摄像机位置,及视野宽高
$cams = array();
//雷达区图片
$area_w=250;
$area_h=500;
//cam图片大小
$cam_w=21;
$cam_h=23;
//vide宽高
$vide_w=1280;
$vide_h=720;
$cams[0]["cam_x"]=645;
$cams[0]["cam_y"]=243;
$cams[0]["cam_area_rotate"]=270;
$cams[0]["cam_area_w"]=0.4; //比例
$cams[0]["cam_area_h"]=0.8;
$cams[1]["cam_x"]=888;
$cams[1]["cam_y"]=167;
$cams[1]["cam_area_rotate"]=180;
$cams[1]["cam_area_w"]=0.6; //比例
$cams[1]["cam_area_h"]=1;
$cams[2]["cam_x"]=1051;
$cams[2]["cam_y"]=163;
$cams[2]["cam_area_rotate"]=270;
$cams[2]["cam_area_w"]=0.3; //比例
$cams[2]["cam_area_h"]=1;
$cams[3]["cam_x"]=893;
$cams[3]["cam_y"]=257;
$cams[3]["cam_area_rotate"]=275;
$cams[3]["cam_area_w"]=1; //比例
$cams[3]["cam_area_h"]=1;
$cams[4]["cam_x"]=714;
$cams[4]["cam_y"]=304;
$cams[4]["cam_area_rotate"]=90;
$cams[4]["cam_area_w"]=0.2; //比例
$cams[4]["cam_area_h"]=0.5;
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
<meta name="viewport" content="width=device-width" />
<meta name="format-detection" content="telephone=no, address=no, email=no" />
<link rel="stylesheet" type="text/css" href="css/css.css">
<script type="text/javascript" src="js/jquery.js"></script>
<script type="text/javascript" src="js/echarts.js"></script>
<style type="text/css">
body {
margin: 0;
padding: 0px;
font-family: "Microsoft YaHei", YaHei, "微软雅黑", SimHei, "黑体";
font-size: 14px;
}
.box {
}
.cam {
position: absolute;
border: none;
}
.area {
cursor: pointer;
z-index: 999;
}
</style>
</head>
<body>
<div class="header">
<div class="right-bar">
<input type="text" class="search">
<span class="icon icon1"></span> <span class="icon icon2"></span>
<div class="avatar"><img src="image/ava.png" alt=""></div>
<span class="arrow"></span> </div>
<div class="logo"><img src="image/logo.png" alt="" style="font-size:120px;"></div>
</div>
<div class="bd">
<div class="side">
<div class="nav">
<ul>
<li><a href="#" class="nav1 current"></a></li>
<li><a href="#" class="nav2"></a></li>
<li><a href="#" class="nav3"></a></li>
</ul>
</div>
</div>
<div class="con">
<div class="hd clear">
<div class="right-bar">摄像头分组显示 <span class="select-wrap">
<select name="">
<option value="A区摄像头">A区摄像头</option>
<option value="A区摄像头">A区摄像头</option>
</select>
</span> </div>
<span class="tit">摄像头_1</span> </div>
<div class="con-bd">
<div class="map-wrap">
<div class="zoom-bar"> <span class="icon zoom "></span> <span class="icon mouse"></span> <span class="icon narrow"></span> </div>
<div class="box">
<table width="1870" border="0" cellspacing="10">
<tr>
<td width="44%" valign="top"><p></p>
<!-- (拖拽在回来可能会有问题 TODO) -->
<div id="video" style="width: 1000px; height:560px;"></div>
<script type="text/javascript" src="js/ckplayer/ckplayer.js" charset="UTF-8"></script>
<script type="text/javascript">
var videoObject = {
container: '#video', //容器的ID或className
variable: 'player', //播放函数名称
loop: true, //播放结束是否循环播放
loaded: "loadedHandler",
autoplay: true,//是否自动播放
drag: 'start', //拖动的属性
video: [
['video/result/1-mask_detection.mp4', 'video/mp4', '中文标清', 0],
]
};
var player = new ckplayer(videoObject);
function timeHandler(time) {
$(".no_pmask").html("");
$(".pmask").html("");
var rtime = parseInt(time*1000);
frame=parseInt(time*30); //每秒30帧 第多少帧
for (i = 0; i < pic_data[frame].length; i++) {
if(pic_data[frame][i].label=="NO MASK"){
$(".no_pmask").append("<img src='video/result/"+pic_data[frame][i].img+"' height=100>");
} else {
$(".pmask").append("<img src='video/result/"+pic_data[frame][i].img+"' height=100>");
}
}
}
function seekTimeHandler(time) {
var rtime = parseInt(time*1000);
$(".seekstate").html(rtime);
}
function loadedHandler() {
player.addListener('time', timeHandler); //监听播放时间
player.addListener('seekTime', seekTimeHandler); //监听跳转播放完
/*
player.addListener('error', errorHandler); //监听视频加载出错
player.addListener('loadedmetadata', loadedMetaDataHandler); //监听元数据
player.addListener('duration', durationHandler); //监听播放时间
player.addListener('play', playHandler); //监听暂停播放
player.addListener('pause', pauseHandler); //监听暂停播放
player.addListener('buffer', bufferHandler); //监听缓冲状态
player.addListener('seek', seekHandler); //监听跳转播放完成
player.addListener('volume', volumeChangeHandler); //监听音量改变
player.addListener('full', fullHandler); //监听全屏/非全屏切换
player.addListener('ended', endedHandler); //监听播放结束
player.addListener('screenshot', screenshotHandler); //监听截图功能
player.addListener('mouse', mouseHandler); //监听鼠标坐标
player.addListener('frontAd', frontAdHandler); //监听前置广告的动作
player.addListener('wheel', wheelHandler); //监听视频放大缩小
player.addListener('controlBar', controlBarHandler); //监听控制栏显示隐藏事件
player.addListener('clickEvent', clickEventHandler); //监听点击事件
player.addListener('definitionChange', definitionChangeHandler); //监听清晰度切换事件
player.addListener('speed', speedHandler); //监听加载速度*/
}
</script>
<br>
</td>
<td width="56%" valign="top">
<p style="padding:0 0 0 40px; font-size:30px; color:red; backgroud-color:#b3b6c7">无口罩</p>
<p class="no_pmask" style="padding:40px;"></p><br><br><br><br>
<p style="padding:0 0 0 40px; font-size:30px; color:#08d814; backgroud-color:#b3b6c7">有口罩</p>
<p class="pmask" style="padding:40px; ;"></p>
</td>
</tr>
<tr>
<td colspan="2" valign="top"><div style="position:relative;"><img src="image/b2f2.png" width="1250"/>
<?php for($i=0;$i<count($cams);$i++){
$r = get_mast_postion($cams[$i]["cam_x"],$cams[$i]["cam_y"],$cams[$i]["cam_area_w"],$cams[$i]["cam_area_h"],$area_w,$area_h,$cam_w,$cam_h);
?>
<div>
<div style="border: 2px #fff solid; border: none; position: absolute; left: <?php echo $r['left'];?>px; top: <?php echo $r['top'];?>px; transform: rotate(<?php echo $cams[$i]["cam_area_rotate"];?>deg);"> <img src="image/mask.png" width="<?php echo $area_w*$cams[$i]["cam_area_w"];?>" height="<?php echo $area_h*$cams[$i]["cam_area_h"];?>" />
</div>
<div id="cam<?php echo $i?>" class="cam area" style="left: <?php echo $cams[$i]["cam_x"];?>px;top: <?php echo $cams[$i]["cam_y"];?>px;"> <img src="image/cam.png" width="<?php echo $cam_w;?>" height="<?php echo $cam_h;?>"/> </div>
</div>
<?php }?>
</div></td>
</tr>
</table>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
<?php
function get_mast_postion($cam_x,$cam_y,$cam_area_w,$cam_area_h,$area_w,$area_h,$cam_w,$cam_h){
$left=$cam_x-$cam_area_w*$area_w/2;
$top=$cam_y-$cam_area_h*$area_h/2;
$r['left'] = $left+$cam_w/2;
$r['top'] = $top+$cam_h/2;
return $r;
}
?>
<?xml version="1.0" encoding="utf-8"?>
<ckplayer>
<!--基本配置-->
<config>
<fullInteractive>true</fullInteractive><!--是否开启交互功能-->
<delay>30</delay><!--延迟加载视频,单位:毫秒-->
<timeFrequency>100</timeFrequency><!--计算当前播放时间和加载量的时间频率,单位:毫秒-->
<autoLoad>true</autoLoad><!--视频是否自动加载-->
<loadNext>0</loadNext><!--多段视频预加载的段数,设置成0则全部加载-->
<definition>true</definition><!--是否使用清晰度组件-->
<smartRemove>true</smartRemove><!--是否使用智能清理,使用该功能则在多段时当前播放段之前的段都会被清除出内存,减少对内存的使用-->
<bufferTime>200</bufferTime><!--缓存区的长度,单位:毫秒,不要小于100-->
<click>true</click><!--是否支持屏幕单击暂停-->
<doubleClick>true</doubleClick><!--是否支持屏幕双击全屏-->
<doubleClickInterval>200</doubleClickInterval><!--判断双击的标准,即二次单击间隔的时间差之内判断为是双击,单位:毫秒-->
<keyDown>
<space>true</space><!--是否启用空格键切换播放/暂停-->
<left>true</left>
<right>true</right>
<up>true</up>
<down>true</down>
</keyDown>
<timeJump>10</timeJump><!--快进快退时的秒数-->
<volumeJump>0.1</volumeJump><!--音量调整的数量,大于0小于1的小数-->
<timeScheduleAdjust>1</timeScheduleAdjust><!--是否可调节调节栏,0不启用,1是启用,2是只能前进(向右拖动),3是只能后退,4是只能前进但能回到第一次拖动时的位置,5是看过的地方可以随意拖动-->
<previewDefaultLoad>true</previewDefaultLoad><!--预览图片是否默认加载,优点是鼠标第一次经过进度条即可显示预览图片-->
<promptSpotTime>false</promptSpotTime><!--提示点文字是否在前面加上对应时间-->
<buttonMode>
<player>false</player><!--鼠标在播放器上是否显示可点击形态-->
<controlBar>false</controlBar><!--鼠标在控制栏上是否显示可点击形态-->
<timeSchedule>true</timeSchedule><!--鼠标在时间进度条上是否显示可点击形态-->
<volumeSchedule>true</volumeSchedule><!--鼠标在音量调节栏上是否显示可点击形态-->
</buttonMode>
<liveAndVod><!--直播+点播=时移功能-->
<open>false</open><!--是否开启,开启该功能需要设置flashvars里live=true-->
<vodTime>2</vodTime><!--可以回看的整点数-->
<start>start</start><!--回看请求参数-->
</liveAndVod>
<errorNum>1</errorNum><!--错误重连次数-->
<playCorrect>false</playCorrect><!--错误修正-->
<timeCorrect>true</timeCorrect><!--http视频播放时间错误纠正,有些因为视频格式的问题导致视频没有实际播放结束视频文件就返回了stop命令-->
<m3u8Definition><!--m3u8自动清晰度时按关键字来进行判断-->
<!--<tags>110k</tags>
<tags>200k</tags>
<tags>400k</tags>
<tags>600k</tags>
<tags>800k</tags>
<tags>1000k</tags>-->
</m3u8Definition>
<m3u8MaxBufferLength>30</m3u8MaxBufferLength><!--m3u8每次缓冲时间,单位:秒数-->
<split>,</split><!--当视频地址采用字符形式并且需要使用逗号或其它符号来切割数组里定义-->
<timeStamp></timeStamp><!--一个地址,用来请求当前时间戳,用于播放器内部时间效准-->
<addCallback>adPlay,adPause,playOrPause,videoPlay,videoPause,videoMute,videoEscMute,videoClear,changeVolume,fastBack,fastNext,videoSeek,newVideo,getMetaDate,videoRotation,videoBrightness,videoContrast,videoSaturation,videoHue,videoZoom,videoProportion,videoError,addListener,removeListener,addElement,getElement,deleteElement,animate,animateResume,animatePause,deleteAnimate,changeConfig,getConfig,openUrl,fullScreen,quitFullScreen,switchFull,screenshot,custom,changeControlBarShow,getCurrentSrc</addCallback>
</config>
<menu>
<ckkey></ckkey>
<name></name>
<link></link>
<domain></domain>
<version></version>
<more></more>
</menu>
<languagePath></languagePath>
<stylePath></stylePath>
<style>
<loading><!--显示的loading元素-->
<file></file>
<align>center</align><!--水平对齐方式:left,center,right-->
<vAlign>middle</vAlign><!--垂直对齐方式:top,middle,bottom-->
<offsetX>-100</offsetX><!--水平偏移量,单位:像素-->
<offsetY>-40</offsetY><!--垂直偏移量,单位:像素-->
</loading>
<logo>
<file></file>
<align>right</align><!--水平对齐方式:left,center,right-->
<vAlign>top</vAlign><!--垂直对齐方式:top,middle,bottom-->
<offsetX>-100</offsetX><!--水平偏移量,单位:像素-->
<offsetY>10</offsetY><!--垂直偏移量,单位:像素-->
</logo>
<advertisement><!--广告相关的配置-->
<time>5</time><!--广告默认播放时长以及多个广告时每个广告默认播放时间,单位:秒-->
<method>get</method><!--广告监测地址默认请求方式,get/post-->
<videoForce>false</videoForce><!--视频广告是否强制播放结束-->
<videoVolume>0.8</videoVolume><!--视频音量-->
<skipButtonShow>true</skipButtonShow><!--是否显示跳过广告按钮-->
<linkButtonShow>true</linkButtonShow><!--是否显示广告链接按钮,如果选择显示,只有在提供了广告链接地址时才会显示-->
<muteButtonShow>true</muteButtonShow><!--是否显示跳过广告按钮-->
<closeButtonShow>true</closeButtonShow><!--暂停时是否显示关闭广告按钮-->
<closeOtherButtonShow>true</closeOtherButtonShow><!--其它广告是否需要关闭广告按钮-->
<frontSkipButtonDelay>0</frontSkipButtonDelay><!--前置广告跳过广告按钮延时显示的时间,秒-->
<insertSkipButtonDelay>0</insertSkipButtonDelay><!--插入广告跳过广告按钮延时显示的时间,秒-->
<endSkipButtonDelay>0</endSkipButtonDelay><!--后置广告跳过广告按钮延时显示的时间,秒-->
<!--广告拉伸方式,0=原始大小,1=自动缩放,2=只有当广告的宽或高大于播放器宽高时才进行缩放,3=参考播放器宽高,4=宽度参考播放器宽、高度自动,5=高度参考播放器高、宽度自动-->
<frontStretched>1</frontStretched>
<insertStretched>2</insertStretched>
<pauseStretched>2</pauseStretched>
<endStretched>2</endStretched>
</advertisement>
<video><!--视频的默认比例-->
<defaultWidth>4</defaultWidth>
<defaultHeight>3</defaultHeight>
</video>
</style>
<!--做为前端flashvars的补充-->
<flashvars></flashvars>
</ckplayer>
\ No newline at end of file
Copyright (c) 2017 Dailymotion (http://www.dailymotion.com)
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
src/remux/mp4-generator.js and src/demux/exp-golomb.js implementation in this project
are derived from the HLS library for video.js (https://github.com/videojs/videojs-contrib-hls)
That work is also covered by the Apache 2 License, following copyright:
Copyright (c) 2013-2015 Brightcove
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
<?xml version="1.0" encoding="utf-8"?>
<language>
<adCountdown>[$second]</adCountdown><!--广告播放结束倒计时-->
<skipDelay>[$second]</skipDelay>
<buttonOver>
<play>点击播放</play>
<pause>暂停播放</pause>
<mute>静音</mute>
<escMute>恢复音量</escMute>
<full>全屏</full>
<escFull>退出全屏</escFull>
<previousPage>上一集</previousPage>
<nextPage>下一集</nextPage>
<definition>点击选择清晰度</definition>
</buttonOver>
<volumeSliderOver>
音量:[$volume]%
</volumeSliderOver>
<buffer>[$percentage]%</buffer>
<timeSliderOver><!--鼠标经过进度条显示的时间格式-->
[$timeh]:[$timei]:[$times]
</timeSliderOver>
<liveAndVod>
[$timeh]:[$timei]:[$times]
</liveAndVod>
<live>
直播中 [$liveTimeY]-[$liveTimem]-[$liveTimed] [$liveTimeh]:[$liveTimei]:[$liveTimes]
</live>
<m3u8Definition>
<name>流畅</name>
<name>低清</name>
<name>标清</name>
<name>高清</name>
<name>超清</name>
<name>蓝光</name>
<name>未知</name>
</m3u8Definition>
<error>
<cannotFindUrl>视频地址不存在</cannotFindUrl>
<streamNotFound>加载失败</streamNotFound>
<formatError>视频格式错误</formatError>
</error>
<definition>自动</definition>
</language>
\ No newline at end of file
此差异已折叠。
此差异已折叠。
# PaddleHub 多标签分类
本示例将展示如何使用PaddleHub Fine-tune API以及BERT预训练模型在Toxic完成多标签分类任务。
多标签分类是广义的多分类,多分类是将样本精确地分类为两个以上类别之一的单标签问题。 在多标签问题中,样本可以分配给多个类别,没有限制。
如下图所示:
<p align="center">
<img src="../../docs/imgs/multi-label-cls.png" hspace='10'/> <br />
</p>
*图片来源于https://mc.ai/building-a-multi-label-text-classifier-using-bert-and-tensorflow/*
## 如何开始Fine-tune
在完成安装PaddlePaddle与PaddleHub后,通过执行脚本`sh run_classifier.sh`即可开始使用BERT对Toxic数据集进行Fine-tune。
其中脚本参数说明如下:
```bash
# 模型相关
--batch_size: 批处理大小,请结合显存情况进行调整,若出现显存不足,请适当调低这一参数;
--use_gpu: 是否使用GPU进行Fine-Tune,默认为False;
--learning_rate: Fine-tune的最大学习率;
--weight_decay: 控制正则项力度的参数,用于防止过拟合,默认为0.01;
--warmup_proportion: 学习率warmup策略的比例,如果0.1,则学习率会在前10%训练step的过程中从0慢慢增长到learning_rate, 而后再缓慢衰减,默认为0;
--num_epoch: Fine-tune迭代的轮数;
--max_seq_len: ERNIE/BERT模型使用的最大序列长度,最大不能超过512, 若出现显存不足,请适当调低这一参数;
# 任务相关
--checkpoint_dir: 模型保存路径,PaddleHub会自动保存验证集上表现最好的模型;
```
## 代码步骤
使用PaddleHub Fine-tune API进行Fine-tune可以分为4个步骤:
### Step1: 加载预训练模型
```python
import paddlehub as hub
module = hub.Module(name="ernie_v2_eng_base")
inputs, outputs, program = module.context(trainable=True, max_seq_len=128)
```
其中最大序列长度`max_seq_len`是可以调整的参数,建议值128,根据任务文本长度不同可以调整该值,但最大不超过512。
PaddleHub还提供BERT等模型可供选择, 模型对应的加载示例如下:
模型名 | PaddleHub Module
---------------------------------- | :------:
ERNIE, Chinese | `hub.Module(name='ernie')`
ERNIE tiny, Chinese | `hub.Module(name='ernie_tiny')`
ERNIE 2.0 Base, English | `hub.Module(name='ernie_v2_eng_base')`
ERNIE 2.0 Large, English | `hub.Module(name='ernie_v2_eng_large')`
BERT-Base, Uncased | `hub.Module(name='bert_uncased_L-12_H-768_A-12')`
BERT-Large, Uncased | `hub.Module(name='bert_uncased_L-24_H-1024_A-16')`
BERT-Base, Cased | `hub.Module(name='bert_cased_L-12_H-768_A-12')`
BERT-Large, Cased | `hub.Module(name='bert_cased_L-24_H-1024_A-16')`
BERT-Base, Multilingual Cased | `hub.Module(nane='bert_multi_cased_L-12_H-768_A-12')`
BERT-Base, Chinese | `hub.Module(name='bert_chinese_L-12_H-768_A-12')`
BERT-wwm, Chinese | `hub.Module(name='bert_wwm_chinese_L-12_H-768_A-12')`
BERT-wwm-ext, Chinese | `hub.Module(name='bert_wwm_ext_chinese_L-12_H-768_A-12')`
RoBERTa-wwm-ext, Chinese | `hub.Module(name='roberta_wwm_ext_chinese_L-12_H-768_A-12')`
RoBERTa-wwm-ext-large, Chinese | `hub.Module(name='roberta_wwm_ext_chinese_L-24_H-1024_A-16')`
更多模型请参考[PaddleHub官网](https://www.paddlepaddle.org.cn/hub)
### Step2: 准备数据集并使用tokenizer预处理数据
```python
tokenizer = hub.BertTokenizer(vocab_file=module.get_vocab_path())
dataset = hub.dataset.Toxic(
tokenizer=tokenizer, max_seq_len=128)
```
如果是使用ernie_tiny预训练模型,请使用ErnieTinyTokenizer。
```
tokenizer = hub.ErnieTinyTokenizer(
vocab_file=module.get_vocab_path(),
spm_path=module.get_spm_path(),
word_dict_path=module.get_word_dict_path())
```
ErnieTinyTokenizer和BertTokenizer的区别在于它将按词粒度进行切分,详情请参考[文章](https://www.jiqizhixin.com/articles/2019-11-06-9)
数据集的准备代码可以参考[toxic.py](../../paddlehub/dataset/toxic.py)
`hub.dataset.Toxic()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录;
`module.get_vocab_path()` 会返回预训练模型对应的词表;
`module.sp_model_path` 若module为ernie_tiny则返回对应的子词切分模型路径,否则返回None;
`module.word_dict_path` 若module为ernie_tiny则返回对应的词语切分模型路径,否则返回None;
`max_seq_len` 需要与Step1中context接口传入的序列长度保持一致;
dataset将调用传入的tokenizer提供的encode接口对全量数据进行预处理,您可以通过以下方式观察数据的处理流程:
```
single_result = tokenizer.encode(text="hello", text_pair="world", max_seq_len=10) # BertTokenizer
# {'input_ids': [3, 1, 5, 39825, 5, 0, 0, 0, 0, 0], 'segment_ids': [0, 0, 0, 1, 1, 0, 0, 0, 0, 0], 'seq_len': 5, 'input_mask': [1, 1, 1, 1, 1, 0, 0, 0, 0, 0], 'position_ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}
dataset_result = dataset.get_dev_records() # set dataset max_seq_len = 10
# {'input_ids': [101, 2233, 2289, 1006, 11396, 1007, 2003, 3746, 1024, 102], 'segment_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'seq_len': 10, 'input_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'position_ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 'label': [0, 0, 0, 0, 0, 0]}
```
#### 自定义数据集
如果想加载自定义数据集完成迁移学习,详细参见[自定义数据集](../../docs/tutorial/how_to_load_data.md)
### Step3:选择优化策略和运行配置
```python
strategy = hub.AdamWeightDecayStrategy(
learning_rate=5e-5,
weight_decay=0.01,
warmup_proportion=0.0,
lr_scheduler="linear_decay",
)
config = hub.RunConfig(use_cuda=True, use_data_parallel=True, use_pyreader=True, num_epoch=3, batch_size=32, strategy=strategy)
```
#### 优化策略
针对ERNIE与BERT类任务,PaddleHub封装了适合这一任务的迁移学习优化策略`AdamWeightDecayStrategy`
* `learning_rate`: Fine-tune过程中的最大学习率;
* `weight_decay`: 模型的正则项参数,默认0.01,如果模型有过拟合倾向,可适当调高这一参数;
* `warmup_proportion`: 如果warmup_proportion>0, 例如0.1, 则学习率会在前10%的steps中线性增长至最高值learning_rate;
* `lr_scheduler`: 有两种策略可选(1) `linear_decay`策略学习率会在最高点后以线性方式衰减; `noam_decay`策略学习率会在最高点以多项式形式衰减;
PaddleHub提供了许多优化策略,如`AdamWeightDecayStrategy``ULMFiTStrategy``DefaultFinetuneStrategy`等,详细信息参见[策略](../../docs/reference/strategy.md)
#### 运行配置
`RunConfig` 主要控制Fine-tune的训练,包含以下可控制的参数:
* `log_interval`: 进度日志打印间隔,默认每10个step打印一次;
* `eval_interval`: 模型评估的间隔,默认每100个step评估一次验证集;
* `save_ckpt_interval`: 模型保存间隔,请根据任务大小配置,默认只保存验证集效果最好的模型和训练结束的模型;
* `use_cuda`: 是否使用GPU训练,默认为False;
* use_pyreader: 是否使用pyreader,默认False;
* use_data_parallel: 是否使用并行计算,默认False。打开该功能依赖nccl库;
* `checkpoint_dir`: 模型checkpoint保存路径, 若用户没有指定,程序会自动生成;
* `num_epoch`: Fine-tune的轮数;
* `batch_size`: 训练的批大小,如果使用GPU,请根据实际情况调整batch_size;
* `enable_memory_optim`: 是否使用内存优化, 默认为True;
* `strategy`: Fine-tune优化策略;
### Step4: 构建网络并创建分类迁移任务进行Fine-tune
```python
pooled_output = outputs["pooled_output"]
multi_label_cls_task = hub.MultiLabelClassifierTask(
dataset=dataset,
feature=pooled_output,
num_classes=dataset.num_labels,
config=config)
multi_label_cls_task.finetune_and_eval()
```
**NOTE:**
1. `outputs["pooled_output"]`返回了ERNIE/BERT模型对应的[CLS]向量,可以用于句子或句对的特征表达。
2. `hub.MultiLabelClassifierTask`通过输入特征,label与迁移的类别数,可以生成适用于多标签分类的迁移任务`MultiLabelClassifierTask`
#### 自定义迁移任务
如果想改变迁移任务组网,详细参见[自定义迁移任务](../../docs/tutorial/how_to_define_task.md)
## 可视化
Fine-tune API训练过程中会自动对关键训练指标进行打点,启动程序后执行下面命令。
```bash
$ visualdl --logdir $CKPT_DIR/visualization --host ${HOST_IP} --port ${PORT_NUM}
```
其中${HOST_IP}为本机IP地址,${PORT_NUM}为可用端口号,如本机IP地址为192.168.0.1,端口号8040,用浏览器打开192.168.0.1:8040,即可看到训练过程中指标的变化情况。
## 模型预测
通过Fine-tune完成模型训练后,在对应的ckpt目录下,会自动保存验证集上效果最好的模型。
配置脚本参数:
```shell
CKPT_DIR="./ckpt_toxic"
python predict.py --checkpoint_dir $CKPT_DIR --max_seq_len 128
```
其中CKPT_DIR为Fine-tune API保存最佳模型的路径, max_seq_len是ERNIE模型的最大序列长度,*请与训练时配置的参数保持一致*
参数配置正确后,请执行脚本`sh run_predict.sh`,即可看到以下文本分类预测结果, 以及最终准确率。
如需了解更多预测步骤,请参考`predict.py`
我们在AI Studio上提供了IPython NoteBook形式的demo,您可以直接在平台上在线体验,链接如下:
|预训练模型|任务类型|数据集|AIStudio链接|备注|
|-|-|-|-|-|
|ResNet|图像分类|猫狗数据集DogCat|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147010)||
|ERNIE|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147006)||
|ERNIE|文本分类|中文新闻分类数据集THUNEWS|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/221999)|本教程讲述了如何将自定义数据集加载,并利用Fine-tune API完成文本分类迁移学习。|
|ERNIE|序列标注|中文序列标注数据集MSRA_NER|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/147009)||
|ERNIE|序列标注|中文快递单数据集Express|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/184200)|本教程讲述了如何将自定义数据集加载,并利用Fine-tune API完成序列标注迁移学习。|
|ERNIE Tiny|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/186443)||
|Senta|文本分类|中文情感分类数据集ChnSentiCorp|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/216846)|本教程讲述了任何利用Senta和Fine-tune API完成情感分类迁移学习。|
|Senta|情感分析预测|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215814)||
|LAC|词法分析|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215711)||
|Ultra-Light-Fast-Generic-Face-Detector-1MB|人脸检测|N/A|[点击体验](https://aistudio.baidu.com/aistudio/projectdetail/215962)||
## 超参优化AutoDL Finetuner
PaddleHub还提供了超参优化(Hyperparameter Tuning)功能, 自动搜索最优模型超参得到更好的模型效果。详细信息参见[AutoDL Finetuner超参优化功能教程](../../docs/tutorial/autofinetune.md)
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册