未验证 提交 4c512e51 编写于 作者: C cc 提交者: GitHub

Add PP-HumanSegV2 Chinese Docs (#5568)

上级 dde08b36
## 1. 推理 Benchmark
### 1.1 软硬件环境
* 测试肖像模型的精度mIoU:针对PP-HumanSeg-14k数据集,使用模型最佳输入尺寸进行测试,没有应用多尺度和flip等操作。
* 测试肖像模型在骁龙855 ARM CPU的推理耗时:基于PaddleLite预测库,小米9手机(骁龙855 CPU)、单线程、大核,使用模型最佳输入尺寸进行测试。
* 测试肖像模型在Intel CPU的推理耗时:基于Paddle Inference预测库,Intel(R) Xeon(R) Gold 6271C CPU。
### 1.2 数据集
* 使用PaddleSeg团队开源的PP-HumanSeg14k数据集进行测试。
### 1.3 指标
单个SOTA模型,在骁龙855 ARM CPU上进行测试速度。
模型 | 输入图像分辨率 | 精度mIoU(%) | 速度(FPS)
---|---|---|---
PortraitNet | 224x224 | 95.20 |31.93
SINet | 224x224 | 93.76 | 78.12
PP-HumanSegV2 | 224x224 | 95.21 | 52.27
PP-HumanSegV2 | 192x192 | 94.87 | 81.96
分割方案,在骁龙855 ARM CPU上进行测试速度。
模型 | 输入图像分辨率 | 精度mIoU(%) | ARM CPU速度(FPS)
---|---|---|---
PP-HumanSegV1 | 398x224 | 93.60 | 33.69
PP-HumanSegV2 | 398x224 | 97.50 | 35.23
PP-HumanSegV2 | 256x144 | 96.63 | 63.05
分割方案,在Intel CPU上进行测试速度。
模型 | 输入图像分辨率 | 精度mIoU(%) | Intel CPU速度(FPS)
---|---|---|---
PP-HumanSegV1 | 398x224 | 93.60 | 36.02
PP-HumanSegV2 | 398x224 | 97.50 | 60.75
PP-HumanSegV2 | 256x144 | 96.63 | 70.67
## 2. 相关使用说明
1. https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.6/contrib/PP-HumanSeg
\ No newline at end of file
# 模型列表
## 1 肖像分割模型
针对手机视频通话、Web视频会议等实时半身人像的分割场景,PP-HumanSeg发布了自研的肖像分割模型。该系列模型可以开箱即用,零成本直接集成到产品中。
| 模型名 | 最佳输入尺寸 | 精度mIou(%) | 手机端推理耗时(ms) | 模型体积(MB) | 配置文件 | 下载连接 |
| --- | --- | --- | ---| --- | --- | --- |
| PP-HumanSegV1-Lite | 398x224 | 93.60 | 29.68 | 2.3 | [cfg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/contrib/PP-HumanSeg/configs/portrait_pp_humansegv1_lite.yml) | [Checkpoint](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/portrait_pp_humansegv1_lite_398x224_pretrained.zip) \| [Inference Model (Argmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/portrait_pp_humansegv1_lite_398x224_inference_model.zip) \| [Inference Model (Softmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/portrait_pp_humansegv1_lite_398x224_inference_model_with_softmax.zip) |
| PP-HumanSegV2-Lite | 256x144 | 96.63 | 15.86 | 5.4 | [cfg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/contrib/PP-HumanSeg/configs/portrait_pp_humansegv2_lite.yml) | [Checkpoint](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/portrait_pp_humansegv2_lite_256x144_smaller/portrait_pp_humansegv2_lite_256x144_pretrained.zip) \| [Inference Model (Argmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/portrait_pp_humansegv2_lite_256x144_smaller/portrait_pp_humansegv2_lite_256x144_inference_model.zip) \| [Inference Model (Softmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/portrait_pp_humansegv2_lite_256x144_smaller/portrait_pp_humansegv2_lite_256x144_inference_model_with_softmax.zip) |
<details><summary>表格说明:</summary>
* 测试肖像模型的精度mIoU:针对PP-HumanSeg-14k数据集,使用模型最佳输入尺寸进行测试,没有应用多尺度和flip等操作。
* 测试肖像模型的推理耗时:基于[PaddleLite](https://www.paddlepaddle.org.cn/lite)预测库,小米9手机(骁龙855 CPU)、单线程、大核,使用模型最佳输入尺寸进行测试。
* 最佳输入尺寸的宽高比例是16:9,和手机、电脑的摄像头拍摄尺寸比例相同。
* Checkpoint是模型权重,结合模型配置文件,可以用于Finetuning场景。
* Inference Model为预测模型,可以直接用于部署。
* Inference Model (Argmax) 指模型最后使用Argmax算子,输出单通道预测结果(int64类型),人像区域为1,背景区域为0。
* Inference Model (Softmax) 指模型最后使用Softmax算子,输出单通道预测结果(float32类型),每个像素数值表示是人像的概率。
</details>
<details><summary>使用说明:</summary>
* 肖像分割模型专用性较强,可以开箱即用,建议使用最佳输入尺寸。
* 在手机端部署肖像分割模型,存在横屏和竖屏两种情况。大家可以根据实际情况对图像进行旋转,保持人像始终是竖直,然后将图像(尺寸比如是256x144或144x256)输入模型,得到最佳分割效果。
</details>
## 2 通用人像分割模型
针对通用人像分割任务,我们首先构建的大规模人像数据集,然后使用PaddleSeg的SOTA模型,最终发布了多个PP-HumanSeg通用人像分割模型。
| 模型名 | 最佳输入尺寸 | 精度mIou(%) | 手机端推理耗时(ms) | 服务器端推理耗时(ms) | 配置文件 | 下载链接 |
| ----- | ---------- | ---------- | -----------------| ----------------- | ------- | ------- |
| PP-HumanSegV1-Lite | 192x192 | 86.02 | 12.3 | - | [cfg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/contrib/PP-HumanSeg/configs/human_pp_humansegv1_lite.yml) | [Checkpoint](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_lite_192x192_pretrained.zip) \| [Inference Model (Argmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_lite_192x192_inference_model.zip) \| [Inference Model (Softmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_lite_192x192_inference_model_with_softmax.zip) |
| PP-HumanSegV2-Lite | 192x192 | 92.52 | 15.3 | - | [cfg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/contrib/PP-HumanSeg/configs/human_pp_humansegv2_lite.yml) | [Checkpoint](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv2_lite_192x192_pretrained.zip) \| [Inference Model (Argmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv2_lite_192x192_inference_model.zip) \| [Inference Model (Softmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv2_lite_192x192_inference_model_with_softmax.zip) |
| PP-HumanSegV1-Mobile | 192x192 | 91.64 | - | 2.83 | [cfg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/contrib/PP-HumanSeg/configs/human_pp_humansegv1_mobile.yml) | [Checkpoint](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_mobile_192x192_pretrained.zip) \| [Inference Model (Argmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_mobile_192x192_inference_model.zip) \| [Inference Model (Softmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_mobile_192x192_inference_model_with_softmax.zip) |
| PP-HumanSegV2-Mobile | 192x192 | 93.13 | - | 2.67 | [cfg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/contrib/PP-HumanSeg/configs/human_pp_humansegv2_mobile.yml) | [Checkpoint](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv2_mobile_192x192_pretrained.zip) \| [Inference Model (Argmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv2_mobile_192x192_inference_model.zip) \| [Inference Model (Softmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv2_mobile_192x192_inference_model_with_softmax.zip) |
| PP-HumanSegV1-Server | 512x512 | 96.47 | - | 24.9 | [cfg](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/contrib/PP-HumanSeg/configs/human_pp_humansegv1_server.yml) | [Checkpoint](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_server_512x512_pretrained.zip) \| [Inference Model (Argmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_server_512x512_inference_model.zip) \| [Inference Model (Softmax)](https://paddleseg.bj.bcebos.com/dygraph/pp_humanseg_v2/human_pp_humansegv1_server_512x512_inference_model_with_softmax.zip) |
<details><summary>表格说明:</summary>
* 测试通用人像模型的精度mIoU:通用分割模型在大规模人像数据集上训练完后,在小规模Supervisely Person 数据集([下载链接](https://paddleseg.bj.bcebos.com/humanseg/data/mini_supervisely.zip))上进行测试。
* 测试手机端推理耗时:基于[PaddleLite](https://www.paddlepaddle.org.cn/lite)预测库,小米9手机(骁龙855 CPU)、单线程、大核,使用模型最佳输入尺寸进行测试。
* 测试服务器端推理耗时:基于[PaddleInference](https://www.paddlepaddle.org.cn/inference/product_introduction/inference_intro.html)预测裤,V100 GPU、开启TRT,使用模型最佳输入尺寸进行测试。
* Checkpoint是模型权重,结合模型配置文件,可以用于Finetune场景。
* Inference Model为预测模型,可以直接用于部署。
* Inference Model (Argmax) 指模型最后使用Argmax算子,输出单通道预测结果(int64类型),人像区域为1,背景区域为0。
* Inference Model (Softmax) 指模型最后使用Softmax算子,输出单通道预测结果(float32类型),每个像素数值表示是人像的概率。
</details>
<details><summary>使用说明:</summary>
* 由于通用人像分割任务的场景变化很大,大家需要根据实际场景评估PP-HumanSeg通用人像分割模型的精度。
* 如果满足业务要求,可以直接应用到产品中。
* 如果不满足业务要求,大家可以收集、标注数据,基于开源通用人像分割模型进行Finetune。
</details>
\ No newline at end of file
---
Model_Info:
name: "PP-HumanSegV2"
description: "PP-HumanSegV2实时人像分割SOTA方案"
description_en: "PP-HumanSegV2 real-time portrait segmentation solution"
icon: "@后续UE统一设计之后,会存到bos上某个位置"
from_repo: "PaddleSeg"
Task:
- tag_en: "Computer Vision"
tag: "计算机视觉"
sub_tag_en: "Image Segmentation"
sub_tag: "图像分割"
Example:
- tag_en: "internet"
tag: "互联网"
sub_tag_en: "人像分割"
title: "PP-HumanSegV2 SOTA人像分割方案"
sub_tag: "人像分割"
url: "https://aistudio.baidu.com/aistudio/projectdetail/4504982"
Datasets: "PP-HumanSeg14K,EG1800"
Pulisher: "Baidu"
License: "apache.2.0"
Paper: ""
IfTraining: 1
IfOnlineDemo: 1
{
"cells": [
{
"cell_type": "markdown",
"id": "55903e0e-3e6d-430f-91b7-d270a953ffd7",
"metadata": {},
"source": [
"## 1. PP-HumanSegV2模型简介\n",
"\n",
"将人物和背景在像素级别进行区分,是一个图像分割的经典任务,具有广泛的应用。 一般而言,该任务可以分为两类:针对半身人像的分割,简称肖像分割;针对全身和半身人像的分割,简称通用人像分割。\n",
"\n",
"对于肖像分割和通用人像分割,PaddleSeg发布了PP-HumanSeg系列模型,具有分割精度高、推理速度快、通用型强的优点。而且PP-HumanSeg系列模型可以开箱即用,零成本部署到产品中,也支持针对特定场景数据进行微调,实现更佳分割效果。\n",
"\n",
"2022年7月,PaddleSeg重磅升级的PP-HumanSegV2人像分割方案,以96.63%的mIoU精度, 63FPS的手机端推理速度,再次刷新开源人像分割算法SOTA指标。相比PP-HumanSegV1方案,推理速度提升87.15%,分割精度提升3.03%,可视化效果更佳。V2方案可与商业收费方案媲美,而且支持零成本、开箱即用!\n",
"\n",
"PP-HumanSeg由飞桨官方出品,是PaddleSeg团队推出的模型和方案。 更多关于PaddleSeg可以点击 https://github.com/PaddlePaddle/PaddleSeg 进行了解。"
]
},
{
"cell_type": "markdown",
"id": "ba317a85-c8a1-49bd-afa3-59bfad7e86c3",
"metadata": {},
"source": [
"## 2. 模型效果及应用场景\n",
"### 2.1 肖像分割和通用人像分割任务\n",
"\n",
"#### 2.1.1 数据集\n",
"\n",
"数据集以PP-HumanSeg14k为主,分为训练集和测试集。\n",
"\n",
"#### 2.1.2 模型效果速览\n",
"\n",
"PP-HumanSegV2在图片上的分割效果如下。\n",
"\n",
"原图:\n",
"<div align=\"center\">\n",
"<img src=\"https://user-images.githubusercontent.com/48357642/200734740-72e98c73-5b41-47c4-b208-7fd9c10d8b8a.jpeg\" width = \"60%\" />\n",
"</div>\n",
"\n",
"\n",
"分割后的图:\n",
"<div align=\"center\">\n",
"<img src=\"https://user-images.githubusercontent.com/48357642/200735017-eb8a2b22-7ef9-4e4f-acc2-1ea538672f75.jpeg\" width = \"60%\" />\n",
"</div>\n"
]
},
{
"cell_type": "markdown",
"id": "0d9668da-d14a-4491-ab01-7b039399bff6",
"metadata": {},
"source": [
"## 3. 模型如何使用\n",
"\n",
"### 3.1 模型推理\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "69e98dbf-a3ba-4d6c-8bde-b1abb469c7b2",
"metadata": {},
"source": [
"* 安装PaddlePaddle\n",
"\n",
"安装PaddlePaddle,要求PaddlePaddle >= 2.2.0。由于图像分割模型计算开销大,推荐在GPU版本的PaddlePaddle下使用。\n",
"\n",
"在AIStudio中,大家选择可以直接选择安装好PaddlePaddle的环境。\n",
"如果需要执行安装PaddlePaddle,请参考[PaddlePaddle官网](https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/pip/linux-pip.html)。\n",
"\n",
"本教程在PaddlePaddle 2.3.2版本下进行了验证。\n"
]
},
{
"cell_type": "markdown",
"id": "5e2e25c6-6943-4d6a-b809-5f49158a5619",
"metadata": {},
"source": [
"* 下载PaddleSeg \n",
"\n",
"(不在Jupyter Notebook上运行时需要将\"!\"或者\"%\"去掉。)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0c587ea0-52d6-48c2-b2cc-beae58b3262d",
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"%cd ~\n",
"\n",
"!git clone https://gitee.com/PaddlePaddle/PaddleSeg.git"
]
},
{
"cell_type": "markdown",
"id": "1fc9e04c-070f-48c3-bd6f-1fcacafb4e1f",
"metadata": {},
"source": [
"* 安装PaddleSeg"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5f7a0af6-4968-4966-b27e-130d70c94886",
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"# 安装PaddleSeg\n",
"%cd ~/PaddleSeg\n",
"!pip install -v -e ."
]
},
{
"cell_type": "markdown",
"id": "96376cd0-99d1-4f28-8521-94e541e065e2",
"metadata": {},
"source": [
"* 下载数据和模型"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6d7f3c8f-bf9b-4a4a-b7dd-9374c4fd2c9c",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"%cd ~/PaddleSeg/contrib/PP-HumanSeg\n",
"!python src/download_inference_models.py\n",
"!python src/download_data.py"
]
},
{
"cell_type": "markdown",
"id": "9c67ff4a-bf96-4bf2-87c2-992b8d3bacd7",
"metadata": {},
"source": [
"* 快速体验\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9a96627b-2b2c-40eb-b13a-e1eb82a707df",
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"!python src/seg_demo.py \\\n",
" --config inference_models/portrait_pp_humansegv2_lite_256x144_inference_model_with_softmax/deploy.yaml \\\n",
" --img_path data/images/portrait_heng.jpg \\\n",
" --bg_img_path data/images/bg_2.jpg \\\n",
" --save_dir data/images_result/portrait_heng_v2_withbg.jpg\n",
"\n",
"!python src/seg_demo.py \\\n",
" --config inference_models/portrait_pp_humansegv2_lite_256x144_inference_model_with_softmax/deploy.yaml \\\n",
" --img_path data/images/portrait_shu.jpg \\\n",
" --bg_img_path data/images/bg_1.jpg \\\n",
" --save_dir data/images_result/portrait_shu_v2_withbg.jpg \\\n",
" --vertical_screen"
]
},
{
"cell_type": "markdown",
"id": "bd5a6d38-a18d-4bb2-ae0d-9397ba7c172e",
"metadata": {},
"source": [
"结果保存在`data/images_result/portrait_heng_v2.jpg`(如下图)。\n",
"\n",
"<img src=\"https://user-images.githubusercontent.com/52520497/188776878-130f4f6a-6379-4fb0-87e4-9a7ee4707c1d.jpg\" width=\"200\"> \n"
]
},
{
"cell_type": "markdown",
"id": "f6aeb78d-63ff-4e01-aa50-da37522e0b08",
"metadata": {},
"source": [
"### 3.2 模型训练\n"
]
},
{
"cell_type": "markdown",
"id": "4da9e355-e99a-4821-9fa2-1571a7b84f97",
"metadata": {},
"source": [
"* 准备\n",
"\n",
"参考前文,安装PaddleSeg、下载数据集,然后下载预训练权重。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5945d193-bae0-4c5f-bd43-59e205b0484a",
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"!python src/download_pretrained_models.py"
]
},
{
"cell_type": "markdown",
"id": "d8ba1500-8475-40e0-a257-9ddad436c14d",
"metadata": {},
"source": [
"* 训练\n",
"\n",
"配置文件保存在`./configs`目录下,如下。配置文件中,已经通过`pretrained`设置好预训练权重的路径。\n",
"```\n",
"configs\n",
"├── human_pp_humansegv1_lite.yml\n",
"├── human_pp_humansegv2_lite.yml\n",
"├── human_pp_humansegv1_mobile.yml\n",
"├── human_pp_humansegv2_mobile.yml\n",
"├── human_pp_humansegv1_server.yml\n",
"├── portrait_pp_humansegv1_lite.yml\n",
"├── portrait_pp_humansegv2_lite.yml\n",
"```\n",
"\n",
"执行如下命令,进行模型微调(大家需要根据实际情况修改配置文件中的超参)。模型训练的详细文档,请参考[链接](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/docs/train/train_cn.md)。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14461b15-5c88-453a-b877-739ddf0062ce",
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"!export CUDA_VISIBLE_DEVICES=0 # Linux下设置1张可用的卡\n",
"# set CUDA_VISIBLE_DEVICES=0 # Windows下设置1张可用的卡\n",
"!python ../../train.py \\\n",
" --config configs/human_pp_humansegv2_lite.yml \\\n",
" --save_dir output/human_pp_humansegv2_lite \\\n",
" --save_interval 100 --do_eval --use_vdl"
]
},
{
"cell_type": "markdown",
"id": "6c44f772-2104-4379-bd66-d6f9b8d72414",
"metadata": {},
"source": [
"* 评估\n",
"\n",
"执行如下命令,加载模型和训练好的权重,进行模型评估,输出验证集上的评估精度。模型评估的详细文档,请参考[链接](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/docs/evaluation/evaluate/evaluate_cn.md)。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ff7c87f-62b4-4377-a7fd-260b8278e6c4",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"!python ../../val.py \\\n",
" --config configs/human_pp_humansegv2_lite.yml \\\n",
" --model_path pretrained_models/human_pp_humansegv2_lite_192x192_pretrained/model.pdparams"
]
},
{
"cell_type": "markdown",
"id": "e346d381-15df-421f-8473-9e96e39bbd0c",
"metadata": {},
"source": [
"* 预测\n",
"\n",
"执行如下命令,加载模型和训练好的权重,对单张图像进行预测,预测结果保存在`./data/images_result`目录下的`added_prediction`和`pseudo_color_prediction`文件夹中。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b5b53557-9dbe-472d-8406-c60c2c58927d",
"metadata": {
"scrolled": true,
"tags": []
},
"outputs": [],
"source": [
"!python ../../predict.py \\\n",
" --config configs/human_pp_humansegv2_lite.yml \\\n",
" --model_path pretrained_models/human_pp_humansegv2_lite_192x192_pretrained/model.pdparams \\\n",
" --image_path data/images/human.jpg \\\n",
" --save_dir ./data/images_result"
]
},
{
"cell_type": "markdown",
"id": "d98d8455-6e84-493a-b823-428aab76e932",
"metadata": {},
"source": [
"* 导出\n",
"\n",
"执行如下命令,加载模型和训练好的权重,导出预测模型。模型导出的详细文档,请参考[链接](https://github.com/PaddlePaddle/PaddleSeg/blob/release/2.6/docs/model_export_cn.md)。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8fe76c54-aafe-43e0-a0cc-b8feb49bb770",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"!python ../../export.py \\\n",
" --config configs/human_pp_humansegv2_lite.yml \\\n",
" --model_path pretrained_models/human_pp_humansegv2_lite_192x192_pretrained/model.pdparams \\\n",
" --save_dir output/human_pp_humansegv2_lite \\\n",
" --without_argmax \\\n",
" --with_softmax"
]
},
{
"cell_type": "markdown",
"id": "629a50ee-3e33-42a5-b505-aacc4ccd220f",
"metadata": {},
"source": [
"注意,使用--without_argmax --with_softmax参数,则模型导出的时候,模型最后面不会添加Argmax算子,而是添加Softmax算子。 所以,输出是浮点数类型,表示前景的概率,使得图像融合的边缘更为平滑。"
]
},
{
"cell_type": "markdown",
"id": "6a6fd1d6-a7a5-4388-b225-86b62ab72d67",
"metadata": {},
"source": [
"## 4. 模型原理\n",
"\n",
"模型结构如下图。\n",
"<div align=\"center\">\n",
"<img src=\"https://user-images.githubusercontent.com/48357642/200757494-1e63215e-4cd1-4c39-8dd9-a0e37f8719f2.png\" width = \"60%\" />\n",
"</div>\n",
"\n",
"* 模型算量大幅减小\n",
"\n",
"对于模型Encoder部分,我们选用MobileNetV3作为骨干网络提取多层特征,分析发现MobileNetV3的参数主要集中在最后一个Stage,在不影响分割精度的前提下,我们只保留MobileNetV3的前四个Stage,成功减少了68.6%的参数量。对于上下文部分,我们使用PP-LiteSeg模型中提出的轻量级SPPM模块,而且其中的普通卷积都替换为可分离卷积,进一步减小计算量。对于Decoder部分,我们设计三个Fusion融合模块,多次融合深层语义特征和浅层细节特征,最后一个Fusion融合模块再次汇集不同层次的特征图,输出分割结果。\n",
"\n",
"多层次特征融合模块图:\n",
"<div align=\"center\">\n",
"<img src=\"https://user-images.githubusercontent.com/48357642/200758284-a8d5e6f9-1a66-414b-804c-57c6b6fcd698.png\" width = \"30%\" />\n",
"</div>\n",
"\n",
"* 使用两阶段训练方式,提升分割精度\n",
"\n",
"两阶段训练是基于迁移学习的思想,首先在大规模混合人像数据集(数据量100k+)上训练,然后使用该预训练权重,在PP-HumanSeg14k数据集(数据量14k)上训练,最终得到训练好的模型。使用两阶段训练方式,可以充分利用其他数据集,提高模型的分割精度和泛化能力。\n",
"\n",
"* 调整图像分辨率,提升推理速度\n",
"\n",
"调整图像分辨率也直接影响模型的推理速度,我们使用多种图像分辨率进行训练和测试,在PP-HumanSeg v2方案中选择最佳图像分辨率,进一步提升了模型推理速度。\n",
"\n",
"* 使用形态学后处理,提升可视化效果\n",
"\n",
"首先获取原始预测图像I,然后使用阈值处理、图像腐蚀、图像膨胀等操作得到掩码图像M,最后预测图像I和掩码图像M相乘,输出最终预测图像O。\n"
]
},
{
"cell_type": "markdown",
"id": "dc360fb0-fb54-4266-a4e3-4ae385d43a76",
"metadata": {},
"source": [
"## 5. 相关论文以及引用信息\n",
"如果我们的项目在学术上帮助到你,请考虑以下引用:\n",
"\n",
"```\n",
"@InProceedings{Chu_2022_WACV,\n",
" author = {Chu, Lutao and Liu, Yi and Wu, Zewu and Tang, Shiyu and Chen, Guowei and Hao, Yuying and Peng, Juncai and Yu, Zhiliang and Chen, Zeyu and Lai, Baohua and Xiong, Haoyi},\n",
" title = {PP-HumanSeg: Connectivity-Aware Portrait Segmentation With a Large-Scale Teleconferencing Video Dataset},\n",
" booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops},\n",
" month = {January},\n",
" year = {2022},\n",
" pages = {202-209}\n",
"}\n",
"```"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "py35-paddle1.2.0"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册