diff --git a/README.md b/README.md
index 4105c32ad8ec0462b16ea365937bf348bc45b903..644769e6dce6a7133e1532379ce36a8a78f569f4 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,7 @@
[](https://github.com/PaddlePaddle/PaddleX/releases)


+
PaddleX是基于飞桨核心框架、开发套件和工具组件的深度学习全流程开发工具。具备**全流程打通**、**融合产业实践**、**易用易集成**三大特点。
diff --git a/docs/FAQ.md b/docs/FAQ.md
index b120ebd10ed791c65c3f65e611c5b45da2a9211f..e25faab5ad9e230f34f1790db0dcf24fba3328e6 100755
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@@ -13,7 +13,7 @@
> 可以使用模型裁剪,参考文档[模型裁剪使用教程](slim/prune.md),通过调整裁剪参数,可以控制模型裁剪后的大小,在实际实验中,如VOC检测数据,使用yolov3-mobilenet,原模型大小为XXM,裁剪后为XX M,精度基本保持不变
## 4. 如何配置训练时GPU的卡数
-> 通过在终端export环境变量,或在Python代码中设置,可参考文档[CPU/多卡GPU训练](gpu_configure.md)
+> 通过在终端export环境变量,或在Python代码中设置,可参考文档[CPU/多卡GPU训练](appendix/gpu_configure.md)
## 5. 想将之前训练的模型参数上继续训练
> 在训练调用`train`接口时,将`pretrain_weights`设为之前的模型保存路径即可
@@ -52,7 +52,7 @@
> 1. 用户自行训练时,如不确定迭代的轮数,可以将轮数设高一些,同时注意设置`save_interval_epochs`,这样模型迭代每间隔相应轮数就会在验证集上进行评估和保存,可以根据不同轮数模型在验证集上的评估指标,判断模型是否已经收敛,若模型已收敛,可以自行结束训练进程
>
## 9. 只有CPU,没有GPU,如何提升训练速度
-> 当没有GPU时,可以根据自己的CPU配置,选择是否使用多CPU进行训练,具体配置方式可以参考文档[多卡CPU/GPU训练](gpu_configure.md)
+> 当没有GPU时,可以根据自己的CPU配置,选择是否使用多CPU进行训练,具体配置方式可以参考文档[多卡CPU/GPU训练](appendix/gpu_configure.md)
>
## 10. 电脑不能联网,训练时因为下载预训练模型失败,如何解决
> 可以预先通过其它方式准备好预训练模型,然后训练时自定义`pretrain_weights`即可,可参考文档[无联网模型训练](how_to_offline_run.md)
@@ -61,8 +61,8 @@
> 1.可以按照9的方式来解决这个问题
> 2.每次训练前都设定`paddlex.pretrain_dir`路径,如设定`paddlex.pretrain_dir='/usrname/paddlex`,如此下载完的预训练模型会存放至`/usrname/paddlex`目录下,而已经下载在该目录的模型也不会再次重复下载
-## 12. 程序启动时提示"Failed to execute script PaddleX",如何解决?
+## 12. PaddleX GUI启动时提示"Failed to execute script PaddleX",如何解决?
> 1. 请检查目标机器上PaddleX程序所在路径是否包含中文。目前暂不支持中文路径,请尝试将程序移动到英文目录。
> 2. 如果您的系统是Windows 7或者Windows Server 2012时,原因是缺少MFPlat.DLL/MF.dll/MFReadWrite.dll等OpenCV依赖的DLL,请按如下方式安装桌面体验:通过“我的电脑”-->“属性”-->"管理"打开服务器管理器,点击右上角“管理”选择“添加角色和功能”。点击“服务器选择”-->“功能”,拖动滚动条到最下端,点开“用户界面和基础结构”,勾选“桌面体验”后点击“安装”,等安装完成尝试再次运行PaddleX。
> 3. 请检查目标机器上是否有其他的PaddleX程序或者进程在运行中,如有请退出或者重启机器看是否解决
-> 4. 请确认运行程序的用户是否有管理员权限,如非管理员权限用户请尝试使用管理员运行看是否成功
\ No newline at end of file
+> 4. 请确认运行程序的用户是否有管理员权限,如非管理员权限用户请尝试使用管理员运行看是否成功
diff --git a/docs/apis/models/classification.md b/docs/apis/models/classification.md
index 82b459d8281b1e9bc9d1f7abdd48fddb16473c21..b70b555a7007b77851af22ddd4a775a4b3a8f93b 100755
--- a/docs/apis/models/classification.md
+++ b/docs/apis/models/classification.md
@@ -80,7 +80,7 @@ predict(self, img_file, transforms=None, topk=5)
## 其它分类器类
-PaddleX提供了共计22种分类器,所有分类器均提供同`ResNet50`相同的训练`train`,评估`evaluate`和预测`predict`接口,各模型效果可参考[模型库](../appendix/model_zoo.md)。
+PaddleX提供了共计22种分类器,所有分类器均提供同`ResNet50`相同的训练`train`,评估`evaluate`和预测`predict`接口,各模型效果可参考[模型库](https://paddlex.readthedocs.io/zh_CN/latest/appendix/model_zoo.html)。
### ResNet18
```python
diff --git a/docs/apis/models/semantic_segmentation.md b/docs/apis/models/semantic_segmentation.md
index 26a695a9564f6929ff586eaa179242b99b5466de..3ff66337fe64b35f29a2a7985cea040fcb233d82 100755
--- a/docs/apis/models/semantic_segmentation.md
+++ b/docs/apis/models/semantic_segmentation.md
@@ -186,10 +186,10 @@ paddlex.seg.HRNet(num_classes=2, width=18, use_bce_loss=False, use_dice_loss=Fal
> **参数**
> > - **num_classes** (int): 类别数。
-> > - **width** (int): 高分辨率分支中特征层的通道数量。默认值为18。可选择取值为[18, 30, 32, 40, 44, 48, 60, 64]。
+> > - **width** (int|str): 高分辨率分支中特征层的通道数量。默认值为18。可选择取值为[18, 30, 32, 40, 44, 48, 60, 64, '18_small_v1']。'18_small_v1'是18的轻量级版本。
> > - **use_bce_loss** (bool): 是否使用bce loss作为网络的损失函数,只能用于两类分割。可与dice loss同时使用。默认False。
> > - **use_dice_loss** (bool): 是否使用dice loss作为网络的损失函数,只能用于两类分割,可与bce loss同时使用。当use_bce_loss和use_dice_loss都为False时,使用交叉熵损失函数。默认False。
-> > - **class_weight** (list/str): 交叉熵损失函数各类损失的权重。当`class_weight`为list的时候,长度应为`num_classes`。当`class_weight`为str时, weight.lower()应为'dynamic',这时会根据每一轮各类像素的比重自行计算相应的权重,每一类的权重为:每类的比例 * num_classes。class_weight取默认值None是,各类的权重1,即平时使用的交叉熵损失函数。
+> > - **class_weight** (list|str): 交叉熵损失函数各类损失的权重。当`class_weight`为list的时候,长度应为`num_classes`。当`class_weight`为str时, weight.lower()应为'dynamic',这时会根据每一轮各类像素的比重自行计算相应的权重,每一类的权重为:每类的比例 * num_classes。class_weight取默认值None是,各类的权重1,即平时使用的交叉熵损失函数。
> > - **ignore_index** (int): label上忽略的值,label为`ignore_index`的像素不参与损失函数的计算。默认255。
### train 训练接口
diff --git a/docs/apis/transforms/seg_transforms.md b/docs/apis/transforms/seg_transforms.md
index 1fb2b561e4818edad72fd97f43029de079b355b3..264af5c472cb824865188a5386a513e5a00fe0ba 100755
--- a/docs/apis/transforms/seg_transforms.md
+++ b/docs/apis/transforms/seg_transforms.md
@@ -200,7 +200,7 @@ ComposedSegTransforms.add_augmenters(augmenters)
import paddlex as pdx
from paddlex.seg import transforms
train_transforms = transforms.ComposedSegTransforms(mode='train', train_crop_size=[512, 512])
-eval_transforms = transforms.ComposedYOLOTransforms(mode='eval')
+eval_transforms = transforms.ComposedSegTransforms(mode='eval')
# 添加数据增强
import imgaug.augmenters as iaa
diff --git a/docs/apis/visualize.md b/docs/apis/visualize.md
index 069913274580f1e8bd5fdb5ee6e6e642c977b3ce..8fe45d4abb82c01c859f0be60bf6f52706eb4e52 100755
--- a/docs/apis/visualize.md
+++ b/docs/apis/visualize.md
@@ -146,10 +146,11 @@ paddlex.interpret.normlime(img_file,
dataset=None,
num_samples=3000,
batch_size=50,
- save_dir='./')
+ save_dir='./',
+ normlime_weights_file=None)
```
使用NormLIME算法将模型预测结果的可解释性可视化。
-NormLIME是利用一定数量的样本来出一个全局的解释。NormLIME会提前计算一定数量的测试样本的LIME结果,然后对相同的特征进行权重的归一化,这样来得到一个全局的输入和输出的关系。
+NormLIME是利用一定数量的样本来出一个全局的解释。由于NormLIME计算量较大,此处采用一种简化的方式:使用一定数量的测试样本(目前默认使用所有测试样本),对每个样本进行特征提取,映射到同一个特征空间;然后以此特征做为输入,以模型输出做为输出,使用线性回归对其进行拟合,得到一个全局的输入和输出的关系。之后,对一测试样本进行解释时,使用NormLIME全局的解释,来对LIME的结果进行滤波,使最终的可视化结果更加稳定。
**注意:** 可解释性结果可视化目前只支持分类模型。
@@ -159,9 +160,10 @@ NormLIME是利用一定数量的样本来出一个全局的解释。NormLIME会
>* **dataset** (paddlex.datasets): 数据集读取器,默认为None。
>* **num_samples** (int): LIME用于学习线性模型的采样数,默认为3000。
>* **batch_size** (int): 预测数据batch大小,默认为50。
->* **save_dir** (str): 可解释性可视化结果(保存为png格式文件)和中间文件存储路径。
+>* **save_dir** (str): 可解释性可视化结果(保存为png格式文件)和中间文件存储路径。
+>* **normlime_weights_file** (str): NormLIME初始化文件名,若不存在,则计算一次,保存于该路径;若存在,则直接载入。
-**注意:** dataset`读取的是一个数据集,该数据集不宜过大,否则计算时间会较长,但应包含所有类别的数据。
+**注意:** dataset`读取的是一个数据集,该数据集不宜过大,否则计算时间会较长,但应包含所有类别的数据。NormLIME可解释性结果可视化目前只支持分类模型。
### 使用示例
> 对预测可解释性结果可视化的过程可参见[代码](https://github.com/PaddlePaddle/PaddleX/blob/develop/tutorials/interpret/normlime.py)。
diff --git a/docs/appendix/index.rst b/docs/appendix/index.rst
index c402384ebc307713ed87055dc86cab58dcf33bbe..814a611948a451a76d73fd0aa9276f40db2c28b9 100755
--- a/docs/appendix/index.rst
+++ b/docs/appendix/index.rst
@@ -7,6 +7,7 @@
:caption: 目录:
model_zoo.md
+ slim_model_zoo.md
metrics.md
interpret.md
parameters.md
diff --git a/docs/appendix/interpret.md b/docs/appendix/interpret.md
index 886620df2fa98c03abda4717dea627277715b2d9..43ecd48e23810c2e3ed3cd1652bf06b6e1fc04f7 100644
--- a/docs/appendix/interpret.md
+++ b/docs/appendix/interpret.md
@@ -20,9 +20,20 @@ LIME的使用方式可参见[代码示例](https://github.com/PaddlePaddle/Paddl
## NormLIME
NormLIME是在LIME上的改进,LIME的解释是局部性的,是针对当前样本给的特定解释,而NormLIME是利用一定数量的样本对当前样本的一个全局性的解释,有一定的降噪效果。其实现步骤如下所示:
1. 下载Kmeans模型参数和ResNet50_vc网络前三层参数。(ResNet50_vc的参数是在ImageNet上训练所得网络的参数;使用ImageNet图像作为数据集,每张图像从ResNet50_vc的第三层输出提取对应超象素位置上的平均特征和质心上的特征,训练将得到此处的Kmeans模型)
-2. 计算测试集中每张图像的LIME结果。(如无测试集,可用验证集代替)
-3. 使用Kmeans模型对所有图像中的所有像素进行聚类。
-4. 对在同一个簇的超像素(相同的特征)进行权重的归一化,得到每个超像素的权重,以此来解释模型。
+2. 使用测试集中的数据计算normlime的权重信息(如无测试集,可用验证集代替):
+ 对每张图像的处理:
+ (1) 获取图像的超像素。
+ (2) 使用ResNet50_vc获取第三层特征,针对每个超像素位置,组合质心特征和均值特征`F`。
+ (3) 把`F`作为Kmeans模型的输入,计算每个超像素位置的聚类中心。
+ (4) 使用训练好的分类模型,预测该张图像的`label`。
+ 对所有图像的处理:
+ (1) 以每张图像的聚类中心信息组成的向量(若某聚类中心出现在盖章途中设置为1,反之为0)为输入,
+ 预测的`label`为输出,构建逻辑回归函数`regression_func`。
+ (2) 由`regression_func`可获得每个聚类中心不同类别下的权重,并对权重进行归一化。
+3. 使用Kmeans模型获取需要可视化图像的每个超像素的聚类中心。
+4. 对需要可视化的图像的超像素进行随机遮掩构成新的图像。
+5. 对每张构造的图像使用预测模型预测label。
+6. 根据normlime的权重信息,每个超像素可获不同的权重,选取最高的权重为最终的权重,以此来解释模型。
NormLIME的使用方式可参见[代码示例](https://github.com/PaddlePaddle/PaddleX/blob/develop/tutorials/interpret/normlime.py)和[api介绍](../apis/visualize.html#normlime)。在使用时,参数中的`num_samples`设置尤为重要,其表示上述步骤2中的随机采样的个数,若设置过小会影响可解释性结果的稳定性,若设置过大则将在上述步骤3耗费较长时间;参数`batch_size`则表示在计算上述步骤3时,预测的batch size,若设置过小将在上述步骤3耗费较长时间,而上限则根据机器配置决定;而`dataset`则是由测试集或验证集构造的数据。
diff --git a/docs/appendix/model_zoo.md b/docs/appendix/model_zoo.md
index 200847bc95aec5872879c3fbbe49b6f2ed0c741e..f866b39173ead1c162e9e3ee722ae2ea2cb2afb3 100644
--- a/docs/appendix/model_zoo.md
+++ b/docs/appendix/model_zoo.md
@@ -40,8 +40,8 @@
|[FasterRCNN-ResNet101](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_1x.tar)| 212.5MB | 582.911 | 38.3 |
|[FasterRCNN-ResNet50-FPN](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_fpn_1x.tar)| 167.7MB | 83.189 | 37.2 |
|[FasterRCNN-ResNet50_vd-FPN](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r50_vd_fpn_2x.tar)|167.8MB | 128.277 | 38.9 |
-|[FasterRCNN-ResNet101-FPN](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_1x.tar)| 244.2MB | 156.097 | 38.7 |
-|[FasterRCNN-ResNet101_vd-FPN](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_vd_fpn_2x.tar) |244.3MB | 119.788 | 40.5 |
+|[FasterRCNN-ResNet101-FPN](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_fpn_1x.tar)| 244.2MB | 119.788 | 38.7 |
+|[FasterRCNN-ResNet101_vd-FPN](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_r101_vd_fpn_2x.tar) |244.3MB | 156.097 | 40.5 |
|[FasterRCNN-HRNet_W18-FPN](https://paddlemodels.bj.bcebos.com/object_detection/faster_rcnn_hrnetv2p_w18_1x.tar) |115.5MB | 81.592 | 36 |
|[YOLOv3-DarkNet53](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_darknet.tar)|249.2MB | 42.672 | 38.9 |
|[YOLOv3-MobileNetV1](https://paddlemodels.bj.bcebos.com/object_detection/yolov3_mobilenet_v1.tar) |99.2MB | 15.442 | 29.3 |
diff --git a/docs/appendix/slim_model_zoo.md b/docs/appendix/slim_model_zoo.md
new file mode 100644
index 0000000000000000000000000000000000000000..a594d53dd7a777288571ccae6fad5ec21415de36
--- /dev/null
+++ b/docs/appendix/slim_model_zoo.md
@@ -0,0 +1,121 @@
+# PaddleX压缩模型库
+
+## 图像分类
+
+数据集:ImageNet-1000
+
+### 量化
+
+| 模型 | 压缩策略 | Top-1准确率 | 存储体积 | TensorRT时延(V100, ms) |
+|:--:|:---:|:--:|:--:|:--:|
+|MobileNetV1| 无 |70.99%| 17MB | -|
+|MobileNetV1| 量化 |70.18% (-0.81%)| 4.4MB | - |
+| MobileNetV2 | 无 |72.15%| 15MB | - |
+| MobileNetV2 | 量化 | 71.15% (-1%)| 4.0MB | - |
+|ResNet50| 无 |76.50%| 99MB | 2.71 |
+|ResNet50| 量化 |76.33% (-0.17%)| 25.1MB | 1.19 |
+
+分类模型Lite时延(ms)
+
+| 设备 | 模型类型 | 压缩策略 | armv7 Thread 1 | armv7 Thread 2 | armv7 Thread 4 | armv8 Thread 1 | armv8 Thread 2 | armv8 Thread 4 |
+| ------- | ----------- | ------------- | -------------- | -------------- | -------------- | -------------- | -------------- | -------------- |
+| 高通835 | MobileNetV1 | 无 | 96.1942 | 53.2058 | 32.4468 | 88.4955 | 47.95 | 27.5189 |
+| 高通835 | MobileNetV1 | 量化 | 60.5615 | 32.4016 | 16.6596 | 56.5266 | 29.7178 | 15.1459 |
+| 高通835 | MobileNetV2 | 无 | 65.715 | 38.1346 | 25.155 | 61.3593 | 36.2038 | 22.849 |
+| 高通835 | MobileNetV2 | 量化 | 48.3495 | 30.3069 | 22.1506 | 45.8715 | 27.4105 | 18.2223 |
+| 高通835 | ResNet50 | 无 | 526.811 | 319.6486 | 205.8345 | 506.1138 | 335.1584 | 214.8936 |
+| 高通835 | ResNet50 | 量化 | 476.0507 | 256.5963 | 139.7266 | 461.9176 | 248.3795 | 149.353 |
+| 高通855 | MobileNetV1 | 无 | 33.5086 | 19.5773 | 11.7534 | 31.3474 | 18.5382 | 10.0811 |
+| 高通855 | MobileNetV1 | 量化 | 37.0498 | 21.7081 | 11.0779 | 14.0947 | 8.1926 | 4.2934 |
+| 高通855 | MobileNetV2 | 无 | 25.0396 | 15.2862 | 9.6609 | 22.909 | 14.1797 | 8.8325 |
+| 高通855 | MobileNetV2 | 量化 | 28.1631 | 18.3917 | 11.8333 | 16.9399 | 11.1772 | 7.4176 |
+| 高通855 | ResNet50 | 无 | 185.3705 | 113.0825 | 87.0741 | 177.7367 | 110.0433 | 74.4114 |
+| 高通855 | ResNet50 | 量化 | 328.2683 | 201.9937 | 106.744 | 242.6397 | 150.0338 | 79.8659 |
+| 麒麟970 | MobileNetV1 | 无 | 101.2455 | 56.4053 | 35.6484 | 94.8985 | 51.7251 | 31.9511 |
+| 麒麟970 | MobileNetV1 | 量化 | 62.4412 | 32.2585 | 16.6215 | 57.825 | 29.2573 | 15.1206 |
+| 麒麟970 | MobileNetV2 | 无 | 70.4176 | 42.0795 | 25.1939 | 68.9597 | 39.2145 | 22.6617 |
+| 麒麟970 | MobileNetV2 | 量化 | 53.0961 | 31.7987 | 21.8334 | 49.383 | 28.2358 | 18.3642 |
+| 麒麟970 | ResNet50 | 无 | 586.8943 | 344.0858 | 228.2293 | 573.3344 | 351.4332 | 225.8006 |
+| 麒麟970 | ResNet50 | 量化 | 489.6188 | 258.3279 | 142.6063 | 480.0064 | 249.5339 | 138.5284 |
+
+### 剪裁
+
+PaddleLite推理耗时说明:
+
+环境:Qualcomm SnapDragon 845 + armv8
+
+速度指标:Thread1/Thread2/Thread4耗时
+
+
+| 模型 | 压缩策略 | Top-1 | 存储体积 |PaddleLite推理耗时|TensorRT推理速度(FPS)|
+|:--:|:---:|:--:|:--:|:--:|:--:|
+| MobileNetV1 | 无 | 70.99% | 17MB | 66.052\35.8014\19.5762|-|
+| MobileNetV1 | 剪裁 -30% | 70.4% (-0.59%) | 12MB | 46.5958\25.3098\13.6982|-|
+| MobileNetV1 | 剪裁 -50% | 69.8% (-1.19%) | 9MB | 37.9892\20.7882\11.3144|-|
+
+## 目标检测
+
+### 量化
+
+数据集: COCO2017
+
+| 模型 | 压缩策略 | 数据集 | Image/GPU | 输入608 Box AP | 存储体积 | TensorRT时延(V100, ms) |
+| :----------------------------: | :---------: | :----: | :-------: | :------------: | :------------: | :----------: |
+| MobileNet-V1-YOLOv3 | 无 | COCO | 8 | 29.3 | 95MB | - |
+| MobileNet-V1-YOLOv3 | 量化 | COCO | 8 | 27.9 (-1.4)| 25MB | - |
+| R34-YOLOv3 | 无 | COCO | 8 | 36.2 | 162MB | - |
+| R34-YOLOv3 | 量化 | COCO | 8 | 35.7 (-0.5) | 42.7MB | - |
+
+### 剪裁
+
+数据集:Pasacl VOC & COCO2017
+
+PaddleLite推理耗时说明:
+
+环境:Qualcomm SnapDragon 845 + armv8
+
+速度指标:Thread1/Thread2/Thread4耗时
+
+| 模型 | 压缩策略 | 数据集 | Image/GPU | 输入608 Box mmAP | 存储体积 | PaddleLite推理耗时(ms)(608*608) | TensorRT推理速度(FPS)(608*608) |
+| :----------------------------: | :---------------: | :--------: | :-------: | :------------: | :----------: | :--------------: | :--------------: |
+| MobileNet-V1-YOLOv3 | 无 | Pascal VOC | 8 | 76.2 | 94MB | 1238\796.943\520.101|60.04|
+| MobileNet-V1-YOLOv3 | 剪裁 -52.88% | Pascal VOC | 8 | 77.6 (+1.4) | 31MB | 602.497\353.759\222.427 |99.36|
+| MobileNet-V1-YOLOv3 | 无 | COCO | 8 | 29.3 | 95MB |-|-|
+| MobileNet-V1-YOLOv3 | 剪裁 -51.77% | COCO | 8 | 26.0 (-3.3) | 32MB |-|73.93|
+
+## 语义分割
+
+数据集:Cityscapes
+
+
+### 量化
+
+| 模型 | 压缩策略 | mIoU | 存储体积 |
+| :--------------------: | :---------: | :-----------: | :------------: |
+| DeepLabv3-MobileNetv2 | 无 | 69.81 | 7.4MB |
+| DeepLabv3-MobileNetv2 | 量化 | 67.59 (-2.22) | 2.1MB |
+
+图像分割模型Lite时延(ms), 输入尺寸769 x 769
+
+| 设备 | 模型类型 | 压缩策略 | armv7 Thread 1 | armv7 Thread 2 | armv7 Thread 4 | armv8 Thread 1 | armv8 Thread 2 | armv8 Thread 4 |
+| ------- | ---------------------- | ------------- | -------------- | -------------- | -------------- | -------------- | -------------- | -------------- |
+| 高通835 | Deeplabv3-MobileNetV2 | 无 | 1282.8126 | 793.2064 | 653.6538 | 1193.9908 | 737.1827 | 593.4522 |
+| 高通835 | Deeplabv3-MobileNetV2 | 量化 | 981.44 | 658.4969 | 538.6166 | 885.3273 | 586.1284 | 484.0018 |
+| 高通855 | Deeplabv3-MobileNetV2 | 无 | 639.4425 | 390.1851 | 322.7014 | 477.7667 | 339.7411 | 262.2847 |
+| 高通855 | Deeplabv3-MobileNetV2 | 量化 | 705.7589 | 474.4076 | 427.2951 | 394.8352 | 297.4035 | 264.6724 |
+| 麒麟970 | Deeplabv3-MobileNetV2 | 无 | 1771.1301 | 1746.0569 | 1222.4805 | 1448.9739 | 1192.4491 | 760.606 |
+| 麒麟970 | Deeplabv3-MobileNetV2 | 量化 | 1320.386 | 918.5328 | 672.2481 | 1020.753 | 820.094 | 591.4114 |
+
+### 剪裁
+
+PaddleLite推理耗时说明:
+
+环境:Qualcomm SnapDragon 845 + armv8
+
+速度指标:Thread1/Thread2/Thread4耗时
+
+
+| 模型 | 压缩方法 | mIoU | 存储体积 | PaddleLite推理耗时 | TensorRT推理速度(FPS) |
+| :-------: | :---------------: | :-----------: | :------: | :------------: | :----: |
+| FastSCNN | 无 | 69.64 | 11MB | 1226.36\682.96\415.664 |39.53|
+| FastSCNN | 剪裁 -47.60% | 66.68 (-2.96) | 5.7MB | 866.693\494.467\291.748 |51.48|
diff --git a/docs/cv_solutions.md b/docs/cv_solutions.md
index cb96c2d9e71ac6e98ee036364b8700ec9656411a..4d8482da94423ba5cc4f0695bf3f9669ef5f732a 100755
--- a/docs/cv_solutions.md
+++ b/docs/cv_solutions.md
@@ -1,63 +1,132 @@
# PaddleX视觉方案介绍
-PaddleX目前提供了4种视觉任务解决方案,分别为图像分类、目标检测、实例分割和语义分割。用户可以根据自己的任务类型按需选取。
+PaddleX针对图像分类、目标检测、实例分割和语义分割4种视觉任务提供了包含模型选择、压缩策略选择、部署方案选择在内的解决方案。用户根据自己的需求选择合适的模型,选择合适的压缩策略来减小模型的计算量和存储体积、加速模型预测推理,最后选择合适的部署方案将模型部署在移动端或者服务器端。
-## 图像分类
+## 模型选择
+
+### 图像分类
图像分类任务指的是输入一张图片,模型预测图片的类别,如识别为风景、动物、车等。

-对于图像分类任务,针对不同的应用场景,PaddleX提供了百度改进的模型,见下表所示
+对于图像分类任务,针对不同的应用场景,PaddleX提供了百度改进的模型,见下表所示:
+> 表中GPU预测速度是使用PaddlePaddle Python预测接口测试得到(测试GPU型号为Nvidia Tesla P40)。
+> 表中CPU预测速度 (测试CPU型号为)。
+> 表中骁龙855预测速度是使用处理器为骁龙855的手机测试得到。
+> 测速时模型输入大小为224 x 224,Top1准确率为ImageNet-1000数据集上评估所得。
-| 模型 | 模型大小 | GPU预测速度 | CPU预测速度 | ARM芯片预测速度 | 准确率 | 备注 |
-| :--------- | :------ | :---------- | :-----------| :------------- | :----- | :--- |
-| MobileNetV3_small_ssld | 12M | - | - | - | 71.3% |适用于移动端场景 |
-| MobileNetV3_large_ssld | 21M | - | - | - | 79.0% | 适用于移动端/服务端场景 |
-| ResNet50_vd_ssld | 102.8MB | - | - | - | 82.4% | 适用于服务端场景 |
-| ResNet101_vd_ssld | 179.2MB | - | - | - |83.7% | 适用于服务端场景 |
+| 模型 | 模型特点 | 存储体积 | GPU预测速度(毫秒) | CPU(x86)预测速度(毫秒) | 骁龙855(ARM)预测速度 (毫秒)| Top1准确率 |
+| :--------- | :------ | :---------- | :-----------| :------------- | :------------- |:--- |
+| MobileNetV3_small_ssld | 轻量高速,适用于追求高速的实时移动端场景 | 12.5MB | 7.08837 | - | 6.546 | 71.3.0% |
+| ShuffleNetV2 | 轻量级模型,精度相对偏低,适用于要求更小存储体积的实时移动端场景 | 10.2MB | 15.40 | - | 10.941 | 68.8% |
+| MobileNetV3_large_ssld | 轻量级模型,在存储方面优势不大,在速度和精度上表现适中,适合于移动端场景 | 22.8MB | 8.06651 | - | 19.803 | 79.0% |
+| MobileNetV2 | 轻量级模型,适用于使用GPU预测的移动端场景 | 15.0MB | 5.92667 | - | 23.318| 72.2 % |
+| ResNet50_vd_ssld | 高精度模型,预测时间较短,适用于大多数的服务器端场景 | 103.5MB | 7.79264 | - | - | 82.4% |
+| ResNet101_vd_ssld | 超高精度模型,预测时间相对较长,适用于有大数据量时的服务器端场景 | 180.5MB | 13.34580 | - | -| 83.7% |
+| Xception65 | 超高精度模型,预测时间更长,在处理较大数据量时有较高的精度,适用于服务器端场景 | 161.6MB | 13.87017 | - | - | 80.3% |
-除上述模型外,PaddleX还支持近20种图像分类模型,模型列表可参考[PaddleX模型库](../appendix/model_zoo.md)
+包括上述模型,PaddleX支持近20种图像分类模型,其余模型可参考[PaddleX模型库](../appendix/model_zoo.md)
-## 目标检测
+### 目标检测
目标检测任务指的是输入图像,模型识别出图像中物体的位置(用矩形框框出来,并给出框的位置),和物体的类别,如在手机等零件质检中,用于检测外观上的瑕疵等。

对于目标检测,针对不同的应用场景,PaddleX提供了主流的YOLOv3模型和Faster-RCNN模型,见下表所示
-
-| 模型 | 模型大小 | GPU预测速度 | CPU预测速度 |ARM芯片预测速度 | BoxMAP | 备注 |
-| :------- | :------- | :--------- | :---------- | :------------- | :----- | :--- |
-| YOLOv3-MobileNetV1 | 101.2M | - | - | - | 29.3 | |
-| YOLOv3-MobileNetV3 | 94.6M | - | - | - | 31.6 | |
-| YOLOv3-ResNet34 | 169.7M | - | - | - | 36.2 | |
-| YOLOv3-DarkNet53 | 252.4 | - | - | - | 38.9 | |
-
-除YOLOv3模型外,PaddleX同时也支持FasterRCNN模型,支持FPN结构和5种backbone网络,详情可参考[PaddleX模型库](../appendix/model_zoo.md)
-
-## 实例分割
+> 表中GPU预测速度是使用PaddlePaddle Python预测接口测试得到(测试GPU型号为Nvidia Tesla P40)。
+> 表中CPU预测速度 (测试CPU型号为)。
+> 表中骁龙855预测速度是使用处理器为骁龙855的手机测试得到。
+> 测速时YOLOv3的输入大小为608 x 608,FasterRCNN的输入大小为800 x 1333,Box mmAP为COCO2017数据集上评估所得。
+
+| 模型 | 模型特点 | 存储体积 | GPU预测速度 | CPU(x86)预测速度(毫秒) | 骁龙855(ARM)预测速度 (毫秒)| Box mmAP |
+| :------- | :------- | :--------- | :---------- | :------------- | :------------- |:--- |
+| YOLOv3-MobileNetV3_larget | 适用于追求高速预测的移动端场景 | 100.7MB | 143.322 | - | - | 31.6 |
+| YOLOv3-MobileNetV1 | 精度相对偏低,适用于追求高速预测的服务器端场景 | 99.2MB| 15.422 | - | - | 29.3 |
+| YOLOv3-DarkNet53 | 在预测速度和模型精度上都有较好的表现,适用于大多数的服务器端场景| 249.2MB | 42.672 | - | - | 38.9 |
+| FasterRCNN-ResNet50-FPN | 经典的二阶段检测器,预测速度相对较慢,适用于重视模型精度的服务器端场景 | 167.MB | 83.189 | - | -| 37.2 |
+| FasterRCNN-HRNet_W18-FPN | 适用于对图像分辨率较为敏感、对目标细节预测要求更高的服务器端场景 | 115.5MB | 81.592 | - | - | 36 |
+| FasterRCNN-ResNet101_vd-FPN | 超高精度模型,预测时间更长,在处理较大数据量时有较高的精度,适用于服务器端场景 | 244.3MB | 156.097 | - | - | 40.5 |
+
+除上述模型外,YOLOv3和Faster RCNN还支持其他backbone,详情可参考[PaddleX模型库](../appendix/model_zoo.md)
+
+### 实例分割
在目标检测中,模型识别出图像中物体的位置和物体的类别。而实例分割则是在目标检测的基础上,做了像素级的分类,将框内的属于目标物体的像素识别出来。

PaddleX目前提供了实例分割MaskRCNN模型,支持5种不同的backbone网络,详情可参考[PaddleX模型库](../appendix/model_zoo.md)
-
-| 模型 | 模型大小 | GPU预测速度 | CPU预测速度 | ARM芯片预测速度 | BoxMAP | SegMAP | 备注 |
-| :---- | :------- | :---------- | :---------- | :------------- | :----- | :----- | :--- |
-| MaskRCNN-ResNet50_vd-FPN | 185.5M | - | - | - | 39.8 | 35.4 | |
-| MaskRCNN-ResNet101_vd-FPN | 268.6M | - | - | - | 41.4 | 36.8 | |
-
-
-## 语义分割
+> 表中GPU预测速度是使用PaddlePaddle Python预测接口测试得到(测试GPU型号为Nvidia Tesla P40)。
+> 表中CPU预测速度 (测试CPU型号为)。
+> 表中骁龙855预测速度是使用处理器为骁龙855的手机测试得到。
+> 测速时MaskRCNN的输入大小为800 x 1333,Box mmAP和Seg mmAP为COCO2017数据集上评估所得。
+
+| 模型 | 模型特点 | 存储体积 | GPU预测速度 | CPU(x86)预测速度(毫秒) | 骁龙855(ARM)预测速度 (毫秒)| Box mmAP | Seg mmAP |
+| :---- | :------- | :---------- | :---------- | :----- | :----- | :--- |:--- |
+| MaskRCNN-HRNet_W18-FPN | 适用于对图像分辨率较为敏感、对目标细节预测要求更高的服务器端场景 | - | - | - | - | 37.0 | 33.4 |
+| MaskRCNN-ResNet50-FPN | 精度较高,适合大多数的服务器端场景| 185.5M | - | - | - | 37.9 | 34.2 |
+| MaskRCNN-ResNet101_vd-FPN | 高精度但预测时间更长,在处理较大数据量时有较高的精度,适用于服务器端场景 | 268.6M | - | - | - | 41.4 | 36.8 |
+
+### 语义分割
语义分割用于对图像做像素级的分类,应用在人像分类、遥感图像识别等场景。

对于语义分割,PaddleX也针对不同的应用场景,提供了不同的模型选择,如下表所示
+> 表中GPU预测速度是使用PaddlePaddle Python预测接口测试得到(测试GPU型号为Nvidia Tesla P40)。
+> 表中CPU预测速度 (测试CPU型号为)。
+> 表中骁龙855预测速度是使用处理器为骁龙855的手机测试得到。
+> 测速时模型的输入大小为1024 x 2048,mIOU为Cityscapes数据集上评估所得。
+
+| 模型 | 模型特点 | 存储体积 | GPU预测速度 | CPU(x86)预测速度(毫秒) | 骁龙855(ARM)预测速度 (毫秒)| mIOU |
+| :---- | :------- | :---------- | :---------- | :----- | :----- |:--- |
+| DeepLabv3p-MobileNetV2_x1.0 | 轻量级模型,适用于移动端场景| - | - | - | 69.8% |
+| HRNet_W18_Small_v1 | 轻量高速,适用于移动端场景 | - | - | - | - |
+| FastSCNN | 轻量高速,适用于追求高速预测的移动端或服务器端场景 | - | - | - | 69.64 |
+| HRNet_W18 | 高精度模型,适用于对图像分辨率较为敏感、对目标细节预测要求更高的服务器端场景| - | - | - | 79.36 |
+| DeepLabv3p-Xception65 | 高精度但预测时间更长,在处理较大数据量时有较高的精度,适用于服务器且背景复杂的场景| - | - | - | 79.3% |
+
+## 压缩策略选择
+
+PaddleX提供包含模型剪裁、定点量化的模型压缩策略来减小模型的计算量和存储体积,加快模型部署后的预测速度。使用不同压缩策略在图像分类、目标检测和语义分割模型上的模型精度和预测速度详见以下内容,用户可以选择根据自己的需求选择合适的压缩策略,进一步优化模型的性能。
+
+| 压缩策略 | 策略特点 |
+| :---- | :------- |
+| 量化 | 较为显著地减少模型的存储体积,适用于移动端或服务期端TensorRT部署,在移动端对于MobileNet系列模型有明显的加速效果 |
+| 剪裁 | 能够去除冗余的参数,达到显著减少参数计算量和模型体积的效果,提升模型的预测性能,适用于CPU部署或移动端部署(GPU上无明显加速效果) |
+| 先剪裁后量化 | 可以进一步提升模型的预测性能,适用于移动端或服务器端TensorRT部署 |
+
+### 性能对比
+
+* 表中各指标的格式为XXX/YYY,XXX表示未采取压缩策略时的指标,YYY表示压缩后的指标
+* 分类模型的准确率指的是ImageNet-1000数据集上的Top1准确率(模型输入大小为224x224),检测模型的准确率指的是COCO2017数据集上的mmAP(模型输入大小为608x608),分割模型的准确率指的是Cityscapes数据集上mIOU(模型输入大小为769x769)
+* 量化策略中,PaddleLiter推理环境为Qualcomm SnapDragon 855 + armv8,速度指标为Thread4耗时
+* 剪裁策略中,PaddleLiter推理环境为Qualcomm SnapDragon 845 + armv8,速度指标为Thread4耗时
+
+
+| 模型 | 压缩策略 | 存储体积(MB) | 准确率(%) | PaddleLite推理耗时(ms) |
+| :--: | :------: | :------: | :----: | :----------------: |
+| MobileNetV1 | 量化 | 17/4.4 | 70.99/70.18 | 10.0811/4.2934 |
+| MobileNetV1 | 剪裁 -30% | 17/12 | 70.99/70.4 | 19.5762/13.6982 |
+| YOLOv3-MobileNetV1 | 量化 | 95/25 | 29.3/27.9 | - |
+| YOLOv3-MobileNetV1 | 剪裁 -51.77% | 95/25 | 29.3/26 | - |
+| Deeplabv3-MobileNetV2 | 量化 | 7.4/1.8 | 63.26/62.03 | 593.4522/484.0018 |
+| FastSCNN | 剪裁 -47.60% | 11/5.7 | 69.64/66.68 | 415.664/291.748 |
+
+更多模型在不同设备上压缩前后的指标对比详见[PaddleX压缩模型库](appendix/slim_model_zoo.md)
+
+压缩策略的具体使用流程详见[模型压缩](tutorials/compress)
+
+**注意:PaddleX中全部图像分类模型和语义分割模型都支持量化和剪裁操作,目标检测仅有YOLOv3支持量化和剪裁操作。**
+
+## 模型部署
+
+PaddleX提供服务器端python部署、服务器端c++部署、服务器端加密部署、OpenVINO部署、移动端部署共5种部署方案,用户可以根据自己的需求选择合适的部署方案,点击以下链接了解部署的具体流程。
-| 模型 | 模型大小 | GPU预测速度 | CPU预测速度 | ARM芯片预测速度 | mIOU | 备注 |
-| :---- | :------- | :---------- | :---------- | :------------- | :----- | :----- |
-| DeepLabv3p-MobileNetV2_x0.25 | | - | - | - | - | - |
-| DeepLabv3p-MobileNetV2_x1.0 | | - | - | - | - | - |
-| DeepLabv3p-Xception65 | | - | - | - | - | - |
-| UNet | | - | - | - | - | - |
+| 部署方案 | 部署流程 |
+| :------: | :------: |
+| 服务器端python部署 | [部署流程](tutorials/deploy/deploy_server/deploy_python.html)|
+| 服务器端c++部署 | [部署流程](tutorials/deploy/deploy_server/deploy_cpp/) |
+| 服务器端加密部署 | [部署流程](tutorials/deploy/deploy_server/encryption.html) |
+| OpenVINO部署 | [部署流程](tutorials/deploy/deploy_openvino.html) |
+| 移动端部署 | [部署流程](tutorials/deploy/deploy_lite.html) |
diff --git a/docs/images/lime.png b/docs/images/lime.png
index de435a2e2375a788319f0d80a4cce7a21d395e41..801be69b57c80ad92dcc0ca69bf1a0a4de074b0f 100644
Binary files a/docs/images/lime.png and b/docs/images/lime.png differ
diff --git a/docs/images/normlime.png b/docs/images/normlime.png
index 4e5099347f261d3f5ce47b93d28cfa484c1d3776..dd9a2f8f96a3ade26179010f340c7c5185bf0656 100644
Binary files a/docs/images/normlime.png and b/docs/images/normlime.png differ
diff --git a/docs/tutorials/deploy/deploy_lite.md b/docs/tutorials/deploy/deploy_lite.md
index 5419aed636545b95e9f98fdd45109592b7a6d9d6..fd757933dcd201cf5c45b9a58013ee8078248ba0 100644
--- a/docs/tutorials/deploy/deploy_lite.md
+++ b/docs/tutorials/deploy/deploy_lite.md
@@ -21,7 +21,7 @@ step 2: 将PaddleX模型导出为inference模型
step 3: 将inference模型转换成PaddleLite模型
```
-python /path/to/PaddleX/deploy/lite/export_lite.py --model_dir /path/to/inference_model --save_file /path/to/onnx_model --place place/to/run
+python /path/to/PaddleX/deploy/lite/export_lite.py --model_dir /path/to/inference_model --save_file /path/to/lite_model --place place/to/run
```
diff --git a/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_linux.md b/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_linux.md
index 9deceffd7cc499048f0cea89ef4918f48c4e9fc1..dada892cc0ea706941d0a9966bd52e657fff0d56 100755
--- a/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_linux.md
+++ b/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_linux.md
@@ -30,7 +30,7 @@ PaddlePaddle C++ 预测库针对不同的`CPU`,`CUDA`,以及是否支持Tens
| ubuntu14.04_cuda10.0_cudnn7_avx_mkl | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.8.2-gpu-cuda10-cudnn7-avx-mkl/fluid_inference.tgz ) |
| ubuntu14.04_cuda10.1_cudnn7.6_avx_mkl_trt6 | [fluid_inference.tgz](https://paddle-inference-lib.bj.bcebos.com/1.8.2-gpu-cuda10.1-cudnn7.6-avx-mkl-trt6%2Ffluid_inference.tgz) |
-更多和更新的版本,请根据实际情况下载: [C++预测库下载列表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/windows_cpp_inference.html#id1)
+更多和更新的版本,请根据实际情况下载: [C++预测库下载列表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html)
下载并解压后`/root/projects/fluid_inference`目录包含内容为:
```
@@ -42,7 +42,7 @@ fluid_inference
└── version.txt # 版本和编译信息
```
-**注意:** 预编译版本除`nv-jetson-cuda10-cudnn7.5-trt5` 以外其它包都是基于`GCC 4.8.5`编译,使用高版本`GCC`可能存在 `ABI`兼容性问题,建议降级或[自行编译预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html#id12)。
+**注意:** 预编译版本除`nv-jetson-cuda10-cudnn7.5-trt5` 以外其它包都是基于`GCC 4.8.5`编译,使用高版本`GCC`可能存在 `ABI`兼容性问题,建议降级或[自行编译预测库](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/advanced_guide/inference_deployment/inference/build_and_install_lib_cn.html#id12)。
### Step4: 编译
diff --git a/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_win_vs2019.md b/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_win_vs2019.md
index a1b659cb65db1d6774e1797732054dceef590711..7f6afb08ce43c25fc62d0aac60a42d2e2f2df9db 100755
--- a/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_win_vs2019.md
+++ b/docs/tutorials/deploy/deploy_server/deploy_cpp/deploy_cpp_win_vs2019.md
@@ -31,6 +31,7 @@ PaddlePaddle C++ 预测库针对不同的`CPU`,`CUDA`,以及是否支持Tens
| 版本说明 | 预测库(1.8.2版本) | 编译器 | 构建工具| cuDNN | CUDA
| ---- | ---- | ---- | ---- | ---- | ---- |
+
| cpu_avx_mkl | [fluid_inference.zip](https://paddle-wheel.bj.bcebos.com/1.8.2/win-infer/mkl/cpu/fluid_inference_install_dir.zip) | MSVC 2015 update 3 | CMake v3.16.0 |
| cpu_avx_openblas | [fluid_inference.zip](https://paddle-wheel.bj.bcebos.com/1.8.2/win-infer/open/cpu/fluid_inference_install_dir.zip) | MSVC 2015 update 3 | CMake v3.16.0 |
| cuda9.0_cudnn7_avx_mkl | [fluid_inference.zip](https://paddle-wheel.bj.bcebos.com/1.8.2/win-infer/mkl/post97/fluid_inference_install_dir.zip) | MSVC 2015 update 3 | CMake v3.16.0 | 7.4.1 | 9.0 |
diff --git a/examples/human_segmentation/README.md b/examples/human_segmentation/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..18d1f22f3b48979602028e13d1045b63991794b8
--- /dev/null
+++ b/examples/human_segmentation/README.md
@@ -0,0 +1,181 @@
+# HumanSeg人像分割模型
+
+本教程基于PaddleX核心分割网络,提供针对人像分割场景从预训练模型、Fine-tune、视频分割预测部署的全流程应用指南。
+
+## 安装
+
+**前置依赖**
+* paddlepaddle >= 1.8.0
+* python >= 3.5
+
+```
+pip install paddlex -i https://mirror.baidu.com/pypi/simple
+```
+安装的相关问题参考[PaddleX安装](https://paddlex.readthedocs.io/zh_CN/latest/install.html)
+
+## 预训练模型
+HumanSeg开放了在大规模人像数据上训练的两个预训练模型,满足多种使用场景的需求
+
+| 模型类型 | Checkpoint Parameter | Inference Model | Quant Inference Model | 备注 |
+| --- | --- | --- | ---| --- |
+| HumanSeg-server | [humanseg_server_params](https://paddlex.bj.bcebos.com/humanseg/models/humanseg_server.pdparams) | [humanseg_server_inference](https://paddlex.bj.bcebos.com/humanseg/models/humanseg_server_inference.zip) | -- | 高精度模型,适用于服务端GPU且背景复杂的人像场景, 模型结构为Deeplabv3+/Xcetion65, 输入大小(512, 512) |
+| HumanSeg-mobile | [humanseg_mobile_params](https://paddlex.bj.bcebos.com/humanseg/models/humanseg_mobile.pdparams) | [humanseg_mobile_inference](https://paddlex.bj.bcebos.com/humanseg/models/humanseg_mobile_inference.zip) | [humanseg_mobile_quant](https://paddlex.bj.bcebos.com/humanseg/models/humanseg_mobile_quant.zip) | 轻量级模型, 适用于移动端或服务端CPU的前置摄像头场景,模型结构为HRNet_w18_samll_v1,输入大小(192, 192) |
+
+
+模型性能
+
+| 模型 | 模型大小 | 计算耗时 |
+| --- | --- | --- |
+|humanseg_server_inference| 158M | - |
+|humanseg_mobile_inference | 5.8 M | 42.35ms |
+|humanseg_mobile_quant | 1.6M | 24.93ms |
+
+计算耗时运行环境: 小米,cpu:骁龙855, 内存:6GB, 图片大小:192*192
+
+
+**NOTE:**
+其中Checkpoint Parameter为模型权重,用于Fine-tuning场景。
+
+* Inference Model和Quant Inference Model为预测部署模型,包含`__model__`计算图结构、`__params__`模型参数和`model.yaml`基础的模型配置信息。
+
+* 其中Inference Model适用于服务端的CPU和GPU预测部署,Qunat Inference Model为量化版本,适用于通过Paddle Lite进行移动端等端侧设备部署。
+
+执行以下脚本进行HumanSeg预训练模型的下载
+```bash
+python pretrain_weights/download_pretrain_weights.py
+```
+
+## 下载测试数据
+我们提供了[supervise.ly](https://supervise.ly/)发布人像分割数据集**Supervisely Persons**, 从中随机抽取一小部分并转化成PaddleX可直接加载数据格式。通过运行以下代码进行快速下载,其中包含手机前置摄像头的人像测试视频`video_test.mp4`.
+
+```bash
+python data/download_data.py
+```
+
+## 快速体验视频流人像分割
+结合DIS(Dense Inverse Search-basedmethod)光流算法预测结果与分割结果,改善视频流人像分割
+```bash
+# 通过电脑摄像头进行实时分割处理
+python video_infer.py --model_dir pretrain_weights/humanseg_mobile_inference
+
+# 对人像视频进行分割处理
+python video_infer.py --model_dir pretrain_weights/humanseg_mobile_inference --video_path data/video_test.mp4
+```
+
+视频分割结果如下:
+
+
+
+根据所选背景进行背景替换,背景可以是一张图片,也可以是一段视频。
+```bash
+# 通过电脑摄像头进行实时背景替换处理, 也可通过'--background_video_path'传入背景视频
+python bg_replace.py --model_dir pretrain_weights/humanseg_mobile_inference --background_image_path data/background.jpg
+
+# 对人像视频进行背景替换处理, 也可通过'--background_video_path'传入背景视频
+python bg_replace.py --model_dir pretrain_weights/humanseg_mobile_inference --video_path data/video_test.mp4 --background_image_path data/background.jpg
+
+# 对单张图像进行背景替换
+python bg_replace.py --model_dir pretrain_weights/humanseg_mobile_inference --image_path data/human_image.jpg --background_image_path data/background.jpg
+
+```
+
+背景替换结果如下:
+
+
+
+
+**NOTE**:
+
+视频分割处理时间需要几分钟,请耐心等待。
+
+提供的模型适用于手机摄像头竖屏拍摄场景,宽屏效果会略差一些。
+
+## 训练
+使用下述命令基于与训练模型进行Fine-tuning,请确保选用的模型结构`model_type`与模型参数`pretrain_weights`匹配。
+```bash
+# 指定GPU卡号(以0号卡为例)
+export CUDA_VISIBLE_DEVICES=0
+# 若不使用GPU,则将CUDA_VISIBLE_DEVICES指定为空
+# export CUDA_VISIBLE_DEVICES=
+python train.py --model_type HumanSegMobile \
+--save_dir output/ \
+--data_dir data/mini_supervisely \
+--train_list data/mini_supervisely/train.txt \
+--val_list data/mini_supervisely/val.txt \
+--pretrain_weights pretrain_weights/humanseg_mobile_params \
+--batch_size 8 \
+--learning_rate 0.001 \
+--num_epochs 10 \
+--image_shape 192 192
+```
+其中参数含义如下:
+* `--model_type`: 模型类型,可选项为:HumanSegServer和HumanSegMobile
+* `--save_dir`: 模型保存路径
+* `--data_dir`: 数据集路径
+* `--train_list`: 训练集列表路径
+* `--val_list`: 验证集列表路径
+* `--pretrain_weights`: 预训练模型路径
+* `--batch_size`: 批大小
+* `--learning_rate`: 初始学习率
+* `--num_epochs`: 训练轮数
+* `--image_shape`: 网络输入图像大小(w, h)
+
+更多命令行帮助可运行下述命令进行查看:
+```bash
+python train.py --help
+```
+**NOTE**
+可通过更换`--model_type`变量与对应的`--pretrain_weights`使用不同的模型快速尝试。
+
+## 评估
+使用下述命令进行评估
+```bash
+python eval.py --model_dir output/best_model \
+--data_dir data/mini_supervisely \
+--val_list data/mini_supervisely/val.txt \
+--image_shape 192 192
+```
+其中参数含义如下:
+* `--model_dir`: 模型路径
+* `--data_dir`: 数据集路径
+* `--val_list`: 验证集列表路径
+* `--image_shape`: 网络输入图像大小(w, h)
+
+## 预测
+使用下述命令进行预测, 预测结果默认保存在`./output/result/`文件夹中。
+```bash
+python infer.py --model_dir output/best_model \
+--data_dir data/mini_supervisely \
+--test_list data/mini_supervisely/test.txt \
+--save_dir output/result \
+--image_shape 192 192
+```
+其中参数含义如下:
+* `--model_dir`: 模型路径
+* `--data_dir`: 数据集路径
+* `--test_list`: 测试集列表路径
+* `--image_shape`: 网络输入图像大小(w, h)
+
+## 模型导出
+```bash
+paddlex --export_inference --model_dir output/best_model \
+--save_dir output/export
+```
+其中参数含义如下:
+* `--model_dir`: 模型路径
+* `--save_dir`: 导出模型保存路径
+
+## 离线量化
+```bash
+python quant_offline.py --model_dir output/best_model \
+--data_dir data/mini_supervisely \
+--quant_list data/mini_supervisely/val.txt \
+--save_dir output/quant_offline \
+--image_shape 192 192
+```
+其中参数含义如下:
+* `--model_dir`: 待量化模型路径
+* `--data_dir`: 数据集路径
+* `--quant_list`: 量化数据集列表路径,一般直接选择训练集或验证集
+* `--save_dir`: 量化模型保存路径
+* `--image_shape`: 网络输入图像大小(w, h)
diff --git a/examples/human_segmentation/bg_replace.py b/examples/human_segmentation/bg_replace.py
new file mode 100644
index 0000000000000000000000000000000000000000..e0c1cc4261f0c946aaf07c11b5c4f6d1c21f6dca
--- /dev/null
+++ b/examples/human_segmentation/bg_replace.py
@@ -0,0 +1,314 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import os
+import os.path as osp
+import cv2
+import numpy as np
+
+from postprocess import postprocess, threshold_mask
+import paddlex as pdx
+import paddlex.utils.logging as logging
+from paddlex.seg import transforms
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(
+ description='HumanSeg inference for video')
+ parser.add_argument(
+ '--model_dir',
+ dest='model_dir',
+ help='Model path for inference',
+ type=str)
+ parser.add_argument(
+ '--image_path',
+ dest='image_path',
+ help='Image including human',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--background_image_path',
+ dest='background_image_path',
+ help='Background image for replacing',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--video_path',
+ dest='video_path',
+ help='Video path for inference',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--background_video_path',
+ dest='background_video_path',
+ help='Background video path for replacing',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--save_dir',
+ dest='save_dir',
+ help='The directory for saving the inference results',
+ type=str,
+ default='./output')
+ parser.add_argument(
+ "--image_shape",
+ dest="image_shape",
+ help="The image shape for net inputs.",
+ nargs=2,
+ default=[192, 192],
+ type=int)
+
+ return parser.parse_args()
+
+
+def bg_replace(label_map, img, bg):
+ h, w, _ = img.shape
+ bg = cv2.resize(bg, (w, h))
+ label_map = np.repeat(label_map[:, :, np.newaxis], 3, axis=2)
+ comb = (label_map * img + (1 - label_map) * bg).astype(np.uint8)
+ return comb
+
+
+def recover(img, im_info):
+ if im_info[0] == 'resize':
+ w, h = im_info[1][1], im_info[1][0]
+ img = cv2.resize(img, (w, h), cv2.INTER_LINEAR)
+ elif im_info[0] == 'padding':
+ w, h = im_info[1][0], im_info[1][0]
+ img = img[0:h, 0:w, :]
+ return img
+
+
+def infer(args):
+ resize_h = args.image_shape[1]
+ resize_w = args.image_shape[0]
+
+ test_transforms = transforms.Compose([transforms.Normalize()])
+ model = pdx.load_model(args.model_dir)
+
+ if not osp.exists(args.save_dir):
+ os.makedirs(args.save_dir)
+
+ # 图像背景替换
+ if args.image_path is not None:
+ if not osp.exists(args.image_path):
+ raise Exception('The --image_path is not existed: {}'.format(
+ args.image_path))
+ if args.background_image_path is None:
+ raise Exception(
+ 'The --background_image_path is not set. Please set it')
+ else:
+ if not osp.exists(args.background_image_path):
+ raise Exception(
+ 'The --background_image_path is not existed: {}'.format(
+ args.background_image_path))
+
+ img = cv2.imread(args.image_path)
+ im_shape = img.shape
+ im_scale_x = float(resize_w) / float(im_shape[1])
+ im_scale_y = float(resize_h) / float(im_shape[0])
+ im = cv2.resize(
+ img,
+ None,
+ None,
+ fx=im_scale_x,
+ fy=im_scale_y,
+ interpolation=cv2.INTER_LINEAR)
+ image = im.astype('float32')
+ im_info = ('resize', im_shape[0:2])
+ pred = model.predict(image, test_transforms)
+ label_map = pred['label_map']
+ label_map = recover(label_map, im_info)
+ bg = cv2.imread(args.background_image_path)
+ save_name = osp.basename(args.image_path)
+ save_path = osp.join(args.save_dir, save_name)
+ result = bg_replace(label_map, img, bg)
+ cv2.imwrite(save_path, result)
+
+ # 视频背景替换,如果提供背景视频则以背景视频作为背景,否则采用提供的背景图片
+ else:
+ is_video_bg = False
+ if args.background_video_path is not None:
+ if not osp.exists(args.background_video_path):
+ raise Exception(
+ 'The --background_video_path is not existed: {}'.format(
+ args.background_video_path))
+ is_video_bg = True
+ elif args.background_image_path is not None:
+ if not osp.exists(args.background_image_path):
+ raise Exception(
+ 'The --background_image_path is not existed: {}'.format(
+ args.background_image_path))
+ else:
+ raise Exception(
+ 'Please offer backgound image or video. You should set --backbground_iamge_paht or --background_video_path'
+ )
+
+ disflow = cv2.DISOpticalFlow_create(
+ cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
+ prev_gray = np.zeros((resize_h, resize_w), np.uint8)
+ prev_cfd = np.zeros((resize_h, resize_w), np.float32)
+ is_init = True
+ if args.video_path is not None:
+ logging.info('Please wait. It is computing......')
+ if not osp.exists(args.video_path):
+ raise Exception('The --video_path is not existed: {}'.format(
+ args.video_path))
+
+ cap_video = cv2.VideoCapture(args.video_path)
+ fps = cap_video.get(cv2.CAP_PROP_FPS)
+ width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
+ height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
+ save_name = osp.basename(args.video_path)
+ save_name = save_name.split('.')[0]
+ save_path = osp.join(args.save_dir, save_name + '.avi')
+
+ cap_out = cv2.VideoWriter(
+ save_path,
+ cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
+ (width, height))
+
+ if is_video_bg:
+ cap_bg = cv2.VideoCapture(args.background_video_path)
+ frames_bg = cap_bg.get(cv2.CAP_PROP_FRAME_COUNT)
+ current_frame_bg = 1
+ else:
+ img_bg = cv2.imread(args.background_image_path)
+ while cap_video.isOpened():
+ ret, frame = cap_video.read()
+ if ret:
+ im_shape = frame.shape
+ im_scale_x = float(resize_w) / float(im_shape[1])
+ im_scale_y = float(resize_h) / float(im_shape[0])
+ im = cv2.resize(
+ frame,
+ None,
+ None,
+ fx=im_scale_x,
+ fy=im_scale_y,
+ interpolation=cv2.INTER_LINEAR)
+ image = im.astype('float32')
+ im_info = ('resize', im_shape[0:2])
+ pred = model.predict(image, test_transforms)
+ score_map = pred['score_map']
+ cur_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
+ cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
+ score_map = 255 * score_map[:, :, 1]
+ optflow_map = postprocess(cur_gray, score_map, prev_gray, prev_cfd, \
+ disflow, is_init)
+ prev_gray = cur_gray.copy()
+ prev_cfd = optflow_map.copy()
+ is_init = False
+ optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
+ optflow_map = threshold_mask(
+ optflow_map, thresh_bg=0.2, thresh_fg=0.8)
+ score_map = recover(optflow_map, im_info)
+
+ #循环读取背景帧
+ if is_video_bg:
+ ret_bg, frame_bg = cap_bg.read()
+ if ret_bg:
+ if current_frame_bg == frames_bg:
+ current_frame_bg = 1
+ cap_bg.set(cv2.CAP_PROP_POS_FRAMES, 0)
+ else:
+ break
+ current_frame_bg += 1
+ comb = bg_replace(score_map, frame, frame_bg)
+ else:
+ comb = bg_replace(score_map, frame, img_bg)
+
+ cap_out.write(comb)
+ else:
+ break
+
+ if is_video_bg:
+ cap_bg.release()
+ cap_video.release()
+ cap_out.release()
+
+ # 当没有输入预测图像和视频的时候,则打开摄像头
+ else:
+ cap_video = cv2.VideoCapture(0)
+ if not cap_video.isOpened():
+ raise IOError("Error opening video stream or file, "
+ "--video_path whether existing: {}"
+ " or camera whether working".format(
+ args.video_path))
+ return
+
+ if is_video_bg:
+ cap_bg = cv2.VideoCapture(args.background_video_path)
+ frames_bg = cap_bg.get(cv2.CAP_PROP_FRAME_COUNT)
+ current_frame_bg = 1
+ else:
+ img_bg = cv2.imread(args.background_image_path)
+ while cap_video.isOpened():
+ ret, frame = cap_video.read()
+ if ret:
+ im_shape = frame.shape
+ im_scale_x = float(resize_w) / float(im_shape[1])
+ im_scale_y = float(resize_h) / float(im_shape[0])
+ im = cv2.resize(
+ frame,
+ None,
+ None,
+ fx=im_scale_x,
+ fy=im_scale_y,
+ interpolation=cv2.INTER_LINEAR)
+ image = im.astype('float32')
+ im_info = ('resize', im_shape[0:2])
+ pred = model.predict(image, test_transforms)
+ score_map = pred['score_map']
+ cur_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
+ cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
+ score_map = 255 * score_map[:, :, 1]
+ optflow_map = postprocess(cur_gray, score_map, prev_gray, prev_cfd, \
+ disflow, is_init)
+ prev_gray = cur_gray.copy()
+ prev_cfd = optflow_map.copy()
+ is_init = False
+ optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
+ optflow_map = threshold_mask(
+ optflow_map, thresh_bg=0.2, thresh_fg=0.8)
+ score_map = recover(optflow_map, im_info)
+
+ #循环读取背景帧
+ if is_video_bg:
+ ret_bg, frame_bg = cap_bg.read()
+ if ret_bg:
+ if current_frame_bg == frames_bg:
+ current_frame_bg = 1
+ cap_bg.set(cv2.CAP_PROP_POS_FRAMES, 0)
+ else:
+ break
+ current_frame_bg += 1
+ comb = bg_replace(score_map, frame, frame_bg)
+ else:
+ comb = bg_replace(score_map, frame, img_bg)
+ cv2.imshow('HumanSegmentation', comb)
+ if cv2.waitKey(1) & 0xFF == ord('q'):
+ break
+ else:
+ break
+ if is_video_bg:
+ cap_bg.release()
+ cap_video.release()
+
+
+if __name__ == "__main__":
+ args = parse_args()
+ infer(args)
diff --git a/examples/human_segmentation/data/download_data.py b/examples/human_segmentation/data/download_data.py
new file mode 100644
index 0000000000000000000000000000000000000000..941b4cc81ef05335c867c6c1eea20c07c44c7360
--- /dev/null
+++ b/examples/human_segmentation/data/download_data.py
@@ -0,0 +1,33 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os
+
+LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
+
+import paddlex as pdx
+
+
+def download_data(savepath):
+ url = "https://paddleseg.bj.bcebos.com/humanseg/data/mini_supervisely.zip"
+ pdx.utils.download_and_decompress(url=url, path=savepath)
+
+ url = "https://paddleseg.bj.bcebos.com/humanseg/data/video_test.zip"
+ pdx.utils.download_and_decompress(url=url, path=savepath)
+
+
+if __name__ == "__main__":
+ download_data(LOCAL_PATH)
+ print("Data download finish!")
diff --git a/examples/human_segmentation/eval.py b/examples/human_segmentation/eval.py
new file mode 100644
index 0000000000000000000000000000000000000000..a6e05ea0b2c463b948a1a021fa74f01512985675
--- /dev/null
+++ b/examples/human_segmentation/eval.py
@@ -0,0 +1,85 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import paddlex as pdx
+import paddlex.utils.logging as logging
+from paddlex.seg import transforms
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='HumanSeg training')
+ parser.add_argument(
+ '--model_dir',
+ dest='model_dir',
+ help='Model path for evaluating',
+ type=str,
+ default='output/best_model')
+ parser.add_argument(
+ '--data_dir',
+ dest='data_dir',
+ help='The root directory of dataset',
+ type=str)
+ parser.add_argument(
+ '--val_list',
+ dest='val_list',
+ help='Val list file of dataset',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--batch_size',
+ dest='batch_size',
+ help='Mini batch size',
+ type=int,
+ default=128)
+ parser.add_argument(
+ "--image_shape",
+ dest="image_shape",
+ help="The image shape for net inputs.",
+ nargs=2,
+ default=[192, 192],
+ type=int)
+ return parser.parse_args()
+
+
+def dict2str(dict_input):
+ out = ''
+ for k, v in dict_input.items():
+ try:
+ v = round(float(v), 6)
+ except:
+ pass
+ out = out + '{}={}, '.format(k, v)
+ return out.strip(', ')
+
+
+def evaluate(args):
+ eval_transforms = transforms.Compose(
+ [transforms.Resize(args.image_shape), transforms.Normalize()])
+
+ eval_dataset = pdx.datasets.SegDataset(
+ data_dir=args.data_dir,
+ file_list=args.val_list,
+ transforms=eval_transforms)
+
+ model = pdx.load_model(args.model_dir)
+ metrics = model.evaluate(eval_dataset, args.batch_size)
+ logging.info('[EVAL] Finished, {} .'.format(dict2str(metrics)))
+
+
+if __name__ == '__main__':
+ args = parse_args()
+
+ evaluate(args)
diff --git a/examples/human_segmentation/infer.py b/examples/human_segmentation/infer.py
new file mode 100644
index 0000000000000000000000000000000000000000..c78df7ae51609299a44d1c706197c56e2a20618e
--- /dev/null
+++ b/examples/human_segmentation/infer.py
@@ -0,0 +1,109 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import os
+import os.path as osp
+import cv2
+import numpy as np
+import tqdm
+
+import paddlex as pdx
+from paddlex.seg import transforms
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(
+ description='HumanSeg prediction and visualization')
+ parser.add_argument(
+ '--model_dir',
+ dest='model_dir',
+ help='Model path for prediction',
+ type=str)
+ parser.add_argument(
+ '--data_dir',
+ dest='data_dir',
+ help='The root directory of dataset',
+ type=str)
+ parser.add_argument(
+ '--test_list',
+ dest='test_list',
+ help='Test list file of dataset',
+ type=str)
+ parser.add_argument(
+ '--save_dir',
+ dest='save_dir',
+ help='The directory for saving the inference results',
+ type=str,
+ default='./output/result')
+ parser.add_argument(
+ "--image_shape",
+ dest="image_shape",
+ help="The image shape for net inputs.",
+ nargs=2,
+ default=[192, 192],
+ type=int)
+ return parser.parse_args()
+
+
+def infer(args):
+ def makedir(path):
+ sub_dir = osp.dirname(path)
+ if not osp.exists(sub_dir):
+ os.makedirs(sub_dir)
+
+ test_transforms = transforms.Compose(
+ [transforms.Resize(args.image_shape), transforms.Normalize()])
+ model = pdx.load_model(args.model_dir)
+ added_saved_path = osp.join(args.save_dir, 'added')
+ mat_saved_path = osp.join(args.save_dir, 'mat')
+ scoremap_saved_path = osp.join(args.save_dir, 'scoremap')
+
+ with open(args.test_list, 'r') as f:
+ files = f.readlines()
+
+ for file in tqdm.tqdm(files):
+ file = file.strip()
+ im_file = osp.join(args.data_dir, file)
+ im = cv2.imread(im_file)
+ result = model.predict(im_file, transforms=test_transforms)
+
+ # save added image
+ added_image = pdx.seg.visualize(
+ im_file, result, weight=0.6, save_dir=None)
+ added_image_file = osp.join(added_saved_path, file)
+ makedir(added_image_file)
+ cv2.imwrite(added_image_file, added_image)
+
+ # save score map
+ score_map = result['score_map'][:, :, 1]
+ score_map = (score_map * 255).astype(np.uint8)
+ score_map_file = osp.join(scoremap_saved_path, file)
+ makedir(score_map_file)
+ cv2.imwrite(score_map_file, score_map)
+
+ # save mat image
+ score_map = np.expand_dims(score_map, axis=-1)
+ mat_image = np.concatenate([im, score_map], axis=2)
+ mat_file = osp.join(mat_saved_path, file)
+ ext = osp.splitext(mat_file)[-1]
+ mat_file = mat_file.replace(ext, '.png')
+ makedir(mat_file)
+ cv2.imwrite(mat_file, mat_image)
+
+
+if __name__ == '__main__':
+ args = parse_args()
+ infer(args)
diff --git a/examples/human_segmentation/postprocess.py b/examples/human_segmentation/postprocess.py
new file mode 100644
index 0000000000000000000000000000000000000000..88e5dcc80f3d49d7d5625e74fe4de313b59fa844
--- /dev/null
+++ b/examples/human_segmentation/postprocess.py
@@ -0,0 +1,125 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import numpy as np
+
+
+def cal_optical_flow_tracking(pre_gray, cur_gray, prev_cfd, dl_weights,
+ disflow):
+ """计算光流跟踪匹配点和光流图
+ 输入参数:
+ pre_gray: 上一帧灰度图
+ cur_gray: 当前帧灰度图
+ prev_cfd: 上一帧光流图
+ dl_weights: 融合权重图
+ disflow: 光流数据结构
+ 返回值:
+ is_track: 光流点跟踪二值图,即是否具有光流点匹配
+ track_cfd: 光流跟踪图
+ """
+ check_thres = 8
+ h, w = pre_gray.shape[:2]
+ track_cfd = np.zeros_like(prev_cfd)
+ is_track = np.zeros_like(pre_gray)
+ flow_fw = disflow.calc(pre_gray, cur_gray, None)
+ flow_bw = disflow.calc(cur_gray, pre_gray, None)
+ flow_fw = np.round(flow_fw).astype(np.int)
+ flow_bw = np.round(flow_bw).astype(np.int)
+ y_list = np.array(range(h))
+ x_list = np.array(range(w))
+ yv, xv = np.meshgrid(y_list, x_list)
+ yv, xv = yv.T, xv.T
+ cur_x = xv + flow_fw[:, :, 0]
+ cur_y = yv + flow_fw[:, :, 1]
+
+ # 超出边界不跟踪
+ not_track = (cur_x < 0) + (cur_x >= w) + (cur_y < 0) + (cur_y >= h)
+ flow_bw[~not_track] = flow_bw[cur_y[~not_track], cur_x[~not_track]]
+ not_track += (np.square(flow_fw[:, :, 0] + flow_bw[:, :, 0]) +
+ np.square(flow_fw[:, :, 1] + flow_bw[:, :, 1])
+ ) >= check_thres
+ track_cfd[cur_y[~not_track], cur_x[~not_track]] = prev_cfd[~not_track]
+
+ is_track[cur_y[~not_track], cur_x[~not_track]] = 1
+
+ not_flow = np.all(np.abs(flow_fw) == 0,
+ axis=-1) * np.all(np.abs(flow_bw) == 0, axis=-1)
+ dl_weights[cur_y[not_flow], cur_x[not_flow]] = 0.05
+ return track_cfd, is_track, dl_weights
+
+
+def fuse_optical_flow_tracking(track_cfd, dl_cfd, dl_weights, is_track):
+ """光流追踪图和人像分割结构融合
+ 输入参数:
+ track_cfd: 光流追踪图
+ dl_cfd: 当前帧分割结果
+ dl_weights: 融合权重图
+ is_track: 光流点匹配二值图
+ 返回
+ cur_cfd: 光流跟踪图和人像分割结果融合图
+ """
+ fusion_cfd = dl_cfd.copy()
+ is_track = is_track.astype(np.bool)
+ fusion_cfd[is_track] = dl_weights[is_track] * dl_cfd[is_track] + (
+ 1 - dl_weights[is_track]) * track_cfd[is_track]
+ # 确定区域
+ index_certain = ((dl_cfd > 0.9) + (dl_cfd < 0.1)) * is_track
+ index_less01 = (dl_weights < 0.1) * index_certain
+ fusion_cfd[index_less01] = 0.3 * dl_cfd[index_less01] + 0.7 * track_cfd[
+ index_less01]
+ index_larger09 = (dl_weights >= 0.1) * index_certain
+ fusion_cfd[index_larger09] = 0.4 * dl_cfd[
+ index_larger09] + 0.6 * track_cfd[index_larger09]
+ return fusion_cfd
+
+
+def threshold_mask(img, thresh_bg, thresh_fg):
+ dst = (img / 255.0 - thresh_bg) / (thresh_fg - thresh_bg)
+ dst[np.where(dst > 1)] = 1
+ dst[np.where(dst < 0)] = 0
+ return dst.astype(np.float32)
+
+
+def postprocess(cur_gray, scoremap, prev_gray, pre_cfd, disflow, is_init):
+ """光流优化
+ Args:
+ cur_gray : 当前帧灰度图
+ pre_gray : 前一帧灰度图
+ pre_cfd :前一帧融合结果
+ scoremap : 当前帧分割结果
+ difflow : 光流
+ is_init : 是否第一帧
+ Returns:
+ fusion_cfd : 光流追踪图和预测结果融合图
+ """
+ h, w = scoremap.shape
+ cur_cfd = scoremap.copy()
+
+ if is_init:
+ if h <= 64 or w <= 64:
+ disflow.setFinestScale(1)
+ elif h <= 160 or w <= 160:
+ disflow.setFinestScale(2)
+ else:
+ disflow.setFinestScale(3)
+ fusion_cfd = cur_cfd
+ else:
+ weights = np.ones((h, w), np.float32) * 0.3
+ track_cfd, is_track, weights = cal_optical_flow_tracking(
+ prev_gray, cur_gray, pre_cfd, weights, disflow)
+ fusion_cfd = fuse_optical_flow_tracking(track_cfd, cur_cfd, weights,
+ is_track)
+
+ return fusion_cfd
diff --git a/examples/human_segmentation/pretrain_weights/download_pretrain_weights.py b/examples/human_segmentation/pretrain_weights/download_pretrain_weights.py
new file mode 100644
index 0000000000000000000000000000000000000000..be961ab6ebca2f8fef2e5573a817ccfd29fee41a
--- /dev/null
+++ b/examples/human_segmentation/pretrain_weights/download_pretrain_weights.py
@@ -0,0 +1,40 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os
+
+LOCAL_PATH = os.path.dirname(os.path.abspath(__file__))
+
+import paddlex as pdx
+import paddlehub as hub
+
+model_urls = {
+ "PaddleX_HumanSeg_Server_Params":
+ "https://bj.bcebos.com/paddlex/models/humanseg/humanseg_server_params.tar",
+ "PaddleX_HumanSeg_Server_Inference":
+ "https://bj.bcebos.com/paddlex/models/humanseg/humanseg_server_inference.tar",
+ "PaddleX_HumanSeg_Mobile_Params":
+ "https://bj.bcebos.com/paddlex/models/humanseg/humanseg_mobile_params.tar",
+ "PaddleX_HumanSeg_Mobile_Inference":
+ "https://bj.bcebos.com/paddlex/models/humanseg/humanseg_mobile_inference.tar",
+ "PaddleX_HumanSeg_Mobile_Quant":
+ "https://bj.bcebos.com/paddlex/models/humanseg/humanseg_mobile_quant.tar"
+}
+
+if __name__ == "__main__":
+ for model_name, url in model_urls.items():
+ pdx.utils.download_and_decompress(url=url, path=LOCAL_PATH)
+ print("Pretrained Model download success!")
diff --git a/examples/human_segmentation/quant_offline.py b/examples/human_segmentation/quant_offline.py
new file mode 100644
index 0000000000000000000000000000000000000000..a801f8d02263f8dab98f3250478a289337492ae4
--- /dev/null
+++ b/examples/human_segmentation/quant_offline.py
@@ -0,0 +1,85 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import paddlex as pdx
+from paddlex.seg import transforms
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='HumanSeg training')
+ parser.add_argument(
+ '--model_dir',
+ dest='model_dir',
+ help='Model path for quant',
+ type=str,
+ default='output/best_model')
+ parser.add_argument(
+ '--batch_size',
+ dest='batch_size',
+ help='Mini batch size',
+ type=int,
+ default=1)
+ parser.add_argument(
+ '--batch_nums',
+ dest='batch_nums',
+ help='Batch number for quant',
+ type=int,
+ default=10)
+ parser.add_argument(
+ '--data_dir',
+ dest='data_dir',
+ help='the root directory of dataset',
+ type=str)
+ parser.add_argument(
+ '--quant_list',
+ dest='quant_list',
+ help='Image file list for model quantization, it can be vat.txt or train.txt',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--save_dir',
+ dest='save_dir',
+ help='The directory for saving the quant model',
+ type=str,
+ default='./output/quant_offline')
+ parser.add_argument(
+ "--image_shape",
+ dest="image_shape",
+ help="The image shape for net inputs.",
+ nargs=2,
+ default=[192, 192],
+ type=int)
+ return parser.parse_args()
+
+
+def evaluate(args):
+ eval_transforms = transforms.Compose(
+ [transforms.Resize(args.image_shape), transforms.Normalize()])
+
+ eval_dataset = pdx.datasets.SegDataset(
+ data_dir=args.data_dir,
+ file_list=args.quant_list,
+ transforms=eval_transforms)
+
+ model = pdx.load_model(args.model_dir)
+ pdx.slim.export_quant_model(model, eval_dataset, args.batch_size,
+ args.batch_nums, args.save_dir)
+
+
+if __name__ == '__main__':
+ args = parse_args()
+
+ evaluate(args)
diff --git a/examples/human_segmentation/train.py b/examples/human_segmentation/train.py
new file mode 100644
index 0000000000000000000000000000000000000000..a7df98f360a78c2624814fc75bb0c382e19b7e95
--- /dev/null
+++ b/examples/human_segmentation/train.py
@@ -0,0 +1,156 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+
+import paddlex as pdx
+from paddlex.seg import transforms
+
+MODEL_TYPE = ['HumanSegMobile', 'HumanSegServer']
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(description='HumanSeg training')
+ parser.add_argument(
+ '--model_type',
+ dest='model_type',
+ help="Model type for traing, which is one of ('HumanSegMobile', 'HumanSegServer')",
+ type=str,
+ default='HumanSegMobile')
+ parser.add_argument(
+ '--data_dir',
+ dest='data_dir',
+ help='The root directory of dataset',
+ type=str)
+ parser.add_argument(
+ '--train_list',
+ dest='train_list',
+ help='Train list file of dataset',
+ type=str)
+ parser.add_argument(
+ '--val_list',
+ dest='val_list',
+ help='Val list file of dataset',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--save_dir',
+ dest='save_dir',
+ help='The directory for saving the model snapshot',
+ type=str,
+ default='./output')
+ parser.add_argument(
+ '--num_classes',
+ dest='num_classes',
+ help='Number of classes',
+ type=int,
+ default=2)
+ parser.add_argument(
+ "--image_shape",
+ dest="image_shape",
+ help="The image shape for net inputs.",
+ nargs=2,
+ default=[192, 192],
+ type=int)
+ parser.add_argument(
+ '--num_epochs',
+ dest='num_epochs',
+ help='Number epochs for training',
+ type=int,
+ default=100)
+ parser.add_argument(
+ '--batch_size',
+ dest='batch_size',
+ help='Mini batch size',
+ type=int,
+ default=128)
+ parser.add_argument(
+ '--learning_rate',
+ dest='learning_rate',
+ help='Learning rate',
+ type=float,
+ default=0.01)
+ parser.add_argument(
+ '--pretrain_weights',
+ dest='pretrain_weights',
+ help='The path of pretrianed weight',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--resume_checkpoint',
+ dest='resume_checkpoint',
+ help='The path of resume checkpoint',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--use_vdl',
+ dest='use_vdl',
+ help='Whether to use visualdl',
+ action='store_true')
+ parser.add_argument(
+ '--save_interval_epochs',
+ dest='save_interval_epochs',
+ help='The interval epochs for save a model snapshot',
+ type=int,
+ default=5)
+
+ return parser.parse_args()
+
+
+def train(args):
+ train_transforms = transforms.Compose([
+ transforms.Resize(args.image_shape), transforms.RandomHorizontalFlip(),
+ transforms.Normalize()
+ ])
+
+ eval_transforms = transforms.Compose(
+ [transforms.Resize(args.image_shape), transforms.Normalize()])
+
+ train_dataset = pdx.datasets.SegDataset(
+ data_dir=args.data_dir,
+ file_list=args.train_list,
+ transforms=train_transforms,
+ shuffle=True)
+ eval_dataset = pdx.datasets.SegDataset(
+ data_dir=args.data_dir,
+ file_list=args.val_list,
+ transforms=eval_transforms)
+
+ if args.model_type == 'HumanSegMobile':
+ model = pdx.seg.HRNet(
+ num_classes=args.num_classes, width='18_small_v1')
+ elif args.model_type == 'HumanSegServer':
+ model = pdx.seg.DeepLabv3p(
+ num_classes=args.num_classes, backbone='Xception65')
+ else:
+ raise ValueError(
+ "--model_type: {} is set wrong, it shold be one of ('HumanSegMobile', "
+ "'HumanSegLite', 'HumanSegServer')".format(args.model_type))
+ model.train(
+ num_epochs=args.num_epochs,
+ train_dataset=train_dataset,
+ train_batch_size=args.batch_size,
+ eval_dataset=eval_dataset,
+ save_interval_epochs=args.save_interval_epochs,
+ learning_rate=args.learning_rate,
+ pretrain_weights=args.pretrain_weights,
+ resume_checkpoint=args.resume_checkpoint,
+ save_dir=args.save_dir,
+ use_vdl=args.use_vdl)
+
+
+if __name__ == '__main__':
+ args = parse_args()
+ train(args)
diff --git a/examples/human_segmentation/video_infer.py b/examples/human_segmentation/video_infer.py
new file mode 100644
index 0000000000000000000000000000000000000000..c2a67fe0032eae19e937580ff35e53ba09d1118f
--- /dev/null
+++ b/examples/human_segmentation/video_infer.py
@@ -0,0 +1,187 @@
+# coding: utf8
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import argparse
+import os
+import os.path as osp
+import cv2
+import numpy as np
+
+from postprocess import postprocess, threshold_mask
+import paddlex as pdx
+import paddlex.utils.logging as logging
+from paddlex.seg import transforms
+
+
+def parse_args():
+ parser = argparse.ArgumentParser(
+ description='HumanSeg inference for video')
+ parser.add_argument(
+ '--model_dir',
+ dest='model_dir',
+ help='Model path for inference',
+ type=str)
+ parser.add_argument(
+ '--video_path',
+ dest='video_path',
+ help='Video path for inference, camera will be used if the path not existing',
+ type=str,
+ default=None)
+ parser.add_argument(
+ '--save_dir',
+ dest='save_dir',
+ help='The directory for saving the inference results',
+ type=str,
+ default='./output')
+ parser.add_argument(
+ "--image_shape",
+ dest="image_shape",
+ help="The image shape for net inputs.",
+ nargs=2,
+ default=[192, 192],
+ type=int)
+
+ return parser.parse_args()
+
+
+def recover(img, im_info):
+ if im_info[0] == 'resize':
+ w, h = im_info[1][1], im_info[1][0]
+ img = cv2.resize(img, (w, h), cv2.INTER_LINEAR)
+ elif im_info[0] == 'padding':
+ w, h = im_info[1][0], im_info[1][0]
+ img = img[0:h, 0:w, :]
+ return img
+
+
+def video_infer(args):
+ resize_h = args.image_shape[1]
+ resize_w = args.image_shape[0]
+
+ model = pdx.load_model(args.model_dir)
+ test_transforms = transforms.Compose([transforms.Normalize()])
+ if not args.video_path:
+ cap = cv2.VideoCapture(0)
+ else:
+ cap = cv2.VideoCapture(args.video_path)
+ if not cap.isOpened():
+ raise IOError("Error opening video stream or file, "
+ "--video_path whether existing: {}"
+ " or camera whether working".format(args.video_path))
+ return
+
+ width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
+ height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+
+ disflow = cv2.DISOpticalFlow_create(cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
+ prev_gray = np.zeros((resize_h, resize_w), np.uint8)
+ prev_cfd = np.zeros((resize_h, resize_w), np.float32)
+ is_init = True
+
+ fps = cap.get(cv2.CAP_PROP_FPS)
+ if args.video_path:
+ logging.info("Please wait. It is computing......")
+ # 用于保存预测结果视频
+ if not osp.exists(args.save_dir):
+ os.makedirs(args.save_dir)
+ out = cv2.VideoWriter(
+ osp.join(args.save_dir, 'result.avi'),
+ cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
+ # 开始获取视频帧
+ while cap.isOpened():
+ ret, frame = cap.read()
+ if ret:
+ im_shape = frame.shape
+ im_scale_x = float(resize_w) / float(im_shape[1])
+ im_scale_y = float(resize_h) / float(im_shape[0])
+ im = cv2.resize(
+ frame,
+ None,
+ None,
+ fx=im_scale_x,
+ fy=im_scale_y,
+ interpolation=cv2.INTER_LINEAR)
+ image = im.astype('float32')
+ im_info = ('resize', im_shape[0:2])
+ pred = model.predict(image, test_transforms)
+ score_map = pred['score_map']
+ cur_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
+ score_map = 255 * score_map[:, :, 1]
+ optflow_map = postprocess(cur_gray, score_map, prev_gray, prev_cfd, \
+ disflow, is_init)
+ prev_gray = cur_gray.copy()
+ prev_cfd = optflow_map.copy()
+ is_init = False
+ optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
+ optflow_map = threshold_mask(
+ optflow_map, thresh_bg=0.2, thresh_fg=0.8)
+ img_matting = np.repeat(
+ optflow_map[:, :, np.newaxis], 3, axis=2)
+ img_matting = recover(img_matting, im_info)
+ bg_im = np.ones_like(img_matting) * 255
+ comb = (img_matting * frame +
+ (1 - img_matting) * bg_im).astype(np.uint8)
+ out.write(comb)
+ else:
+ break
+ cap.release()
+ out.release()
+
+ else:
+ while cap.isOpened():
+ ret, frame = cap.read()
+ if ret:
+ im_shape = frame.shape
+ im_scale_x = float(resize_w) / float(im_shape[1])
+ im_scale_y = float(resize_h) / float(im_shape[0])
+ im = cv2.resize(
+ frame,
+ None,
+ None,
+ fx=im_scale_x,
+ fy=im_scale_y,
+ interpolation=cv2.INTER_LINEAR)
+ image = im.astype('float32')
+ im_info = ('resize', im_shape[0:2])
+ pred = model.predict(image, test_transforms)
+ score_map = pred['score_map']
+ cur_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
+ cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
+ score_map = 255 * score_map[:, :, 1]
+ optflow_map = postprocess(cur_gray, score_map, prev_gray, prev_cfd, \
+ disflow, is_init)
+ prev_gray = cur_gray.copy()
+ prev_cfd = optflow_map.copy()
+ is_init = False
+ optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
+ optflow_map = threshold_mask(
+ optflow_map, thresh_bg=0.2, thresh_fg=0.8)
+ img_matting = np.repeat(
+ optflow_map[:, :, np.newaxis], 3, axis=2)
+ img_matting = recover(img_matting, im_info)
+ bg_im = np.ones_like(img_matting) * 255
+ comb = (img_matting * frame +
+ (1 - img_matting) * bg_im).astype(np.uint8)
+ cv2.imshow('HumanSegmentation', comb)
+ if cv2.waitKey(1) & 0xFF == ord('q'):
+ break
+ else:
+ break
+ cap.release()
+
+
+if __name__ == "__main__":
+ args = parse_args()
+ video_infer(args)
diff --git a/new_tutorials/train/README.md b/new_tutorials/train/README.md
deleted file mode 100644
index fc319d16d0c795f856600355d43c18ef413eae0e..0000000000000000000000000000000000000000
--- a/new_tutorials/train/README.md
+++ /dev/null
@@ -1,21 +0,0 @@
-# 使用教程——训练模型
-
-本目录下整理了使用PaddleX训练模型的示例代码,代码中均提供了示例数据的自动下载,并均使用单张GPU卡进行训练。
-
-|代码 | 模型任务 | 数据 |
-|------|--------|---------|
-|classification/mobilenetv2.py | 图像分类MobileNetV2 | 蔬菜分类 |
-|classification/resnet50.py | 图像分类ResNet50 | 蔬菜分类 |
-|detection/faster_rcnn_r50_fpn.py | 目标检测FasterRCNN | 昆虫检测 |
-|detection/mask_rcnn_f50_fpn.py | 实例分割MaskRCNN | 垃圾分拣 |
-|segmentation/deeplabv3p.py | 语义分割DeepLabV3| 视盘分割 |
-|segmentation/unet.py | 语义分割UNet | 视盘分割 |
-|segmentation/hrnet.py | 语义分割HRNet | 视盘分割 |
-|segmentation/fast_scnn.py | 语义分割FastSCNN | 视盘分割 |
-
-
-## 开始训练
-在安装PaddleX后,使用如下命令开始训练
-```
-python classification/mobilenetv2.py
-```
diff --git a/new_tutorials/train/classification/mobilenetv2.py b/new_tutorials/train/classification/mobilenetv2.py
deleted file mode 100644
index 9a075526a3cbb7e560c133f08faef68ea5a07121..0000000000000000000000000000000000000000
--- a/new_tutorials/train/classification/mobilenetv2.py
+++ /dev/null
@@ -1,47 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-from paddlex.cls import transforms
-import paddlex as pdx
-
-# 下载和解压蔬菜分类数据集
-veg_dataset = 'https://bj.bcebos.com/paddlex/datasets/vegetables_cls.tar.gz'
-pdx.utils.download_and_decompress(veg_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/cls_transforms.html#composedclstransforms
-train_transforms = transforms.ComposedClsTransforms(mode='train', crop_size=[224, 224])
-eval_transforms = transforms.ComposedClsTransforms(mode='eval', crop_size=[224, 224])
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/classification.html#imagenet
-train_dataset = pdx.datasets.ImageNet(
- data_dir='vegetables_cls',
- file_list='vegetables_cls/train_list.txt',
- label_list='vegetables_cls/labels.txt',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.ImageNet(
- data_dir='vegetables_cls',
- file_list='vegetables_cls/val_list.txt',
- label_list='vegetables_cls/labels.txt',
- transforms=eval_transforms)
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/mobilenetv2/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/models/classification.html#resnet50
-model = pdx.cls.MobileNetV2(num_classes=len(train_dataset.labels))
-model.train(
- num_epochs=10,
- train_dataset=train_dataset,
- train_batch_size=32,
- eval_dataset=eval_dataset,
- lr_decay_epochs=[4, 6, 8],
- learning_rate=0.025,
- save_dir='output/mobilenetv2',
- use_vdl=True)
diff --git a/new_tutorials/train/classification/resnet50.py b/new_tutorials/train/classification/resnet50.py
deleted file mode 100644
index bf56a605f1c3376057c1ab9283fa1251491b2750..0000000000000000000000000000000000000000
--- a/new_tutorials/train/classification/resnet50.py
+++ /dev/null
@@ -1,56 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-import paddle.fluid as fluid
-from paddlex.cls import transforms
-import paddlex as pdx
-
-# 下载和解压蔬菜分类数据集
-veg_dataset = 'https://bj.bcebos.com/paddlex/datasets/vegetables_cls.tar.gz'
-pdx.utils.download_and_decompress(veg_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/cls_transforms.html#composedclstransforms
-train_transforms = transforms.ComposedClsTransforms(mode='train', crop_size=[224, 224])
-eval_transforms = transforms.ComposedClsTransforms(mode='eval', crop_size=[224, 224])
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/classification.html#imagenet
-train_dataset = pdx.datasets.ImageNet(
- data_dir='vegetables_cls',
- file_list='vegetables_cls/train_list.txt',
- label_list='vegetables_cls/labels.txt',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.ImageNet(
- data_dir='vegetables_cls',
- file_list='vegetables_cls/val_list.txt',
- label_list='vegetables_cls/labels.txt',
- transforms=eval_transforms)
-
-# PaddleX支持自定义构建优化器
-step_each_epoch = train_dataset.num_samples // 32
-learning_rate = fluid.layers.cosine_decay(
- learning_rate=0.025, step_each_epoch=step_each_epoch, epochs=10)
-optimizer = fluid.optimizer.Momentum(
- learning_rate=learning_rate,
- momentum=0.9,
- regularization=fluid.regularizer.L2Decay(4e-5))
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/resnet50/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/models/classification.html#resnet50
-model = pdx.cls.ResNet50(num_classes=len(train_dataset.labels))
-model.train(
- num_epochs=10,
- train_dataset=train_dataset,
- train_batch_size=32,
- eval_dataset=eval_dataset,
- optimizer=optimizer,
- save_dir='output/resnet50',
- use_vdl=True)
diff --git a/new_tutorials/train/detection/faster_rcnn_r50_fpn.py b/new_tutorials/train/detection/faster_rcnn_r50_fpn.py
deleted file mode 100644
index a64b711c3af48cb85cfd8a82938785ca386a99ec..0000000000000000000000000000000000000000
--- a/new_tutorials/train/detection/faster_rcnn_r50_fpn.py
+++ /dev/null
@@ -1,49 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-from paddlex.det import transforms
-import paddlex as pdx
-
-# 下载和解压昆虫检测数据集
-insect_dataset = 'https://bj.bcebos.com/paddlex/datasets/insect_det.tar.gz'
-pdx.utils.download_and_decompress(insect_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/det_transforms.html#composedrcnntransforms
-train_transforms = transforms.ComposedRCNNTransforms(mode='train', min_max_size=[800, 1333])
-eval_transforms = transforms.ComposedRCNNTransforms(mode='eval', min_max_size=[800, 1333])
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/detection.html#vocdetection
-train_dataset = pdx.datasets.VOCDetection(
- data_dir='insect_det',
- file_list='insect_det/train_list.txt',
- label_list='insect_det/labels.txt',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.VOCDetection(
- data_dir='insect_det',
- file_list='insect_det/val_list.txt',
- label_list='insect_det/labels.txt',
- transforms=eval_transforms)
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/faster_rcnn_r50_fpn/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-# num_classes 需要设置为包含背景类的类别数,即: 目标类别数量 + 1
-
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/models/detection.html#fasterrcnn
-num_classes = len(train_dataset.labels) + 1
-model = pdx.det.FasterRCNN(num_classes=num_classes)
-model.train(
- num_epochs=12,
- train_dataset=train_dataset,
- train_batch_size=2,
- eval_dataset=eval_dataset,
- learning_rate=0.0025,
- lr_decay_epochs=[8, 11],
- save_dir='output/faster_rcnn_r50_fpn',
- use_vdl=True)
diff --git a/new_tutorials/train/detection/mask_rcnn_r50_fpn.py b/new_tutorials/train/detection/mask_rcnn_r50_fpn.py
deleted file mode 100644
index f2ebf6e20f18054bf16452eb6e60b9ea24f20748..0000000000000000000000000000000000000000
--- a/new_tutorials/train/detection/mask_rcnn_r50_fpn.py
+++ /dev/null
@@ -1,48 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-from paddlex.det import transforms
-import paddlex as pdx
-
-# 下载和解压小度熊分拣数据集
-xiaoduxiong_dataset = 'https://bj.bcebos.com/paddlex/datasets/xiaoduxiong_ins_det.tar.gz'
-pdx.utils.download_and_decompress(xiaoduxiong_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/det_transforms.html#composedrcnntransforms
-train_transforms = transforms.ComposedRCNNTransforms(mode='train', min_max_size=[800, 1333])
-eval_transforms = transforms.ComposedRCNNTransforms(mode='eval', min_max_size=[800, 1333])
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/detection.html#cocodetection
-train_dataset = pdx.datasets.CocoDetection(
- data_dir='xiaoduxiong_ins_det/JPEGImages',
- ann_file='xiaoduxiong_ins_det/train.json',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.CocoDetection(
- data_dir='xiaoduxiong_ins_det/JPEGImages',
- ann_file='xiaoduxiong_ins_det/val.json',
- transforms=eval_transforms)
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/mask_rcnn_r50_fpn/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-# num_classes 需要设置为包含背景类的类别数,即: 目标类别数量 + 1
-
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/models/instance_segmentation.html#maskrcnn
-num_classes = len(train_dataset.labels) + 1
-model = pdx.det.MaskRCNN(num_classes=num_classes)
-model.train(
- num_epochs=12,
- train_dataset=train_dataset,
- train_batch_size=1,
- eval_dataset=eval_dataset,
- learning_rate=0.00125,
- warmup_steps=10,
- lr_decay_epochs=[8, 11],
- save_dir='output/mask_rcnn_r50_fpn',
- use_vdl=True)
diff --git a/new_tutorials/train/detection/yolov3_darknet53.py b/new_tutorials/train/detection/yolov3_darknet53.py
deleted file mode 100644
index 8027a506458aac94de82a915aa8b058d71ba97f7..0000000000000000000000000000000000000000
--- a/new_tutorials/train/detection/yolov3_darknet53.py
+++ /dev/null
@@ -1,48 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-from paddlex.det import transforms
-import paddlex as pdx
-
-# 下载和解压昆虫检测数据集
-insect_dataset = 'https://bj.bcebos.com/paddlex/datasets/insect_det.tar.gz'
-pdx.utils.download_and_decompress(insect_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/det_transforms.html#composedyolotransforms
-train_transforms = transforms.ComposedYOLOv3Transforms(mode='train', shape=[608, 608])
-eval_transforms = transforms.ComposedYOLOv3Transforms(mode='eva', shape=[608, 608])
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/detection.html#vocdetection
-train_dataset = pdx.datasets.VOCDetection(
- data_dir='insect_det',
- file_list='insect_det/train_list.txt',
- label_list='insect_det/labels.txt',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.VOCDetection(
- data_dir='insect_det',
- file_list='insect_det/val_list.txt',
- label_list='insect_det/labels.txt',
- transforms=eval_transforms)
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/yolov3_darknet/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/models/detection.html#yolov3
-num_classes = len(train_dataset.labels)
-model = pdx.det.YOLOv3(num_classes=num_classes, backbone='DarkNet53')
-model.train(
- num_epochs=270,
- train_dataset=train_dataset,
- train_batch_size=8,
- eval_dataset=eval_dataset,
- learning_rate=0.000125,
- lr_decay_epochs=[210, 240],
- save_dir='output/yolov3_darknet53',
- use_vdl=True)
diff --git a/new_tutorials/train/segmentation/deeplabv3p.py b/new_tutorials/train/segmentation/deeplabv3p.py
deleted file mode 100644
index cb18fcfad65331d02b04abe3c3a76fa0356fb5b8..0000000000000000000000000000000000000000
--- a/new_tutorials/train/segmentation/deeplabv3p.py
+++ /dev/null
@@ -1,51 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-import paddlex as pdx
-from paddlex.seg import transforms
-
-# 下载和解压视盘分割数据集
-optic_dataset = 'https://bj.bcebos.com/paddlex/datasets/optic_disc_seg.tar.gz'
-pdx.utils.download_and_decompress(optic_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/seg_transforms.html#composedsegtransforms
-train_transforms = transforms.ComposedSegTransforms(mode='train', train_crop_size=[769, 769])
-eval_transforms = transforms.ComposedSegTransforms(mode='eval')
-
-train_transforms.add_augmenters([
- transforms.RandomRotate()
-])
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/semantic_segmentation.html#segdataset
-train_dataset = pdx.datasets.SegDataset(
- data_dir='optic_disc_seg',
- file_list='optic_disc_seg/train_list.txt',
- label_list='optic_disc_seg/labels.txt',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.SegDataset(
- data_dir='optic_disc_seg',
- file_list='optic_disc_seg/val_list.txt',
- label_list='optic_disc_seg/labels.txt',
- transforms=eval_transforms)
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/deeplab/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/models/semantic_segmentation.html#deeplabv3p
-num_classes = len(train_dataset.labels)
-model = pdx.seg.DeepLabv3p(num_classes=num_classes)
-model.train(
- num_epochs=40,
- train_dataset=train_dataset,
- train_batch_size=4,
- eval_dataset=eval_dataset,
- learning_rate=0.01,
- save_dir='output/deeplab',
- use_vdl=True)
diff --git a/new_tutorials/train/segmentation/hrnet.py b/new_tutorials/train/segmentation/hrnet.py
deleted file mode 100644
index 98fdd1b925bd4707001fdad56b3ffdc6bb2b58ae..0000000000000000000000000000000000000000
--- a/new_tutorials/train/segmentation/hrnet.py
+++ /dev/null
@@ -1,47 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-import paddlex as pdx
-from paddlex.seg import transforms
-
-# 下载和解压视盘分割数据集
-optic_dataset = 'https://bj.bcebos.com/paddlex/datasets/optic_disc_seg.tar.gz'
-pdx.utils.download_and_decompress(optic_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/seg_transforms.html#composedsegtransforms
-train_transforms = transforms.ComposedSegTransforms(mode='train', train_crop_size=[769, 769])
-eval_transforms = transforms.ComposedSegTransforms(mode='eval')
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/semantic_segmentation.html#segdataset
-train_dataset = pdx.datasets.SegDataset(
- data_dir='optic_disc_seg',
- file_list='optic_disc_seg/train_list.txt',
- label_list='optic_disc_seg/labels.txt',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.SegDataset(
- data_dir='optic_disc_seg',
- file_list='optic_disc_seg/val_list.txt',
- label_list='optic_disc_seg/labels.txt',
- transforms=eval_transforms)
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/unet/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-
-# https://paddlex.readthedocs.io/zh_CN/latest/apis/models/semantic_segmentation.html#hrnet
-num_classes = len(train_dataset.labels)
-model = pdx.seg.HRNet(num_classes=num_classes)
-model.train(
- num_epochs=20,
- train_dataset=train_dataset,
- train_batch_size=4,
- eval_dataset=eval_dataset,
- learning_rate=0.01,
- save_dir='output/hrnet',
- use_vdl=True)
diff --git a/new_tutorials/train/segmentation/unet.py b/new_tutorials/train/segmentation/unet.py
deleted file mode 100644
index ddf4f7991a690b0d0d506967df0c140f60945e85..0000000000000000000000000000000000000000
--- a/new_tutorials/train/segmentation/unet.py
+++ /dev/null
@@ -1,47 +0,0 @@
-import os
-# 选择使用0号卡
-os.environ['CUDA_VISIBLE_DEVICES'] = '0'
-
-import paddlex as pdx
-from paddlex.seg import transforms
-
-# 下载和解压视盘分割数据集
-optic_dataset = 'https://bj.bcebos.com/paddlex/datasets/optic_disc_seg.tar.gz'
-pdx.utils.download_and_decompress(optic_dataset, path='./')
-
-# 定义训练和验证时的transforms
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/transforms/seg_transforms.html#composedsegtransforms
-train_transforms = transforms.ComposedSegTransforms(mode='train', train_crop_size=[769, 769])
-eval_transforms = transforms.ComposedSegTransforms(mode='eval')
-
-# 定义训练和验证所用的数据集
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/datasets/semantic_segmentation.html#segdataset
-train_dataset = pdx.datasets.SegDataset(
- data_dir='optic_disc_seg',
- file_list='optic_disc_seg/train_list.txt',
- label_list='optic_disc_seg/labels.txt',
- transforms=train_transforms,
- shuffle=True)
-eval_dataset = pdx.datasets.SegDataset(
- data_dir='optic_disc_seg',
- file_list='optic_disc_seg/val_list.txt',
- label_list='optic_disc_seg/labels.txt',
- transforms=eval_transforms)
-
-# 初始化模型,并进行训练
-# 可使用VisualDL查看训练指标
-# VisualDL启动方式: visualdl --logdir output/unet/vdl_log --port 8001
-# 浏览器打开 https://0.0.0.0:8001即可
-# 其中0.0.0.0为本机访问,如为远程服务, 改成相应机器IP
-
-# API说明: https://paddlex.readthedocs.io/zh_CN/latest/apis/models/semantic_segmentation.html#unet
-num_classes = len(train_dataset.labels)
-model = pdx.seg.UNet(num_classes=num_classes)
-model.train(
- num_epochs=20,
- train_dataset=train_dataset,
- train_batch_size=4,
- eval_dataset=eval_dataset,
- learning_rate=0.01,
- save_dir='output/unet',
- use_vdl=True)
diff --git a/paddlex/__init__.py b/paddlex/__init__.py
index b80363f2e6adfdbd6ce712cfec486540753abbb7..6fc8aff1d3fdbc08a7474627bf38f2af17599fb3 100644
--- a/paddlex/__init__.py
+++ b/paddlex/__init__.py
@@ -53,4 +53,4 @@ log_level = 2
from . import interpret
-__version__ = '1.0.6'
+__version__ = '1.0.7'
diff --git a/paddlex/command.py b/paddlex/command.py
index 8198291180b92a061dd633eae863f8ddb17727cb..612bc5f3f2b2c3bbec23f56c2983a722d76e21fc 100644
--- a/paddlex/command.py
+++ b/paddlex/command.py
@@ -1,11 +1,11 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
-#
+#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
-#
+#
# http://www.apache.org/licenses/LICENSE-2.0
-#
+#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -15,6 +15,7 @@
from six import text_type as _text_type
import argparse
import sys
+import paddlex.utils.logging as logging
def arg_parser():
@@ -94,15 +95,15 @@ def main():
if args.export_onnx:
assert args.model_dir is not None, "--model_dir should be defined while exporting onnx model"
assert args.save_dir is not None, "--save_dir should be defined to create onnx model"
- assert args.fixed_input_shape is not None, "--fixed_input_shape should be defined [w,h] to create onnx model, such as [224,224]"
- fixed_input_shape = []
- if args.fixed_input_shape is not None:
- fixed_input_shape = eval(args.fixed_input_shape)
- assert len(
- fixed_input_shape
- ) == 2, "len of fixed input shape must == 2, such as [224,224]"
- model = pdx.load_model(args.model_dir, fixed_input_shape)
+ model = pdx.load_model(args.model_dir)
+ if model.status == "Normal" or model.status == "Prune":
+ logging.error(
+ "Only support inference model, try to export model first as below,",
+ exit=False)
+ logging.error(
+ "paddlex --export_inference --model_dir model_path --save_dir infer_model"
+ )
pdx.convertor.export_onnx_model(model, args.save_dir)
diff --git a/paddlex/convertor.py b/paddlex/convertor.py
index a6888ae1ef9bd764d213125142d355e7e2ca2428..47fc8a82be5ac337206eb0c9dc395aecb862299e 100644
--- a/paddlex/convertor.py
+++ b/paddlex/convertor.py
@@ -1,11 +1,11 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
-#
+#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
-#
+#
# http://www.apache.org/licenses/LICENSE-2.0
-#
+#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -30,119 +30,17 @@ def export_onnx(model_dir, save_dir, fixed_input_shape):
def export_onnx_model(model, save_dir):
- support_list = [
- 'ResNet18', 'ResNet34', 'ResNet50', 'ResNet101', 'ResNet50_vd',
- 'ResNet101_vd', 'ResNet50_vd_ssld', 'ResNet101_vd_ssld', 'DarkNet53',
- 'MobileNetV1', 'MobileNetV2', 'DenseNet121', 'DenseNet161',
- 'DenseNet201'
- ]
- if model.__class__.__name__ not in support_list:
- raise Exception("Model: {} unsupport export to ONNX".format(
- model.__class__.__name__))
- try:
- from fluid.utils import op_io_info, init_name_prefix
- from onnx import helper, checker
- import fluid_onnx.ops as ops
- from fluid_onnx.variables import paddle_variable_to_onnx_tensor, paddle_onnx_weight
- from debug.model_check import debug_model, Tracker
- except Exception as e:
+ if model.model_type == "detector" or model.__class__.__name__ == "FastSCNN":
logging.error(
- "Import Module Failed! Please install paddle2onnx. Related requirements see https://github.com/PaddlePaddle/paddle2onnx."
+ "Only image classifier models and semantic segmentation models(except FastSCNN) are supported to export to ONNX"
)
- raise e
- place = fluid.CPUPlace()
- exe = fluid.Executor(place)
- inference_scope = fluid.global_scope()
- with fluid.scope_guard(inference_scope):
- test_input_names = [
- var.name for var in list(model.test_inputs.values())
- ]
- inputs_outputs_list = ["fetch", "feed"]
- weights, weights_value_info = [], []
- global_block = model.test_prog.global_block()
- for var_name in global_block.vars:
- var = global_block.var(var_name)
- if var_name not in test_input_names\
- and var.persistable:
- weight, val_info = paddle_onnx_weight(
- var=var, scope=inference_scope)
- weights.append(weight)
- weights_value_info.append(val_info)
-
- # Create inputs
- inputs = [
- paddle_variable_to_onnx_tensor(v, global_block)
- for v in test_input_names
- ]
- logging.INFO("load the model parameter done.")
- onnx_nodes = []
- op_check_list = []
- op_trackers = []
- nms_first_index = -1
- nms_outputs = []
- for block in model.test_prog.blocks:
- for op in block.ops:
- if op.type in ops.node_maker:
- # TODO: deal with the corner case that vars in
- # different blocks have the same name
- node_proto = ops.node_maker[str(op.type)](
- operator=op, block=block)
- op_outputs = []
- last_node = None
- if isinstance(node_proto, tuple):
- onnx_nodes.extend(list(node_proto))
- last_node = list(node_proto)
- else:
- onnx_nodes.append(node_proto)
- last_node = [node_proto]
- tracker = Tracker(str(op.type), last_node)
- op_trackers.append(tracker)
- op_check_list.append(str(op.type))
- if op.type == "multiclass_nms" and nms_first_index < 0:
- nms_first_index = 0
- if nms_first_index >= 0:
- _, _, output_op = op_io_info(op)
- for output in output_op:
- nms_outputs.extend(output_op[output])
- else:
- if op.type not in ['feed', 'fetch']:
- op_check_list.append(op.type)
- logging.info('The operator sets to run test case.')
- logging.info(set(op_check_list))
-
- # Create outputs
- # Get the new names for outputs if they've been renamed in nodes' making
- renamed_outputs = op_io_info.get_all_renamed_outputs()
- test_outputs = list(model.test_outputs.values())
- test_outputs_names = [var.name for var in model.test_outputs.values()]
- test_outputs_names = [
- name if name not in renamed_outputs else renamed_outputs[name]
- for name in test_outputs_names
- ]
- outputs = [
- paddle_variable_to_onnx_tensor(v, global_block)
- for v in test_outputs_names
- ]
-
- # Make graph
- onnx_name = 'paddlex.onnx'
- onnx_graph = helper.make_graph(
- nodes=onnx_nodes,
- name=onnx_name,
- initializer=weights,
- inputs=inputs + weights_value_info,
- outputs=outputs)
-
- # Make model
- onnx_model = helper.make_model(
- onnx_graph, producer_name='PaddlePaddle')
-
- # Model check
- checker.check_model(onnx_model)
- if onnx_model is not None:
- onnx_model_file = os.path.join(save_dir, onnx_name)
- if not os.path.exists(save_dir):
- os.mkdir(save_dir)
- with open(onnx_model_file, 'wb') as f:
- f.write(onnx_model.SerializeToString())
- logging.info("Saved converted model to path: %s" % onnx_model_file)
+ try:
+ import x2paddle
+ if x2paddle.__version__ < '0.7.4':
+ logging.error("You need to upgrade x2paddle >= 0.7.4")
+ except:
+ logging.error(
+ "You need to install x2paddle first, pip install x2paddle>=0.7.4")
+ from x2paddle.op_mapper.paddle_op_mapper import PaddleOpMapper
+ mapper = PaddleOpMapper()
+ mapper.convert(model.test_prog, save_dir)
diff --git a/paddlex/cv/datasets/coco.py b/paddlex/cv/datasets/coco.py
index 97e791be5ed3cac1656fba4429d90f1653bfe1be..264b2da1e6a6aa9e15bf8a2ae9b3fbdc3ee75f1b 100644
--- a/paddlex/cv/datasets/coco.py
+++ b/paddlex/cv/datasets/coco.py
@@ -100,7 +100,7 @@ class CocoDetection(VOCDetection):
gt_score = np.ones((num_bbox, 1), dtype=np.float32)
is_crowd = np.zeros((num_bbox, 1), dtype=np.int32)
difficult = np.zeros((num_bbox, 1), dtype=np.int32)
- gt_poly = None
+ gt_poly = [None] * num_bbox
for i, box in enumerate(bboxes):
catid = box['category_id']
@@ -108,8 +108,6 @@ class CocoDetection(VOCDetection):
gt_bbox[i, :] = box['clean_bbox']
is_crowd[i][0] = box['iscrowd']
if 'segmentation' in box:
- if gt_poly is None:
- gt_poly = [None] * num_bbox
gt_poly[i] = box['segmentation']
im_info = {
@@ -121,10 +119,9 @@ class CocoDetection(VOCDetection):
'gt_class': gt_class,
'gt_bbox': gt_bbox,
'gt_score': gt_score,
+ 'gt_poly': gt_poly,
'difficult': difficult
}
- if gt_poly is not None:
- label_info['gt_poly'] = gt_poly
coco_rec = (im_info, label_info)
self.file_list.append([im_fname, coco_rec])
diff --git a/paddlex/cv/datasets/easydata_cls.py b/paddlex/cv/datasets/easydata_cls.py
index 121ae563308c695a0a76fcf383eb6e6bb7f43011..9b6dddc4843616ff0a09712e6766e3ea9552b466 100644
--- a/paddlex/cv/datasets/easydata_cls.py
+++ b/paddlex/cv/datasets/easydata_cls.py
@@ -39,14 +39,14 @@ class EasyDataCls(ImageNet):
线程和'process'进程两种方式。默认为'process'(Windows和Mac下会强制使用thread,该参数无效)。
shuffle (bool): 是否需要对数据集中样本打乱顺序。默认为False。
"""
-
+
def __init__(self,
data_dir,
file_list,
label_list,
transforms=None,
num_workers='auto',
- buffer_size=100,
+ buffer_size=8,
parallel_method='process',
shuffle=False):
super(ImageNet, self).__init__(
@@ -58,7 +58,7 @@ class EasyDataCls(ImageNet):
self.file_list = list()
self.labels = list()
self._epoch = 0
-
+
with open(label_list, encoding=get_encoding(label_list)) as f:
for line in f:
item = line.strip()
@@ -73,8 +73,8 @@ class EasyDataCls(ImageNet):
if not osp.isfile(json_file):
continue
if not osp.exists(img_file):
- raise IOError(
- 'The image file {} is not exist!'.format(img_file))
+ raise IOError('The image file {} is not exist!'.format(
+ img_file))
with open(json_file, mode='r', \
encoding=get_encoding(json_file)) as j:
json_info = json.load(j)
@@ -83,4 +83,3 @@ class EasyDataCls(ImageNet):
self.num_samples = len(self.file_list)
logging.info("{} samples in file {}".format(
len(self.file_list), file_list))
-
\ No newline at end of file
diff --git a/paddlex/cv/datasets/imagenet.py b/paddlex/cv/datasets/imagenet.py
index 99723d3b8f4ec6f8c0b9297f9fe66c1fbc60693f..0986f823add893c6fb746168f3c2bcfa438f5e10 100644
--- a/paddlex/cv/datasets/imagenet.py
+++ b/paddlex/cv/datasets/imagenet.py
@@ -45,7 +45,7 @@ class ImageNet(Dataset):
label_list,
transforms=None,
num_workers='auto',
- buffer_size=100,
+ buffer_size=8,
parallel_method='process',
shuffle=False):
super(ImageNet, self).__init__(
@@ -70,8 +70,8 @@ class ImageNet(Dataset):
continue
full_path = osp.join(data_dir, items[0])
if not osp.exists(full_path):
- raise IOError(
- 'The image file {} is not exist!'.format(full_path))
+ raise IOError('The image file {} is not exist!'.format(
+ full_path))
self.file_list.append([full_path, int(items[1])])
self.num_samples = len(self.file_list)
logging.info("{} samples in file {}".format(
diff --git a/paddlex/cv/datasets/seg_dataset.py b/paddlex/cv/datasets/seg_dataset.py
index 61697e3d799ccb0ca765410a81e7257741acfb44..6e8bfae1ca623ed90a6d583042627cf4aecb2ea6 100644
--- a/paddlex/cv/datasets/seg_dataset.py
+++ b/paddlex/cv/datasets/seg_dataset.py
@@ -1,4 +1,4 @@
-# copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@@ -28,7 +28,7 @@ class SegDataset(Dataset):
Args:
data_dir (str): 数据集所在的目录路径。
file_list (str): 描述数据集图片文件和对应标注文件的文件路径(文本内每行路径为相对data_dir的相对路)。
- label_list (str): 描述数据集包含的类别信息文件路径。
+ label_list (str): 描述数据集包含的类别信息文件路径。默认值为None。
transforms (list): 数据集中每个样本的预处理/增强算子。
num_workers (int): 数据集中样本在预处理过程中的线程或进程数。默认为4。
buffer_size (int): 数据集中样本在预处理过程中队列的缓存长度,以样本数为单位。默认为100。
@@ -40,7 +40,7 @@ class SegDataset(Dataset):
def __init__(self,
data_dir,
file_list,
- label_list,
+ label_list=None,
transforms=None,
num_workers='auto',
buffer_size=100,
@@ -56,10 +56,11 @@ class SegDataset(Dataset):
self.labels = list()
self._epoch = 0
- with open(label_list, encoding=get_encoding(label_list)) as f:
- for line in f:
- item = line.strip()
- self.labels.append(item)
+ if label_list is not None:
+ with open(label_list, encoding=get_encoding(label_list)) as f:
+ for line in f:
+ item = line.strip()
+ self.labels.append(item)
with open(file_list, encoding=get_encoding(file_list)) as f:
for line in f:
@@ -69,8 +70,8 @@ class SegDataset(Dataset):
full_path_im = osp.join(data_dir, items[0])
full_path_label = osp.join(data_dir, items[1])
if not osp.exists(full_path_im):
- raise IOError(
- 'The image file {} is not exist!'.format(full_path_im))
+ raise IOError('The image file {} is not exist!'.format(
+ full_path_im))
if not osp.exists(full_path_label):
raise IOError('The image file {} is not exist!'.format(
full_path_label))
diff --git a/paddlex/cv/datasets/voc.py b/paddlex/cv/datasets/voc.py
index 9b2e8528c52d5f2ecd6a041bbf7e86f095ea35ac..b701c56847b6e0da9aace3784c4cb8e76dbbed77 100644
--- a/paddlex/cv/datasets/voc.py
+++ b/paddlex/cv/datasets/voc.py
@@ -17,6 +17,7 @@ import copy
import os
import os.path as osp
import random
+import re
import numpy as np
from collections import OrderedDict
import xml.etree.ElementTree as ET
@@ -104,23 +105,60 @@ class VOCDetection(Dataset):
else:
ct = int(tree.find('id').text)
im_id = np.array([int(tree.find('id').text)])
-
- objs = tree.findall('object')
- im_w = float(tree.find('size').find('width').text)
- im_h = float(tree.find('size').find('height').text)
+ pattern = re.compile('