提交 1b014f63 编写于 作者: C chenjian

finish object detection docs

上级 2fa0e187
...@@ -23,7 +23,9 @@ ...@@ -23,7 +23,9 @@
- ### Module Introduction - ### Module Introduction
- Faster_RCNN是两阶段目标检测器,对图像生成候选区域、提取特征、判别特征类别并修正候选框位置.Faster_RCNN整体网络可以分为4部分,一是ResNet-50作为基础卷积层,二是区域生成网络,三是Rol Align,四是检测层.Faster_RCNN是在MS-COCO数据集上预训练的模型.目前仅提供预测功能. - Faster_RCNN is a two-stage detector, it consists of feature extraction, proposal, classification and refinement processes. This module is trained on COCO2017 dataset, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -73,7 +75,7 @@ ...@@ -73,7 +75,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -82,22 +84,22 @@ ...@@ -82,22 +84,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
......
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
- ### Module Introduction - ### Module Introduction
- Faster_RCNN是两阶段目标检测器,对图像生成候选区域、提取特征、判别特征类别并修正候选框位置.Faster_RCNN整体网络可以分为4个部分,一是ResNet-50作为基础卷积层,二是区域生成网络,三是Rol Align,四是检测层.Faster_RCNN是在MS-COCO数据集上预训练的模型.目前仅支持预测. - Faster_RCNN is a two-stage detector, it consists of feature extraction, proposal, classification and refinement processes. This module is trained on COCO2017 dataset, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -73,7 +73,7 @@ ...@@ -73,7 +73,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -82,22 +82,22 @@ ...@@ -82,22 +82,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
| :--- | :---: | | :--- | :---: |
|Category|object detection| |Category|object detection|
|Network|faster_rcnn| |Network|faster_rcnn|
|Dataset|百度自建Dataset| |Dataset|Baidu Detection Dataset|
|Fine-tuning supported or not|Yes| |Fine-tuning supported or not|Yes|
|Module Size|317MB| |Module Size|317MB|
|Latest update date|2021-02-26| |Latest update date|2021-02-26|
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
- ### Module Introduction - ### Module Introduction
- Faster_RCNN是两阶段目标检测器,对图像生成候选区域、提取特征、判别特征类别并修正候选框位置.Faster_RCNN整体网络可以分为4个部分,一是ResNet-50作为基础卷积层,二是区域生成网络,三是Rol Align,四是检测层.该PaddleHub Module是由800+tag,170w图片,1000w+检测框训练的大规模通用检测模型,在8个数据集上MAP平均提升2.06%,iou=0.5的准确率平均提升1.78%.对比于其他通用检测模型,使用该Module进行finetune,可以更快收敛,达到较优效果. - Faster_RCNN is a two-stage detector, it consists of feature extraction, proposal, classification and refinement processes. This module is trained on Baidu Detection Dataset, which contains 170w pictures and 1000w+ boxes, and improve the accuracy on 8 test datasets with average 2.06%. Besides, this module supports to fine-tune model, and can achieve faster convergence and better performance.
## II.Installation ## II.Installation
...@@ -44,38 +44,38 @@ ...@@ -44,38 +44,38 @@
phase='train') phase='train')
``` ```
- 提取特征,用于迁移学习. - Extract features, and do transfer learning
- **Parameters** - **Parameters**
- num\_classes (int): 类别数;<br/> - num\_classes (int): number of classes;<br/>
- trainable (bool): Parameters是否可训练;<br/> - trainable (bool): whether parameters trainable or not;<br/>
- pretrained (bool): 是否加载预训练模型;<br/> - pretrained (bool): whether load pretrained model or not
- get\_prediction (bool): 可选值为 'train'/'predict','train' 用于训练,'predict' 用于预测. - get\_prediction (bool): optional, 'train' or 'predict','train' is used for training,'predict' used for prediction.
- **Return** - **Return**
- inputs (dict): 模型的输入,相应的取值为 - inputs (dict): inputs, a dict
当phase为'train'时,包含 if phase is 'train', keys are
- image (Variable): 图像变量 - image (Variable): image variable
- im\_size (Variable): 图像的尺寸 - im\_size (Variable): image size
- im\_info (Variable): 图像缩放信息 - im\_info (Variable): image information
- gt\_class (Variable): 检测框类别 - gt\_class (Variable): box class
- gt\_box (Variable): 检测框坐标 - gt\_box (Variable): box coordination
- is\_crowd (Variable): 单个框内是否包含多个物体 - is\_crowd (Variable): if multiple objects in box
当 phase 为 'predict'时,包含 if phase 为 'predict',keys are
- image (Variable): 图像变量 - image (Variable): image variable
- im\_size (Variable): 图像的尺寸 - im\_size (Variable): image size
- im\_info (Variable): 图像缩放信息 - im\_info (Variable): image information
- outputs (dict): 模型的输出,响应的取值为: - outputs (dict): model output
当 phase 为 'train'时,包含 if phase is 'train', keys are
- head_features (Variable): 所提取的特征 - head_features (Variable): features extracted
- rpn\_cls\_loss (Variable): 检测框分类损失 - rpn\_cls\_loss (Variable): classfication loss in box
- rpn\_reg\_loss (Variable): 检测框回归损失 - rpn\_reg\_loss (Variable): regression loss in box
- generate\_proposal\_labels (Variable): 图像信息 - generate\_proposal\_labels (Variable): proposal labels
当 phase 为 'predict'时,包含 if phase 为 'predict',keys are
- head_features (Variable): 所提取的特征 - head_features (Variable): features extracted
- rois (Variable): 提取的roi - rois (Variable): roi
- bbox\_out (Variable): 预测结果 - bbox\_out (Variable): prediction results
- context\_prog (Program): 用于迁移学习的 Program - program for transfer learning
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
- ### Module Introduction - ### Module Introduction
- Single Shot MultiBox Detector (SSD) 是一种单阶段的目标检测器.与两阶段的检测方法不同,单阶段目标检测并不进行区域推荐,而是直接从特征图回归出目标的边界框和分类概率.SSD 运用了这种单阶段检测的思想,并且对其进行改进:在不同尺度的特征图上检测对应尺度的目标.该PaddleHub Module的基网络为MobileNet-v1模型,在Pascal数据集上预训练得到,目前仅支持预测. - Single Shot MultiBox Detector (SSD) is a one-stage detector. Different from two-stage detector, SSD frames object detection as a re- gression problem to spatially separated bounding boxes and associated class probabilities. This module is based on MobileNet-v1, trained on Pascal dataset, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -73,7 +73,7 @@ ...@@ -73,7 +73,7 @@
) )
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -90,15 +90,15 @@ ...@@ -90,15 +90,15 @@
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -22,7 +22,8 @@ ...@@ -22,7 +22,8 @@
- ### Module Introduction - ### Module Introduction
- Single Shot MultiBox Detector (SSD) 是一种单阶段的目标检测器.与两阶段的检测方法不同,单阶段目标检测并不进行区域推荐,而是直接从特征图回归出目标的边界框和分类概率.SSD 运用了这种单阶段检测的思想,并且对其进行改进:在不同尺度的特征图上检测对应尺度的目标.该PaddleHub Module的基网络为VGG16模型,在Pascal数据集上预训练得到,目前仅支持预测. - Single Shot MultiBox Detector (SSD) is a one-stage detector. Different from two-stage detector, SSD frames object detection as a re- gression problem to spatially separated bounding boxes and associated class probabilities. This module is based on VGG16, trained on COCO2017 dataset, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -72,7 +73,7 @@ ...@@ -72,7 +73,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -81,22 +82,22 @@ ...@@ -81,22 +82,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
- ### Module Introduction - ### Module Introduction
- YOLOv3是由Joseph Redmon和Ali Farhadi提出的单阶段检测器, 该检测器与达到同样精度的传统目标检测方法相比,推断速度能达到接近两倍. YOLOv3将输入图像划分格子,并对每个格子预测bounding box.YOLOv3的loss函数由三部分组成:Location误差,Confidence误差和分类误差.该PaddleHub Module预训练数据集为COCO2017,目前仅支持预测. - YOLOv3 is a one-stage detector proposed by Joseph Redmon and Ali Farhadi, which can reach comparable accuracy but twice as fast as traditional methods. This module is based on YOLOv3, trained on COCO2017, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -72,7 +72,7 @@ ...@@ -72,7 +72,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -81,22 +81,22 @@ ...@@ -81,22 +81,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
| :--- | :---: | | :--- | :---: |
|Category|object detection| |Category|object detection|
|Network|YOLOv3| |Network|YOLOv3|
|Dataset|百度自建大规模行人Dataset| |Dataset|Baidu Pedestrian Dataset|
|Fine-tuning supported or not|No| |Fine-tuning supported or not|No|
|Module Size|238MB| |Module Size|238MB|
|Latest update date|2021-03-15| |Latest update date|2021-03-15|
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
- ### Module Introduction - ### Module Introduction
- 行人检测是计算机视觉技术中的目标检测问题,用于判断图像中是否存在行人并给予精确定位,定位结果用矩形框表示.行人检测技术有很强的使用价值,它可以与行人跟踪、行人重识别等技术结合,应用于汽车无人驾驶系统、智能视频监控、人体行为分析、客流统计系统、智能交通等领域.yolov3_darknet53_pedestrian Module的网络为YOLOv3, 其中backbone为DarkNet53, 采用百度自建大规模车辆数据集训练得到,目前仅支持预测. - YOLOv3 is a one-stage detector proposed by Joseph Redmon and Ali Farhadi, which can reach comparable accuracy but twice as fast as traditional methods. This module is based on YOLOv3, trained on Baidu Pedestrian Dataset, and can be used for pedestrian detection.
## II.Installation ## II.Installation
...@@ -72,7 +72,7 @@ ...@@ -72,7 +72,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有行人的位置. - Detection API, detect positions of all pedestrian in image
- **Parameters** - **Parameters**
...@@ -81,7 +81,7 @@ ...@@ -81,7 +81,7 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
...@@ -89,15 +89,15 @@ ...@@ -89,15 +89,15 @@
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
| :--- | :---: | | :--- | :---: |
|Category|object detection| |Category|object detection|
|Network|YOLOv3| |Network|YOLOv3|
|Dataset|百度自建大规模车辆Dataset| |Dataset|Baidu Vehicle Dataset|
|Fine-tuning supported or not|No| |Fine-tuning supported or not|No|
|Module Size|238MB| |Module Size|238MB|
|Latest update date|2021-03-15| |Latest update date|2021-03-15|
...@@ -22,7 +22,8 @@ ...@@ -22,7 +22,8 @@
- ### Module Introduction - ### Module Introduction
- 车辆检测是城市交通监控中非常重要并且具有挑战性的任务,该任务的难度在于对复杂场景中相对较小的车辆进行精准地定位和分类.该 PaddleHub Module 的网络为 YOLOv3, 其中 backbone 为 DarkNet53,采用百度自建大规模车辆数据集训练得到,支持car (汽车)、truck (卡车)、bus (公交车)、motorbike (摩托车)、tricycle (三轮车)等车型的识别.目前仅支持预测. - YOLOv3 is a one-stage detector proposed by Joseph Redmon and Ali Farhadi, which can reach comparable accuracy but twice as fast as traditional methods. This module is based on YOLOv3, trained on Baidu Vehicle Dataset, and can be used for vehicle detection.
## II.Installation ## II.Installation
...@@ -72,7 +73,7 @@ ...@@ -72,7 +73,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有车辆的位置. - Detection API, detect positions of all vehicles in image
- **Parameters** - **Parameters**
...@@ -81,22 +82,22 @@ ...@@ -81,22 +82,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -4,7 +4,7 @@ ...@@ -4,7 +4,7 @@
| :--- | :---: | | :--- | :---: |
|Category|object detection| |Category|object detection|
|Network|YOLOv3| |Network|YOLOv3|
|Dataset|百度自建Dataset| |Dataset|Baidu Detection Dataset|
|Fine-tuning supported or not|Yes| |Fine-tuning supported or not|Yes|
|Module Size|501MB| |Module Size|501MB|
|Latest update date|2021-02-26| |Latest update date|2021-02-26|
...@@ -15,7 +15,7 @@ ...@@ -15,7 +15,7 @@
- ### Module Introduction - ### Module Introduction
- YOLOv3是由Joseph Redmon和Ali Farhadi提出的单阶段检测器, 该检测器与达到同样精度的传统目标检测方法相比,推断速度能达到接近两倍. YOLOv3将输入图像划分格子,并对每个格子预测bounding box.YOLOv3的loss函数由三部分组成:Location误差,Confidence误差和分类误差.该PaddleHub Module是由800+tag,170w图片,1000w+检测框训练的大规模通用检测模型,在8个数据集上MAP平均提升5.36%,iou=0.5的准确率提升4.53%.对比于其他通用检测模型,使用该Module进行finetune,可以更快收敛,达到较优效果. - YOLOv3 is a one-stage detector proposed by Joseph Redmon and Ali Farhadi, which can reach comparable accuracy but twice as fast as traditional methods. This module is based on YOLOv3, trained on Baidu Vehicle Dataset which consists of 170w pictures and 1000w+ boxes, improve the accuracy on 8 test datasets for average 5.36%, and can be used for vehicle detection.
## II.Installation ## II.Installation
...@@ -43,20 +43,20 @@ ...@@ -43,20 +43,20 @@
get_prediction=False) get_prediction=False)
``` ```
- 提取特征,用于迁移学习. - Extract features, and do transfer learning
- **Parameters** - **Parameters**
- trainable(bool): Parameters是否可训练;<br/> - trainable(bool): whether parameters trainable or not
- pretrained (bool): 是否加载预训练模型;<br/> - pretrained (bool): whether load pretrained model or not
- get\_prediction (bool): 是否执行预测. - get\_prediction (bool): whether perform prediction
- **Return** - **Return**
- inputs (dict): 模型的输入,keys 包括 'image', 'im\_size',相应的取值为: - inputs (dict): inputs, a dict, include two keys: "image" and "im\_size"
- image (Variable): 图像变量 - image (Variable): image variable
- im\_size (Variable): 图片的尺寸 - im\_size (Variable): image size
- outputs (dict): 模型的输出.如果 get\_prediction 为 False,输出 'head\_features'、'body\_features',否则输出 'bbox\_out' - outputs (dict): model output
- context\_prog (Program): 用于迁移学习的 Program - program for transfer learning
- ```python - ```python
def object_detection(paths=None, def object_detection(paths=None,
...@@ -68,7 +68,7 @@ ...@@ -68,7 +68,7 @@
output_dir='detection_result') output_dir='detection_result')
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -76,7 +76,7 @@ ...@@ -76,7 +76,7 @@
- images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format [H, W, C], BGR; - images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format [H, W, C], BGR;
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
- output_dir (str): save path of images; - output_dir (str): save path of images;
...@@ -85,12 +85,12 @@ ...@@ -85,12 +85,12 @@
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -23,7 +23,7 @@ ...@@ -23,7 +23,7 @@
- ### Module Introduction - ### Module Introduction
- YOLOv3是由Joseph Redmon和Ali Farhadi提出的单阶段检测器, 该检测器与达到同样精度的传统目标检测方法相比,推断速度能达到接近两倍.YOLOv3将输入图像划分格子,并对每个格子预测bounding box.YOLOv3的loss函数由三部分组成:Location误差,Confidence误差和分类误差.该PaddleHub Module预训练数据集为COCO2017,目前仅支持预测. - YOLOv3 is a one-stage detector proposed by Joseph Redmon and Ali Farhadi, which can reach comparable accuracy but twice as fast as traditional methods. This module is based on YOLOv3, trained on COCO2017, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -73,7 +73,7 @@ ...@@ -73,7 +73,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -82,22 +82,22 @@ ...@@ -82,22 +82,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
- ### Module Introduction - ### Module Introduction
- YOLOv3是由Joseph Redmon和Ali Farhadi提出的单阶段检测器, 该检测器与达到同样精度的传统目标检测方法相比,推断速度能达到接近两倍. YOLOv3将输入图像划分格子,并对每个格子预测bounding box.YOLOv3的loss函数由三部分组成:Location误差,Confidence误差和分类误差.该PaddleHub Module预训练数据集为COCO2017,目前仅支持预测. - YOLOv3 is a one-stage detector proposed by Joseph Redmon and Ali Farhadi, which can reach comparable accuracy but twice as fast as traditional methods. This module is based on YOLOv3, trained on COCO2017, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -72,7 +72,7 @@ ...@@ -72,7 +72,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -81,22 +81,22 @@ ...@@ -81,22 +81,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
- ### Module Introduction - ### Module Introduction
- YOLOv3是由Joseph Redmon和Ali Farhadi提出的单阶段检测器, 该检测器与达到同样精度的传统目标检测方法相比,推断速度能达到接近两倍. YOLOv3将输入图像划分格子,并对每个格子预测bounding box.YOLOv3的loss函数由三部分组成:Location误差,Confidence误差和分类误差.该PaddleHub Module预训练数据集为COCO2017,目前仅支持预测. - YOLOv3 is a one-stage detector proposed by Joseph Redmon and Ali Farhadi, which can reach comparable accuracy but twice as fast as traditional methods. This module is based on YOLOv3, trained on COCO2017, and can be used for object detection.
## II.Installation ## II.Installation
...@@ -72,7 +72,7 @@ ...@@ -72,7 +72,7 @@
visualization=True) visualization=True)
``` ```
- 预测API,检测输入图片中的所有目标的位置. - Detection API, detect positions of all objects in image
- **Parameters** - **Parameters**
...@@ -81,22 +81,22 @@ ...@@ -81,22 +81,22 @@
- batch_size (int): the size of batch; - batch_size (int): the size of batch;
- use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU** - use_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- output_dir (str): save path of images; - output_dir (str): save path of images;
- score\_thresh (float): 识别置信度的阈值;<br/> - score\_thresh (float): confidence threshold;<br/>
- visualization (bool): Whether to save the results as picture files; - visualization (bool): Whether to save the results as picture files;
**NOTE:** choose one parameter to provide data from paths and images **NOTE:** choose one parameter to provide data from paths and images
- **Return** - **Return**
- res (list\[dict\]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability - res (list\[dict\]): results
- data (list): detection results, each element in the list is dict - data (list): detection results, each element in the list is dict
- confidence (float): the confidence of the result - confidence (float): the confidence of the result
- label (str): 标签 - label (str): label
- left (int): the upper left corner x coordinate of the detection box - left (int): the upper left corner x coordinate of the detection box
- top (int): the upper left corner y coordinate of the detection box - top (int): the upper left corner y coordinate of the detection box
- right (int): the lower right corner x coordinate of the detection box - right (int): the lower right corner x coordinate of the detection box
- bottom (int): the lower right corner y coordinate of the detection box - bottom (int): the lower right corner y coordinate of the detection box
- save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在) - save\_path (str, optional): output path for saving results
- ```python - ```python
def save_inference_model(dirname, def save_inference_model(dirname,
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册