- AlexNet是图像分类中的经典模型.模型由Alex Krizhevsky于2012年提出,并在2012年ILSVRC比赛中夺得冠军.该PaddleHub Module结构为AlexNet,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- AlexNet was a classification model proposed by Alex Krizhevsky in 2012, and gained the champion of ILSVRC 2012. This module is based on AlexNet, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- DarkNet 是由 Joseph Redmon 提出的图像分类模型,并应用于Yolov3 中作为 Backbone 来完成特征提取.该网络采用连续的 3*3 和 1*1 卷积进行连接,并像ResNet 一样有ShortCut连接.该 PaddleHub Module 基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测.
- DarkNet is a classification model proposed by Joseph Redmon, which uses Yolov3 as backbone to extract features. This module is based on darknet53, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接.对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入.DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了Parameters量.该PaddleHub Module结构为 DenseNet121,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DenseNet is the model in CVPR2017 best paper. Every layer outputs its result as input for the layer after it, and forms the dense connection topology. The dense connection ease the probblem of vanishing gradient and improve the information flow. This module is based on DenseNet121, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接.对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入.DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了Parameters量.该PaddleHub Module结构为 DenseNet161,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DenseNet is the model in CVPR2017 best paper. Every layer outputs its result as input for the layer after it, and forms the dense connection topology. The dense connection ease the probblem of vanishing gradient and improve the information flow. This module is based on DenseNet161, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接.对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入.DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了Parameters量.该PaddleHub Module结构为 DenseNet169,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DenseNet is the model in CVPR2017 best paper. Every layer outputs its result as input for the layer after it, and forms the dense connection topology. The dense connection ease the probblem of vanishing gradient and improve the information flow. This module is based on DenseNet169, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接.对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入.DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了Parameters量.该PaddleHub Module结构为 DenseNet201,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DenseNet is the model in CVPR2017 best paper. Every layer outputs its result as input for the layer after it, and forms the dense connection topology. The dense connection ease the probblem of vanishing gradient and improve the information flow. This module is based on DenseNet201, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接.对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入.DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了Parameters量.该PaddleHub Module结构为 DenseNet264,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DenseNet is the model in CVPR2017 best paper. Every layer outputs its result as input for the layer after it, and forms the dense connection topology. The dense connection ease the probblem of vanishing gradient and improve the information flow. This module is based on DenseNet264, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想.该PaddleHub Module结构为 DPN107,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DPN(Dual Path Networks) is the champion of ILSVRC2017 in Object Localization Task. This module is based on DPN107, trained on ImageNet-2012, can predict an image of size 224*224*3.
- DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想.该PaddleHub Module结构为 DPN98,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DPN(Dual Path Networks) is the champion of ILSVRC2017 in Object Localization Task. This module is based on DPN131, trained on ImageNet-2012, can predict an image of size 224*224*3.
- DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想.该PaddleHub Module结构为 DPN68,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DPN(Dual Path Networks) is the champion of ILSVRC2017 in Object Localization Task. This module is based on DPN68, trained on ImageNet-2012, can predict an image of size 224*224*3.
- DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想.该PaddleHub Module结构为 DPN92,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DPN(Dual Path Networks) is the champion of ILSVRC2017 in Object Localization Task. This module is based on DPN92, trained on ImageNet-2012, can predict an image of size 224*224*3.
- DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想.该PaddleHub Module结构为 DPN98,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- DPN(Dual Path Networks) is the champion of ILSVRC2017 in Object Localization Task. This module is based on DPN98, trained on ImageNet-2012, can predict an image of size 224*224*3.
- ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数.该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测.
- ResNeXt is proposed by UC San Diego and Facebook AI Research in 2017. This module is based on ResNeXt model. It is weak-supervised trained on billions of socail images, finetuned on ImageNet-2012 dataset, and can predict an image of size 224*224*3.
## II.Installation
## II.Installation
...
@@ -45,7 +46,7 @@
...
@@ -45,7 +46,7 @@
```
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测Prediction Code Example
- ### 2、Prediction Code Example
-```python
-```python
import paddlehub as hub
import paddlehub as hub
...
@@ -66,13 +66,13 @@
...
@@ -66,13 +66,13 @@
```
```
- classification API.
- classification API.
-**Parameters**
-**Parameters**
- images:list类型,待检测的图像.
- images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format [H, W, C], BGR;
- **Return**
- **Return**
- result(list[dict]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability
- result(list[dict]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability
- GoogleNet是图像分类中的经典模型.由Christian Szegedy等人在2014年提出,并获得了2014年ILSVRC竞赛冠军.该PaddleHub Module结构为GoogleNet,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- GoogleNet was proposed by Christian Szegedy in 2014 and gained the champion of ILSVRC 2014. This module is based on GoogleNet, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- Inception structure is first introduced in GoogLeNet, so GoogLeNet is named Inception-v1. Inception-v4 is an improvement on it, which takas advantage of sereral useful strategies such as batch normalization, residual learning. This module is based on Inception-v4, trained on ImageNet-2012, and can predict an image of size 224*224*3.
-This module can be used to classify marine biometrics.
## II.Installation
## II.Installation
...
@@ -44,7 +44,7 @@
...
@@ -44,7 +44,7 @@
```
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测Prediction Code Example
- ### 2、Prediction Code Example
-```python
-```python
import paddlehub as hub
import paddlehub as hub
...
@@ -64,7 +64,7 @@
...
@@ -64,7 +64,7 @@
```
```
- classification API.
- classification API.
-**Parameters**
-**Parameters**
- images:list类型,待检测的图像.
- images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format [H, W, C], BGR;
- **Return**
- **Return**
- result(list[dict]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability
- result(list[dict]): classication results, each element in the list is dict, key is the label name, and value is the corresponding probability
- MobileNet V2是Mark Sandler, Andrew Howard等人在2018年提出的一个图像分类模型,该系列模型(MobileNet)是为移动和嵌入式设备提出的高效模型,在模型Parameters较少的情况下仍然保持了较高的分类准确率.该PaddleHub Module基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- MobileNet V2 is an image classification model proposed by Mark Sandler, Andrew Howard et al. in 2018. This model is a light-weight model for mobile and embedded device, and can reach high accurary with a few parameters. This module is based on MobileNet V2, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- MobileNet V2是Mark Sandler, Andrew Howard等人在2018年提出的一个图像分类模型,该系列模型(MobileNet)是为移动和嵌入式设备提出的高效模型,在模型Parameters较少的情况下仍然保持了较高的分类准确率.该PaddleHub Module基于ImageNet-2012数据集并采用PaddleClas提供的SSLD蒸馏方法训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测.
- MobileNet V2 is an image classification model proposed by Mark Sandler, Andrew Howard et al. in 2018. This model is a light-weight model for mobile and embedded device, and can reach high accurary with a few parameters. This module is based on MobileNet V2, trained on ImageNet-2012 with SSLD distillation strategy, and can predict an image of size 224*224*3.
- MobileNetV3是Google在2019年发布的新模型,作者通过结合NAS与NetAdapt进行搜索得到该网络结构,提供了Large和Small两个版本,分别适用于对资源不同要求的情况.对比于MobileNetV2,新的模型在速度和精度方面均有提升.该PaddleHubModule的模型结构为MobileNetV3 Large,基于ImageNet-2012数据集并采用PaddleClas提供的SSLD蒸馏方法训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测.
- MobileNetV3 is an image classification model proposed by Google in 2019. The authors proposed to search the network architecture by combination of NAS and NetAdapt, and provide two versions of this model, i.e. Large and Small version. This module is based on MobileNetV3 Large, trained on ImageNet-2012 with SSLD distillation strategy, and can predict an image of size 224*224*3.
- MobileNetV3是Google在2019年发布的新模型,作者通过结合NAS与NetAdapt进行搜索得到该网络结构,提供了Large和Small两个版本,分别适用于对资源不同要求的情况.对比于MobileNetV2,新的模型在速度和精度方面均有提升.该PaddleHubModule的模型结构为MobileNetV3 Small,基于ImageNet-2012数据集并采用PaddleClas提供的SSLD蒸馏方法训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测.
- MobileNetV3 is an image classification model proposed by Google in 2019. The authors proposed to search the network architecture by combination of NAS and NetAdapt, and provide two versions of this model, i.e. Large and Small version. This module is based on MobileNetV3 Small, trained on ImageNet-2012 with SSLD distillation strategy, and can predict an image of size 224*224*3.
- NASNet is proposed by Google, which is trained by AutoML. This module is based on NASNet, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- NASNet是Google通过AutoML自动训练出来的图像分类模型.该PaddleHub Module基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- PNASNet是Google通过AutoML自动训练出来的图像分类模型.该PaddleHub Module基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- PNASNet is proposed by Google, which is trained by AutoML. This module is based on PNASNet, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- Res2Net是2019年提出的一种全新的对ResNet的改进方案,该方案可以和现有其他优秀模块轻松整合,在不增加计算负载量的情况下,在ImageNet、CIFAR-100等数据集上的测试性能超过了ResNet.Res2Net结构简单,性能优越,进一步探索了CNN在更细粒度级别的多尺度表示能力. 该 PaddleHub Module 使用 ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测.
- Res2Net is an improvement on ResNet, which can improve performance without increasing computation. This module is based on Res2Net, trained on ImageNet-2012, and can predict an image of size 224*224*3.
## II.Installation
## II.Installation
...
@@ -45,7 +45,7 @@
...
@@ -45,7 +45,7 @@
```
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率,ResNet-vd 其实就是 ResNet-D,是ResNet 原始结构的变种.该PaddleHub Module结构为ResNet_vd,基于ImageNet-2012数据集训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测.
- ResNet proposed a residual unit to solve the problem of training an extremely deep network, and improved the prediction accuracy of models. ResNet-vd is a variant of ResNet. This module is based on ResNet_vd, trained on ImageNet-2012 dataset, and can predict an image of size 224*224*3.
- ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率,ResNet-vd 其实就是 ResNet-D,是ResNet 原始结构的变种.该PaddleHub Module结构为ResNet_vd,使用百度自研的基于10万种类别、4千多万的有标签数据进行训练,接受输入图片大小为224 x 224 x 3,支持finetune.
- ResNet proposed a residual unit to solve the problem of training an extremely deep network, and improved the prediction accuracy of models. ResNet-vd is a variant of ResNet. This module is based on ResNet_vd, trained on Baidu dataset(consists of 100 thousand classes, 40 million pairs of data), and can predict an image of size 224*224*3.
- ResNet proposed a residual unit to solve the problem of training an extremely deep network, and improved the prediction accuracy of models. ResNet-vd is a variant of ResNet. This module is based on ResNet-vd and can classify 8416 kinds of food.
-更多详情参考:[Bag of Tricks for Image Classification with Convolutional Neural Networks](https://arxiv.org/pdf/1812.01187.pdf)
-For more information, please refer to:[Bag of Tricks for Image Classification with Convolutional Neural Networks](https://arxiv.org/pdf/1812.01187.pdf)
- ResNet proposed a residual unit to solve the problem of training an extremely deep network, and improved the prediction accuracy of models. ResNet-vd is a variant of ResNet. This module is based on ResNet_vd, trained on IFAW Wild Animal dataset, and can predict ten kinds of wild animal components.
- Squeeze-and-Excitation Networks是由Momenta在2017年提出的一种图像分类结构.该结构通过对特征通道间的相关性进行建模,把重要的特征进行强化来提升准确率.SE_ResNeXt基于ResNeXt模型添加了SE Block,并获得了2017 ILSVR竞赛的冠军.该PaddleHub Module结构为SE_ResNeXt101_32x4d,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- Squeeze-and-Excitation Network is proposed by Momenta in 2017. This model learns the weight to strengthen important channels of features and improves classification accuracy, which is the champion of ILSVR 2017. This module is based on se_resnext101_32x4d, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- Squeeze-and-Excitation Networks是由Momenta在2017年提出的一种图像分类结构.该结构通过对特征通道间的相关性进行建模,把重要的特征进行强化来提升准确率.SE_ResNeXt基于ResNeXt模型添加了SE Block,并获得了2017 ILSVR竞赛的冠军.该PaddleHub Module结构为SE_ResNeXt50_32x4d,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- Squeeze-and-Excitation Network is proposed by Momenta in 2017. This model learns the weight to strengthen important channels of features and improves classification accuracy, which is the champion of ILSVR 2017. This module is based on SE_ResNeXt50_32x4d, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- ShuffleNet V2是由旷视科技在2018年提出的轻量级图像分类模型,该模型通过pointwise group convolution和channel shuffle两种方式,在保持精度的同时大大降低了模型的计算量.该PaddleHub Module结构为ShuffleNet V2,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测.
- ShuffleNet V2 is a light-weight model proposed by MEGVII in 2018. This model proposed pointwise group convolution and channel shuffle to keep accurary and reduce the amount of computation. This module is based on ShuffleNet V2, trained on ImageNet-2012, and can predict an image of 224*224*3.
- Xception 全称为 Extreme Inception,是 Google 于 2016年提出的 Inception V3 的改进模型.Xception 采用了深度可分离卷积(depthwise separable convolution) 来替换原来 Inception V3 中的卷积操作,整体的网络结构是带有残差连接的深度可分离卷积层的线性堆叠.该PaddleHub Module结构为Xception41,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测.
- Xception is a model proposed by Google in 2016, which is an improvement on Inception V3. This module is based on Xception41, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- Xception 全称为 Extreme Inception,是 Google 于 2016年提出的 Inception V3 的改进模型.Xception 采用了深度可分离卷积(depthwise separable convolution) 来替换原来 Inception V3 中的卷积操作,整体的网络结构是带有残差连接的深度可分离卷积层的线性堆叠.该PaddleHub Module结构为Xception65,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测.
- Xception is a model proposed by Google in 2016, which is an improvement on Inception V3. This module is based on Xception65, trained on ImageNet-2012, and can predict an image of size 224*224*3.
- Xception 全称为 Extreme Inception,是 Google 于 2016年提出的 Inception V3 的改进模型.Xception 采用了深度可分离卷积(depthwise separable convolution) 来替换原来 Inception V3 中的卷积操作,整体的网络结构是带有残差连接的深度可分离卷积层的线性堆叠.该PaddleHub Module结构为Xception71,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测.
- Xception is a model proposed by Google in 2016, which is an improvement on Inception V3. This module is based on Xception71, trained on ImageNet-2012, and can predict an image of size 224*224*3.