diff --git a/README.md b/README.md index 1fa319e18b770c2cab14fd3d86b6ebe8fa8f87c9..b7e62e237f448824a23138f64384deaa496c4b5d 100644 --- a/README.md +++ b/README.md @@ -178,8 +178,8 @@ The release of this project is certified by the Apache 2.0 l - + @@ -187,6 +187,7 @@ The release of this project is certified by the Apache 2.0 l + @@ -201,6 +202,7 @@ The release of this project is certified by the Apache 2.0 l + @@ -227,3 +229,5 @@ We welcome you to contribute code to PaddleHub, and thank you for your feedback. * Many thanks to [paopjian](https://github.com/paopjian) for correcting the wrong website address [#1424](https://github.com/PaddlePaddle/PaddleHub/issues/1424) * Many thanks to [Wgm-Inspur](https://github.com/Wgm-Inspur) for correcting the demo errors in readme, and updating the RNN illustration in the text classification and sequence labeling demo * Many thanks to [zl1271](https://github.com/zl1271) for fixing serving docs typo +* Many thanks to [AK391](https://github.com/AK391) for adding the webdemo of UGATIT and deoldify models in Hugging Face spaces +* Many thanks to [itegel](https://github.com/itegel) for fixing quick start docs typo diff --git a/README_ch.md b/README_ch.md index ecbc273fb5b5bbbce499a4f61b39142c08754769..4d4efd58b4304477fd8d3737a476810610e50e80 100644 --- a/README_ch.md +++ b/README_ch.md @@ -195,8 +195,8 @@ print(results) - + @@ -204,6 +204,7 @@ print(results) + @@ -218,6 +219,7 @@ print(results) + @@ -243,3 +245,5 @@ print(results) * 非常感谢[paopjian](https://github.com/paopjian)修改了中文readme模型搜索指向的的网站地址错误[#1424](https://github.com/PaddlePaddle/PaddleHub/issues/1424) * 非常感谢[Wgm-Inspur](https://github.com/Wgm-Inspur)修复了readme中的代码示例问题,并优化了文本分类、序列标注demo中的RNN示例图 * 非常感谢[zl1271](https://github.com/zl1271)修复了serving文档中的错别字 +* 非常感谢[AK391](https://github.com/AK391)在Hugging Face spaces中添加了UGATIT和deoldify模型的web demo +* 非常感谢[itegel](https://github.com/itegel)修复了快速开始文档中的错别字 diff --git a/docs/docs_en/visualization.md b/docs/docs_en/visualization.md index be170a7ccdb0bab697e52b9f6c9b25ae5f125bf2..43dd60ea6a7ff52c3f912acad1bb4ce6149d8469 100644 --- a/docs/docs_en/visualization.md +++ b/docs/docs_en/visualization.md @@ -39,6 +39,8 @@ +**Deoldify Huggingface Web Demo**: Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/akhaliq/deoldify) + ### Image Generation - Including portrait cartoonization, street scene cartoonization, and style transfer. - Many thanks to CopyRight@[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)、CopyRight@[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)for the pre-trained models. @@ -46,6 +48,8 @@ +**UGATIT Selfie2anime Huggingface Web Demo**: Integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/akhaliq/U-GAT-IT-selfie2anime) + ### Object Detection - Pedestrian detection, vehicle detection, and more industrial-grade ultra-large-scale pretrained models are provided. diff --git a/modules/image/Image_editing/colorization/user_guided_colorization/README.md b/modules/image/Image_editing/colorization/user_guided_colorization/README.md new file mode 100644 index 0000000000000000000000000000000000000000..390f04e1500e1d3d0ae1215f798bb9f7902f1fdc --- /dev/null +++ b/modules/image/Image_editing/colorization/user_guided_colorization/README.md @@ -0,0 +1,204 @@ +# user_guided_colorization + +|模型名称|user_guided_colorization| +| :--- | :---: | +|类别|图像-图像编辑| +|网络| Local and Global Hints Network | +|数据集|ILSVRC 2012| +|是否支持Fine-tuning|是| +|模型大小|131MB| +|指标|-| +|最新更新日期|2021-02-26| + + +## 一、模型基本信息 + +- ### 模型介绍 + +- ### 应用效果展示 + + - 样例结果示例(左为原图,右为效果图): +

+ +

+ + - user_guided_colorization 是基于''Real-Time User-Guided Image Colorization with Learned Deep Priors"的着色模型,该模型利用预先提供的着色块对图像进行着色。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 + +- ### 2、安装 + - ```shell + $ hub install user_guided_colorization + ``` + + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1.命令行预测 + + ```shell + $ hub run user_guided_colorization --input_path "/PATH/TO/IMAGE" + ``` +- ### 2.预测代码示例 + + ```python + import paddle + import paddlehub as hub + + if __name__ == '__main__': + + model = hub.Module(name='user_guided_colorization') + model.set_config(prob=0.1) + result = model.predict(images=['/PATH/TO/IMAGE']) + ``` +- ### 3.如何开始Fine-tune + + - 在完成安装PaddlePaddle与PaddleHub后,通过执行`python train.py`即可开始使用user_guided_colorization模型对[Canvas](../../docs/reference/datasets.md#class-hubdatasetsCanvas)等数据集进行Fine-tune。 + + - 代码步骤 + + - Step1: 定义数据预处理方式 + - ```python + import paddlehub.vision.transforms as T + + transform = T.Compose([T.Resize((256, 256), interpolation='NEAREST'), + T.RandomPaddingCrop(crop_size=176), + T.RGB2LAB()], to_rgb=True) + ``` + + - `transforms` 数据增强模块定义了丰富的数据预处理方式,用户可按照需求替换自己需要的数据预处理方式。 + + - Step2: 下载数据集并使用 + - ```python + from paddlehub.datasets import Canvas + + color_set = Canvas(transform=transform, mode='train') + ``` + + * `transforms`: 数据预处理方式。 + * `mode`: 选择数据模式,可选项有 `train`, `test`, `val`, 默认为`train`。 + + * `hub.datasets.Canvas()` 会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。 + + + - Step3: 加载预训练模型 + + - ```python + model = hub.Module(name='user_guided_colorization', load_checkpoint=None) + model.set_config(classification=True, prob=1) + ``` + * `name`:加载模型的名字。 + * `load_checkpoint`: 是否加载自己训练的模型,若为None,则加载提供的模型默认参数。 + * `classification`: 着色模型分两部分训练,开始阶段`classification`设置为True, 用于浅层网络训练。训练后期将`classification`设置为False, 用于训练网络的输出层。 + * `prob`: 每张输入图不加一个先验彩色块的概率,默认为1,即不加入先验彩色块。例如,当`prob`设定为0.9时,一张图上有两个先验彩色块的概率为(1-0.9)*(1-0.9)*0.9=0.009. + + - Step4: 选择优化策略和运行配置 + + ```python + optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters()) + trainer = Trainer(model, optimizer, checkpoint_dir='img_colorization_ckpt_cls_1') + trainer.train(color_set, epochs=201, batch_size=25, eval_dataset=color_set, log_interval=10, save_interval=10) + ``` + + + - 运行配置 + + - `Trainer` 主要控制Fine-tune的训练,包含以下可控制的参数: + + * `model`: 被优化模型; + * `optimizer`: 优化器选择; + * `use_vdl`: 是否使用vdl可视化训练过程; + * `checkpoint_dir`: 保存模型参数的地址; + * `compare_metrics`: 保存最优模型的衡量指标; + + - `trainer.train` 主要控制具体的训练过程,包含以下可控制的参数: + + * `train_dataset`: 训练时所用的数据集; + * `epochs`: 训练轮数; + * `batch_size`: 训练的批大小,如果使用GPU,请根据实际情况调整batch_size; + * `num_workers`: works的数量,默认为0; + * `eval_dataset`: 验证集; + * `log_interval`: 打印日志的间隔, 单位为执行批训练的次数。 + * `save_interval`: 保存模型的间隔频次,单位为执行训练的轮数。 + + - 模型预测 + + - 当完成Fine-tune后,Fine-tune过程在验证集上表现最优的模型会被保存在`${CHECKPOINT_DIR}/best_model`目录下,其中`${CHECKPOINT_DIR}`目录为Fine-tune时所选择的保存checkpoint的目录。 我们使用该模型来进行预测。predict.py脚本如下: + + - ```python + import paddle + import paddlehub as hub + + if __name__ == '__main__': + model = hub.Module(name='user_guided_colorization', load_checkpoint='/PATH/TO/CHECKPOINT') + model.set_config(prob=0.1) + result = model.predict(images=['house.png']) + ``` + + + - **NOTE:** 进行预测时,所选择的module,checkpoint_dir,dataset必须和Fine-tune所用的一样。若想获取油画风着色效果,请下载参数文件[油画着色](https://paddlehub.bj.bcebos.com/dygraph/models/canvas_rc.pdparams) + +## 四、服务部署 + +- PaddleHub Serving可以部署一个在线着色任务服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + + - ```shell + $ hub serving start -m user_guided_colorization + ``` + + - 这样就完成了一个着色任务服务化API的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + ```python + import requests + import json + import cv2 + import base64 + import numpy as np + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + def base64_to_cv2(b64str): + data = base64.b64decode(b64str.encode('utf8')) + data = np.fromstring(data, np.uint8) + data = cv2.imdecode(data, cv2.IMREAD_COLOR) + return data + + # 发送HTTP请求 + org_im = cv2.imread('/PATH/TO/IMAGE') + data = {'images':[cv2_to_base64(org_im)]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/user_guided_colorization" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + data = base64_to_cv2(r.json()["results"]['data'][0]['fake_reg']) + cv2.imwrite('color.png', data) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + diff --git a/modules/image/Image_editing/super_resolution/dcscn/README.md b/modules/image/Image_editing/super_resolution/dcscn/README.md index da9bfa44b9fdc496e52ac60f2c810c959fbf52eb..15722b2f2e03999f33597fc8f224d22b9a3d6334 100644 --- a/modules/image/Image_editing/super_resolution/dcscn/README.md +++ b/modules/image/Image_editing/super_resolution/dcscn/README.md @@ -1,134 +1,173 @@ -## 模型概述 +# dcscn -DCSCN是基于Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network设计的轻量化超分辨模型。该模型使用残差结构和跳连的方式构建网络来提取局部和全局特征,同时使用并行1*1的卷积网络学习细节特征提升模型性能。该模型提供的超分倍数为2倍。 -## 命令行预测 +|模型名称|dcscn| +| :--- | :---: | +|类别|图像-图像编辑| +|网络|dcscn| +|数据集|DIV2k| +|是否支持Fine-tuning|否| +|模型大小|260KB| +|指标|PSNR37.63| +|最新更新日期|2021-02-26| -``` -$ hub run dcscn --input_path "/PATH/TO/IMAGE" -``` +## 一、模型基本信息 -## API +- ### 应用效果展示 + + - 样例结果示例(左为原图,右为效果图): +

+ +

-```python -def reconstruct(self, - images=None, - paths=None, - use_gpu=False, - visualization=False, - output_dir="dcscn_output") -``` -预测API,用于图像超分辨率。 +- ### 模型介绍 -**参数** + - DCSCN是基于Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network设计的轻量化超分辨模型。该模型使用残差结构和跳连的方式构建网络来提取局部和全局特征,同时使用并行1*1的卷积网络学习细节特征提升模型性能。该模型提供的超分倍数为2倍。 -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; -* paths (list\[str\]): 图片的路径; -* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置; -* visualization (bool): 是否将识别结果保存为图片文件; -* output\_dir (str): 图片的保存路径。 + - 更多详情请参考:[dcscn](https://github.com/jiny2001/dcscn-super-resolution) -**返回** +## 二、安装 -* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为: - * save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在); - * data (numpy.ndarray): 超分辨后图像。 +- ### 1、环境依赖 -```python -def save_inference_model(self, - dirname='dcscn_save_model', - model_filename=None, - params_filename=None, - combined=False) -``` + - paddlepaddle >= 2.0.0 -将模型保存到指定路径。 + - paddlehub >= 2.0.0 -**参数** -* dirname: 存在模型的目录名称 -* model\_filename: 模型文件名称,默认为\_\_model\_\_ -* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 +- ### 2、安装 + - ```shell + $ hub install dcscn + ``` -## 代码示例 + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -```python -import cv2 -import paddlehub as hub +## 三、模型API预测 +- ### 1、命令行预测 -sr_model = hub.Module(name='dcscn') -im = cv2.imread('/PATH/TO/IMAGE').astype('float32') -#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。 -res = sr_model.reconstruct(images=[im], visualization=True) -print(res[0]['data']) -sr_model.save_inference_model() -``` + - ``` + $ hub run dcscn --input_path "/PATH/TO/IMAGE" + ``` +- ### 2、预测代码示例 -## 服务部署 + ```python + import cv2 + import paddlehub as hub -PaddleHub Serving可以部署一个图像超分的在线服务。 + sr_model = hub.Module(name='dcscn') + im = cv2.imread('/PATH/TO/IMAGE').astype('float32') + #visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。 + res = sr_model.reconstruct(images=[im], visualization=True) + print(res[0]['data']) + sr_model.save_inference_model() + ``` -## 第一步:启动PaddleHub Serving +- ### 3、API -运行启动命令: + - ```python + def reconstruct(self, + images=None, + paths=None, + use_gpu=False, + visualization=False, + output_dir="dcscn_output") + ``` -```shell -$ hub serving start -m dcscn -``` + - 预测API,用于图像超分辨率。 -这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。 + - **参数** -**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + * images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; + * paths (list\[str\]): 图片的路径; + * use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置; + * visualization (bool): 是否将识别结果保存为图片文件; + * output\_dir (str): 图片的保存路径。 -## 第二步:发送预测请求 + - **返回** -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + * res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为: + * save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在); + * data (numpy.ndarray): 超分辨后图像。 -```python -import requests -import json -import base64 + - ```python + def save_inference_model(self, + dirname='dcscn_save_model', + model_filename=None, + params_filename=None, + combined=False) + ``` -import cv2 -import numpy as np + - 将模型保存到指定路径。 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') -def base64_to_cv2(b64str): - data = base64.b64decode(b64str.encode('utf8')) - data = np.fromstring(data, np.uint8) - data = cv2.imdecode(data, cv2.IMREAD_COLOR) - return data + - **参数** -# 发送HTTP请求 + * dirname: 存在模型的目录名称 + * model\_filename: 模型文件名称,默认为\_\_model\_\_ + * params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) + * combined: 是否将参数保存到统一的一个文件中 -org_im = cv2.imread('/PATH/TO/IMAGE') -data = {'images':[cv2_to_base64(org_im)]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/dcscn" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) -sr = np.expand_dims(cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY), axis=2) -shape =sr.shape -org_im = cv2.cvtColor(org_im, cv2.COLOR_BGR2YUV) -uv = cv2.resize(org_im[...,1:], (shape[1], shape[0]), interpolation=cv2.INTER_CUBIC) -combine_im = cv2.cvtColor(np.concatenate((sr, uv), axis=2), cv2.COLOR_YUV2BGR) -cv2.imwrite('dcscn_X2.png', combine_im) -print("save image as dcscn_X2.png") -``` -### 查看代码 +## 四、服务部署 -https://github.com/jiny2001/dcscn-super-resolution +- PaddleHub Serving可以部署一个图像超分的在线服务。 +- ### 第一步:启动PaddleHub Serving + - 运行启动命令: -### 依赖 + - ```shell + $ hub serving start -m dcscn + ``` -paddlepaddle >= 1.8.0 + - 这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。 -paddlehub >= 1.7.1 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + + - ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + ```python + import requests + import json + import base64 + + import cv2 + import numpy as np + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + def base64_to_cv2(b64str): + data = base64.b64decode(b64str.encode('utf8')) + data = np.fromstring(data, np.uint8) + data = cv2.imdecode(data, cv2.IMREAD_COLOR) + return data + + # 发送HTTP请求 + + org_im = cv2.imread('/PATH/TO/IMAGE') + data = {'images':[cv2_to_base64(org_im)]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/dcscn" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + sr = np.expand_dims(cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY), axis=2) + shape =sr.shape + org_im = cv2.cvtColor(org_im, cv2.COLOR_BGR2YUV) + uv = cv2.resize(org_im[...,1:], (shape[1], shape[0]), interpolation=cv2.INTER_CUBIC) + combine_im = cv2.cvtColor(np.concatenate((sr, uv), axis=2), cv2.COLOR_YUV2BGR) + cv2.imwrite('dcscn_X2.png', combine_im) + print("save image as dcscn_X2.png") + ``` + + +## 五、更新历史 + + +* 1.0.0 + + 初始发布 diff --git a/modules/image/Image_editing/super_resolution/falsr_a/README.md b/modules/image/Image_editing/super_resolution/falsr_a/README.md index 2981753ca3512962fc3a05c60df8ef2203e78323..f1b98a651387342bffb3397a3f4ada31cc61411d 100644 --- a/modules/image/Image_editing/super_resolution/falsr_a/README.md +++ b/modules/image/Image_editing/super_resolution/falsr_a/README.md @@ -1,126 +1,169 @@ -## 模型概述 +# falsr_a -falsr_a是基于Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search设计的轻量化超分辨模型。该模型使用多目标方法处理超分问题,同时使用基于混合控制器的弹性搜索策略来提升模型性能。该模型提供的超分倍数为2倍。 -## 命令行预测 +|模型名称|falsr_a| +| :--- | :---: | +|类别|图像-图像编辑| +|网络|falsr_a| +|数据集|DIV2k| +|是否支持Fine-tuning|否| +|模型大小|8.9MB| +|指标|PSNR37.82| +|最新更新日期|2021-02-26| -``` -$ hub run falsr_a --input_path "/PATH/TO/IMAGE" -``` +## 一、模型基本信息 -## API +- ### 应用效果展示 + + - 样例结果示例(左为原图,右为效果图): +

+ +

-```python -def reconstruct(self, - images=None, - paths=None, - use_gpu=False, - visualization=False, - output_dir="falsr_a_output") -``` -预测API,用于图像超分辨率。 +- ### 模型介绍 -**参数** + - falsr_a是基于Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search设计的轻量化超分辨模型。该模型使用多目标方法处理超分问题,同时使用基于混合控制器的弹性搜索策略来提升模型性能。该模型提供的超分倍数为2倍。 -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; -* paths (list\[str\]): 图片的路径; -* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置; -* visualization (bool): 是否将识别结果保存为图片文件; -* output\_dir (str): 图片的保存路径。 + - 更多详情请参考:[falsr_a](https://github.com/xiaomi-automl/FALSR) -**返回** +## 二、安装 -* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为: - * save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在); - * data (numpy.ndarray): 超分辨后图像。 +- ### 1、环境依赖 -```python -def save_inference_model(self, - dirname='falsr_a_save_model', - model_filename=None, - params_filename=None, - combined=False) -``` + - paddlepaddle >= 2.0.0 -将模型保存到指定路径。 + - paddlehub >= 2.0.0 -**参数** -* dirname: 存在模型的目录名称 -* model\_filename: 模型文件名称,默认为\_\_model\_\_ -* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 +- ### 2、安装 + - ```shell + $ hub install falsr_a + ``` -## 代码示例 + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -```python -import cv2 -import paddlehub as hub +## 三、模型API预测 +- ### 1、命令行预测 -sr_model = hub.Module(name='falsr_a') -im = cv2.imread('/PATH/TO/IMAGE').astype('float32') -#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。 -res = sr_model.reconstruct(images=[im], visualization=True) -print(res[0]['data']) -sr_model.save_inference_model() -``` + - ``` + $ hub run falsr_a --input_path "/PATH/TO/IMAGE" + ``` +- ### 2、预测代码示例 -## 服务部署 + ```python + import cv2 + import paddlehub as hub -PaddleHub Serving可以部署一个图像超分的在线服务。 + sr_model = hub.Module(name='falsr_a') + im = cv2.imread('/PATH/TO/IMAGE').astype('float32') + #visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。 + res = sr_model.reconstruct(images=[im], visualization=True) + print(res[0]['data']) + sr_model.save_inference_model() + ``` -## 第一步:启动PaddleHub Serving +- ### 3、API -运行启动命令: + - ```python + def reconstruct(self, + images=None, + paths=None, + use_gpu=False, + visualization=False, + output_dir="falsr_a_output") + ``` -```shell -$ hub serving start -m falsr_a -``` + - 预测API,用于图像超分辨率。 -这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。 + - **参数** -**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + * images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; + * paths (list\[str\]): 图片的路径; + * use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置; + * visualization (bool): 是否将识别结果保存为图片文件; + * output\_dir (str): 图片的保存路径。 -## 第二步:发送预测请求 + - **返回** -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + * res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为: + * save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在); + * data (numpy.ndarray): 超分辨后图像。 -```python -import requests -import json -import base64 + - ```python + def save_inference_model(self, + dirname='falsr_a_save_model', + model_filename=None, + params_filename=None, + combined=False) + ``` -import cv2 -import numpy as np + - 将模型保存到指定路径。 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') -def base64_to_cv2(b64str): - data = base64.b64decode(b64str.encode('utf8')) - data = np.fromstring(data, np.uint8) - data = cv2.imdecode(data, cv2.IMREAD_COLOR) - return data + - **参数** -# 发送HTTP请求 -org_im = cv2.imread('/PATH/TO/IMAGE') -data = {'images':[cv2_to_base64(org_im)]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/falsr_a" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) -sr = base64_to_cv2(r.json()["results"][0]['data']) -cv2.imwrite('falsr_a_X2.png', sr) -print("save image as falsr_a_X2.png") -``` -### 查看代码 + * dirname: 存在模型的目录名称 + * model\_filename: 模型文件名称,默认为\_\_model\_\_ + * params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) + * combined: 是否将参数保存到统一的一个文件中 -https://github.com/xiaomi-automl/FALSR -### 依赖 +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像超分的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + + - ```shell + $ hub serving start -m falsr_a + ``` + + - 这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + + - ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + ```python + import requests + import json + import base64 + + import cv2 + import numpy as np + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + def base64_to_cv2(b64str): + data = base64.b64decode(b64str.encode('utf8')) + data = np.fromstring(data, np.uint8) + data = cv2.imdecode(data, cv2.IMREAD_COLOR) + return data + + # 发送HTTP请求 + org_im = cv2.imread('/PATH/TO/IMAGE') + data = {'images':[cv2_to_base64(org_im)]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/falsr_a" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + sr = base64_to_cv2(r.json()["results"][0]['data']) + cv2.imwrite('falsr_a_X2.png', sr) + print("save image as falsr_a_X2.png") + ``` + + +## 五、更新历史 + + +* 1.0.0 + + 初始发布 -paddlepaddle >= 1.8.0 -paddlehub >= 1.7.1 diff --git a/modules/image/Image_editing/super_resolution/falsr_b/README.md b/modules/image/Image_editing/super_resolution/falsr_b/README.md index f54f159d57e81c98d3d503da9bc68afd877ee796..b74a5f894791719d8d0b61ca666b395f318076a4 100644 --- a/modules/image/Image_editing/super_resolution/falsr_b/README.md +++ b/modules/image/Image_editing/super_resolution/falsr_b/README.md @@ -1,126 +1,170 @@ -## 模型概述 +# falsr_b -falsr_b是基于Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search设计的轻量化超分辨模型。falsr_b较falsr_a更轻量化。该模型使用多目标方法处理超分问题,同时使用基于混合控制器的弹性搜索策略来提升模型性能。该模型提供的超分倍数为2倍。 -## 命令行预测 +|模型名称|falsr_b| +| :--- | :---: | +|类别|图像-图像编辑| +|网络|falsr_b| +|数据集|DIV2k| +|是否支持Fine-tuning|否| +|模型大小|4MB| +|指标|PSNR37.61| +|最新更新日期|2021-02-26| -``` -$ hub run falsr_b --input_path "/PATH/TO/IMAGE" -``` +## 一、模型基本信息 -## API +- ### 应用效果展示 + + - 样例结果示例(左为原图,右为效果图): +

+ +

-```python -def reconstruct(self, - images=None, - paths=None, - use_gpu=False, - visualization=True, - output_dir="falsr_b_output") -``` -预测API,用于图像超分辨率。 +- ### 模型介绍 -**参数** + - falsr_b是基于Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search设计的轻量化超分辨模型。该模型使用多目标方法处理超分问题,同时使用基于混合控制器的弹性搜索策略来提升模型性能。该模型提供的超分倍数为2倍。 -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; -* paths (list\[str\]): 图片的路径; -* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置; -* visualization (bool): 是否将识别结果保存为图片文件; -* output\_dir (str): 图片的保存路径。 + - 更多详情请参考:[falsr_b](https://github.com/xiaomi-automl/FALSR) -**返回** +## 二、安装 -* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为: - * save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在); - * data (numpy.ndarray): 超分辨后图像。 +- ### 1、环境依赖 -```python -def save_inference_model(self, - dirname='falsr_b_save_model', - model_filename=None, - params_filename=None, - combined=False) -``` + - paddlepaddle >= 2.0.0 -将模型保存到指定路径。 + - paddlehub >= 2.0.0 -**参数** -* dirname: 存在模型的目录名称 -* model\_filename: 模型文件名称,默认为\_\_model\_\_ -* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 +- ### 2、安装 + - ```shell + $ hub install falsr_b + ``` -## 代码示例 + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -```python -import cv2 -import paddlehub as hub +## 三、模型API预测 +- ### 1、命令行预测 -sr_model = hub.Module(name='falsr_b') -im = cv2.imread('/PATH/TO/IMAGE').astype('float32') -#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。 -res = sr_model.reconstruct(images=[im], visualization=True) -print(res[0]['data']) -sr_model.save_inference_model() -``` + - ``` + $ hub run falsr_b --input_path "/PATH/TO/IMAGE" + ``` +- ### 2、预测代码示例 -## 服务部署 + ```python + import cv2 + import paddlehub as hub -PaddleHub Serving可以部署一个图像超分的在线服务。 + sr_model = hub.Module(name='falsr_b') + im = cv2.imread('/PATH/TO/IMAGE').astype('float32') + #visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。 + res = sr_model.reconstruct(images=[im], visualization=True) + print(res[0]['data']) + sr_model.save_inference_model() + ``` -## 第一步:启动PaddleHub Serving +- ### 3、API -运行启动命令: + - ```python + def reconstruct(self, + images=None, + paths=None, + use_gpu=False, + visualization=False, + output_dir="falsr_b_output") + ``` -```shell -$ hub serving start -m falsr_b -``` + - 预测API,用于图像超分辨率。 -这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。 + - **参数** -**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + * images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; + * paths (list\[str\]): 图片的路径; + * use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置; + * visualization (bool): 是否将识别结果保存为图片文件; + * output\_dir (str): 图片的保存路径。 -## 第二步:发送预测请求 + - **返回** -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + * res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为: + * save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在); + * data (numpy.ndarray): 超分辨后图像。 -```python -import requests -import json -import base64 + - ```python + def save_inference_model(self, + dirname='falsr_b_save_model', + model_filename=None, + params_filename=None, + combined=False) + ``` -import cv2 -import numpy as np + - 将模型保存到指定路径。 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') -def base64_to_cv2(b64str): - data = base64.b64decode(b64str.encode('utf8')) - data = np.fromstring(data, np.uint8) - data = cv2.imdecode(data, cv2.IMREAD_COLOR) - return data + - **参数** -# 发送HTTP请求 -org_im = cv2.imread('/PATH/TO/IMAGE') -data = {'images':[cv2_to_base64(org_im)]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/falsr_b" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) -sr = base64_to_cv2(r.json()["results"][0]['data']) -cv2.imwrite('falsr_b_X2.png', sr) -print("save image as falsr_b_X2.png") -``` + * dirname: 存在模型的目录名称 + * model\_filename: 模型文件名称,默认为\_\_model\_\_ + * params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) + * combined: 是否将参数保存到统一的一个文件中 -### 查看代码 -https://github.com/xiaomi-automl/FALSR -### 依赖 +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像超分的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + + - ```shell + $ hub serving start -m falsr_b + ``` + + - 这样就完成了一个超分任务的服务化API的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + + - ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + ```python + import requests + import json + import base64 + + import cv2 + import numpy as np + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + def base64_to_cv2(b64str): + data = base64.b64decode(b64str.encode('utf8')) + data = np.fromstring(data, np.uint8) + data = cv2.imdecode(data, cv2.IMREAD_COLOR) + return data + + # 发送HTTP请求 + org_im = cv2.imread('/PATH/TO/IMAGE') + data = {'images':[cv2_to_base64(org_im)]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/falsr_b" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + sr = base64_to_cv2(r.json()["results"][0]['data']) + cv2.imwrite('falsr_b_X2.png', sr) + print("save image as falsr_b_X2.png") + ``` + + +## 五、更新历史 + + +* 1.0.0 + + 初始发布 + -paddlepaddle >= 1.8.0 -paddlehub >= 1.7.1 diff --git a/modules/image/Image_gan/attgan_celeba/README.md b/modules/image/Image_gan/attgan_celeba/README.md new file mode 100644 index 0000000000000000000000000000000000000000..f9a7a211949a026093547542c845d4e182392f98 --- /dev/null +++ b/modules/image/Image_gan/attgan_celeba/README.md @@ -0,0 +1,105 @@ +# attgan_celeba + +|模型名称|attgan_celeba| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|AttGAN| +|数据集|Celeba| +|是否支持Fine-tuning|否| +|模型大小|167MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + +## 一、模型基本信息 + +- ### 应用效果展示 + - 样例结果示例: + +

+
+ 图1. AttGAN的效果图(图片属性分别为:original image, Bald, Bangs, Black_Hair, Blond_Hair, Brown_Hair, Bushy_Eyebrows, Eyeglasses, Gender, Mouth_Slightly_Open, Mustache, No_Beard, Pale_Skin, Aged)
+

+ + +- ### 模型介绍 + + - AttGAN 是一种生成对抗网络(Generative Adversarial Networks),它利用分类损失和重构损失来保证改变特定的属性。该 PaddleHub Module 使用 Celeba 数据集训练完成,目前支持 "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged" 这十三种人脸属性转换。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.5.2 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install attgan_celeba==1.0.0 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run attgan_celeba --image "/PATH/TO/IMAGE" --style "target_attribute" + ``` + - **参数** + + - image :指定图片路径。 + + - style 指定拟转换的属性,可选择 "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged" 中的一种。 + + + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + + attgan = hub.Module(name="attgan_celeba") + + test_img_path = ["/PATH/TO/IMAGE"] + trans_attr = ["Bangs"] + + # set input dict + input_dict = {"image": test_img_path, "style": trans_attr} + + # execute predict and print the result + results = attgan.generate(data=input_dict) + print(results) + ``` + +- ### 3、API + + - ```python + def generate(data) + ``` + + - 风格转换API,用于图像生成。 + + - **参数** + + - data: dict 类型,有以下字段 + - image (list\[str\]): list中每个元素为待转换的图片路径。 + - style (list\[str\]): list中每个元素为字符串,填写待转换的人脸属性。 + + - **返回** + - res (list\[str\]): 提示生成图片的保存路径。 + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + diff --git a/modules/image/Image_gan/cyclegan_cityscapes/README.md b/modules/image/Image_gan/cyclegan_cityscapes/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a63efd9e92c93d8b65545dedac8c33a349549aec --- /dev/null +++ b/modules/image/Image_gan/cyclegan_cityscapes/README.md @@ -0,0 +1,108 @@ +# cyclegan_cityscapes + +|模型名称|cyclegan_cityscapes| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|CycleGAN| +|数据集|Cityscapes| +|是否支持Fine-tuning|否| +|模型大小|33MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + +## 一、模型基本信息 + +- ### 应用效果展示 + - 样例结果示例: + +

+ +
+ 输入图像 +
+ +
+ 输出图像 +
+

+ + +- ### 模型介绍 + + - CycleGAN是生成对抗网络(Generative Adversarial Networks )的一种,与传统的GAN只能单向生成图片不同,CycleGAN可以同时完成两个domain的图片进行相互转换。该PaddleHub Module使用Cityscapes数据集训练完成,支持图片从实景图转换为语义分割结果,也支持从语义分割结果转换为实景图。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.1.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install cyclegan_cityscapes==1.0.0 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run cyclegan_cityscapes --input_path "/PATH/TO/IMAGE" + ``` + - **参数** + + - input_path :指定图片路径。 + + + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + + cyclegan = hub.Module(name="cyclegan_cityscapes") + + test_img_path = "/PATH/TO/IMAGE" + + # set input dict + input_dict = {"image": [test_img_path]} + + # execute predict and print the result + results = cyclegan.generate(data=input_dict) + print(results) + ``` + +- ### 3、API + + - ```python + def generate(data) + ``` + + - 风格转换API,用于图像生成。 + + - **参数** + + - data: dict 类型,有以下字段: + - image (list\[str\]): list中每个元素为待转换的图片路径。 + + - **返回** + - res (list\[str\]): 每个元素为对应输入图片的预测结果。预测结果为dict类型,有以下字段: + - origin: 原输入图片路径. + - generated: 生成图片的路径。 + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + diff --git a/modules/image/Image_gan/stargan_celeba/README.md b/modules/image/Image_gan/stargan_celeba/README.md new file mode 100644 index 0000000000000000000000000000000000000000..b5a160274dae031ae2824b93416eb1395b814770 --- /dev/null +++ b/modules/image/Image_gan/stargan_celeba/README.md @@ -0,0 +1,102 @@ +# stargan_celeba + +|模型名称|stargan_celeba| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|STGAN| +|数据集|Celeba| +|是否支持Fine-tuning|否| +|模型大小|33MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + +## 一、模型基本信息 + +- ### 应用效果展示 + - 样例结果示例: + +

+
+ 图1. StarGAN的效果图 (属性分别为:origial image, Black_Hair, Blond_Hair, Brown_Hair, Male, Aged)
+

+ + +- ### 模型介绍 + + - StarGAN 是为了解决跨多个域、多个数据集的训练而提出的生成对抗网络模型。单个 StarGAN 模型就可以实现多个风格域的转换。 该 PaddleHub Module 使用 Celeba 数据集训练完成,目前支持 "Black_Hair", "Blond_Hair", "Brown_Hair", "Female", "Male", "Aged" 这六种人脸属性转换。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.5.2 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install stargan_celeba==1.0.0 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run stargan_celeba --image "/PATH/TO/IMAGE" --style "target_attribute" + ``` + - **参数** + + - image :指定图片路径。 + + - style 指定拟转换的属性,可选择 "Black_Hair", "Blond_Hair", "Brown_Hair", "Female", "Male", "Aged" 中的一个。 + + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + + stargan = hub.Module(name="stargan_celeba") + test_img_path = ["/PATH/TO/IMAGE"] + trans_attr = ["Blond_Hair"] + + # set input dict + input_dict = {"image": test_img_path, "style": trans_attr} + + # execute predict and print the result + results = stargan.generate(data=input_dict) + print(results) + ``` + +- ### 3、API + + - ```python + def generate(data) + ``` + + - 风格转换API,用于图像生成。 + + - **参数** + + - data: dict 类型,有以下字段 + - image (list\[str\]): list中每个元素为待转换的图片路径。 + - style (list\[str\]): list中每个元素为字符串,填写待转换的人脸属性。 + + - **返回** + - res (list\[str\]): 提示生成图片的保存路径。 + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + diff --git a/modules/image/Image_gan/stgan_celeba/README.md b/modules/image/Image_gan/stgan_celeba/README.md new file mode 100644 index 0000000000000000000000000000000000000000..52e22e019e5d576d41d58ddc53c4a51a7870e130 --- /dev/null +++ b/modules/image/Image_gan/stgan_celeba/README.md @@ -0,0 +1,106 @@ +# stgan_celeba + +|模型名称|stgan_celeba| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|STGAN| +|数据集|Celeba| +|是否支持Fine-tuning|否| +|模型大小|287MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + +## 一、模型基本信息 + +- ### 应用效果展示 + - 样例结果示例: + +

+
+ STGAN的效果图(图片属性分别为:original image, Bald, Bangs, Black_Hair, Blond_Hair, Brown_Hair, Bushy_Eyebrows, Eyeglasses, Gender, Mouth_Slightly_Open, Mustache, No_Beard, Pale_Skin, Aged)
+

+ + +- ### 模型介绍 + + - STGAN 以原属性和目标属性的差值作为输入,并创造性地提出了 STUs (Selective transfer units) 来选择和修改 encoder 的特征,从而改善了转换效果和处理能力。 该 PaddleHub Module 使用 Celeba 数据集训练完成,目前支持 "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged" 这十三种人脸属性转换。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.5.2 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install stgan_celeba==1.0.0 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run stgan_celeba --image "/PATH/TO/IMAGE" --info "original_attributes" --style "target_attribute" + ``` + - **参数** + + - image :指定图片路径。 + + - info :原图的属性,必须填写性别( "Male" 或者 "Female")。可选值有:"Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged" 。比如输入图片是一个女孩,有着黑头发,那么就填写为 "Female,Black_Hair"。建议尽可能完整地填写原图具备的属性,比如一个黑发女孩还戴了眼镜,那么应填写为 "Female,Black_Hair,Eyeglasses",否则有可能转换失败。 + + - style 指定拟转换的属性,可选择 "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged" 中的一种。 + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + + stgan = hub.Module(name="stgan_celeba") + + test_img_path = ["/PATH/TO/IMAGE"] + org_info = ["Female,Black_Hair"] + trans_attr = ["Bangs"] + + # set input dict + input_dict = {"image": test_img_path, "style": trans_attr, "info": org_info} + + # execute predict and print the result + results = stgan.generate(data=input_dict) + print(results) + ``` + +- ### 3、API + + - ```python + def generate(data) + ``` + + - 风格转换API,用于图像生成。 + + - **参数** + + - data: dict 类型,有以下字段 + - image (list\[str\]): list中每个元素为待转换的图片路径。 + - style (list\[str\]): list中每个元素为字符串,填写待转换的人脸属性。 + - info (list\[str\]): 表示原图具备的人脸属性,填得越详细效果会越好,不同属性用逗号隔开。 + + + - **返回** + - res (list\[str\]): 提示生成图片的保存路径。 + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 diff --git a/modules/image/Image_gan/style_transfer/ID_Photo_GEN/README.md b/modules/image/Image_gan/style_transfer/ID_Photo_GEN/README.md index 6707c477de171e19770d0aa5a0869ac6f4b81fa7..6957e9a03f1c2116263d37ac06e5dded42f1575e 100644 --- a/modules/image/Image_gan/style_transfer/ID_Photo_GEN/README.md +++ b/modules/image/Image_gan/style_transfer/ID_Photo_GEN/README.md @@ -1,48 +1,97 @@ -## 概述 -* 基于 face_landmark_localization 和 FCN_HRNet_W18_Face_Seg 模型实现的证件照生成模型,一键生成白底、红底和蓝底的人像照片 - -## 效果展示 -![](https://img-blog.csdnimg.cn/20201224163307901.jpg) - -## API -```python -def Photo_GEN( - images=None, - paths=None, - batch_size=1, - output_dir='output', - visualization=False, - use_gpu=False): -``` -证件照生成 API - -**参数** -* images (list[np.ndarray]) : 输入图像数据列表(BGR) -* paths (list[str]) : 输入图像路径列表 -* batch_size (int) : 数据批大小 -* output_dir (str) : 可视化图像输出目录 -* visualization (bool) : 是否可视化 -* use_gpu (bool) : 是否使用 GPU 进行推理 - -**返回** -* results (list[dict{"write":np.ndarray,"blue":np.ndarray,"red":np.ndarray}]): 输出图像数据列表 - -**代码示例** -```python -import cv2 -import paddlehub as hub - -model = hub.Module(name='ID_Photo_GEN') - -result = model.Photo_GEN( - images=[cv2.imread('/PATH/TO/IMAGE')], - paths=None, - batch_size=1, - output_dir='output', - visualization=True, - use_gpu=False) -``` - -## 依赖 -paddlepaddle >= 2.0.0rc0 -paddlehub >= 2.0.0b1 +# ID_Photo_GEN + +|模型名称|ID_Photo_GEN| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|HRNet_W18| +|数据集|-| +|是否支持Fine-tuning|否| +|模型大小|28KB| +|最新更新日期|2021-02-26| +|数据指标|-| + + +## 一、模型基本信息 + +- ### 应用效果展示 + - 样例结果示例: +

+ +

+ + +- ### 模型介绍 + + - 基于face_landmark_localization和FCN_HRNet_W18_Face_Seg模型实现的证件照生成模型,一键生成白底、红底和蓝底的人像照片 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 + +- ### 2、安装 + + - ```shell + $ hub install ID_Photo_GEN + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## 三、模型API预测 + +- ### 1、预测代码示例 + + - ```python + import cv2 + import paddlehub as hub + + model = hub.Module(name='ID_Photo_GEN') + + result = model.Photo_GEN( + images=[cv2.imread('/PATH/TO/IMAGE')], + paths=None, + batch_size=1, + output_dir='output', + visualization=True, + use_gpu=False) + ``` + +- ### 2、API + + - ```python + def Photo_GEN( + images=None, + paths=None, + batch_size=1, + output_dir='output', + visualization=False, + use_gpu=False): + ``` + + - 证件照生成API + + - **参数** + * images (list[np.ndarray]) : 输入图像数据列表(BGR) + * paths (list[str]) : 输入图像路径列表 + * batch_size (int) : 数据批大小 + * output_dir (str) : 可视化图像输出目录 + * visualization (bool) : 是否可视化 + * use_gpu (bool) : 是否使用 GPU 进行推理 + + **NOTE:** paths和images两个参数选择其一进行提供数据 + + - **返回** + + * results (list[dict{"write":np.ndarray,"blue":np.ndarray,"red":np.ndarray}]): 输出图像数据列表 + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 diff --git a/modules/image/Image_gan/style_transfer/UGATIT_83w/README.md b/modules/image/Image_gan/style_transfer/UGATIT_83w/README.md index 493b8eaf78eaced6fd48a99783a19c3f7e0ac2d1..82bbf44afa06f2d03bb89f010d46a36ee5cf3b73 100644 --- a/modules/image/Image_gan/style_transfer/UGATIT_83w/README.md +++ b/modules/image/Image_gan/style_transfer/UGATIT_83w/README.md @@ -1,122 +1,141 @@ -## 模型概述 -UGATIT 图像风格转换模型 +# UGATIT_83w -模型可将输入的人脸图像转换成动漫风格 +|模型名称|UGATIT_83w| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|U-GAT-IT| +|数据集|selfie2anime| +|是否支持Fine-tuning|否| +|模型大小|41MB| +|最新更新日期|2021-02-26| +|数据指标|-| -模型权重来自UGATIT-Paddle开源项目 -模型所使用的权重为genA2B_0835000 +## 一、模型基本信息 -模型详情请参考[UGATIT-Paddle开源项目](https://github.com/miraiwk/UGATIT-paddle) +- ### 应用效果展示 + - 样例结果示例(左为原图,右为效果图): +

+ +

-## 模型安装 -```shell -$hub install UGATIT_83w -``` +- ### 模型介绍 -## API 说明 + - UGATIT 图像风格转换模型, 模型可将输入的人脸图像转换成动漫风格. -```python -def style_transfer( - self, - images=None, - paths=None, - batch_size=1, - output_dir='output', - visualization=False -) -``` -风格转换API,将输入的人脸图像转换成动漫风格。 +## 二、安装 -转换效果图如下: +- ### 1、环境依赖 -![输入图像](https://ai-studio-static-online.cdn.bcebos.com/d130fabd8bd34e53b2f942b3766eb6bbd3c19c0676d04abfbd5cc4b83b66f8b6) -![输出图像](https://ai-studio-static-online.cdn.bcebos.com/78653331ee2d472b81ff5bbccd6a904a80d2c5208f9c42c789b4f09a1ef46332) + - paddlepaddle >= 1.8.2 -**参数** + - paddlehub >= 1.8.0 -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],默认为 None; -* paths (list\[str\]): 图片的路径,默认为 None; -* batch\_size (int): batch 的大小,默认设为 1; -* visualization (bool): 是否将识别结果保存为图片文件,默认设为 False; -* output\_dir (str): 图片的保存路径,默认设为 output。 +- ### 2、安装 + - ```shell + $ hub install UGATIT_83w + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## 三、模型API预测 -**返回** +- ### 1、预测代码示例 -* res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\]。 + - ```python + import cv2 + import paddlehub as hub + # 模型加载 + # use_gpu:是否使用GPU进行预测 + model = hub.Module(name='UGATIT_83w', use_gpu=False) -## 预测代码示例 + # 模型预测 + result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) -```python -import cv2 -import paddlehub as hub + # or + # result = model.style_transfer(paths=['/PATH/TO/IMAGE']) + ``` -# 模型加载 -# use_gpu:是否使用GPU进行预测 -model = hub.Module('UGATIT_83w', use_gpu=False) +- ### 2、API -# 模型预测 -result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) + - ```python + def style_transfer( + self, + images=None, + paths=None, + batch_size=1, + output_dir='output', + visualization=False + ) + ``` -# or -# result = model.style_transfer(paths=['/PATH/TO/IMAGE']) -``` + - 风格转换API,将输入的人脸图像转换成动漫风格。 -## 服务部署 + - **参数** + * images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],默认为 None; + * paths (list\[str\]): 图片的路径,默认为 None; + * batch\_size (int): batch 的大小,默认设为 1; + * visualization (bool): 是否将识别结果保存为图片文件,默认设为 False; + * output\_dir (str): 图片的保存路径,默认设为 output -PaddleHub Serving可以部署一个在线图像风格转换服务。 + **NOTE:** paths和images两个参数选择其一进行提供数据 -## 第一步:启动PaddleHub Serving + - **返回** -运行启动命令: -```shell -$ hub serving start -m UGATIT_w83 -``` + - res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\] + -这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 +## 四、服务部署 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 +- PaddleHub Serving可以部署一个在线图像风格转换服务。 -## 第二步:发送预测请求 +- ### 第一步:启动PaddleHub Serving -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - 运行启动命令: + + - ```shell + $ hub serving start -m UGATIT_83w + ``` -```python -import requests -import json -import cv2 -import base64 + - 这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') +- ### 第二步:发送预测请求 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/UGATIT_w83" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + - ```python + import requests + import json + import cv2 + import base64 -# 打印预测结果 -print(r.json()["results"]) -``` + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -## 模型相关信息 -### 模型代码 + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/UGATIT_83w" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) -https://github.com/miraiwk/UGATIT-paddle + # 打印预测结果 + print(r.json()["results"]) + ``` -### 依赖 -paddlepaddle >= 1.8.0 +## 五、更新历史 -paddlehub >= 1.8.0 +* 1.0.0 + + 初始发布 \ No newline at end of file diff --git a/modules/image/Image_gan/style_transfer/UGATIT_92w/README.md b/modules/image/Image_gan/style_transfer/UGATIT_92w/README.md index 084188af3a11d767dd7a8480dc63d1bdd4bead19..8108976faeaa9bccad1af206a9aa6a34115dffc0 100644 --- a/modules/image/Image_gan/style_transfer/UGATIT_92w/README.md +++ b/modules/image/Image_gan/style_transfer/UGATIT_92w/README.md @@ -1,122 +1,141 @@ -## 模型概述 -UGATIT 图像风格转换模型 +# UGATIT_92w -模型可将输入的人脸图像转换成动漫风格 +|模型名称|UGATIT_92w| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|U-GAT-IT| +|数据集|selfie2anime| +|是否支持Fine-tuning|否| +|模型大小|41MB| +|最新更新日期|2021-02-26| +|数据指标|-| -模型权重来自UGATIT-Paddle开源项目 -模型所使用的权重为genA2B_0924000 +## 一、模型基本信息 -模型详情请参考[UGATIT-Paddle开源项目](https://github.com/miraiwk/UGATIT-paddle) +- ### 应用效果展示 + - 样例结果示例(左为原图,右为效果图): +

+ +

-## 模型安装 -```shell -$hub install UGATIT_92w -``` +- ### 模型介绍 -## API 说明 + - UGATIT 图像风格转换模型, 模型可将输入的人脸图像转换成动漫风格. -```python -def style_transfer( - self, - images=None, - paths=None, - batch_size=1, - output_dir='output', - visualization=False -) -``` -风格转换API,将输入的人脸图像转换成动漫风格。 +## 二、安装 -转换效果图如下: +- ### 1、环境依赖 -![输入图像](https://ai-studio-static-online.cdn.bcebos.com/d130fabd8bd34e53b2f942b3766eb6bbd3c19c0676d04abfbd5cc4b83b66f8b6) -![输出图像](https://ai-studio-static-online.cdn.bcebos.com/b7305162ff6345e9b04507a196ebe854907b446936934844be8aae4b0297db18) + - paddlepaddle >= 1.8.2 -**参数** + - paddlehub >= 1.8.0 -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],默认为 None; -* paths (list\[str\]): 图片的路径,默认为 None; -* batch\_size (int): batch 的大小,默认设为 1; -* visualization (bool): 是否将识别结果保存为图片文件,默认设为 False; -* output\_dir (str): 图片的保存路径,默认设为 output。 +- ### 2、安装 + - ```shell + $ hub install UGATIT_92w + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## 三、模型API预测 -**返回** +- ### 1、预测代码示例 -* res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\]。 + - ```python + import cv2 + import paddlehub as hub + # 模型加载 + # use_gpu:是否使用GPU进行预测 + model = hub.Module(name='UGATIT_92w', use_gpu=False) -## 预测代码示例 + # 模型预测 + result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) -```python -import cv2 -import paddlehub as hub + # or + # result = model.style_transfer(paths=['/PATH/TO/IMAGE']) + ``` -# 模型加载 -# use_gpu:是否使用GPU进行预测 -model = hub.Module(name='UGATIT_92w', use_gpu=False) +- ### 2、API -# 模型预测 -result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) + - ```python + def style_transfer( + self, + images=None, + paths=None, + batch_size=1, + output_dir='output', + visualization=False + ) + ``` -# or -# result = model.style_transfer(paths=['/PATH/TO/IMAGE']) -``` + - 风格转换API,将输入的人脸图像转换成动漫风格。 -## 服务部署 + - **参数** + * images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],默认为 None; + * paths (list\[str\]): 图片的路径,默认为 None; + * batch\_size (int): batch 的大小,默认设为 1; + * visualization (bool): 是否将识别结果保存为图片文件,默认设为 False; + * output\_dir (str): 图片的保存路径,默认设为 output -PaddleHub Serving可以部署一个在线图像风格转换服务。 + **NOTE:** paths和images两个参数选择其一进行提供数据 -## 第一步:启动PaddleHub Serving + - **返回** -运行启动命令: -```shell -$ hub serving start -m UGATIT_92w -``` + - res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\] + -这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 +## 四、服务部署 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 +- PaddleHub Serving可以部署一个在线图像风格转换服务。 -## 第二步:发送预测请求 +- ### 第一步:启动PaddleHub Serving -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - 运行启动命令: + + - ```shell + $ hub serving start -m UGATIT_92w + ``` -```python -import requests -import json -import cv2 -import base64 + - 这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') +- ### 第二步:发送预测请求 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/UGATIT_92w" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + - ```python + import requests + import json + import cv2 + import base64 -# 打印预测结果 -print(r.json()["results"]) -``` + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -## 模型相关信息 -### 模型代码 + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/UGATIT_92w" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) -https://github.com/miraiwk/UGATIT-paddle + # 打印预测结果 + print(r.json()["results"]) + ``` -### 依赖 -paddlepaddle >= 1.8.0 +## 五、更新历史 -paddlehub >= 1.8.0 +* 1.0.0 + + 初始发布 \ No newline at end of file diff --git a/modules/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md b/modules/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md index 50205f868b12c2abaadad3f21d9cea6eaa0542d4..5dcf44fb75e084a563c27ef514848fbdd8d6176b 100644 --- a/modules/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md +++ b/modules/image/Image_gan/style_transfer/animegan_v2_paprika_54/README.md @@ -1,125 +1,148 @@ -## 模型概述 -AnimeGAN V2 图像风格转换模型 +# animegan_v2_paprika_54 -模型可将输入的图像转换成Paprika风格 +|模型名称|animegan_v2_paprika_54| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|AnimeGAN| +|数据集|Paprika| +|是否支持Fine-tuning|否| +|模型大小|9.4MB| +|最新更新日期|2021-02-26| +|数据指标|-| -模型权重转换自AnimeGAN V2官方开源项目 -模型所使用的权重为Paprika-54.ckpt +## 一、模型基本信息 -模型详情请参考[AnimeGAN V2 开源项目](https://github.com/TachibanaYoshino/AnimeGANv2) +- ### 应用效果展示 + - 样例结果示例: +

+ +
+ 输入图像 +
+ +
+ 输出图像 +
+

-## 模型安装 -```shell -$hub install animegan_v2_paprika_54 -``` +- ### 模型介绍 + - AnimeGAN V2 图像风格转换模型, 模型可将输入的图像转换成今敏红辣椒动漫风格,模型权重转换自[AnimeGAN V2官方开源项目](https://github.com/TachibanaYoshino/AnimeGANv2)。 -## API 说明 -```python -def style_transfer( - self, - images=None, - paths=None, - output_dir='output', - visualization=False, - min_size=32, - max_size=1024 -) -``` +## 二、安装 -风格转换API,将输入的图片转换为漫画风格。 +- ### 1、环境依赖 -转换效果图如下: + - paddlepaddle >= 1.8.0 -![输入图像](https://ai-studio-static-online.cdn.bcebos.com/bd002c4bb6a7427daf26988770bb18648b7d8d2bfd6746bfb9a429db4867727f) -![输出图像](https://ai-studio-static-online.cdn.bcebos.com/08ee95c94e0b4d4e8b2855a6ed40af5853b40c0047b3421aaa2f7c877fac5130) + - paddlehub >= 1.8.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) +- ### 2、安装 -**参数** + - ```shell + $ hub install animegan_v2_paprika_54 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],默认为 None; -* paths (list\[str\]): 图片的路径,默认为 None; -* visualization (bool): 是否将识别结果保存为图片文件,默认设为 False; -* output\_dir (str): 图片的保存路径,默认设为 output; -* min\_size (int): 输入图片的短边最小尺寸,默认设为 32; -* max\_size (int): 输入图片的短边最大尺寸,默认设为 1024。 +## 三、模型API预测 +- ### 1、预测代码示例 -**返回** + - ```python + import paddlehub as hub + import cv2 -* res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\]。 + model = hub.Module(name="animegan_v2_paprika_54") + result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = model.style_transfer(paths=['/PATH/TO/IMAGE']) + ``` +- ### 2、API -## 预测代码示例 + - ```python + def style_transfer(images=None, + paths=None, + output_dir='output', + visualization=False, + min_size=32, + max_size=1024) + ``` -```python -import cv2 -import paddlehub as hub + - 风格转换API,将输入的图片转换为漫画风格。 -# 模型加载 -# use_gpu:是否使用GPU进行预测 -model = hub.Module(name='animegan_v2_paprika_54', use_gpu=False) + - **参数** -# 模型预测 -result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) + - images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\];
+ - paths (list\[str\]): 图片的路径;
+ - output\_dir (str): 图片的保存路径,默认设为 output;
+ - visualization (bool): 是否将结果保存为图片文件;
+ - min\_size (int): 输入图片的短边最小尺寸,默认设为 32;
+ - max\_size (int): 输入图片的短边最大尺寸,默认设为 1024。 -# or -# result = model.style_transfer(paths=['/PATH/TO/IMAGE']) -``` + - **返回** + - res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\] -## 服务部署 -PaddleHub Serving可以部署一个在线图像风格转换服务。 +## 四、服务部署 -## 第一步:启动PaddleHub Serving +- PaddleHub Serving可以部署一个在线图像风格转换服务。 -运行启动命令: -```shell -$ hub serving start -m animegan_v2_paprika_54 -``` +- ### 第一步:启动PaddleHub Serving -这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 + - 运行启动命令: + - ```shell + $ hub serving start -m animegan_v2_paprika_54 + ``` -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + - 这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 -## 第二步:发送预测请求 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 +- ### 第二步:发送预测请求 -```python -import requests -import json -import cv2 -import base64 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - ```python + import requests + import json + import cv2 + import base64 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/animegan_v2_paprika_54" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/animegan_v2_paprika_54" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) -# 打印预测结果 -print(r.json()["results"]) -``` + # 打印预测结果 + print(r.json()["results"]) + ``` -## 模型相关信息 +## 五、更新历史 -### 模型代码 +* 1.0.0 -https://github.com/TachibanaYoshino/AnimeGANv2 + 初始发布 -### 依赖 +* 1.0.1 -paddlepaddle >= 1.8.0 + 适配paddlehub2.0 -paddlehub >= 1.8.0 +* 1.0.2 + + 删除batch_size选项 + + - ```shell + $ hub install animegan_v2_paprika_54==1.0.2 + ``` \ No newline at end of file diff --git a/modules/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md b/modules/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md index 10af52a3a71f2dd26168b659dab0cb05f3818323..ff8b5a3e95ff9155ceb016a1e3ec6dc08f7c18c0 100644 --- a/modules/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md +++ b/modules/image/Image_gan/style_transfer/animegan_v2_paprika_97/README.md @@ -1,125 +1,147 @@ -## 模型概述 -AnimeGAN V2 图像风格转换模型 +# animegan_v2_paprika_97 -模型可将输入的图像转换成Paprika风格 +|模型名称|animegan_v2_paprika_97| +| :--- | :---: | +|类别|图像 - 图像生成| +|网络|AnimeGAN| +|数据集|Paprika| +|是否支持Fine-tuning|否| +|模型大小|9.7MB| +|最新更新日期|2021-07-30| +|数据指标|-| -模型权重转换自AnimeGAN V2官方开源项目 -模型所使用的权重为Paprika-97.ckpt +## 一、模型基本信息 -模型详情请参考[AnimeGAN V2 开源项目](https://github.com/TachibanaYoshino/AnimeGANv2) +- ### 应用效果展示 + - 样例结果示例: +

+ +
+ 输入图像 +
+ +
+ 输出图像 +
+

-## 模型安装 -```shell -$hub install animegan_v2_paprika_97 -``` +- ### 模型介绍 -## API 说明 + - AnimeGAN V2 图像风格转换模型, 模型可将输入的图像转换成红辣椒动漫风格,模型权重转换自[AnimeGAN V2官方开源项目](https://github.com/TachibanaYoshino/AnimeGAN)。 -```python -def style_transfer( - self, - images=None, - paths=None, - output_dir='output', - visualization=False, - min_size=32, - max_size=1024 -) -``` -风格转换API,将输入的图片转换为漫画风格。 +## 二、安装 -转换效果图如下: +- ### 1、环境依赖 -![输入图像](https://ai-studio-static-online.cdn.bcebos.com/bd002c4bb6a7427daf26988770bb18648b7d8d2bfd6746bfb9a429db4867727f) -![输出图像](https://ai-studio-static-online.cdn.bcebos.com/3b962a18a22e43028cc5530db1c5adb1a42e6aae4bb74b8598ee30ed52b59c8b) + - paddlepaddle >= 1.8.0 + - paddlehub >= 1.8.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) -**参数** +- ### 2、安装 -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],默认为 None; -* paths (list\[str\]): 图片的路径,默认为 None; -* visualization (bool): 是否将识别结果保存为图片文件,默认设为 False; -* output\_dir (str): 图片的保存路径,默认设为 output; -* min\_size (int): 输入图片的短边最小尺寸,默认设为 32; -* max\_size (int): 输入图片的短边最大尺寸,默认设为 1024。 + - ```shell + $ hub install animegan_v2_paprika_97 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) +## 三、模型API预测 -**返回** +- ### 1、预测代码示例 -* res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\]。 + - ```python + import paddlehub as hub + import cv2 + model = hub.Module(name="animegan_v2_paprika_97") + result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = model.style_transfer(paths=['/PATH/TO/IMAGE']) + ``` -## 预测代码示例 +- ### 2、API -```python -import cv2 -import paddlehub as hub + - ```python + def style_transfer(images=None, + paths=None, + output_dir='output', + visualization=False, + min_size=32, + max_size=1024) + ``` -# 模型加载 -# use_gpu:是否使用GPU进行预测 -model = hub.Module(name='animegan_v2_paprika_97', use_gpu=False) + - 风格转换API,将输入的图片转换为漫画风格。 -# 模型预测 -result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')]) + - **参数** -# or -# result = model.style_transfer(paths=['/PATH/TO/IMAGE']) -``` + - images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\];
+ - paths (list\[str\]): 图片的路径;
+ - output\_dir (str): 图片的保存路径,默认设为 output;
+ - visualization (bool): 是否将识别结果保存为图片文件;
+ - min\_size (int): 输入图片的短边最小尺寸,默认设为 32;
+ - max\_size (int): 输入图片的短边最大尺寸,默认设为 1024。 -## 服务部署 + **NOTE:** paths和images两个参数选择其一进行提供数据 -PaddleHub Serving可以部署一个在线图像风格转换服务。 + - **返回** + - res (list\[numpy.ndarray\]): 输出图像数据,ndarray.shape 为 \[H, W, C\] -## 第一步:启动PaddleHub Serving -运行启动命令: -```shell -$ hub serving start -m animegan_v2_paprika_97 -``` +## 四、服务部署 -这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 +- PaddleHub Serving可以部署一个在线图像风格转换服务。 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 +- ### 第一步:启动PaddleHub Serving -## 第二步:发送预测请求 + - 运行启动命令: + - ```shell + $ hub serving start -m animegan_v2_paprika_97 + ``` -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - 这样就完成了一个图像风格转换的在线服务API的部署,默认端口号为8866。 -```python -import requests -import json -import cv2 -import base64 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 +- ### 第二步:发送预测请求 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - ```python + import requests + import json + import cv2 + import base64 -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/animegan_v2_paprika_97" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) -# 打印预测结果 -print(r.json()["results"]) -``` + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/animegan_v2_paprika_97" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) -## 模型相关信息 + # 打印预测结果 + print(r.json()["results"]) + ``` -### 模型代码 -https://github.com/TachibanaYoshino/AnimeGANv2 +## 五、更新历史 -### 依赖 +* 1.0.0 -paddlepaddle >= 1.8.0 + 初始发布 -paddlehub >= 1.8.0 +* 1.0.1 + + 适配paddlehub2.0 + +* 1.0.2 + + 删除batch_size选项 diff --git a/modules/image/classification/DriverStatusRecognition/README.md b/modules/image/classification/DriverStatusRecognition/README.md index 4de54de77da732623d1cb9066bd3f5d7b5fdecd4..9183c607af9f5405edcf4ab7a829b012803da17d 100644 --- a/modules/image/classification/DriverStatusRecognition/README.md +++ b/modules/image/classification/DriverStatusRecognition/README.md @@ -1,65 +1,90 @@ -DriverStatusRecognition -类别 图像 - 图像分类 -网络 MobileNetV3_small_ssld -数据集 分心司机检测数据集 - -# 模型概述 -驾驶员状态识别(DriverStatusRecognition),该模型可挖掘出人在疲劳状态下的表情特征,然后将这些定性的表情特征进行量化,提取出面部特征点及特征指标作为判断依据,再结合实验数据总结出基于这些参数的识别方法,最后输入获取到的状态数据进行识别和判断。该PaddleHub Module支持API预测及命令行预测。 - -# 选择模型版本进行安装 -$ hub install DriverStatusRecognition==1.0.0 - -# 在线体验 -[AI Studio快速体验](https://aistudio.baidu.com/aistudio/projectdetail/1649513) - -# 命令行预测示例 -$ hub run DriverStatusRecognition --image 1.png --use_gpu True - -# Module API说明 -## def predict(data) -驾驶员状态识别预测接口,输入一张图像,输出该图像上驾驶员的状态 -### 参数 -- data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 - -### 返回 -- result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 - -# 代码示例 - -## API调用 -~~~ -import cv2 -import paddlehub as hub - -module = hub.Module(directory='DriverStatusRecognition') # 一行代码实现模型调用 - -images = [cv2.imread('work/imgs/test/img_1622.jpg'), cv2.imread('work/imgs/test/img_14165.jpg'), cv2.imread('work/imgs/test/img_47183.jpg')] -results = module.predict(images=images) - -for result in results: - print(result) -~~~ - -## 命令行调用 -~~~ -$ hub run DriverStatusRecognition --image 1.png --use_gpu True -~~~ - -# 效果展示 - -## 原图 - - -## 输出结果 -~~~ -[{'category_id': 5, 'category': 'ch5', 'score': 0.47390476}] -[{'category_id': 2, 'category': 'ch2', 'score': 0.99997914}] -[{'category_id': 1, 'category': 'ch1', 'score': 0.99996376}] -~~~ - -# 贡献者 -郑博培、彭兆帅 - -# 依赖 -paddlepaddle >= 2.0.0
-paddlehub >= 2.0.0 +# DriverStatusRecognition + +|模型名称|DriverStatusRecognition| +| :--- | :---: | +|类别|图像-图像分类| +|网络|MobileNetV3_small_ssld| +|数据集|分心司机检测数据集| +|是否支持Fine-tuning|否| +|模型大小|6MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 驾驶员状态识别(DriverStatusRecognition),该模型可挖掘出人在疲劳状态下的表情特征,然后将这些定性的表情特征进行量化,提取出面部特征点及特征指标作为判断依据,再结合实验数据总结出基于这些参数的识别方法,最后输入获取到的状态数据进行识别和判断。该PaddleHub Module支持API预测及命令行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + - paddlex >= 1.3.7 + + +- ### 2、安装 + + - ```shell + $ hub install DriverStatusRecognition + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +- ### 3、在线体验 + [AI Studio 快速体验](https://aistudio.baidu.com/aistudio/projectdetail/1649513) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run DriverStatusRecognition --input_path /PATH/TO/IMAGE + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="DriverStatusRecognition") + images = [cv2.imread('/PATH/TO/IMAGE')] + results = classifier.predict(images=images) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def predict(images) + ``` + - 分类接口API。 + - **参数** + - images:list类型,待检测的图像。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install DriverStatusRecognition==1.0.0 + ``` diff --git a/modules/image/classification/DriverStatusRecognition/requirements.txt b/modules/image/classification/DriverStatusRecognition/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..736e12bdda43ec3a1d858322d7b2fdabe392531e --- /dev/null +++ b/modules/image/classification/DriverStatusRecognition/requirements.txt @@ -0,0 +1,2 @@ +paddlex==1.3.7 +chardet diff --git a/modules/image/classification/SnakeIdentification/README.md b/modules/image/classification/SnakeIdentification/README.md index e39ea8de42d1d4c39a89bebab77f26143b6ea8df..809aae6db923f222fe125c3e31952f2f46f42204 100644 --- a/modules/image/classification/SnakeIdentification/README.md +++ b/modules/image/classification/SnakeIdentification/README.md @@ -1,64 +1,90 @@ -SnakeIdentification -类别 图像 - 图像分类 -网络 ResNet50_vd_ssld -数据集 蛇种数据集 - -# 模型概述 -蛇种识别(SnakeIdentification),该模型可准确识别蛇的种类,并精准判断蛇的毒性。该PaddleHub Module支持API预测及命令行预测。 - -# 选择模型版本进行安装 -$ hub install SnakeIdentification==1.0.0 - -# 在线体验 -[AI Studio快速体验](https://aistudio.baidu.com/aistudio/projectdetail/1646951) - -# 命令行预测示例 -$ hub run SnakeIdentification --image 1.png --use_gpu True - -# Module API说明 -## def predict(data) -蛇种识别预测接口,输入一张图像,输出该图像上蛇的类别 -### 参数 -- data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 - -### 返回 -- result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 - -# 代码示例 - -## API调用 -~~~ -import cv2 -import paddlehub as hub - -module = hub.Module(name="SnakeIdentification") - -images = [cv2.imread('snake_data/class_1/2421.jpg')] - -# execute predict and print the result -results = module.predict(images=images) -for result in results: - print(result) -~~~ - -## 命令行调用 -~~~ -$ hub run SnakeIdentification --image 1.png --use_gpu True -~~~ - -# 效果展示 - -## 原图 - - -## 输出结果 -~~~ -[{'category_id': 0, 'category': '水蛇', 'score': 0.9999205}] -~~~ - -# 贡献者 -郑博培、彭兆帅 - -# 依赖 -paddlepaddle >= 2.0.0
-paddlehub >= 2.0.0 +# SnakeIdentification + +|模型名称|SnakeIdentification| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet50_vd_ssld| +|数据集|蛇种数据集| +|是否支持Fine-tuning|否| +|模型大小|84MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 蛇种识别(SnakeIdentification),该模型可准确识别蛇的种类,并精准判断蛇的毒性。该PaddleHub Module支持API预测及命令行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + - paddlex >= 1.3.7 + + +- ### 2、安装 + + - ```shell + $ hub install SnakeIdentification + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +- ### 3、在线体验 + [AI Studio 快速体验](https://aistudio.baidu.com/aistudio/projectdetail/1646951) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run SnakeIdentification --input_path /PATH/TO/IMAGE + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="SnakeIdentification") + images = [cv2.imread('/PATH/TO/IMAGE')] + results = classifier.predict(images=images) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def predict(images) + ``` + - 分类接口API。 + - **参数** + - images:list类型,待检测的图像。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install SnakeIdentification==1.0.0 + ``` diff --git a/modules/image/classification/SnakeIdentification/requirements.txt b/modules/image/classification/SnakeIdentification/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..307c5de765a9bb322c4deebf2bcee55109e7ce74 --- /dev/null +++ b/modules/image/classification/SnakeIdentification/requirements.txt @@ -0,0 +1 @@ +paddlex==1.3.7 diff --git a/modules/image/classification/alexnet_imagenet/README.md b/modules/image/classification/alexnet_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..50fe4c0b30cbe51015358f44d8ad1a663b54f914 --- /dev/null +++ b/modules/image/classification/alexnet_imagenet/README.md @@ -0,0 +1,84 @@ +# alexnet_imagenet + +|模型名称|alexnet_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|AlexNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|234MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - AlexNet是图像分类中的经典模型。模型由Alex Krizhevsky于2012年提出,并在2012年ILSVRC比赛中夺得冠军。该PaddleHub Module结构为AlexNet,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install alexnet_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run alexnet_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="alexnet_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install alexnet_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/darknet53_imagenet/README.md b/modules/image/classification/darknet53_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..161f4342794229f00ea7e0baa42f73831ee460a4 --- /dev/null +++ b/modules/image/classification/darknet53_imagenet/README.md @@ -0,0 +1,84 @@ +# darknet53_imagenet + +|模型名称|darknet53_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DarkNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|160MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DarkNet 是由 Joseph Redmon 提出的图像分类模型,并应用于Yolov3 中作为 Backbone 来完成特征提取。该网络采用连续的 3*3 和 1*1 卷积进行连接,并像ResNet 一样有ShortCut连接。该 PaddleHub Module 基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install darknet53_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run darknet53_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="darknet53_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install darknet53_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/densenet121_imagenet/README.md b/modules/image/classification/densenet121_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..548d5d98392e91638f764692354b3984ce1ef39f --- /dev/null +++ b/modules/image/classification/densenet121_imagenet/README.md @@ -0,0 +1,84 @@ +# densenet121_imagenet + +|模型名称|densenet121_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DenseNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|34MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接。对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入。DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了参数量。该PaddleHub Module结构为 DenseNet121,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install densenet121_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run densenet121_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="densenet121_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install densenet121_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/densenet161_imagenet/README.md b/modules/image/classification/densenet161_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..19c779407258ca49c1bcd12d0298b1f8cae0122c --- /dev/null +++ b/modules/image/classification/densenet161_imagenet/README.md @@ -0,0 +1,84 @@ +# densenet161_imagenet + +|模型名称|densenet161_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DenseNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|114MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接。对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入。DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了参数量。该PaddleHub Module结构为 DenseNet161,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install densenet161_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run densenet161_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="densenet161_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install densenet161_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/densenet169_imagenet/README.md b/modules/image/classification/densenet169_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..56a7bd4ea597fd569a79bad46386a5092ded34c6 --- /dev/null +++ b/modules/image/classification/densenet169_imagenet/README.md @@ -0,0 +1,84 @@ +# densenet169_imagenet + +|模型名称|densenet169_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DenseNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|59MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接。对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入。DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了参数量。该PaddleHub Module结构为 DenseNet169,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install densenet169_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run densenet169_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="densenet169_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install densenet169_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/densenet201_imagenet/README.md b/modules/image/classification/densenet201_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..702886c8579f6d9b786735de2e7babf0c607678a --- /dev/null +++ b/modules/image/classification/densenet201_imagenet/README.md @@ -0,0 +1,84 @@ +# densenet201_imagenet + +|模型名称|densenet201_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DenseNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|82MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接。对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入。DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了参数量。该PaddleHub Module结构为 DenseNet201,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install densenet201_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run densenet201_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="densenet201_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install densenet201_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/densenet264_imagenet/README.md b/modules/image/classification/densenet264_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..4a35aea838a761747f054d4e3749c0ddcfad3569 --- /dev/null +++ b/modules/image/classification/densenet264_imagenet/README.md @@ -0,0 +1,84 @@ +# densenet264_imagenet + +|模型名称|densenet264_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DenseNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|135MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DenseNet 是 CVPR 2017 最佳论文的模型,DenseNet 以前馈方式将每一层与其他层连接,从而 L 层网络就有 L(L+1)/2 个直接连接。对于每一层,其输入是之前的所有层的特征图,而自己的特征图作为之后所有层的输入。DenseNet 缓解了梯度消失问题,加强特征传播,促进了特征重用,并大幅减少了参数量。该PaddleHub Module结构为 DenseNet264,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install densenet264_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run densenet264_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="densenet264_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install densenet264_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/dpn107_imagenet/README.md b/modules/image/classification/dpn107_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e97226f52d8a4ca5d8bb22ec6f523a34be189271 --- /dev/null +++ b/modules/image/classification/dpn107_imagenet/README.md @@ -0,0 +1,85 @@ +# dpn107_imagenet + +|模型名称|dpn107_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DPN| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|335MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想。该PaddleHub Module结构为 DPN107,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install dpn107_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run dpn107_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="dpn107_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install dpn107_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/dpn131_imagenet/README.md b/modules/image/classification/dpn131_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..1afd847c268a4907fe9dd520ac12eaa78dec88df --- /dev/null +++ b/modules/image/classification/dpn131_imagenet/README.md @@ -0,0 +1,85 @@ +# dpn131_imagenet + +|模型名称|dpn131_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DPN| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|306MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想。该PaddleHub Module结构为 DPN98,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install dpn131_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run dpn131_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="dpn131_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install dpn131_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/dpn68_imagenet/README.md b/modules/image/classification/dpn68_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..72518161921afbd4b02e9722715a6aaf63b2464d --- /dev/null +++ b/modules/image/classification/dpn68_imagenet/README.md @@ -0,0 +1,85 @@ +# dpn68_imagenet + +|模型名称|dpn68_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DPN| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|50MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想。该PaddleHub Module结构为 DPN68,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install dpn68_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run dpn68_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="dpn68_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install dpn68_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/dpn92_imagenet/README.md b/modules/image/classification/dpn92_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..69024027df06f87801bdaf5e7c3c44183bd4a1eb --- /dev/null +++ b/modules/image/classification/dpn92_imagenet/README.md @@ -0,0 +1,85 @@ +# dpn92_imagenet + +|模型名称|dpn92_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DPN| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|146MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想。该PaddleHub Module结构为 DPN92,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install dpn92_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run dpn92_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="dpn92_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install dpn92_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/dpn98_imagenet/README.md b/modules/image/classification/dpn98_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a418583c623729c72c281ef52c9071c46cc8edc1 --- /dev/null +++ b/modules/image/classification/dpn98_imagenet/README.md @@ -0,0 +1,86 @@ +# dpn98_imagenet + +|模型名称|dpn98_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|DPN| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|238MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - DPN(Dual Path Networks) 是 ImageNet 2017 目标定位冠军的图像分类模型,融合了 ResNet 和 DenseNet 的核心思想。该PaddleHub Module结构为 DPN98,基于ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install dpn98_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run dpn98_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="dpn98_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install dpn98_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/efficientnetb0_imagenet/README.md b/modules/image/classification/efficientnetb0_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a1013ab01f121c264517ecdd7ca0bbe055fcea0f --- /dev/null +++ b/modules/image/classification/efficientnetb0_imagenet/README.md @@ -0,0 +1,137 @@ +# efficientnetb0_imagenet + +|模型名称|efficientnetb0_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|22MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB0,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb0_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb0_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb0_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb0_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb0_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb0_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/efficientnetb0_small_imagenet/README.md b/modules/image/classification/efficientnetb0_small_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..be464cc8ca77aedea64a199449d2d34e9db18eb4 --- /dev/null +++ b/modules/image/classification/efficientnetb0_small_imagenet/README.md @@ -0,0 +1,136 @@ +# efficientnetb0_small_imagenet + +|模型名称|efficientnetb0_small_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|20MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB0,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb0_small_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb0_small_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb0_small_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb0_small_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb0_small_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install efficientnetb0_small_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/efficientnetb1_imagenet/README.md b/modules/image/classification/efficientnetb1_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..fe5981ece4560d811de6953db1b18ea00bee62c5 --- /dev/null +++ b/modules/image/classification/efficientnetb1_imagenet/README.md @@ -0,0 +1,136 @@ +# efficientnetb1_imagenet + +|模型名称|efficientnetb1_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|33MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB1,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb1_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb1_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb1_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb1_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb1_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb1_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/efficientnetb2_imagenet/README.md b/modules/image/classification/efficientnetb2_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3972b35e11309354a24787a0167ea450bc6891b7 --- /dev/null +++ b/modules/image/classification/efficientnetb2_imagenet/README.md @@ -0,0 +1,136 @@ +# efficientnetb2_imagenet + +|模型名称|efficientnetb2_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|38MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB2,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb2_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb2_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb2_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb2_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb2_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb2_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/efficientnetb3_imagenet/README.md b/modules/image/classification/efficientnetb3_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3bb6273921e6cce4c3210736b0eeb4b1c264f53c --- /dev/null +++ b/modules/image/classification/efficientnetb3_imagenet/README.md @@ -0,0 +1,136 @@ +# efficientnetb3_imagenet + +|模型名称|efficientnetb3_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|51MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB3,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb3_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb3_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb3_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb3_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb3_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb3_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/efficientnetb4_imagenet/README.md b/modules/image/classification/efficientnetb4_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..1a7d0e9baf4a7e8503c3e7f7b9fdec32fdac52d9 --- /dev/null +++ b/modules/image/classification/efficientnetb4_imagenet/README.md @@ -0,0 +1,137 @@ +# efficientnetb4_imagenet + +|模型名称|efficientnetb4_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|77MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB4,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb4_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb4_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb4_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb4_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb4_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb4_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/efficientnetb5_imagenet/README.md b/modules/image/classification/efficientnetb5_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3c8a4bc373f02cf773cfe4a742c86c3025774361 --- /dev/null +++ b/modules/image/classification/efficientnetb5_imagenet/README.md @@ -0,0 +1,137 @@ +# efficientnetb5_imagenet + +|模型名称|efficientnetb5_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|121MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB5,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb5_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb5_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb5_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb5_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb5_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb5_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/efficientnetb6_imagenet/README.md b/modules/image/classification/efficientnetb6_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..746ff1a711556429a03625db0f959146fbae3331 --- /dev/null +++ b/modules/image/classification/efficientnetb6_imagenet/README.md @@ -0,0 +1,136 @@ +# efficientnetb6_imagenet + +|模型名称|efficientnetb6_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|170MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB6,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb6_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb6_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb6_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb6_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb6_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb6_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/efficientnetb7_imagenet/README.md b/modules/image/classification/efficientnetb7_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..bef07051b9390f2416748df89041d60ef9290d32 --- /dev/null +++ b/modules/image/classification/efficientnetb7_imagenet/README.md @@ -0,0 +1,137 @@ +# efficientnetb7_imagenet + +|模型名称|efficientnetb7_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|EfficientNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|260MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - EfficientNet 是谷歌的开源新模型,是一个轻量级网络,它的主干网络由 MBConv 构成,同时采取了 squeeze-and-excitation 操作对网络结构进行优化。该 PaddleHub Module结构为 EfficientNetB7,基于 ImageNet-2012 数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install efficientnetb7_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run efficientnetb7_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="efficientnetb7_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m efficientnetb7_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/efficientnetb7_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 提升预测性能以及易用性 + - ```shell + $ hub install efficientnetb7_imagenet==1.1.0 + ``` diff --git a/modules/image/classification/fix_resnext101_32x48d_wsl_imagenet/README.md b/modules/image/classification/fix_resnext101_32x48d_wsl_imagenet/README.md index 12c673852199021c2f88a228795642b334d16841..ef750cec4941fb598296ccc9308f8261862d0c70 100644 --- a/modules/image/classification/fix_resnext101_32x48d_wsl_imagenet/README.md +++ b/modules/image/classification/fix_resnext101_32x48d_wsl_imagenet/README.md @@ -1,149 +1,134 @@ -## 命令行预测 +# fix_resnext101_32x48d_wsl_imagenet -``` -hub run fix_resnext101_32x48d_wsl_imagenet --input_path "/PATH/TO/IMAGE" -``` +|模型名称|fix_resnext101_32x48d_wsl_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|3.1GB| +|最新更新日期|-| +|数据指标|-| -## API -```python -def get_expected_image_width() -``` +## 一、模型基本信息 -返回预处理的图片宽度,也就是224。 -```python -def get_expected_image_height() -``` -返回预处理的图片高度,也就是224。 +- ### 模型介绍 -```python -def get_pretrained_images_mean() -``` + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 -返回预处理的图片均值,也就是 \[0.485, 0.456, 0.406\]。 -```python -def get_pretrained_images_std() -``` +## 二、安装 -返回预处理的图片标准差,也就是 \[0.229, 0.224, 0.225\]。 +- ### 1、环境依赖 + - paddlepaddle >= 1.6.2 -```python -def context(trainable=True, pretrained=True) -``` + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) -**参数** -* trainable (bool): 计算图的参数是否为可训练的; -* pretrained (bool): 是否加载默认的预训练模型。 +- ### 2、安装 -**返回** + - ```shell + $ hub install fix_resnext101_32x48d_wsl_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -* inputs (dict): 计算图的输入,key 为 'image', value 为图片的张量; -* outputs (dict): 计算图的输出,key 为 'classification' 和 'feature_map',其相应的值为: - * classification (paddle.fluid.framework.Variable): 分类结果,也就是全连接层的输出; - * feature\_map (paddle.fluid.framework.Variable): 特征匹配,全连接层前面的那个张量。 -* context\_prog(fluid.Program): 计算图,用于迁移学习。 +## 三、模型API预测 -```python -def classification(images=None, - paths=None, - batch_size=1, - use_gpu=False, - top_k=1): -``` +- ### 1、命令行预测 -**参数** + - ```shell + $ hub run fix_resnext101_32x48d_wsl_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -* images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR; -* paths (list\[str\]): 图片的路径; -* batch\_size (int): batch 的大小; -* use\_gpu (bool): 是否使用 GPU 来预测; -* top\_k (int): 返回预测结果的前 k 个。 +- ### 2、预测代码示例 -**返回** + - ```python + import paddlehub as hub + import cv2 -res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别动物的类别,value为置信度。 + classifier = hub.Module(name="fix_resnext101_32x48d_wsl_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` -```python -def save_inference_model(dirname, - model_filename=None, - params_filename=None, - combined=True) -``` +- ### 3、API -将模型保存到指定路径。 -**参数** + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** -* dirname: 存在模型的目录名称 -* model\_filename: 模型文件名称,默认为\_\_model\_\_ -* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 -## 代码示例 + - **返回** -```python -import paddlehub as hub -import cv2 + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 -classifier = hub.Module(name="fix_resnext101_32x48d_wsl_imagenet") -result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) -# or -# result = classifier.classification(paths=['/PATH/TO/IMAGE']) -``` -## 服务部署 -PaddleHub Serving可以部署一个在线图像识别服务。 +## 四、服务部署 -## 第一步:启动PaddleHub Serving +- PaddleHub Serving可以部署一个图像识别的在线服务。 -运行启动命令: -```shell -$ hub serving start -m fix_resnext101_32x48d_wsl_imagenet -``` +- ### 第一步:启动PaddleHub Serving -这样就完成了一个在线图像识别服务化API的部署,默认端口号为8866。 + - 运行启动命令: + - ```shell + $ hub serving start -m fix_resnext101_32x48d_wsl_imagenet + ``` -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 -## 第二步:发送预测请求 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 +- ### 第二步:发送预测请求 -```python -import requests -import json -import cv2 -import base64 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - ```python + import requests + import json + import cv2 + import base64 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/fix_resnext101_32x48d_wsl_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/fix_resnext101_32x48d_wsl_imagenet" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + # 打印预测结果 + print(r.json()["results"]) + ``` -# 打印预测结果 -print(r.json()["results"]) -``` -### 查看代码 +## 五、更新历史 -https://github.com/PaddlePaddle/PaddleClas +* 1.0.0 -### 依赖 - -paddlepaddle >= 1.6.2 - -paddlehub >= 1.6.0 + 初始发布 + - ```shell + $ hub install fix_resnext101_32x48d_wsl_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/food_classification/README.md b/modules/image/classification/food_classification/README.md index 138bfcf037cc2e6d0f6ef71fb392b2d13cc2b309..01f910138e18aee8c45d1a2f56f493d547988d50 100644 --- a/modules/image/classification/food_classification/README.md +++ b/modules/image/classification/food_classification/README.md @@ -1,84 +1,90 @@ -# food_classification - -类别 图像 - 图像分类 - -网络 ResNet50_vd_ssld - - -> 模型概述 - -美食分类(food_classification),该模型可识别苹果派,小排骨,烤面包,牛肉馅饼,牛肉鞑靼。该PaddleHub Module支持API预测及命令行预测。 - -> 选择模型版本进行安装 - -```shell -$ hub install food_classification==1.0.0 -``` -> Module API说明 - -```python -def predict(self, - images=None, - paths=None, - batch_size=1, - use_gpu=False, - **kwargs): -``` -美食分类预测接口,输入一张图像,输出该图像上食物的类别 - -参数 - -* images (list[numpy.ndarray]): 图片数据,ndarray.shape 为 [H, W, C],BGR格式; -* paths (list[str]): 图片的路径; -* batch_size (int): batch 的大小; -* use_gpu (bool): 是否使用 GPU; - -返回 - -* res (list[dict]): 识别结果的列表,列表中每一个元素为 dict,各字段为: - * category_id (int): 类别的id; - * category(str): 类别; - * score(float): 准确率; - -## 代码示例 - -### API调用 - -```python -import cv2 -import paddlehub as hub - -module = hub.Module(name="food_classification") - -images = [cv2.imread('PATH/TO/IMAGE')] - -# execute predict and print the result -results = module.predict(images=images) -for result in results: - print(result) -``` - -### 命令行调用 -```shell -$ hub run food_classification --input_path /PATH/TO/IMAGE --use_gpu True -``` - -## 效果展示 - -### 原图 - - -### 输出结果 -```python -[{'category_id': 0, 'category': 'apple_pie', 'score': 0.9985085}] -``` - -## 贡献者 -彭兆帅、郑博培 - -## 依赖 -paddlepaddle >= 2.0.0 - -paddlehub >= 2.0.0 - -paddlex >= 1.3.7 +# food_classification + +|模型名称|food_classification| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet50_vd_ssld| +|数据集|美食数据集| +|是否支持Fine-tuning|否| +|模型大小|91MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 美食分类(food_classification),该模型可识别苹果派,小排骨,烤面包,牛肉馅饼,牛肉鞑靼。该PaddleHub Module支持API预测及命令行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + - paddlex >= 1.3.7 + + +- ### 2、安装 + + - ```shell + $ hub install food_classification + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run food_classification --input_path /PATH/TO/IMAGE + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="food_classification") + images = [cv2.imread('/PATH/TO/IMAGE')] + results = classifier.predict(images=images) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def predict(images) + ``` + - 分类接口API。 + - **参数** + - images:list类型,待检测的图像。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型: + - category_id (int): 类别的id; + - category(str): 类别; + - score(float): 准确率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install food_classification==1.0.0 + ``` diff --git a/modules/image/classification/food_classification/requirements.txt b/modules/image/classification/food_classification/requirements.txt index ad32066430096ff4050dce8930f74eae5eb9d2f0..f3c5b8fb12473794251e0a4669dac313cb93eff4 100644 --- a/modules/image/classification/food_classification/requirements.txt +++ b/modules/image/classification/food_classification/requirements.txt @@ -1,3 +1,3 @@ paddlepaddle >= 2.0.0 paddlehub >= 2.0.0 -paddlex >= 1.3.7 +paddlex == 1.3.7 diff --git a/modules/image/classification/googlenet_imagenet/README.md b/modules/image/classification/googlenet_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7dec0850aea31f81e2488829ea20d7e80a8f6c3d --- /dev/null +++ b/modules/image/classification/googlenet_imagenet/README.md @@ -0,0 +1,84 @@ +# googlenet_imagenet + +|模型名称|googlenet_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|GoogleNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|28MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - GoogleNet是图像分类中的经典模型。由Christian Szegedy等人在2014年提出,并获得了2014年ILSVRC竞赛冠军。该PaddleHub Module结构为GoogleNet,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install googlenet_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run googlenet_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="googlenet_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install googlenet_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/inception_v4_imagenet/README.md b/modules/image/classification/inception_v4_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..ca8d613ee382e526ab8b75c1a502639e8c0fec40 --- /dev/null +++ b/modules/image/classification/inception_v4_imagenet/README.md @@ -0,0 +1,84 @@ +# inception_v4_imagenet + +|模型名称|inception_v4_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Inception_V4| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|167MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - Inception 结构最初由 GoogLeNet 引入,因此 GoogLeNet 也被称为 Inception-v1,通过在 Inception-v1 的基础上引入Batch Normalization、分解、残差连接等技术,设计出了Inception-v4。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install inception_v4_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run inception_v4_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="inception_v4_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率。 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install inception_v4_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/marine_biometrics/README.md b/modules/image/classification/marine_biometrics/README.md index 6ba7acd92dc5f94c28a65695a7fcd0f93050190e..797288aee8ce47c102dc0bf2973bd57fa8d473d1 100644 --- a/modules/image/classification/marine_biometrics/README.md +++ b/modules/image/classification/marine_biometrics/README.md @@ -1,69 +1,85 @@ -marine_biometrics - -类别 图像 - 图像分类 - -网络 ResNet50_vd_ssld - -数据集 Fish4Knowledge - -# 模型概述 -海洋生物识别(marine_biometrics),该模型可准确识别鱼的种类。该PaddleHub Module支持API预测及命令行预测。 - -# 选择模型版本进行安装 -$ hub install marine_biometrics==1.0.0 - -# 在线体验 -[AI Studio快速体验](https://aistudio.baidu.com/aistudio/projectdetail/1667809) - -# 命令行预测示例 -$ hub run marine_biometrics --image 1.png --use_gpu True - -# Module API说明 -## def predict(data) -海洋生物识别预测接口,输入一张图像,输出该图像上鱼的类别 -### 参数 -- data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 - -### 返回 -- result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 - -# 代码示例 - -## API调用 - -~~~ -import cv2 -import paddlehub as hub - -module = hub.Module(name="MarineBiometrics") - -images = [cv2.imread('PATH/TO/IMAGE')] - -# execute predict and print the result -results = module.predict(images=images) -for result in results: - print(result) -~~~ - -## 命令行调用 -~~~ -$ hub run marine_biometrics --image 1.png --use_gpu True -~~~ - -# 效果展示 - -## 原图 - - -## 输出结果 -~~~ -[{'category_id': 16, 'category': 'Plectroglyphidodon_dickii', 'score': 0.9932127}] -~~~ - -# 贡献者 -郑博培、彭兆帅 - -# 依赖 -paddlepaddle >= 2.0.0 - -paddlehub >= 2.0.0 +# marine_biometrics + +|模型名称|marine_biometrics| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet50_vd_ssld| +|数据集|Fish4Knowledge| +|是否支持Fine-tuning|否| +|模型大小|84MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 海洋生物识别(marine_biometrics),该模型可准确识别鱼的种类。该PaddleHub Module支持API预测及命令行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install marine_biometrics + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run marine_biometrics --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="marine_biometrics") + images = [cv2.imread('/PATH/TO/IMAGE')] + results = classifier.predict(images=images) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def predict(images) + ``` + - 分类接口API。 + - **参数** + - images:list类型,待检测的图像。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install marine_biometrics==1.0.0 + ``` diff --git a/modules/image/classification/marine_biometrics/requirements.txt b/modules/image/classification/marine_biometrics/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..307c5de765a9bb322c4deebf2bcee55109e7ce74 --- /dev/null +++ b/modules/image/classification/marine_biometrics/requirements.txt @@ -0,0 +1 @@ +paddlex==1.3.7 diff --git a/modules/image/classification/mobilenet_v2_animals/README.md b/modules/image/classification/mobilenet_v2_animals/README.md index f1824d6536d8e8e5ea5400df5486f0ff8ec2268d..e1ba58dcdce89cfc89cf33108716e903d6458d54 100644 --- a/modules/image/classification/mobilenet_v2_animals/README.md +++ b/modules/image/classification/mobilenet_v2_animals/README.md @@ -1,159 +1,134 @@ -```shell -$ hub install mobilenet_v2_animals==1.0.0 -``` +# mobilenet_v2_animals -

-
MobileNet 系列的网络结构 -

+|模型名称|mobilenet_v2_animals| +| :--- | :---: | +|类别|图像-图像分类| +|网络|MobileNet_v2| +|数据集|百度自建动物数据集| +|是否支持Fine-tuning|否| +|模型大小|50MB| +|最新更新日期|-| +|数据指标|-| -模型的详情可参考[论文](https://arxiv.org/pdf/1801.04381.pdf) -## 命令行预测 +## 一、模型基本信息 -``` -hub run mobilenet_v2_animals --input_path "/PATH/TO/IMAGE" -``` -## API -```python -def get_expected_image_width() -``` +- ### 模型介绍 -返回预处理的图片宽度,也就是224。 + - MobileNet V2 是一个轻量化的卷积神经网络,它在 MobileNet 的基础上,做了 Inverted Residuals 和 Linear bottlenecks 这两大改进。该 PaddleHub Module 是在百度自建动物数据集上训练得到的,可用于图像分类和特征提取,当前已支持7978种动物的分类识别。模型的详情可参考[论文](https://arxiv.org/pdf/1801.04381.pdf)。 -```python -def get_expected_image_height() -``` -返回预处理的图片高度,也就是224。 -```python -def get_pretrained_images_mean() -``` +## 二、安装 -返回预处理的图片均值,也就是 \[0.485, 0.456, 0.406\]。 +- ### 1、环境依赖 -```python -def get_pretrained_images_std() -``` + - paddlepaddle >= 1.6.2 -返回预处理的图片标准差,也就是 \[0.229, 0.224, 0.225\]。 + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) -```python -def context(trainable=True, pretrained=True) -``` +- ### 2、安装 -**参数** + - ```shell + $ hub install mobilenet_v2_animals + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -* trainable (bool): 计算图的参数是否为可训练的; -* pretrained (bool): 是否加载默认的预训练模型。 +## 三、模型API预测 -**返回** +- ### 1、命令行预测 -* inputs (dict): 计算图的输入,key 为 'image', value 为图片的张量; -* outputs (dict): 计算图的输出,key 为 'classification' 和 'feature_map',其相应的值为: - * classification (paddle.fluid.framework.Variable): 分类结果,也就是全连接层的输出; - * feature\_map (paddle.fluid.framework.Variable): 特征匹配,全连接层前面的那个张量。 -* context\_prog(fluid.Program): 计算图,用于迁移学习。 + - ```shell + $ hub run mobilenet_v2_animals --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -```python -def classification(images=None, - paths=None, - batch_size=1, - use_gpu=False, - top_k=1): -``` +- ### 2、预测代码示例 -**参数** + - ```python + import paddlehub as hub + import cv2 -* images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR; -* paths (list\[str\]): 图片的路径; -* batch\_size (int): batch 的大小; -* use\_gpu (bool): 是否使用 GPU 来预测; -* top\_k (int): 返回预测结果的前 k 个。 + classifier = hub.Module(name="mobilenet_v2_animals") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` -**返回** +- ### 3、API -res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别动物的类别,value为置信度。 -```python -def save_inference_model(dirname, - model_filename=None, - params_filename=None, - combined=True) -``` + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** -将模型保存到指定路径。 + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 -**参数** + - **返回** -* dirname: 存在模型的目录名称 -* model_filename: 模型文件名称,默认为\_\_model\_\_ -* params_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 -## 代码示例 -```python -import paddlehub as hub -import cv2 -classifier = hub.Module(name="mobilenet_v2_animals") +## 四、服务部署 -result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) -# or -# result = classifier.classification(paths=['/PATH/TO/IMAGE']) -``` +- PaddleHub Serving可以部署一个动物识别的在线服务。 -## 服务部署 +- ### 第一步:启动PaddleHub Serving -PaddleHub Serving可以部署一个在线动物识别服务。 + - 运行启动命令: + - ```shell + $ hub serving start -m mobilenet_v2_animals + ``` -## 第一步:启动PaddleHub Serving + - 这样就完成了一个动物识别的在线服务的部署,默认端口号为8866。 -运行启动命令: -```shell -$ hub serving start -m mobilenet_v2_animals -``` + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 -这样就完成了一个在线动物识别服务化API的部署,默认端口号为8866。 +- ### 第二步:发送预测请求 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -## 第二步:发送预测请求 + - ```python + import requests + import json + import cv2 + import base64 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -```python -import requests -import json -import cv2 -import base64 + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/mobilenet_v2_animals" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + # 打印预测结果 + print(r.json()["results"]) + ``` -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') +## 五、更新历史 -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/mobilenet_v2_animals" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) +* 1.0.0 -# 打印预测结果 -print(r.json()["results"]) -``` - -### 查看代码 - -[PaddlePaddle/models 图像分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification) - -### 依赖 - -paddlepaddle >= 1.6.2 - -paddlehub >= 1.6.0 + 初始发布 + - ```shell + $ hub install mobilenet_v2_animals==1.0.0 + ``` diff --git a/modules/image/classification/mobilenet_v2_dishes/README.md b/modules/image/classification/mobilenet_v2_dishes/README.md index cdbd1c048620f88c5b2564c642912de839ce1230..aad927459be227f693f107e47e61655a934ce95b 100644 --- a/modules/image/classification/mobilenet_v2_dishes/README.md +++ b/modules/image/classification/mobilenet_v2_dishes/README.md @@ -1,159 +1,139 @@ -```shell -$ hub install mobilenet_v2_dishes==1.0.0 -``` +# mobilenet_v2_dishes -

-
MobileNet 系列的网络结构 -

- -模型的详情可参考[论文](https://arxiv.org/pdf/1801.04381.pdf) - -## 命令行预测 +|模型名称|mobilenet_v2_dishes| +| :--- | :---: | +|类别|图像-图像分类| +|网络|MobileNet_v2| +|数据集|百度自建菜品数据集| +|是否支持Fine-tuning|否| +|模型大小|52MB| +|最新更新日期|-| +|数据指标|-| -``` -hub run mobilenet_v2_dishes --input_path "/PATH/TO/IMAGE" -``` -## API +## 一、模型基本信息 -```python -def get_expected_image_width() -``` -返回预处理的图片宽度,也就是224。 -```python -def get_expected_image_height() -``` +- ### 模型介绍 -返回预处理的图片高度,也就是224。 + - MobileNet V2 是一个轻量化的卷积神经网络,它在 MobileNet 的基础上,做了 Inverted Residuals 和 Linear bottlenecks 这两大改进。该 PaddleHub Module 是在百度自建菜品数据集上训练得到的,可用于图像分类和特征提取,当前已支持8416种菜品的分类识别。 -```python -def get_pretrained_images_mean() -``` - -返回预处理的图片均值,也就是 \[0.485, 0.456, 0.406\]。 - -```python -def get_pretrained_images_std() -``` +

+
+

-返回预处理的图片标准差,也就是 \[0.229, 0.224, 0.225\]。 + - 更多详情参考:[MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/pdf/1801.04381.pdf) +## 二、安装 -```python -def context(trainable=True, pretrained=True) -``` +- ### 1、环境依赖 -**参数** + - paddlepaddle >= 1.6.2 -* trainable (bool): 计算图的参数是否为可训练的; -* pretrained (bool): 是否加载默认的预训练模型。 + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) -**返回** -* inputs (dict): 计算图的输入,key 为 'image', value 为图片的张量; -* outputs (dict): 计算图的输出,key 为 'classification' 和 'feature_map',其相应的值为: - * classification (paddle.fluid.framework.Variable): 分类结果,也就是全连接层的输出; - * feature\_map (paddle.fluid.framework.Variable): 特征匹配,全连接层前面的那个张量。 -* context\_prog(fluid.Program): 计算图,用于迁移学习。 +- ### 2、安装 -```python -def classification(images=None, - paths=None, - batch_size=1, - use_gpu=False, - top_k=1): -``` + - ```shell + $ hub install mobilenet_v2_dishes + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -**参数** +## 三、模型API预测 -* images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR; -* paths (list\[str\]): 图片的路径; -* batch\_size (int): batch 的大小; -* use\_gpu (bool): 是否使用 GPU 来预测; -* top\_k (int): 返回预测结果的前 k 个。 +- ### 1、命令行预测 -**返回** + - ```shell + $ hub run mobilenet_v2_dishes --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现菜品分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 +- ### 2、预测代码示例 -```python -def save_inference_model(dirname, - model_filename=None, - params_filename=None, - combined=True) -``` + - ```python + import paddlehub as hub + import cv2 -将模型保存到指定路径。 + classifier = hub.Module(name="mobilenet_v2_dishes") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` -**参数** +- ### 3、API -* dirname: 存在模型的目录名称 -* model_filename: 模型文件名称,默认为\_\_model\_\_ -* params_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** -## 代码示例 + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 -```python -import paddlehub as hub -import cv2 + - **返回** -classifier = hub.Module(name="mobilenet_v2_dishes") + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 -result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) -# or -# result = classifier.classification(paths=['/PATH/TO/IMAGE']) -``` -## 服务部署 -PaddleHub Serving可以部署一个菜品分类的在线服务。 -## 第一步:启动PaddleHub Serving +## 四、服务部署 -运行启动命令: -```shell -$ hub serving start -m mobilenet_v2_dishes -``` +- PaddleHub Serving可以部署一个菜品分类的在线服务。 -这样就完成了一个菜品分类的在线服务的部署,默认端口号为8866。 +- ### 第一步:启动PaddleHub Serving -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + - 运行启动命令: + - ```shell + $ hub serving start -m mobilenet_v2_dishes + ``` -## 第二步:发送预测请求 + - 这样就完成了一个菜品分类的在线服务的部署,默认端口号为8866。 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 -```python -import requests -import json -import cv2 -import base64 +- ### 第二步:发送预测请求 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') + - ```python + import requests + import json + import cv2 + import base64 + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/mobilenet_v2_dishes" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/mobilenet_v2_dishes" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) -# 打印预测结果 -print(r.json()["results"]) -``` + # 打印预测结果 + print(r.json()["results"]) + ``` -### 查看代码 -[PaddlePaddle/models 图像分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification) +## 五、更新历史 -### 依赖 +* 1.0.0 -paddlepaddle >= 1.6.2 + 初始发布 -paddlehub >= 1.6.0 + - ```shell + $ hub install mobilenet_v2_dishes==1.0.0 + ``` diff --git a/modules/image/classification/mobilenet_v2_imagenet/README.md b/modules/image/classification/mobilenet_v2_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7b9bb0f7e5a0bcfcab35910ad88189e5bce756b3 --- /dev/null +++ b/modules/image/classification/mobilenet_v2_imagenet/README.md @@ -0,0 +1,88 @@ +# mobilenet_v2_imagenet + +|模型名称|mobilenet_v2_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Mobilenet_v2| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|15MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - MobileNet V2是Mark Sandler, Andrew Howard等人在2018年提出的一个图像分类模型,该系列模型(MobileNet)是为移动和嵌入式设备提出的高效模型,在模型参数较少的情况下仍然保持了较高的分类准确率。该PaddleHub Module基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install mobilenet_v2_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run mobilenet_v2_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="mobilenet_v2_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.0.1 + + 修复python2中编码问题 + + - ```shell + $ hub install mobilenet_v2_imagenet==1.0.1 + ``` diff --git a/modules/image/classification/mobilenet_v2_imagenet_ssld/README.md b/modules/image/classification/mobilenet_v2_imagenet_ssld/README.md new file mode 100644 index 0000000000000000000000000000000000000000..4529275acdfd69eb40ff4f1133adde685aef7014 --- /dev/null +++ b/modules/image/classification/mobilenet_v2_imagenet_ssld/README.md @@ -0,0 +1,133 @@ +# mobilenet_v2_imagenet_ssld + +|模型名称|mobilenet_v2_imagenet_ssld| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Mobilenet_v2| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|15MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - MobileNet V2是Mark Sandler, Andrew Howard等人在2018年提出的一个图像分类模型,该系列模型(MobileNet)是为移动和嵌入式设备提出的高效模型,在模型参数较少的情况下仍然保持了较高的分类准确率。该PaddleHub Module基于ImageNet-2012数据集并采用PaddleClas提供的SSLD蒸馏方法训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install mobilenet_v2_imagenet_ssld + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run mobilenet_v2_imagenet_ssld --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="mobilenet_v2_imagenet_ssld") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m mobilenet_v2_imagenet_ssld + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/mobilenet_v2_imagenet_ssld" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install mobilenet_v2_imagenet_ssld==1.0.0 + ``` diff --git a/modules/image/classification/mobilenet_v3_large_imagenet_ssld/README.md b/modules/image/classification/mobilenet_v3_large_imagenet_ssld/README.md new file mode 100644 index 0000000000000000000000000000000000000000..03cf9d75fd085c41addccf1758b04151ffccf76e --- /dev/null +++ b/modules/image/classification/mobilenet_v3_large_imagenet_ssld/README.md @@ -0,0 +1,135 @@ +# mobilenet_v3_large_imagenet_ssld + +|模型名称|mobilenet_v3_large_imagenet_ssld| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Mobilenet_v3_large| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|23MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - MobileNetV3是Google在2019年发布的新模型,作者通过结合NAS与NetAdapt进行搜索得到该网络结构,提供了Large和Small两个版本,分别适用于对资源不同要求的情况。对比于MobileNetV2,新的模型在速度和精度方面均有提升。该PaddleHubModule的模型结构为MobileNetV3 Large,基于ImageNet-2012数据集并采用PaddleClas提供的SSLD蒸馏方法训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install mobilenet_v3_large_imagenet_ssld + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run mobilenet_v3_large_imagenet_ssld --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="mobilenet_v3_large_imagenet_ssld") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m mobilenet_v3_large_imagenet_ssld + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/mobilenet_v3_large_imagenet_ssld" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install mobilenet_v3_large_imagenet_ssld==1.0.0 + ``` diff --git a/modules/image/classification/mobilenet_v3_small_imagenet_ssld/README.md b/modules/image/classification/mobilenet_v3_small_imagenet_ssld/README.md new file mode 100644 index 0000000000000000000000000000000000000000..bdcd475885d3ac0b8181978af50a3d0ead7fcaf7 --- /dev/null +++ b/modules/image/classification/mobilenet_v3_small_imagenet_ssld/README.md @@ -0,0 +1,134 @@ +# mobilenet_v3_small_imagenet_ssld + +|模型名称|mobilenet_v3_small_imagenet_ssld| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Mobilenet_v3_Small| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|13MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - MobileNetV3是Google在2019年发布的新模型,作者通过结合NAS与NetAdapt进行搜索得到该网络结构,提供了Large和Small两个版本,分别适用于对资源不同要求的情况。对比于MobileNetV2,新的模型在速度和精度方面均有提升。该PaddleHubModule的模型结构为MobileNetV3 Small,基于ImageNet-2012数据集并采用PaddleClas提供的SSLD蒸馏方法训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install mobilenet_v3_small_imagenet_ssld + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run mobilenet_v3_small_imagenet_ssld --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="mobilenet_v3_small_imagenet_ssld") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m mobilenet_v3_small_imagenet_ssld + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/mobilenet_v3_small_imagenet_ssld" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install mobilenet_v3_small_imagenet_ssld==1.0.0 + ``` diff --git a/modules/image/classification/nasnet_imagenet/README.md b/modules/image/classification/nasnet_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..b9ca44eb5114e65433ecb398ba2acedc6634d5c7 --- /dev/null +++ b/modules/image/classification/nasnet_imagenet/README.md @@ -0,0 +1,87 @@ +# nasnet_imagenet + +|模型名称|nasnet_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|NASNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|345MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - NASNet是Google通过AutoML自动训练出来的图像分类模型。该PaddleHub Module基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install nasnet_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run nasnet_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="nasnet_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.0.1 + + 修复python2中编码问题 + - ```shell + $ hub install nasnet_imagenet==1.0.1 + ``` diff --git a/modules/image/classification/pnasnet_imagenet/README.md b/modules/image/classification/pnasnet_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..e87ff0721634bb6613bcac1b4a42035ff4fe1b55 --- /dev/null +++ b/modules/image/classification/pnasnet_imagenet/README.md @@ -0,0 +1,87 @@ +# pnasnet_imagenet + +|模型名称|pnasnet_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|PNASNet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|333MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - PNASNet是Google通过AutoML自动训练出来的图像分类模型。该PaddleHub Module基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install pnasnet_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run pnasnet_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="pnasnet_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.0.1 + + 修复python2中编码问题 + - ```shell + $ hub install pnasnet_imagenet==1.0.1 + ``` diff --git a/modules/image/classification/res2net101_vd_26w_4s_imagenet/README.md b/modules/image/classification/res2net101_vd_26w_4s_imagenet/README.md index 25c1fcb335c5c46987572de1cc0f3934a3c0138f..75f10a97f1e175b3df1dd60fb957f7133073108f 100644 --- a/modules/image/classification/res2net101_vd_26w_4s_imagenet/README.md +++ b/modules/image/classification/res2net101_vd_26w_4s_imagenet/README.md @@ -1,149 +1,134 @@ -## 命令行预测 +# res2net101_vd_26w_4s_imagenet -``` -hub run res2net101_vd_26w_4s_imagenet --input_path "/PATH/TO/IMAGE" -``` +|模型名称|res2net101_vd_26w_4s_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Res2Net| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|179MB| +|最新更新日期|-| +|数据指标|-| -## API -```python -def get_expected_image_width() -``` +## 一、模型基本信息 -返回预处理的图片宽度,也就是224。 -```python -def get_expected_image_height() -``` -返回预处理的图片高度,也就是224。 +- ### 模型介绍 -```python -def get_pretrained_images_mean() -``` + - Res2Net是2019年提出的一种全新的对ResNet的改进方案,该方案可以和现有其他优秀模块轻松整合,在不增加计算负载量的情况下,在ImageNet、CIFAR-100等数据集上的测试性能超过了ResNet。Res2Net结构简单,性能优越,进一步探索了CNN在更细粒度级别的多尺度表示能力。 该 PaddleHub Module 使用 ImageNet-2012数据集训练,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 -返回预处理的图片均值,也就是 \[0.485, 0.456, 0.406\]。 -```python -def get_pretrained_images_std() -``` +## 二、安装 -返回预处理的图片标准差,也就是 \[0.229, 0.224, 0.225\]。 +- ### 1、环境依赖 + - paddlepaddle >= 1.6.2 -```python -def context(trainable=True, pretrained=True) -``` + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) -**参数** -* trainable (bool): 计算图的参数是否为可训练的; -* pretrained (bool): 是否加载默认的预训练模型。 +- ### 2、安装 -**返回** + - ```shell + $ hub install res2net101_vd_26w_4s_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -* inputs (dict): 计算图的输入,key 为 'image', value 为图片的张量; -* outputs (dict): 计算图的输出,key 为 'classification' 和 'feature_map',其相应的值为: - * classification (paddle.fluid.framework.Variable): 分类结果,也就是全连接层的输出; - * feature\_map (paddle.fluid.framework.Variable): 特征匹配,全连接层前面的那个张量。 -* context\_prog(fluid.Program): 计算图,用于迁移学习。 +## 三、模型API预测 -```python -def classification(images=None, - paths=None, - batch_size=1, - use_gpu=False, - top_k=1): -``` +- ### 1、命令行预测 -**参数** + - ```shell + $ hub run res2net101_vd_26w_4s_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -* images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR; -* paths (list\[str\]): 图片的路径; -* batch\_size (int): batch 的大小; -* use\_gpu (bool): 是否使用 GPU 来预测; -* top\_k (int): 返回预测结果的前 k 个。 +- ### 2、预测代码示例 -**返回** + - ```python + import paddlehub as hub + import cv2 -res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别动物的类别,value为置信度。 + classifier = hub.Module(name="res2net101_vd_26w_4s_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` -```python -def save_inference_model(dirname, - model_filename=None, - params_filename=None, - combined=True) -``` +- ### 3、API -将模型保存到指定路径。 -**参数** + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** -* dirname: 存在模型的目录名称 -* model\_filename: 模型文件名称,默认为\_\_model\_\_ -* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 -## 代码示例 + - **返回** -```python -import paddlehub as hub -import cv2 + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 -classifier = hub.Module(name="res2net101_vd_26w_4s_imagenet") -result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) -# or -# result = classifier.classification(paths=['/PATH/TO/IMAGE']) -``` -## 服务部署 +## 四、服务部署 -PaddleHub Serving可以部署一个在线图像识别服务。 +- PaddleHub Serving可以部署一个图像识别的在线服务。 -## 第一步:启动PaddleHub Serving +- ### 第一步:启动PaddleHub Serving -运行启动命令: -```shell -$ hub serving start -m res2net101_vd_26w_4s_imagenet -``` + - 运行启动命令: + - ```shell + $ hub serving start -m res2net101_vd_26w_4s_imagenet + ``` -这样就完成了一个在线图像识别服务化API的部署,默认端口号为8866。 + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 -## 第二步:发送预测请求 +- ### 第二步:发送预测请求 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -```python -import requests -import json -import cv2 -import base64 + - ```python + import requests + import json + import cv2 + import base64 + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/res2net101_vd_26w_4s_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + # 打印预测结果 + print(r.json()["results"]) + ``` -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/res2net101_vd_26w_4s_imagenet" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) -# 打印预测结果 -print(r.json()["results"]) -``` +## 五、更新历史 -### 查看代码 +* 1.0.0 -https://github.com/PaddlePaddle/PaddleClas + 初始发布 -### 依赖 - -paddlepaddle >= 1.6.2 - -paddlehub >= 1.6.0 + - ```shell + $ hub install res2net101_vd_26w_4s_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnet18_vd_imagenet/README.md b/modules/image/classification/resnet18_vd_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a84af151bb181b7056235dc5c9c1e45450423e47 --- /dev/null +++ b/modules/image/classification/resnet18_vd_imagenet/README.md @@ -0,0 +1,136 @@ +# resnet18_vd_imagenet + +|模型名称|resnet18_vd_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet_vd| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|46MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率,ResNet-vd 其实就是 ResNet-D,是ResNet 原始结构的变种。该PaddleHub Module结构为ResNet_vd,基于ImageNet-2012数据集训练得到,接受输入图片大小为224 x 224 x 3,支持finetune,也可以直接通过命令行或者Python接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnet18_vd_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnet18_vd_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnet18_vd_imagenet") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** + + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 + + - **返回** + + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + + + + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个图像识别的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m resnet18_vd_imagenet + ``` + + - 这样就完成了一个图像识别的在线服务的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/resnet18_vd_imagenet" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(r.json()["results"]) + ``` + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnet18_vd_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnet50_vd_10w/README.md b/modules/image/classification/resnet50_vd_10w/README.md new file mode 100644 index 0000000000000000000000000000000000000000..35b736abf8bc9eabdc7cfa74e53c8dfab5a7aeb9 --- /dev/null +++ b/modules/image/classification/resnet50_vd_10w/README.md @@ -0,0 +1,95 @@ +# resnet50_vd_10w + +|模型名称|resnet50_vd_10w| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet_vd| +|数据集|百度自建数据集| +|是否支持Fine-tuning|否| +|模型大小|92MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率,ResNet-vd 其实就是 ResNet-D,是ResNet 原始结构的变种。该PaddleHub Module结构为ResNet_vd,使用百度自研的基于10万种类别、4千多万的有标签数据进行训练,接受输入图片大小为224 x 224 x 3,支持finetune。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnet50_vd_10w + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnet50_vd_10w") + input_dict, output_dict, program = classifier.context(trainable=True) + ``` + +- ### 2、API + + - ```python + def context(trainable=True, pretrained=True) + ``` + - **参数** + - trainable (bool): 计算图的参数是否为可训练的;
+ - pretrained (bool): 是否加载默认的预训练模型。 + + - **返回** + - inputs (dict): 计算图的输入,key 为 'image', value 为图片的张量;
+ - outputs (dict): 计算图的输出,key 为 'classification' 和 'feature_map',其相应的值为: + - classification (paddle.fluid.framework.Variable): 分类结果,也就是全连接层的输出; + - feature\_map (paddle.fluid.framework.Variable): 特征匹配,全连接层前面的那个张量。 + - context\_prog(fluid.Program): 计算图,用于迁移学习。 + + + + - ```python + def save_inference_model(dirname, + model_filename=None, + params_filename=None, + combined=True) + ``` + - **参数** + - dirname: 存在模型的目录名称;
+ - model_filename: 模型文件名称,默认为\_\_model\_\_;
+ - params_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效);
+ - combined: 是否将参数保存到统一的一个文件中。 + + + + + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnet50_vd_10w==1.0.0 + ``` diff --git a/modules/image/classification/resnet50_vd_dishes/README.md b/modules/image/classification/resnet50_vd_dishes/README.md index abd32a6ac2257b079a58abb9c02b44db1091211a..c5108d3d2ec69ba4af4c7c9085f948460417cfb7 100644 --- a/modules/image/classification/resnet50_vd_dishes/README.md +++ b/modules/image/classification/resnet50_vd_dishes/README.md @@ -1,159 +1,140 @@ -```shell -$ hub install resnet50_vd_dishes==1.0.0 -``` +# resnet50_vd_dishes -

-
ResNet 系列的网络结构 -

- -模型的详情可参考[论文](https://arxiv.org/pdf/1812.01187.pdf) - -## 命令行预测 +|模型名称|resnet50_vd_dishes| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet50_vd| +|数据集|百度自建菜品数据集| +|是否支持Fine-tuning|否| +|模型大小|158MB| +|最新更新日期|-| +|数据指标|-| -``` -hub run resnet50_vd_dishes --input_path "/PATH/TO/IMAGE" -``` -## API +## 一、模型基本信息 -```python -def get_expected_image_width() -``` -返回预处理的图片宽度,也就是224。 -```python -def get_expected_image_height() -``` +- ### 模型介绍 -返回预处理的图片高度,也就是224。 + - ResNet-vd是ResNet原始结构的变种,可用于图像分类和特征提取。该 PaddleHub Module 采用百度自建菜品数据集训练得到,支持8416种菜品的分类识别。 -```python -def get_pretrained_images_mean() -``` - -返回预处理的图片均值,也就是 \[0.485, 0.456, 0.406\]。 +

+
+

-```python -def get_pretrained_images_std() -``` + - 更多详情参考:[Bag of Tricks for Image Classification with Convolutional Neural Networks](https://arxiv.org/pdf/1812.01187.pdf) -返回预处理的图片标准差,也就是 \[0.229, 0.224, 0.225\]。 +## 二、安装 +- ### 1、环境依赖 -```python -def context(trainable=True, pretrained=True) -``` + - paddlepaddle >= 1.6.2 -**参数** + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) -* trainable (bool): 计算图的参数是否为可训练的; -* pretrained (bool): 是否加载默认的预训练模型。 -**返回** +- ### 2、安装 -* inputs (dict): 计算图的输入,key 为 'image', value 为图片的张量; -* outputs (dict): 计算图的输出,key 为 'classification' 和 'feature_map',其相应的值为: - * classification (paddle.fluid.framework.Variable): 分类结果,也就是全连接层的输出; - * feature\_map (paddle.fluid.framework.Variable): 特征匹配,全连接层前面的那个张量。 -* context\_prog(fluid.Program): 计算图,用于迁移学习。 + - ```shell + $ hub install resnet50_vd_dishes + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -```python -def classification(images=None, - paths=None, - batch_size=1, - use_gpu=False, - top_k=1): -``` +## 三、模型API预测 -**参数** +- ### 1、命令行预测 -* images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR; -* paths (list\[str\]): 图片的路径; -* batch\_size (int): batch 的大小; -* use\_gpu (bool): 是否使用 GPU 来预测; -* top\_k (int): 返回预测结果的前 k 个。 + - ```shell + $ hub run resnet50_vd_dishes --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现菜品分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -**返回** +- ### 2、预测代码示例 -res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 + - ```python + import paddlehub as hub + import cv2 -```python -def save_inference_model(dirname, - model_filename=None, - params_filename=None, - combined=True) -``` + classifier = hub.Module(name="resnet50_vd_dishes") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` -将模型保存到指定路径。 +- ### 3、API -**参数** -* dirname: 存在模型的目录名称 -* model_filename: 模型文件名称,默认为\_\_model\_\_ -* params_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** -## 代码示例 + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 -```python -import paddlehub as hub -import cv2 + - **返回** -classifier = hub.Module(name="resnet50_vd_dishes") + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 -result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) -# or -# result = classifier.classification(paths=['/PATH/TO/IMAGE']) -``` -## 服务部署 -PaddleHub Serving可以部署一个菜品分类的在线服务。 -## 第一步:启动PaddleHub Serving +## 四、服务部署 -运行启动命令: -```shell -$ hub serving start -m resnet50_vd_dishes -``` +- PaddleHub Serving可以部署一个菜品分类的在线服务。 -这样就完成了一个菜品分类的在线服务的部署,默认端口号为8866。 +- ### 第一步:启动PaddleHub Serving -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + - 运行启动命令: + - ```shell + $ hub serving start -m resnet50_vd_dishes + ``` -## 第二步:发送预测请求 + - 这样就完成了一个菜品分类的在线服务的部署,默认端口号为8866。 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果。 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 -```python -import requests -import json -import cv2 -import base64 +- ### 第二步:发送预测请求 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') + - ```python + import requests + import json + import cv2 + import base64 + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/resnet50_vd_dishes" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/resnet50_vd_dishes" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) -# 打印预测结果 -print(r.json()["results"]) -``` + # 打印预测结果 + print(r.json()["results"]) + ``` -### 查看代码 -[PaddlePaddle/models 图像分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification) +## 五、更新历史 -### 依赖 +* 1.0.0 -paddlepaddle >= 1.6.2 + 初始发布 -paddlehub >= 1.6.0 + - ```shell + $ hub install resnet50_vd_dishes==1.0.0 + ``` diff --git a/modules/image/classification/resnet50_vd_wildanimals/README.md b/modules/image/classification/resnet50_vd_wildanimals/README.md index f4415cccc069b6686be3d3a978b7db8de3ec72b4..d857c89b70156dda3891da0994336ca7d5f801fc 100644 --- a/modules/image/classification/resnet50_vd_wildanimals/README.md +++ b/modules/image/classification/resnet50_vd_wildanimals/README.md @@ -1,159 +1,134 @@ -```shell -$ hub install resnet50_vd_wildanimals==1.0.0 -``` +# resnet50_vd_wildanimals -

-
ResNet 系列的网络结构 -

+|模型名称|resnet50_vd_wildanimals| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet_vd| +|数据集|IFAW 自建野生动物数据集| +|是否支持Fine-tuning|否| +|模型大小|92MB| +|最新更新日期|-| +|数据指标|-| -模型的详情可参考[论文](https://arxiv.org/pdf/1812.01187.pdf) -## 命令行预测 +## 一、模型基本信息 -``` -hub run resnet50_vd_wildanimals --input_path "/PATH/TO/IMAGE" -``` -## API -```python -def get_expected_image_width() -``` +- ### 模型介绍 -返回预处理的图片宽度,也就是224。 + - ResNet-vd 其实就是 ResNet-D,是ResNet 原始结构的变种,可用于图像分类和特征提取。该 PaddleHub Module 采用百度自建野生动物数据集训练得到,支持'象牙制品','象牙', '大象', '虎皮', '老虎', '虎牙/虎爪/虎骨', '穿山甲甲片', '穿山甲', '穿山甲爪子', '其他' 这十个标签的识别。模型的详情可参考[论文](https://arxiv.org/pdf/1812.01187.pdf)。 -```python -def get_expected_image_height() -``` -返回预处理的图片高度,也就是224。 -```python -def get_pretrained_images_mean() -``` +## 二、安装 -返回预处理的图片均值,也就是 \[0.485, 0.456, 0.406\]。 +- ### 1、环境依赖 -```python -def get_pretrained_images_std() -``` + - paddlepaddle >= 1.6.2 -返回预处理的图片标准差,也就是 \[0.229, 0.224, 0.225\]。 + - paddlehub >= 1.6.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) -```python -def context(trainable=True, pretrained=True) -``` +- ### 2、安装 -**参数** + - ```shell + $ hub install resnet50_vd_wildanimals + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -* trainable (bool): 计算图的参数是否为可训练的; -* pretrained (bool): 是否加载默认的预训练模型。 +## 三、模型API预测 -**返回** +- ### 1、命令行预测 -* inputs (dict): 计算图的输入,key 为 'image', value 为图片的张量; -* outputs (dict): 计算图的输出,key 为 'classification' 和 'feature\_map',其相应的值为: - * classification (paddle.fluid.framework.Variable): 分类结果,也就是全连接层的输出; - * feature\_map (paddle.fluid.framework.Variable): 特征匹配,全连接层前面的那个张量。 -* context\_prog(fluid.Program): 计算图,用于迁移学习。 + - ```shell + $ hub run resnet50_vd_wildanimals --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -```python -def classification(images=None, - paths=None, - batch_size=1, - use_gpu=False, - top_k=1): -``` +- ### 2、预测代码示例 -**参数** + - ```python + import paddlehub as hub + import cv2 -* images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR; -* paths (list\[str\]): 图片的路径; -* batch\_size (int): batch 的大小; -* use\_gpu (bool): 是否使用 GPU 来预测; -* top\_k (int): 返回预测结果的前 k 个。 + classifier = hub.Module(name="resnet50_vd_wildanimals") + result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) + # or + # result = classifier.classification(paths=['/PATH/TO/IMAGE']) + ``` -**返回** +- ### 3、API -res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别动物的类别,value为置信度。 -```python -def save_inference_model(dirname, - model_filename=None, - params_filename=None, - combined=True) -``` + - ```python + def classification(images=None, + paths=None, + batch_size=1, + use_gpu=False, + top_k=1): + ``` + - 分类接口API。 + - **参数** -将模型保存到指定路径。 + - images (list\[numpy.ndarray\]): 图片数据,每一个图片数据的shape 均为 \[H, W, C\],颜色空间为 BGR;
+ - paths (list\[str\]): 图片的路径;
+ - batch\_size (int): batch 的大小;
+ - use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA_VISIBLE_DEVICES环境变量**
+ - top\_k (int): 返回预测结果的前 k 个。 -**参数** + - **返回** -* dirname: 存在模型的目录名称 -* model\_filename: 模型文件名称,默认为\_\_model\_\_ -* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 + - res (list\[dict\]): 分类结果,列表的每一个元素均为字典,其中 key 为识别的菜品类别,value为置信度。 -## 代码示例 -```python -import paddlehub as hub -import cv2 -classifier = hub.Module(name="resnet50_vd_wildanimals") +## 四、服务部署 -result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')]) -# or -# result = classifier.classification(paths=['/PATH/TO/IMAGE']) -``` +- PaddleHub Serving可以部署一个野生动物及其制品识别的在线服务。 -## 服务部署 +- ### 第一步:启动PaddleHub Serving -PaddleHub Serving可以部署一个野生动物及其制品的在线识别服务。 + - 运行启动命令: + - ```shell + $ hub serving start -m resnet50_vd_wildanimals + ``` -## 第一步:启动PaddleHub Serving + - 这样就完成了一个野生动物及其制品识别的在线服务的部署,默认端口号为8866。 -运行启动命令: -```shell -$ hub serving start -m resnet50_vd_wildanimals -``` + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 -这样就完成了一个野生动物及其制品的在线服务的部署,默认端口号为8866。 +- ### 第二步:发送预测请求 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -## 第二步:发送预测请求 + - ```python + import requests + import json + import cv2 + import base64 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') -```python -import requests -import json -import cv2 -import base64 + # 发送HTTP请求 + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/resnet50_vd_wildanimals" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + # 打印预测结果 + print(r.json()["results"]) + ``` -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') +## 五、更新历史 -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/resnet50_vd_wildanimals" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) +* 1.0.0 -# 打印预测结果 -print(r.json()["results"]) -``` - -### 查看代码 - -[PaddlePaddle/models 图像分类](https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/image_classification) - -### 依赖 - -paddlepaddle >= 1.6.2 - -paddlehub >= 1.6.0 + 初始发布 + - ```shell + $ hub install resnet50_vd_wildanimals==1.0.0 + ``` diff --git a/modules/image/classification/resnet_v2_101_imagenet/README.md b/modules/image/classification/resnet_v2_101_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..8533fb4b21496e17eef80ccd2b486f5ff2076a99 --- /dev/null +++ b/modules/image/classification/resnet_v2_101_imagenet/README.md @@ -0,0 +1,86 @@ +# resnet_v2_101_imagenet + +|模型名称|resnet_v2_101_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet V2 101| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|173MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率。该PaddleHub Module结构为ResNet101,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnet_v2_101_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnet_v2_101_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnet_v2_101_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.0.1 + 修复python2中编码问题 + - ```shell + $ hub install resnet_v2_101_imagenet==1.0.1 + ``` diff --git a/modules/image/classification/resnet_v2_152_imagenet/README.md b/modules/image/classification/resnet_v2_152_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..f849e95ed6e3cc5910f70321ceaba467702e3447 --- /dev/null +++ b/modules/image/classification/resnet_v2_152_imagenet/README.md @@ -0,0 +1,86 @@ +# resnet_v2_152_imagenet + +|模型名称|resnet_v2_152_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet V2| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|234MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率。该PaddleHub Module结构为ResNet152,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnet_v2_152_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnet_v2_152_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnet_v2_152_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.0.1 + 修复python2中编码问题 + - ```shell + $ hub install resnet_v2_152_imagenet==1.0.1 + ``` diff --git a/modules/image/classification/resnet_v2_18_imagenet/README.md b/modules/image/classification/resnet_v2_18_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..23a83f47686c1b76a5b49fd487b954d5102b8b44 --- /dev/null +++ b/modules/image/classification/resnet_v2_18_imagenet/README.md @@ -0,0 +1,84 @@ +# resnet_v2_18_imagenet + +|模型名称|resnet_v2_18_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet V2| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|46MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率。该PaddleHub Module结构为ResNet18,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnet_v2_18_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnet_v2_18_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnet_v2_18_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnet_v2_18_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnet_v2_34_imagenet/README.md b/modules/image/classification/resnet_v2_34_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..d8752ec5d46ad8e0f1931072ebde00beea8b6843 --- /dev/null +++ b/modules/image/classification/resnet_v2_34_imagenet/README.md @@ -0,0 +1,84 @@ +# resnet_v2_34_imagenet + +|模型名称|resnet_v2_34_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet V2| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|85MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率。该PaddleHub Module结构为ResNet34,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnet_v2_34_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnet_v2_34_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnet_v2_34_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnet_v2_34_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnet_v2_50_imagenet/README.md b/modules/image/classification/resnet_v2_50_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3963bd759ccfe6d651582401276ab580d552cbfc --- /dev/null +++ b/modules/image/classification/resnet_v2_50_imagenet/README.md @@ -0,0 +1,86 @@ +# resnet_v2_50_imagenet + +|模型名称|resnet_v2_50_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNet V2| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|99MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNet系列模型是图像分类领域的重要模型之一,模型中提出的残差单元有效地解决了深度网络训练困难的问题,通过增加模型的深度提升了模型的准确率。该PaddleHub Module结构为ResNet50,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnet_v2_50_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnet_v2_50_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnet_v2_50_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.0.1 + 修复python2中编码问题 + - ```shell + $ hub install resnet_v2_50_imagenet==1.0.1 + ``` diff --git a/modules/image/classification/resnext101_32x16d_wsl/README.md b/modules/image/classification/resnext101_32x16d_wsl/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7b6501be1e191f25ca7c85dc418ab61b05e70095 --- /dev/null +++ b/modules/image/classification/resnext101_32x16d_wsl/README.md @@ -0,0 +1,84 @@ +# resnext101_32x16d_wsl + +|模型名称|resnext101_32x16d_wsl| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_wsl| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|744MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 由于人工标注的数据集在规模上已经接近其函数极限,Facebook 的研发人员采用了一种独特的迁移学习研究,通过使用 hashtag 作为标注,在包含数十亿张社交媒体图片的数据集上进行训练,这为大规模训练转向弱监督学习(Weakly Supervised Learning) 取得了重大突破。在 ImageNet 图像识别基准上,ResNeXt101_32x16d_wsl 的 Top-1 达到了 84.24% 的准确率。该 PaddleHub Module结构为 ResNeXt101_32x16d_wsl,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_32x16d_wsl + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_32x16d_wsl --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_32x16d_wsl") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext101_32x16d_wsl==1.0.0 + ``` diff --git a/modules/image/classification/resnext101_32x32d_wsl/README.md b/modules/image/classification/resnext101_32x32d_wsl/README.md new file mode 100644 index 0000000000000000000000000000000000000000..f3f37f3d4be0260831cfc96c7052a00199b698af --- /dev/null +++ b/modules/image/classification/resnext101_32x32d_wsl/README.md @@ -0,0 +1,84 @@ +# resnext101_32x32d_wsl + +|模型名称|resnext101_32x32d_wsl| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_wsl| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|1.8GB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 由于人工标注的数据集在规模上已经接近其函数极限,Facebook 的研发人员采用了一种独特的迁移学习研究,通过使用 hashtag 作为标注,在包含数十亿张社交媒体图片的数据集上进行训练,这为大规模训练转向弱监督学习(Weakly Supervised Learning) 取得了重大突破。在 ImageNet 图像识别基准上,ResNeXt101_32x32d_wsl 的 Top-1 达到了 84.97% 的准确率。该 PaddleHub Module结构为 ResNeXt101_32x32d_wsl,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_32x32d_wsl + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_32x32d_wsl --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_32x32d_wsl") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext101_32x32d_wsl==1.0.0 + ``` diff --git a/modules/image/classification/resnext101_32x48d_wsl/README.md b/modules/image/classification/resnext101_32x48d_wsl/README.md new file mode 100644 index 0000000000000000000000000000000000000000..24603e39ac1ac93c065ea35a25a6ad6c69959750 --- /dev/null +++ b/modules/image/classification/resnext101_32x48d_wsl/README.md @@ -0,0 +1,84 @@ +# resnext101_32x48d_wsl + +|模型名称|resnext101_32x48d_wsl| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_wsl| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|342MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 由于人工标注的数据集在规模上已经接近其函数极限,Facebook 的研发人员采用了一种独特的迁移学习研究,通过使用 hashtag 作为标注,在包含数十亿张社交媒体图片的数据集上进行训练,这为大规模训练转向弱监督学习(Weakly Supervised Learning) 取得了重大突破。在 ImageNet 图像识别基准上,ResNeXt101_32x48d_wsl 的 Top-1 达到了 85.4% 的准确率。该 PaddleHub Module结构为 ResNeXt101_32x48d_wsl,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_32x48d_wsl + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_32x48d_wsl --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_32x48d_wsl") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext101_32x48d_wsl==1.0.0 + ``` diff --git a/modules/image/classification/resnext101_32x4d_imagenet/README.md b/modules/image/classification/resnext101_32x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..60a0e27f7431ed79619a65f7fb91075b54eee2b3 --- /dev/null +++ b/modules/image/classification/resnext101_32x4d_imagenet/README.md @@ -0,0 +1,85 @@ +# resnext101_32x4d_imagenet + +|模型名称|resnext101_32x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|172MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext101_32x4d,表示 layers 为 101, 分支数为 32,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_32x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_32x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_32x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext101_32x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext101_32x8d_wsl/README.md b/modules/image/classification/resnext101_32x8d_wsl/README.md new file mode 100644 index 0000000000000000000000000000000000000000..94f8491dc53bd9905bcb163b84839ba3fc309527 --- /dev/null +++ b/modules/image/classification/resnext101_32x8d_wsl/README.md @@ -0,0 +1,84 @@ +# resnext101_32x8d_wsl + +|模型名称|resnext101_32x8d_wsl| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_wsl| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|317MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 由于人工标注的数据集在规模上已经接近其函数极限,Facebook 的研发人员采用了一种独特的迁移学习研究,通过使用 hashtag 作为标注,在包含数十亿张社交媒体图片的数据集上进行训练,这为大规模训练转向弱监督学习(Weakly Supervised Learning) 取得了重大突破。在 ImageNet 图像识别基准上,ResNeXt101_32x8d_wsl 的 Top-1 达到了 82.55% 的准确率。该 PaddleHub Module结构为 ResNeXt101_32x8d_wsl,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.6.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_32x8d_wsl + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_32x8d_wsl --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_32x8d_wsl") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext101_32x8d_wsl==1.0.0 + ``` diff --git a/modules/image/classification/resnext101_64x4d_imagenet/README.md b/modules/image/classification/resnext101_64x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..588f2dbabc6bba44e290ec1fc75ed1a75dcf22ec --- /dev/null +++ b/modules/image/classification/resnext101_64x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# resnext101_64x4d_imagenet + +|模型名称|resnext101_64x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|322MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext101_64x4d,表示 layers 为 101, 分支数为 64,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_64x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_64x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_64x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext101_64x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext101_vd_32x4d_imagenet/README.md b/modules/image/classification/resnext101_vd_32x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..7c21889b7429f2b1cbd8dd42d0abea640a6519bd --- /dev/null +++ b/modules/image/classification/resnext101_vd_32x4d_imagenet/README.md @@ -0,0 +1,83 @@ +# resnext101_vd_32x4d_imagenet + +|模型名称|resnext101_vd_32x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|172MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext101_vd_32x4d,表示 layers 为 101, 分支数为 32,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_vd_32x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_vd_32x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_vd_32x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + - ```shell + $ hub install resnext101_vd_32x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext101_vd_64x4d_imagenet/README.md b/modules/image/classification/resnext101_vd_64x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..b6d6c5c025cb98164f3797b9c4e4e7fa4e192b2c --- /dev/null +++ b/modules/image/classification/resnext101_vd_64x4d_imagenet/README.md @@ -0,0 +1,83 @@ +# resnext101_vd_64x4d_imagenet + +|模型名称|resnext101_vd_64x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_vd| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|172MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext101_vd_64x4d,表示 layers 为 101, 分支数为 64,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext101_vd_64x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext101_vd_64x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext101_vd_64x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + - ```shell + $ hub install resnext101_vd_64x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext152_32x4d_imagenet/README.md b/modules/image/classification/resnext152_32x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..d748c6d52cba08259034b6cb68125d4574d8179f --- /dev/null +++ b/modules/image/classification/resnext152_32x4d_imagenet/README.md @@ -0,0 +1,85 @@ +# resnext152_32x4d_imagenet + +|模型名称|resnext152_32x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|233MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext152_32x4d,表示 layers 为 152, 分支数为32,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext152_32x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext152_32x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext152_32x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext152_32x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext152_64x4d_imagenet/README.md b/modules/image/classification/resnext152_64x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..43508a2fedf83dc465fa1e30f526a9274237c2a1 --- /dev/null +++ b/modules/image/classification/resnext152_64x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# resnext152_64x4d_imagenet + +|模型名称|resnext152_64x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|444MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext152_64x4d,表示 layers 为 152, 分支数为64,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext152_64x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext152_64x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext152_64x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext152_64x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext152_vd_64x4d_imagenet/README.md b/modules/image/classification/resnext152_vd_64x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..537dae7832d4786b2139b686e8cfc243eb9b5be0 --- /dev/null +++ b/modules/image/classification/resnext152_vd_64x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# resnext152_vd_64x4d_imagenet + +|模型名称|resnext152_vd_64x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_vd| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|444MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext152_vd_64x4d,表示 layers 为 152, 分支数为64,每个分支的输入输出 channels, 并采用了 3 个 3*3 的卷积核替代 ResNeXt152_64x4d 中第一个 7*7 的卷积核。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext152_vd_64x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext152_vd_64x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext152_vd_64x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext152_vd_64x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext50_32x4d_imagenet/README.md b/modules/image/classification/resnext50_32x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..474bd4cac3c96f9d13e9eacb9872c8b72e4eaf90 --- /dev/null +++ b/modules/image/classification/resnext50_32x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# resnext50_32x4d_imagenet + +|模型名称|resnext50_32x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|97MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext50_32x4d,表示 layers 为 50, 分支数为 32,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext50_32x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext50_32x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext50_32x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext50_32x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext50_64x4d_imagenet/README.md b/modules/image/classification/resnext50_64x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..740c56812597a911dfb4b21b9bacf1c6852e16ba --- /dev/null +++ b/modules/image/classification/resnext50_64x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# resnext50_64x4d_imagenet + +|模型名称|resnext50_64x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|174MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext50_64x4d,表示 layers 为 50, 分支数为 64,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext50_64x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext50_64x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext50_64x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext50_64x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext50_vd_32x4d_imagenet/README.md b/modules/image/classification/resnext50_vd_32x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..02e3585777c743ebaeebf90228455a32ae20c827 --- /dev/null +++ b/modules/image/classification/resnext50_vd_32x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# resnext50_vd_32x4d_imagenet + +|模型名称|resnext50_vd_32x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_vd| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|98MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext50_vd_32x4d,表示 layers 为 50, 分支数为 32,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext50_vd_32x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext50_vd_32x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext50_vd_32x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install resnext50_vd_32x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/resnext50_vd_64x4d_imagenet/README.md b/modules/image/classification/resnext50_vd_64x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..5b473b5c869ada3001d01b3cb8a8cf0a3e3adc11 --- /dev/null +++ b/modules/image/classification/resnext50_vd_64x4d_imagenet/README.md @@ -0,0 +1,83 @@ +# resnext50_vd_64x4d_imagenet + +|模型名称|resnext50_vd_64x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ResNeXt_vd| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|175MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ResNeXt 是由 UC San Diego 和 Facebook AI 研究所于2017年提出的图像分类模型,模型沿袭了 VGG/ResNets 的堆叠思想,并采用 split-transform-merge 策略来增加网络的分支数。resnext50_vd_64x4d,表示 layers 为 50, 分支数为 64,每个分支的输入输出 channels 为4。该 PaddleHub Module 在包含数十亿张社交媒体图片的数据集上进行弱监督训练,并使用ImageNet-2012数据集finetune,接受输入图片大小为 224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install resnext50_vd_64x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run resnext50_vd_64x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="resnext50_vd_64x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + - ```shell + $ hub install resnext50_vd_64x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/se_resnet18_vd_imagenet/README.md b/modules/image/classification/se_resnet18_vd_imagenet/README.md index 2b1b1c23b4042e09792f91f72718c1abe19746ed..b1c11fed0f84cd73fcda65dc928e2405676b725b 100644 --- a/modules/image/classification/se_resnet18_vd_imagenet/README.md +++ b/modules/image/classification/se_resnet18_vd_imagenet/README.md @@ -84,7 +84,7 @@ def save_inference_model(dirname, * params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) * combined: 是否将参数保存到统一的一个文件中 -## 代码示例 +## 预测代码示例 ```python import paddlehub as hub diff --git a/modules/image/classification/se_resnext101_32x4d_imagenet/README.md b/modules/image/classification/se_resnext101_32x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..1f2ac07a5a923e6634f70446816a6aa22516dab0 --- /dev/null +++ b/modules/image/classification/se_resnext101_32x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# se_resnext101_32x4d_imagenet + +|模型名称|se_resnext101_32x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|SE_ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|191MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - Squeeze-and-Excitation Networks是由Momenta在2017年提出的一种图像分类结构。该结构通过对特征通道间的相关性进行建模,把重要的特征进行强化来提升准确率。SE_ResNeXt基于ResNeXt模型添加了SE Block,并获得了2017 ILSVR竞赛的冠军。该PaddleHub Module结构为SE_ResNeXt101_32x4d,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install se_resnext101_32x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run se_resnext101_32x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="se_resnext101_32x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install se_resnext101_32x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/se_resnext50_32x4d_imagenet/README.md b/modules/image/classification/se_resnext50_32x4d_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..05c4020f3ac5257c4324da1b4c6eeb167c522f2a --- /dev/null +++ b/modules/image/classification/se_resnext50_32x4d_imagenet/README.md @@ -0,0 +1,84 @@ +# se_resnext50_32x4d_imagenet + +|模型名称|se_resnext50_32x4d_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|SE_ResNeXt| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|107MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - Squeeze-and-Excitation Networks是由Momenta在2017年提出的一种图像分类结构。该结构通过对特征通道间的相关性进行建模,把重要的特征进行强化来提升准确率。SE_ResNeXt基于ResNeXt模型添加了SE Block,并获得了2017 ILSVR竞赛的冠军。该PaddleHub Module结构为SE_ResNeXt50_32x4d,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install se_resnext50_32x4d_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run se_resnext50_32x4d_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="se_resnext50_32x4d_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install se_resnext50_32x4d_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/shufflenet_v2_imagenet/README.md b/modules/image/classification/shufflenet_v2_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3e372c9600c8c2ebf017a5cfb7d8e6c5baf55df6 --- /dev/null +++ b/modules/image/classification/shufflenet_v2_imagenet/README.md @@ -0,0 +1,84 @@ +# shufflenet_v2_imagenet + +|模型名称|shufflenet_v2_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|ShuffleNet V2| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|11MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - ShuffleNet V2是由旷视科技在2018年提出的轻量级图像分类模型,该模型通过pointwise group convolution和channel shuffle两种方式,在保持精度的同时大大降低了模型的计算量。该PaddleHub Module结构为ShuffleNet V2,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install shufflenet_v2_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run shufflenet_v2_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="shufflenet_v2_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install shufflenet_v2_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/spinalnet_res101_gemstone/README.md b/modules/image/classification/spinalnet_res101_gemstone/README.md new file mode 100644 index 0000000000000000000000000000000000000000..bd785bda5c4deffdecdf0292d3ff7e6965b15ff5 --- /dev/null +++ b/modules/image/classification/spinalnet_res101_gemstone/README.md @@ -0,0 +1,81 @@ +# spinalnet_res101_gemstone + +|模型名称|spinalnet_res101_gemstone| +| :--- | :---: | +|类别|图像-图像分类| +|网络|resnet101| +|数据集|gemstone| +|是否支持Fine-tuning|否| +|模型大小|246MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 使用PaddleHub的SpinalNet预训练模型进行宝石识别或finetune并完成宝石的预测任务。 +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install spinalnet_res101_gemstone + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run spinalnet_res101_gemstone --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="spinalnet_res101_gemstone") + result = classifier.predict(['/PATH/TO/IMAGE']) + print(result) + ``` + +- ### 3、API + + - ```python + def predict(images) + ``` + - 分类接口API。 + - **参数** + - images: list类型,待预测的图像。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + - ```shell + $ hub install spinalnet_res101_gemstone==1.0.0 + ``` diff --git a/modules/image/classification/spinalnet_res50_gemstone/README.md b/modules/image/classification/spinalnet_res50_gemstone/README.md new file mode 100644 index 0000000000000000000000000000000000000000..ed97788f71ae95deebe9fd3ec83d2f08bb6bd56f --- /dev/null +++ b/modules/image/classification/spinalnet_res50_gemstone/README.md @@ -0,0 +1,81 @@ +# spinalnet_res50_gemstone + +|模型名称|spinalnet_res50_gemstone| +| :--- | :---: | +|类别|图像-图像分类| +|网络|resnet50| +|数据集|gemstone| +|是否支持Fine-tuning|否| +|模型大小|137MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 使用PaddleHub的SpinalNet预训练模型进行宝石识别或finetune并完成宝石的预测任务。 +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install spinalnet_res50_gemstone + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run spinalnet_res50_gemstone --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="spinalnet_res50_gemstone") + result = classifier.predict(['/PATH/TO/IMAGE']) + print(result) + ``` + +- ### 3、API + + - ```python + def predict(images) + ``` + - 分类接口API。 + - **参数** + - images: list类型,待预测的图像。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + - ```shell + $ hub install spinalnet_res50_gemstone==1.0.0 + ``` diff --git a/modules/image/classification/spinalnet_vgg16_gemstone/README.md b/modules/image/classification/spinalnet_vgg16_gemstone/README.md new file mode 100644 index 0000000000000000000000000000000000000000..5ca6eacd550179c5cb0c838d0c2451eb3d61f02f --- /dev/null +++ b/modules/image/classification/spinalnet_vgg16_gemstone/README.md @@ -0,0 +1,81 @@ +# spinalnet_vgg16_gemstone + +|模型名称|spinalnet_vgg16_gemstone| +| :--- | :---: | +|类别|图像-图像分类| +|网络|vgg16| +|数据集|gemstone| +|是否支持Fine-tuning|否| +|模型大小|1.5GB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - 使用PaddleHub的SpinalNet预训练模型进行宝石识别或finetune并完成宝石的预测任务。 +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install spinalnet_vgg16_gemstone + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run spinalnet_vgg16_gemstone --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="spinalnet_vgg16_gemstone") + result = classifier.predict(['/PATH/TO/IMAGE']) + print(result) + ``` + +- ### 3、API + + - ```python + def predict(images) + ``` + - 分类接口API。 + - **参数** + - images: list类型,待预测的图像。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + - ```shell + $ hub install spinalnet_vgg16_gemstone==1.0.0 + ``` diff --git a/modules/image/classification/vgg11_imagenet/README.md b/modules/image/classification/vgg11_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..2905883511483c62b3a94907c21aa87994500a19 --- /dev/null +++ b/modules/image/classification/vgg11_imagenet/README.md @@ -0,0 +1,84 @@ +# vgg11_imagenet + +|模型名称|vgg11_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|VGG| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|507MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - VGG是牛津大学计算机视觉组和DeepMind在2014年提出的一种图像分类模型。该系列模型探索了卷积神经网络的深度与其性能之间的关系,通过实验证明了增加网络的深度能够在一定程度上影响网络最终的性能,到目前为止,VGG仍然被许多其他图像任务用作特征提取的BackBone网络。该PaddleHub Module结构为VGG11,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install vgg11_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run vgg11_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="vgg11_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install vgg11_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/vgg13_imagenet/README.md b/modules/image/classification/vgg13_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..2f967b7f19bec47d6b5695fbaccae6086a98cee7 --- /dev/null +++ b/modules/image/classification/vgg13_imagenet/README.md @@ -0,0 +1,84 @@ +# vgg13_imagenet + +|模型名称|vgg13_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|VGG| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|508MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - VGG是牛津大学计算机视觉组和DeepMind在2014年提出的一种图像分类模型。该系列模型探索了卷积神经网络的深度与其性能之间的关系,通过实验证明了增加网络的深度能够在一定程度上影响网络最终的性能,到目前为止,VGG仍然被许多其他图像任务用作特征提取的BackBone网络。该PaddleHub Module结构为VGG13,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install vgg13_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run vgg13_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="vgg13_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install vgg13_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/vgg16_imagenet/README.md b/modules/image/classification/vgg16_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..14186cec20232f3d9562620bd1b062082f004b78 --- /dev/null +++ b/modules/image/classification/vgg16_imagenet/README.md @@ -0,0 +1,84 @@ +# vgg16_imagenet + +|模型名称|vgg16_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|VGG| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|528MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - VGG是牛津大学计算机视觉组和DeepMind在2014年提出的一种图像分类模型。该系列模型探索了卷积神经网络的深度与其性能之间的关系,通过实验证明了增加网络的深度能够在一定程度上影响网络最终的性能,到目前为止,VGG仍然被许多其他图像任务用作特征提取的BackBone网络。该PaddleHub Module结构为VGG16,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install vgg16_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run vgg16_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="vgg16_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install vgg16_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/vgg19_imagenet/README.md b/modules/image/classification/vgg19_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..3ecf4e2bff64c4558c68dc89688cab18748313bb --- /dev/null +++ b/modules/image/classification/vgg19_imagenet/README.md @@ -0,0 +1,84 @@ +# vgg19_imagenet + +|模型名称|vgg19_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|vgg19_imagenet| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|549MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - VGG是牛津大学计算机视觉组和DeepMind在2014年提出的一种图像分类模型。该系列模型探索了卷积神经网络的深度与其性能之间的关系,通过实验证明了增加网络的深度能够在一定程度上影响网络最终的性能,到目前为止,VGG仍然被许多其他图像任务用作特征提取的BackBone网络。该PaddleHub Module结构为VGG19,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者Python接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install vgg19_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run vgg19_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="vgg19_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install vgg19_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/xception41_imagenet/README.md b/modules/image/classification/xception41_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a5ad52074a0527fe580f2c4f6870fd49af530cb6 --- /dev/null +++ b/modules/image/classification/xception41_imagenet/README.md @@ -0,0 +1,84 @@ +# xception41_imagenet + +|模型名称|xception41_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Xception| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - Xception 全称为 Extreme Inception,是 Google 于 2016年提出的 Inception V3 的改进模型。Xception 采用了深度可分离卷积(depthwise separable convolution) 来替换原来 Inception V3 中的卷积操作,整体的网络结构是带有残差连接的深度可分离卷积层的线性堆叠。该PaddleHub Module结构为Xception41,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install xception41_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run xception41_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="xception41_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install xception41_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/xception65_imagenet/README.md b/modules/image/classification/xception65_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..1be8b866e1857f1d3f483baa0d7e54a964452e59 --- /dev/null +++ b/modules/image/classification/xception65_imagenet/README.md @@ -0,0 +1,84 @@ +# xception65_imagenet + +|模型名称|xception65_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Xception| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|140MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - Xception 全称为 Extreme Inception,是 Google 于 2016年提出的 Inception V3 的改进模型。Xception 采用了深度可分离卷积(depthwise separable convolution) 来替换原来 Inception V3 中的卷积操作,整体的网络结构是带有残差连接的深度可分离卷积层的线性堆叠。该PaddleHub Module结构为Xception65,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install xception65_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run xception65_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="xception65_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install xception65_imagenet==1.0.0 + ``` diff --git a/modules/image/classification/xception71_imagenet/README.md b/modules/image/classification/xception71_imagenet/README.md new file mode 100644 index 0000000000000000000000000000000000000000..28f44f91615a709c2dea2d58485a3760c0ba1edd --- /dev/null +++ b/modules/image/classification/xception71_imagenet/README.md @@ -0,0 +1,84 @@ +# xception71_imagenet + +|模型名称|xception71_imagenet| +| :--- | :---: | +|类别|图像-图像分类| +|网络|Xception| +|数据集|ImageNet-2012| +|是否支持Fine-tuning|否| +|模型大小|147MB| +|最新更新日期|-| +|数据指标|-| + + +## 一、模型基本信息 + + + +- ### 模型介绍 + + - Xception 全称为 Extreme Inception,是 Google 于 2016年提出的 Inception V3 的改进模型。Xception 采用了深度可分离卷积(depthwise separable convolution) 来替换原来 Inception V3 中的卷积操作,整体的网络结构是带有残差连接的深度可分离卷积层的线性堆叠。该PaddleHub Module结构为Xception71,基于ImageNet-2012数据集训练,接受输入图片大小为224 x 224 x 3,支持直接通过命令行或者 Python 接口进行预测。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + +- ### 2、安装 + + - ```shell + $ hub install xception71_imagenet + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + $ hub run xception71_imagenet --input_path "/PATH/TO/IMAGE" + ``` + - 通过命令行方式实现图像分类模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + import paddlehub as hub + import cv2 + + classifier = hub.Module(name="xception71_imagenet") + test_img_path = "/PATH/TO/IMAGE" + input_dict = {"image": [test_img_path]} + result = classifier.classification(data=input_dict) + ``` + +- ### 3、API + + - ```python + def classification(data) + ``` + - 分类接口API。 + - **参数** + - data:dict类型,key为image,str类型,value为待检测的图片路径,list类型。 + + - **返回** + - result:list类型,每个元素为对应输入图片的预测结果。预测结果为dict类型,key为该图片分类结果label,value为该label对应的概率 + + + + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install xception71_imagenet==1.0.0 + ``` diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/README.md b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/README.md deleted file mode 100644 index 95b9a1dd61eb5477bb54bcc8188f52eadb6baa81..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/README.md +++ /dev/null @@ -1,138 +0,0 @@ -## 命令行预测 - -``` -$ hub run retinanet_resnet50_fpn_coco2017 --input_path "/PATH/TO/IMAGE" -``` - -## API - -``` -def context(trainable=True, - pretrained=True, - get_prediction=False) -``` - -提取特征,用于迁移学习。 - -**参数** - -* trainable(bool): 参数是否可训练; -* pretrained (bool): 是否加载预训练模型; -* get\_prediction (bool): 是否执行预测。 - -**返回** - -* inputs (dict): 模型的输入,keys 包括 'image', 'im\_size',相应的取值为: - * image (Variable): 图像变量 - * im\_size (Variable): 图片的尺寸 -* outputs (dict): 模型的输出。如果 get\_prediction 为 False,输出 'head\_fatures',否则输出 'bbox\_out'。 -* context\_prog (Program): 用于迁移学习的 Program. - -```python -def object_detection(paths=None, - images=None, - batch_size=1, - use_gpu=False, - output_dir='detection_result', - score_thresh=0.5, - visualization=True) -``` - -预测API,检测输入图片中的所有目标的位置。 - -**参数** - -* paths (list\[str\]): 图片的路径; -* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式; -* batch\_size (int): batch 的大小; -* use\_gpu (bool): 是否使用 GPU; -* score\_thresh (float): 识别置信度的阈值; -* visualization (bool): 是否将识别结果保存为图片文件; -* output\_dir (str): 图片的保存路径,默认设为 detection\_result; - -**返回** - -* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,各字段为: - * data (list): 检测结果,list的每一个元素为 dict,各字段为: - * confidence (float): 识别的置信度; - * label (str): 标签; - * left (int): 边界框的左上角x坐标; - * top (int): 边界框的左上角y坐标; - * right (int): 边界框的右下角x坐标; - * bottom (int): 边界框的右下角y坐标; - * save\_path (str, optional): 识别结果的保存路径 (仅当visualization=True时存在)。 - -```python -def save_inference_model(dirname, - model_filename=None, - params_filename=None, - combined=True) -``` - -将模型保存到指定路径。 - -**参数** - -* dirname: 存在模型的目录名称 -* model\_filename: 模型文件名称,默认为\_\_model\_\_ -* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效) -* combined: 是否将参数保存到统一的一个文件中 - -## 代码示例 - -```python -import paddlehub as hub -import cv2 - -object_detector = hub.Module(name="retinanet_resnet50_fpn_coco2017") -result = object_detector.object_detection(images=[cv2.imread('/PATH/TO/IMAGE')]) -# or -# result = object_detector.object_detection((paths=['/PATH/TO/IMAGE']) -``` - -## 服务部署 - -PaddleHub Serving可以部署一个目标检测的在线服务。 - -## 第一步:启动PaddleHub Serving - -运行启动命令: -```shell -$ hub serving start -m retinanet_resnet50_fpn_coco2017 -``` - -这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。 - -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 - -## 第二步:发送预测请求 - -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 - -```python -import requests -import json -import cv2 -import base64 - - -def cv2_to_base64(image): - data = cv2.imencode('.jpg', image)[1] - return base64.b64encode(data.tostring()).decode('utf8') - - -# 发送HTTP请求 -data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} -headers = {"Content-type": "application/json"} -url = "http://127.0.0.1:8866/predict/retinanet_resnet50_fpn_coco2017" -r = requests.post(url=url, headers=headers, data=json.dumps(data)) - -# 打印预测结果 -print(r.json()["results"]) -``` - -### 依赖 - -paddlepaddle >= 1.6.2 - -paddlehub >= 1.6.0 diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/__init__.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/__init__.py deleted file mode 100644 index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..0000000000000000000000000000000000000000 diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/data_feed.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/data_feed.py deleted file mode 100644 index dbef6a3fc4ae231e6e08dac93af4674066920b43..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/data_feed.py +++ /dev/null @@ -1,99 +0,0 @@ -# coding=utf-8 -from __future__ import absolute_import -from __future__ import print_function -from __future__ import division - -import os -from collections import OrderedDict - -import numpy as np -import cv2 -from PIL import Image, ImageEnhance -from paddle import fluid - -__all__ = ['test_reader', 'padding_minibatch'] - - -def test_reader(paths=None, images=None): - """ - data generator - - Args: - paths (list[str]): paths to images. - images (list(numpy.ndarray)): data of images, shape of each is [H, W, C] - - Yield: - res (dict): key contains 'image' and 'im_info', the corresponding values is: - image (numpy.ndarray): the image to be fed into network - im_info (numpy.ndarray): the info about the preprocessed. - """ - img_list = list() - if paths: - for img_path in paths: - assert os.path.isfile(img_path), "The {} isn't a valid file path.".format(img_path) - img = cv2.imread(img_path).astype('float32') - img_list.append(img) - if images is not None: - for img in images: - img_list.append(img) - for im in img_list: - im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB) - im = im.astype(np.float32, copy=False) - mean = [0.485, 0.456, 0.406] - std = [0.229, 0.224, 0.225] - mean = np.array(mean)[np.newaxis, np.newaxis, :] - std = np.array(std)[np.newaxis, np.newaxis, :] - im = im / 255.0 - im -= mean - im /= std - target_size = 800 - max_size = 1333 - shape = im.shape - # im_shape holds the original shape of image. - # im_shape = np.array([shape[0], shape[1], 1.0]).astype('float32') - im_size_min = np.min(shape[0:2]) - im_size_max = np.max(shape[0:2]) - im_scale = float(target_size) / float(im_size_min) - if np.round(im_scale * im_size_max) > max_size: - im_scale = float(max_size) / float(im_size_max) - - resize_w = np.round(im_scale * float(shape[1])) - resize_h = np.round(im_scale * float(shape[0])) - # im_info holds the resize info of image. - im_info = np.array([resize_h, resize_w, im_scale]).astype('float32') - - im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_LINEAR) - - # HWC --> CHW - im = np.swapaxes(im, 1, 2) - im = np.swapaxes(im, 1, 0) - yield {'image': im, 'im_info': im_info} - - -def padding_minibatch(batch_data, coarsest_stride=0, use_padded_im_info=True): - max_shape_org = np.array([data['image'].shape for data in batch_data]).max(axis=0) - if coarsest_stride > 0: - max_shape = np.zeros((3)).astype('int32') - max_shape[1] = int(np.ceil(max_shape_org[1] / coarsest_stride) * coarsest_stride) - max_shape[2] = int(np.ceil(max_shape_org[2] / coarsest_stride) * coarsest_stride) - else: - max_shape = max_shape_org.astype('int32') - - padding_image = list() - padding_info = list() - padding_shape = list() - - for data in batch_data: - im_c, im_h, im_w = data['image'].shape - # image - padding_im = np.zeros((im_c, max_shape[1], max_shape[2]), dtype=np.float32) - padding_im[:, 0:im_h, 0:im_w] = data['image'] - padding_image.append(padding_im) - # im_info - data['im_info'][0] = max_shape[1] if use_padded_im_info else max_shape_org[1] - data['im_info'][1] = max_shape[2] if use_padded_im_info else max_shape_org[2] - padding_info.append(data['im_info']) - - padding_image = np.array(padding_image).astype('float32') - padding_info = np.array(padding_info).astype('float32') - return padding_image, padding_info diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/fpn.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/fpn.py deleted file mode 100644 index 803b8acde9bbd289237d2bd1b9735fd905964edf..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/fpn.py +++ /dev/null @@ -1,237 +0,0 @@ -# coding=utf-8 -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import copy -from collections import OrderedDict - -from paddle import fluid -from paddle.fluid.param_attr import ParamAttr -from paddle.fluid.initializer import Xavier -from paddle.fluid.regularizer import L2Decay - -__all__ = ['FPN'] - - -def ConvNorm(input, - num_filters, - filter_size, - stride=1, - groups=1, - norm_decay=0., - norm_type='affine_channel', - norm_groups=32, - dilation=1, - lr_scale=1, - freeze_norm=False, - act=None, - norm_name=None, - initializer=None, - name=None): - fan = num_filters - conv = fluid.layers.conv2d( - input=input, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=((filter_size - 1) // 2) * dilation, - dilation=dilation, - groups=groups, - act=None, - param_attr=ParamAttr(name=name + "_weights", initializer=initializer, learning_rate=lr_scale), - bias_attr=False, - name=name + '.conv2d.output.1') - norm_lr = 0. if freeze_norm else 1. - pattr = ParamAttr(name=norm_name + '_scale', learning_rate=norm_lr * lr_scale, regularizer=L2Decay(norm_decay)) - battr = ParamAttr(name=norm_name + '_offset', learning_rate=norm_lr * lr_scale, regularizer=L2Decay(norm_decay)) - if norm_type in ['bn', 'sync_bn']: - global_stats = True if freeze_norm else False - out = fluid.layers.batch_norm( - input=conv, - act=act, - name=norm_name + '.output.1', - param_attr=pattr, - bias_attr=battr, - moving_mean_name=norm_name + '_mean', - moving_variance_name=norm_name + '_variance', - use_global_stats=global_stats) - scale = fluid.framework._get_var(pattr.name) - bias = fluid.framework._get_var(battr.name) - elif norm_type == 'gn': - out = fluid.layers.group_norm( - input=conv, act=act, name=norm_name + '.output.1', groups=norm_groups, param_attr=pattr, bias_attr=battr) - scale = fluid.framework._get_var(pattr.name) - bias = fluid.framework._get_var(battr.name) - elif norm_type == 'affine_channel': - scale = fluid.layers.create_parameter( - shape=[conv.shape[1]], dtype=conv.dtype, attr=pattr, default_initializer=fluid.initializer.Constant(1.)) - bias = fluid.layers.create_parameter( - shape=[conv.shape[1]], dtype=conv.dtype, attr=battr, default_initializer=fluid.initializer.Constant(0.)) - out = fluid.layers.affine_channel(x=conv, scale=scale, bias=bias, act=act) - if freeze_norm: - scale.stop_gradient = True - bias.stop_gradient = True - return out - - -class FPN(object): - """ - Feature Pyramid Network, see https://arxiv.org/abs/1612.03144 - - Args: - num_chan (int): number of feature channels - min_level (int): lowest level of the backbone feature map to use - max_level (int): highest level of the backbone feature map to use - spatial_scale (list): feature map scaling factor - has_extra_convs (bool): whether has extral convolutions in higher levels - norm_type (str|None): normalization type, 'bn'/'sync_bn'/'affine_channel' - """ - __shared__ = ['norm_type', 'freeze_norm'] - - def __init__(self, - num_chan=256, - min_level=2, - max_level=6, - spatial_scale=[1. / 32., 1. / 16., 1. / 8., 1. / 4.], - has_extra_convs=False, - norm_type=None, - freeze_norm=False): - self.freeze_norm = freeze_norm - self.num_chan = num_chan - self.min_level = min_level - self.max_level = max_level - self.spatial_scale = spatial_scale - self.has_extra_convs = has_extra_convs - self.norm_type = norm_type - - def _add_topdown_lateral(self, body_name, body_input, upper_output): - lateral_name = 'fpn_inner_' + body_name + '_lateral' - topdown_name = 'fpn_topdown_' + body_name - fan = body_input.shape[1] - if self.norm_type: - initializer = Xavier(fan_out=fan) - lateral = ConvNorm( - body_input, - self.num_chan, - 1, - initializer=initializer, - norm_type=self.norm_type, - freeze_norm=self.freeze_norm, - name=lateral_name, - norm_name=lateral_name) - else: - lateral = fluid.layers.conv2d( - body_input, - self.num_chan, - 1, - param_attr=ParamAttr(name=lateral_name + "_w", initializer=Xavier(fan_out=fan)), - bias_attr=ParamAttr(name=lateral_name + "_b", learning_rate=2., regularizer=L2Decay(0.)), - name=lateral_name) - topdown = fluid.layers.resize_nearest(upper_output, scale=2., name=topdown_name) - - return lateral + topdown - - def get_output(self, body_dict): - """ - Add FPN onto backbone. - - Args: - body_dict(OrderedDict): Dictionary of variables and each element is the - output of backbone. - - Return: - fpn_dict(OrderedDict): A dictionary represents the output of FPN with - their name. - spatial_scale(list): A list of multiplicative spatial scale factor. - """ - spatial_scale = copy.deepcopy(self.spatial_scale) - body_name_list = list(body_dict.keys())[::-1] - num_backbone_stages = len(body_name_list) - self.fpn_inner_output = [[] for _ in range(num_backbone_stages)] - fpn_inner_name = 'fpn_inner_' + body_name_list[0] - body_input = body_dict[body_name_list[0]] - fan = body_input.shape[1] - if self.norm_type: - initializer = Xavier(fan_out=fan) - self.fpn_inner_output[0] = ConvNorm( - body_input, - self.num_chan, - 1, - initializer=initializer, - norm_type=self.norm_type, - freeze_norm=self.freeze_norm, - name=fpn_inner_name, - norm_name=fpn_inner_name) - else: - self.fpn_inner_output[0] = fluid.layers.conv2d( - body_input, - self.num_chan, - 1, - param_attr=ParamAttr(name=fpn_inner_name + "_w", initializer=Xavier(fan_out=fan)), - bias_attr=ParamAttr(name=fpn_inner_name + "_b", learning_rate=2., regularizer=L2Decay(0.)), - name=fpn_inner_name) - for i in range(1, num_backbone_stages): - body_name = body_name_list[i] - body_input = body_dict[body_name] - top_output = self.fpn_inner_output[i - 1] - fpn_inner_single = self._add_topdown_lateral(body_name, body_input, top_output) - self.fpn_inner_output[i] = fpn_inner_single - fpn_dict = {} - fpn_name_list = [] - for i in range(num_backbone_stages): - fpn_name = 'fpn_' + body_name_list[i] - fan = self.fpn_inner_output[i].shape[1] * 3 * 3 - if self.norm_type: - initializer = Xavier(fan_out=fan) - fpn_output = ConvNorm( - self.fpn_inner_output[i], - self.num_chan, - 3, - initializer=initializer, - norm_type=self.norm_type, - freeze_norm=self.freeze_norm, - name=fpn_name, - norm_name=fpn_name) - else: - fpn_output = fluid.layers.conv2d( - self.fpn_inner_output[i], - self.num_chan, - filter_size=3, - padding=1, - param_attr=ParamAttr(name=fpn_name + "_w", initializer=Xavier(fan_out=fan)), - bias_attr=ParamAttr(name=fpn_name + "_b", learning_rate=2., regularizer=L2Decay(0.)), - name=fpn_name) - fpn_dict[fpn_name] = fpn_output - fpn_name_list.append(fpn_name) - if not self.has_extra_convs and self.max_level - self.min_level == len(spatial_scale): - body_top_name = fpn_name_list[0] - body_top_extension = fluid.layers.pool2d( - fpn_dict[body_top_name], 1, 'max', pool_stride=2, name=body_top_name + '_subsampled_2x') - fpn_dict[body_top_name + '_subsampled_2x'] = body_top_extension - fpn_name_list.insert(0, body_top_name + '_subsampled_2x') - spatial_scale.insert(0, spatial_scale[0] * 0.5) - # Coarser FPN levels introduced for RetinaNet - highest_backbone_level = self.min_level + len(spatial_scale) - 1 - if self.has_extra_convs and self.max_level > highest_backbone_level: - fpn_blob = body_dict[body_name_list[0]] - for i in range(highest_backbone_level + 1, self.max_level + 1): - fpn_blob_in = fpn_blob - fpn_name = 'fpn_' + str(i) - if i > highest_backbone_level + 1: - fpn_blob_in = fluid.layers.relu(fpn_blob) - fan = fpn_blob_in.shape[1] * 3 * 3 - fpn_blob = fluid.layers.conv2d( - input=fpn_blob_in, - num_filters=self.num_chan, - filter_size=3, - stride=2, - padding=1, - param_attr=ParamAttr(name=fpn_name + "_w", initializer=Xavier(fan_out=fan)), - bias_attr=ParamAttr(name=fpn_name + "_b", learning_rate=2., regularizer=L2Decay(0.)), - name=fpn_name) - fpn_dict[fpn_name] = fpn_blob - fpn_name_list.insert(0, fpn_name) - spatial_scale.insert(0, spatial_scale[0] * 0.5) - res_dict = OrderedDict([(k, fpn_dict[k]) for k in fpn_name_list]) - return res_dict, spatial_scale diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/label_file.txt b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/label_file.txt deleted file mode 100644 index d7d43a94adf73208f997f0efd6581bef11ca734e..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/label_file.txt +++ /dev/null @@ -1,81 +0,0 @@ -background -person -bicycle -car -motorcycle -airplane -bus -train -truck -boat -traffic light -fire hydrant -stop sign -parking meter -bench -bird -cat -dog -horse -sheep -cow -elephant -bear -zebra -giraffe -backpack -umbrella -handbag -tie -suitcase -frisbee -skis -snowboard -sports ball -kite -baseball bat -baseball glove -skateboard -surfboard -tennis racket -bottle -wine glass -cup -fork -knife -spoon -bowl -banana -apple -sandwich -orange -broccoli -carrot -hot dog -pizza -donut -cake -chair -couch -potted plant -bed -dining table -toilet -tv -laptop -mouse -remote -keyboard -cell phone -microwave -oven -toaster -sink -refrigerator -book -clock -vase -scissors -teddy bear -hair drier -toothbrush diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/module.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/module.py deleted file mode 100644 index 5070dacb42d0eb4ca20d6e752c7239b83b2257ee..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/module.py +++ /dev/null @@ -1,302 +0,0 @@ -# coding=utf-8 -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -import ast -import argparse -from functools import partial - -import numpy as np -import paddle.fluid as fluid -import paddlehub as hub -from paddlehub.module.module import moduleinfo, runnable, serving -from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor -from paddlehub.io.parser import txt_parser -from paddlehub.common.paddle_helper import add_vars_prefix - -from retinanet_resnet50_fpn_coco2017.fpn import FPN -from retinanet_resnet50_fpn_coco2017.retina_head import AnchorGenerator, RetinaTargetAssign, RetinaOutputDecoder, RetinaHead -from retinanet_resnet50_fpn_coco2017.processor import load_label_info, postprocess, base64_to_cv2 -from retinanet_resnet50_fpn_coco2017.data_feed import test_reader, padding_minibatch -from retinanet_resnet50_fpn_coco2017.resnet import ResNet - - -@moduleinfo( - name="retinanet_resnet50_fpn_coco2017", - version="1.0.0", - type="cv/object_detection", - summary="Baidu's RetinaNet model for object detection, with backbone ResNet50 and FPN.", - author="paddlepaddle", - author_email="paddle-dev@baidu.com") -class RetinaNetResNet50FPN(hub.Module): - def _initialize(self): - # default pretrained model of Retinanet_ResNet50_FPN, the shape of input image tensor is (3, 608, 608) - self.default_pretrained_model_path = os.path.join(self.directory, "retinanet_resnet50_fpn_model") - self.label_names = load_label_info(os.path.join(self.directory, "label_file.txt")) - self.infer_prog = None - self.image = None - self.im_info = None - self.bbox_out = None - self._set_config() - - def _set_config(self): - """ - predictor config setting - """ - cpu_config = AnalysisConfig(self.default_pretrained_model_path) - cpu_config.disable_glog_info() - cpu_config.disable_gpu() - self.cpu_predictor = create_paddle_predictor(cpu_config) - - try: - _places = os.environ["CUDA_VISIBLE_DEVICES"] - int(_places[0]) - use_gpu = True - except: - use_gpu = False - if use_gpu: - gpu_config = AnalysisConfig(self.default_pretrained_model_path) - gpu_config.disable_glog_info() - gpu_config.enable_use_gpu(memory_pool_init_size_mb=500, device_id=0) - self.gpu_predictor = create_paddle_predictor(gpu_config) - - def context(self, num_classes=81, trainable=True, pretrained=True, phase='train'): - """ - Distill the Head Features, so as to perform transfer learning. - - Args: - num_classes (int): number of classes. - trainable (bool): whether to set parameters trainable. - pretrained (bool): whether to load default pretrained model. - phase (str): optional choices are 'train' and 'predict'. - - Returns: - inputs(dict): the input variables. - outputs(dict): the output variables. - context_prog (Program): the program to execute transfer learning. - """ - context_prog = fluid.Program() - startup_program = fluid.Program() - with fluid.program_guard(context_prog, startup_program): - with fluid.unique_name.guard(): - var_prefix = '@HUB_{}@'.format(self.name) - # image - image = fluid.layers.data(name='image', shape=[-1, 3, -1, -1], dtype='float32', lod_level=0) - # im_info - im_info = fluid.layers.data(name='im_info', shape=[3], dtype='float32', lod_level=0) - # backbone - backbone = ResNet( - norm_type='affine_channel', freeze_at=2, norm_decay=0., depth=50, feature_maps=[3, 4, 5]) - body_feats = backbone(image) - # retina_head - retina_head = RetinaHead( - anchor_generator=AnchorGenerator(aspect_ratios=[1.0, 2.0, 0.5], variance=[1.0, 1.0, 1.0, 1.0]), - target_assign=RetinaTargetAssign(positive_overlap=0.5, negative_overlap=0.4), - output_decoder=RetinaOutputDecoder( - score_thresh=0.05, nms_thresh=0.5, pre_nms_top_n=1000, detections_per_im=100, nms_eta=1.0), - num_convs_per_octave=4, - num_chan=256, - max_level=7, - min_level=3, - prior_prob=0.01, - base_scale=4, - num_scales_per_octave=3) - # fpn - fpn = FPN( - max_level=7, - min_level=3, - num_chan=256, - spatial_scale=[0.03125, 0.0625, 0.125], - has_extra_convs=True) - # body_feats - body_feats, spatial_scale = fpn.get_output(body_feats) - # inputs, outputs, context_prog - inputs = {'image': var_prefix + image.name, 'im_info': var_prefix + im_info.name} - if phase == 'predict': - pred = retina_head.get_prediction(body_feats, spatial_scale, im_info) - outputs = {'bbox_out': var_prefix + pred.name} - else: - outputs = {'body_features': [var_prefix + var.name for key, var in body_feats.items()]} - - # add_vars_prefix - add_vars_prefix(context_prog, var_prefix) - add_vars_prefix(fluid.default_startup_program(), var_prefix) - - global_vars = context_prog.global_block().vars - inputs = {key: global_vars[value] for key, value in inputs.items()} - outputs = { - key: global_vars[value] if not isinstance(value, list) else [global_vars[var] for var in value] - for key, value in outputs.items() - } - - place = fluid.CPUPlace() - exe = fluid.Executor(place) - for param in context_prog.global_block().iter_parameters(): - param.trainable = trainable - if pretrained: - - def _if_exist(var): - return os.path.exists(os.path.join(self.default_pretrained_model_path, var.name)) - - fluid.io.load_vars(exe, self.default_pretrained_model_path, predicate=_if_exist) - else: - exe.run(startup_program) - return inputs, outputs, context_prog - - def save_inference_model(self, dirname, model_filename=None, params_filename=None, combined=True): - if combined: - model_filename = "__model__" if not model_filename else model_filename - params_filename = "__params__" if not params_filename else params_filename - place = fluid.CPUPlace() - exe = fluid.Executor(place) - - program, feeded_var_names, target_vars = fluid.io.load_inference_model( - dirname=self.default_pretrained_model_path, executor=exe) - - fluid.io.save_inference_model( - dirname=dirname, - main_program=program, - executor=exe, - feeded_var_names=feeded_var_names, - target_vars=target_vars, - model_filename=model_filename, - params_filename=params_filename) - - def object_detection(self, - paths=None, - images=None, - use_gpu=False, - batch_size=1, - output_dir='detection_result', - score_thresh=0.5, - visualization=True): - """API of Object Detection. - - Args: - paths (list[str]): The paths of images. - images (list(numpy.ndarray)): images data, shape of each is [H, W, C] - batch_size (int): batch size. - use_gpu (bool): Whether to use gpu. - output_dir (str): The path to store output images. - visualization (bool): Whether to save image or not. - score_thresh (float): threshold for object detecion. - visualization (bool): whether to save result as images. - - Returns: - res (list[dict]): The result of coco2017 detecion. keys include 'data', 'save_path', the corresponding value is: - data (dict): the result of object detection, keys include 'left', 'top', 'right', 'bottom', 'label', 'confidence', the corresponding value is: - left (float): The X coordinate of the upper left corner of the bounding box; - top (float): The Y coordinate of the upper left corner of the bounding box; - right (float): The X coordinate of the lower right corner of the bounding box; - bottom (float): The Y coordinate of the lower right corner of the bounding box; - label (str): The label of detection result; - confidence (float): The confidence of detection result. - save_path (str, optional): The path to save output images. - """ - if use_gpu: - try: - _places = os.environ["CUDA_VISIBLE_DEVICES"] - int(_places[0]) - except: - raise RuntimeError( - "Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id." - ) - - all_images = list() - paths = paths if paths else list() - for yield_data in test_reader(paths, images): - all_images.append(yield_data) - - images_num = len(all_images) - loop_num = int(np.ceil(images_num / batch_size)) - res = list() - for iter_id in range(loop_num): - batch_data = list() - handle_id = iter_id * batch_size - for image_id in range(batch_size): - try: - batch_data.append(all_images[handle_id + image_id]) - except: - pass - padding_image, padding_info = padding_minibatch(batch_data, coarsest_stride=32, use_padded_im_info=True) - padding_image_tensor = PaddleTensor(padding_image.copy()) - padding_info_tensor = PaddleTensor(padding_info.copy()) - feed_list = [padding_image_tensor, padding_info_tensor] - if use_gpu: - data_out = self.gpu_predictor.run(feed_list) - else: - data_out = self.cpu_predictor.run(feed_list) - output = postprocess( - paths=paths, - images=images, - data_out=data_out, - score_thresh=score_thresh, - label_names=self.label_names, - output_dir=output_dir, - handle_id=handle_id, - visualization=visualization) - res += output - return res - - def add_module_config_arg(self): - """ - Add the command config options - """ - self.arg_config_group.add_argument( - '--use_gpu', type=ast.literal_eval, default=False, help="whether use GPU or not") - - self.arg_config_group.add_argument('--batch_size', type=int, default=1, help="batch size for prediction") - - def add_module_input_arg(self): - """ - Add the command input options - """ - self.arg_input_group.add_argument('--input_path', type=str, default=None, help="input data") - - self.arg_input_group.add_argument('--input_file', type=str, default=None, help="file contain input data") - - def check_input_data(self, args): - input_data = list() - if args.input_path: - input_data = [args.input_path] - elif args.input_file: - if not os.path.exists(args.input_file): - raise RuntimeError("File %s is not exist." % args.input_file) - else: - input_data = txt_parser.parse(args.input_file, use_strip=True) - return input_data - - @serving - def serving_method(self, images, **kwargs): - """ - Run as a service. - """ - images_decode = [base64_to_cv2(image) for image in images] - results = self.object_detection(images=images_decode, **kwargs) - return results - - @runnable - def run_cmd(self, argvs): - self.parser = argparse.ArgumentParser( - description="Run the {}".format(self.name), - prog="hub run {}".format(self.name), - usage='%(prog)s', - add_help=True) - self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required") - self.arg_config_group = self.parser.add_argument_group( - title="Config options", description="Run configuration for controlling module behavior, not required.") - self.add_module_config_arg() - - self.add_module_input_arg() - args = self.parser.parse_args(argvs) - input_data = self.check_input_data(args) - if len(input_data) == 0: - self.parser.print_help() - exit(1) - else: - for image_path in input_data: - if not os.path.exists(image_path): - raise RuntimeError("File %s or %s is not exist." % image_path) - return self.object_detection(paths=input_data, use_gpu=args.use_gpu, batch_size=args.batch_size) diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/name_adapter.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/name_adapter.py deleted file mode 100644 index bebf8bdeeec3aa76357d95cc52ba5a009e19d46f..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/name_adapter.py +++ /dev/null @@ -1,61 +0,0 @@ -# coding=utf-8 - - -class NameAdapter(object): - """Fix the backbones variable names for pretrained weight""" - - def __init__(self, model): - super(NameAdapter, self).__init__() - self.model = model - - @property - def model_type(self): - return getattr(self.model, '_model_type', '') - - @property - def variant(self): - return getattr(self.model, 'variant', '') - - def fix_conv_norm_name(self, name): - if name == "conv1": - bn_name = "bn_" + name - else: - bn_name = "bn" + name[3:] - # the naming rule is same as pretrained weight - if self.model_type == 'SEResNeXt': - bn_name = name + "_bn" - return bn_name - - def fix_shortcut_name(self, name): - if self.model_type == 'SEResNeXt': - name = 'conv' + name + '_prj' - return name - - def fix_bottleneck_name(self, name): - if self.model_type == 'SEResNeXt': - conv_name1 = 'conv' + name + '_x1' - conv_name2 = 'conv' + name + '_x2' - conv_name3 = 'conv' + name + '_x3' - shortcut_name = name - else: - conv_name1 = name + "_branch2a" - conv_name2 = name + "_branch2b" - conv_name3 = name + "_branch2c" - shortcut_name = name + "_branch1" - return conv_name1, conv_name2, conv_name3, shortcut_name - - def fix_layer_warp_name(self, stage_num, count, i): - name = 'res' + str(stage_num) - if count > 10 and stage_num == 4: - if i == 0: - conv_name = name + "a" - else: - conv_name = name + "b" + str(i) - else: - conv_name = name + chr(ord("a") + i) - if self.model_type == 'SEResNeXt': - conv_name = str(stage_num + 2) + '_' + str(i + 1) - return conv_name - - def fix_c1_stage_name(self): - return "res_conv1" if self.model_type == 'ResNeXt' else "conv1" diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/nonlocal_helper.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/nonlocal_helper.py deleted file mode 100644 index 839df4caf744280001f033d8ef6a3d560277368e..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/nonlocal_helper.py +++ /dev/null @@ -1,151 +0,0 @@ -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function -from __future__ import unicode_literals - -import paddle.fluid as fluid -from paddle.fluid import ParamAttr - -nonlocal_params = { - "use_zero_init_conv": False, - "conv_init_std": 0.01, - "no_bias": True, - "use_maxpool": False, - "use_softmax": True, - "use_bn": False, - "use_scale": True, # vital for the model prformance!!! - "use_affine": False, - "bn_momentum": 0.9, - "bn_epsilon": 1.0000001e-5, - "bn_init_gamma": 0.9, - "weight_decay_bn": 1.e-4, -} - - -def space_nonlocal(input, dim_in, dim_out, prefix, dim_inner, max_pool_stride=2): - cur = input - theta = fluid.layers.conv2d(input = cur, num_filters = dim_inner, \ - filter_size = [1, 1], stride = [1, 1], \ - padding = [0, 0], \ - param_attr=ParamAttr(name = prefix + '_theta' + "_w", \ - initializer = fluid.initializer.Normal(loc = 0.0, - scale = nonlocal_params["conv_init_std"])), \ - bias_attr = ParamAttr(name = prefix + '_theta' + "_b", \ - initializer = fluid.initializer.Constant(value = 0.)) \ - if not nonlocal_params["no_bias"] else False, \ - name = prefix + '_theta') - theta_shape = theta.shape - theta_shape_op = fluid.layers.shape(theta) - theta_shape_op.stop_gradient = True - - if nonlocal_params["use_maxpool"]: - max_pool = fluid.layers.pool2d(input = cur, \ - pool_size = [max_pool_stride, max_pool_stride], \ - pool_type = 'max', \ - pool_stride = [max_pool_stride, max_pool_stride], \ - pool_padding = [0, 0], \ - name = prefix + '_pool') - else: - max_pool = cur - - phi = fluid.layers.conv2d(input = max_pool, num_filters = dim_inner, \ - filter_size = [1, 1], stride = [1, 1], \ - padding = [0, 0], \ - param_attr = ParamAttr(name = prefix + '_phi' + "_w", \ - initializer = fluid.initializer.Normal(loc = 0.0, - scale = nonlocal_params["conv_init_std"])), \ - bias_attr = ParamAttr(name = prefix + '_phi' + "_b", \ - initializer = fluid.initializer.Constant(value = 0.)) \ - if (nonlocal_params["no_bias"] == 0) else False, \ - name = prefix + '_phi') - phi_shape = phi.shape - - g = fluid.layers.conv2d(input = max_pool, num_filters = dim_inner, \ - filter_size = [1, 1], stride = [1, 1], \ - padding = [0, 0], \ - param_attr = ParamAttr(name = prefix + '_g' + "_w", \ - initializer = fluid.initializer.Normal(loc = 0.0, scale = nonlocal_params["conv_init_std"])), \ - bias_attr = ParamAttr(name = prefix + '_g' + "_b", \ - initializer = fluid.initializer.Constant(value = 0.)) if (nonlocal_params["no_bias"] == 0) else False, \ - name = prefix + '_g') - g_shape = g.shape - # we have to use explicit batch size (to support arbitrary spacetime size) - # e.g. (8, 1024, 4, 14, 14) => (8, 1024, 784) - theta = fluid.layers.reshape(theta, shape=(0, 0, -1)) - theta = fluid.layers.transpose(theta, [0, 2, 1]) - phi = fluid.layers.reshape(phi, [0, 0, -1]) - theta_phi = fluid.layers.matmul(theta, phi, name=prefix + '_affinity') - g = fluid.layers.reshape(g, [0, 0, -1]) - - if nonlocal_params["use_softmax"]: - if nonlocal_params["use_scale"]: - theta_phi_sc = fluid.layers.scale(theta_phi, scale=dim_inner**-.5) - else: - theta_phi_sc = theta_phi - p = fluid.layers.softmax(theta_phi_sc, name=prefix + '_affinity' + '_prob') - else: - # not clear about what is doing in xlw's code - p = None # not implemented - raise "Not implemented when not use softmax" - - # note g's axis[2] corresponds to p's axis[2] - # e.g. g(8, 1024, 784_2) * p(8, 784_1, 784_2) => (8, 1024, 784_1) - p = fluid.layers.transpose(p, [0, 2, 1]) - t = fluid.layers.matmul(g, p, name=prefix + '_y') - - # reshape back - # e.g. (8, 1024, 784) => (8, 1024, 4, 14, 14) - t_shape = t.shape - t_re = fluid.layers.reshape(t, shape=list(theta_shape), actual_shape=theta_shape_op) - blob_out = t_re - blob_out = fluid.layers.conv2d(input = blob_out, num_filters = dim_out, \ - filter_size = [1, 1], stride = [1, 1], padding = [0, 0], \ - param_attr = ParamAttr(name = prefix + '_out' + "_w", \ - initializer = fluid.initializer.Constant(value = 0.) \ - if nonlocal_params["use_zero_init_conv"] \ - else fluid.initializer.Normal(loc = 0.0, - scale = nonlocal_params["conv_init_std"])), \ - bias_attr = ParamAttr(name = prefix + '_out' + "_b", \ - initializer = fluid.initializer.Constant(value = 0.)) \ - if (nonlocal_params["no_bias"] == 0) else False, \ - name = prefix + '_out') - blob_out_shape = blob_out.shape - - if nonlocal_params["use_bn"]: - bn_name = prefix + "_bn" - blob_out = fluid.layers.batch_norm(blob_out, \ - # is_test = test_mode, \ - momentum = nonlocal_params["bn_momentum"], \ - epsilon = nonlocal_params["bn_epsilon"], \ - name = bn_name, \ - param_attr = ParamAttr(name = bn_name + "_s", \ - initializer = fluid.initializer.Constant(value = nonlocal_params["bn_init_gamma"]), \ - regularizer = fluid.regularizer.L2Decay(nonlocal_params["weight_decay_bn"])), \ - bias_attr = ParamAttr(name = bn_name + "_b", \ - regularizer = fluid.regularizer.L2Decay(nonlocal_params["weight_decay_bn"])), \ - moving_mean_name = bn_name + "_rm", \ - moving_variance_name = bn_name + "_riv") # add bn - - if nonlocal_params["use_affine"]: - affine_scale = fluid.layers.create_parameter(\ - shape=[blob_out_shape[1]], dtype = blob_out.dtype, \ - attr=ParamAttr(name=prefix + '_affine' + '_s'), \ - default_initializer = fluid.initializer.Constant(value = 1.)) - affine_bias = fluid.layers.create_parameter(\ - shape=[blob_out_shape[1]], dtype = blob_out.dtype, \ - attr=ParamAttr(name=prefix + '_affine' + '_b'), \ - default_initializer = fluid.initializer.Constant(value = 0.)) - blob_out = fluid.layers.affine_channel(blob_out, scale = affine_scale, \ - bias = affine_bias, name = prefix + '_affine') # add affine - - return blob_out - - -def add_space_nonlocal(input, dim_in, dim_out, prefix, dim_inner): - ''' - add_space_nonlocal: - Non-local Neural Networks: see https://arxiv.org/abs/1711.07971 - ''' - conv = space_nonlocal(input, dim_in, dim_out, prefix, dim_inner) - output = fluid.layers.elementwise_add(input, conv, name=prefix + '_sum') - return output diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/processor.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/processor.py deleted file mode 100644 index 167508096e96cbda4645bb4b20cb6b080ce5f37d..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/processor.py +++ /dev/null @@ -1,162 +0,0 @@ -# coding=utf-8 -import base64 -import os - -import cv2 -import numpy as np -from PIL import Image, ImageDraw - -__all__ = [ - 'base64_to_cv2', - 'load_label_info', - 'postprocess', -] - - -def base64_to_cv2(b64str): - data = base64.b64decode(b64str.encode('utf8')) - data = np.fromstring(data, np.uint8) - data = cv2.imdecode(data, cv2.IMREAD_COLOR) - return data - - -def get_save_image_name(img, output_dir, image_path): - """Get save image name from source image path. - """ - image_name = os.path.split(image_path)[-1] - name, ext = os.path.splitext(image_name) - if ext == '': - if img.format == 'PNG': - ext = '.png' - elif img.format == 'JPEG': - ext = '.jpg' - elif img.format == 'BMP': - ext = '.bmp' - else: - if img.mode == "RGB" or img.mode == "L": - ext = ".jpg" - elif img.mode == "RGBA" or img.mode == "P": - ext = '.png' - - return os.path.join(output_dir, "{}".format(name)) + ext - - -def draw_bounding_box_on_image(image_path, data_list, save_dir): - image = Image.open(image_path) - draw = ImageDraw.Draw(image) - for data in data_list: - left, right, top, bottom = data['left'], data['right'], data['top'], data['bottom'] - # draw bbox - draw.line([(left, top), (left, bottom), (right, bottom), (right, top), (left, top)], width=2, fill='red') - - # draw label - if image.mode == 'RGB': - text = data['label'] + ": %.2f%%" % (100 * data['confidence']) - textsize_width, textsize_height = draw.textsize(text=text) - draw.rectangle( - xy=(left, top - (textsize_height + 5), left + textsize_width + 10, top), fill=(255, 255, 255)) - draw.text(xy=(left, top - 15), text=text, fill=(0, 0, 0)) - - save_name = get_save_image_name(image, save_dir, image_path) - if os.path.exists(save_name): - os.remove(save_name) - image.save(save_name) - - return save_name - - -def clip_bbox(bbox, img_width, img_height): - xmin = max(min(bbox[0], img_width), 0.) - ymin = max(min(bbox[1], img_height), 0.) - xmax = max(min(bbox[2], img_width), 0.) - ymax = max(min(bbox[3], img_height), 0.) - return float(xmin), float(ymin), float(xmax), float(ymax) - - -def load_label_info(file_path): - with open(file_path, 'r') as fr: - text = fr.readlines() - label_names = [] - for info in text: - label_names.append(info.strip()) - return label_names - - -def postprocess(paths, images, data_out, score_thresh, label_names, output_dir, handle_id, visualization): - """ - postprocess the lod_tensor produced by fluid.Executor.run - - Args: - paths (list[str]): the path of images. - images (list(numpy.ndarray)): list of images, shape of each is [H, W, C]. - data_out (lod_tensor): data produced by executor.run. - score_thresh (float): the low limit of bounding box. - label_names (list[str]): label names. - output_dir (str): output directory. - handle_id (int): The number of images that have been handled. - visualization (bool): whether to save as images. - - Returns: - res (list[dict]): The result of vehicles detecion. keys include 'data', 'save_path', the corresponding value is: - data (dict): the result of object detection, keys include 'left', 'top', 'right', 'bottom', 'label', 'confidence', the corresponding value is: - left (float): The X coordinate of the upper left corner of the bounding box; - top (float): The Y coordinate of the upper left corner of the bounding box; - right (float): The X coordinate of the lower right corner of the bounding box; - bottom (float): The Y coordinate of the lower right corner of the bounding box; - label (str): The label of detection result; - confidence (float): The confidence of detection result. - save_path (str): The path to save output images. - """ - lod_tensor = data_out[0] - lod = lod_tensor.lod[0] - results = lod_tensor.as_ndarray() - - if handle_id < len(paths): - unhandled_paths = paths[handle_id:] - unhandled_paths_num = len(unhandled_paths) - else: - unhandled_paths_num = 0 - - output_dir = output_dir if output_dir else os.path.join(os.getcwd(), 'detection_result') - if visualization: - if not os.path.exists(output_dir): - os.makedirs(output_dir) - - output = [] - for index in range(len(lod) - 1): - output_i = {'data': []} - if index < unhandled_paths_num: - org_img_path = unhandled_paths[index] - org_img = Image.open(org_img_path) - output_i['path'] = org_img_path - else: - org_img = images[index - unhandled_paths_num] - org_img = org_img.astype(np.uint8) - org_img = Image.fromarray(org_img[:, :, ::-1]) - if visualization: - org_img_path = get_save_image_name(org_img, output_dir, 'image_numpy_{}'.format((handle_id + index))) - org_img.save(org_img_path) - org_img_height = org_img.height - org_img_width = org_img.width - result_i = results[lod[index]:lod[index + 1]] - - for row in result_i: - if len(row) != 6: - continue - if row[1] < score_thresh: - continue - category_id = int(row[0]) - confidence = row[1] - bbox = row[2:] - dt = {} - dt['label'] = label_names[category_id] - dt['confidence'] = float(confidence) - dt['left'], dt['top'], dt['right'], dt['bottom'] = clip_bbox(bbox, org_img_width, org_img_height) - output_i['data'].append(dt) - - output.append(output_i) - - if visualization: - output_i['save_path'] = draw_bounding_box_on_image(org_img_path, output_i['data'], output_dir) - - return output diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/resnet.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/resnet.py deleted file mode 100644 index 77a3f7f4c7b16c3f9c65c46fc93eb394befa5110..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/resnet.py +++ /dev/null @@ -1,364 +0,0 @@ -# coding=utf-8 -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import math -from collections import OrderedDict -from numbers import Integral - -from paddle import fluid -from paddle.fluid.param_attr import ParamAttr -from paddle.fluid.framework import Variable -from paddle.fluid.regularizer import L2Decay -from paddle.fluid.initializer import Constant - -from .nonlocal_helper import add_space_nonlocal -from .name_adapter import NameAdapter - -__all__ = ['ResNet', 'ResNetC5'] - - -class ResNet(object): - """ - Residual Network, see https://arxiv.org/abs/1512.03385 - Args: - depth (int): ResNet depth, should be 34, 50. - freeze_at (int): freeze the backbone at which stage - norm_type (str): normalization type, 'bn'/'sync_bn'/'affine_channel' - freeze_norm (bool): freeze normalization layers - norm_decay (float): weight decay for normalization layer weights - variant (str): ResNet variant, supports 'a', 'b', 'c', 'd' currently - feature_maps (list): index of stages whose feature maps are returned - dcn_v2_stages (list): index of stages who select deformable conv v2 - nonlocal_stages (list): index of stages who select nonlocal networks - """ - __shared__ = ['norm_type', 'freeze_norm', 'weight_prefix_name'] - - def __init__(self, - depth=50, - freeze_at=0, - norm_type='sync_bn', - freeze_norm=False, - norm_decay=0., - variant='b', - feature_maps=[3, 4, 5], - dcn_v2_stages=[], - weight_prefix_name='', - nonlocal_stages=[], - get_prediction=False, - class_dim=1000): - super(ResNet, self).__init__() - - if isinstance(feature_maps, Integral): - feature_maps = [feature_maps] - - assert depth in [34, 50], \ - "depth {} not in [34, 50]" - assert variant in ['a', 'b', 'c', 'd'], "invalid ResNet variant" - assert 0 <= freeze_at <= 4, "freeze_at should be 0, 1, 2, 3 or 4" - assert len(feature_maps) > 0, "need one or more feature maps" - assert norm_type in ['bn', 'sync_bn', 'affine_channel'] - assert not (len(nonlocal_stages)>0 and depth<50), \ - "non-local is not supported for resnet18 or resnet34" - - self.depth = depth - self.freeze_at = freeze_at - self.norm_type = norm_type - self.norm_decay = norm_decay - self.freeze_norm = freeze_norm - self.variant = variant - self._model_type = 'ResNet' - self.feature_maps = feature_maps - self.dcn_v2_stages = dcn_v2_stages - self.depth_cfg = { - 34: ([3, 4, 6, 3], self.basicblock), - 50: ([3, 4, 6, 3], self.bottleneck), - } - self.stage_filters = [64, 128, 256, 512] - self._c1_out_chan_num = 64 - self.na = NameAdapter(self) - self.prefix_name = weight_prefix_name - - self.nonlocal_stages = nonlocal_stages - self.nonlocal_mod_cfg = { - 50: 2, - 101: 5, - 152: 8, - 200: 12, - } - self.get_prediction = get_prediction - self.class_dim = class_dim - - def _conv_offset(self, input, filter_size, stride, padding, act=None, name=None): - out_channel = filter_size * filter_size * 3 - out = fluid.layers.conv2d( - input, - num_filters=out_channel, - filter_size=filter_size, - stride=stride, - padding=padding, - param_attr=ParamAttr(initializer=Constant(0.0), name=name + ".w_0"), - bias_attr=ParamAttr(initializer=Constant(0.0), name=name + ".b_0"), - act=act, - name=name) - return out - - def _conv_norm(self, input, num_filters, filter_size, stride=1, groups=1, act=None, name=None, dcn_v2=False): - _name = self.prefix_name + name if self.prefix_name != '' else name - if not dcn_v2: - conv = fluid.layers.conv2d( - input=input, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - groups=groups, - act=None, - param_attr=ParamAttr(name=_name + "_weights"), - bias_attr=False, - name=_name + '.conv2d.output.1') - else: - # select deformable conv" - offset_mask = self._conv_offset( - input=input, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - act=None, - name=_name + "_conv_offset") - offset_channel = filter_size**2 * 2 - mask_channel = filter_size**2 - offset, mask = fluid.layers.split(input=offset_mask, num_or_sections=[offset_channel, mask_channel], dim=1) - mask = fluid.layers.sigmoid(mask) - conv = fluid.layers.deformable_conv( - input=input, - offset=offset, - mask=mask, - num_filters=num_filters, - filter_size=filter_size, - stride=stride, - padding=(filter_size - 1) // 2, - groups=groups, - deformable_groups=1, - im2col_step=1, - param_attr=ParamAttr(name=_name + "_weights"), - bias_attr=False, - name=_name + ".conv2d.output.1") - - bn_name = self.na.fix_conv_norm_name(name) - bn_name = self.prefix_name + bn_name if self.prefix_name != '' else bn_name - - norm_lr = 0. if self.freeze_norm else 1. - norm_decay = self.norm_decay - pattr = ParamAttr(name=bn_name + '_scale', learning_rate=norm_lr, regularizer=L2Decay(norm_decay)) - battr = ParamAttr(name=bn_name + '_offset', learning_rate=norm_lr, regularizer=L2Decay(norm_decay)) - - if self.norm_type in ['bn', 'sync_bn']: - global_stats = True if self.freeze_norm else False - out = fluid.layers.batch_norm( - input=conv, - act=act, - name=bn_name + '.output.1', - param_attr=pattr, - bias_attr=battr, - moving_mean_name=bn_name + '_mean', - moving_variance_name=bn_name + '_variance', - use_global_stats=global_stats) - scale = fluid.framework._get_var(pattr.name) - bias = fluid.framework._get_var(battr.name) - elif self.norm_type == 'affine_channel': - scale = fluid.layers.create_parameter( - shape=[conv.shape[1]], dtype=conv.dtype, attr=pattr, default_initializer=fluid.initializer.Constant(1.)) - bias = fluid.layers.create_parameter( - shape=[conv.shape[1]], dtype=conv.dtype, attr=battr, default_initializer=fluid.initializer.Constant(0.)) - out = fluid.layers.affine_channel(x=conv, scale=scale, bias=bias, act=act) - if self.freeze_norm: - scale.stop_gradient = True - bias.stop_gradient = True - return out - - def _shortcut(self, input, ch_out, stride, is_first, name): - max_pooling_in_short_cut = self.variant == 'd' - ch_in = input.shape[1] - # the naming rule is same as pretrained weight - name = self.na.fix_shortcut_name(name) - std_senet = getattr(self, 'std_senet', False) - if ch_in != ch_out or stride != 1 or (self.depth < 50 and is_first): - if std_senet: - if is_first: - return self._conv_norm(input, ch_out, 1, stride, name=name) - else: - return self._conv_norm(input, ch_out, 3, stride, name=name) - if max_pooling_in_short_cut and not is_first: - input = fluid.layers.pool2d( - input=input, pool_size=2, pool_stride=2, pool_padding=0, ceil_mode=True, pool_type='avg') - return self._conv_norm(input, ch_out, 1, 1, name=name) - return self._conv_norm(input, ch_out, 1, stride, name=name) - else: - return input - - def bottleneck(self, input, num_filters, stride, is_first, name, dcn_v2=False): - if self.variant == 'a': - stride1, stride2 = stride, 1 - else: - stride1, stride2 = 1, stride - - # ResNeXt - groups = getattr(self, 'groups', 1) - group_width = getattr(self, 'group_width', -1) - if groups == 1: - expand = 4 - elif (groups * group_width) == 256: - expand = 1 - else: # FIXME hard code for now, handles 32x4d, 64x4d and 32x8d - num_filters = num_filters // 2 - expand = 2 - - conv_name1, conv_name2, conv_name3, \ - shortcut_name = self.na.fix_bottleneck_name(name) - std_senet = getattr(self, 'std_senet', False) - if std_senet: - conv_def = [[int(num_filters / 2), 1, stride1, 'relu', 1, conv_name1], - [num_filters, 3, stride2, 'relu', groups, conv_name2], - [num_filters * expand, 1, 1, None, 1, conv_name3]] - else: - conv_def = [[num_filters, 1, stride1, 'relu', 1, conv_name1], - [num_filters, 3, stride2, 'relu', groups, conv_name2], - [num_filters * expand, 1, 1, None, 1, conv_name3]] - - residual = input - for i, (c, k, s, act, g, _name) in enumerate(conv_def): - residual = self._conv_norm( - input=residual, - num_filters=c, - filter_size=k, - stride=s, - act=act, - groups=g, - name=_name, - dcn_v2=(i == 1 and dcn_v2)) - short = self._shortcut(input, num_filters * expand, stride, is_first=is_first, name=shortcut_name) - # Squeeze-and-Excitation - if callable(getattr(self, '_squeeze_excitation', None)): - residual = self._squeeze_excitation(input=residual, num_channels=num_filters, name='fc' + name) - return fluid.layers.elementwise_add(x=short, y=residual, act='relu', name=name + ".add.output.5") - - def basicblock(self, input, num_filters, stride, is_first, name, dcn_v2=False): - assert dcn_v2 is False, "Not implemented yet." - conv0 = self._conv_norm( - input=input, num_filters=num_filters, filter_size=3, act='relu', stride=stride, name=name + "_branch2a") - conv1 = self._conv_norm(input=conv0, num_filters=num_filters, filter_size=3, act=None, name=name + "_branch2b") - short = self._shortcut(input, num_filters, stride, is_first, name=name + "_branch1") - return fluid.layers.elementwise_add(x=short, y=conv1, act='relu') - - def layer_warp(self, input, stage_num): - """ - Args: - input (Variable): input variable. - stage_num (int): the stage number, should be 2, 3, 4, 5 - - Returns: - The last variable in endpoint-th stage. - """ - assert stage_num in [2, 3, 4, 5] - - stages, block_func = self.depth_cfg[self.depth] - count = stages[stage_num - 2] - - ch_out = self.stage_filters[stage_num - 2] - is_first = False if stage_num != 2 else True - dcn_v2 = True if stage_num in self.dcn_v2_stages else False - - nonlocal_mod = 1000 - if stage_num in self.nonlocal_stages: - nonlocal_mod = self.nonlocal_mod_cfg[self.depth] if stage_num == 4 else 2 - - # Make the layer name and parameter name consistent - # with ImageNet pre-trained model - conv = input - for i in range(count): - conv_name = self.na.fix_layer_warp_name(stage_num, count, i) - if self.depth < 50: - is_first = True if i == 0 and stage_num == 2 else False - conv = block_func( - input=conv, - num_filters=ch_out, - stride=2 if i == 0 and stage_num != 2 else 1, - is_first=is_first, - name=conv_name, - dcn_v2=dcn_v2) - - # add non local model - dim_in = conv.shape[1] - nonlocal_name = "nonlocal_conv{}".format(stage_num) - if i % nonlocal_mod == nonlocal_mod - 1: - conv = add_space_nonlocal(conv, dim_in, dim_in, nonlocal_name + '_{}'.format(i), int(dim_in / 2)) - return conv - - def c1_stage(self, input): - out_chan = self._c1_out_chan_num - - conv1_name = self.na.fix_c1_stage_name() - - if self.variant in ['c', 'd']: - conv_def = [ - [out_chan // 2, 3, 2, "conv1_1"], - [out_chan // 2, 3, 1, "conv1_2"], - [out_chan, 3, 1, "conv1_3"], - ] - else: - conv_def = [[out_chan, 7, 2, conv1_name]] - - for (c, k, s, _name) in conv_def: - input = self._conv_norm(input=input, num_filters=c, filter_size=k, stride=s, act='relu', name=_name) - - output = fluid.layers.pool2d(input=input, pool_size=3, pool_stride=2, pool_padding=1, pool_type='max') - return output - - def __call__(self, input): - assert isinstance(input, Variable) - assert not (set(self.feature_maps) - set([2, 3, 4, 5])), \ - "feature maps {} not in [2, 3, 4, 5]".format(self.feature_maps) - - res_endpoints = [] - - res = input - feature_maps = self.feature_maps - severed_head = getattr(self, 'severed_head', False) - if not severed_head: - res = self.c1_stage(res) - feature_maps = range(2, max(self.feature_maps) + 1) - - for i in feature_maps: - res = self.layer_warp(res, i) - if i in self.feature_maps: - res_endpoints.append(res) - if self.freeze_at >= i: - res.stop_gradient = True - if self.get_prediction: - pool = fluid.layers.pool2d(input=res, pool_type='avg', global_pooling=True) - stdv = 1.0 / math.sqrt(pool.shape[1] * 1.0) - - out = fluid.layers.fc( - input=pool, - size=self.class_dim, - param_attr=fluid.param_attr.ParamAttr(initializer=fluid.initializer.Uniform(-stdv, stdv))) - out = fluid.layers.softmax(out) - return out - return OrderedDict( - [('res{}_sum'.format(self.feature_maps[idx]), feat) for idx, feat in enumerate(res_endpoints)]) - - -class ResNetC5(ResNet): - def __init__(self, - depth=50, - freeze_at=2, - norm_type='affine_channel', - freeze_norm=True, - norm_decay=0., - variant='b', - feature_maps=[5], - weight_prefix_name=''): - super(ResNetC5, self).__init__(depth, freeze_at, norm_type, freeze_norm, norm_decay, variant, feature_maps) - self.severed_head = True diff --git a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/retina_head.py b/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/retina_head.py deleted file mode 100644 index 1cde9e3202136fefc81c21812f805c456a12d548..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/retinanet_resnet50_fpn_coco2017/retina_head.py +++ /dev/null @@ -1,381 +0,0 @@ -# coding=utf-8 -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import numpy as np -import paddle.fluid as fluid -from paddle.fluid.param_attr import ParamAttr -from paddle.fluid.initializer import Normal, Constant -from paddle.fluid.regularizer import L2Decay - -__all__ = ['AnchorGenerator', 'RetinaTargetAssign', 'RetinaOutputDecoder', 'RetinaHead'] - - -class AnchorGenerator(object): - # __op__ = fluid.layers.anchor_generator - def __init__(self, - stride=[16.0, 16.0], - anchor_sizes=[32, 64, 128, 256, 512], - aspect_ratios=[0.5, 1., 2.], - variance=[1., 1., 1., 1.]): - self.anchor_sizes = anchor_sizes - self.aspect_ratios = aspect_ratios - self.variance = variance - self.stride = stride - - -class RetinaTargetAssign(object): - # __op__ = fluid.layers.retinanet_target_assign - def __init__(self, positive_overlap=0.5, negative_overlap=0.4): - self.positive_overlap = positive_overlap - self.negative_overlap = negative_overlap - - -class RetinaOutputDecoder(object): - # __op__ = fluid.layers.retinanet_detection_output - def __init__(self, score_thresh=0.05, nms_thresh=0.3, pre_nms_top_n=1000, detections_per_im=100, nms_eta=1.0): - super(RetinaOutputDecoder, self).__init__() - self.score_threshold = score_thresh - self.nms_threshold = nms_thresh - self.nms_top_k = pre_nms_top_n - self.keep_top_k = detections_per_im - self.nms_eta = nms_eta - - -class RetinaHead(object): - """ - Retina Head - - Args: - anchor_generator (object): `AnchorGenerator` instance - target_assign (object): `RetinaTargetAssign` instance - output_decoder (object): `RetinaOutputDecoder` instance - num_convs_per_octave (int): Number of convolution layers in each octave - num_chan (int): Number of octave output channels - max_level (int): Highest level of FPN output - min_level (int): Lowest level of FPN output - prior_prob (float): Used to set the bias init for the class prediction layer - base_scale (int): Anchors are generated based on this scale - num_scales_per_octave (int): Number of anchor scales per octave - num_classes (int): Number of classes - gamma (float): The parameter in focal loss - alpha (float): The parameter in focal loss - sigma (float): The parameter in smooth l1 loss - """ - __inject__ = ['anchor_generator', 'target_assign', 'output_decoder'] - __shared__ = ['num_classes'] - - def __init__(self, - anchor_generator=AnchorGenerator(), - target_assign=RetinaTargetAssign(), - output_decoder=RetinaOutputDecoder(), - num_convs_per_octave=4, - num_chan=256, - max_level=7, - min_level=3, - prior_prob=0.01, - base_scale=4, - num_scales_per_octave=3, - num_classes=81, - gamma=2.0, - alpha=0.25, - sigma=3.0151134457776365): - self.anchor_generator = anchor_generator - self.target_assign = target_assign - self.output_decoder = output_decoder - self.num_convs_per_octave = num_convs_per_octave - self.num_chan = num_chan - self.max_level = max_level - self.min_level = min_level - self.prior_prob = prior_prob - self.base_scale = base_scale - self.num_scales_per_octave = num_scales_per_octave - self.num_classes = num_classes - self.gamma = gamma - self.alpha = alpha - self.sigma = sigma - - def _class_subnet(self, body_feats, spatial_scale): - """ - Get class predictions of all level FPN level. - - Args: - fpn_dict(dict): A dictionary represents the output of FPN with - their name. - spatial_scale(list): A list of multiplicative spatial scale factor. - - Returns: - cls_pred_input(list): Class prediction of all input fpn levels. - """ - assert len(body_feats) == self.max_level - self.min_level + 1 - fpn_name_list = list(body_feats.keys()) - cls_pred_list = [] - for lvl in range(self.min_level, self.max_level + 1): - fpn_name = fpn_name_list[self.max_level - lvl] - subnet_blob = body_feats[fpn_name] - for i in range(self.num_convs_per_octave): - conv_name = 'retnet_cls_conv_n{}_fpn{}'.format(i, lvl) - conv_share_name = 'retnet_cls_conv_n{}_fpn{}'.format(i, self.min_level) - subnet_blob_in = subnet_blob - subnet_blob = fluid.layers.conv2d( - input=subnet_blob_in, - num_filters=self.num_chan, - filter_size=3, - stride=1, - padding=1, - act='relu', - name=conv_name, - param_attr=ParamAttr(name=conv_share_name + '_w', initializer=Normal(loc=0., scale=0.01)), - bias_attr=ParamAttr(name=conv_share_name + '_b', learning_rate=2., regularizer=L2Decay(0.))) - - # class prediction - cls_name = 'retnet_cls_pred_fpn{}'.format(lvl) - cls_share_name = 'retnet_cls_pred_fpn{}'.format(self.min_level) - num_anchors = self.num_scales_per_octave * len(self.anchor_generator.aspect_ratios) - cls_dim = num_anchors * (self.num_classes - 1) - # bias initialization: b = -log((1 - pai) / pai) - bias_init = float(-np.log((1 - self.prior_prob) / self.prior_prob)) - out_cls = fluid.layers.conv2d( - input=subnet_blob, - num_filters=cls_dim, - filter_size=3, - stride=1, - padding=1, - act=None, - name=cls_name, - param_attr=ParamAttr(name=cls_share_name + '_w', initializer=Normal(loc=0., scale=0.01)), - bias_attr=ParamAttr( - name=cls_share_name + '_b', - initializer=Constant(value=bias_init), - learning_rate=2., - regularizer=L2Decay(0.))) - cls_pred_list.append(out_cls) - - return cls_pred_list - - def _bbox_subnet(self, body_feats, spatial_scale): - """ - Get bounding box predictions of all level FPN level. - - Args: - fpn_dict(dict): A dictionary represents the output of FPN with - their name. - spatial_scale(list): A list of multiplicative spatial scale factor. - - Returns: - bbox_pred_input(list): Bounding box prediction of all input fpn - levels. - """ - assert len(body_feats) == self.max_level - self.min_level + 1 - fpn_name_list = list(body_feats.keys()) - bbox_pred_list = [] - for lvl in range(self.min_level, self.max_level + 1): - fpn_name = fpn_name_list[self.max_level - lvl] - subnet_blob = body_feats[fpn_name] - for i in range(self.num_convs_per_octave): - conv_name = 'retnet_bbox_conv_n{}_fpn{}'.format(i, lvl) - conv_share_name = 'retnet_bbox_conv_n{}_fpn{}'.format(i, self.min_level) - subnet_blob_in = subnet_blob - subnet_blob = fluid.layers.conv2d( - input=subnet_blob_in, - num_filters=self.num_chan, - filter_size=3, - stride=1, - padding=1, - act='relu', - name=conv_name, - param_attr=ParamAttr(name=conv_share_name + '_w', initializer=Normal(loc=0., scale=0.01)), - bias_attr=ParamAttr(name=conv_share_name + '_b', learning_rate=2., regularizer=L2Decay(0.))) - - # bbox prediction - bbox_name = 'retnet_bbox_pred_fpn{}'.format(lvl) - bbox_share_name = 'retnet_bbox_pred_fpn{}'.format(self.min_level) - num_anchors = self.num_scales_per_octave * len(self.anchor_generator.aspect_ratios) - bbox_dim = num_anchors * 4 - out_bbox = fluid.layers.conv2d( - input=subnet_blob, - num_filters=bbox_dim, - filter_size=3, - stride=1, - padding=1, - act=None, - name=bbox_name, - param_attr=ParamAttr(name=bbox_share_name + '_w', initializer=Normal(loc=0., scale=0.01)), - bias_attr=ParamAttr(name=bbox_share_name + '_b', learning_rate=2., regularizer=L2Decay(0.))) - bbox_pred_list.append(out_bbox) - return bbox_pred_list - - def _anchor_generate(self, body_feats, spatial_scale): - """ - Get anchor boxes of all level FPN level. - - Args: - fpn_dict(dict): A dictionary represents the output of FPN with their name. - spatial_scale(list): A list of multiplicative spatial scale factor. - - Return: - anchor_input(list): Anchors of all input fpn levels with shape of. - anchor_var_input(list): Anchor variance of all input fpn levels with shape. - """ - assert len(body_feats) == self.max_level - self.min_level + 1 - fpn_name_list = list(body_feats.keys()) - anchor_list = [] - anchor_var_list = [] - for lvl in range(self.min_level, self.max_level + 1): - anchor_sizes = [] - stride = int(1 / spatial_scale[self.max_level - lvl]) - for octave in range(self.num_scales_per_octave): - anchor_size = stride * (2**(float(octave) / float(self.num_scales_per_octave))) * self.base_scale - anchor_sizes.append(anchor_size) - fpn_name = fpn_name_list[self.max_level - lvl] - anchor, anchor_var = fluid.layers.anchor_generator( - input=body_feats[fpn_name], - anchor_sizes=anchor_sizes, - aspect_ratios=self.anchor_generator.aspect_ratios, - stride=[stride, stride], - variance=self.anchor_generator.variance) - anchor_list.append(anchor) - anchor_var_list.append(anchor_var) - return anchor_list, anchor_var_list - - def _get_output(self, body_feats, spatial_scale): - """ - Get class, bounding box predictions and anchor boxes of all level FPN level. - - Args: - fpn_dict(dict): A dictionary represents the output of FPN with - their name. - spatial_scale(list): A list of multiplicative spatial scale factor. - - Returns: - cls_pred_input(list): Class prediction of all input fpn levels. - bbox_pred_input(list): Bounding box prediction of all input fpn - levels. - anchor_input(list): Anchors of all input fpn levels with shape of. - anchor_var_input(list): Anchor variance of all input fpn levels with - shape. - """ - assert len(body_feats) == self.max_level - self.min_level + 1 - # class subnet - cls_pred_list = self._class_subnet(body_feats, spatial_scale) - # bbox subnet - bbox_pred_list = self._bbox_subnet(body_feats, spatial_scale) - #generate anchors - anchor_list, anchor_var_list = self._anchor_generate(body_feats, spatial_scale) - cls_pred_reshape_list = [] - bbox_pred_reshape_list = [] - anchor_reshape_list = [] - anchor_var_reshape_list = [] - for i in range(self.max_level - self.min_level + 1): - cls_pred_transpose = fluid.layers.transpose(cls_pred_list[i], perm=[0, 2, 3, 1]) - cls_pred_reshape = fluid.layers.reshape(cls_pred_transpose, shape=(0, -1, self.num_classes - 1)) - bbox_pred_transpose = fluid.layers.transpose(bbox_pred_list[i], perm=[0, 2, 3, 1]) - bbox_pred_reshape = fluid.layers.reshape(bbox_pred_transpose, shape=(0, -1, 4)) - anchor_reshape = fluid.layers.reshape(anchor_list[i], shape=(-1, 4)) - anchor_var_reshape = fluid.layers.reshape(anchor_var_list[i], shape=(-1, 4)) - cls_pred_reshape_list.append(cls_pred_reshape) - bbox_pred_reshape_list.append(bbox_pred_reshape) - anchor_reshape_list.append(anchor_reshape) - anchor_var_reshape_list.append(anchor_var_reshape) - output = {} - output['cls_pred'] = cls_pred_reshape_list - output['bbox_pred'] = bbox_pred_reshape_list - output['anchor'] = anchor_reshape_list - output['anchor_var'] = anchor_var_reshape_list - return output - - def get_prediction(self, body_feats, spatial_scale, im_info): - """ - Get prediction bounding box in test stage. - - Args: - fpn_dict(dict): A dictionary represents the output of FPN with - their name. - spatial_scale(list): A list of multiplicative spatial scale factor. - im_info (Variable): A 2-D LoDTensor with shape [B, 3]. B is the - number of input images, each element consists of im_height, - im_width, im_scale. - - Returns: - pred_result(Variable): Prediction result with shape [N, 6]. Each - row has 6 values: [label, confidence, xmin, ymin, xmax, ymax]. - N is the total number of prediction. - """ - output = self._get_output(body_feats, spatial_scale) - cls_pred_reshape_list = output['cls_pred'] - bbox_pred_reshape_list = output['bbox_pred'] - anchor_reshape_list = output['anchor'] - for i in range(self.max_level - self.min_level + 1): - cls_pred_reshape_list[i] = fluid.layers.sigmoid(cls_pred_reshape_list[i]) - pred_result = fluid.layers.retinanet_detection_output( - bboxes=bbox_pred_reshape_list, - scores=cls_pred_reshape_list, - anchors=anchor_reshape_list, - im_info=im_info, - score_threshold=self.output_decoder.score_threshold, - nms_threshold=self.output_decoder.nms_threshold, - nms_top_k=self.output_decoder.nms_top_k, - keep_top_k=self.output_decoder.keep_top_k, - nms_eta=self.output_decoder.nms_eta) - return pred_result - - def get_loss(self, body_feats, spatial_scale, im_info, gt_box, gt_label, is_crowd): - """ - Calculate the loss of retinanet. - Args: - fpn_dict(dict): A dictionary represents the output of FPN with - their name. - spatial_scale(list): A list of multiplicative spatial scale factor. - im_info(Variable): A 2-D LoDTensor with shape [B, 3]. B is the - number of input images, each element consists of im_height, - im_width, im_scale. - gt_box(Variable): The ground-truth bounding boxes with shape [M, 4]. - M is the number of groundtruth. - gt_label(Variable): The ground-truth labels with shape [M, 1]. - M is the number of groundtruth. - is_crowd(Variable): Indicates groud-truth is crowd or not with - shape [M, 1]. M is the number of groundtruth. - - Returns: - Type: dict - loss_cls(Variable): focal loss. - loss_bbox(Variable): smooth l1 loss. - """ - output = self._get_output(body_feats, spatial_scale) - cls_pred_reshape_list = output['cls_pred'] - bbox_pred_reshape_list = output['bbox_pred'] - anchor_reshape_list = output['anchor'] - anchor_var_reshape_list = output['anchor_var'] - - cls_pred_input = fluid.layers.concat(cls_pred_reshape_list, axis=1) - bbox_pred_input = fluid.layers.concat(bbox_pred_reshape_list, axis=1) - anchor_input = fluid.layers.concat(anchor_reshape_list, axis=0) - anchor_var_input = fluid.layers.concat(anchor_var_reshape_list, axis=0) - score_pred, loc_pred, score_tgt, loc_tgt, bbox_weight, fg_num = \ - fluid.layers.rpn_target_assign( - bbox_pred=bbox_pred_input, - cls_logits=cls_pred_input, - anchor_box=anchor_input, - anchor_var=anchor_var_input, - gt_boxes=gt_box, - gt_labels=gt_label, - is_crowd=is_crowd, - im_info=im_info, - num_classes=self.num_classes - 1, - rpn_batch_size_per_im=self.target_assign.rpn_batch_size_per_im, - rpn_straddle_thresh=self.target_assign.rpn_straddle_thresh, - rpn_fg_fraction=self.target_assign.rpn_fg_fraction, - rpn_positive_overlap=self.target_assign.rpn_positive_overlap, - rpn_negative_overlap=self.target_assign.rpn_negative_overlap, - use_random=self.target_assign.use_random) - fg_num = fluid.layers.reduce_sum(fg_num, name='fg_num') - score_tgt = fluid.layers.cast(score_tgt, 'int32') - loss_cls = fluid.layers.sigmoid_focal_loss( - x=score_pred, label=score_tgt, fg_num=fg_num, gamma=self.gamma, alpha=self.alpha) - loss_cls = fluid.layers.reduce_sum(loss_cls, name='loss_cls') - loss_bbox = fluid.layers.smooth_l1( - x=loc_pred, y=loc_tgt, sigma=self.sigma, inside_weight=bbox_weight, outside_weight=bbox_weight) - loss_bbox = fluid.layers.reduce_sum(loss_bbox, name='loss_bbox') - loss_bbox = loss_bbox / fg_num - return {'loss_cls': loss_cls, 'loss_bbox': loss_bbox} diff --git a/modules/image/object_detection/yolov3_darknet53_pascalvoc/module.py b/modules/image/object_detection/yolov3_darknet53_pascalvoc/module.py deleted file mode 100644 index 2ec816e51989327ad8006b02e878fb86ab235c31..0000000000000000000000000000000000000000 --- a/modules/image/object_detection/yolov3_darknet53_pascalvoc/module.py +++ /dev/null @@ -1,325 +0,0 @@ -import os - -import paddle -import paddle.nn as nn -import paddle.nn.functional as F -from paddle.nn.initializer import Normal, Constant -from paddle.regularizer import L2Decay -from paddlehub.module.cv_module import Yolov3Module -import paddlehub.process.detect_transforms as T -from paddlehub.module.module import moduleinfo - - -class ConvBNLayer(nn.Layer): - """Basic block for Darknet""" - - def __init__(self, - ch_in: int, - ch_out: int, - filter_size: int = 3, - stride: int = 1, - groups: int = 1, - padding: int = 0, - act: str = 'leakly', - is_test: bool = False): - super(ConvBNLayer, self).__init__() - - self.conv = nn.Conv2d( - ch_in, - ch_out, - filter_size, - padding=padding, - stride=stride, - groups=groups, - weight_attr=paddle.ParamAttr(initializer=Normal(0., 0.02)), - bias_attr=False) - - self.batch_norm = nn.BatchNorm( - num_channels=ch_out, - is_test=is_test, - param_attr=paddle.ParamAttr(initializer=Normal(0., 0.02), regularizer=L2Decay(0.))) - self.act = act - - def forward(self, inputs: paddle.Tensor) -> paddle.Tensor: - out = self.conv(inputs) - out = self.batch_norm(out) - if self.act == "leakly": - out = F.leaky_relu(x=out, negative_slope=0.1) - return out - - -class DownSample(nn.Layer): - """Downsample block for Darknet""" - - def __init__(self, - ch_in: int, - ch_out: int, - filter_size: int = 3, - stride: int = 2, - padding: int = 1, - is_test: bool = False): - super(DownSample, self).__init__() - - self.conv_bn_layer = ConvBNLayer( - ch_in=ch_in, ch_out=ch_out, filter_size=filter_size, stride=stride, padding=padding, is_test=is_test) - self.ch_out = ch_out - - def forward(self, inputs: paddle.Tensor) -> paddle.Tensor: - out = self.conv_bn_layer(inputs) - return out - - -class BasicBlock(nn.Layer): - """Basic residual block for Darknet""" - - def __init__(self, ch_in: int, ch_out: int, is_test: bool = False): - super(BasicBlock, self).__init__() - - self.conv1 = ConvBNLayer(ch_in=ch_in, ch_out=ch_out, filter_size=1, stride=1, padding=0, is_test=is_test) - self.conv2 = ConvBNLayer(ch_in=ch_out, ch_out=ch_out * 2, filter_size=3, stride=1, padding=1, is_test=is_test) - - def forward(self, inputs: paddle.Tensor) -> paddle.Tensor: - conv1 = self.conv1(inputs) - conv2 = self.conv2(conv1) - out = paddle.elementwise_add(x=inputs, y=conv2, act=None) - return out - - -class LayerWarp(nn.Layer): - """Warp layer composed by basic residual blocks""" - - def __init__(self, ch_in: int, ch_out: int, count: int, is_test: bool = False): - super(LayerWarp, self).__init__() - self.basicblock0 = BasicBlock(ch_in, ch_out, is_test=is_test) - self.res_out_list = [] - for i in range(1, count): - res_out = self.add_sublayer("basic_block_%d" % (i), BasicBlock(ch_out * 2, ch_out, is_test=is_test)) - self.res_out_list.append(res_out) - self.ch_out = ch_out - - def forward(self, inputs: paddle.Tensor) -> paddle.Tensor: - y = self.basicblock0(inputs) - for basic_block_i in self.res_out_list: - y = basic_block_i(y) - return y - - -class DarkNet53_conv_body(nn.Layer): - """Darknet53 - Args: - ch_in(int): Input channels, default is 3. - is_test (bool): Set the test mode, default is True. - """ - - def __init__(self, ch_in: int = 3, is_test: bool = False): - super(DarkNet53_conv_body, self).__init__() - self.stages = [1, 2, 8, 8, 4] - self.stages = self.stages[0:5] - - self.conv0 = ConvBNLayer(ch_in=ch_in, ch_out=32, filter_size=3, stride=1, padding=1, is_test=is_test) - - self.downsample0 = DownSample(ch_in=32, ch_out=32 * 2, is_test=is_test) - self.darknet53_conv_block_list = [] - self.downsample_list = [] - ch_in = [64, 128, 256, 512, 1024] - - for i, stage in enumerate(self.stages): - conv_block = self.add_sublayer("stage_%d" % (i), - LayerWarp(int(ch_in[i]), 32 * (2**i), stage, is_test=is_test)) - self.darknet53_conv_block_list.append(conv_block) - - for i in range(len(self.stages) - 1): - downsample = self.add_sublayer( - "stage_%d_downsample" % i, DownSample( - ch_in=32 * (2**(i + 1)), ch_out=32 * (2**(i + 2)), is_test=is_test)) - self.downsample_list.append(downsample) - - def forward(self, inputs: paddle.Tensor) -> paddle.Tensor: - out = self.conv0(inputs) - out = self.downsample0(out) - blocks = [] - for i, conv_block_i in enumerate(self.darknet53_conv_block_list): - out = conv_block_i(out) - blocks.append(out) - if i < len(self.stages) - 1: - out = self.downsample_list[i](out) - return blocks[-1:-4:-1] - - -class YoloDetectionBlock(nn.Layer): - """Basic block for Yolov3""" - - def __init__(self, ch_in: int, channel: int, is_test: bool = True): - super(YoloDetectionBlock, self).__init__() - - assert channel % 2 == 0, \ - "channel {} cannot be divided by 2".format(channel) - - self.conv0 = ConvBNLayer(ch_in=ch_in, ch_out=channel, filter_size=1, stride=1, padding=0, is_test=is_test) - self.conv1 = ConvBNLayer(ch_in=channel, ch_out=channel * 2, filter_size=3, stride=1, padding=1, is_test=is_test) - self.conv2 = ConvBNLayer(ch_in=channel * 2, ch_out=channel, filter_size=1, stride=1, padding=0, is_test=is_test) - self.conv3 = ConvBNLayer(ch_in=channel, ch_out=channel * 2, filter_size=3, stride=1, padding=1, is_test=is_test) - self.route = ConvBNLayer(ch_in=channel * 2, ch_out=channel, filter_size=1, stride=1, padding=0, is_test=is_test) - self.tip = ConvBNLayer(ch_in=channel, ch_out=channel * 2, filter_size=3, stride=1, padding=1, is_test=is_test) - - def forward(self, inputs): - out = self.conv0(inputs) - out = self.conv1(out) - out = self.conv2(out) - out = self.conv3(out) - route = self.route(out) - tip = self.tip(route) - return route, tip - - -class Upsample(nn.Layer): - """Upsample block for Yolov3""" - - def __init__(self, scale: int = 2): - super(Upsample, self).__init__() - self.scale = scale - - def forward(self, inputs: paddle.Tensor): - shape_nchw = paddle.to_tensor(inputs.shape) - shape_hw = paddle.slice(shape_nchw, axes=[0], starts=[2], ends=[4]) - shape_hw.stop_gradient = True - in_shape = paddle.cast(shape_hw, dtype='int32') - out_shape = in_shape * self.scale - out_shape.stop_gradient = True - out = F.resize_nearest(input=inputs, scale=self.scale, actual_shape=out_shape) - return out - - -@moduleinfo( - name="yolov3_darknet53_pascalvoc", - type="CV/image_editing", - author="paddlepaddle", - author_email="", - summary="Yolov3 is a detection model, this module is trained with VOC dataset.", - version="1.0.0", - meta=Yolov3Module) -class YOLOv3(nn.Layer): - """YOLOV3 for detection - - Args: - ch_in(int): Input channels, default is 3. - class_num(int): Categories for detection,if dataset is voc, class_num is 20. - ignore_thresh(float): The ignore threshold to ignore confidence loss. - valid_thresh(float): Threshold to filter out bounding boxes with low confidence score. - nms_topk(int): Maximum number of detections to be kept according to the confidences after the filtering - detections based on score_threshold. - nms_posk(int): Number of total bboxes to be kept per image after NMS step. -1 means keeping all bboxes after NMS - step. - nms_thresh (float): The threshold to be used in NMS. Default: 0.3. - is_train (bool): Set the train mode, default is True. - load_checkpoint(str): Whether to load checkpoint. - """ - - def __init__(self, - ch_in: int = 3, - class_num: int = 20, - ignore_thresh: float = 0.7, - valid_thresh: float = 0.005, - nms_topk: int = 400, - nms_posk: int = 100, - nms_thresh: float = 0.45, - is_train: bool = True, - load_checkpoint: str = None): - super(YOLOv3, self).__init__() - - self.is_train = is_train - self.block = DarkNet53_conv_body(ch_in=ch_in, is_test=not self.is_train) - self.block_outputs = [] - self.yolo_blocks = [] - self.route_blocks_2 = [] - self.anchor_masks = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] - self.anchors = [10, 13, 16, 30, 33, 23, 30, 61, 62, 45, 59, 119, 116, 90, 156, 198, 373, 326] - self.class_num = class_num - self.ignore_thresh = ignore_thresh - self.valid_thresh = valid_thresh - self.nms_topk = nms_topk - self.nms_posk = nms_posk - self.nms_thresh = nms_thresh - ch_in_list = [1024, 768, 384] - - for i in range(3): - yolo_block = self.add_sublayer( - "yolo_detecton_block_%d" % (i), - YoloDetectionBlock(ch_in_list[i], channel=512 // (2**i), is_test=not self.is_train)) - self.yolo_blocks.append(yolo_block) - - num_filters = len(self.anchor_masks[i]) * (self.class_num + 5) - block_out = self.add_sublayer( - "block_out_%d" % (i), - nn.Conv2d( - 1024 // (2**i), - num_filters, - 1, - stride=1, - padding=0, - weight_attr=paddle.ParamAttr(initializer=Normal(0., 0.02)), - bias_attr=paddle.ParamAttr(initializer=Constant(0.0), regularizer=L2Decay(0.)))) - self.block_outputs.append(block_out) - - if i < 2: - route = self.add_sublayer( - "route2_%d" % i, - ConvBNLayer( - ch_in=512 // (2**i), - ch_out=256 // (2**i), - filter_size=1, - stride=1, - padding=0, - is_test=(not self.is_train))) - self.route_blocks_2.append(route) - self.upsample = Upsample() - - if load_checkpoint is not None: - model_dict = paddle.load(load_checkpoint)[0] - self.set_dict(model_dict) - print("load custom checkpoint success") - - else: - checkpoint = os.path.join(self.directory, 'yolov3_darknet53_voc.pdparams') - if not os.path.exists(checkpoint): - os.system( - 'wget https://paddlehub.bj.bcebos.com/dygraph/detection/yolov3_darknet53_voc.pdparams -O ' \ - + checkpoint) - model_dict = paddle.load(checkpoint)[0] - self.set_dict(model_dict) - print("load pretrained checkpoint success") - - def transform(self, img): - if self.is_train: - transform = T.Compose([ - T.RandomDistort(), - T.RandomExpand(fill=[0.485, 0.456, 0.406]), - T.RandomCrop(), - T.Resize(target_size=416), - T.RandomFlip(), - T.ShuffleBox(), - T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) - ]) - else: - transform = T.Compose([ - T.Resize(target_size=416, interp='CUBIC'), - T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) - ]) - - return transform(img) - - def forward(self, inputs: paddle.Tensor): - outputs = [] - blocks = self.block(inputs) - route = None - for i, block in enumerate(blocks): - if i > 0: - block = paddle.concat([route, block], axis=1) - route, tip = self.yolo_blocks[i](block) - block_out = self.block_outputs[i](tip) - outputs.append(block_out) - if i < 2: - route = self.route_blocks_2[i](route) - route = self.upsample(route) - - return outputs diff --git a/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP/README.md b/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP/README.md new file mode 100644 index 0000000000000000000000000000000000000000..002bfeebc0201e7a825f24bfb850a3fb564a1c25 --- /dev/null +++ b/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP/README.md @@ -0,0 +1,92 @@ + +# Pneumonia_CT_LKM_PP + +|模型名称|Pneumonia_CT_LKM_PP| +| :--- | :---: | +|类别|图像-图像分割| +|网络|-| +|数据集|-| +|是否支持Fine-tuning|否| +|模型大小|35M| +|指标|-| +|最新更新日期|2021-02-26| + + +## 一、模型基本信息 + + +- ### 模型介绍 + + - 肺炎CT影像分析模型(Pneumonia-CT-LKM-PP)可以高效地完成对患者CT影像的病灶检测识别、病灶轮廓勾画,通过一定的后处理代码,可以分析输出肺部病灶的数量、体积、病灶占比等全套定量指标。值得强调的是,该系统采用的深度学习算法模型充分训练了所收集到的高分辨率和低分辨率的CT影像数据,能极好地适应不同等级CT影像设备采集的检查数据,有望为医疗资源受限和医疗水平偏低的基层医院提供有效的肺炎辅助诊断工具。 + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 + +- ### 2、安装 + + - ```shell + $ hub install Pneumonia_CT_LKM_PP==1.0.0 + ``` + + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、预测代码示例 + + ```python + import paddlehub as hub + + pneumonia = hub.Module(name="Pneumonia_CT_LKM_PP") + + input_only_lesion_np_path = "/PATH/TO/ONLY_LESION_NP" + input_both_lesion_np_path = "/PATH/TO/LESION_NP" + input_both_lung_np_path = "/PATH/TO/LUNG_NP" + + # set input dict + input_dict = {"image_np_path": [ + [input_only_lesion_np_path], + [input_both_lesion_np_path, input_both_lung_np_path], + ]} + + # execute predict and print the result + results = pneumonia.segmentation(data=input_dict) + for result in results: + print(result) + + ``` + + +- ### 2、API + + ```python + def segmentation(data) + ``` + + - 预测API,用于肺炎CT影像分析。 + + - **参数** + + * data (dict): key,str类型,"image_np_path";value,list类型,每个元素为list类型,[用于病灶分析的影像numpy数组(文件后缀名.npy)路径, 用于肺部分割的影像numpy数组路径],如果仅进行病灶分析不进行肺部分割,可以省略用于肺部分割的影像numpy数组路径 + + + - **返回** + + * result (list\[dict\]): 每个元素为对应输入的预测结果。每个预测结果为dict类型:预测结果有以下字段: + * input_lesion_np_path: 存放用于病灶分析的numpy数组路径; + * output_lesion_np: 存放病灶分析结果,numpy数组; + * input_lesion_np_path:存放用于肺部分割的numpy数组路径(仅当对应输入包含肺部影像numpy时存在该字段) + * output_lung_np:存放肺部分割结果,numpy数组(仅当对应输入包含肺部影像numpy时存在该字段) + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 diff --git a/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung/README.md b/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung/README.md new file mode 100644 index 0000000000000000000000000000000000000000..24a6df13d15294b3c0f859aff5a7ff20befd9b9a --- /dev/null +++ b/modules/image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung/README.md @@ -0,0 +1,92 @@ + +# Pneumonia_CT_LKM_PP_lung + +|模型名称|Pneumonia_CT_LKM_PP_lung| +| :--- | :---: | +|类别|图像-图像分割| +|网络|-| +|数据集|-| +|是否支持Fine-tuning|否| +|模型大小|35M| +|指标|-| +|最新更新日期|2021-02-26| + + +## 一、模型基本信息 + + +- ### 模型介绍 + + - 肺炎CT影像分析模型(Pneumonia-CT-LKM-PP)可以高效地完成对患者CT影像的病灶检测识别、病灶轮廓勾画,通过一定的后处理代码,可以分析输出肺部病灶的数量、体积、病灶占比等全套定量指标。值得强调的是,该系统采用的深度学习算法模型充分训练了所收集到的高分辨率和低分辨率的CT影像数据,能极好地适应不同等级CT影像设备采集的检查数据,有望为医疗资源受限和医疗水平偏低的基层医院提供有效的肺炎辅助诊断工具。(此module为Pneumonia_CT_LKM_PP的子module。) + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 + +- ### 2、安装 + + - ```shell + $ hub install Pneumonia_CT_LKM_PP_lung==1.0.0 + ``` + + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、预测代码示例 + + ```python + import paddlehub as hub + + pneumonia = hub.Module(name="Pneumonia_CT_LKM_PP_lung") + + input_only_lesion_np_path = "/PATH/TO/ONLY_LESION_NP" + input_both_lesion_np_path = "/PATH/TO/LESION_NP" + input_both_lung_np_path = "/PATH/TO/LUNG_NP" + + # set input dict + input_dict = {"image_np_path": [ + [input_only_lesion_np_path], + [input_both_lesion_np_path, input_both_lung_np_path], + ]} + + # execute predict and print the result + results = pneumonia.segmentation(data=input_dict) + for result in results: + print(result) + + ``` + + +- ### 2、API + + ```python + def segmentation(data) + ``` + + - 预测API,用于肺炎CT影像分析。 + + - **参数** + + * data (dict): key,str类型,"image_np_path";value,list类型,每个元素为list类型,[用于病灶分析的影像numpy数组(文件后缀名.npy)路径, 用于肺部分割的影像numpy数组路径],如果仅进行病灶分析不进行肺部分割,可以省略用于肺部分割的影像numpy数组路径 + + + - **返回** + + * result (list\[dict\]): 每个元素为对应输入的预测结果。每个预测结果为dict类型:预测结果有以下字段: + * input_lesion_np_path: 存放用于病灶分析的numpy数组路径; + * output_lesion_np: 存放病灶分析结果,numpy数组; + * input_lesion_np_path:存放用于肺部分割的numpy数组路径(仅当对应输入包含肺部影像numpy时存在该字段) + * output_lung_np:存放肺部分割结果,numpy数组(仅当对应输入包含肺部影像numpy时存在该字段) + + +## 四、更新历史 + +* 1.0.0 + + 初始发布 diff --git a/modules/text/embedding/tencent_ailab_chinese_embedding/README.md b/modules/text/embedding/tencent_ailab_chinese_embedding/README.md deleted file mode 100644 index 75ed2880215bb8f2f7e295093155295b505b7c99..0000000000000000000000000000000000000000 --- a/modules/text/embedding/tencent_ailab_chinese_embedding/README.md +++ /dev/null @@ -1,49 +0,0 @@ -## 概述 - -Tencent_AILab_ChineseEmbedding提供了基于海量中文语料训练学习得到的800多万个中文词语和短语的词向量表示,每一个词向量为200维。可以用于各种下游任务迁移学习。 - -更多详情参考: https://ai.tencent.com/ailab/nlp/en/embedding.html - -注:该Module由第三方开发者DesmonDay贡献。 - -## API - -```python -def context(trainable=False, max_seq_len=128, num_slots=1) -``` - -获取该Module的预训练program以及program相应的输入输出。 - -**参数** - -* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变。 -* max_seq_len(int): 模型使用的最大序列长度。 -* num_slots(int): 输入到模型所需要的文本个数,如完成单句文本分类任务,则num_slots=1;完成pointwise文本匹配任务,则num_slots=2;完成pairtwise文本匹配任务,则num_slots=3; - -**返回** - -* inputs(dict): program的输入变量 -* outputs(dict): program的输出变量 -* main_program(Program): 带有预训练参数的program - -### 代码示例 - -```python -import paddlehub as hub -import cv2 - -tencent_ailab_chinese_embedding = hub.Module(name="tencent_ailab_chinese_embedding") -inputs, outputs, program = tencent_ailab_chinese_embedding.context(trainable=True, max_seq_len=128, num_slots=1) -``` - -## 依赖 - -paddlepaddle >= 1.8.2 - -paddlehub >= 1.8.0 - -## 更新历史 - -* 1.0.0 - - 初始发布 diff --git a/modules/text/embedding/tencent_ailab_chinese_embedding/module.py b/modules/text/embedding/tencent_ailab_chinese_embedding/module.py deleted file mode 100644 index 7c2785bcfdda3e7fb01e7a85ac49942b343bd477..0000000000000000000000000000000000000000 --- a/modules/text/embedding/tencent_ailab_chinese_embedding/module.py +++ /dev/null @@ -1,149 +0,0 @@ -# -*- coding:utf-8 -*- -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import io -import os - -import paddle.fluid as fluid -import paddlehub as hub -from paddlehub.common.paddle_helper import add_vars_prefix -from paddlehub.module.module import moduleinfo - - -def load_vocab(file_path): - """ - load the given vocabulary - """ - vocab = {} - with io.open(file_path, 'r', encoding='utf8') as f: - for line in f: - parts = line.split("\t") - vocab[parts[0]] = int(parts[1]) - - return vocab - - -@moduleinfo( - name="tencent_ailab_chinese_embedding", - version="1.0.0", - summary= - "Tencent AI Lab Embedding Corpus for Chinese Words and Phrases and the vocab size is 8,824,331. For more information, please refer to https://ai.tencent.com/ailab/nlp/zh/embedding.html", - author="", - author_email="", - type="nlp/semantic_model") -class TencentAILabChineseEmbedding(hub.Module): - def _initialize(self): - """ - initialize with the necessary elements - """ - self.pretrained_model_path = os.path.join(self.directory, "assets", "model") - self.vocab_path = os.path.join(self.directory, "assets", "vocab.txt") - self.vocab = load_vocab(self.vocab_path) - - def context(self, trainable=False, max_seq_len=128, num_slots=1): - """ - Get the input ,output and program of the pretrained tencent_ailab_chinese_embedding - - Args: - trainable(bool): whether fine-tune the pretrained parameters of simnet_bow or not - num_slots(int): It's number of slots inputted to the model, selectted as following options: - - - 1(default): There's only one data to be feeded in the model, e.g. the module is used for sentence classification task. - - 2: There are two data to be feeded in the model, e.g. the module is used for text matching task (point-wise). - - 3: There are three data to be feeded in the model, e.g. the module is used for text matching task (pair-wise). - - Returns: - inputs(dict): the input variables of tencent_ailab_chinese_embedding (words) - outputs(dict): the output variables of input words (word embeddings) - main_program(Program): the main_program of tencent_ailab_chinese_embedding with pretrained prameters - """ - assert num_slots >= 1 and num_slots <= 3, "num_slots must be 1, 2, or 3, but the input is %d" % num_slots - main_program = fluid.Program() - startup_program = fluid.Program() - with fluid.program_guard(main_program, startup_program): - with fluid.unique_name.guard(): - w_param_attrs = fluid.ParamAttr( - name="embedding_0.w_0", - initializer=fluid.initializer.TruncatedNormal(scale=0.02), - trainable=trainable) - - text_1 = fluid.data(name='text', shape=[-1, max_seq_len], dtype='int64', lod_level=0) - emb_1 = fluid.embedding( - input=text_1, - size=[len(self.vocab), 200], - is_sparse=True, - padding_idx=len(self.vocab) - 1, - dtype='float32', - param_attr=w_param_attrs) - emb_1_name = emb_1.name - data_list = [text_1] - emb_name_list = [emb_1_name] - - if num_slots > 1: - text_2 = fluid.data(name='text_2', shape=[-1, max_seq_len], dtype='int64', lod_level=0) - emb_2 = fluid.embedding( - input=text_2, - size=[len(self.vocab), 200], - is_sparse=True, - padding_idx=len(self.vocab) - 1, - dtype='float32', - param_attr=w_param_attrs) - emb_2_name = emb_2.name - data_list.append(text_2) - emb_name_list.append(emb_2_name) - - if num_slots > 2: - text_3 = fluid.data(name='text_3', shape=[-1, max_seq_len], dtype='int64', lod_level=0) - emb_3 = fluid.embedding( - input=text_3, - size=[len(self.vocab), 200], - is_sparse=True, - padding_idx=len(self.vocab) - 1, - dtype='float32', - param_attr=w_param_attrs) - emb_3_name = emb_3.name - data_list.append(text_3) - emb_name_list.append(emb_3_name) - - variable_names = filter(lambda v: v not in ['text', 'text_2', 'text_3'], - list(main_program.global_block().vars.keys())) - - prefix_name = "@HUB_{}@".format(self.name) - add_vars_prefix(program=main_program, prefix=prefix_name, vars=variable_names) - for param in main_program.global_block().iter_parameters(): - param.trainable = trainable - - place = fluid.CPUPlace() - exe = fluid.Executor(place) - - # load the pretrained model - def if_exist(var): - return os.path.exists(os.path.join(self.pretrained_model_path, var.name)) - - fluid.io.load_vars(exe, self.pretrained_model_path, predicate=if_exist) - - inputs = {} - outputs = {} - for index, data in enumerate(data_list): - if index == 0: - inputs['text'] = data - outputs['emb'] = main_program.global_block().vars[prefix_name + emb_name_list[0]] - else: - inputs['text_%s' % (index + 1)] = data - outputs['emb_%s' % (index + 1)] = main_program.global_block().vars[prefix_name + - emb_name_list[index]] - - return inputs, outputs, main_program - - def get_vocab_path(self): - return self.vocab_path - - -if __name__ == "__main__": - w2v = TencentAILabChineseEmbedding() - inputs, outputs, program = w2v.context(num_slots=3) - print(inputs) - print(outputs) - print(w2v.get_vocab_path()) diff --git a/modules/text/embedding/tencent_ailab_chinese_embedding_small/README.md b/modules/text/embedding/tencent_ailab_chinese_embedding_small/README.md deleted file mode 100644 index c5d2b84f24f097c6cc8fae58fe3e26348c36f315..0000000000000000000000000000000000000000 --- a/modules/text/embedding/tencent_ailab_chinese_embedding_small/README.md +++ /dev/null @@ -1,50 +0,0 @@ -## 概述 - -Tencent_AILab_ChineseEmbedding提供了基于海量中文语料训练学习得到的800多万个中文词语和短语的词向量表示,每一个词向量为200维。 -该Module截取了原来词汇表中前200万的词语,同样可以用于各种下游任务迁移学习。 - -更多详情参考: https://ai.tencent.com/ailab/nlp/en/embedding.html - -注:该Module由第三方开发者DesmonDay贡献。 - -## API - -```python -def context(trainable=False, max_seq_len=128, num_slots=1) -``` - -获取该Module的预训练program以及program相应的输入输出。 - -**参数** - -* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变。 -* max_seq_len(int): 模型使用的最大序列长度。 -* num_slots(int): 输入到模型所需要的文本个数,如完成单句文本分类任务,则num_slots=1;完成pointwise文本匹配任务,则num_slots=2;完成pairtwise文本匹配任务,则num_slots=3; - -**返回** - -* inputs(dict): program的输入变量 -* outputs(dict): program的输出变量 -* main_program(Program): 带有预训练参数的program - -### 代码示例 - -```python -import paddlehub as hub -import cv2 - -tencent_ailab_chinese_embedding = hub.Module(name="tencent_ailab_chinese_embedding_small") -inputs, outputs, program = tencent_ailab_chinese_embedding.context(trainable=True, max_seq_len=128, num_slots=1) -``` - -## 依赖 - -paddlepaddle >= 1.8.2 - -paddlehub >= 1.8.0 - -## 更新历史 - -* 1.0.0 - - 初始发布 diff --git a/modules/text/embedding/tencent_ailab_chinese_embedding_small/module.py b/modules/text/embedding/tencent_ailab_chinese_embedding_small/module.py deleted file mode 100644 index b77f6885e2fc0197d70fe1e82203b56203316dfc..0000000000000000000000000000000000000000 --- a/modules/text/embedding/tencent_ailab_chinese_embedding_small/module.py +++ /dev/null @@ -1,149 +0,0 @@ -# -*- coding:utf-8 -*- -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import io -import os - -import paddle.fluid as fluid -import paddlehub as hub -from paddlehub.common.paddle_helper import add_vars_prefix -from paddlehub.module.module import moduleinfo - - -def load_vocab(file_path): - """ - load the given vocabulary - """ - vocab = {} - with io.open(file_path, 'r', encoding='utf8') as f: - for line in f: - parts = line.split("\t") - vocab[parts[0]] = int(parts[1]) - - return vocab - - -@moduleinfo( - name="tencent_ailab_chinese_embedding_small", - version="1.0.0", - summary= - "Tencent AI Lab Embedding Corpus for Chinese Words and Phrases and the vocab size is 2,000,002. For more information, please refer to https://ai.tencent.com/ailab/nlp/zh/embedding.html", - author="", - author_email="", - type="nlp/semantic_model") -class TencentAILabChineseEmbeddingSmall(hub.Module): - def _initialize(self): - """ - initialize with the necessary elements - """ - self.pretrained_model_path = os.path.join(self.directory, "assets", "model") - self.vocab_path = os.path.join(self.directory, "assets", "vocab.txt") - self.vocab = load_vocab(self.vocab_path) - - def context(self, trainable=False, max_seq_len=128, num_slots=1): - """ - Get the input ,output and program of the pretrained word2vec_skipgram - - Args: - trainable(bool): Whether fine-tune the pretrained parameters of tencent_ailab_chinese_embedding_small or not. - num_slots(int): It's number of data inputted to the model, selectted as following options: - - - 1(default): There's only one data to be feeded in the model, e.g. the module is used for sentence classification task. - - 2: There are two data to be feeded in the model, e.g. the module is used for text matching task (point-wise). - - 3: There are three data to be feeded in the model, e.g. the module is used for text matching task (pair-wise). - - Returns: - inputs(dict): the input variables of tencent_ailab_chinese_embedding_small (words) - outputs(dict): the output variables of input words (word embeddings) - main_program(Program): the main_program of tencent_ailab_chinese_embedding_small with pretrained prameters - """ - assert num_slots >= 1 and num_slots <= 3, "num_slots must be 1, 2, or 3, but the input is %d" % num_slots - main_program = fluid.Program() - startup_program = fluid.Program() - with fluid.program_guard(main_program, startup_program): - with fluid.unique_name.guard(): - w_param_attrs = fluid.ParamAttr( - name="embedding_0.w_0", - initializer=fluid.initializer.TruncatedNormal(scale=0.02), - trainable=trainable) - - text_1 = fluid.data(name='text', shape=[-1, max_seq_len], dtype='int64', lod_level=0) - emb_1 = fluid.embedding( - input=text_1, - size=[len(self.vocab), 200], - is_sparse=True, - padding_idx=len(self.vocab) - 1, - dtype='float32', - param_attr=w_param_attrs) - emb_1_name = emb_1.name - data_list = [text_1] - emb_name_list = [emb_1_name] - - if num_slots > 1: - text_2 = fluid.data(name='text_2', shape=[-1, max_seq_len], dtype='int64', lod_level=0) - emb_2 = fluid.embedding( - input=text_2, - size=[len(self.vocab), 200], - is_sparse=True, - padding_idx=len(self.vocab) - 1, - dtype='float32', - param_attr=w_param_attrs) - emb_2_name = emb_2.name - data_list.append(text_2) - emb_name_list.append(emb_2_name) - - if num_slots > 2: - text_3 = fluid.data(name='text_3', shape=[-1, max_seq_len], dtype='int64', lod_level=0) - emb_3 = fluid.embedding( - input=text_3, - size=[len(self.vocab), 200], - is_sparse=True, - padding_idx=len(self.vocab) - 1, - dtype='float32', - param_attr=w_param_attrs) - emb_3_name = emb_3.name - data_list.append(text_3) - emb_name_list.append(emb_3_name) - - variable_names = filter(lambda v: v not in ['text', 'text_2', 'text_3'], - list(main_program.global_block().vars.keys())) - - prefix_name = "@HUB_{}@".format(self.name) - add_vars_prefix(program=main_program, prefix=prefix_name, vars=variable_names) - for param in main_program.global_block().iter_parameters(): - param.trainable = trainable - - place = fluid.CPUPlace() - exe = fluid.Executor(place) - - # load the pretrained model - def if_exist(var): - return os.path.exists(os.path.join(self.pretrained_model_path, var.name)) - - fluid.io.load_vars(exe, self.pretrained_model_path, predicate=if_exist) - - inputs = {} - outputs = {} - for index, data in enumerate(data_list): - if index == 0: - inputs['text'] = data - outputs['emb'] = main_program.global_block().vars[prefix_name + emb_name_list[0]] - else: - inputs['text_%s' % (index + 1)] = data - outputs['emb_%s' % (index + 1)] = main_program.global_block().vars[prefix_name + - emb_name_list[index]] - - return inputs, outputs, main_program - - def get_vocab_path(self): - return self.vocab_path - - -if __name__ == "__main__": - w2v = TencentAILabChineseEmbeddingSmall() - inputs, outputs, program = w2v.context(num_slots=3) - print(inputs) - print(outputs) - print(w2v.get_vocab_path()) diff --git a/modules/text/machine_translation/transformer/en-de/README.md b/modules/text/machine_translation/transformer/en-de/README.md index 586186ed26bcadcd52bc82575e2cdae395cf0690..5e93e9bdc1548a0fcef57db8089f9156e48c7ade 100644 --- a/modules/text/machine_translation/transformer/en-de/README.md +++ b/modules/text/machine_translation/transformer/en-de/README.md @@ -1,120 +1,141 @@ -```shell -$ hub install transformer_en-de==1.0.0 -``` +# transformer_en-de +|模型名称|transformer_en-de| +| :--- | :---: | +|类别|文本-机器翻译| +|网络|Transformer| +|数据集|WMT14 EN-DE| +|是否支持Fine-tuning|否| +|模型大小|481MB| +|最新更新日期|2021-07-21| +|数据指标|-| -## 概述 +## 一、模型基本信息 -2017 年,Google机器翻译团队在其发表的论文[Attention Is All You Need](https://arxiv.org/abs/1706.03762)中,提出了用于完成机器翻译(Machine Translation)等序列到序列(Seq2Seq)学习任务的一种全新网络结构——Transformer。Tranformer网络完全使用注意力(Attention)机制来实现序列到序列的建模,并且取得了很好的效果。 +- ### 模型介绍 -transformer_en-de包含6层的transformer结构,头数为8,隐藏层参数为512,参数量为64M。该模型在[WMT'14 EN-DE数据集](http://www.statmt.org/wmt14/translation-task.html)进行了预训练,加载后可直接用于预测,提供了英文翻译为德文的能力。 + - 2017 年,Google机器翻译团队在其发表的论文[Attention Is All You Need](https://arxiv.org/abs/1706.03762)中,提出了用于完成机器翻译(Machine Translation)等序列到序列(Seq2Seq)学习任务的一种全新网络结构——Transformer。Tranformer网络完全使用注意力(Attention)机制来实现序列到序列的建模,并且取得了很好的效果。 -关于机器翻译的Transformer模型训练方式和详情,可查看[Machine Translation using Transformer](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/machine_translation/transformer)。 + - transformer_en-de包含6层的transformer结构,头数为8,隐藏层参数为512,参数量为64M。该模型在[WMT'14 EN-DE数据集](http://www.statmt.org/wmt14/translation-task.html)进行了预训练,加载后可直接用于预测,提供了英文翻译为德文的能力。 -## API + - 关于机器翻译的Transformer模型训练方式和详情,可查看[Machine Translation using Transformer](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/machine_translation/transformer)。 +## 二、安装 -```python -def __init__(max_length: int = 256, - max_out_len: int = 256, - beam_size: int = 5): -``` -初始化module,可配置模型的输入输出文本的最大长度和解码时beam search的宽度。 +- ### 1、环境依赖 -**参数** -- `max_length`(int): 输入文本的最大长度,默认值为256。 -- `max_out_len`(int): 输出文本的最大解码长度,默认值为256。 -- `beam_size`(int): beam search方式解码的beam宽度,默认为5。 + - paddlepaddle >= 2.1.0 + + - paddlehub >= 2.1.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) +- ### 2、安装 -```python -def predict(data: List[str], - batch_size: int = 1, - n_best: int = 1, - use_gpu: bool = False): -``` -预测API,输入源语言的文本句子,解码后输出翻译后的目标语言的文本候选句子。 + - ```shell + $ hub install transformer_en-de + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ### 1、预测代码示例 -**参数** -- `data`(List[str]): 源语言的文本列表,数据类型为List[str] -- `batch_size`(int): 进行预测的batch_size,默认为1 -- `n_best`(int): 每个输入文本经过模型解码后,输出的得分最高的候选句子的数量,必须小于beam_size,默认为1 -- `use_gpu`(bool): 是否使用gpu执行预测,默认为False + - ```python + import paddlehub as hub -**返回** -* `results`(List[str]): 翻译后的目标语言的候选句子,长度为`len(data)*n_best` + model = hub.Module(name='transformer_en-de', beam_size=5) + src_texts = [ + 'What are you doing now?', + 'The change was for the better; I eat well, I exercise, I take my drugs.', + 'Such experiments are not conducted for ethical reasons.', + ] + n_best = 3 # 每个输入样本的输出候选句子数量 + trg_texts = model.predict(src_texts, n_best=n_best) + for idx, st in enumerate(src_texts): + print('-'*30) + print(f'src: {st}') + for i in range(n_best): + print(f'trg[{i+1}]: {trg_texts[idx*n_best+i]}') + ``` -**代码示例** +- ### 2、API -```python -import paddlehub as hub + - ```python + def __init__(max_length: int = 256, + max_out_len: int = 256, + beam_size: int = 5): + ``` + + - 初始化module,可配置模型的输入输出文本的最大长度和解码时beam search的宽度。 + + - **参数** -model = hub.Module(name='transformer_en-de', beam_size=5) -src_texts = [ - 'What are you doing now?', - 'The change was for the better; I eat well, I exercise, I take my drugs.', - 'Such experiments are not conducted for ethical reasons.', -] + - `max_length`(int): 输入文本的最大长度,默认值为256。 + - `max_out_len`(int): 输出文本的最大解码长度,默认值为256。 + - `beam_size`(int): beam search方式解码的beam宽度,默认为5。 -n_best = 3 # 每个输入样本的输出候选句子数量 -trg_texts = model.predict(src_texts, n_best=n_best) -for idx, st in enumerate(src_texts): - print('-'*30) - print(f'src: {st}') - for i in range(n_best): - print(f'trg[{i+1}]: {trg_texts[idx*n_best+i]}') -``` + - ```python + def predict(data: List[str], + batch_size: int = 1, + n_best: int = 1, + use_gpu: bool = False): + ``` -## 服务部署 + - 预测API,输入源语言的文本句子,解码后输出翻译后的目标语言的文本候选句子。 -通过启动PaddleHub Serving,可以加载模型部署在线翻译服务。 + - **参数** -### Step1: 启动PaddleHub Serving + - `data`(List[str]): 源语言的文本列表,数据类型为List[str] + - `batch_size`(int): 进行预测的batch_size,默认为1 + - `n_best`(int): 每个输入文本经过模型解码后,输出的得分最高的候选句子的数量,必须小于beam_size,默认为1 + - `use_gpu`(bool): 是否使用gpu执行预测,默认为False + + - **返回** -运行启动命令: + - `results`(List[str]): 翻译后的目标语言的候选句子,长度为`len(data)*n_best` -```shell -$ hub serving start -m transformer_en-de -``` +## 四、服务部署 -通过以上命令可完成一个英德机器翻译API的部署,默认端口号为8866。 +- 通过启动PaddleHub Serving,可以加载模型部署在线翻译服务。 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 +- ### 第一步:启动PaddleHub Serving -### Step2: 发送预测请求 + - 运行启动命令: -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - ```shell + $ hub serving start -m transformer_en-de + ``` -```python -import requests -import json + - 通过以上命令可完成一个英德机器翻译API的部署,默认端口号为8866。 -texts = [ - 'What are you doing now?', - 'The change was for the better; I eat well, I exercise, I take my drugs.', - 'Such experiments are not conducted for ethical reasons.', -] -data = {"data": texts} -# 发送post请求,content-type类型应指定json方式,url中的ip地址需改为对应机器的ip -url = "http://127.0.0.1:8866/predict/transformer_en-de" -# 指定post请求的headers为application/json方式 -headers = {"Content-Type": "application/json"} + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 -r = requests.post(url=url, headers=headers, data=json.dumps(data)) -print(r.json()) -``` +- ## 第二步:发送预测请求 -## 查看代码 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/machine_translation/transformer + - ```python + import requests + import json -## 依赖 + texts = [ + 'What are you doing now?', + 'The change was for the better; I eat well, I exercise, I take my drugs.', + 'Such experiments are not conducted for ethical reasons.', + ] + data = {"data": texts} + # 发送post请求,content-type类型应指定json方式,url中的ip地址需改为对应机器的ip + url = "http://127.0.0.1:8866/predict/transformer_en-de" + # 指定post请求的headers为application/json方式 + headers = {"Content-Type": "application/json"} -paddlepaddle >= 2.0.0 + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + print(r.json()) + ``` -paddlehub >= 2.1.0 + - 关于PaddleHub Serving更多信息参考:[服务部署](../../../../docs/docs_ch/tutorial/serving.md) -## 更新历史 +## 五、更新历史 * 1.0.0 @@ -123,3 +144,6 @@ paddlehub >= 2.1.0 * 1.0.1 修复模型初始化的兼容性问题 + - ```shell + $ hub install transformer_en-de==1.0.1 + ``` diff --git a/modules/text/machine_translation/transformer/en-de/module.py b/modules/text/machine_translation/transformer/en-de/module.py index ed60c5a82091b5e72f8e30d7a66741b7d3b71128..75b0389b8a2cc02118b40d34b6ade553b0a7ac16 100644 --- a/modules/text/machine_translation/transformer/en-de/module.py +++ b/modules/text/machine_translation/transformer/en-de/module.py @@ -13,7 +13,6 @@ # limitations under the License. import os -from packaging.version import Version from typing import List import paddle @@ -56,15 +55,12 @@ class MTTransformer(nn.Layer): # Vocabularies in source and target should be same for weight sharing. "weight_sharing": True, # Dropout rate - 'dropout': 0 + 'dropout': 0, + # Number of sub-layers to be stacked in the encoder and decoder. + "num_encoder_layers": 6, + "num_decoder_layers": 6 } - # Number of sub-layers to be stacked in the encoder and decoder. - if Version(paddlenlp.__version__) <= Version('2.0.5'): - model_config.update({"n_layer": 6}) - else: - model_config.update({"num_encoder_layers": 6, "num_decoder_layers": 6}) - # Vocab config vocab_config = { # Used to pad vocab size to be multiple of pad_factor. diff --git a/modules/text/machine_translation/transformer/en-de/requirements.txt b/modules/text/machine_translation/transformer/en-de/requirements.txt index adf3e7fe61baa839a71c8b276b752c3ad2148ca4..9b56b1aac0af7268a79bf6f9a4f10d0b906808e3 100644 --- a/modules/text/machine_translation/transformer/en-de/requirements.txt +++ b/modules/text/machine_translation/transformer/en-de/requirements.txt @@ -1,2 +1,3 @@ +paddlenlp>=2.1.0 sacremoses subword-nmt diff --git a/modules/text/machine_translation/transformer/zh-en/README.md b/modules/text/machine_translation/transformer/zh-en/README.md index 444b8cdb3e09bd203441e41c64ae59d3f2e06821..db4135f84a457898d9cc4ba93f7cf3971de0e0dc 100644 --- a/modules/text/machine_translation/transformer/zh-en/README.md +++ b/modules/text/machine_translation/transformer/zh-en/README.md @@ -1,118 +1,138 @@ -```shell -$ hub install transformer_zh-en==1.0.0 -``` +# transformer_zh-en +|模型名称|transformer_zh-en| +| :--- | :---: | +|类别|文本-机器翻译| +|网络|Transformer| +|数据集|CWMT2021| +|是否支持Fine-tuning|否| +|模型大小|614MB| +|最新更新日期|2021-07-21| +|数据指标|-| -## 概述 +## 一、模型基本信息 -2017 年,Google机器翻译团队在其发表的论文[Attention Is All You Need](https://arxiv.org/abs/1706.03762)中,提出了用于完成机器翻译(Machine Translation)等序列到序列(Seq2Seq)学习任务的一种全新网络结构——Transformer。Tranformer网络完全使用注意力(Attention)机制来实现序列到序列的建模,并且取得了很好的效果。 +- ### 模型介绍 -transformer_zh-en包含6层的transformer结构,头数为8,隐藏层参数为512,参数量为64M。该模型在[CWMT2021的数据集](http://nlp.nju.edu.cn/cwmt-wmt)进行了预训练,加载后可直接用于预测, 提供了中文翻译为英文的能力。 + - 2017 年,Google机器翻译团队在其发表的论文[Attention Is All You Need](https://arxiv.org/abs/1706.03762)中,提出了用于完成机器翻译(Machine Translation)等序列到序列(Seq2Seq)学习任务的一种全新网络结构——Transformer。Tranformer网络完全使用注意力(Attention)机制来实现序列到序列的建模,并且取得了很好的效果。 -关于机器翻译的Transformer模型训练方式和详情,可查看[Machine Translation using Transformer](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/machine_translation/transformer)。 + - transformer_zh-en包含6层的transformer结构,头数为8,隐藏层参数为512,参数量为64M。该模型在[CWMT2021的数据集](http://nlp.nju.edu.cn/cwmt-wmt)进行了预训练,加载后可直接用于预测, 提供了中文翻译为英文的能力。 -## API + - 关于机器翻译的Transformer模型训练方式和详情,可查看[Machine Translation using Transformer](https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/machine_translation/transformer)。 +## 二、安装 -```python -def __init__(max_length: int = 256, - max_out_len: int = 256, - beam_size: int = 5): -``` -初始化module,可配置模型的输入输出文本的最大长度和解码时beam search的宽度。 +- ### 1、环境依赖 -**参数** -- `max_length`(int): 输入文本的最大长度,默认值为256。 -- `max_out_len`(int): 输出文本的最大解码长度,默认值为256。 -- `beam_size`(int): beam search方式解码的beam宽度,默认为5。 + - paddlepaddle >= 2.1.0 + + - paddlehub >= 2.1.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) +- ### 2、安装 -```python -def predict(data: List[str], - batch_size: int = 1, - n_best: int = 1, - use_gpu: bool = False): -``` -预测API,输入源语言的文本句子,解码后输出翻译后的目标语言的文本候选句子。 + - ```shell + $ hub install transformer_zh-en + ``` -**参数** -- `data`(List[str]): 源语言的文本列表,数据类型为List[str] -- `batch_size`(int): 进行预测的batch_size,默认为1 -- `n_best`(int): 每个输入文本经过模型解码后,输出的得分最高的候选句子的数量,必须小于beam_size,默认为1 -- `use_gpu`(bool): 是否使用gpu执行预测,默认为False + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -**返回** -* `results`(List[str]): 翻译后的目标语言的候选句子,长度为`len(data)*n_best` +## 三、模型API预测 +- ### 1、预测代码示例 -**代码示例** + - ```python + import paddlehub as hub -```python -import paddlehub as hub + model = hub.Module(name='transformer_zh-en', beam_size=5) + src_texts = [ + '今天天气怎么样?', + '我们一起去吃饭吧。', + ] -model = hub.Module(name='transformer_zh-en', beam_size=5) -src_texts = [ - '今天天气怎么样?', - '我们一起去吃饭吧。', -] + n_best = 3 # 每个输入样本的输出候选句子数量 + trg_texts = model.predict(src_texts, n_best=n_best) + for idx, st in enumerate(src_texts): + print('-'*30) + print(f'src: {st}') + for i in range(n_best): + print(f'trg[{i+1}]: {trg_texts[idx*n_best+i]}') + ``` -n_best = 3 # 每个输入样本的输出候选句子数量 -trg_texts = model.predict(src_texts, n_best=n_best) -for idx, st in enumerate(src_texts): - print('-'*30) - print(f'src: {st}') - for i in range(n_best): - print(f'trg[{i+1}]: {trg_texts[idx*n_best+i]}') -``` +- ### 2、API -## 服务部署 + - ```python + def __init__(max_length: int = 256, + max_out_len: int = 256, + beam_size: int = 5): + ``` -通过启动PaddleHub Serving,可以加载模型部署在线翻译服务。 + - 初始化module,可配置模型的输入输出文本的最大长度和解码时beam search的宽度。 -### Step1: 启动PaddleHub Serving + - **参数** -运行启动命令: + - `max_length`(int): 输入文本的最大长度,默认值为256。 + - `max_out_len`(int): 输出文本的最大解码长度,默认值为256。 + - `beam_size`(int): beam search方式解码的beam宽度,默认为5。 -```shell -$ hub serving start -m transformer_zh-en -``` + - ```python + def predict(data: List[str], + batch_size: int = 1, + n_best: int = 1, + use_gpu: bool = False): + ``` -通过以上命令可完成一个中英机器翻译API的部署,默认端口号为8866。 + - 预测API,输入源语言的文本句子,解码后输出翻译后的目标语言的文本候选句子。 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + - **参数** + - `data`(List[str]): 源语言的文本列表,数据类型为List[str] + - `batch_size`(int): 进行预测的batch_size,默认为1 + - `n_best`(int): 每个输入文本经过模型解码后,输出的得分最高的候选句子的数量,必须小于beam_size,默认为1 + - `use_gpu`(bool): 是否使用gpu执行预测,默认为False -### Step2: 发送预测请求 + - **返回** + - `results`(List[str]): 翻译后的目标语言的候选句子,长度为`len(data)*n_best` -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 +## 四、服务部署 -```python -import requests -import json + - 通过启动PaddleHub Serving,可以加载模型部署在线翻译服务。 -texts = [ - '今天天气怎么样啊?', - '我们一起去吃饭吧。', -] -data = {"data": texts} -# 发送post请求,content-type类型应指定json方式,url中的ip地址需改为对应机器的ip -url = "http://127.0.0.1:8866/predict/transformer_zh-en" -# 指定post请求的headers为application/json方式 -headers = {"Content-Type": "application/json"} + - ### 第一步: 启动PaddleHub Serving -r = requests.post(url=url, headers=headers, data=json.dumps(data)) -print(r.json()) -``` + - 运行启动命令: -## 查看代码 + - ```shell + $ hub serving start -m transformer_zh-en + ``` -https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/machine_translation/transformer + - 通过以上命令可完成一个中英机器翻译API的部署,默认端口号为8866。 -## 依赖 + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 -paddlepaddle >= 2.0.0 + - ### 第二步: 发送预测请求 -paddlehub >= 2.1.0 + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 -## 更新历史 + - ```python + import requests + import json + + texts = [ + '今天天气怎么样啊?', + '我们一起去吃饭吧。', + ] + data = {"data": texts} + # 发送post请求,content-type类型应指定json方式,url中的ip地址需改为对应机器的ip + url = "http://127.0.0.1:8866/predict/transformer_zh-en" + # 指定post请求的headers为application/json方式 + headers = {"Content-Type": "application/json"} + + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + print(r.json()) + ``` + + - 关于PaddleHub Serving更多信息参考:[服务部署](../../../../docs/docs_ch/tutorial/serving.md) + +## 五、更新历史 * 1.0.0 @@ -121,3 +141,6 @@ paddlehub >= 2.1.0 * 1.0.1 修复模型初始化的兼容性问题 + - ```shell + $ hub install transformer_zh-en==1.0.1 + ``` diff --git a/modules/text/machine_translation/transformer/zh-en/module.py b/modules/text/machine_translation/transformer/zh-en/module.py index 7d6d6a1a017b4cd589fa487cf26e932333f71467..318d572847bae642a724b9e781aef3606cbf3915 100644 --- a/modules/text/machine_translation/transformer/zh-en/module.py +++ b/modules/text/machine_translation/transformer/zh-en/module.py @@ -13,7 +13,6 @@ # limitations under the License. import os -from packaging.version import Version from typing import List import paddle @@ -56,15 +55,12 @@ class MTTransformer(nn.Layer): # Vocabularies in source and target should be same for weight sharing. "weight_sharing": False, # Dropout rate - 'dropout': 0 + 'dropout': 0, + # Number of sub-layers to be stacked in the encoder and decoder. + "num_encoder_layers": 6, + "num_decoder_layers": 6 } - # Number of sub-layers to be stacked in the encoder and decoder. - if Version(paddlenlp.__version__) <= Version('2.0.5'): - model_config.update({"n_layer": 6}) - else: - model_config.update({"num_encoder_layers": 6, "num_decoder_layers": 6}) - # Vocab config vocab_config = { # Used to pad vocab size to be multiple of pad_factor. diff --git a/modules/text/machine_translation/transformer/zh-en/requirements.txt b/modules/text/machine_translation/transformer/zh-en/requirements.txt index 6029eb21ad870229e0cb41e4462cf741227e52e2..8eca50b1f59a6c94052d55687789708fefc73e43 100644 --- a/modules/text/machine_translation/transformer/zh-en/requirements.txt +++ b/modules/text/machine_translation/transformer/zh-en/requirements.txt @@ -1,3 +1,4 @@ +paddlenlp>=2.1.0 jieba sacremoses subword-nmt diff --git a/modules/text/text_generation/ernie_gen_leave/README.md b/modules/text/text_generation/ernie_gen_leave/README.md deleted file mode 100644 index ddde23ca6d86de747f1608c424e496671d0600cb..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/README.md +++ /dev/null @@ -1,52 +0,0 @@ -## 概述 - - -ernie_gen_leave是基于ERNIE-GEN进行微调的模型,该模型的主要功能为生成请假条。输出一个关键词,给出你的请假理由。 - -## 命令行预测 - -```shell -$ hub run ernie_gen_leave --input_text="理由" --use_gpu True --beam_width 5 -``` - -## API - -```python -def generate(texts, use_gpu=False, beam_width=5): -``` - -预测API,输入关键字给出请假理由。 - -**参数** - -* texts (list\[str\]): 请假关键字; -* use\_gpu (bool): 是否使用 GPU;**若使用GPU,请先设置CUDA\_VISIBLE\_DEVICES环境变量**; -* beam\_width: beam search宽度,决定输出多少理由的数量。 - -**返回** - -* results (list\[list\]\[str\]): 输出请假理由。 - -**代码示例** - -```python -import paddlehub as hub - -module = hub.Module(name="ernie_gen_leave") - -test_texts = ["理由"] -results = module.generate(texts=test_texts, use_gpu=False, beam_width=2) -for result in results: - print(result) -``` - - -## 查看代码 - -https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.0.0-rc/modules/text/text_generation/ernie_gen_leave - -### 依赖 - -paddlepaddle >= 2.0.0rc1 - -paddlehub >= 2.0.0rc0 diff --git a/modules/text/text_generation/ernie_gen_leave/model/decode.py b/modules/text/text_generation/ernie_gen_leave/model/decode.py deleted file mode 100644 index d07a58b559796b0331946561ed2dcbdc85ffadae..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/model/decode.py +++ /dev/null @@ -1,259 +0,0 @@ -# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import re -import numpy as np -from collections import namedtuple - -import paddle.fluid as F -import paddle.fluid.layers as L -import paddle.fluid.dygraph as D - - -def gen_bias(encoder_inputs, decoder_inputs, step): - decoder_bsz, decoder_seqlen = decoder_inputs.shape[:2] - attn_bias = L.reshape(L.range(0, decoder_seqlen, 1, dtype='float32') + 1, [1, -1, 1]) - decoder_bias = L.cast((L.matmul(attn_bias, 1. / attn_bias, transpose_y=True) >= 1.), - 'float32') #[1, 1, decoderlen, decoderlen] - encoder_bias = L.unsqueeze(L.cast(L.ones_like(encoder_inputs), 'float32'), [1]) #[bsz, 1, encoderlen] - encoder_bias = L.expand(encoder_bias, [1, decoder_seqlen, 1]) #[bsz,decoderlen, encoderlen] - decoder_bias = L.expand(decoder_bias, [decoder_bsz, 1, 1]) #[bsz, decoderlen, decoderlen] - if step > 0: - bias = L.concat([encoder_bias, L.ones([decoder_bsz, decoder_seqlen, step], 'float32'), decoder_bias], -1) - else: - bias = L.concat([encoder_bias, decoder_bias], -1) - return bias - - -@D.no_grad -def greedy_search_infilling(model, - q_ids, - q_sids, - sos_id, - eos_id, - attn_id, - max_encode_len=640, - max_decode_len=100, - tgt_type_id=3): - model.eval() - _, logits, info = model(q_ids, q_sids) - gen_ids = L.argmax(logits, -1) - d_batch, d_seqlen = q_ids.shape - seqlen = L.reduce_sum(L.cast(q_ids != 0, 'int64'), 1, keep_dim=True) - has_stopped = np.zeros([d_batch], dtype=np.bool) - gen_seq_len = np.zeros([d_batch], dtype=np.int64) - output_ids = [] - - past_cache = info['caches'] - - cls_ids = L.ones([d_batch], dtype='int64') * sos_id - attn_ids = L.ones([d_batch], dtype='int64') * attn_id - ids = L.stack([cls_ids, attn_ids], -1) - for step in range(max_decode_len): - bias = gen_bias(q_ids, ids, step) - pos_ids = D.to_variable(np.tile(np.array([[step, step + 1]], dtype=np.int64), [d_batch, 1])) - pos_ids += seqlen - _, logits, info = model( - ids, L.ones_like(ids) * tgt_type_id, pos_ids=pos_ids, attn_bias=bias, past_cache=past_cache) - gen_ids = L.argmax(logits, -1) - - past_cached_k, past_cached_v = past_cache - cached_k, cached_v = info['caches'] - cached_k = [L.concat([pk, k[:, :1, :]], 1) for pk, k in zip(past_cached_k, cached_k)] # concat cached - cached_v = [L.concat([pv, v[:, :1, :]], 1) for pv, v in zip(past_cached_v, cached_v)] - past_cache = (cached_k, cached_v) - - gen_ids = gen_ids[:, 1] - ids = L.stack([gen_ids, attn_ids], 1) - - gen_ids = gen_ids.numpy() - has_stopped |= (gen_ids == eos_id).astype(np.bool) - gen_seq_len += (1 - has_stopped.astype(np.int64)) - output_ids.append(gen_ids.tolist()) - if has_stopped.all(): - break - output_ids = np.array(output_ids).transpose([1, 0]) - return output_ids - - -BeamSearchState = namedtuple('BeamSearchState', ['log_probs', 'lengths', 'finished']) -BeamSearchOutput = namedtuple('BeamSearchOutput', ['scores', 'predicted_ids', 'beam_parent_ids']) - - -def log_softmax(x): - e_x = np.exp(x - np.max(x)) - return np.log(e_x / e_x.sum()) - - -def mask_prob(p, onehot_eos, finished): - is_finished = L.cast(L.reshape(finished, [-1, 1]) != 0, 'float32') - p = is_finished * (1. - L.cast(onehot_eos, 'float32')) * -9999. + (1. - is_finished) * p - return p - - -def hyp_score(log_probs, length, length_penalty): - lp = L.pow((5. + L.cast(length, 'float32')) / 6., length_penalty) - return log_probs / lp - - -def beam_search_step(state, logits, eos_id, beam_width, is_first_step, length_penalty): - """logits.shape == [B*W, V]""" - beam_size, vocab_size = logits.shape # as batch size=1 in this hub module. the first dim means bsz * beam_size equals beam_size - logits_np = logits.numpy() - for i in range(beam_size): - logits_np[i][17963] = 0 # make [UNK] prob = 0 - logits = D.to_variable(logits_np) - - bsz, beam_width = state.log_probs.shape - onehot_eos = L.cast(F.one_hot(L.ones([1], 'int64') * eos_id, vocab_size), 'int64') #[1, V] - - probs = L.log(L.softmax(logits)) #[B*W, V] - probs = mask_prob(probs, onehot_eos, state.finished) #[B*W, V] - allprobs = L.reshape(state.log_probs, [-1, 1]) + probs #[B*W, V] - - not_finished = 1 - L.reshape(state.finished, [-1, 1]) #[B*W,1] - not_eos = 1 - onehot_eos - length_to_add = not_finished * not_eos #[B*W,V] - alllen = L.reshape(state.lengths, [-1, 1]) + length_to_add - - allprobs = L.reshape(allprobs, [-1, beam_width * vocab_size]) - alllen = L.reshape(alllen, [-1, beam_width * vocab_size]) - allscore = hyp_score(allprobs, alllen, length_penalty) - if is_first_step: - allscore = L.reshape(allscore, [bsz, beam_width, -1])[:, 0, :] # first step only consiter beam 0 - scores, idx = L.topk(allscore, k=beam_width) #[B, W] - next_beam_id = idx // vocab_size #[B, W] - next_word_id = idx % vocab_size - - gather_idx = L.concat([L.where(idx != -1)[:, :1], L.reshape(idx, [-1, 1])], 1) - next_probs = L.reshape(L.gather_nd(allprobs, gather_idx), idx.shape) - next_len = L.reshape(L.gather_nd(alllen, gather_idx), idx.shape) - - gather_idx = L.concat([L.where(next_beam_id != -1)[:, :1], L.reshape(next_beam_id, [-1, 1])], 1) - next_finished = L.reshape(L.gather_nd(state.finished, gather_idx), - state.finished.shape) #[gather new beam state according to new beam id] - - next_finished += L.cast(next_word_id == eos_id, 'int64') - next_finished = L.cast(next_finished > 0, 'int64') - - next_state = BeamSearchState(log_probs=next_probs, lengths=next_len, finished=next_finished) - output = BeamSearchOutput(scores=scores, predicted_ids=next_word_id, beam_parent_ids=next_beam_id) - - return output, next_state - - -@D.no_grad -def beam_search_infilling(model, - q_ids, - q_sids, - sos_id, - eos_id, - attn_id, - max_encode_len=640, - max_decode_len=100, - beam_width=5, - tgt_type_id=3, - length_penalty=1.0): - model.eval() - _, __, info = model(q_ids, q_sids) - d_batch, d_seqlen = q_ids.shape - - state = BeamSearchState( - log_probs=L.zeros([d_batch, beam_width], 'float32'), - lengths=L.zeros([d_batch, beam_width], 'int64'), - finished=L.zeros([d_batch, beam_width], 'int64')) - outputs = [] - - def reorder_(t, parent_id): - """reorder cache according to parent beam id""" - gather_idx = L.where(parent_id != -1)[:, 0] * beam_width + L.reshape(parent_id, [-1]) - t = L.gather(t, gather_idx) - return t - - def tile_(t, times): - _shapes = list(t.shape[1:]) - ret = L.reshape(L.expand(L.unsqueeze(t, [1]), [ - 1, - times, - ] + [ - 1, - ] * len(_shapes)), [ - -1, - ] + _shapes) - return ret - - cached_k, cached_v = info['caches'] - cached_k = [tile_(k, beam_width) for k in cached_k] - cached_v = [tile_(v, beam_width) for v in cached_v] - past_cache = (cached_k, cached_v) - - q_ids = tile_(q_ids, beam_width) - seqlen = L.reduce_sum(L.cast(q_ids != 0, 'int64'), 1, keep_dim=True) - - cls_ids = L.ones([d_batch * beam_width], dtype='int64') * sos_id - attn_ids = L.ones([d_batch * beam_width], dtype='int64') * attn_id # SOS - ids = L.stack([cls_ids, attn_ids], -1) - for step in range(max_decode_len): - bias = gen_bias(q_ids, ids, step) - pos_ids = D.to_variable(np.tile(np.array([[step, step + 1]], dtype=np.int64), [d_batch * beam_width, 1])) - pos_ids += seqlen - - _, logits, info = model( - ids, L.ones_like(ids) * tgt_type_id, pos_ids=pos_ids, attn_bias=bias, past_cache=past_cache) - - output, state = beam_search_step( - state, - logits[:, 1], - eos_id=eos_id, - beam_width=beam_width, - is_first_step=(step == 0), - length_penalty=length_penalty) - outputs.append(output) - - past_cached_k, past_cached_v = past_cache - cached_k, cached_v = info['caches'] - cached_k = [ - reorder_(L.concat([pk, k[:, :1, :]], 1), output.beam_parent_ids) for pk, k in zip(past_cached_k, cached_k) - ] # concat cached - cached_v = [ - reorder_(L.concat([pv, v[:, :1, :]], 1), output.beam_parent_ids) for pv, v in zip(past_cached_v, cached_v) - ] - past_cache = (cached_k, cached_v) - - pred_ids_flatten = L.reshape(output.predicted_ids, [d_batch * beam_width]) - ids = L.stack([pred_ids_flatten, attn_ids], 1) - - if state.finished.numpy().all(): - break - - final_ids = L.stack([o.predicted_ids for o in outputs], 0) - final_parent_ids = L.stack([o.beam_parent_ids for o in outputs], 0) - final_ids = L.gather_tree(final_ids, final_parent_ids) #[:, :, - #0] #pick best beam - final_ids = L.transpose(L.reshape(final_ids, [-1, d_batch * 1, beam_width]), [1, 2, 0]) - return final_ids - - -en_patten = re.compile(r'^[a-zA-Z0-9]*$') - - -def post_process(token): - if token.startswith('##'): - ret = token[2:] - else: - if en_patten.match(token): - ret = ' ' + token - else: - ret = token - return ret diff --git a/modules/text/text_generation/ernie_gen_leave/model/file_utils.py b/modules/text/text_generation/ernie_gen_leave/model/file_utils.py deleted file mode 100644 index 608be4efc6644626f7f408df200fd299f2dd997e..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/model/file_utils.py +++ /dev/null @@ -1,46 +0,0 @@ -# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import os - -from tqdm import tqdm -from paddlehub.common.logger import logger -from paddlehub.common.dir import MODULE_HOME - - -def _fetch_from_remote(url, force_download=False): - import tempfile, requests, tarfile - cached_dir = os.path.join(MODULE_HOME, "ernie_for_gen") - if force_download or not os.path.exists(cached_dir): - with tempfile.NamedTemporaryFile() as f: - #url = 'https://ernie.bj.bcebos.com/ERNIE_stable.tgz' - r = requests.get(url, stream=True) - total_len = int(r.headers.get('content-length')) - for chunk in tqdm( - r.iter_content(chunk_size=1024), total=total_len // 1024, desc='downloading %s' % url, unit='KB'): - if chunk: - f.write(chunk) - f.flush() - logger.debug('extacting... to %s' % f.name) - with tarfile.open(f.name) as tf: - tf.extractall(path=cached_dir) - logger.debug('%s cached in %s' % (url, cached_dir)) - return cached_dir - - -def add_docstring(doc): - def func(f): - f.__doc__ += ('\n======other docs from supper class ======\n%s' % doc) - return f - - return func diff --git a/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie.py b/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie.py deleted file mode 100644 index d5de28a5fee73371babd05b644e03a0f75ecdd5e..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie.py +++ /dev/null @@ -1,327 +0,0 @@ -# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from __future__ import division -from __future__ import absolute_import -from __future__ import print_function -from __future__ import unicode_literals - -import logging - -import paddle.fluid.dygraph as D -import paddle.fluid as F -import paddle.fluid.layers as L - -log = logging.getLogger(__name__) - - -def _build_linear(n_in, n_out, name, init, act=None): - return D.Linear( - n_in, - n_out, - param_attr=F.ParamAttr(name='%s.w_0' % name if name is not None else None, initializer=init), - bias_attr='%s.b_0' % name if name is not None else None, - act=act) - - -def _build_ln(n_in, name): - return D.LayerNorm( - normalized_shape=n_in, - param_attr=F.ParamAttr( - name='%s_layer_norm_scale' % name if name is not None else None, initializer=F.initializer.Constant(1.)), - bias_attr=F.ParamAttr( - name='%s_layer_norm_bias' % name if name is not None else None, initializer=F.initializer.Constant(1.)), - ) - - -def append_name(name, postfix): - if name is None: - return None - elif name == '': - return postfix - else: - return '%s_%s' % (name, postfix) - - -class AttentionLayer(D.Layer): - def __init__(self, cfg, name=None): - super(AttentionLayer, self).__init__() - initializer = F.initializer.TruncatedNormal(scale=cfg['initializer_range']) - d_model = cfg['hidden_size'] - n_head = cfg['num_attention_heads'] - assert d_model % n_head == 0 - d_model_q = cfg.get('query_hidden_size_per_head', d_model // n_head) * n_head - d_model_v = cfg.get('value_hidden_size_per_head', d_model // n_head) * n_head - self.n_head = n_head - self.d_key = d_model_q // n_head - self.q = _build_linear(d_model, d_model_q, append_name(name, 'query_fc'), initializer) - self.k = _build_linear(d_model, d_model_q, append_name(name, 'key_fc'), initializer) - self.v = _build_linear(d_model, d_model_v, append_name(name, 'value_fc'), initializer) - self.o = _build_linear(d_model_v, d_model, append_name(name, 'output_fc'), initializer) - self.dropout = lambda i: L.dropout( - i, - dropout_prob=cfg['attention_probs_dropout_prob'], - dropout_implementation="upscale_in_train", - ) if self.training else i - - def forward(self, queries, keys, values, attn_bias, past_cache): - assert len(queries.shape) == len(keys.shape) == len(values.shape) == 3 - - q = self.q(queries) - k = self.k(keys) - v = self.v(values) - - cache = (k, v) - if past_cache is not None: - cached_k, cached_v = past_cache - k = L.concat([cached_k, k], 1) - v = L.concat([cached_v, v], 1) - - q = L.transpose(L.reshape(q, [0, 0, self.n_head, q.shape[-1] // self.n_head]), - [0, 2, 1, 3]) #[batch, head, seq, dim] - k = L.transpose(L.reshape(k, [0, 0, self.n_head, k.shape[-1] // self.n_head]), - [0, 2, 1, 3]) #[batch, head, seq, dim] - v = L.transpose(L.reshape(v, [0, 0, self.n_head, v.shape[-1] // self.n_head]), - [0, 2, 1, 3]) #[batch, head, seq, dim] - - q = L.scale(q, scale=self.d_key**-0.5) - score = L.matmul(q, k, transpose_y=True) - if attn_bias is not None: - score += attn_bias - score = L.softmax(score, use_cudnn=True) - score = self.dropout(score) - - out = L.matmul(score, v) - out = L.transpose(out, [0, 2, 1, 3]) - out = L.reshape(out, [0, 0, out.shape[2] * out.shape[3]]) - - out = self.o(out) - return out, cache - - -class PositionwiseFeedForwardLayer(D.Layer): - def __init__(self, cfg, name=None): - super(PositionwiseFeedForwardLayer, self).__init__() - initializer = F.initializer.TruncatedNormal(scale=cfg['initializer_range']) - d_model = cfg['hidden_size'] - d_ffn = cfg.get('intermediate_size', 4 * d_model) - assert cfg['hidden_act'] in ['relu', 'gelu'] - self.i = _build_linear(d_model, d_ffn, append_name(name, 'fc_0'), initializer, act=cfg['hidden_act']) - self.o = _build_linear(d_ffn, d_model, append_name(name, 'fc_1'), initializer) - prob = cfg.get('intermediate_dropout_prob', 0.) - self.dropout = lambda i: L.dropout( - i, - dropout_prob=prob, - dropout_implementation="upscale_in_train", - ) if self.training else i - - def forward(self, inputs): - hidden = self.i(inputs) - hidden = self.dropout(hidden) - out = self.o(hidden) - return out - - -class ErnieBlock(D.Layer): - def __init__(self, cfg, name=None): - super(ErnieBlock, self).__init__() - d_model = cfg['hidden_size'] - initializer = F.initializer.TruncatedNormal(scale=cfg['initializer_range']) - - self.attn = AttentionLayer(cfg, name=append_name(name, 'multi_head_att')) - self.ln1 = _build_ln(d_model, name=append_name(name, 'post_att')) - self.ffn = PositionwiseFeedForwardLayer(cfg, name=append_name(name, 'ffn')) - self.ln2 = _build_ln(d_model, name=append_name(name, 'post_ffn')) - prob = cfg.get('intermediate_dropout_prob', cfg['hidden_dropout_prob']) - self.dropout = lambda i: L.dropout( - i, - dropout_prob=prob, - dropout_implementation="upscale_in_train", - ) if self.training else i - - def forward(self, inputs, attn_bias=None, past_cache=None): - attn_out, cache = self.attn(inputs, inputs, inputs, attn_bias, past_cache=past_cache) #self attn - attn_out = self.dropout(attn_out) - hidden = attn_out + inputs - hidden = self.ln1(hidden) # dropout/ add/ norm - - ffn_out = self.ffn(hidden) - ffn_out = self.dropout(ffn_out) - hidden = ffn_out + hidden - hidden = self.ln2(hidden) - return hidden, cache - - -class ErnieEncoderStack(D.Layer): - def __init__(self, cfg, name=None): - super(ErnieEncoderStack, self).__init__() - n_layers = cfg['num_hidden_layers'] - self.block = D.LayerList([ErnieBlock(cfg, append_name(name, 'layer_%d' % i)) for i in range(n_layers)]) - - def forward(self, inputs, attn_bias=None, past_cache=None): - if past_cache is not None: - assert isinstance( - past_cache, - tuple), 'unknown type of `past_cache`, expect tuple or list. got %s' % repr(type(past_cache)) - past_cache = list(zip(*past_cache)) - else: - past_cache = [None] * len(self.block) - cache_list_k, cache_list_v, hidden_list = [], [], [inputs] - - for b, p in zip(self.block, past_cache): - inputs, cache = b(inputs, attn_bias=attn_bias, past_cache=p) - cache_k, cache_v = cache - cache_list_k.append(cache_k) - cache_list_v.append(cache_v) - hidden_list.append(inputs) - - return inputs, hidden_list, (cache_list_k, cache_list_v) - - -class ErnieModel(D.Layer): - def __init__(self, cfg, name=None): - """ - Fundamental pretrained Ernie model - """ - log.debug('init ErnieModel with config: %s' % repr(cfg)) - D.Layer.__init__(self) - d_model = cfg['hidden_size'] - d_emb = cfg.get('emb_size', cfg['hidden_size']) - d_vocab = cfg['vocab_size'] - d_pos = cfg['max_position_embeddings'] - d_sent = cfg.get("sent_type_vocab_size") or cfg['type_vocab_size'] - self.n_head = cfg['num_attention_heads'] - self.return_additional_info = cfg.get('return_additional_info', False) - initializer = F.initializer.TruncatedNormal(scale=cfg['initializer_range']) - - self.ln = _build_ln(d_model, name=append_name(name, 'pre_encoder')) - self.word_emb = D.Embedding([d_vocab, d_emb], - param_attr=F.ParamAttr( - name=append_name(name, 'word_embedding'), initializer=initializer)) - self.pos_emb = D.Embedding([d_pos, d_emb], - param_attr=F.ParamAttr( - name=append_name(name, 'pos_embedding'), initializer=initializer)) - self.sent_emb = D.Embedding([d_sent, d_emb], - param_attr=F.ParamAttr( - name=append_name(name, 'sent_embedding'), initializer=initializer)) - prob = cfg['hidden_dropout_prob'] - self.dropout = lambda i: L.dropout( - i, - dropout_prob=prob, - dropout_implementation="upscale_in_train", - ) if self.training else i - - self.encoder_stack = ErnieEncoderStack(cfg, append_name(name, 'encoder')) - if cfg.get('has_pooler', True): - self.pooler = _build_linear( - cfg['hidden_size'], cfg['hidden_size'], append_name(name, 'pooled_fc'), initializer, act='tanh') - else: - self.pooler = None - self.train() - - def eval(self): - if F.in_dygraph_mode(): - super(ErnieModel, self).eval() - self.training = False - for l in self.sublayers(): - l.training = False - - def train(self): - if F.in_dygraph_mode(): - super(ErnieModel, self).train() - self.training = True - for l in self.sublayers(): - l.training = True - - def forward(self, - src_ids, - sent_ids=None, - pos_ids=None, - input_mask=None, - attn_bias=None, - past_cache=None, - use_causal_mask=False): - """ - Args: - src_ids (`Variable` of shape `[batch_size, seq_len]`): - Indices of input sequence tokens in the vocabulary. - sent_ids (optional, `Variable` of shape `[batch_size, seq_len]`): - aka token_type_ids, Segment token indices to indicate first and second portions of the inputs. - if None, assume all tokens come from `segment_a` - pos_ids(optional, `Variable` of shape `[batch_size, seq_len]`): - Indices of positions of each input sequence tokens in the position embeddings. - input_mask(optional `Variable` of shape `[batch_size, seq_len]`): - Mask to avoid performing attention on the padding token indices of the encoder input. - attn_bias(optional, `Variable` of shape `[batch_size, seq_len, seq_len] or False`): - 3D version of `input_mask`, if set, overrides `input_mask`; if set not False, will not apply attention mask - past_cache(optional, tuple of two lists: cached key and cached value, - each is a list of `Variable`s of shape `[batch_size, seq_len, hidden_size]`): - cached key/value tensor that will be concated to generated key/value when performing self attention. - if set, `attn_bias` should not be None. - - Returns: - pooled (`Variable` of shape `[batch_size, hidden_size]`): - output logits of pooler classifier - encoded(`Variable` of shape `[batch_size, seq_len, hidden_size]`): - output logits of transformer stack - """ - assert len(src_ids.shape) == 2, 'expect src_ids.shape = [batch, sequecen], got %s' % (repr(src_ids.shape)) - assert attn_bias is not None if past_cache else True, 'if `past_cache` is specified; attn_bias should not be None' - d_batch = L.shape(src_ids)[0] - d_seqlen = L.shape(src_ids)[1] - if pos_ids is None: - pos_ids = L.reshape(L.range(0, d_seqlen, 1, dtype='int32'), [1, -1]) - pos_ids = L.cast(pos_ids, 'int64') - if attn_bias is None: - if input_mask is None: - input_mask = L.cast(src_ids != 0, 'float32') - assert len(input_mask.shape) == 2 - input_mask = L.unsqueeze(input_mask, axes=[-1]) - attn_bias = L.matmul(input_mask, input_mask, transpose_y=True) - if use_causal_mask: - sequence = L.reshape(L.range(0, d_seqlen, 1, dtype='float32') + 1., [1, 1, -1, 1]) - causal_mask = L.cast((L.matmul(sequence, 1. / sequence, transpose_y=True) >= 1.), 'float32') - attn_bias *= causal_mask - else: - assert len(attn_bias.shape) == 3, 'expect attn_bias tobe rank 3, got %r' % attn_bias.shape - attn_bias = (1. - attn_bias) * -10000.0 - attn_bias = L.unsqueeze(attn_bias, [1]) - attn_bias = L.expand(attn_bias, [1, self.n_head, 1, 1]) # avoid broadcast =_= - attn_bias.stop_gradient = True - - if sent_ids is None: - sent_ids = L.zeros_like(src_ids) - - src_embedded = self.word_emb(src_ids) - pos_embedded = self.pos_emb(pos_ids) - sent_embedded = self.sent_emb(sent_ids) - embedded = src_embedded + pos_embedded + sent_embedded - - embedded = self.dropout(self.ln(embedded)) - - encoded, hidden_list, cache_list = self.encoder_stack(embedded, attn_bias, past_cache=past_cache) - if self.pooler is not None: - pooled = self.pooler(encoded[:, 0, :]) - else: - pooled = None - - additional_info = { - 'hiddens': hidden_list, - 'caches': cache_list, - } - - if self.return_additional_info: - return pooled, encoded, additional_info - else: - return pooled, encoded diff --git a/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie_gen.py b/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie_gen.py deleted file mode 100644 index bc3d783d622356fad1e48f2767640a59edc05d70..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/model/modeling_ernie_gen.py +++ /dev/null @@ -1,65 +0,0 @@ -# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import paddle.fluid as F -import paddle.fluid.layers as L - -from .modeling_ernie import ErnieModel -from .modeling_ernie import _build_linear, _build_ln, append_name - - -class ErnieModelForGeneration(ErnieModel): - def __init__(self, cfg, name=None): - cfg['return_additional_info'] = True - cfg['has_pooler'] = False - super(ErnieModelForGeneration, self).__init__(cfg, name=name) - initializer = F.initializer.TruncatedNormal(scale=cfg['initializer_range']) - d_model = cfg['hidden_size'] - d_vocab = cfg['vocab_size'] - - self.mlm = _build_linear( - d_model, d_model, append_name(name, 'mask_lm_trans_fc'), initializer, act=cfg['hidden_act']) - self.mlm_ln = _build_ln(d_model, name=append_name(name, 'mask_lm_trans')) - self.mlm_bias = L.create_parameter( - dtype='float32', - shape=[d_vocab], - attr=F.ParamAttr( - name=append_name(name, 'mask_lm_out_fc.b_0'), initializer=F.initializer.Constant(value=0.0)), - is_bias=True, - ) - - def forward(self, src_ids, *args, **kwargs): - tgt_labels = kwargs.pop('tgt_labels', None) - tgt_pos = kwargs.pop('tgt_pos', None) - encode_only = kwargs.pop('encode_only', False) - _, encoded, info = ErnieModel.forward(self, src_ids, *args, **kwargs) - if encode_only: - return None, None, info - elif tgt_labels is None: - encoded = self.mlm(encoded) - encoded = self.mlm_ln(encoded) - logits = L.matmul(encoded, self.word_emb.weight, transpose_y=True) + self.mlm_bias - output_ids = L.argmax(logits, -1) - return output_ids, logits, info - else: - encoded_2d = L.gather_nd(encoded, tgt_pos) - encoded_2d = self.mlm(encoded_2d) - encoded_2d = self.mlm_ln(encoded_2d) - logits_2d = L.matmul(encoded_2d, self.word_emb.weight, transpose_y=True) + self.mlm_bias - if len(tgt_labels.shape) == 1: - tgt_labels = L.reshape(tgt_labels, [-1, 1]) - - loss = L.reduce_mean( - L.softmax_with_cross_entropy(logits_2d, tgt_labels, soft_label=(tgt_labels.shape[-1] != 1))) - return loss, logits_2d, info diff --git a/modules/text/text_generation/ernie_gen_leave/model/tokenizing_ernie.py b/modules/text/text_generation/ernie_gen_leave/model/tokenizing_ernie.py deleted file mode 100644 index c9e5638f9a17207ce2d664c27376f08138876da3..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/model/tokenizing_ernie.py +++ /dev/null @@ -1,163 +0,0 @@ -# Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -import six -import re -import logging -from functools import partial - -import numpy as np - -import io - -open = partial(io.open, encoding='utf8') - -log = logging.getLogger(__name__) - -_max_input_chars_per_word = 100 - - -def _wordpiece(token, vocab, unk_token, prefix='##', sentencepiece_prefix=''): - """ wordpiece: helloworld => [hello, ##world] """ - chars = list(token) - if len(chars) > _max_input_chars_per_word: - return [unk_token], [(0, len(chars))] - - is_bad = False - start = 0 - sub_tokens = [] - sub_pos = [] - while start < len(chars): - end = len(chars) - cur_substr = None - while start < end: - substr = "".join(chars[start:end]) - if start == 0: - substr = sentencepiece_prefix + substr - if start > 0: - substr = prefix + substr - if substr in vocab: - cur_substr = substr - break - end -= 1 - if cur_substr is None: - is_bad = True - break - sub_tokens.append(cur_substr) - sub_pos.append((start, end)) - start = end - if is_bad: - return [unk_token], [(0, len(chars))] - else: - return sub_tokens, sub_pos - - -class ErnieTokenizer(object): - def __init__(self, - vocab, - unk_token='[UNK]', - sep_token='[SEP]', - cls_token='[CLS]', - pad_token='[PAD]', - mask_token='[MASK]', - wordpiece_prefix='##', - sentencepiece_prefix='', - lower=True, - encoding='utf8', - special_token_list=[]): - if not isinstance(vocab, dict): - raise ValueError('expect `vocab` to be instance of dict, got %s' % type(vocab)) - self.vocab = vocab - self.lower = lower - self.prefix = wordpiece_prefix - self.sentencepiece_prefix = sentencepiece_prefix - self.pad_id = self.vocab[pad_token] - self.cls_id = cls_token and self.vocab[cls_token] - self.sep_id = sep_token and self.vocab[sep_token] - self.unk_id = unk_token and self.vocab[unk_token] - self.mask_id = mask_token and self.vocab[mask_token] - self.unk_token = unk_token - special_tokens = {pad_token, cls_token, sep_token, unk_token, mask_token} | set(special_token_list) - pat_str = '' - for t in special_tokens: - if t is None: - continue - pat_str += '(%s)|' % re.escape(t) - pat_str += r'([a-zA-Z0-9]+|\S)' - log.debug('regex: %s' % pat_str) - self.pat = re.compile(pat_str) - self.encoding = encoding - - def tokenize(self, text): - if len(text) == 0: - return [] - if six.PY3 and not isinstance(text, six.string_types): - text = text.decode(self.encoding) - if six.PY2 and isinstance(text, str): - text = text.decode(self.encoding) - - res = [] - for match in self.pat.finditer(text): - match_group = match.group(0) - if match.groups()[-1]: - if self.lower: - match_group = match_group.lower() - words, _ = _wordpiece( - match_group, - vocab=self.vocab, - unk_token=self.unk_token, - prefix=self.prefix, - sentencepiece_prefix=self.sentencepiece_prefix) - else: - words = [match_group] - res += words - return res - - def convert_tokens_to_ids(self, tokens): - return [self.vocab.get(t, self.unk_id) for t in tokens] - - def truncate(self, id1, id2, seqlen): - len1 = len(id1) - len2 = len(id2) - half = seqlen // 2 - if len1 > len2: - len1_truncated, len2_truncated = max(half, seqlen - len2), min(half, len2) - else: - len1_truncated, len2_truncated = min(half, seqlen - len1), max(half, seqlen - len1) - return id1[:len1_truncated], id2[:len2_truncated] - - def build_for_ernie(self, text_id, pair_id=[]): - """build sentence type id, add [CLS] [SEP]""" - text_id_type = np.zeros_like(text_id, dtype=np.int64) - ret_id = np.concatenate([[self.cls_id], text_id, [self.sep_id]], 0) - ret_id_type = np.concatenate([[0], text_id_type, [0]], 0) - - if len(pair_id): - pair_id_type = np.ones_like(pair_id, dtype=np.int64) - ret_id = np.concatenate([ret_id, pair_id, [self.sep_id]], 0) - ret_id_type = np.concatenate([ret_id_type, pair_id_type, [1]], 0) - return ret_id, ret_id_type - - def encode(self, text, pair=None, truncate_to=None): - text_id = np.array(self.convert_tokens_to_ids(self.tokenize(text)), dtype=np.int64) - text_id_type = np.zeros_like(text_id, dtype=np.int64) - if pair is not None: - pair_id = np.array(self.convert_tokens_to_ids(self.tokenize(pair)), dtype=np.int64) - else: - pair_id = [] - if truncate_to is not None: - text_id, pair_id = self.truncate(text_id, [] if pair_id is None else pair_id, truncate_to) - - ret_id, ret_id_type = self.build_for_ernie(text_id, pair_id) - return ret_id, ret_id_type diff --git a/modules/text/text_generation/ernie_gen_leave/module.py b/modules/text/text_generation/ernie_gen_leave/module.py deleted file mode 100644 index 04d5d733b4f9b7322953c595c0c10ac3b74eb3c7..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/module.py +++ /dev/null @@ -1,162 +0,0 @@ -# coding:utf-8 -# -# Licensed under the Apache License, Version 2.0 (the "License" -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -import ast -import json - -import paddle.fluid as fluid -import paddlehub as hub -from paddlehub.module.module import runnable -from paddlehub.compat.module.nlp_module import DataFormatError -from paddlehub.common.logger import logger -from paddlehub.module.module import moduleinfo, serving - -import argparse -import os -import numpy as np - -import paddle.fluid.dygraph as D - -from .model.tokenizing_ernie import ErnieTokenizer -from .model.decode import beam_search_infilling -from .model.modeling_ernie_gen import ErnieModelForGeneration - - -@moduleinfo( - name="ernie_gen_leave", - version="1.0.0", - summary="", - author="彭兆帅,郑博培", - author_email="1084667371@qq.com,2733821739@qq.com", - type="nlp/text_generation", -) -class ErnieGen(hub.NLPPredictionModule): - def _initialize(self): - """ - initialize with the necessary elements - """ - assets_path = os.path.join(self.directory, "assets") - gen_checkpoint_path = os.path.join(assets_path, "ernie_gen") - ernie_cfg_path = os.path.join(assets_path, 'ernie_config.json') - with open(ernie_cfg_path, encoding='utf8') as ernie_cfg_file: - ernie_cfg = dict(json.loads(ernie_cfg_file.read())) - ernie_vocab_path = os.path.join(assets_path, 'vocab.txt') - with open(ernie_vocab_path, encoding='utf8') as ernie_vocab_file: - ernie_vocab = {j.strip().split('\t')[0]: i for i, j in enumerate(ernie_vocab_file.readlines())} - - with fluid.dygraph.guard(fluid.CPUPlace()): - with fluid.unique_name.guard(): - self.model = ErnieModelForGeneration(ernie_cfg) - finetuned_states, _ = D.load_dygraph(gen_checkpoint_path) - self.model.set_dict(finetuned_states) - - self.tokenizer = ErnieTokenizer(ernie_vocab) - self.rev_dict = {v: k for k, v in self.tokenizer.vocab.items()} - self.rev_dict[self.tokenizer.pad_id] = '' # replace [PAD] - self.rev_dict[self.tokenizer.unk_id] = '' # replace [PAD] - self.rev_lookup = np.vectorize(lambda i: self.rev_dict[i]) - - @serving - def generate(self, texts, use_gpu=False, beam_width=5): - """ - Get the predict result from the input texts. - - Args: - texts(list): the input texts. - use_gpu(bool): whether use gpu to predict or not - beam_width(int): the beam search width. - - Returns: - results(list): the predict result. - """ - if texts and isinstance(texts, list) and all(texts) and all([isinstance(text, str) for text in texts]): - predicted_data = texts - else: - raise ValueError("The input texts should be a list with nonempty string elements.") - - if use_gpu and "CUDA_VISIBLE_DEVICES" not in os.environ: - use_gpu = False - logger.warning( - "use_gpu has been set False as you didn't set the environment variable CUDA_VISIBLE_DEVICES while using use_gpu=True" - ) - if use_gpu: - place = fluid.CUDAPlace(0) - else: - place = fluid.CPUPlace() - - with fluid.dygraph.guard(place): - self.model.eval() - results = [] - for text in predicted_data: - sample_results = [] - ids, sids = self.tokenizer.encode(text) - src_ids = D.to_variable(np.expand_dims(ids, 0)) - src_sids = D.to_variable(np.expand_dims(sids, 0)) - output_ids = beam_search_infilling( - self.model, - src_ids, - src_sids, - eos_id=self.tokenizer.sep_id, - sos_id=self.tokenizer.cls_id, - attn_id=self.tokenizer.vocab['[MASK]'], - max_decode_len=50, - max_encode_len=50, - beam_width=beam_width, - tgt_type_id=1) - output_str = self.rev_lookup(output_ids[0].numpy()) - - for ostr in output_str.tolist(): - if '[SEP]' in ostr: - ostr = ostr[:ostr.index('[SEP]')] - sample_results.append("".join(ostr)) - results.append(sample_results) - return results - - def add_module_config_arg(self): - """ - Add the command config options - """ - self.arg_config_group.add_argument( - '--use_gpu', type=ast.literal_eval, default=False, help="whether use GPU for prediction") - - self.arg_config_group.add_argument('--beam_width', type=int, default=5, help="the beam search width") - - @runnable - def run_cmd(self, argvs): - """ - Run as a command - """ - self.parser = argparse.ArgumentParser( - description='Run the %s module.' % self.name, - prog='hub run %s' % self.name, - usage='%(prog)s', - add_help=True) - - self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required") - self.arg_config_group = self.parser.add_argument_group( - title="Config options", description="Run configuration for controlling module behavior, optional.") - - self.add_module_config_arg() - self.add_module_input_arg() - - args = self.parser.parse_args(argvs) - - try: - input_data = self.check_input_data(args) - except DataFormatError and RuntimeError: - self.parser.print_help() - return None - - results = self.generate(texts=input_data, use_gpu=args.use_gpu, beam_width=args.beam_width) - - return results diff --git a/modules/text/text_generation/ernie_gen_leave/test.py b/modules/text/text_generation/ernie_gen_leave/test.py deleted file mode 100644 index a7abf1b88bc07aaaf7b3f5d0800d55595569dbc0..0000000000000000000000000000000000000000 --- a/modules/text/text_generation/ernie_gen_leave/test.py +++ /dev/null @@ -1,8 +0,0 @@ -import paddlehub as hub - -module = hub.Module(name="ernie_gen_leave") - -test_texts = ["理由"] -results = module.generate(texts=test_texts, use_gpu=False, beam_width=2) -for result in results: - print(result)