未验证 提交 9f622a30 编写于 作者: K KP 提交者: GitHub

Merge branch 'develop' into add_styleganv2mixing_module

......@@ -30,3 +30,9 @@
- --show-source
- --statistics
files: \.py$
- repo: https://github.com/asottile/reorder_python_imports
rev: v2.4.0
hooks:
- id: reorder-python-imports
exclude: (?=third_party).*(\.py)$
......@@ -4,7 +4,7 @@ English | [简体中文](README_ch.md)
<img src="./docs/imgs/paddlehub_logo.jpg" align="middle">
<p align="center">
<div align="center">
<h3> <a href=#QuickStart> QuickStart </a> | <a href="https://paddlehub.readthedocs.io/en/release-v2.1"> Tutorial </a> | <a href="https://www.paddlepaddle.org.cn/hublist"> Models List </a> | <a href="https://www.paddlepaddle.org.cn/hub"> Demos </a> </h3>
<h3> <a href=#QuickStart> QuickStart </a> | <a href="https://paddlehub.readthedocs.io/en/release-v2.1"> Tutorial </a> | <a href="./modules"> Models List </a> | <a href="https://www.paddlepaddle.org.cn/hub"> Demos </a> </h3>
</div>
------------------------------------------------------------------------------------------
......@@ -28,7 +28,7 @@ English | [简体中文](README_ch.md)
## Introduction and Features
- **PaddleHub** aims to provide developers with rich, high-quality, and directly usable pre-trained models.
- **Abundant Pre-trained Models**: 300+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage.
- **Abundant Pre-trained Models**: 360+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage.
- **No Need for Deep Learning Background**: you can use AI models quickly and enjoy the dividends of the artificial intelligence era.
- **Quick Model Prediction**: model prediction can be realized through a few lines of scripts to quickly experience the model effect.
- **Model As Service**: one-line command to build deep learning model API service deployment capabilities.
......@@ -36,6 +36,7 @@ English | [简体中文](README_ch.md)
- **Cross-platform**: support Linux, Windows, MacOS and other operating systems.
### Recent updates
- **2021.12.22**,The v2.2.0 version is released. [1]More than 100 new models released,including dialog, speech, segmentation, OCR, text processing, GANs, and many other categories. The total number of pre-trained models reaches [**【360】**](https://www.paddlepaddle.org.cn/hublist). [2]Add an [indexed file](./modules/README.md) including useful information of pretrained models supported by PaddleHub. [3]Refactor README of pretrained models.
- **2021.05.12:** Add an open-domain dialogue system, i.e., [plato-mini](https://www.paddlepaddle.org.cn/hubdetail?name=plato-mini&en_category=TextGeneration), to make it easy to build a chatbot in wechat with the help of the wechaty, [See Demo](https://github.com/KPatr1ck/paddlehub-wechaty-demo)
- **2021.04.27:** The v2.1.0 version is released. [1] Add supports for five new models, including two high-precision semantic segmentation models based on VOC dataset and three voice classification models. [2] Enforce the transfer learning capabilities for image semantic segmentation, text semantic matching and voice classification on related datasets. [3] Add the export function APIs for two kinds of model formats, i.,e, ONNX and PaddleInference. [4] Add the support for [BentoML](https://github.com/bentoml/BentoML/), which is a cloud native framework for serving deployment. Users can easily serve pre-trained models from PaddleHub by following the [Tutorial notebooks](https://github.com/PaddlePaddle/PaddleHub/blob/release/v2.1/demo/serving/bentoml/cloud-native-model-serving-with-bentoml.ipynb). Also, see this announcement and [Release note](https://github.com/bentoml/BentoML/releases/tag/v0.12.1) from BentoML. (Many thanks to @[parano](https://github.com/parano) @[cqvu](https://github.com/cqvu) @[deehrlic](https://github.com/deehrlic) for contributing this feature in PaddleHub). [5] The total number of pre-trained models reaches **【300】**.
- **2021.02.18:** The v2.0.0 version is released, making model development and debugging easier, and the finetune task is more flexible and easy to use.The ability to transfer learning for visual tasks is fully upgraded, supporting various tasks such as image classification, image coloring, and style transfer; Transformer models such as BERT, ERNIE, and RoBERTa are upgraded to dynamic graphs, supporting Fine-Tune capabilities for text classification and sequence labeling; Optimize the Serving capability, support multi-card prediction, automatic load balancing, and greatly improve performance; the new automatic data enhancement capability Auto Augment can efficiently search for data enhancement strategy combinations suitable for data sets. 61 new word vector models were added, including 51 Chinese models and 10 English models; add 4 image segmentation models, 2 depth models, 7 image generation models, and 3 text generation models, the total number of pre-trained models reaches **【274】**.
......@@ -44,8 +45,8 @@ English | [简体中文](README_ch.md)
## Visualization Demo [[More]](./docs/docs_en/visualization.md)
### **Computer Vision (161 models)**
## Visualization Demo [[More]](./docs/docs_en/visualization.md) [[ModelList]](./modules)
### **[Computer Vision (212 models)](./modules#Image)**
<div align="center">
<img src="./docs/imgs/Readme_Related/Image_all.gif" width = "530" height = "400" />
</div>
......@@ -53,7 +54,7 @@ English | [简体中文](README_ch.md)
- Many thanks to CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) for the pre-trained models, you can try to train your models with them.
### **Natural Language Processing (129 models)**
### **[Natural Language Processing (130 models)](./modules#Text)**
<div align="center">
<img src="./docs/imgs/Readme_Related/Text_all.gif" width = "640" height = "240" />
</div>
......@@ -62,9 +63,37 @@ English | [简体中文](README_ch.md)
### Speech (3 models)
### [Speech (15 models)](./modules#Audio)
- ASR speech recognition algorithm, multiple algorithms are available.
- The speech recognition effect is as follows:
<div align="center">
<table>
<thead>
<tr>
<th width=250> Input Audio </th>
<th width=550> Recognition Result </th>
</tr>
</thead>
<tbody>
<tr>
<td align = "center">
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
<img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250 ></a><br>
</td>
<td >I knocked at the door on the ancient side of the building.</td>
</tr>
<tr>
<td align = "center">
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav" rel="nofollow">
<img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250></a><br>
</td>
<td>我认为跑步最重要的就是给我带来了身体健康。</td>
</tr>
</tbody>
</table>
</div>
- TTS speech synthesis algorithm, multiple algorithms are available.
- Many thanks to CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet) for the pre-trained models, you can try to train your models with Parakeet.
- Input: `Life was like a box of chocolates, you never know what you're gonna get.`
- The synthesis effect is as follows:
<div align="center">
......@@ -95,7 +124,9 @@ English | [简体中文](README_ch.md)
</table>
</div>
### Video (8 models)
- Many thanks to CopyRight@[PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech) for the pre-trained models, you can try to train your models with PaddleSpeech.
### [Video (8 models)](./modules#Video)
- Short video classification trained via large-scale video datasets, supports 3000+ tag types prediction for short Form Videos.
- Many thanks to CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo) for the pre-trained model, you can try to train your models with PaddleVideo.
- `Example: Input a short video of swimming, the algorithm can output the result of "swimming"`
......
......@@ -4,7 +4,7 @@
<img src="./docs/imgs/paddlehub_logo.jpg" align="middle">
<p align="center">
<div align="center">
<h3> <a href=#QuickStart> 快速开始 </a> | <a href="https://paddlehub.readthedocs.io/zh_CN/release-v2.1//"> 教程文档 </a> | <a href="https://www.paddlepaddle.org.cn/hublist"> 模型搜索 </a> | <a href="https://www.paddlepaddle.org.cn/hub"> 演示Demo </a>
<h3> <a href=#QuickStart> 快速开始 </a> | <a href="https://paddlehub.readthedocs.io/zh_CN/release-v2.1//"> 教程文档 </a> | <a href="./modules/README_ch.md"> 模型库 </a> | <a href="https://www.paddlepaddle.org.cn/hub"> 演示Demo </a>
</h3>
</div>
......@@ -30,7 +30,7 @@
## 简介与特性
- PaddleHub旨在为开发者提供丰富的、高质量的、直接可用的预训练模型
- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 300+ 预训练模型,全部开源下载,离线可运行
- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 **360+** 预训练模型,全部开源下载,离线可运行
- **【超低使用门槛】**:无需深度学习背景、无需数据与训练过程,可快速使用AI模型
- **【一键模型快速预测】**:通过一行命令行或者极简的Python API实现模型调用,可快速体验模型效果
- **【一键模型转服务化】**:一行命令,搭建深度学习模型API服务化部署能力
......@@ -38,6 +38,7 @@
- **【跨平台兼容性】**:可运行于Linux、Windows、MacOS等多种操作系统
## 近期更新
- **2021.12.22**,发布v2.2.0版本。【1】新增100+高质量模型,涵盖对话、语音处理、语义分割、文字识别、文本处理、图像生成等多个领域,预训练模型总量达到[**【360+】**](https://www.paddlepaddle.org.cn/hublist);【2】新增模型[检索列表](./modules/README_ch.md),包含模型名称、网络、数据集和使用场景等信息,快速定位用户所需的模型;【3】模型文档排版优化,呈现数据集、指标、模型大小等更多实用信息。
- **2021.05.12**,新增轻量级中文对话模型[plato-mini](https://www.paddlepaddle.org.cn/hubdetail?name=plato-mini&en_category=TextGeneration),可以配合使用wechaty实现微信闲聊机器人,[参考demo](https://github.com/KPatr1ck/paddlehub-wechaty-demo)
- **2021.04.27**,发布v2.1.0版本。【1】新增基于VOC数据集的高精度语义分割模型2个,语音分类模型3个。【2】新增图像语义分割、文本语义匹配、语音分类等相关任务的Fine-Tune能力以及相关任务数据集;完善部署能力:【3】新增ONNX和PaddleInference等模型格式的导出功能。【4】新增[BentoML](https://github.com/bentoml/BentoML) 云原生服务化部署能力,可以支持统一的多框架模型管理和模型部署的工作流,[详细教程](https://github.com/PaddlePaddle/PaddleHub/blob/release/v2.1/demo/serving/bentoml/cloud-native-model-serving-with-bentoml.ipynb). 更多内容可以参考BentoML 最新 v0.12.1 [Releasenote](https://github.com/bentoml/BentoML/releases/tag/v0.12.1).(感谢@[parano](https://github.com/parano) @[cqvu](https://github.com/cqvu) @[deehrlic](https://github.com/deehrlic))的贡献与支持。【5】预训练模型总量达到[**【300】**](https://www.paddlepaddle.org.cn/hublist)个。
- **2021.02.18**,发布v2.0.0版本,【1】模型开发调试更简单,finetune接口更加灵活易用。视觉类任务迁移学习能力全面升级,支持[图像分类](./demo/image_classification/README.md)[图像着色](./demo/colorization/README.md)[风格迁移](./demo/style_transfer/README.md)等多种任务;BERT、ERNIE、RoBERTa等Transformer类模型升级至动态图,支持[文本分类](./demo/text_classification/README.md)[序列标注](./demo/sequence_labeling/README.md)的Fine-Tune能力;【2】优化服务化部署Serving能力,支持多卡预测、自动负载均衡,性能大幅度提升;【3】新增自动数据增强能力[Auto Augment](./demo/autoaug/README.md),能高效地搜索适合数据集的数据增强策略组合。【4】新增[词向量模型](./modules/text/embedding)61个,其中包含中文模型51个,英文模型10个;新增[图像分割](./modules/thirdparty/image/semantic_segmentation)模型4个、[深度模型](./modules/thirdparty/image/depth_estimation)2个、[图像生成](./modules/thirdparty/image/Image_gan/style_transfer)模型7个、[文本生成](./modules/thirdparty/text/text_generation)模型3个。【5】预训练模型总量达到[**【274】**](https://www.paddlepaddle.org.cn/hublist) 个。
......@@ -47,9 +48,9 @@
## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)**
## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)[【模型库】](./modules/README_ch.md)**
### **图像类(161个)**
### **[图像类(212个)](./modules/README_ch.md#图像)**
- 包括图像分类、人脸检测、口罩检测、车辆检测、人脸/人体/手部关键点检测、人像分割、80+语言文本识别、图像超分/上色/动漫化等
<div align="center">
<img src="./docs/imgs/Readme_Related/Image_all.gif" width = "530" height = "400" />
......@@ -58,7 +59,7 @@
- 感谢CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) 提供相关预训练模型,训练能力开放,欢迎体验。
### **文本类(129个)**
### **[文本类(130个)](./modules/README_ch.md#文本)**
- 包括中文分词、词性标注与命名实体识别、句法分析、AI写诗/对联/情话/藏头诗、中文的评论情感分析、中文色情文本审核等
<div align="center">
<img src="./docs/imgs/Readme_Related/Text_all.gif" width = "640" height = "240" />
......@@ -67,9 +68,37 @@
- 感谢CopyRight@[ERNIE](https://github.com/PaddlePaddle/ERNIE)[LAC](https://github.com/baidu/LAC)[DDParser](https://github.com/baidu/DDParser)提供相关预训练模型,训练能力开放,欢迎体验。
### **语音类(3个)**
### **[语音类(15个)](./modules/README_ch.md#语音)**
- ASR语音识别算法,多种算法可选
- 语音识别效果如下:
<div align="center">
<table>
<thead>
<tr>
<th width=250> Input Audio </th>
<th width=550> Recognition Result </th>
</tr>
</thead>
<tbody>
<tr>
<td align = "center">
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/en.wav" rel="nofollow">
<img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250 ></a><br>
</td>
<td >I knocked at the door on the ancient side of the building.</td>
</tr>
<tr>
<td align = "center">
<a href="https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav" rel="nofollow">
<img align="center" src="./docs/imgs/Readme_Related/audio_icon.png" width=250></a><br>
</td>
<td>我认为跑步最重要的就是给我带来了身体健康。</td>
</tr>
</tbody>
</table>
</div>
- TTS语音合成算法,多种算法可选
- 感谢CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet)提供预训练模型,训练能力开放,欢迎体验。
- 输入:`Life was like a box of chocolates, you never know what you're gonna get.`
- 合成效果如下:
<div align="center">
......@@ -100,7 +129,9 @@
</table>
</div>
### **视频类(8个)**
- 感谢CopyRight@[PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech)提供预训练模型,训练能力开放,欢迎体验。
### **[视频类(8个)](./modules/README_ch.md#视频)**
- 包含短视频分类,支持3000+标签种类,可输出TOP-K标签,多种算法可选。
- 感谢CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo)提供预训练模型,训练能力开放,欢迎体验。
- `举例:输入一段游泳的短视频,算法可以输出"游泳"结果`
......
English | [简体中文](README_ch.md)
# CONTENTS
|[Image](#Image) (212)|[Text](#Text) (130)|[Audio](#Audio) (15)|[Video](#Video) (8)|[Industrial Application](#Industrial-Application) (1)|
|--|--|--|--|--|
|[Image Classification](#Image-Classification) (108)|[Text Generation](#Text-Generation) (17)| [Voice Cloning](#Voice-Cloning) (2)|[Video Classification](#Video-Classification) (5)| [Meter Detection](#Meter-Detection) (1)|
|[Image Generation](#Image-Generation) (26)|[Word Embedding](#Word-Embedding) (62)|[Text to Speech](#Text-to-Speech) (5)|[Video Editing](#Video-Editing) (1)|-|
|[Keypoint Detection](#Keypoint-Detection) (5)|[Machine Translation](#Machine-Translation) (2)|[Automatic Speech Recognition](#Automatic-Speech-Recognition) (5)|[Multiple Object tracking](#Multiple-Object-tracking) (2)|-|
|[Semantic Segmentation](#Semantic-Segmentation) (25)|[Language Model](#Language-Model) (30)|[Audio Classification](#Audio-Classification) (3)| -|-|
|[Face Detection](#Face-Detection) (7)|[Sentiment Analysis](#Sentiment-Analysis) (7)|-|-|-|
|[Text Recognition](#Text-Recognition) (17)|[Syntactic Analysis](#Syntactic-Analysis) (1)|-|-|-|
|[Image Editing](#Image-Editing) (8)|[Simultaneous Translation](#Simultaneous-Translation) (5)|-|-|-|
|[Instance Segmentation](#Instance-Segmentation) (1)|[Lexical Analysis](#Lexical-Analysis) (2)|-|-|-|
|[Object Detection](#Object-Detection) (13)|[Punctuation Restoration](#Punctuation-Restoration) (1)|-|-|-|
|[Depth Estimation](#Depth-Estimation) (2)|[Text Review](#Text-Review) (3)|-|-|-|
## Image
- ### Image Classification
<details><summary>expand</summary><div>
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[DriverStatusRecognition](image/classification/DriverStatusRecognition)|MobileNetV3_small_ssld|分心司机检测数据集||
|[mobilenet_v2_animals](image/classification/mobilenet_v2_animals)|MobileNet_v2|百度自建动物数据集||
|[repvgg_a1_imagenet](image/classification/repvgg_a1_imagenet)|RepVGG|ImageNet-2012||
|[repvgg_a0_imagenet](image/classification/repvgg_a0_imagenet)|RepVGG|ImageNet-2012||
|[resnext152_32x4d_imagenet](image/classification/resnext152_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[resnet_v2_152_imagenet](image/classification/resnet_v2_152_imagenet)|ResNet V2|ImageNet-2012||
|[resnet50_vd_animals](image/classification/resnet50_vd_animals)|ResNet50_vd|百度自建动物数据集||
|[food_classification](image/classification/food_classification)|ResNet50_vd_ssld|美食数据集||
|[mobilenet_v3_large_imagenet_ssld](image/classification/mobilenet_v3_large_imagenet_ssld)|Mobilenet_v3_large|ImageNet-2012||
|[resnext152_vd_32x4d_imagenet](image/classification/resnext152_vd_32x4d_imagenet)||||
|[ghostnet_x1_3_imagenet_ssld](image/classification/ghostnet_x1_3_imagenet_ssld)|GhostNet|ImageNet-2012||
|[rexnet_1_5_imagenet](image/classification/rexnet_1_5_imagenet)|ReXNet|ImageNet-2012||
|[resnext50_64x4d_imagenet](image/classification/resnext50_64x4d_imagenet)|ResNeXt|ImageNet-2012||
|[resnext101_64x4d_imagenet](image/classification/resnext101_64x4d_imagenet)|ResNeXt|ImageNet-2012||
|[efficientnetb0_imagenet](image/classification/efficientnetb0_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb1_imagenet](image/classification/efficientnetb1_imagenet)|EfficientNet|ImageNet-2012||
|[mobilenet_v2_imagenet_ssld](image/classification/mobilenet_v2_imagenet_ssld)|Mobilenet_v2|ImageNet-2012||
|[resnet50_vd_dishes](image/classification/resnet50_vd_dishes)|ResNet50_vd|百度自建菜品数据集||
|[pnasnet_imagenet](image/classification/pnasnet_imagenet)|PNASNet|ImageNet-2012||
|[rexnet_2_0_imagenet](image/classification/rexnet_2_0_imagenet)|ReXNet|ImageNet-2012||
|[SnakeIdentification](image/classification/SnakeIdentification)|ResNet50_vd_ssld|蛇种数据集||
|[hrnet40_imagenet](image/classification/hrnet40_imagenet)|HRNet|ImageNet-2012||
|[resnet_v2_34_imagenet](image/classification/resnet_v2_34_imagenet)|ResNet V2|ImageNet-2012||
|[mobilenet_v2_dishes](image/classification/mobilenet_v2_dishes)|MobileNet_v2|百度自建菜品数据集||
|[resnext101_vd_32x4d_imagenet](image/classification/resnext101_vd_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[repvgg_b2g4_imagenet](image/classification/repvgg_b2g4_imagenet)|RepVGG|ImageNet-2012||
|[fix_resnext101_32x48d_wsl_imagenet](image/classification/fix_resnext101_32x48d_wsl_imagenet)|ResNeXt|ImageNet-2012||
|[vgg13_imagenet](image/classification/vgg13_imagenet)|VGG|ImageNet-2012||
|[se_resnext101_32x4d_imagenet](image/classification/se_resnext101_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012||
|[hrnet30_imagenet](image/classification/hrnet30_imagenet)|HRNet|ImageNet-2012||
|[ghostnet_x1_3_imagenet](image/classification/ghostnet_x1_3_imagenet)|GhostNet|ImageNet-2012||
|[dpn107_imagenet](image/classification/dpn107_imagenet)|DPN|ImageNet-2012||
|[densenet161_imagenet](image/classification/densenet161_imagenet)|DenseNet|ImageNet-2012||
|[vgg19_imagenet](image/classification/vgg19_imagenet)|vgg19_imagenet|ImageNet-2012||
|[mobilenet_v2_imagenet](image/classification/mobilenet_v2_imagenet)|Mobilenet_v2|ImageNet-2012||
|[resnet50_vd_10w](image/classification/resnet50_vd_10w)|ResNet_vd|百度自建数据集||
|[resnet_v2_101_imagenet](image/classification/resnet_v2_101_imagenet)|ResNet V2 101|ImageNet-2012||
|[darknet53_imagenet](image/classification/darknet53_imagenet)|DarkNet|ImageNet-2012||
|[se_resnext50_32x4d_imagenet](image/classification/se_resnext50_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012||
|[se_hrnet64_imagenet_ssld](image/classification/se_hrnet64_imagenet_ssld)|HRNet|ImageNet-2012||
|[resnext101_32x16d_wsl](image/classification/resnext101_32x16d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[hrnet18_imagenet](image/classification/hrnet18_imagenet)|HRNet|ImageNet-2012||
|[spinalnet_res101_gemstone](image/classification/spinalnet_res101_gemstone)|resnet101|gemstone||
|[densenet264_imagenet](image/classification/densenet264_imagenet)|DenseNet|ImageNet-2012||
|[resnext50_vd_32x4d_imagenet](image/classification/resnext50_vd_32x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[SpinalNet_Gemstones](image/classification/SpinalNet_Gemstones)||||
|[spinalnet_vgg16_gemstone](image/classification/spinalnet_vgg16_gemstone)|vgg16|gemstone||
|[xception71_imagenet](image/classification/xception71_imagenet)|Xception|ImageNet-2012||
|[repvgg_b2_imagenet](image/classification/repvgg_b2_imagenet)|RepVGG|ImageNet-2012||
|[dpn68_imagenet](image/classification/dpn68_imagenet)|DPN|ImageNet-2012||
|[alexnet_imagenet](image/classification/alexnet_imagenet)|AlexNet|ImageNet-2012||
|[rexnet_1_3_imagenet](image/classification/rexnet_1_3_imagenet)|ReXNet|ImageNet-2012||
|[hrnet64_imagenet](image/classification/hrnet64_imagenet)|HRNet|ImageNet-2012||
|[efficientnetb7_imagenet](image/classification/efficientnetb7_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb0_small_imagenet](image/classification/efficientnetb0_small_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb6_imagenet](image/classification/efficientnetb6_imagenet)|EfficientNet|ImageNet-2012||
|[hrnet48_imagenet](image/classification/hrnet48_imagenet)|HRNet|ImageNet-2012||
|[rexnet_3_0_imagenet](image/classification/rexnet_3_0_imagenet)|ReXNet|ImageNet-2012||
|[shufflenet_v2_imagenet](image/classification/shufflenet_v2_imagenet)|ShuffleNet V2|ImageNet-2012||
|[ghostnet_x0_5_imagenet](image/classification/ghostnet_x0_5_imagenet)|GhostNet|ImageNet-2012||
|[inception_v4_imagenet](image/classification/inception_v4_imagenet)|Inception_V4|ImageNet-2012||
|[resnext101_vd_64x4d_imagenet](image/classification/resnext101_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[densenet201_imagenet](image/classification/densenet201_imagenet)|DenseNet|ImageNet-2012||
|[vgg16_imagenet](image/classification/vgg16_imagenet)|VGG|ImageNet-2012||
|[mobilenet_v3_small_imagenet_ssld](image/classification/mobilenet_v3_small_imagenet_ssld)|Mobilenet_v3_Small|ImageNet-2012||
|[hrnet18_imagenet_ssld](image/classification/hrnet18_imagenet_ssld)|HRNet|ImageNet-2012||
|[resnext152_64x4d_imagenet](image/classification/resnext152_64x4d_imagenet)|ResNeXt|ImageNet-2012||
|[efficientnetb3_imagenet](image/classification/efficientnetb3_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb2_imagenet](image/classification/efficientnetb2_imagenet)|EfficientNet|ImageNet-2012||
|[repvgg_b1g4_imagenet](image/classification/repvgg_b1g4_imagenet)|RepVGG|ImageNet-2012||
|[resnext101_32x4d_imagenet](image/classification/resnext101_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[resnext50_32x4d_imagenet](image/classification/resnext50_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[repvgg_a2_imagenet](image/classification/repvgg_a2_imagenet)|RepVGG|ImageNet-2012||
|[resnext152_vd_64x4d_imagenet](image/classification/resnext152_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[xception41_imagenet](image/classification/xception41_imagenet)|Xception|ImageNet-2012||
|[googlenet_imagenet](image/classification/googlenet_imagenet)|GoogleNet|ImageNet-2012||
|[resnet50_vd_imagenet_ssld](image/classification/resnet50_vd_imagenet_ssld)|ResNet_vd|ImageNet-2012||
|[repvgg_b1_imagenet](image/classification/repvgg_b1_imagenet)|RepVGG|ImageNet-2012||
|[repvgg_b0_imagenet](image/classification/repvgg_b0_imagenet)|RepVGG|ImageNet-2012||
|[resnet_v2_50_imagenet](image/classification/resnet_v2_50_imagenet)|ResNet V2|ImageNet-2012||
|[rexnet_1_0_imagenet](image/classification/rexnet_1_0_imagenet)|ReXNet|ImageNet-2012||
|[resnet_v2_18_imagenet](image/classification/resnet_v2_18_imagenet)|ResNet V2|ImageNet-2012||
|[resnext101_32x8d_wsl](image/classification/resnext101_32x8d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[efficientnetb4_imagenet](image/classification/efficientnetb4_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb5_imagenet](image/classification/efficientnetb5_imagenet)|EfficientNet|ImageNet-2012||
|[repvgg_b1g2_imagenet](image/classification/repvgg_b1g2_imagenet)|RepVGG|ImageNet-2012||
|[resnext101_32x48d_wsl](image/classification/resnext101_32x48d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[resnet50_vd_wildanimals](image/classification/resnet50_vd_wildanimals)|ResNet_vd|IFAW 自建野生动物数据集||
|[nasnet_imagenet](image/classification/nasnet_imagenet)|NASNet|ImageNet-2012||
|[se_resnet18_vd_imagenet](image/classification/se_resnet18_vd_imagenet)||||
|[spinalnet_res50_gemstone](image/classification/spinalnet_res50_gemstone)|resnet50|gemstone||
|[resnext50_vd_64x4d_imagenet](image/classification/resnext50_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[resnext101_32x32d_wsl](image/classification/resnext101_32x32d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[dpn131_imagenet](image/classification/dpn131_imagenet)|DPN|ImageNet-2012||
|[xception65_imagenet](image/classification/xception65_imagenet)|Xception|ImageNet-2012||
|[repvgg_b3g4_imagenet](image/classification/repvgg_b3g4_imagenet)|RepVGG|ImageNet-2012||
|[marine_biometrics](image/classification/marine_biometrics)|ResNet50_vd_ssld|Fish4Knowledge||
|[res2net101_vd_26w_4s_imagenet](image/classification/res2net101_vd_26w_4s_imagenet)|Res2Net|ImageNet-2012||
|[dpn98_imagenet](image/classification/dpn98_imagenet)|DPN|ImageNet-2012||
|[resnet18_vd_imagenet](image/classification/resnet18_vd_imagenet)|ResNet_vd|ImageNet-2012||
|[densenet121_imagenet](image/classification/densenet121_imagenet)|DenseNet|ImageNet-2012||
|[vgg11_imagenet](image/classification/vgg11_imagenet)|VGG|ImageNet-2012||
|[hrnet44_imagenet](image/classification/hrnet44_imagenet)|HRNet|ImageNet-2012||
|[densenet169_imagenet](image/classification/densenet169_imagenet)|DenseNet|ImageNet-2012||
|[hrnet32_imagenet](image/classification/hrnet32_imagenet)|HRNet|ImageNet-2012||
|[dpn92_imagenet](image/classification/dpn92_imagenet)|DPN|ImageNet-2012||
|[ghostnet_x1_0_imagenet](image/classification/ghostnet_x1_0_imagenet)|GhostNet|ImageNet-2012||
|[hrnet48_imagenet_ssld](image/classification/hrnet48_imagenet_ssld)|HRNet|ImageNet-2012||
</div></details>
- ### Image Generation
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[pixel2style2pixel](image/Image_gan/gan/pixel2style2pixel/)|Pixel2Style2Pixel|-|人脸转正|
|[stgan_bald](image/Image_gan/gan/stgan_bald/)|STGAN|CelebA|秃头生成器|
|[styleganv2_editing](image/Image_gan/gan/styleganv2_editing)|StyleGAN V2|-|人脸编辑|
|[wav2lip](image/Image_gan/gan/wav2lip)|wav2lip|LRS2|唇形生成|
|[attgan_celeba](image/Image_gan/attgan_celeba/)|AttGAN|Celeba|人脸编辑|
|[cyclegan_cityscapes](image/Image_gan/cyclegan_cityscapes)|CycleGAN|Cityscapes|实景图和语义分割结果互相转换|
|[stargan_celeba](image/Image_gan/stargan_celeba)|StarGAN|Celeba|人脸编辑|
|[stgan_celeba](image/Image_gan/stgan_celeba/)|STGAN|Celeba|人脸编辑|
|[ID_Photo_GEN](image/Image_gan/style_transfer/ID_Photo_GEN)|HRNet_W18|-|证件照生成|
|[Photo2Cartoon](image/Image_gan/style_transfer/Photo2Cartoon)|U-GAT-IT|cartoon_data|人脸卡通化|
|[U2Net_Portrait](image/Image_gan/style_transfer/U2Net_Portrait)|U^2Net|-|人脸素描化|
|[UGATIT_100w](image/Image_gan/style_transfer/UGATIT_100w)|U-GAT-IT|selfie2anime|人脸动漫化|
|[UGATIT_83w](image/Image_gan/style_transfer/UGATIT_83w)|U-GAT-IT|selfie2anime|人脸动漫化|
|[UGATIT_92w](image/Image_gan/style_transfer/UGATIT_92w)| U-GAT-IT|selfie2anime|人脸动漫化|
|[animegan_v1_hayao_60](image/Image_gan/style_transfer/animegan_v1_hayao_60)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏|
|[animegan_v2_hayao_64](image/Image_gan/style_transfer/animegan_v2_hayao_64)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏|
|[animegan_v2_hayao_99](image/Image_gan/style_transfer/animegan_v2_hayao_99)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏|
|[animegan_v2_paprika_54](image/Image_gan/style_transfer/animegan_v2_paprika_54)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_paprika_74](image/Image_gan/style_transfer/animegan_v2_paprika_74)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_paprika_97](image/Image_gan/style_transfer/animegan_v2_paprika_97)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_paprika_98](image/Image_gan/style_transfer/animegan_v2_paprika_98)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_shinkai_33](image/Image_gan/style_transfer/animegan_v2_shinkai_33)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚|
|[animegan_v2_shinkai_53](image/Image_gan/style_transfer/animegan_v2_shinkai_53)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚|
|[msgnet](image/Image_gan/style_transfer/msgnet)|msgnet|COCO2014|
|[stylepro_artistic](image/Image_gan/style_transfer/stylepro_artistic)|StyleProNet|MS-COCO + WikiArt|艺术风格迁移|
|stylegan_ffhq|StyleGAN|FFHQ|图像风格迁移|
- ### Keypoint Detection
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[face_landmark_localization](image/keypoint_detection/face_landmark_localization)|Face_Landmark|AFW/AFLW|人脸关键点检测|
|[hand_pose_localization](image/keypoint_detection/hand_pose_localization)|-|MPII, NZSL|手部关键点检测|
|[openpose_body_estimation](image/keypoint_detection/openpose_body_estimation)|two-branch multi-stage CNN|MPII, COCO 2016|肢体关键点检测|
|[human_pose_estimation_resnet50_mpii](image/keypoint_detection/human_pose_estimation_resnet50_mpii)|Pose_Resnet50|MPII|人体骨骼关键点检测
|[openpose_hands_estimation](image/keypoint_detection/openpose_hands_estimation)|-|MPII, NZSL|手部关键点检测|
- ### Semantic Segmentation
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[deeplabv3p_xception65_humanseg](image/semantic_segmentation/deeplabv3p_xception65_humanseg)|deeplabv3p|百度自建数据集|人像分割|
|[humanseg_server](image/semantic_segmentation/humanseg_server)|deeplabv3p|百度自建数据集|人像分割|
|[humanseg_mobile](image/semantic_segmentation/humanseg_mobile)|hrnet|百度自建数据集|人像分割-移动端前置摄像头|
|[humanseg_lite](image/semantic_segmentation/umanseg_lite)|shufflenet|百度自建数据集|轻量级人像分割-移动端实时|
|[ExtremeC3_Portrait_Segmentation](image/semantic_segmentation/ExtremeC3_Portrait_Segmentation)|ExtremeC3|EG1800, Baidu fashion dataset|轻量化人像分割|
|[SINet_Portrait_Segmentation](image/semantic_segmentation/SINet_Portrait_Segmentation)|SINet|EG1800, Baidu fashion dataset|轻量化人像分割|
|[FCN_HRNet_W18_Face_Seg](image/semantic_segmentation/FCN_HRNet_W18_Face_Seg)|FCN_HRNet_W18|-|人像分割|
|[ace2p](image/semantic_segmentation/ace2p)|ACE2P|LIP|人体解析|
|[Pneumonia_CT_LKM_PP](image/semantic_segmentation/Pneumonia_CT_LKM_PP)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析|
|[Pneumonia_CT_LKM_PP_lung](image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析|
|[ocrnet_hrnetw18_voc](image/semantic_segmentation/ocrnet_hrnetw18_voc)|ocrnet, hrnet|PascalVoc2012|
|[U2Net](image/semantic_segmentation/U2Net)|U^2Net|-|图像前景背景分割|
|[U2Netp](image/semantic_segmentation/U2Netp)|U^2Net|-|图像前景背景分割|
|[Extract_Line_Draft](image/semantic_segmentation/Extract_Line_Draft)|UNet|Pixiv|线稿提取|
|[unet_cityscapes](image/semantic_segmentation/unet_cityscapes)|UNet|cityscapes|
|[ocrnet_hrnetw18_cityscapes](image/semantic_segmentation/ocrnet_hrnetw18_cityscapes)|ocrnet_hrnetw18|cityscapes|
|[hardnet_cityscapes](image/semantic_segmentation/hardnet_cityscapes)|hardnet|cityscapes|
|[fcn_hrnetw48_voc](image/semantic_segmentation/fcn_hrnetw48_voc)|fcn_hrnetw48|PascalVoc2012|
|[fcn_hrnetw48_cityscapes](image/semantic_segmentation/fcn_hrnetw48_cityscapes)|fcn_hrnetw48|cityscapes|
|[fcn_hrnetw18_voc](image/semantic_segmentation/fcn_hrnetw18_voc)|fcn_hrnetw18|PascalVoc2012|
|[fcn_hrnetw18_cityscapes](image/semantic_segmentation/fcn_hrnetw18_cityscapes)|fcn_hrnetw18|cityscapes|
|[fastscnn_cityscapes](image/semantic_segmentation/fastscnn_cityscapes)|fastscnn|cityscapes|
|[deeplabv3p_resnet50_voc](image/semantic_segmentation/deeplabv3p_resnet50_voc)|deeplabv3p, resnet50|PascalVoc2012|
|[deeplabv3p_resnet50_cityscapes](image/semantic_segmentation/deeplabv3p_resnet50_cityscapes)|deeplabv3p, resnet50|cityscapes|
|[bisenetv2_cityscapes](image/semantic_segmentation/bisenetv2_cityscapes)|bisenetv2|cityscapes|
- ### Face Detection
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[pyramidbox_lite_mobile](image/face_detection/pyramidbox_lite_mobile)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测-移动端|
|[pyramidbox_lite_mobile_mask](image/face_detection/pyramidbox_lite_mobile_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测-移动端|
|[pyramidbox_lite_server_mask](image/face_detection/pyramidbox_lite_server_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测|
|[ultra_light_fast_generic_face_detector_1mb_640](image/face_detection/ultra_light_fast_generic_face_detector_1mb_640)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备|
|[ultra_light_fast_generic_face_detector_1mb_320](image/face_detection/ultra_light_fast_generic_face_detector_1mb_320)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备|
|[pyramidbox_lite_server](image/face_detection/pyramidbox_lite_server)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测|
|[pyramidbox_face_detection](image/face_detection/pyramidbox_face_detection)|PyramidBox|WIDER FACE数据集|人脸检测|
- ### Text Recognition
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[chinese_ocr_db_crnn_mobile](image/text_recognition/chinese_ocr_db_crnn_mobile)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别|[chinese_text_detection_db_mobile](image/text_recognition/chinese_text_detection_db_mobile)|Differentiable Binarization|icdar2015数据集|中文文本检测|
|[chinese_text_detection_db_server](image/text_recognition/chinese_text_detection_db_server)|Differentiable Binarization|icdar2015数据集|中文文本检测|
|[chinese_ocr_db_crnn_server](image/text_recognition/chinese_ocr_db_crnn_server)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别|
|[Vehicle_License_Plate_Recognition](image/text_recognition/Vehicle_License_Plate_Recognition)|-|CCPD|车牌识别|
|[chinese_cht_ocr_db_crnn_mobile](image/text_recognition/chinese_cht_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|繁体中文文字识别|
|[japan_ocr_db_crnn_mobile](image/text_recognition/japan_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|日文文字识别|
|[korean_ocr_db_crnn_mobile](image/text_recognition/korean_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|韩文文字识别|
|[german_ocr_db_crnn_mobile](image/text_recognition/german_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|德文文字识别|
|[french_ocr_db_crnn_mobile](image/text_recognition/french_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|法文文字识别|
|[latin_ocr_db_crnn_mobile](image/text_recognition/latin_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|拉丁文文字识别|
|[cyrillic_ocr_db_crnn_mobile](image/text_recognition/cyrillic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|斯拉夫文文字识别|
|[multi_languages_ocr_db_crnn](image/text_recognition/multi_languages_ocr_db_crnn)|Differentiable Binarization+RCNN|icdar2015数据集|多语言文字识别|
|[kannada_ocr_db_crnn_mobile](image/text_recognition/kannada_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|卡纳达文文字识别|
|[arabic_ocr_db_crnn_mobile](image/text_recognition/arabic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|阿拉伯文文字识别|
|[telugu_ocr_db_crnn_mobile](image/text_recognition/telugu_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰卢固文文字识别|
|[devanagari_ocr_db_crnn_mobile](image/text_recognition/devanagari_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|梵文文字识别|
|[tamil_ocr_db_crnn_mobile](image/text_recognition/tamil_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰米尔文文字识别|
- ### Image Editing
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[realsr](image/Image_editing/super_resolution/realsr)|LP-KPN|RealSR dataset|图像/视频超分-4倍|
|[deoldify](image/Image_editing/colorization/deoldify)|GAN|ILSVRC 2012|黑白照片/视频着色|
|[photo_restoration](image/Image_editing/colorization/photo_restoration)|基于deoldify和realsr模型|-|老照片修复|
|[user_guided_colorization](image/Image_editing/colorization/user_guided_colorization)|siggraph|ILSVRC 2012|图像着色|
|[falsr_c](image/Image_editing/super_resolution/falsr_c)|falsr_c| DIV2k|轻量化超分-2倍|
|[dcscn](image/Image_editing/super_resolution/dcscn)|dcscn| DIV2k|轻量化超分-2倍|
|[falsr_a](image/Image_editing/super_resolution/falsr_a)|falsr_a| DIV2k|轻量化超分-2倍|
|[falsr_b](image/Image_editing/super_resolution/falsr_b)|falsr_b|DIV2k|轻量化超分-2倍|
- ### Instance Segmentation
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[solov2](image/instance_segmentation/solov2)|-|COCO2014|实例分割|
- ### Object Detection
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[faster_rcnn_resnet50_coco2017](image/object_detection/faster_rcnn_resnet50_coco2017)|faster_rcnn|COCO2017||
|[ssd_vgg16_512_coco2017](image/object_detection/ssd_vgg16_512_coco2017)|SSD|COCO2017||
|[faster_rcnn_resnet50_fpn_venus](image/object_detection/faster_rcnn_resnet50_fpn_venus)|faster_rcnn|百度自建数据集|大规模通用目标检测|
|[ssd_vgg16_300_coco2017](image/object_detection/ssd_vgg16_300_coco2017)||||
|[yolov3_resnet34_coco2017](image/object_detection/yolov3_resnet34_coco2017)|YOLOv3|COCO2017||
|[yolov3_darknet53_pedestrian](image/object_detection/yolov3_darknet53_pedestrian)|YOLOv3|百度自建大规模行人数据集|行人检测|
|[yolov3_mobilenet_v1_coco2017](image/object_detection/yolov3_mobilenet_v1_coco2017)|YOLOv3|COCO2017||
|[ssd_mobilenet_v1_pascal](image/object_detection/ssd_mobilenet_v1_pascal)|SSD|PASCAL VOC||
|[faster_rcnn_resnet50_fpn_coco2017](image/object_detection/faster_rcnn_resnet50_fpn_coco2017)|faster_rcnn|COCO2017||
|[yolov3_darknet53_coco2017](image/object_detection/yolov3_darknet53_coco2017)|YOLOv3|COCO2017||
|[yolov3_darknet53_vehicles](image/object_detection/yolov3_darknet53_vehicles)|YOLOv3|百度自建大规模车辆数据集|车辆检测|
|[yolov3_darknet53_venus](image/object_detection/yolov3_darknet53_venus)|YOLOv3|百度自建数据集|大规模通用检测|
|[yolov3_resnet50_vd_coco2017](image/object_detection/yolov3_resnet50_vd_coco2017)|YOLOv3|COCO2017||
- ### Depth Estimation
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[MiDaS_Large](image/depth_estimation/MiDaS_Large)|-|3D Movies, WSVD, ReDWeb, MegaDepth||
|[MiDaS_Small](image/depth_estimation/MiDaS_Small)|-|3D Movies, WSVD, ReDWeb, MegaDepth, etc.||
## Text
- ### Text Generation
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[ernie_gen](text/text_generation/ernie_gen)|ERNIE-GEN|-|面向生成任务的预训练-微调框架|
|[ernie_gen_poetry](text/text_generation/ernie_gen_poetry)|ERNIE-GEN|开源诗歌数据集|诗歌生成|
|[ernie_gen_couplet](text/text_generation/ernie_gen_couplet)|ERNIE-GEN|开源对联数据集|对联生成|
|[ernie_gen_lover_words](text/text_generation/ernie_gen_lover_words)|ERNIE-GEN|网络情诗、情话数据|情话生成|
|[ernie_tiny_couplet](text/text_generation/ernie_tiny_couplet)|Eernie_tiny|开源对联数据集|对联生成|
|[ernie_gen_acrostic_poetry](text/text_generation/ernie_gen_acrostic_poetry)|ERNIE-GEN|开源诗歌数据集|藏头诗生成|
|[Rumor_prediction](text/text_generation/Rumor_prediction)|-|新浪微博中文谣言数据|谣言预测|
|[plato-mini](text/text_generation/plato-mini)|Unified Transformer|十亿级别的中文对话数据|中文对话|
|[plato2_en_large](text/text_generation/plato2_en_large)|plato2|开放域多轮数据集|超大规模生成式对话|
|[plato2_en_base](text/text_generation/plato2_en_base)|plato2|开放域多轮数据集|超大规模生成式对话|
|[CPM_LM](text/text_generation/CPM_LM)|GPT-2|自建数据集|中文文本生成|
|[unified_transformer-12L-cn](text/text_generation/unified_transformer-12L-cn)|Unified Transformer|千万级别中文会话数据|人机多轮对话|
|[unified_transformer-12L-cn-luge](text/text_generation/unified_transformer-12L-cn-luge)|Unified Transformer|千言对话数据集|人机多轮对话|
|[reading_pictures_writing_poems](text/text_generation/reading_pictures_writing_poems)|多网络级联|-|看图写诗|
|[GPT2_CPM_LM](text/text_generation/GPT2_CPM_LM)|||问答类文本生成|
|[GPT2_Base_CN](text/text_generation/GPT2_Base_CN)|||问答类文本生成|
- ### Word Embedding
<details><summary>expand</summary><div>
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[w2v_weibo_target_word-bigram_dim300](text/embedding/w2v_weibo_target_word-bigram_dim300)|w2v|weibo||
|[w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_literature_target_word-word_dim300](text/embedding/w2v_literature_target_word-word_dim300)|w2v|literature||
|[word2vec_skipgram](text/embedding/word2vec_skipgram)|skip-gram|百度自建数据集||
|[w2v_sogou_target_word-char_dim300](text/embedding/w2v_sogou_target_word-char_dim300)|w2v|sogou||
|[w2v_weibo_target_bigram-char_dim300](text/embedding/w2v_weibo_target_bigram-char_dim300)|w2v|weibo||
|[w2v_zhihu_target_word-bigram_dim300](text/embedding/w2v_zhihu_target_word-bigram_dim300)|w2v|zhihu||
|[w2v_financial_target_word-word_dim300](text/embedding/w2v_financial_target_word-word_dim300)|w2v|financial||
|[w2v_wiki_target_word-word_dim300](text/embedding/w2v_wiki_target_word-word_dim300)|w2v|wiki||
|[w2v_baidu_encyclopedia_context_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-word_dim300)|w2v|baidu_encyclopedia||
|[w2v_weibo_target_word-word_dim300](text/embedding/w2v_weibo_target_word-word_dim300)|w2v|weibo||
|[w2v_zhihu_target_bigram-char_dim300](text/embedding/w2v_zhihu_target_bigram-char_dim300)|w2v|zhihu||
|[w2v_zhihu_target_word-word_dim300](text/embedding/w2v_zhihu_target_word-word_dim300)|w2v|zhihu||
|[w2v_people_daily_target_word-char_dim300](text/embedding/w2v_people_daily_target_word-char_dim300)|w2v|people_daily||
|[w2v_sikuquanshu_target_word-word_dim300](text/embedding/w2v_sikuquanshu_target_word-word_dim300)|w2v|sikuquanshu||
|[glove_twitter_target_word-word_dim200_en](text/embedding/glove_twitter_target_word-word_dim200_en)|fasttext|twitter||
|[fasttext_crawl_target_word-word_dim300_en](text/embedding/fasttext_crawl_target_word-word_dim300_en)|fasttext|crawl||
|[w2v_wiki_target_word-bigram_dim300](text/embedding/w2v_wiki_target_word-bigram_dim300)|w2v|wiki||
|[w2v_baidu_encyclopedia_context_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-1_dim300)|w2v|baidu_encyclopedia||
|[glove_wiki2014-gigaword_target_word-word_dim300_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim300_en)|glove|wiki2014-gigaword||
|[glove_wiki2014-gigaword_target_word-word_dim50_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim50_en)|glove|wiki2014-gigaword||
|[w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_wiki_target_bigram-char_dim300](text/embedding/w2v_wiki_target_bigram-char_dim300)|w2v|wiki||
|[w2v_baidu_encyclopedia_target_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-1_dim300)|w2v|baidu_encyclopedia||
|[w2v_financial_target_bigram-char_dim300](text/embedding/w2v_financial_target_bigram-char_dim300)|w2v|financial||
|[glove_wiki2014-gigaword_target_word-word_dim200_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim200_en)|glove|wiki2014-gigaword||
|[w2v_financial_target_word-bigram_dim300](text/embedding/w2v_financial_target_word-bigram_dim300)|w2v|financial||
|[w2v_mixed-large_target_word-char_dim300](text/embedding/w2v_mixed-large_target_word-char_dim300)|w2v|mixed||
|[w2v_baidu_encyclopedia_target_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordPosition_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_target_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordLR_dim300)|w2v|baidu_encyclopedia||
|[w2v_sogou_target_bigram-char_dim300](text/embedding/w2v_sogou_target_bigram-char_dim300)|w2v|sogou||
|[w2v_weibo_target_word-char_dim300](text/embedding/w2v_weibo_target_word-char_dim300)|w2v|weibo||
|[w2v_people_daily_target_word-word_dim300](text/embedding/w2v_people_daily_target_word-word_dim300)|w2v|people_daily||
|[w2v_zhihu_target_word-char_dim300](text/embedding/w2v_zhihu_target_word-char_dim300)|w2v|zhihu||
|[w2v_wiki_target_word-char_dim300](text/embedding/w2v_wiki_target_word-char_dim300)|w2v|wiki||
|[w2v_sogou_target_word-bigram_dim300](text/embedding/w2v_sogou_target_word-bigram_dim300)|w2v|sogou||
|[w2v_financial_target_word-char_dim300](text/embedding/w2v_financial_target_word-char_dim300)|w2v|financial||
|[w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia||
|[glove_wiki2014-gigaword_target_word-word_dim100_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim100_en)|glove|wiki2014-gigaword||
|[w2v_baidu_encyclopedia_target_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-4_dim300)|w2v|baidu_encyclopedia||
|[w2v_sogou_target_word-word_dim300](text/embedding/w2v_sogou_target_word-word_dim300)|w2v|sogou||
|[w2v_literature_target_word-char_dim300](text/embedding/w2v_literature_target_word-char_dim300)|w2v|literature||
|[w2v_baidu_encyclopedia_target_bigram-char_dim300](text/embedding/w2v_baidu_encyclopedia_target_bigram-char_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_target_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-word_dim300)|w2v|baidu_encyclopedia||
|[glove_twitter_target_word-word_dim100_en](text/embedding/glove_twitter_target_word-word_dim100_en)|glove|crawl||
|[w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_context_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-4_dim300)|w2v|baidu_encyclopedia||
|[w2v_literature_target_bigram-char_dim300](text/embedding/w2v_literature_target_bigram-char_dim300)|w2v|literature||
|[fasttext_wiki-news_target_word-word_dim300_en](text/embedding/fasttext_wiki-news_target_word-word_dim300_en)|fasttext|wiki-news||
|[w2v_people_daily_target_word-bigram_dim300](text/embedding/w2v_people_daily_target_word-bigram_dim300)|w2v|people_daily||
|[w2v_mixed-large_target_word-word_dim300](text/embedding/w2v_mixed-large_target_word-word_dim300)|w2v|mixed||
|[w2v_people_daily_target_bigram-char_dim300](text/embedding/w2v_people_daily_target_bigram-char_dim300)|w2v|people_daily||
|[w2v_literature_target_word-bigram_dim300](text/embedding/w2v_literature_target_word-bigram_dim300)|w2v|literature||
|[glove_twitter_target_word-word_dim25_en](text/embedding/glove_twitter_target_word-word_dim25_en)|glove|twitter||
|[w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_sikuquanshu_target_word-bigram_dim300](text/embedding/w2v_sikuquanshu_target_word-bigram_dim300)|w2v|sikuquanshu||
|[w2v_baidu_encyclopedia_context_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-2_dim300)|w2v|baidu_encyclopedia||
|[glove_twitter_target_word-word_dim50_en](text/embedding/glove_twitter_target_word-word_dim50_en)|glove|twitter||
|[w2v_baidu_encyclopedia_context_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordLR_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_target_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_context_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordPosition_dim300)|w2v|baidu_encyclopedia||
</div></details>
- ### Machine Translation
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[transformer_zh-en](text/machine_translation/transformer/transformer_zh-en)|Transformer|CWMT2021|中文译英文|
|[transformer_en-de](text/machine_translation/transformer/transformer_en-de)|Transformer|WMT14 EN-DE|英文译德文|
- ### Language Model
<details><summary>expand</summary><div>
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[chinese_electra_small](text/language_model/chinese_electra_small)||||
|[chinese_electra_base](text/language_model/chinese_electra_base)||||
|[roberta-wwm-ext-large](text/language_model/roberta-wwm-ext-large)|roberta-wwm-ext-large|百度自建数据集||
|[chinese-bert-wwm-ext](text/language_model/chinese_bert_wwm_ext)|chinese-bert-wwm-ext|百度自建数据集||
|[lda_webpage](text/language_model/lda_webpage)|LDA|百度自建网页领域数据集||
|[lda_novel](text/language_model/lda_novel)||||
|[bert-base-multilingual-uncased](text/language_model/bert-base-multilingual-uncased)||||
|[rbt3](text/language_model/rbt3)||||
|[ernie_v2_eng_base](text/language_model/ernie_v2_eng_base)|ernie_v2_eng_base|百度自建数据集||
|[bert-base-multilingual-cased](text/language_model/bert-base-multilingual-cased)||||
|[rbtl3](text/language_model/rbtl3)||||
|[chinese-bert-wwm](text/language_model/chinese_bert_wwm)|chinese-bert-wwm|百度自建数据集||
|[bert-large-uncased](text/language_model/bert-large-uncased)||||
|[slda_novel](text/language_model/slda_novel)||||
|[slda_news](text/language_model/slda_news)||||
|[electra_small](text/language_model/electra_small)||||
|[slda_webpage](text/language_model/slda_webpage)||||
|[bert-base-cased](text/language_model/bert-base-cased)||||
|[slda_weibo](text/language_model/slda_weibo)||||
|[roberta-wwm-ext](text/language_model/roberta-wwm-ext)|roberta-wwm-ext|百度自建数据集||
|[bert-base-uncased](text/language_model/bert-base-uncased)||||
|[electra_large](text/language_model/electra_large)||||
|[ernie](text/language_model/ernie)|ernie-1.0|百度自建数据集||
|[simnet_bow](text/language_model/simnet_bow)|BOW|百度自建数据集||
|[ernie_tiny](text/language_model/ernie_tiny)|ernie_tiny|百度自建数据集||
|[bert-base-chinese](text/language_model/bert-base-chinese)|bert-base-chinese|百度自建数据集||
|[lda_news](text/language_model/lda_news)|LDA|百度自建新闻领域数据集||
|[electra_base](text/language_model/electra_base)||||
|[ernie_v2_eng_large](text/language_model/ernie_v2_eng_large)|ernie_v2_eng_large|百度自建数据集||
|[bert-large-cased](text/language_model/bert-large-cased)||||
</div></details>
- ### Sentiment Analysis
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[ernie_skep_sentiment_analysis](text/sentiment_analysis/ernie_skep_sentiment_analysis)|SKEP|百度自建数据集|句子级情感分析|
|[emotion_detection_textcnn](text/sentiment_analysis/emotion_detection_textcnn)|TextCNN|百度自建数据集|对话情绪识别|
|[senta_bilstm](text/sentiment_analysis/senta_bilstm)|BiLSTM|百度自建数据集|中文情感倾向分析|
|[senta_bow](text/sentiment_analysis/senta_bow)|BOW|百度自建数据集|中文情感倾向分析|
|[senta_gru](text/sentiment_analysis/senta_gru)|GRU|百度自建数据集|中文情感倾向分析|
|[senta_lstm](text/sentiment_analysis/senta_lstm)|LSTM|百度自建数据集|中文情感倾向分析|
|[senta_cnn](text/sentiment_analysis/senta_cnn)|CNN|百度自建数据集|中文情感倾向分析|
- ### Syntactic Analysis
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[DDParser](text/syntactic_analysis/DDParser)|Deep Biaffine Attention|搜索query、网页文本、语音输入等数据|句法分析|
- ### Simultaneous Translation
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[transformer_nist_wait_1](text/simultaneous_translation/stacl/transformer_nist_wait_1)|transformer|NIST 2008-中英翻译数据集|中译英-wait-1策略|
|[transformer_nist_wait_3](text/simultaneous_translation/stacl/transformer_nist_wait_3)|transformer|NIST 2008-中英翻译数据集|中译英-wait-3策略|
|[transformer_nist_wait_5](text/simultaneous_translation/stacl/transformer_nist_wait_5)|transformer|NIST 2008-中英翻译数据集|中译英-wait-5策略|
|[transformer_nist_wait_7](text/simultaneous_translation/stacl/transformer_nist_wait_7)|transformer|NIST 2008-中英翻译数据集|中译英-wait-7策略|
|[transformer_nist_wait_all](text/simultaneous_translation/stacl/transformer_nist_wait_all)|transformer|NIST 2008-中英翻译数据集|中译英-waitk=-1策略|
- ### Lexical Analysis
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[jieba_paddle](text/lexical_analysis/jieba_paddle)|BiGRU+CRF|百度自建数据集|百度自研联合的词法分析模型,能整体性地完成中文分词、词性标注、专名识别任务。在百度自建数据集上评测,LAC效果:Precision=88.0%,Recall=88.7%,F1-Score=88.4%。|
|[lac](text/lexical_analysis/lac)|BiGRU+CRF|百度自建数据集|jieba使用Paddle搭建的切词网络(双向GRU)。同时支持jieba的传统切词方法,如精确模式、全模式、搜索引擎模式等切词模式。|
- ### Punctuation Restoration
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[auto_punc](text/punctuation_restoration/auto_punc)|Ernie-1.0|WuDaoCorpora 2.0|自动添加7种标点符号|
- ### Text Review
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[porn_detection_cnn](text/text_review/porn_detection_cnn)|CNN|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别|
|[porn_detection_gru](text/text_review/porn_detection_gru)|GRU|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别|
|[porn_detection_lstm](text/text_review/porn_detection_lstm)|LSTM|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别|
## Audio
- ### Video cloning
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[ge2e_fastspeech2_pwgan](audio/voice_cloning/ge2e_fastspeech2_pwgan)|FastSpeech2|AISHELL-3|中文语音克隆|
|[lstm_tacotron2](audio/voice_cloning/lstm_tacotron2)|LSTM、Tacotron2、WaveFlow|AISHELL-3|中文语音克隆|
- ### Text to Speech
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[transformer_tts_ljspeech](audio/tts/transformer_tts_ljspeech)|Transformer|LJSpeech-1.1|英文语音合成|
|[fastspeech_ljspeech](audio/tts/fastspeech_ljspeech)|FastSpeech|LJSpeech-1.1|英文语音合成|
|[fastspeech2_baker](audio/tts/fastspeech2_baker)|FastSpeech2|Chinese Standard Mandarin Speech Copus|中文语音合成|
|[fastspeech2_ljspeech](audio/tts/fastspeech2_ljspeech)|FastSpeech2|LJSpeech-1.1|英文语音合成|
|[deepvoice3_ljspeech](audio/tts/deepvoice3_ljspeech)|DeepVoice3|LJSpeech-1.1|英文语音合成|
- ### Automatic Speech Recognition
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[deepspeech2_aishell](audio/asr/deepspeech2_aishell)|DeepSpeech2|AISHELL-1|中文语音识别|
|[deepspeech2_librispeech](audio/asr/deepspeech2_librispeech)|DeepSpeech2|LibriSpeech|英文语音识别|
|[u2_conformer_aishell](audio/asr/u2_conformer_aishell)|Conformer|AISHELL-1|中文语音识别|
|[u2_conformer_wenetspeech](audio/asr/u2_conformer_wenetspeech)|Conformer|WenetSpeech|中文语音识别|
|[u2_conformer_librispeech](audio/asr/u2_conformer_librispeech)|Conformer|LibriSpeech|英文语音识别|
- ### Audio Classification
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[panns_cnn6](audio/audio_classification/PANNs/cnn6)|PANNs|Google Audioset|主要包含4个卷积层和2个全连接层,模型参数为4.5M。经过预训练后,可以用于提取音频的embbedding,维度是512|
|[panns_cnn14](audio/audio_classification/PANNs/cnn14)|PANNs|Google Audioset|主要包含12个卷积层和2个全连接层,模型参数为79.6M。经过预训练后,可以用于提取音频的embbedding,维度是2048|
|[panns_cnn10](audio/audio_classification/PANNs/cnn10)|PANNs|Google Audioset|主要包含8个卷积层和2个全连接层,模型参数为4.9M。经过预训练后,可以用于提取音频的embbedding,维度是512|
## Video
- ### Video Classification
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[videotag_tsn_lstm](video/classification/videotag_tsn_lstm)|TSN + AttentionLSTM|百度自建数据集|大规模短视频分类打标签|
|[tsn_kinetics400](video/classification/tsn_kinetics400)|TSN|Kinetics-400|视频分类|
|[tsm_kinetics400](video/classification/tsm_kinetics400)|TSM|Kinetics-400|视频分类|
|[stnet_kinetics400](video/classification/stnet_kinetics400)|StNet|Kinetics-400|视频分类|
|[nonlocal_kinetics400](video/classification/nonlocal_kinetics400)|Non-local|Kinetics-400|视频分类|
- ### Video Editing
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[SkyAR](video/Video_editing/SkyAR)|UNet|UNet|视频换天|
- ### Multiple Object tracking
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[fairmot_dla34](video/multiple_object_tracking/fairmot_dla34)|CenterNet|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|实时多目标跟踪|
|[jde_darknet53](video/multiple_object_tracking/jde_darknet53)|YOLOv3|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|多目标跟踪-兼顾精度和速度|
## Industrial Application
- ### Meter Detection
|module|Network|Dataset|Introduction|
|--|--|--|--|
|[WatermeterSegmentation](image/semantic_segmentation/WatermeterSegmentation)|DeepLabV3|水表的数字表盘分割数据集|水表的数字表盘分割|
简体中文 | [English](README.md)
# 目录
|[图像](#图像) (212个)|[文本](#文本) (130个)|[语音](#语音) (15个)|[视频](#视频) (8个)|[工业应用](#工业应用) (1个)|
|--|--|--|--|--|
|[图像分类](#图像分类) (108)|[文本生成](#文本生成) (17)| [声音克隆](#声音克隆) (2)|[视频分类](#视频分类) (5)| [表针识别](#表针识别) (1)|
|[图像生成](#图像生成) (26)|[词向量](#词向量) (62)|[语音合成](#语音合成) (5)|[视频修复](#视频修复) (1)|-|
|[关键点检测](#关键点检测) (5)|[机器翻译](#机器翻译) (2)|[语音识别](#语音识别) (5)|[多目标追踪](#多目标追踪) (2)|-|
|[图像分割](#图像分割) (25)|[语义模型](#语义模型) (30)|[声音分类](#声音分类) (3)| -|-|
|[人脸检测](#人脸检测) (7)|[情感分析](#情感分析) (7)|-|-|-|
|[文字识别](#文字识别) (17)|[句法分析](#句法分析) (1)|-|-|-|
|[图像编辑](#图像编辑) (8)|[同声传译](#同声传译) (5)|-|-|-|
|[实例分割](#实例分割) (1)|[词法分析](#词法分析) (2)|-|-|-|
|[目标检测](#目标检测) (13)|[标点恢复](#标点恢复) (1)|-|-|-|
|[深度估计](#深度估计) (2)|[文本审核](#文本审核) (3)|-|-|-|
## 图像
- ### 图像分类
<details><summary>expand</summary><div>
|module|网络|数据集|简介|
|--|--|--|--|
|[DriverStatusRecognition](image/classification/DriverStatusRecognition)|MobileNetV3_small_ssld|分心司机检测数据集||
|[mobilenet_v2_animals](image/classification/mobilenet_v2_animals)|MobileNet_v2|百度自建动物数据集||
|[repvgg_a1_imagenet](image/classification/repvgg_a1_imagenet)|RepVGG|ImageNet-2012||
|[repvgg_a0_imagenet](image/classification/repvgg_a0_imagenet)|RepVGG|ImageNet-2012||
|[resnext152_32x4d_imagenet](image/classification/resnext152_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[resnet_v2_152_imagenet](image/classification/resnet_v2_152_imagenet)|ResNet V2|ImageNet-2012||
|[resnet50_vd_animals](image/classification/resnet50_vd_animals)|ResNet50_vd|百度自建动物数据集||
|[food_classification](image/classification/food_classification)|ResNet50_vd_ssld|美食数据集||
|[mobilenet_v3_large_imagenet_ssld](image/classification/mobilenet_v3_large_imagenet_ssld)|Mobilenet_v3_large|ImageNet-2012||
|[resnext152_vd_32x4d_imagenet](image/classification/resnext152_vd_32x4d_imagenet)||||
|[ghostnet_x1_3_imagenet_ssld](image/classification/ghostnet_x1_3_imagenet_ssld)|GhostNet|ImageNet-2012||
|[rexnet_1_5_imagenet](image/classification/rexnet_1_5_imagenet)|ReXNet|ImageNet-2012||
|[resnext50_64x4d_imagenet](image/classification/resnext50_64x4d_imagenet)|ResNeXt|ImageNet-2012||
|[resnext101_64x4d_imagenet](image/classification/resnext101_64x4d_imagenet)|ResNeXt|ImageNet-2012||
|[efficientnetb0_imagenet](image/classification/efficientnetb0_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb1_imagenet](image/classification/efficientnetb1_imagenet)|EfficientNet|ImageNet-2012||
|[mobilenet_v2_imagenet_ssld](image/classification/mobilenet_v2_imagenet_ssld)|Mobilenet_v2|ImageNet-2012||
|[resnet50_vd_dishes](image/classification/resnet50_vd_dishes)|ResNet50_vd|百度自建菜品数据集||
|[pnasnet_imagenet](image/classification/pnasnet_imagenet)|PNASNet|ImageNet-2012||
|[rexnet_2_0_imagenet](image/classification/rexnet_2_0_imagenet)|ReXNet|ImageNet-2012||
|[SnakeIdentification](image/classification/SnakeIdentification)|ResNet50_vd_ssld|蛇种数据集||
|[hrnet40_imagenet](image/classification/hrnet40_imagenet)|HRNet|ImageNet-2012||
|[resnet_v2_34_imagenet](image/classification/resnet_v2_34_imagenet)|ResNet V2|ImageNet-2012||
|[mobilenet_v2_dishes](image/classification/mobilenet_v2_dishes)|MobileNet_v2|百度自建菜品数据集||
|[resnext101_vd_32x4d_imagenet](image/classification/resnext101_vd_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[repvgg_b2g4_imagenet](image/classification/repvgg_b2g4_imagenet)|RepVGG|ImageNet-2012||
|[fix_resnext101_32x48d_wsl_imagenet](image/classification/fix_resnext101_32x48d_wsl_imagenet)|ResNeXt|ImageNet-2012||
|[vgg13_imagenet](image/classification/vgg13_imagenet)|VGG|ImageNet-2012||
|[se_resnext101_32x4d_imagenet](image/classification/se_resnext101_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012||
|[hrnet30_imagenet](image/classification/hrnet30_imagenet)|HRNet|ImageNet-2012||
|[ghostnet_x1_3_imagenet](image/classification/ghostnet_x1_3_imagenet)|GhostNet|ImageNet-2012||
|[dpn107_imagenet](image/classification/dpn107_imagenet)|DPN|ImageNet-2012||
|[densenet161_imagenet](image/classification/densenet161_imagenet)|DenseNet|ImageNet-2012||
|[vgg19_imagenet](image/classification/vgg19_imagenet)|vgg19_imagenet|ImageNet-2012||
|[mobilenet_v2_imagenet](image/classification/mobilenet_v2_imagenet)|Mobilenet_v2|ImageNet-2012||
|[resnet50_vd_10w](image/classification/resnet50_vd_10w)|ResNet_vd|百度自建数据集||
|[resnet_v2_101_imagenet](image/classification/resnet_v2_101_imagenet)|ResNet V2 101|ImageNet-2012||
|[darknet53_imagenet](image/classification/darknet53_imagenet)|DarkNet|ImageNet-2012||
|[se_resnext50_32x4d_imagenet](image/classification/se_resnext50_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012||
|[se_hrnet64_imagenet_ssld](image/classification/se_hrnet64_imagenet_ssld)|HRNet|ImageNet-2012||
|[resnext101_32x16d_wsl](image/classification/resnext101_32x16d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[hrnet18_imagenet](image/classification/hrnet18_imagenet)|HRNet|ImageNet-2012||
|[spinalnet_res101_gemstone](image/classification/spinalnet_res101_gemstone)|resnet101|gemstone||
|[densenet264_imagenet](image/classification/densenet264_imagenet)|DenseNet|ImageNet-2012||
|[resnext50_vd_32x4d_imagenet](image/classification/resnext50_vd_32x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[SpinalNet_Gemstones](image/classification/SpinalNet_Gemstones)||||
|[spinalnet_vgg16_gemstone](image/classification/spinalnet_vgg16_gemstone)|vgg16|gemstone||
|[xception71_imagenet](image/classification/xception71_imagenet)|Xception|ImageNet-2012||
|[repvgg_b2_imagenet](image/classification/repvgg_b2_imagenet)|RepVGG|ImageNet-2012||
|[dpn68_imagenet](image/classification/dpn68_imagenet)|DPN|ImageNet-2012||
|[alexnet_imagenet](image/classification/alexnet_imagenet)|AlexNet|ImageNet-2012||
|[rexnet_1_3_imagenet](image/classification/rexnet_1_3_imagenet)|ReXNet|ImageNet-2012||
|[hrnet64_imagenet](image/classification/hrnet64_imagenet)|HRNet|ImageNet-2012||
|[efficientnetb7_imagenet](image/classification/efficientnetb7_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb0_small_imagenet](image/classification/efficientnetb0_small_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb6_imagenet](image/classification/efficientnetb6_imagenet)|EfficientNet|ImageNet-2012||
|[hrnet48_imagenet](image/classification/hrnet48_imagenet)|HRNet|ImageNet-2012||
|[rexnet_3_0_imagenet](image/classification/rexnet_3_0_imagenet)|ReXNet|ImageNet-2012||
|[shufflenet_v2_imagenet](image/classification/shufflenet_v2_imagenet)|ShuffleNet V2|ImageNet-2012||
|[ghostnet_x0_5_imagenet](image/classification/ghostnet_x0_5_imagenet)|GhostNet|ImageNet-2012||
|[inception_v4_imagenet](image/classification/inception_v4_imagenet)|Inception_V4|ImageNet-2012||
|[resnext101_vd_64x4d_imagenet](image/classification/resnext101_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[densenet201_imagenet](image/classification/densenet201_imagenet)|DenseNet|ImageNet-2012||
|[vgg16_imagenet](image/classification/vgg16_imagenet)|VGG|ImageNet-2012||
|[mobilenet_v3_small_imagenet_ssld](image/classification/mobilenet_v3_small_imagenet_ssld)|Mobilenet_v3_Small|ImageNet-2012||
|[hrnet18_imagenet_ssld](image/classification/hrnet18_imagenet_ssld)|HRNet|ImageNet-2012||
|[resnext152_64x4d_imagenet](image/classification/resnext152_64x4d_imagenet)|ResNeXt|ImageNet-2012||
|[efficientnetb3_imagenet](image/classification/efficientnetb3_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb2_imagenet](image/classification/efficientnetb2_imagenet)|EfficientNet|ImageNet-2012||
|[repvgg_b1g4_imagenet](image/classification/repvgg_b1g4_imagenet)|RepVGG|ImageNet-2012||
|[resnext101_32x4d_imagenet](image/classification/resnext101_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[resnext50_32x4d_imagenet](image/classification/resnext50_32x4d_imagenet)|ResNeXt|ImageNet-2012||
|[repvgg_a2_imagenet](image/classification/repvgg_a2_imagenet)|RepVGG|ImageNet-2012||
|[resnext152_vd_64x4d_imagenet](image/classification/resnext152_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[xception41_imagenet](image/classification/xception41_imagenet)|Xception|ImageNet-2012||
|[googlenet_imagenet](image/classification/googlenet_imagenet)|GoogleNet|ImageNet-2012||
|[resnet50_vd_imagenet_ssld](image/classification/resnet50_vd_imagenet_ssld)|ResNet_vd|ImageNet-2012||
|[repvgg_b1_imagenet](image/classification/repvgg_b1_imagenet)|RepVGG|ImageNet-2012||
|[repvgg_b0_imagenet](image/classification/repvgg_b0_imagenet)|RepVGG|ImageNet-2012||
|[resnet_v2_50_imagenet](image/classification/resnet_v2_50_imagenet)|ResNet V2|ImageNet-2012||
|[rexnet_1_0_imagenet](image/classification/rexnet_1_0_imagenet)|ReXNet|ImageNet-2012||
|[resnet_v2_18_imagenet](image/classification/resnet_v2_18_imagenet)|ResNet V2|ImageNet-2012||
|[resnext101_32x8d_wsl](image/classification/resnext101_32x8d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[efficientnetb4_imagenet](image/classification/efficientnetb4_imagenet)|EfficientNet|ImageNet-2012||
|[efficientnetb5_imagenet](image/classification/efficientnetb5_imagenet)|EfficientNet|ImageNet-2012||
|[repvgg_b1g2_imagenet](image/classification/repvgg_b1g2_imagenet)|RepVGG|ImageNet-2012||
|[resnext101_32x48d_wsl](image/classification/resnext101_32x48d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[resnet50_vd_wildanimals](image/classification/resnet50_vd_wildanimals)|ResNet_vd|IFAW 自建野生动物数据集||
|[nasnet_imagenet](image/classification/nasnet_imagenet)|NASNet|ImageNet-2012||
|[se_resnet18_vd_imagenet](image/classification/se_resnet18_vd_imagenet)||||
|[spinalnet_res50_gemstone](image/classification/spinalnet_res50_gemstone)|resnet50|gemstone||
|[resnext50_vd_64x4d_imagenet](image/classification/resnext50_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012||
|[resnext101_32x32d_wsl](image/classification/resnext101_32x32d_wsl)|ResNeXt_wsl|ImageNet-2012||
|[dpn131_imagenet](image/classification/dpn131_imagenet)|DPN|ImageNet-2012||
|[xception65_imagenet](image/classification/xception65_imagenet)|Xception|ImageNet-2012||
|[repvgg_b3g4_imagenet](image/classification/repvgg_b3g4_imagenet)|RepVGG|ImageNet-2012||
|[marine_biometrics](image/classification/marine_biometrics)|ResNet50_vd_ssld|Fish4Knowledge||
|[res2net101_vd_26w_4s_imagenet](image/classification/res2net101_vd_26w_4s_imagenet)|Res2Net|ImageNet-2012||
|[dpn98_imagenet](image/classification/dpn98_imagenet)|DPN|ImageNet-2012||
|[resnet18_vd_imagenet](image/classification/resnet18_vd_imagenet)|ResNet_vd|ImageNet-2012||
|[densenet121_imagenet](image/classification/densenet121_imagenet)|DenseNet|ImageNet-2012||
|[vgg11_imagenet](image/classification/vgg11_imagenet)|VGG|ImageNet-2012||
|[hrnet44_imagenet](image/classification/hrnet44_imagenet)|HRNet|ImageNet-2012||
|[densenet169_imagenet](image/classification/densenet169_imagenet)|DenseNet|ImageNet-2012||
|[hrnet32_imagenet](image/classification/hrnet32_imagenet)|HRNet|ImageNet-2012||
|[dpn92_imagenet](image/classification/dpn92_imagenet)|DPN|ImageNet-2012||
|[ghostnet_x1_0_imagenet](image/classification/ghostnet_x1_0_imagenet)|GhostNet|ImageNet-2012||
|[hrnet48_imagenet_ssld](image/classification/hrnet48_imagenet_ssld)|HRNet|ImageNet-2012||
</div></details>
- ### 图像生成
|module|网络|数据集|简介|
|--|--|--|--|
|[pixel2style2pixel](image/Image_gan/gan/pixel2style2pixel/)|Pixel2Style2Pixel|-|人脸转正|
|[stgan_bald](image/Image_gan/gan/stgan_bald/)|STGAN|CelebA|秃头生成器|
|[styleganv2_editing](image/Image_gan/gan/styleganv2_editing)|StyleGAN V2|-|人脸编辑|
|[wav2lip](image/Image_gan/gan/wav2lip)|wav2lip|LRS2|唇形生成|
|[attgan_celeba](image/Image_gan/attgan_celeba/)|AttGAN|Celeba|人脸编辑|
|[cyclegan_cityscapes](image/Image_gan/cyclegan_cityscapes)|CycleGAN|Cityscapes|实景图和语义分割结果互相转换|
|[stargan_celeba](image/Image_gan/stargan_celeba)|StarGAN|Celeba|人脸编辑|
|[stgan_celeba](image/Image_gan/stgan_celeba/)|STGAN|Celeba|人脸编辑|
|[ID_Photo_GEN](image/Image_gan/style_transfer/ID_Photo_GEN)|HRNet_W18|-|证件照生成|
|[Photo2Cartoon](image/Image_gan/style_transfer/Photo2Cartoon)|U-GAT-IT|cartoon_data|人脸卡通化|
|[U2Net_Portrait](image/Image_gan/style_transfer/U2Net_Portrait)|U^2Net|-|人脸素描化|
|[UGATIT_100w](image/Image_gan/style_transfer/UGATIT_100w)|U-GAT-IT|selfie2anime|人脸动漫化|
|[UGATIT_83w](image/Image_gan/style_transfer/UGATIT_83w)|U-GAT-IT|selfie2anime|人脸动漫化|
|[UGATIT_92w](image/Image_gan/style_transfer/UGATIT_92w)| U-GAT-IT|selfie2anime|人脸动漫化|
|[animegan_v1_hayao_60](image/Image_gan/style_transfer/animegan_v1_hayao_60)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏|
|[animegan_v2_hayao_64](image/Image_gan/style_transfer/animegan_v2_hayao_64)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏|
|[animegan_v2_hayao_99](image/Image_gan/style_transfer/animegan_v2_hayao_99)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏|
|[animegan_v2_paprika_54](image/Image_gan/style_transfer/animegan_v2_paprika_54)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_paprika_74](image/Image_gan/style_transfer/animegan_v2_paprika_74)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_paprika_97](image/Image_gan/style_transfer/animegan_v2_paprika_97)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_paprika_98](image/Image_gan/style_transfer/animegan_v2_paprika_98)|AnimeGAN|Paprika|图像风格迁移-今敏|
|[animegan_v2_shinkai_33](image/Image_gan/style_transfer/animegan_v2_shinkai_33)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚|
|[animegan_v2_shinkai_53](image/Image_gan/style_transfer/animegan_v2_shinkai_53)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚|
|[msgnet](image/Image_gan/style_transfer/msgnet)|msgnet|COCO2014|
|[stylepro_artistic](image/Image_gan/style_transfer/stylepro_artistic)|StyleProNet|MS-COCO + WikiArt|艺术风格迁移|
|stylegan_ffhq|StyleGAN|FFHQ|图像风格迁移|
- ### 关键点检测
|module|网络|数据集|简介|
|--|--|--|--|
|[face_landmark_localization](image/keypoint_detection/face_landmark_localization)|Face_Landmark|AFW/AFLW|人脸关键点检测|
|[hand_pose_localization](image/keypoint_detection/hand_pose_localization)|-|MPII, NZSL|手部关键点检测|
|[openpose_body_estimation](image/keypoint_detection/openpose_body_estimation)|two-branch multi-stage CNN|MPII, COCO 2016|肢体关键点检测|
|[human_pose_estimation_resnet50_mpii](image/keypoint_detection/human_pose_estimation_resnet50_mpii)|Pose_Resnet50|MPII|人体骨骼关键点检测
|[openpose_hands_estimation](image/keypoint_detection/openpose_hands_estimation)|-|MPII, NZSL|手部关键点检测|
- ### 图像分割
|module|网络|数据集|简介|
|--|--|--|--|
|[deeplabv3p_xception65_humanseg](image/semantic_segmentation/deeplabv3p_xception65_humanseg)|deeplabv3p|百度自建数据集|人像分割|
|[humanseg_server](image/semantic_segmentation/humanseg_server)|deeplabv3p|百度自建数据集|人像分割|
|[humanseg_mobile](image/semantic_segmentation/humanseg_mobile)|hrnet|百度自建数据集|人像分割-移动端前置摄像头|
|[humanseg_lite](image/semantic_segmentation/umanseg_lite)|shufflenet|百度自建数据集|轻量级人像分割-移动端实时|
|[ExtremeC3_Portrait_Segmentation](image/semantic_segmentation/ExtremeC3_Portrait_Segmentation)|ExtremeC3|EG1800, Baidu fashion dataset|轻量化人像分割|
|[SINet_Portrait_Segmentation](image/semantic_segmentation/SINet_Portrait_Segmentation)|SINet|EG1800, Baidu fashion dataset|轻量化人像分割|
|[FCN_HRNet_W18_Face_Seg](image/semantic_segmentation/FCN_HRNet_W18_Face_Seg)|FCN_HRNet_W18|-|人像分割|
|[ace2p](image/semantic_segmentation/ace2p)|ACE2P|LIP|人体解析|
|[Pneumonia_CT_LKM_PP](image/semantic_segmentation/Pneumonia_CT_LKM_PP)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析|
|[Pneumonia_CT_LKM_PP_lung](image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析|
|[ocrnet_hrnetw18_voc](image/semantic_segmentation/ocrnet_hrnetw18_voc)|ocrnet, hrnet|PascalVoc2012|
|[U2Net](image/semantic_segmentation/U2Net)|U^2Net|-|图像前景背景分割|
|[U2Netp](image/semantic_segmentation/U2Netp)|U^2Net|-|图像前景背景分割|
|[Extract_Line_Draft](image/semantic_segmentation/Extract_Line_Draft)|UNet|Pixiv|线稿提取|
|[unet_cityscapes](image/semantic_segmentation/unet_cityscapes)|UNet|cityscapes|
|[ocrnet_hrnetw18_cityscapes](image/semantic_segmentation/ocrnet_hrnetw18_cityscapes)|ocrnet_hrnetw18|cityscapes|
|[hardnet_cityscapes](image/semantic_segmentation/hardnet_cityscapes)|hardnet|cityscapes|
|[fcn_hrnetw48_voc](image/semantic_segmentation/fcn_hrnetw48_voc)|fcn_hrnetw48|PascalVoc2012|
|[fcn_hrnetw48_cityscapes](image/semantic_segmentation/fcn_hrnetw48_cityscapes)|fcn_hrnetw48|cityscapes|
|[fcn_hrnetw18_voc](image/semantic_segmentation/fcn_hrnetw18_voc)|fcn_hrnetw18|PascalVoc2012|
|[fcn_hrnetw18_cityscapes](image/semantic_segmentation/fcn_hrnetw18_cityscapes)|fcn_hrnetw18|cityscapes|
|[fastscnn_cityscapes](image/semantic_segmentation/fastscnn_cityscapes)|fastscnn|cityscapes|
|[deeplabv3p_resnet50_voc](image/semantic_segmentation/deeplabv3p_resnet50_voc)|deeplabv3p, resnet50|PascalVoc2012|
|[deeplabv3p_resnet50_cityscapes](image/semantic_segmentation/deeplabv3p_resnet50_cityscapes)|deeplabv3p, resnet50|cityscapes|
|[bisenetv2_cityscapes](image/semantic_segmentation/bisenetv2_cityscapes)|bisenetv2|cityscapes|
- ### 人脸检测
|module|网络|数据集|简介|
|--|--|--|--|
|[pyramidbox_lite_mobile](image/face_detection/pyramidbox_lite_mobile)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测-移动端|
|[pyramidbox_lite_mobile_mask](image/face_detection/pyramidbox_lite_mobile_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测-移动端|
|[pyramidbox_lite_server_mask](image/face_detection/pyramidbox_lite_server_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测|
|[ultra_light_fast_generic_face_detector_1mb_640](image/face_detection/ultra_light_fast_generic_face_detector_1mb_640)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备|
|[ultra_light_fast_generic_face_detector_1mb_320](image/face_detection/ultra_light_fast_generic_face_detector_1mb_320)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备|
|[pyramidbox_lite_server](image/face_detection/pyramidbox_lite_server)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测|
|[pyramidbox_face_detection](image/face_detection/pyramidbox_face_detection)|PyramidBox|WIDER FACE数据集|人脸检测|
- ### 文字识别
|module|网络|数据集|简介|
|--|--|--|--|
|[chinese_ocr_db_crnn_mobile](image/text_recognition/chinese_ocr_db_crnn_mobile)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别|[chinese_text_detection_db_mobile](image/text_recognition/chinese_text_detection_db_mobile)|Differentiable Binarization|icdar2015数据集|中文文本检测|
|[chinese_text_detection_db_server](image/text_recognition/chinese_text_detection_db_server)|Differentiable Binarization|icdar2015数据集|中文文本检测|
|[chinese_ocr_db_crnn_server](image/text_recognition/chinese_ocr_db_crnn_server)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别|
|[Vehicle_License_Plate_Recognition](image/text_recognition/Vehicle_License_Plate_Recognition)|-|CCPD|车牌识别|
|[chinese_cht_ocr_db_crnn_mobile](image/text_recognition/chinese_cht_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|繁体中文文字识别|
|[japan_ocr_db_crnn_mobile](image/text_recognition/japan_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|日文文字识别|
|[korean_ocr_db_crnn_mobile](image/text_recognition/korean_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|韩文文字识别|
|[german_ocr_db_crnn_mobile](image/text_recognition/german_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|德文文字识别|
|[french_ocr_db_crnn_mobile](image/text_recognition/french_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|法文文字识别|
|[latin_ocr_db_crnn_mobile](image/text_recognition/latin_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|拉丁文文字识别|
|[cyrillic_ocr_db_crnn_mobile](image/text_recognition/cyrillic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|斯拉夫文文字识别|
|[multi_languages_ocr_db_crnn](image/text_recognition/multi_languages_ocr_db_crnn)|Differentiable Binarization+RCNN|icdar2015数据集|多语言文字识别|
|[kannada_ocr_db_crnn_mobile](image/text_recognition/kannada_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|卡纳达文文字识别|
|[arabic_ocr_db_crnn_mobile](image/text_recognition/arabic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|阿拉伯文文字识别|
|[telugu_ocr_db_crnn_mobile](image/text_recognition/telugu_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰卢固文文字识别|
|[devanagari_ocr_db_crnn_mobile](image/text_recognition/devanagari_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|梵文文字识别|
|[tamil_ocr_db_crnn_mobile](image/text_recognition/tamil_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰米尔文文字识别|
- ### 图像编辑
|module|网络|数据集|简介|
|--|--|--|--|
|[realsr](image/Image_editing/super_resolution/realsr)|LP-KPN|RealSR dataset|图像/视频超分-4倍|
|[deoldify](image/Image_editing/colorization/deoldify)|GAN|ILSVRC 2012|黑白照片/视频着色|
|[photo_restoration](image/Image_editing/colorization/photo_restoration)|基于deoldify和realsr模型|-|老照片修复|
|[user_guided_colorization](image/Image_editing/colorization/user_guided_colorization)|siggraph|ILSVRC 2012|图像着色|
|[falsr_c](image/Image_editing/super_resolution/falsr_c)|falsr_c| DIV2k|轻量化超分-2倍|
|[dcscn](image/Image_editing/super_resolution/dcscn)|dcscn| DIV2k|轻量化超分-2倍|
|[falsr_a](image/Image_editing/super_resolution/falsr_a)|falsr_a| DIV2k|轻量化超分-2倍|
|[falsr_b](image/Image_editing/super_resolution/falsr_b)|falsr_b|DIV2k|轻量化超分-2倍|
- ### 实例分割
|module|网络|数据集|简介|
|--|--|--|--|
|[solov2](image/instance_segmentation/solov2)|-|COCO2014|实例分割|
- ### 目标检测
|module|网络|数据集|简介|
|--|--|--|--|
|[faster_rcnn_resnet50_coco2017](image/object_detection/faster_rcnn_resnet50_coco2017)|faster_rcnn|COCO2017||
|[ssd_vgg16_512_coco2017](image/object_detection/ssd_vgg16_512_coco2017)|SSD|COCO2017||
|[faster_rcnn_resnet50_fpn_venus](image/object_detection/faster_rcnn_resnet50_fpn_venus)|faster_rcnn|百度自建数据集|大规模通用目标检测|
|[ssd_vgg16_300_coco2017](image/object_detection/ssd_vgg16_300_coco2017)||||
|[yolov3_resnet34_coco2017](image/object_detection/yolov3_resnet34_coco2017)|YOLOv3|COCO2017||
|[yolov3_darknet53_pedestrian](image/object_detection/yolov3_darknet53_pedestrian)|YOLOv3|百度自建大规模行人数据集|行人检测|
|[yolov3_mobilenet_v1_coco2017](image/object_detection/yolov3_mobilenet_v1_coco2017)|YOLOv3|COCO2017||
|[ssd_mobilenet_v1_pascal](image/object_detection/ssd_mobilenet_v1_pascal)|SSD|PASCAL VOC||
|[faster_rcnn_resnet50_fpn_coco2017](image/object_detection/faster_rcnn_resnet50_fpn_coco2017)|faster_rcnn|COCO2017||
|[yolov3_darknet53_coco2017](image/object_detection/yolov3_darknet53_coco2017)|YOLOv3|COCO2017||
|[yolov3_darknet53_vehicles](image/object_detection/yolov3_darknet53_vehicles)|YOLOv3|百度自建大规模车辆数据集|车辆检测|
|[yolov3_darknet53_venus](image/object_detection/yolov3_darknet53_venus)|YOLOv3|百度自建数据集|大规模通用检测|
|[yolov3_resnet50_vd_coco2017](image/object_detection/yolov3_resnet50_vd_coco2017)|YOLOv3|COCO2017||
- ### 深度估计
|module|网络|数据集|简介|
|--|--|--|--|
|[MiDaS_Large](image/depth_estimation/MiDaS_Large)|-|3D Movies, WSVD, ReDWeb, MegaDepth||
|[MiDaS_Small](image/depth_estimation/MiDaS_Small)|-|3D Movies, WSVD, ReDWeb, MegaDepth, etc.||
## 文本
- ### 文本生成
|module|网络|数据集|简介|
|--|--|--|--|
|[ernie_gen](text/text_generation/ernie_gen)|ERNIE-GEN|-|面向生成任务的预训练-微调框架|
|[ernie_gen_poetry](text/text_generation/ernie_gen_poetry)|ERNIE-GEN|开源诗歌数据集|诗歌生成|
|[ernie_gen_couplet](text/text_generation/ernie_gen_couplet)|ERNIE-GEN|开源对联数据集|对联生成|
|[ernie_gen_lover_words](text/text_generation/ernie_gen_lover_words)|ERNIE-GEN|网络情诗、情话数据|情话生成|
|[ernie_tiny_couplet](text/text_generation/ernie_tiny_couplet)|Eernie_tiny|开源对联数据集|对联生成|
|[ernie_gen_acrostic_poetry](text/text_generation/ernie_gen_acrostic_poetry)|ERNIE-GEN|开源诗歌数据集|藏头诗生成|
|[Rumor_prediction](text/text_generation/Rumor_prediction)|-|新浪微博中文谣言数据|谣言预测|
|[plato-mini](text/text_generation/plato-mini)|Unified Transformer|十亿级别的中文对话数据|中文对话|
|[plato2_en_large](text/text_generation/plato2_en_large)|plato2|开放域多轮数据集|超大规模生成式对话|
|[plato2_en_base](text/text_generation/plato2_en_base)|plato2|开放域多轮数据集|超大规模生成式对话|
|[CPM_LM](text/text_generation/CPM_LM)|GPT-2|自建数据集|中文文本生成|
|[unified_transformer-12L-cn](text/text_generation/unified_transformer-12L-cn)|Unified Transformer|千万级别中文会话数据|人机多轮对话|
|[unified_transformer-12L-cn-luge](text/text_generation/unified_transformer-12L-cn-luge)|Unified Transformer|千言对话数据集|人机多轮对话|
|[reading_pictures_writing_poems](text/text_generation/reading_pictures_writing_poems)|多网络级联|-|看图写诗|
|[GPT2_CPM_LM](text/text_generation/GPT2_CPM_LM)|||问答类文本生成|
|[GPT2_Base_CN](text/text_generation/GPT2_Base_CN)|||问答类文本生成|
- ### 词向量
<details><summary>expand</summary><div>
|module|网络|数据集|简介|
|--|--|--|--|
|[w2v_weibo_target_word-bigram_dim300](text/embedding/w2v_weibo_target_word-bigram_dim300)|w2v|weibo||
|[w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_literature_target_word-word_dim300](text/embedding/w2v_literature_target_word-word_dim300)|w2v|literature||
|[word2vec_skipgram](text/embedding/word2vec_skipgram)|skip-gram|百度自建数据集||
|[w2v_sogou_target_word-char_dim300](text/embedding/w2v_sogou_target_word-char_dim300)|w2v|sogou||
|[w2v_weibo_target_bigram-char_dim300](text/embedding/w2v_weibo_target_bigram-char_dim300)|w2v|weibo||
|[w2v_zhihu_target_word-bigram_dim300](text/embedding/w2v_zhihu_target_word-bigram_dim300)|w2v|zhihu||
|[w2v_financial_target_word-word_dim300](text/embedding/w2v_financial_target_word-word_dim300)|w2v|financial||
|[w2v_wiki_target_word-word_dim300](text/embedding/w2v_wiki_target_word-word_dim300)|w2v|wiki||
|[w2v_baidu_encyclopedia_context_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-word_dim300)|w2v|baidu_encyclopedia||
|[w2v_weibo_target_word-word_dim300](text/embedding/w2v_weibo_target_word-word_dim300)|w2v|weibo||
|[w2v_zhihu_target_bigram-char_dim300](text/embedding/w2v_zhihu_target_bigram-char_dim300)|w2v|zhihu||
|[w2v_zhihu_target_word-word_dim300](text/embedding/w2v_zhihu_target_word-word_dim300)|w2v|zhihu||
|[w2v_people_daily_target_word-char_dim300](text/embedding/w2v_people_daily_target_word-char_dim300)|w2v|people_daily||
|[w2v_sikuquanshu_target_word-word_dim300](text/embedding/w2v_sikuquanshu_target_word-word_dim300)|w2v|sikuquanshu||
|[glove_twitter_target_word-word_dim200_en](text/embedding/glove_twitter_target_word-word_dim200_en)|fasttext|twitter||
|[fasttext_crawl_target_word-word_dim300_en](text/embedding/fasttext_crawl_target_word-word_dim300_en)|fasttext|crawl||
|[w2v_wiki_target_word-bigram_dim300](text/embedding/w2v_wiki_target_word-bigram_dim300)|w2v|wiki||
|[w2v_baidu_encyclopedia_context_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-1_dim300)|w2v|baidu_encyclopedia||
|[glove_wiki2014-gigaword_target_word-word_dim300_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim300_en)|glove|wiki2014-gigaword||
|[glove_wiki2014-gigaword_target_word-word_dim50_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim50_en)|glove|wiki2014-gigaword||
|[w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_wiki_target_bigram-char_dim300](text/embedding/w2v_wiki_target_bigram-char_dim300)|w2v|wiki||
|[w2v_baidu_encyclopedia_target_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-1_dim300)|w2v|baidu_encyclopedia||
|[w2v_financial_target_bigram-char_dim300](text/embedding/w2v_financial_target_bigram-char_dim300)|w2v|financial||
|[glove_wiki2014-gigaword_target_word-word_dim200_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim200_en)|glove|wiki2014-gigaword||
|[w2v_financial_target_word-bigram_dim300](text/embedding/w2v_financial_target_word-bigram_dim300)|w2v|financial||
|[w2v_mixed-large_target_word-char_dim300](text/embedding/w2v_mixed-large_target_word-char_dim300)|w2v|mixed||
|[w2v_baidu_encyclopedia_target_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordPosition_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_target_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordLR_dim300)|w2v|baidu_encyclopedia||
|[w2v_sogou_target_bigram-char_dim300](text/embedding/w2v_sogou_target_bigram-char_dim300)|w2v|sogou||
|[w2v_weibo_target_word-char_dim300](text/embedding/w2v_weibo_target_word-char_dim300)|w2v|weibo||
|[w2v_people_daily_target_word-word_dim300](text/embedding/w2v_people_daily_target_word-word_dim300)|w2v|people_daily||
|[w2v_zhihu_target_word-char_dim300](text/embedding/w2v_zhihu_target_word-char_dim300)|w2v|zhihu||
|[w2v_wiki_target_word-char_dim300](text/embedding/w2v_wiki_target_word-char_dim300)|w2v|wiki||
|[w2v_sogou_target_word-bigram_dim300](text/embedding/w2v_sogou_target_word-bigram_dim300)|w2v|sogou||
|[w2v_financial_target_word-char_dim300](text/embedding/w2v_financial_target_word-char_dim300)|w2v|financial||
|[w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia||
|[glove_wiki2014-gigaword_target_word-word_dim100_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim100_en)|glove|wiki2014-gigaword||
|[w2v_baidu_encyclopedia_target_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-4_dim300)|w2v|baidu_encyclopedia||
|[w2v_sogou_target_word-word_dim300](text/embedding/w2v_sogou_target_word-word_dim300)|w2v|sogou||
|[w2v_literature_target_word-char_dim300](text/embedding/w2v_literature_target_word-char_dim300)|w2v|literature||
|[w2v_baidu_encyclopedia_target_bigram-char_dim300](text/embedding/w2v_baidu_encyclopedia_target_bigram-char_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_target_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-word_dim300)|w2v|baidu_encyclopedia||
|[glove_twitter_target_word-word_dim100_en](text/embedding/glove_twitter_target_word-word_dim100_en)|glove|crawl||
|[w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_context_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-4_dim300)|w2v|baidu_encyclopedia||
|[w2v_literature_target_bigram-char_dim300](text/embedding/w2v_literature_target_bigram-char_dim300)|w2v|literature||
|[fasttext_wiki-news_target_word-word_dim300_en](text/embedding/fasttext_wiki-news_target_word-word_dim300_en)|fasttext|wiki-news||
|[w2v_people_daily_target_word-bigram_dim300](text/embedding/w2v_people_daily_target_word-bigram_dim300)|w2v|people_daily||
|[w2v_mixed-large_target_word-word_dim300](text/embedding/w2v_mixed-large_target_word-word_dim300)|w2v|mixed||
|[w2v_people_daily_target_bigram-char_dim300](text/embedding/w2v_people_daily_target_bigram-char_dim300)|w2v|people_daily||
|[w2v_literature_target_word-bigram_dim300](text/embedding/w2v_literature_target_word-bigram_dim300)|w2v|literature||
|[glove_twitter_target_word-word_dim25_en](text/embedding/glove_twitter_target_word-word_dim25_en)|glove|twitter||
|[w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_sikuquanshu_target_word-bigram_dim300](text/embedding/w2v_sikuquanshu_target_word-bigram_dim300)|w2v|sikuquanshu||
|[w2v_baidu_encyclopedia_context_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-2_dim300)|w2v|baidu_encyclopedia||
|[glove_twitter_target_word-word_dim50_en](text/embedding/glove_twitter_target_word-word_dim50_en)|glove|twitter||
|[w2v_baidu_encyclopedia_context_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordLR_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_target_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-2_dim300)|w2v|baidu_encyclopedia||
|[w2v_baidu_encyclopedia_context_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordPosition_dim300)|w2v|baidu_encyclopedia||
</div></details>
- ### 机器翻译
|module|网络|数据集|简介|
|--|--|--|--|
|[transformer_zh-en](text/machine_translation/transformer/transformer_zh-en)|Transformer|CWMT2021|中文译英文|
|[transformer_en-de](text/machine_translation/transformer/transformer_en-de)|Transformer|WMT14 EN-DE|英文译德文|
- ### 语义模型
<details><summary>expand</summary><div>
|module|网络|数据集|简介|
|--|--|--|--|
|[chinese_electra_small](text/language_model/chinese_electra_small)||||
|[chinese_electra_base](text/language_model/chinese_electra_base)||||
|[roberta-wwm-ext-large](text/language_model/roberta-wwm-ext-large)|roberta-wwm-ext-large|百度自建数据集||
|[chinese-bert-wwm-ext](text/language_model/chinese_bert_wwm_ext)|chinese-bert-wwm-ext|百度自建数据集||
|[lda_webpage](text/language_model/lda_webpage)|LDA|百度自建网页领域数据集||
|[lda_novel](text/language_model/lda_novel)||||
|[bert-base-multilingual-uncased](text/language_model/bert-base-multilingual-uncased)||||
|[rbt3](text/language_model/rbt3)||||
|[ernie_v2_eng_base](text/language_model/ernie_v2_eng_base)|ernie_v2_eng_base|百度自建数据集||
|[bert-base-multilingual-cased](text/language_model/bert-base-multilingual-cased)||||
|[rbtl3](text/language_model/rbtl3)||||
|[chinese-bert-wwm](text/language_model/chinese_bert_wwm)|chinese-bert-wwm|百度自建数据集||
|[bert-large-uncased](text/language_model/bert-large-uncased)||||
|[slda_novel](text/language_model/slda_novel)||||
|[slda_news](text/language_model/slda_news)||||
|[electra_small](text/language_model/electra_small)||||
|[slda_webpage](text/language_model/slda_webpage)||||
|[bert-base-cased](text/language_model/bert-base-cased)||||
|[slda_weibo](text/language_model/slda_weibo)||||
|[roberta-wwm-ext](text/language_model/roberta-wwm-ext)|roberta-wwm-ext|百度自建数据集||
|[bert-base-uncased](text/language_model/bert-base-uncased)||||
|[electra_large](text/language_model/electra_large)||||
|[ernie](text/language_model/ernie)|ernie-1.0|百度自建数据集||
|[simnet_bow](text/language_model/simnet_bow)|BOW|百度自建数据集||
|[ernie_tiny](text/language_model/ernie_tiny)|ernie_tiny|百度自建数据集||
|[bert-base-chinese](text/language_model/bert-base-chinese)|bert-base-chinese|百度自建数据集||
|[lda_news](text/language_model/lda_news)|LDA|百度自建新闻领域数据集||
|[electra_base](text/language_model/electra_base)||||
|[ernie_v2_eng_large](text/language_model/ernie_v2_eng_large)|ernie_v2_eng_large|百度自建数据集||
|[bert-large-cased](text/language_model/bert-large-cased)||||
</div></details>
- ### 情感分析
|module|网络|数据集|简介|
|--|--|--|--|
|[ernie_skep_sentiment_analysis](text/sentiment_analysis/ernie_skep_sentiment_analysis)|SKEP|百度自建数据集|句子级情感分析|
|[emotion_detection_textcnn](text/sentiment_analysis/emotion_detection_textcnn)|TextCNN|百度自建数据集|对话情绪识别|
|[senta_bilstm](text/sentiment_analysis/senta_bilstm)|BiLSTM|百度自建数据集|中文情感倾向分析|
|[senta_bow](text/sentiment_analysis/senta_bow)|BOW|百度自建数据集|中文情感倾向分析|
|[senta_gru](text/sentiment_analysis/senta_gru)|GRU|百度自建数据集|中文情感倾向分析|
|[senta_lstm](text/sentiment_analysis/senta_lstm)|LSTM|百度自建数据集|中文情感倾向分析|
|[senta_cnn](text/sentiment_analysis/senta_cnn)|CNN|百度自建数据集|中文情感倾向分析|
- ### 句法分析
|module|网络|数据集|简介|
|--|--|--|--|
|[DDParser](text/syntactic_analysis/DDParser)|Deep Biaffine Attention|搜索query、网页文本、语音输入等数据|句法分析|
- ### 同声传译
|module|网络|数据集|简介|
|--|--|--|--|
|[transformer_nist_wait_1](text/simultaneous_translation/stacl/transformer_nist_wait_1)|transformer|NIST 2008-中英翻译数据集|中译英-wait-1策略|
|[transformer_nist_wait_3](text/simultaneous_translation/stacl/transformer_nist_wait_3)|transformer|NIST 2008-中英翻译数据集|中译英-wait-3策略|
|[transformer_nist_wait_5](text/simultaneous_translation/stacl/transformer_nist_wait_5)|transformer|NIST 2008-中英翻译数据集|中译英-wait-5策略|
|[transformer_nist_wait_7](text/simultaneous_translation/stacl/transformer_nist_wait_7)|transformer|NIST 2008-中英翻译数据集|中译英-wait-7策略|
|[transformer_nist_wait_all](text/simultaneous_translation/stacl/transformer_nist_wait_all)|transformer|NIST 2008-中英翻译数据集|中译英-waitk=-1策略|
- ### 词法分析
|module|网络|数据集|简介|
|--|--|--|--|
|[jieba_paddle](text/lexical_analysis/jieba_paddle)|BiGRU+CRF|百度自建数据集|百度自研联合的词法分析模型,能整体性地完成中文分词、词性标注、专名识别任务。在百度自建数据集上评测,LAC效果:Precision=88.0%,Recall=88.7%,F1-Score=88.4%。|
|[lac](text/lexical_analysis/lac)|BiGRU+CRF|百度自建数据集|jieba使用Paddle搭建的切词网络(双向GRU)。同时支持jieba的传统切词方法,如精确模式、全模式、搜索引擎模式等切词模式。|
- ### 标点恢复
|module|网络|数据集|简介|
|--|--|--|--|
|[auto_punc](text/punctuation_restoration/auto_punc)|Ernie-1.0|WuDaoCorpora 2.0|自动添加7种标点符号|
- ### 文本审核
|module|网络|数据集|简介|
|--|--|--|--|
|[porn_detection_cnn](text/text_review/porn_detection_cnn)|CNN|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别|
|[porn_detection_gru](text/text_review/porn_detection_gru)|GRU|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别|
|[porn_detection_lstm](text/text_review/porn_detection_lstm)|LSTM|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别|
## 语音
- ### 声音克隆
|module|网络|数据集|简介|
|--|--|--|--|
|[ge2e_fastspeech2_pwgan](audio/voice_cloning/ge2e_fastspeech2_pwgan)|FastSpeech2|AISHELL-3|中文语音克隆|
|[lstm_tacotron2](audio/voice_cloning/lstm_tacotron2)|LSTM、Tacotron2、WaveFlow|AISHELL-3|中文语音克隆|
- ### 语音合成
|module|网络|数据集|简介|
|--|--|--|--|
|[transformer_tts_ljspeech](audio/tts/transformer_tts_ljspeech)|Transformer|LJSpeech-1.1|英文语音合成|
|[fastspeech_ljspeech](audio/tts/fastspeech_ljspeech)|FastSpeech|LJSpeech-1.1|英文语音合成|
|[fastspeech2_baker](audio/tts/fastspeech2_baker)|FastSpeech2|Chinese Standard Mandarin Speech Copus|中文语音合成|
|[fastspeech2_ljspeech](audio/tts/fastspeech2_ljspeech)|FastSpeech2|LJSpeech-1.1|英文语音合成|
|[deepvoice3_ljspeech](audio/tts/deepvoice3_ljspeech)|DeepVoice3|LJSpeech-1.1|英文语音合成|
- ### 语音识别
|module|网络|数据集|简介|
|--|--|--|--|
|[deepspeech2_aishell](audio/asr/deepspeech2_aishell)|DeepSpeech2|AISHELL-1|中文语音识别|
|[deepspeech2_librispeech](audio/asr/deepspeech2_librispeech)|DeepSpeech2|LibriSpeech|英文语音识别|
|[u2_conformer_aishell](audio/asr/u2_conformer_aishell)|Conformer|AISHELL-1|中文语音识别|
|[u2_conformer_wenetspeech](audio/asr/u2_conformer_wenetspeech)|Conformer|WenetSpeech|中文语音识别|
|[u2_conformer_librispeech](audio/asr/u2_conformer_librispeech)|Conformer|LibriSpeech|英文语音识别|
- ### 声音分类
|module|网络|数据集|简介|
|--|--|--|--|
|[panns_cnn6](audio/audio_classification/PANNs/cnn6)|PANNs|Google Audioset|主要包含4个卷积层和2个全连接层,模型参数为4.5M。经过预训练后,可以用于提取音频的embbedding,维度是512|
|[panns_cnn14](audio/audio_classification/PANNs/cnn14)|PANNs|Google Audioset|主要包含12个卷积层和2个全连接层,模型参数为79.6M。经过预训练后,可以用于提取音频的embbedding,维度是2048|
|[panns_cnn10](audio/audio_classification/PANNs/cnn10)|PANNs|Google Audioset|主要包含8个卷积层和2个全连接层,模型参数为4.9M。经过预训练后,可以用于提取音频的embbedding,维度是512|
## 视频
- ### 视频分类
|module|网络|数据集|简介|
|--|--|--|--|
|[videotag_tsn_lstm](video/classification/videotag_tsn_lstm)|TSN + AttentionLSTM|百度自建数据集|大规模短视频分类打标签|
|[tsn_kinetics400](video/classification/tsn_kinetics400)|TSN|Kinetics-400|视频分类|
|[tsm_kinetics400](video/classification/tsm_kinetics400)|TSM|Kinetics-400|视频分类|
|[stnet_kinetics400](video/classification/stnet_kinetics400)|StNet|Kinetics-400|视频分类|
|[nonlocal_kinetics400](video/classification/nonlocal_kinetics400)|Non-local|Kinetics-400|视频分类|
- ### 视频修复
|module|网络|数据集|简介|
|--|--|--|--|
|[SkyAR](video/Video_editing/SkyAR)|UNet|UNet|视频换天|
- ### 多目标追踪
|module|网络|数据集|简介|
|--|--|--|--|
|[fairmot_dla34](video/multiple_object_tracking/fairmot_dla34)|CenterNet|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|实时多目标跟踪|
|[jde_darknet53](video/multiple_object_tracking/jde_darknet53)|YOLOv3|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|多目标跟踪-兼顾精度和速度|
## 工业应用
- ### 表针识别
|module|网络|数据集|简介|
|--|--|--|--|
|[WatermeterSegmentation](image/semantic_segmentation/WatermeterSegmentation)|DeepLabV3|水表的数字表盘分割数据集|水表的数字表盘分割|
......@@ -3,7 +3,7 @@
|模型名称|u2_conformer_aishell|
| :--- | :---: |
|类别|语音-语音识别|
|网络|DeepSpeech2|
|网络|Conformer|
|数据集|AISHELL-1|
|是否支持Fine-tuning|否|
|模型大小|284MB|
......
......@@ -3,7 +3,7 @@
|模型名称|u2_conformer_librispeech|
| :--- | :---: |
|类别|语音-语音识别|
|网络|DeepSpeech2|
|网络|Conformer|
|数据集|LibriSpeech|
|是否支持Fine-tuning|否|
|模型大小|191MB|
......
......@@ -53,9 +53,9 @@
## 三、模型API预测
- ### 1、代码示例
- ### 1、预测代码示例
```python
- ```python
import paddlehub as hub
model = hub.Module(name='deoldify')
......
# deoldify
| Module Name |deoldify|
| :--- | :---: |
|Category|Image editing|
|Network |NoGAN|
|Dataset|ILSVRC 2012|
|Fine-tuning supported or not |No|
|Module Size |834MB|
|Data indicators|-|
|Latest update date |2021-04-13|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130886749-668dfa38-42ed-4a09-8d4a-b18af0475375.jpg" width = "450" height = "300" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130886685-76221736-839a-46a2-8415-e5e0dd3b345e.png" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Deoldify is a color rendering model for images and videos, which can restore color for black and white photos and videos.
- For more information, please refer to: [deoldify](https://github.com/jantic/DeOldify)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- NOTE: This Module relies on ffmpeg, Please install ffmpeg before using this Module.
```shell
$ conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge
```
- ### 2、Installation
- ```shell
$ hub install deoldify
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import paddlehub as hub
model = hub.Module(name='deoldify')
model.predict('/PATH/TO/IMAGE/OR/VIDEO')
```
- ### 2、API
- ```python
def predict(self, input):
```
- Prediction API.
- **Parameter**
- input (str): Image path.
- **Return**
- If input is image path, the output is:
- pred_img(np.ndarray): image data, ndarray.shape is in the format [H, W, C], BGR.
- out_path(str): save path of images.
- If input is video path, the output is :
- frame_pattern_combined(str): save path of frames from output video.
- vid_out_path(str): save path of output video.
- ```python
def run_image(self, img):
```
- Prediction API for image.
- **Parameter**
- img (str|np.ndarray): Image data, str or ndarray. ndarray.shape is in the format [H, W, C], BGR.
- **Return**
- pred_img(np.ndarray): Ndarray.shape is in the format [H, W, C], BGR.
- ```python
def run_video(self, video):
```
- Prediction API for video.
- **Parameter**
- video(str): Video path.
- **Return**
- frame_pattern_combined(str): Save path of frames from output video.
- vid_out_path(str): Save path of output video.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of coloring old photos or videos.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m deoldify
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result.
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('/PATH/TO/ORIGIN/IMAGE')
data = {'images':cv2_to_base64(org_im)}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/deoldify"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
img = base64_to_cv2(r.json()["results"])
cv2.imwrite('/PATH/TO/SAVE/IMAGE', img)
```
## V. Release Note
- 1.0.0
First release
- 1.0.1
Adapt to paddlehub2.0
......@@ -51,9 +51,9 @@
## 三、模型API预测
- ### 1、代码示例
- ### 1、预测代码示例
```python
- ```python
import cv2
import paddlehub as hub
......
# photo_restoration
|Module Name|photo_restoration|
| :--- | :---: |
|Category|Image editing|
|Network|deoldify and realsr|
|Fine-tuning supported or not|No|
|Module Size |64MB+834MB|
|Data indicators|-|
|Latest update date|2021-08-19|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130897828-d0c86b81-63d1-4e9a-8095-bc000b8c7ca8.jpg" width = "260" height = "400" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130897762-5c9fa711-62bc-4067-8d44-f8feff8c574c.png" width = "260" height = "400" hspace='10'/>
</p>
- ### Module Introduction
- Photo_restoration can restore old photos. It mainly consists of two parts: coloring and super-resolution. The coloring model is deoldify
, and super resolution model is realsr. Therefore, when using this model, please install deoldify and realsr in advance.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- NOTE: This Module relies on ffmpeg, Please install ffmpeg before using this Module.
```shell
$ conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge
```
- ### 2、Installation
- ```shell
$ hub install photo_restoration
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import cv2
import paddlehub as hub
model = hub.Module(name='photo_restoration', visualization=True)
im = cv2.imread('/PATH/TO/IMAGE')
res = model.run_image(im)
```
- ### 2、API
- ```python
def run_image(self,
input,
model_select= ['Colorization', 'SuperResolution'],
save_path = 'photo_restoration'):
```
- Predicition API, produce repaired photos.
- **Parameter**
- input (numpy.ndarray|str): Image data,numpy.ndarray or str. ndarray.shape is in the format [H, W, C], BGR.
- model_select (list\[str\]): Mode selection,\['Colorization'\] only colorize the input image, \['SuperResolution'\] only increase the image resolution;
default is \['Colorization', 'SuperResolution'\]。
- save_path (str): Save path, default is 'photo_restoration'.
- **Return**
- output (numpy.ndarray): Restoration result,ndarray.shape is in the format [H, W, C], BGR.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of photo restoration.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m photo_restoration
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('PATH/TO/IMAGE')
data = {'images':cv2_to_base64(org_im), 'model_select': ['Colorization', 'SuperResolution']}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/photo_restoration"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
img = base64_to_cv2(r.json()["results"])
cv2.imwrite('PATH/TO/SAVE/IMAGE', img)
```
## V. Release Note
- 1.0.0
First release
- 1.0.1
Adapt to paddlehub2.0
......@@ -22,7 +22,7 @@
<img src="https://user-images.githubusercontent.com/35907364/136653401-6644bd46-d280-4c15-8d48-680b7eb152cb.png" width = "300" height = "450" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/136648959-40493c9c-08ec-46cd-a2a2-5e2038dcbfa7.png" width = "300" height = "450" hspace='10'/>
</p>
- user_guided_colorization 是基于''Real-Time User-Guided Image Colorization with Learned Deep Priors"的着色模型,该模型利用预先提供的着色块对图像进行着色。
- user_guided_colorization 是基于"Real-Time User-Guided Image Colorization with Learned Deep Priors"的着色模型,该模型利用预先提供的着色块对图像进行着色。
## 二、安装
......
# user_guided_colorization
|Module Name|user_guided_colorization|
| :--- | :---: |
|Category |Image editing|
|Network| Local and Global Hints Network |
|Dataset|ILSVRC 2012|
|Fine-tuning supported or notFine-tuning|Yes|
|Module Size|131MB|
|Data indicators|-|
|Latest update date |2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/136653401-6644bd46-d280-4c15-8d48-680b7eb152cb.png" width = "300" height = "450" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/136648959-40493c9c-08ec-46cd-a2a2-5e2038dcbfa7.png" width = "300" height = "450" hspace='10'/>
</p>
- ### Module Introduction
- User_guided_colorization is a colorization model based on "Real-Time User-Guided Image Colorization with Learned Deep Priors",this model uses pre-supplied coloring blocks to color the gray image.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install user_guided_colorization
```
- In case of any problems during installation, please refer to: [Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
```shell
$ hub run user_guided_colorization --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
```python
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='user_guided_colorization')
model.set_config(prob=0.1)
result = model.predict(images=['/PATH/TO/IMAGE'])
```
- ### 3.Fine-tune and Encapsulation
- After completing the installation of PaddlePaddle and PaddleHub, you can start using the user_guided_colorization model to fine-tune datasets such as [Canvas](../../docs/reference/datasets.md#class-hubdatasetsCanvas) by executing `python train.py`.
- Steps:
- Step1: Define the data preprocessing method
- ```python
import paddlehub.vision.transforms as T
transform = T.Compose([T.Resize((256, 256), interpolation='NEAREST'),
T.RandomPaddingCrop(crop_size=176),
T.RGB2LAB()], to_rgb=True)
```
- `transforms`: The data enhancement module defines lots of data preprocessing methods. Users can replace the data preprocessing methods according to their needs.
- Step2: Download the dataset
- ```python
from paddlehub.datasets import Canvas
color_set = Canvas(transform=transform, mode='train')
```
* `transforms`: Data preprocessing methods.
* `mode`: Select the data mode, the options are `train`, `test`, `val`. Default is `train`.
* `hub.datasets.Canvas()`: The dataset will be automatically downloaded from the network and decompressed to the `$HOME/.paddlehub/dataset` directory under the user directory.
- Step3: Load the pre-trained model
- ```python
model = hub.Module(name='user_guided_colorization', load_checkpoint=None)
model.set_config(classification=True, prob=1)
```
* `name`: Model name.
* `load_checkpoint`: Whether to load the self-trained model, if it is None, load the provided parameters.
* `classification`: The model is trained by two mode. At the beginning, `classification` is set to True, which is used for shallow network training. In the later stage of training, set `classification` to False, which is used to train the output layer of the network.
* `prob`: The probability that a priori color block is not added to each input image, the default is 1, that is, no prior color block is added. For example, when `prob` is set to 0.9, the probability that there are two a priori color blocks on a picture is(1-0.9)*(1-0.9)*0.9=0.009.
- Step4: Optimization strategy
```python
optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='img_colorization_ckpt_cls_1')
trainer.train(color_set, epochs=201, batch_size=25, eval_dataset=color_set, log_interval=10, save_interval=10)
```
- Run configuration
- `Trainer` mainly control the training of Fine-tune, including the following controllable parameters:
* `model`: Optimized model.
* `optimizer`: Optimizer selection.
* `use_vdl`: Whether to use vdl to visualize the training process.
* `checkpoint_dir`: The storage address of the model parameters.
* `compare_metrics`: The measurement index of the optimal model.
- `trainer.train` mainly control the specific training process, including the following controllable parameters:
* `train_dataset`: Training dataset.
* `epochs`: Epochs of training process.
* `batch_size`: Batch size.
* `num_workers`: Number of workers.
* `eval_dataset`: Validation dataset.
* `log_interval`:The interval for printing logs.
* `save_interval`: The interval for saving model parameters.
- Model prediction
- When Fine-tune is completed, the model with the best performance on the verification set will be saved in the `${CHECKPOINT_DIR}/best_model` directory. We use this model to make predictions. The `predict.py` script is as follows:
- ```python
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='user_guided_colorization', load_checkpoint='/PATH/TO/CHECKPOINT')
model.set_config(prob=0.1)
result = model.predict(images=['/PATH/TO/IMAGE'])
```
- **NOTE:** If you want to get the oil painting style, please download the parameter file [Canvas colorization](https://paddlehub.bj.bcebos.com/dygraph/models/canvas_rc.pdparams)
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of colorization.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m user_guided_colorization
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/user_guided_colorization"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
data = base64_to_cv2(r.json()["results"]['data'][0]['fake_reg'])
cv2.imwrite('color.png', data)
```
## V. Release Note
* 1.0.0
First release
# dcscn
|Module Name|dcscn|
| :--- | :---: |
|Category |Image editing|
|Network|dcscn|
|Dataset|DIV2k|
|Fine-tuning supported or not|No|
|Module Size|260KB|
|Data indicators|PSNR37.63|
|Data indicators |2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/133558583-0b7049db-ed1f-4a16-8676-f2141fcb3dee.png" width = "450" height = "300" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130899031-a6f8c58a-5cb7-4105-b990-8cca5ae15368.png" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- DCSCN is a super resolution model based on 'Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network'. The model uses residual structure and skip connections to extract local and global features. It uses a parallel 1*1 convolutional network to learn detailed features to improve model performance. This model provides super resolution result with scale factor x2.
- For more information, please refer to: [dcscn](https://github.com/jiny2001/dcscn-super-resolution)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install dcscn
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
$ hub run dcscn --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import cv2
import paddlehub as hub
sr_model = hub.Module(name='dcscn')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
- ### 3、API
- ```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="dcscn_output")
```
- Prediction API.
- **Parameter**
* images (list\[numpy.ndarray\]): Image data,ndarray.shape is in the format \[H, W, C\],BGR.
* paths (list\[str\]): image path.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**.
* visualization (bool): Whether to save the recognition results as picture files.
* output\_dir (str): Save path of images, "dcscn_output" by default.
- **Return**
* res (list\[dict\]): The list of model results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result, save_path is '' if no image is saved.
* data (numpy.ndarray): Result of super resolution.
- ```python
def save_inference_model(self,
dirname='dcscn_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: Model file name,defalt is \_\_model\_\_
* params\_filename: Parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of super resolution.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m dcscn
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/dcscn"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = np.expand_dims(cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY), axis=2)
shape =sr.shape
org_im = cv2.cvtColor(org_im, cv2.COLOR_BGR2YUV)
uv = cv2.resize(org_im[...,1:], (shape[1], shape[0]), interpolation=cv2.INTER_CUBIC)
combine_im = cv2.cvtColor(np.concatenate((sr, uv), axis=2), cv2.COLOR_YUV2BGR)
cv2.imwrite('dcscn_X2.png', combine_im)
print("save image as dcscn_X2.png")
```
## V. Release Note
- 1.0.0
First release
# falsr_a
|Module Name|falsr_a|
| :--- | :---: |
|Category |Image editing|
|Network |falsr_a|
|Dataset|DIV2k|
|Fine-tuning supported or not|No|
|Module Size |8.9MB|
|Data indicators|PSNR37.82|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/133558583-0b7049db-ed1f-4a16-8676-f2141fcb3dee.png" width = "450" height = "300" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130899031-a6f8c58a-5cb7-4105-b990-8cca5ae15368.png" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Falsr_a is a lightweight super-resolution model based on "Accurate and Lightweight Super-Resolution with Neural Architecture Search". The model uses a multi-objective approach to deal with the over-segmentation problem, and uses an elastic search strategy based on a hybrid controller to improve the performance of the model. This model provides super resolution result with scale factor x2.
- For more information, please refer to: [falsr_a](https://github.com/xiaomi-automl/FALSR)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install falsr_a
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
$ hub run falsr_a --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import cv2
import paddlehub as hub
sr_model = hub.Module(name='falsr_a')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
- ### 3、API
- ```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_a_output")
```
- Prediction API.
- **Parameter**
* images (list\[numpy.ndarray\]): image data,ndarray.shape is in the format \[H, W, C\],BGR.
* paths (list\[str\]): image path.
* use\_gpu (bool): use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**.
* visualization (bool): Whether to save the recognition results as picture files.
* output\_dir (str): save path of images, "dcscn_output" by default.
- **Return**
* res (list\[dict\]): The list of model results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result, save_path is '' if no image is saved.
* data (numpy.ndarray): result of super resolution.
- ```python
def save_inference_model(self,
dirname='falsr_a_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: model file name,defalt is \_\_model\_\_
* params\_filename: parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of super resolution.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m falsr_a
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/falsr_a"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = base64_to_cv2(r.json()["results"][0]['data'])
cv2.imwrite('falsr_a_X2.png', sr)
print("save image as falsr_a_X2.png")
```
## V. Release Note
- 1.0.0
First release
# falsr_b
|Module Name|falsr_b|
| :--- | :---: |
|Category |Image editing|
|Network |falsr_b|
|Dataset|DIV2k|
|Fine-tuning supported or not|No|
|Module Size |4MB|
|Data indicators|PSNR37.61|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/133558583-0b7049db-ed1f-4a16-8676-f2141fcb3dee.png" width = "450" height = "300" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130899031-a6f8c58a-5cb7-4105-b990-8cca5ae15368.png" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Falsr_b is a lightweight super-resolution model based on "Accurate and Lightweight Super-Resolution with Neural Architecture Search". The model uses a multi-objective approach to deal with the over-segmentation problem, and uses an elastic search strategy based on a hybrid controller to improve the performance of the model. This model provides super resolution result with scale factor x2.
- For more information, please refer to:[falsr_b](https://github.com/xiaomi-automl/FALSR)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install falsr_b
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
$ hub run falsr_b --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
```python
import cv2
import paddlehub as hub
sr_model = hub.Module(name='falsr_b')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
- ### 3、API
- ```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_b_output")
```
- Prediction API.
- **Parameter**
* images (list\[numpy.ndarray\]): Image data,ndarray.shape is in the format \[H, W, C\],BGR.
* paths (list\[str\]): Image path.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**.
* visualization (bool): Whether to save the recognition results as picture files.
* output\_dir (str): Save path of images, "dcscn_output" by default.
- **Return**
* res (list\[dict\]): The list of model results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result, save_path is '' if no image is saved.
* data (numpy.ndarray): Result of super resolution.
- ```python
def save_inference_model(self,
dirname='falsr_b_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: Model file name,defalt is \_\_model\_\_
* params\_filename: Parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of super resolution.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m falsr_b
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/falsr_b"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = base64_to_cv2(r.json()["results"][0]['data'])
cv2.imwrite('falsr_b_X2.png', sr)
print("save image as falsr_b_X2.png")
```
## V. Release Note
- 1.0.0
First release
......@@ -51,7 +51,7 @@
- ```
$ hub run falsr_c --input_path "/PATH/TO/IMAGE"
```
- ### 代码示例
- ### 2、预测代码示例
```python
import cv2
......@@ -65,7 +65,7 @@
sr_model.save_inference_model()
```
- ### 2、API
- ### 3、API
- ```python
def reconstruct(self,
......
# falsr_c
|Module Name|falsr_c|
| :--- | :---: |
|Category |Image editing|
|Network |falsr_c|
|Dataset|DIV2k|
|Fine-tuning supported or not|No|
|Module Size |4.4MB|
|Data indicators|PSNR37.66|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/133558583-0b7049db-ed1f-4a16-8676-f2141fcb3dee.png" width = "450" height = "300" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130899031-a6f8c58a-5cb7-4105-b990-8cca5ae15368.png" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Falsr_c is a lightweight super-resolution model based on "Accurate and Lightweight Super-Resolution with Neural Architecture Search". The model uses a multi-objective approach to deal with the over-segmentation problem, and uses an elastic search strategy based on a hybrid controller to improve the performance of the model. This model provides super resolution result with scale factor x2.
- For more information, please refer to:[falsr_c](https://github.com/xiaomi-automl/FALSR)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install falsr_c
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
$ hub run falsr_c --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
```python
import cv2
import paddlehub as hub
sr_model = hub.Module(name='falsr_c')
im = cv2.imread('/PATH/TO/IMAGE').astype('float32')
res = sr_model.reconstruct(images=[im], visualization=True)
print(res[0]['data'])
sr_model.save_inference_model()
```
- ### 3、API
- ```python
def reconstruct(self,
images=None,
paths=None,
use_gpu=False,
visualization=False,
output_dir="falsr_c_output")
```
- Prediction API.
- **Parameter**
* images (list\[numpy.ndarray\]): Image data,ndarray.shape is in the format \[H, W, C\],BGR.
* paths (list\[str\]): Image path.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**.
* visualization (bool): Whether to save the recognition results as picture files.
* output\_dir (str): Save path of images, "dcscn_output" by default.
- **Return**
* res (list\[dict\]): The list of model results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result, save_path is '' if no image is saved.
* data (numpy.ndarray): Result of super resolution.
- ```python
def save_inference_model(self,
dirname='falsr_c_save_model',
model_filename=None,
params_filename=None,
combined=False)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: Model file name,defalt is \_\_model\_\_
* params\_filename: Parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of super resolution.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m falsr_c
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/falsr_c"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
sr = base64_to_cv2(r.json()["results"][0]['data'])
cv2.imwrite('falsr_c_X2.png', sr)
print("save image as falsr_c_X2.png")
```
## V. Release Note
- 1.0.0
First release
......@@ -57,7 +57,7 @@
## 三、模型API预测
- ### 1、代码示例
- ### 1、预测代码示例
```python
import paddlehub as hub
......
# realsr
|Module Name |reasr|
| :--- | :---: |
|Category |Image editing|
|Network|LP-KPN|
|Dataset |RealSR dataset|
|Fine-tuning supported or not|No|
|Module Size |64MB|
|Latest update date|2021-02-26|
|Data indicators |PSNR29.05|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/133558583-0b7049db-ed1f-4a16-8676-f2141fcb3dee.png" width = "450" height = "300" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130789888-a0d4f78e-acd6-44c1-9570-7390e90ae8dc.png" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Realsr is a super resolution model for image and video based on "Toward Real-World Single Image Super-Resolution: A New Benchmark and A New Mode". This model provides super resolution result with scale factor x4.
- For more information, please refer to: [realsr](https://github.com/csjcai/RealSR)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- **NOTE**: This Module relies on ffmpeg, Please install ffmpeg before using this Module.
```shell
$ conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge
```
- ### 2、Installation
- ```shell
$ hub install realsr
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import paddlehub as hub
model = hub.Module(name='realsr')
model.predict('/PATH/TO/IMAGE/OR/VIDEO')
```
- ### 2、API
- ```python
def predict(self, input):
```
- Prediction API.
- **Parameter**
- input (str): image path.
- **Return**
- If input is image path, the output is:
- pred_img(np.ndarray): image data, ndarray.shape is in the format [H, W, C], BGR.
- out_path(str): save path of images.
- If input is video path, the output is :
- frame_pattern_combined(str): save path of frames from output video.
- vid_out_path(str): save path of output video.
- ```python
def run_image(self, img):
```
- Prediction API for images.
- **Parameter**
- img (str|np.ndarray): Image data, str or ndarray. ndarray.shape is in the format [H, W, C], BGR.
- **Return**
- pred_img(np.ndarray): Prediction result, ndarray.shape is in the format [H, W, C], BGR.
- ```python
def run_video(self, video):
```
- Prediction API for video.
- **Parameter**
- video(str): Video path.
- **Return**
- frame_pattern_combined(str): Save path of frames from output video.
- vid_out_path(str): Save path of output video.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of image super resolution.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m realsr
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':cv2_to_base64(org_im)}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/realsr"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
img = base64_to_cv2(r.json()["results"])
cv2.imwrite('/PATH/TO/SAVE/IMAGE', img)
```
## V. Release Note
- 1.0.0
First release
* 1.0.1
Support paddlehub2.0
# attgan_celeba
|Module Name|attgan_celeba|
| :--- | :---: |
|Category |image generation|
|Network |AttGAN|
|Dataset|Celeba|
|Fine-tuning supported or not |No|
|Module Size |167MB|
|Latest update date|2021-02-26|
|Data indicators |-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/137855667-43c5c40c-28f5-45d8-accc-028e185b988f.JPG" width=1200><br/>
The image attributes are: original image, Bald, Bangs, Black_Hair, Blond_Hair, Brown_Hair, Bushy_Eyebrows, Eyeglasses, Gender, Mouth_Slightly_Open, Mustache, No_Beard, Pale_Skin, Aged<br/>
</p>
- ### Module Introduction
- AttGAN is a Generative Adversarial Network, which uses classification loss and reconstruction loss to train the network. The PaddleHub Module is trained one Celeba dataset and currently supports attributes of "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged".
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.5.2
- paddlehub >= 1.0.0 | [How to install PaddleHub](../../../../docs/docs_en/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install attgan_celeba==1.0.0
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md).
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run attgan_celeba --image "/PATH/TO/IMAGE" --style "target_attribute"
```
- **Parameters**
- image: Input image path.
- style: Specify the attributes to be converted. The options are "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged". You can choose one of the options.
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
attgan = hub.Module(name="attgan_celeba")
test_img_path = ["/PATH/TO/IMAGE"]
trans_attr = ["Bangs"]
# set input dict
input_dict = {"image": test_img_path, "style": trans_attr}
# execute predict and print the result
results = attgan.generate(data=input_dict)
print(results)
```
- ### 3、API
- ```python
def generate(data)
```
- Style transfer API.
- **Parameter**
- data(list[dict]): Each element in the list is dict and each field is:
- image (list\[str\]): Each element in the list is the path of the image to be converted.
- style (list\[str\]): Each element in the list is a string, fill in the face attributes to be converted.
- **Return**
- res (list\[str\]): Save path of the result.
## IV. Release Note
- 1.0.0
First release
# cyclegan_cityscapes
|Module Name|cyclegan_cityscapes|
| :--- | :---: |
|Category |Image generation|
|Network |CycleGAN|
|Dataset|Cityscapes|
|Fine-tuning supported or not |No|
|Module Size |33MB|
|Latest update date |2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/137839740-4be4cf40-816f-401e-a73f-6cda037041dd.png" width = "450" height = "300" hspace='10'/>
<br />
Input image
<br />
<img src="https://user-images.githubusercontent.com/35907364/137839777-89fc705b-f0d7-4a93-94e2-76c0d3c5a0b0.png" width = "450" height = "300" hspace='10'/>
<br />
Output image
<br />
</p>
- ### Module Introduction
- CycleGAN belongs to Generative Adversarial Networks(GANs). Unlike traditional GANs that can only generate pictures in one direction, CycleGAN can simultaneously complete the style transfer of two domains. The PaddleHub Module is trained by Cityscapes dataset, and supports the conversion from real images to semantic segmentation results, and also supports conversion from semantic segmentation results to real images.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.4.0
- paddlehub >= 1.1.0
- ### 2、Installation
- ```shell
$ hub install cyclegan_cityscapes==1.0.0
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run cyclegan_cityscapes --input_path "/PATH/TO/IMAGE"
```
- **Parameters**
- input_path: image path
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
cyclegan = hub.Module(name="cyclegan_cityscapes")
test_img_path = "/PATH/TO/IMAGE"
# set input dict
input_dict = {"image": [test_img_path]}
# execute predict and print the result
results = cyclegan.generate(data=input_dict)
print(results)
```
- ### 3、API
- ```python
def generate(data)
```
- Style transfer API.
- **Parameters**
- data(list[dict]): Each element in the list is dict and each field is:
- image (list\[str\]): Image path.
- **Return**
- res (list\[str\]): The list of style transfer results, where each element is dict and each field is:
- origin: Original input path.
- generated: Save path of images.
## IV. Release Note
* 1.0.0
First release
# first_order_motion
|模型名称|first_order_motion|
| :--- | :---: |
|类别|图像 - 图像生成|
|网络|S3FD|
|数据集|-|
|是否支持Fine-tuning|否|
|模型大小|343MB|
|最新更新日期|2021-12-24|
|数据指标|-|
## 一、模型基本信息
- ### 应用效果展示
- 样例结果示例:
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/147347145-1a7e84b6-2853-4490-8eaf-caf9cfdca79b.png" width = "40%" hspace='10'/>
<br />
输入图像
<br />
<img src="https://user-images.githubusercontent.com/22424850/147347151-d6c5690b-00cd-433f-b82b-3f8bb90bc7bd.gif" width = "40%" hspace='10'/>
<br />
输入视频
<br />
<img src="https://user-images.githubusercontent.com/22424850/147348127-52eb3f26-9b2c-49d5-a4a2-20a31f159802.gif" width = "40%" hspace='10'/>
<br />
输出视频
<br />
</p>
- ### 模型介绍
- First Order Motion的任务是图像动画/Image Animation,即输入为一张源图片和一个驱动视频,源图片中的人物则会做出驱动视频中的动作。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 2.1.0
- paddlehub >= 2.1.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install first_order_motion
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
$ hub run first_order_motion --source_image "/PATH/TO/IMAGE" --driving_video "/PATH/TO/VIDEO" --use_gpu
```
- 通过命令行方式实现视频驱动生成模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
module = hub.Module(name="first_order_motion")
module.generate(source_image="/PATH/TO/IMAGE", driving_video="/PATH/TO/VIDEO", ratio=0.4, image_size=256, output_dir='./motion_driving_result/', filename='result.mp4', use_gpu=False)
```
- ### 3、API
- ```python
generate(self, source_image=None, driving_video=None, ratio=0.4, image_size=256, output_dir='./motion_driving_result/', filename='result.mp4', use_gpu=False)
```
- 视频驱动生成API。
- **参数**
- source_image (str): 原始图片,支持单人图片和多人图片,视频中人物的表情动作将迁移到该原始图片中的人物上。
- driving_video (str): 驱动视频,视频中人物的表情动作作为待迁移的对象。
- ratio (float): 贴回驱动生成的人脸区域占原图的比例, 用户需要根据生成的效果调整该参数,尤其对于多人脸距离比较近的情况下需要调整改参数, 默认为0.4,调整范围是[0.4, 0.5]。
- image_size (int): 图片人脸大小,默认为256,可设置为512。
- output\_dir (str): 结果保存的文件夹名; <br/>
- filename (str): 结果保存的文件名。
- use\_gpu (bool): 是否使用 GPU;<br/>
## 四、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install first_order_motion==1.0.0
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.
import os
import sys
import math
import pickle
import yaml
import imageio
import numpy as np
from tqdm import tqdm
from scipy.spatial import ConvexHull
import cv2
import paddle
from ppgan.utils.download import get_path_from_url
from ppgan.utils.animate import normalize_kp
from ppgan.modules.keypoint_detector import KPDetector
from ppgan.models.generators.occlusion_aware import OcclusionAwareGenerator
from ppgan.faceutils import face_detection
class FirstOrderPredictor:
def __init__(self,
weight_path=None,
config=None,
image_size=256,
relative=True,
adapt_scale=False,
find_best_frame=False,
best_frame=None,
face_detector='sfd',
multi_person=False,
face_enhancement=True,
batch_size=1,
mobile_net=False):
if config is not None and isinstance(config, str):
with open(config) as f:
self.cfg = yaml.load(f, Loader=yaml.SafeLoader)
elif isinstance(config, dict):
self.cfg = config
elif config is None:
self.cfg = {
'model': {
'common_params': {
'num_kp': 10,
'num_channels': 3,
'estimate_jacobian': True
},
'generator': {
'kp_detector_cfg': {
'temperature': 0.1,
'block_expansion': 32,
'max_features': 1024,
'scale_factor': 0.25,
'num_blocks': 5
},
'generator_cfg': {
'block_expansion': 64,
'max_features': 512,
'num_down_blocks': 2,
'num_bottleneck_blocks': 6,
'estimate_occlusion_map': True,
'dense_motion_params': {
'block_expansion': 64,
'max_features': 1024,
'num_blocks': 5,
'scale_factor': 0.25
}
}
}
}
}
self.image_size = image_size
if weight_path is None:
if mobile_net:
vox_cpk_weight_url = 'https://paddlegan.bj.bcebos.com/applications/first_order_model/vox-mobile.pdparams'
else:
if self.image_size == 512:
vox_cpk_weight_url = 'https://paddlegan.bj.bcebos.com/applications/first_order_model/vox-cpk-512.pdparams'
else:
vox_cpk_weight_url = 'https://paddlegan.bj.bcebos.com/applications/first_order_model/vox-cpk.pdparams'
weight_path = get_path_from_url(vox_cpk_weight_url)
self.weight_path = weight_path
self.relative = relative
self.adapt_scale = adapt_scale
self.find_best_frame = find_best_frame
self.best_frame = best_frame
self.face_detector = face_detector
self.generator, self.kp_detector = self.load_checkpoints(self.cfg, self.weight_path)
self.multi_person = multi_person
self.face_enhancement = face_enhancement
self.batch_size = batch_size
if face_enhancement:
from ppgan.faceutils.face_enhancement import FaceEnhancement
self.faceenhancer = FaceEnhancement(batch_size=batch_size)
def read_img(self, path):
img = imageio.imread(path)
if img.ndim == 2:
img = np.expand_dims(img, axis=2)
# som images have 4 channels
if img.shape[2] > 3:
img = img[:, :, :3]
return img
def run(self, source_image, driving_video, ratio, image_size, output_dir, filename):
self.ratio = ratio
self.image_size = image_size
self.output = output_dir
self.filename = filename
if not os.path.exists(output_dir):
os.makedirs(output_dir)
def get_prediction(face_image):
if self.find_best_frame or self.best_frame is not None:
i = self.best_frame if self.best_frame is not None else self.find_best_frame_func(
source_image, driving_video)
print("Best frame: " + str(i))
driving_forward = driving_video[i:]
driving_backward = driving_video[:(i + 1)][::-1]
predictions_forward = self.make_animation(
face_image,
driving_forward,
self.generator,
self.kp_detector,
relative=self.relative,
adapt_movement_scale=self.adapt_scale)
predictions_backward = self.make_animation(
face_image,
driving_backward,
self.generator,
self.kp_detector,
relative=self.relative,
adapt_movement_scale=self.adapt_scale)
predictions = predictions_backward[::-1] + predictions_forward[1:]
else:
predictions = self.make_animation(
face_image,
driving_video,
self.generator,
self.kp_detector,
relative=self.relative,
adapt_movement_scale=self.adapt_scale)
return predictions
source_image = self.read_img(source_image)
reader = imageio.get_reader(driving_video)
fps = reader.get_meta_data()['fps']
driving_video = []
try:
for im in reader:
driving_video.append(im)
except RuntimeError:
print("Read driving video error!")
pass
reader.close()
driving_video = [cv2.resize(frame, (self.image_size, self.image_size)) / 255.0 for frame in driving_video]
results = []
bboxes = self.extract_bbox(source_image.copy())
print(str(len(bboxes)) + " persons have been detected")
# for multi person
for rec in bboxes:
face_image = source_image.copy()[rec[1]:rec[3], rec[0]:rec[2]]
face_image = cv2.resize(face_image, (self.image_size, self.image_size)) / 255.0
predictions = get_prediction(face_image)
results.append({'rec': rec, 'predict': [predictions[i] for i in range(predictions.shape[0])]})
if len(bboxes) == 1 or not self.multi_person:
break
out_frame = []
for i in range(len(driving_video)):
frame = source_image.copy()
for result in results:
x1, y1, x2, y2, _ = result['rec']
h = y2 - y1
w = x2 - x1
out = result['predict'][i]
out = cv2.resize(out.astype(np.uint8), (x2 - x1, y2 - y1))
if len(results) == 1:
frame[y1:y2, x1:x2] = out
break
else:
patch = np.zeros(frame.shape).astype('uint8')
patch[y1:y2, x1:x2] = out
mask = np.zeros(frame.shape[:2]).astype('uint8')
cx = int((x1 + x2) / 2)
cy = int((y1 + y2) / 2)
cv2.circle(mask, (cx, cy), math.ceil(h * self.ratio), (255, 255, 255), -1, 8, 0)
frame = cv2.copyTo(patch, mask, frame)
out_frame.append(frame)
imageio.mimsave(os.path.join(self.output, self.filename), [frame for frame in out_frame], fps=fps)
def load_checkpoints(self, config, checkpoint_path):
generator = OcclusionAwareGenerator(
**config['model']['generator']['generator_cfg'], **config['model']['common_params'], inference=True)
kp_detector = KPDetector(**config['model']['generator']['kp_detector_cfg'], **config['model']['common_params'])
checkpoint = paddle.load(self.weight_path)
generator.set_state_dict(checkpoint['generator'])
kp_detector.set_state_dict(checkpoint['kp_detector'])
generator.eval()
kp_detector.eval()
return generator, kp_detector
def make_animation(self,
source_image,
driving_video,
generator,
kp_detector,
relative=True,
adapt_movement_scale=True):
with paddle.no_grad():
predictions = []
source = paddle.to_tensor(source_image[np.newaxis].astype(np.float32)).transpose([0, 3, 1, 2])
driving = paddle.to_tensor(np.array(driving_video).astype(np.float32)).transpose([0, 3, 1, 2])
kp_source = kp_detector(source)
kp_driving_initial = kp_detector(driving[0:1])
kp_source_batch = {}
kp_source_batch["value"] = paddle.tile(kp_source["value"], repeat_times=[self.batch_size, 1, 1])
kp_source_batch["jacobian"] = paddle.tile(kp_source["jacobian"], repeat_times=[self.batch_size, 1, 1, 1])
source = paddle.tile(source, repeat_times=[self.batch_size, 1, 1, 1])
begin_idx = 0
for frame_idx in tqdm(range(int(np.ceil(float(driving.shape[0]) / self.batch_size)))):
frame_num = min(self.batch_size, driving.shape[0] - begin_idx)
driving_frame = driving[begin_idx:begin_idx + frame_num]
kp_driving = kp_detector(driving_frame)
kp_source_img = {}
kp_source_img["value"] = kp_source_batch["value"][0:frame_num]
kp_source_img["jacobian"] = kp_source_batch["jacobian"][0:frame_num]
kp_norm = normalize_kp(
kp_source=kp_source,
kp_driving=kp_driving,
kp_driving_initial=kp_driving_initial,
use_relative_movement=relative,
use_relative_jacobian=relative,
adapt_movement_scale=adapt_movement_scale)
out = generator(source[0:frame_num], kp_source=kp_source_img, kp_driving=kp_norm)
img = np.transpose(out['prediction'].numpy(), [0, 2, 3, 1]) * 255.0
if self.face_enhancement:
img = self.faceenhancer.enhance_from_batch(img)
predictions.append(img)
begin_idx += frame_num
return np.concatenate(predictions)
def find_best_frame_func(self, source, driving):
import face_alignment
def normalize_kp(kp):
kp = kp - kp.mean(axis=0, keepdims=True)
area = ConvexHull(kp[:, :2]).volume
area = np.sqrt(area)
kp[:, :2] = kp[:, :2] / area
return kp
fa = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, flip_input=True)
kp_source = fa.get_landmarks(255 * source)[0]
kp_source = normalize_kp(kp_source)
norm = float('inf')
frame_num = 0
for i, image in tqdm(enumerate(driving)):
kp_driving = fa.get_landmarks(255 * image)[0]
kp_driving = normalize_kp(kp_driving)
new_norm = (np.abs(kp_source - kp_driving)**2).sum()
if new_norm < norm:
norm = new_norm
frame_num = i
return frame_num
def extract_bbox(self, image):
detector = face_detection.FaceAlignment(
face_detection.LandmarksType._2D, flip_input=False, face_detector=self.face_detector)
frame = [image]
predictions = detector.get_detections_for_image(np.array(frame))
person_num = len(predictions)
if person_num == 0:
return np.array([])
results = []
face_boxs = []
h, w, _ = image.shape
for rect in predictions:
bh = rect[3] - rect[1]
bw = rect[2] - rect[0]
cy = rect[1] + int(bh / 2)
cx = rect[0] + int(bw / 2)
margin = max(bh, bw)
y1 = max(0, cy - margin)
x1 = max(0, cx - int(0.8 * margin))
y2 = min(h, cy + margin)
x2 = min(w, cx + int(0.8 * margin))
area = (y2 - y1) * (x2 - x1)
results.append([x1, y1, x2, y2, area])
# if a person has more than one bbox, keep the largest one
# maybe greedy will be better?
sorted(results, key=lambda area: area[4], reverse=True)
results_box = [results[0]]
for i in range(1, person_num):
num = len(results_box)
add_person = True
for j in range(num):
pre_person = results_box[j]
iou = self.IOU(pre_person[0], pre_person[1], pre_person[2], pre_person[3], pre_person[4], results[i][0],
results[i][1], results[i][2], results[i][3], results[i][4])
if iou > 0.5:
add_person = False
break
if add_person:
results_box.append(results[i])
boxes = np.array(results_box)
return boxes
def IOU(self, ax1, ay1, ax2, ay2, sa, bx1, by1, bx2, by2, sb):
#sa = abs((ax2 - ax1) * (ay2 - ay1))
#sb = abs((bx2 - bx1) * (by2 - by1))
x1, y1 = max(ax1, bx1), max(ay1, by1)
x2, y2 = min(ax2, bx2), min(ay2, by2)
w = x2 - x1
h = y2 - y1
if w < 0 or h < 0:
return 0.0
else:
return 1.0 * w * h / (sa + sb - w * h)
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import argparse
import copy
import paddle
import paddlehub as hub
from paddlehub.module.module import moduleinfo, runnable, serving
import numpy as np
import cv2
from skimage.io import imread
from skimage.transform import rescale, resize
from .model import FirstOrderPredictor
@moduleinfo(
name="first_order_motion", type="CV/gan", author="paddlepaddle", author_email="", summary="", version="1.0.0")
class FirstOrderMotion:
def __init__(self):
self.pretrained_model = os.path.join(self.directory, "vox-cpk.pdparams")
self.network = FirstOrderPredictor(weight_path=self.pretrained_model, face_enhancement=True)
def generate(self,
source_image=None,
driving_video=None,
ratio=0.4,
image_size=256,
output_dir='./motion_driving_result/',
filename='result.mp4',
use_gpu=False):
'''
source_image (str): path to image<br/>
driving_video (str) : path to driving_video<br/>
ratio: margin ratio
image_size: size of image
output_dir: the dir to save the results
filename: filename to save the results
use_gpu: if True, use gpu to perform the computation, otherwise cpu.
'''
paddle.disable_static()
place = 'gpu:0' if use_gpu else 'cpu'
place = paddle.set_device(place)
if source_image == None or driving_video == None:
print('No image or driving video provided. Please input an image and a driving video.')
return
self.network.run(source_image, driving_video, ratio, image_size, output_dir, filename)
@runnable
def run_cmd(self, argvs: list):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options", description="Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
self.args = self.parser.parse_args(argvs)
self.generate(
source_image=self.args.source_image,
driving_video=self.args.driving_video,
ratio=self.args.ratio,
image_size=self.args.image_size,
output_dir=self.args.output_dir,
use_gpu=self.args.use_gpu)
return
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
self.arg_config_group.add_argument(
'--output_dir', type=str, default='motion_driving_result', help='output directory for saving result.')
self.arg_config_group.add_argument("--filename", default='result.mp4', help="filename to output")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument("--source_image", type=str, help="path to source image")
self.arg_input_group.add_argument("--driving_video", type=str, help="path to driving video")
self.arg_input_group.add_argument("--ratio", dest="ratio", type=float, default=0.4, help="margin ratio")
self.arg_input_group.add_argument(
"--image_size", dest="image_size", type=int, default=256, help="size of image")
# pixel2style2pixel
|模型名称|pixel2style2pixel|
| :--- | :---: |
|类别|图像 - 图像生成|
|网络|Pixel2Style2Pixel|
|数据集|-|
|是否支持Fine-tuning|否|
|模型大小|1.7GB|
|最新更新日期|2021-12-14|
|数据指标|-|
## 一、模型基本信息
- ### 应用效果展示
- 样例结果示例:
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/146486444-63637926-4e46-4299-8905-d93f529d9d54.jpg" width = "40%" hspace='10'/>
<br />
输入图像
<br />
<img src="https://user-images.githubusercontent.com/22424850/146486413-0447dcc8-80ac-4b2c-8a7a-69347d60a2c4.png" width = "40%" hspace='10'/>
<br />
输出图像
<br />
</p>
- ### 模型介绍
- Pixel2Style2Pixel使用相当大的模型对图像进行编码,将图像编码到StyleGAN V2的风格向量空间中,使编码前的图像和解码后的图像具有强关联性。该模块应用于人脸转正任务。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 2.1.0
- paddlehub >= 2.1.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install pixel2style2pixel
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
# Read from a file
$ hub run pixel2style2pixel --input_path "/PATH/TO/IMAGE"
```
- 通过命令行方式实现人脸转正模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
module = hub.Module(name="pixel2style2pixel")
input_path = ["/PATH/TO/IMAGE"]
# Read from a file
module.style_transfer(paths=input_path, output_dir='./transfer_result/', use_gpu=True)
```
- ### 3、API
- ```python
style_transfer(images=None, paths=None, output_dir='./transfer_result/', use_gpu=False, visualization=True):
```
- 人脸转正生成API。
- **参数**
- images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\];<br/>
- paths (list\[str\]): 图片的路径;<br/>
- output\_dir (str): 结果保存的路径; <br/>
- use\_gpu (bool): 是否使用 GPU;<br/>
- visualization(bool): 是否保存结果到本地文件夹
## 四、服务部署
- PaddleHub Serving可以部署一个在线人脸转正服务。
- ### 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m pixel2style2pixel
```
- 这样就完成了一个人脸转正的在线服务API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
- ### 第二步:发送预测请求
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/pixel2style2pixel"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(r.json()["results"])
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install pixel2style2pixel==1.0.0
```
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import cv2
import scipy
import random
import numpy as np
import paddle
import paddle.vision.transforms as T
import ppgan.faceutils as futils
from ppgan.models.generators import Pixel2Style2Pixel
from ppgan.utils.download import get_path_from_url
from PIL import Image
model_cfgs = {
'ffhq-inversion': {
'model_urls':
'https://paddlegan.bj.bcebos.com/models/pSp-ffhq-inversion.pdparams',
'transform':
T.Compose([T.Resize((256, 256)),
T.Transpose(),
T.Normalize([127.5, 127.5, 127.5], [127.5, 127.5, 127.5])]),
'size':
1024,
'style_dim':
512,
'n_mlp':
8,
'channel_multiplier':
2
},
'ffhq-toonify': {
'model_urls':
'https://paddlegan.bj.bcebos.com/models/pSp-ffhq-toonify.pdparams',
'transform':
T.Compose([T.Resize((256, 256)),
T.Transpose(),
T.Normalize([127.5, 127.5, 127.5], [127.5, 127.5, 127.5])]),
'size':
1024,
'style_dim':
512,
'n_mlp':
8,
'channel_multiplier':
2
},
'default': {
'transform':
T.Compose([T.Resize((256, 256)),
T.Transpose(),
T.Normalize([127.5, 127.5, 127.5], [127.5, 127.5, 127.5])])
}
}
def run_alignment(image):
img = Image.fromarray(image).convert("RGB")
face = futils.dlib.detect(img)
if not face:
raise Exception('Could not find a face in the given image.')
face_on_image = face[0]
lm = futils.dlib.landmarks(img, face_on_image)
lm = np.array(lm)[:, ::-1]
lm_eye_left = lm[36:42]
lm_eye_right = lm[42:48]
lm_mouth_outer = lm[48:60]
output_size = 1024
transform_size = 4096
enable_padding = True
# Calculate auxiliary vectors.
eye_left = np.mean(lm_eye_left, axis=0)
eye_right = np.mean(lm_eye_right, axis=0)
eye_avg = (eye_left + eye_right) * 0.5
eye_to_eye = eye_right - eye_left
mouth_left = lm_mouth_outer[0]
mouth_right = lm_mouth_outer[6]
mouth_avg = (mouth_left + mouth_right) * 0.5
eye_to_mouth = mouth_avg - eye_avg
# Choose oriented crop rectangle.
x = eye_to_eye - np.flipud(eye_to_mouth) * [-1, 1]
x /= np.hypot(*x)
x *= max(np.hypot(*eye_to_eye) * 2.0, np.hypot(*eye_to_mouth) * 1.8)
y = np.flipud(x) * [-1, 1]
c = eye_avg + eye_to_mouth * 0.1
quad = np.stack([c - x - y, c - x + y, c + x + y, c + x - y])
qsize = np.hypot(*x) * 2
# Shrink.
shrink = int(np.floor(qsize / output_size * 0.5))
if shrink > 1:
rsize = (int(np.rint(float(img.size[0]) / shrink)), int(np.rint(float(img.size[1]) / shrink)))
img = img.resize(rsize, Image.ANTIALIAS)
quad /= shrink
qsize /= shrink
# Crop.
border = max(int(np.rint(qsize * 0.1)), 3)
crop = (int(np.floor(min(quad[:, 0]))), int(np.floor(min(quad[:, 1]))), int(np.ceil(max(quad[:, 0]))),
int(np.ceil(max(quad[:, 1]))))
crop = (max(crop[0] - border, 0), max(crop[1] - border, 0), min(crop[2] + border, img.size[0]),
min(crop[3] + border, img.size[1]))
if crop[2] - crop[0] < img.size[0] or crop[3] - crop[1] < img.size[1]:
img = img.crop(crop)
quad -= crop[0:2]
# Pad.
pad = (int(np.floor(min(quad[:, 0]))), int(np.floor(min(quad[:, 1]))), int(np.ceil(max(quad[:, 0]))),
int(np.ceil(max(quad[:, 1]))))
pad = (max(-pad[0] + border, 0), max(-pad[1] + border, 0), max(pad[2] - img.size[0] + border, 0),
max(pad[3] - img.size[1] + border, 0))
if enable_padding and max(pad) > border - 4:
pad = np.maximum(pad, int(np.rint(qsize * 0.3)))
img = np.pad(np.float32(img), ((pad[1], pad[3]), (pad[0], pad[2]), (0, 0)), 'reflect')
h, w, _ = img.shape
y, x, _ = np.ogrid[:h, :w, :1]
mask = np.maximum(1.0 - np.minimum(np.float32(x) / pad[0],
np.float32(w - 1 - x) / pad[2]),
1.0 - np.minimum(np.float32(y) / pad[1],
np.float32(h - 1 - y) / pad[3]))
blur = qsize * 0.02
img += (scipy.ndimage.gaussian_filter(img, [blur, blur, 0]) - img) * np.clip(mask * 3.0 + 1.0, 0.0, 1.0)
img += (np.median(img, axis=(0, 1)) - img) * np.clip(mask, 0.0, 1.0)
img = Image.fromarray(np.uint8(np.clip(np.rint(img), 0, 255)), 'RGB')
quad += pad[:2]
# Transform.
img = img.transform((transform_size, transform_size), Image.QUAD, (quad + 0.5).flatten(), Image.BILINEAR)
return img
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
self.__dict__ = self
class Pixel2Style2PixelPredictor:
def __init__(self,
weight_path=None,
model_type=None,
seed=None,
size=1024,
style_dim=512,
n_mlp=8,
channel_multiplier=2):
if weight_path is None and model_type != 'default':
if model_type in model_cfgs.keys():
weight_path = get_path_from_url(model_cfgs[model_type]['model_urls'])
size = model_cfgs[model_type].get('size', size)
style_dim = model_cfgs[model_type].get('style_dim', style_dim)
n_mlp = model_cfgs[model_type].get('n_mlp', n_mlp)
channel_multiplier = model_cfgs[model_type].get('channel_multiplier', channel_multiplier)
checkpoint = paddle.load(weight_path)
else:
raise ValueError('Predictor need a weight path or a pretrained model type')
else:
checkpoint = paddle.load(weight_path)
opts = checkpoint.pop('opts')
opts = AttrDict(opts)
opts['size'] = size
opts['style_dim'] = style_dim
opts['n_mlp'] = n_mlp
opts['channel_multiplier'] = channel_multiplier
self.generator = Pixel2Style2Pixel(opts)
self.generator.set_state_dict(checkpoint)
self.generator.eval()
if seed is not None:
paddle.seed(seed)
random.seed(seed)
np.random.seed(seed)
self.model_type = 'default' if model_type is None else model_type
def run(self, image):
src_img = run_alignment(image)
src_img = np.asarray(src_img)
transformed_image = model_cfgs[self.model_type]['transform'](src_img)
dst_img, latents = self.generator(
paddle.to_tensor(transformed_image[None, ...]), resize=False, return_latents=True)
dst_img = (dst_img * 0.5 + 0.5)[0].numpy() * 255
dst_img = dst_img.transpose((1, 2, 0))
dst_npy = latents[0].numpy()
return dst_img, dst_npy
# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import argparse
import copy
import paddle
import paddlehub as hub
from paddlehub.module.module import moduleinfo, runnable, serving
import numpy as np
import cv2
from skimage.io import imread
from skimage.transform import rescale, resize
from .model import Pixel2Style2PixelPredictor
from .util import base64_to_cv2
@moduleinfo(
name="pixel2style2pixel",
type="CV/style_transfer",
author="paddlepaddle",
author_email="",
summary="",
version="1.0.0")
class pixel2style2pixel:
def __init__(self):
self.pretrained_model = os.path.join(self.directory, "pSp-ffhq-inversion.pdparams")
self.network = Pixel2Style2PixelPredictor(weight_path=self.pretrained_model, model_type='ffhq-inversion')
def style_transfer(self,
images=None,
paths=None,
output_dir='./transfer_result/',
use_gpu=False,
visualization=True):
'''
images (list[numpy.ndarray]): data of images, shape of each is [H, W, C], color space must be BGR(read by cv2).
paths (list[str]): paths to images
output_dir: the dir to save the results
use_gpu: if True, use gpu to perform the computation, otherwise cpu.
visualization: if True, save results in output_dir.
'''
results = []
paddle.disable_static()
place = 'gpu:0' if use_gpu else 'cpu'
place = paddle.set_device(place)
if images == None and paths == None:
print('No image provided. Please input an image or a image path.')
return
if images != None:
for image in images:
image = image[:, :, ::-1]
out = self.network.run(image)
results.append(out)
if paths != None:
for path in paths:
image = cv2.imread(path)[:, :, ::-1]
out = self.network.run(image)
results.append(out)
if visualization == True:
if not os.path.exists(output_dir):
os.makedirs(output_dir, exist_ok=True)
for i, out in enumerate(results):
if out is not None:
cv2.imwrite(os.path.join(output_dir, 'output_{}.png'.format(i)), out[0][:, :, ::-1])
np.save(os.path.join(output_dir, 'output_{}.npy'.format(i)), out[1])
return results
@runnable
def run_cmd(self, argvs: list):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options", description="Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
self.args = self.parser.parse_args(argvs)
results = self.style_transfer(
paths=[self.args.input_path],
output_dir=self.args.output_dir,
use_gpu=self.args.use_gpu,
visualization=self.args.visualization)
return results
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.style_transfer(images=images_decode, **kwargs)
tolist = [result.tolist() for result in results]
return tolist
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument('--use_gpu', action='store_true', help="use GPU or not")
self.arg_config_group.add_argument(
'--output_dir', type=str, default='transfer_result', help='output directory for saving result.')
self.arg_config_group.add_argument('--visualization', type=bool, default=False, help='save results or not.')
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument('--input_path', type=str, help="path to input image.")
import base64
import cv2
import numpy as np
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# stargan_celeba
|Module Name|stargan_celeba|
| :--- | :---: |
|Category|image generation|
|Network|STGAN|
|Dataset|Celeba|
|Fine-tuning supported or not|No|
|Module Size |33MB|
|Latest update date|2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/137855887-f0abca76-2735-4275-b7ad-242decf31bb3.PNG" width=600><br/>
The image attributes are: origial image, Black_Hair, Blond_Hair, Brown_Hair, Male, Aged<br/>
</p>
- ### Module Introduction
- STGAN takes the original attribute and the target attribute as input, and proposes STUs (Selective transfer units) to select and modify features of the encoder. The PaddleHub Module is trained one Celeba dataset and currently supports attributes of "Black_Hair", "Blond_Hair", "Brown_Hair", "Female", "Male", "Aged".
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.5.2
- paddlehub >= 1.0.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install stargan_celeba==1.0.0
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run stargan_celeba --image "/PATH/TO/IMAGE" --style "target_attribute"
```
- **Parameters**
- image: image path
- style: Specify the attributes to be converted. The options are "Black_Hair", "Blond_Hair", "Brown_Hair", "Female", "Male", "Aged". You can choose one of the options.
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
stargan = hub.Module(name="stargan_celeba")
test_img_path = ["/PATH/TO/IMAGE"]
trans_attr = ["Blond_Hair"]
# set input dict
input_dict = {"image": test_img_path, "style": trans_attr}
# execute predict and print the result
results = stargan.generate(data=input_dict)
print(results)
```
- ### 3、API
- ```python
def generate(data)
```
- Style transfer API.
- **Parameter**
- data(list[dict]): each element in the list is dict and each field is:
- image (list\[str\]): Each element in the list is the path of the image to be converted.
- style (list\[str\]): Each element in the list is a string, fill in the face attributes to be converted.
- **Return**
- res (list\[str\]): Save path of the result.
## IV. Release Note
- 1.0.0
First release
# stgan_celeba
|Module Name|stgan_celeba|
| :--- | :---: |
|Category|image generation|
|Network|STGAN|
|Dataset|Celeba|
|Fine-tuning supported or not|No|
|Module Size |287MB|
|Latest update date|2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/137856070-2a43facd-cda0-473f-8935-e61f5dd583d8.JPG" width=1200><br/>
The image attributes are: original image, Bald, Bangs, Black_Hair, Blond_Hair, Brown_Hair, Bushy_Eyebrows, Eyeglasses, Gender, Mouth_Slightly_Open, Mustache, No_Beard, Pale_Skin, Aged<br/>
</p>
- ### Module Introduction
- STGAN takes the original attribute and the target attribute as input, and proposes STUs (Selective transfer units) to select and modify features of the encoder. The PaddleHub Module is trained one Celeba dataset and currently supports attributes of "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged".
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.5.2
- paddlehub >= 1.0.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install stgan_celeba==1.0.0
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run stgan_celeba --image "/PATH/TO/IMAGE" --info "original_attributes" --style "target_attribute"
```
- **Parameters**
- image: Image path
- info: Attributes of original image, must fill in gender( "Male" or "Female").The options are "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged". For example, the input picture is a girl with black hair, then fill in as "Female,Black_Hair".
- style: Specify the attributes to be converted. The options are "Bald", "Bangs", "Black_Hair", "Blond_Hair", "Brown_Hair", "Bushy_Eyebrows", "Eyeglasses", "Gender", "Mouth_Slightly_Open", "Mustache", "No_Beard", "Pale_Skin", "Aged". You can choose one of the options.
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
stgan = hub.Module(name="stgan_celeba")
test_img_path = ["/PATH/TO/IMAGE"]
org_info = ["Female,Black_Hair"]
trans_attr = ["Bangs"]
# set input dict
input_dict = {"image": test_img_path, "style": trans_attr, "info": org_info}
# execute predict and print the result
results = stgan.generate(data=input_dict)
print(results)
```
- ### 3、API
- ```python
def generate(data)
```
- Style transfer API.
- **Parameter**
- data(list[dict]): Each element in the list is dict and each field is:
- image (list\[str\]): Each element in the list is the path of the image to be converted.
- style (list\[str\]): Each element in the list is a string, fill in the face attributes to be converted.
- info (list\[str\]): Represents the face attributes of the original image. Different attributes are separated by commas.
- **Return**
- res (list\[str\]): Save path of the result.
## IV. Release Note
- 1.0.0
First release
# ID_Photo_GEN
|Module Name |ID_Photo_GEN|
| :--- | :---: |
|Category|Image generation|
|Network|HRNet_W18|
|Dataset |-|
|Fine-tuning supported or not |No|
|Module Size|28KB|
|Latest update date|2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://img-blog.csdnimg.cn/20201224163307901.jpg" >
</p>
- ### Module Introduction
- This model is based on face_landmark_localization and FCN_HRNet_W18_Face_Seg. It can generate ID photos with white, red and blue background
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install ID_Photo_GEN
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import cv2
import paddlehub as hub
model = hub.Module(name='ID_Photo_GEN')
result = model.Photo_GEN(
images=[cv2.imread('/PATH/TO/IMAGE')],
paths=None,
batch_size=1,
output_dir='output',
visualization=True,
use_gpu=False)
```
- ### 2、API
- ```python
def Photo_GEN(
images=None,
paths=None,
batch_size=1,
output_dir='output',
visualization=False,
use_gpu=False):
```
- Prediction API, generating ID photos.
- **Parameter**
* images (list[np.ndarray]): Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list[str]): Image path
* batch_size (int): Batch size
* output_dir (str): Save path of images, output by default.
* visualization (bool): Whether to save the recognition results as picture files.
* use_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
**NOTE:** Choose one of `paths` and `images` to provide input data.
- **Return**
* results (list[dict{"write":np.ndarray,"blue":np.ndarray,"red":np.ndarray}]): The list of generation results.
## IV. Release Note
- 1.0.0
First release
\ No newline at end of file
# UGATIT_83w
|Module Name|UGATIT_83w|
| :--- | :---: |
|Category|Image editing|
|Network |U-GAT-IT|
|Dataset|selfie2anime|
|Fine-tuning supported or not|No|
|Module Size|41MB|
|Latest update date |2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/136651638-33cac040-edad-41ac-a9ce-7c0e678d8c52.jpg" width = "400" height = "400" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/136651644-dd1d3836-99b3-40f0-8543-37de18f9cfd9.jpg" width = "400" height = "400" hspace='10'/>
</p>
- ### Module Introduction
- UGATIT can transfer the input face image into the anime style.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.8.2
- paddlehub >= 1.8.0
- ### 2、Installation
- ```shell
$ hub install UGATIT_83w
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import cv2
import paddlehub as hub
model = hub.Module(name='UGATIT_83w', use_gpu=False)
result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = model.style_transfer(paths=['/PATH/TO/IMAGE'])
```
- ### 2、API
- ```python
def style_transfer(
self,
images=None,
paths=None,
batch_size=1,
output_dir='output',
visualization=False
)
```
- Style transfer API, convert the input face image into anime style.
- **Parameters**
* images (list\[numpy.ndarray\]): Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list\[str\]): image path,default is None;
* batch\_size (int): Batch size, default is 1;
* visualization (bool): Whether to save the recognition results as picture files, default is False.
* output\_dir (str): Save path of images, `output` by default.
**NOTE:** Choose one of `paths` and `images` to provide data.
- **Return**
- res (list\[numpy.ndarray\]): Result, ndarray.shape is in the format [H, W, C].
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of Style transfer task.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m UGATIT_83w
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/UGATIT_83w"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction results
print(r.json()["results"])
```
## V. Release Note
- 1.0.0
First release
\ No newline at end of file
# UGATIT_92w
|Module Name|UGATIT_92w|
| :--- | :---: |
|Category|Image editing|
|Network |U-GAT-IT|
|Dataset|selfie2anime|
|Fine-tuning supported or not|No|
|Module Size|41MB|
|Latest update date |2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/136651638-33cac040-edad-41ac-a9ce-7c0e678d8c52.jpg" width = "400" height = "400" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/136653047-f00c30fb-521f-486f-8247-8d8f63649473.jpg" width = "400" height = "400" hspace='10'/>
</p>
- ### Module Introduction
- UGATIT can transfer the input face image into the anime style.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.8.2
- paddlehub >= 1.8.0
- ### 2、Installation
- ```shell
$ hub install UGATIT_92w
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import cv2
import paddlehub as hub
model = hub.Module(name='UGATIT_92w', use_gpu=False)
result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = model.style_transfer(paths=['/PATH/TO/IMAGE'])
```
- ### 2、API
- ```python
def style_transfer(
self,
images=None,
paths=None,
batch_size=1,
output_dir='output',
visualization=False
)
```
- Style transfer API, convert the input face image into anime style.
- **Parameters**
* images (list\[numpy.ndarray\]): Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list\[str\]): Image path,default is None;
* batch\_size (int): Batch size, default is 1;
* visualization (bool): Whether to save the recognition results as picture files, default is False.
* output\_dir (str): save path of images, `output` by default.
**NOTE:** Choose one of `paths` and `images` to provide input data.
- **Return**
- res (list\[numpy.ndarray\]): Style tranfer result, ndarray.shape is in the format [H, W, C].
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of Style transfer task.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m UGATIT_92w
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/UGATIT_92w"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction results
print(r.json()["results"])
```
## V. Release Note
- 1.0.0
First release
\ No newline at end of file
# animegan_v2_paprika_54
|Module Name |animegan_v2_paprika_54|
| :--- | :---: |
|Category |Image generation|
|Network|AnimeGAN|
|Dataset|Paprika|
|Fine-tuning supported or not|No|
|Module Size|9.4MB|
|Latest update date|2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://ai-studio-static-online.cdn.bcebos.com/bd002c4bb6a7427daf26988770bb18648b7d8d2bfd6746bfb9a429db4867727f" width = "450" height = "300" hspace='10'/>
<br />
Input image
<br />
<img src="https://ai-studio-static-online.cdn.bcebos.com/6574669d87b24bab9627c6e33896528b4a0bf5af1cd84ca29655d68719f2d551" width = "450" height = "300" hspace='10'/>
<br />
Output image
<br />
</p>
- ### Module Introduction
- AnimeGAN V2 image style stransfer model, the model can convert the input image into red pepper anime style, the model weight is converted from[AnimeGAN V2 official repo](https://github.com/TachibanaYoshino/AnimeGAN)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.8.0
- paddlehub >= 1.8.0 | [How to install PaddleHub](../../../../docs/docs_en/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install animegan_v2_paprika_54
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
model = hub.Module(name="animegan_v2_paprika_54")
result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = model.style_transfer(paths=['/PATH/TO/IMAGE'])
```
- ### 2、API
- ```python
def style_transfer(images=None,
paths=None,
output_dir='output',
visualization=False,
min_size=32,
max_size=1024)
```
- Style transfer API.
- **Parameters**
- images (list\[numpy.ndarray\]): Image data, ndarray.shape is in the format [H, W, C], BGR.
- paths (list\[str\]): Image path.
- output\_dir (str): Save path of images, `output` by default.
- visualization (bool): Whether to save the results as picture files.
- min\_size (int): Minimum size, default is 32.
- max\_size (int): Maximum size, default is 1024.
**NOTE:** Choose one of `paths` and `images` to provide input data.
- **Return**
- res (list\[numpy.ndarray\]): The list of style transfer results,ndarray.shape is in the format [H, W, C].
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of style transfer.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m animegan_v2_paprika_54
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/animegan_v2_paprika_54"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction results
print(r.json()["results"])
```
## V. Release Note
- 1.0.0
First release.
* 1.0.1
Support paddlehub2.0.
* 1.0.2
Delete batch_size.
# animegan_v2_paprika_97
|Module Name |animegan_v2_paprika_97|
| :--- | :---: |
|Category |Image generation|
|Network|AnimeGAN|
|Dataset|Paprika|
|Fine-tuning supported or not|No|
|Module Size|9.7MB|
|Latest update date|2021-07-30|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/136652269-48b8c902-3a2b-46b7-a9f2-d500097bbb0e.jpg" width = "450" height = "300" hspace='10'/>
<br />
Input image
<br />
<img src="https://user-images.githubusercontent.com/35907364/136652280-7e9ebfd2-8a45-4b5b-b3ac-f107770525c4.jpg" width = "450" height = "300" hspace='10'/>
<br />
Output image
<br />
</p>
- ### Module Introduction
- AnimeGAN V2 image style stransfer model, the model can convert the input image into red pepper anime style, the model weight is converted from[AnimeGAN V2 official repo](https://github.com/TachibanaYoshino/AnimeGAN)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.8.0
- paddlehub >= 1.8.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install animegan_v2_paprika_97
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
model = hub.Module(name="animegan_v2_paprika_97")
result = model.style_transfer(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = model.style_transfer(paths=['/PATH/TO/IMAGE'])
```
- ### 2、API
- ```python
def style_transfer(images=None,
paths=None,
output_dir='output',
visualization=False,
min_size=32,
max_size=1024)
```
- Style transfer API.
- **Parameters**
- images (list\[numpy.ndarray\]): Image data, ndarray.shape is in the format [H, W, C], BGR.
- paths (list\[str\]): Image path.
- output\_dir (str): Save path of images, `output` by default.
- visualization (bool): Whether to save the results as picture files.
- min\_size (int): Minimum size, default is 32.
- max\_size (int): Maximum size, default is 1024.
**NOTE:** Choose one of `paths` and `images` to provide input data.
- **Return**
- res (list\[numpy.ndarray\]): The list of style transfer results,ndarray.shape is in the format [H, W, C].
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of style transfer.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m animegan_v2_paprika_97
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/animegan_v2_paprika_97"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction results
print(r.json()["results"])
```
## V. Release Note
- 1.0.0
First release.
* 1.0.1
Support paddlehub2.0.
* 1.0.2
Delete batch_size.
......@@ -50,13 +50,14 @@ $ hub run msgnet --input_path "/PATH/TO/ORIGIN/IMAGE" --style_path "/PATH/TO/STY
- ### 2.预测代码示例
```python
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='msgnet')
result = model.predict(origin=["venice-boat.jpg"], style="candy.jpg", visualization=True, save_path ='style_tranfer')
result = model.predict(origin=["/PATH/TO/ORIGIN/IMAGE"], style="/PATH/TO/STYLE/IMAGE", visualization=True, save_path ="/PATH/TO/SAVE/IMAGE")
```
......@@ -86,7 +87,7 @@ if __name__ == '__main__':
- `transforms`: 数据预处理方式。
- `mode`: 选择数据模式,可选项有 `train`, `test`, 默认为`train`。
- 数据集的准备代码可以参考 [minicoco.py](../../paddlehub/datasets/flowers.py)。`hub.datasets.MiniCOCO()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
- 数据集的准备代码可以参考 [minicoco.py](../../paddlehub/datasets/minicoco.py)。`hub.datasets.MiniCOCO()`会自动从网络下载数据集并解压到用户目录下`$HOME/.paddlehub/dataset`目录。
- Step3: 加载预训练模型
......@@ -117,7 +118,7 @@ if __name__ == '__main__':
if __name__ == '__main__':
model = hub.Module(name='msgnet', load_checkpoint="/PATH/TO/CHECKPOINT")
result = model.predict(origin=["venice-boat.jpg"], style="candy.jpg", visualization=True, save_path ='style_tranfer')
result = model.predict(origin=["/PATH/TO/ORIGIN/IMAGE"], style="/PATH/TO/STYLE/IMAGE", visualization=True, save_path ="/PATH/TO/SAVE/IMAGE")
```
- 参数配置正确后,请执行脚本`python predict.py`, 加载模型具体可参见[加载](https://www.paddlepaddle.org.cn/documentation/docs/zh/2.0-rc/api/paddle/framework/io/load_cn.html#load)。
......
# msgnet
|Module Name|msgnet|
| :--- | :---: |
|Category|Image editing|
|Network|msgnet|
|Dataset|COCO2014|
|Fine-tuning supported or not|Yes|
|Module Size|68MB|
|Data indicators|-|
|Latest update date|2021-07-29|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130910325-d72f34b2-d567-4e77-bb60-35148864301e.jpg" width = "450" height = "300" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130910195-9433e4a7-3596-4677-85d2-2ffc16939597.png" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Msgnet is a style transfer model. We will show how to use PaddleHub to finetune the pre-trained model and complete the prediction.
- For more information, please refer to [msgnet](https://github.com/zhanghang1989/PyTorch-Multi-Style-Transfer)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install msgnet
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
$ hub run msgnet --input_path "/PATH/TO/ORIGIN/IMAGE" --style_path "/PATH/TO/STYLE/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='msgnet')
result = model.predict(origin=["/PATH/TO/ORIGIN/IMAGE"], style="/PATH/TO/STYLE/IMAGE", visualization=True, save_path ="/PATH/TO/SAVE/IMAGE")
```
- ### 3.Fine-tune and Encapsulation
- After completing the installation of PaddlePaddle and PaddleHub, you can start using the msgnet model to fine-tune datasets such as [MiniCOCO](../../docs/reference/datasets.md#class-hubdatasetsMiniCOCO) by executing `python train.py`.
- Steps:
- Step1: Define the data preprocessing method
- ```python
import paddlehub.vision.transforms as T
transform = T.Compose([T.Resize((256, 256), interpolation='LINEAR')])
```
- `transforms` The data enhancement module defines lots of data preprocessing methods. Users can replace the data preprocessing methods according to their needs.
- Step2: Download the dataset
- ```python
from paddlehub.datasets.minicoco import MiniCOCO
styledata = MiniCOCO(transform=transform, mode='train')
```
* `transforms`: data preprocessing methods.
* `mode`: Select the data mode, the options are `train`, `test`, `val`. Default is `train`.
- Dataset preparation can be referred to [minicoco.py](../../paddlehub/datasets/minicoco.py). `hub.datasets.MiniCOCO()` will be automatically downloaded from the network and decompressed to the `$HOME/.paddlehub/dataset` directory under the user directory.
- Step3: Load the pre-trained model
- ```python
model = hub.Module(name='msgnet', load_checkpoint=None)
```
* `name`: model name.
* `load_checkpoint`: Whether to load the self-trained model, if it is None, load the provided parameters.
- Step4: Optimization strategy
- ```python
optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='test_style_ckpt')
trainer.train(styledata, epochs=101, batch_size=4, eval_dataset=styledata, log_interval=10, save_interval=10)
```
- Model prediction
- When Fine-tune is completed, the model with the best performance on the verification set will be saved in the `${CHECKPOINT_DIR}/best_model` directory. We use this model to make predictions. The `predict.py` script is as follows:
- ```python
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='msgnet', load_checkpoint="/PATH/TO/CHECKPOINT")
result = model.predict(origin=["/PATH/TO/ORIGIN/IMAGE"], style="/PATH/TO/STYLE/IMAGE", visualization=True, save_path ="/PATH/TO/SAVE/IMAGE")
```
- **Parameters**
* `origin`: Image path or ndarray data with format [H, W, C], BGR.
* `style`: Style image path.
* `visualization`: Whether to save the recognition results as picture files.
* `save_path`: Save path of the result, default is 'style_tranfer'.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of style transfer.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m msgnet
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result:
- ```python
import requests
import json
import cv2
import base64
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('/PATH/TO/ORIGIN/IMAGE')
style_im = cv2.imread('/PATH/TO/STYLE/IMAGE')
data = {'images':[[cv2_to_base64(org_im)], cv2_to_base64(style_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/msgnet"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
data = base64_to_cv2(r.json()["results"]['data'][0])
cv2.imwrite('style.png', data)
```
## V. Release Note
- 1.0.0
First release
......@@ -44,7 +44,7 @@
hub run resnet50_vd_animals --input_path "/PATH/TO/IMAGE"
```
- ### 2、代码示例
- ### 2、预测代码示例
- ```python
import paddlehub as hub
......
# resnet50_vd_animals
|Module Name|resnet50_vd_animals|
| :--- | :---: |
|Category |Image classification|
|Network|ResNet50_vd|
|Dataset|Baidu self-built dataset|
|Fine-tuning supported or not|No|
|Module Size|154MB|
|Latest update date|2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- ResNet-vd is a variant of ResNet, which can be used for image classification and feature extraction. This module is trained by Baidu self-built animal data set and supports the classification and recognition of 7,978 animal species.
- For more information, please refer to [ResNet-vd](https://arxiv.org/pdf/1812.01187.pdf)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install resnet50_vd_animals
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run resnet50_vd_animals --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
classifier = hub.Module(name="resnet50_vd_animals")
result = classifier.classification(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = classifier.classification(paths=['/PATH/TO/IMAGE'])
```
- ### 3、API
- ```python
def get_expected_image_width()
```
- Returns the preprocessed image width, which is 224.
- ```python
def get_expected_image_height()
```
- Returns the preprocessed image height, which is 224.
- ```python
def get_pretrained_images_mean()
```
- Returns the mean value of the preprocessed image, which is \[0.485, 0.456, 0.406\].
- ```python
def get_pretrained_images_std()
```
- Return the standard deviation of the preprocessed image, which is \[0.229, 0.224, 0.225\].
- ```python
def classification(images=None,
paths=None,
batch_size=1,
use_gpu=False,
top_k=1):
```
- **Parameter**
* images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format [H, W, C], BGR;
* paths (list\[str\]): image path;
* batch\_size (int): batch size;
* use\_gpu (bool): use GPU or not; **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* top\_k (int): return the top k prediction results.
- **Return**
- res (list\[dict\]): the list of classification results,key is the prediction label, value is the corresponding confidence.
- ```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: model file name,defalt is \_\_model\_\_
* params\_filename: parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of animal classification.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m resnet50_vd_animals
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/resnet50_vd_animals"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction results
print(r.json()["results"])
```
## V. Release Note
- 1.0.0
First release
......@@ -50,7 +50,7 @@
if __name__ == '__main__':
model = hub.Module(name='resnet50_vd_imagenet_ssld')
result = model.predict(['flower.jpg'])
result = model.predict(['/PATH/TO/IMAGE'])
```
- ### 3.如何开始Fine-tune
......@@ -134,7 +134,7 @@
if __name__ == '__main__':
model = hub.Module(name='resnet50_vd_imagenet_ssld', label_list=["roses", "tulips", "daisy", "sunflowers", "dandelion"], load_checkpoint='/PATH/TO/CHECKPOINT')
result = model.predict(['flower.jpg'])
result = model.predict(['/PATH/TO/IMAGE'])
```
......
# resnet50_vd_imagenet_ssld
|Module Name|resnet50_vd_imagenet_ssld|
| :--- | :---: |
|Category |Image classification|
|Network|ResNet_vd|
|Dataset|ImageNet-2012|
|Fine-tuning supported or notFine-tuning|Yes|
|Module Size|148MB|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Module Introduction
- ResNet-vd is a variant of ResNet, which can be used for image classification and feature extraction.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install resnet50_vd_imagenet_ssld
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
```shell
$ hub run resnet50_vd_imagenet_ssld --input_path "/PATH/TO/IMAGE" --top_k 5
```
- ### 2、Prediction Code Example
```python
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='resnet50_vd_imagenet_ssld')
result = model.predict(['/PATH/TO/IMAGE'])
```
- ### 3.Fine-tune and Encapsulation
- After completing the installation of PaddlePaddle and PaddleHub, you can start using the user_guided_colorization model to fine-tune datasets such as [Flowers](../../docs/reference/datasets.md#class-hubdatasetsflowers) by excuting `python train.py`.
- Steps:
- Step1: Define the data preprocessing method
- ```python
import paddlehub.vision.transforms as T
transforms = T.Compose([T.Resize((256, 256)),
T.CenterCrop(224),
T.Normalize(mean=[0.485, 0.456, 0.406], std = [0.229, 0.224, 0.225])],
to_rgb=True)
```
- `transforms`: The data enhancement module defines lots of data preprocessing methods. Users can replace the data preprocessing methods according to their needs.
- Step2: Download the dataset
- ```python
from paddlehub.datasets import Flowers
flowers = Flowers(transforms)
flowers_validate = Flowers(transforms, mode='val')
```
* `transforms`: data preprocessing methods.
* `mode`: Select the data mode, the options are `train`, `test`, `val`. Default is `train`.
* `hub.datasets.Flowers()` will be automatically downloaded from the network and decompressed to the `$HOME/.paddlehub/dataset` directory under the user directory.
- Step3: Load the pre-trained model
- ```python
model = hub.Module(name="resnet50_vd_imagenet_ssld", label_list=["roses", "tulips", "daisy", "sunflowers", "dandelion"])
```
* `name`: model name.
* `label_list`: set the output classification category. Default is Imagenet2012 category.
- Step4: Optimization strategy
```python
optimizer = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters())
trainer = Trainer(model, optimizer, checkpoint_dir='img_classification_ckpt')
trainer.train(flowers, epochs=100, batch_size=32, eval_dataset=flowers_validate, save_interval=1)
```
- Run configuration
- `Trainer` mainly control the training of Fine-tune, including the following controllable parameters:
* `model`: Optimized model.
* `optimizer`: Optimizer selection.
* `use_vdl`: Whether to use vdl to visualize the training process.
* `checkpoint_dir`: The storage address of the model parameters.
* `compare_metrics`: The measurement index of the optimal model.
- `trainer.train` mainly control the specific training process, including the following controllable parameters:
* `train_dataset`: Training dataset.
* `epochs`: Epochs of training process.
* `batch_size`: Batch size.
* `num_workers`: Number of workers.
* `eval_dataset`: Validation dataset.
* `log_interval`:The interval for printing logs.
* `save_interval`: The interval for saving model parameters.
- Model prediction
- When Fine-tune is completed, the model with the best performance on the verification set will be saved in the `${CHECKPOINT_DIR}/best_model` directory. We use this model to make predictions. The `predict.py` script is as follows:
- ```python
import paddle
import paddlehub as hub
if __name__ == '__main__':
model = hub.Module(name='resnet50_vd_imagenet_ssld', label_list=["roses", "tulips", "daisy", "sunflowers", "dandelion"], load_checkpoint='/PATH/TO/CHECKPOINT')
result = model.predict(['/PATH/TO/IMAGE'])
```
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of classification.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m resnet50_vd_imagenet_ssld
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)], 'top_k':2}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/resnet50_vd_imagenet_ssld"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
data =r.json()["results"]['data']
```
## V. Release Note
* 1.0.0
First release
* 1.1.0
Upgrade to dynamic version
# resnet_v2_50_imagenet
|Module Name|resnet_v2_50_imagenet|
| :--- | :---: |
|Category |Image classification|
|Network|ResNet V2|
|Dataset|ImageNet-2012|
|Fine-tuning supported or not|No|
|Module Size|99MB|
|Latest update date|2021-02-26|
|Data indicators|-|
## I. Basic Information
- ### Application Effect Display
- This module utilizes ResNet50 structure and it is trained on ImageNet-2012.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [How to install PaddleHub](../../../../docs/docs_en/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install resnet_v2_50_imagenet
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run resnet_v2_50_imagenet --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
classifier = hub.Module(name="resnet_v2_50_imagenet")
test_img_path = "/PATH/TO/IMAGE"
input_dict = {"image": [test_img_path]}
result = classifier.classification(data=input_dict)
```
- ### 3、API
- ```python
def classification(data)
```
- Prediction API for classification.
- **Parameter**
- data (dict): Key is 'image',value is the list of image path.
- **Return**
- result (list[dict]): The list of classification results,key is the prediction label, value is the corresponding confidence.
## IV. Release Note
- 1.0.0
First release
- 1.0.1
Fix encoding problem in python2
- ```shell
$ hub install resnet_v2_50_imagenet==1.0.1
```
......@@ -44,7 +44,7 @@
## 三、模型API预测
- ### 1、代码示例
- ### 1、预测代码示例
```python
import cv2
......@@ -60,26 +60,26 @@
visualization=False)
```
- ### 2、API
- ### 2、API
```python
def Segmentation(
```python
def Segmentation(
images=None,
paths=None,
batch_size=1,
output_dir='output',
visualization=False):
```
- 人像分割 API
```
- 人像分割 API
- **参数**
- **参数**
* images (list[np.ndarray]) : 输入图像数据列表(BGR)
* paths (list[str]) : 输入图像路径列表
* batch_size (int) : 数据批大小
* output_dir (str) : 可视化图像输出目录
* visualization (bool) : 是否可视化
- **返回**
- **返回**
* results (list[dict{"mask":np.ndarray,"result":np.ndarray}]): 输出图像数据列表
## 四、更新历史
......
# ExtremeC3_Portrait_Segmentation
|Module Name|ExtremeC3_Portrait_Segmentation|
| :--- | :---: |
|Category|image segmentation|
|Network |ExtremeC3|
|Dataset|EG1800, Baidu fashion dataset|
|Fine-tuning supported or not|No|
|Module Size|0.038MB|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://ai-studio-static-online.cdn.bcebos.com/1261398a98e24184852bdaff5a4e1dbd7739430f59fb47e8b84e3a2cfb976107" hspace='10'/> <br />
</p>
- ### Module Introduction
* ExtremeC3_Portrait_Segmentation is a light weigth module based on ExtremeC3 to achieve portrait segmentation.
* For more information, please refer to: [ExtremeC3_Portrait_Segmentation](https://github.com/clovaai/ext_portrait_segmentation).
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install ExtremeC3_Portrait_Segmentation
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
```python
import cv2
import paddlehub as hub
model = hub.Module(name='ExtremeC3_Portrait_Segmentation')
result = model.Segmentation(
images=[cv2.imread('/PATH/TO/IMAGE')],
paths=None,
batch_size=1,
output_dir='output',
visualization=False)
```
- ### 2、API
```python
def Segmentation(
images=None,
paths=None,
batch_size=1,
output_dir='output',
visualization=False):
```
- Prediction API, used for portrait segmentation.
- **Parameter**
* images (list[np.ndarray]) : image data, ndarray.shape is in the format [H, W, C], BGR;
* paths (list[str]) :image path
* batch_size (int) : batch size
* output_dir (str) : save path of images, 'output' by default.
* visualization (bool) : whether to save the segmentation results as picture files.
- **Return**
* results (list[dict{"mask":np.ndarray,"result":np.ndarray}]): list of recognition results.
## IV. Release Note
- 1.0.0
First release
# Pneumonia_CT_LKM_PP
|Module Name|Pneumonia_CT_LKM_PP|
| :--- | :---: |
|Category|Image segmentation|
|Network |-|
|Dataset|-|
|Fine-tuning supported or not|No|
|Module Size|35M|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Module Introduction
- Pneumonia CT analysis model (Pneumonia-CT-LKM-PP) can efficiently complete the detection of lesions and outline the patient's CT images. Through post-processing codes, the number, volume, and lesions of lung lesions can be analyzed. This model has been fully trained by high-resolution and low-resolution CT image data, which can adapt to the examination data collected by different levels of CT imaging equipment.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install Pneumonia_CT_LKM_PP==1.0.0
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
```python
import paddlehub as hub
pneumonia = hub.Module(name="Pneumonia_CT_LKM_PP")
input_only_lesion_np_path = "/PATH/TO/ONLY_LESION_NP"
input_both_lesion_np_path = "/PATH/TO/LESION_NP"
input_both_lung_np_path = "/PATH/TO/LUNG_NP"
# set input dict
input_dict = {"image_np_path": [
[input_only_lesion_np_path],
[input_both_lesion_np_path, input_both_lung_np_path],
]}
# execute predict and print the result
results = pneumonia.segmentation(data=input_dict)
for result in results:
print(result)
```
- ### 2、API
- ```python
def segmentation(data)
```
- Prediction API, used for CT analysis of pneumonia.
- **Parameter**
* data (dict): key is "image_np_path", value is the list of results which contains lesion and lung segmentation masks.
- **Return**
* result (list\[dict\]): the list of recognition results, where each element is dict and each field is:
* input_lesion_np_path: input path of lesion.
* output_lesion_np: segmentation result path of lesion.
* input_lung_np_path: input path of lung.
* output_lung_np:segmentation result path of lung.
## IV. Release Note
* 1.0.0
First release
# Pneumonia_CT_LKM_PP_lung
|Module Name|Pneumonia_CT_LKM_PP_lung|
| :--- | :---: |
|Category|Image segmentation|
|Network |-|
|Dataset|-|
|Fine-tuning supported or not|No|
|Module Size|35M|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Module Introduction
- Pneumonia CT analysis model (Pneumonia-CT-LKM-PP) can efficiently complete the detection of lesions and outline the patient's CT images. Through post-processing codes, the number, volume, and lesions of lung lesions can be analyzed. This model has been fully trained by high-resolution and low-resolution CT image data, which can adapt to the examination data collected by different levels of CT imaging equipment. (This module is a submodule of Pneumonia_CT_LKM_PP.)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install Pneumonia_CT_LKM_PP_lung==1.0.0
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
```python
import paddlehub as hub
pneumonia = hub.Module(name="Pneumonia_CT_LKM_PP_lung")
input_only_lesion_np_path = "/PATH/TO/ONLY_LESION_NP"
input_both_lesion_np_path = "/PATH/TO/LESION_NP"
input_both_lung_np_path = "/PATH/TO/LUNG_NP"
# set input dict
input_dict = {"image_np_path": [
[input_only_lesion_np_path],
[input_both_lesion_np_path, input_both_lung_np_path],
]}
# execute predict and print the result
results = pneumonia.segmentation(data=input_dict)
for result in results:
print(result)
```
- ### 2、API
- ```python
def segmentation(data)
```
- Prediction API, used for CT analysis of pneumonia.
- **Parameter**
* data (dict): Key is "image_np_path", value is the list of results which contains lesion and lung segmentation masks.
- **Return**
* result (list\[dict\]): The list of recognition results, where each element is dict and each field is:
* input_lesion_np_path: Input path of lesion.
* output_lesion_np: Segmentation result path of lesion.
* input_lung_np_path: Input path of lung.
* output_lung_np: Segmentation result path of lung.
## IV. Release Note
* 1.0.0
First release
......@@ -43,7 +43,7 @@
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、代码示例
- ### 1、预测代码示例
```python
import cv2
......
# U2Net
|Module Name |U2Net|
| :--- | :---: |
|Category |Image segmentation|
|Network |U^2Net|
|Dataset|-|
|Fine-tuning supported or not|No|
|Module Size |254MB|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://ai-studio-static-online.cdn.bcebos.com/4d77bc3a05cf48bba6f67b797978f4cdf10f38288b9645d59393dd85cef58eff" width = "450" height = "300" hspace='10'/> <img src="https://ai-studio-static-online.cdn.bcebos.com/11c9eba8de6d4316b672f10b285245061821f0a744e441f3b80c223881256ca0" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Network architecture:
<p align="center">
<img src="https://ai-studio-static-online.cdn.bcebos.com/999d37b4ffdd49dc9e3315b7cec7b2c6918fdd57c8594ced9dded758a497913d" hspace='10'/> <br />
</p>
- For more information, please refer to: [U2Net](https://github.com/xuebinqin/U-2-Net)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install U2Net
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
```python
import cv2
import paddlehub as hub
model = hub.Module(name='U2Net')
result = model.Segmentation(
images=[cv2.imread('/PATH/TO/IMAGE')],
paths=None,
batch_size=1,
input_size=320,
output_dir='output',
visualization=True)
```
- ### 2、API
```python
def Segmentation(
images=None,
paths=None,
batch_size=1,
input_size=320,
output_dir='output',
visualization=False):
```
- Prediction API, obtaining segmentation result.
- **Parameter**
* images (list[np.ndarray]) : Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list[str]) : Image path.
* batch_size (int) : Batch size.
* input_size (int) : Input image size, default is 320.
* output_dir (str) : Save path of images, 'output' by default.
* visualization (bool) : Whether to save the results as picture files.
- **Return**
* results (list[np.ndarray]): The list of segmentation results.
## IV. Release Note
- 1.0.0
First release
......@@ -47,7 +47,7 @@
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、代码示例
- ### 1、预测代码示例
```python
import cv2
......
# U2Netp
|Module Name |U2Netp|
| :--- | :---: |
|Category |Image segmentation|
|Network |U^2Net|
|Dataset|-|
|Fine-tuning supported or not|No|
|Module Size |6.7MB|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://ai-studio-static-online.cdn.bcebos.com/4d77bc3a05cf48bba6f67b797978f4cdf10f38288b9645d59393dd85cef58eff" width = "450" height = "300" hspace='10'/> <img src="https://ai-studio-static-online.cdn.bcebos.com/11c9eba8de6d4316b672f10b285245061821f0a744e441f3b80c223881256ca0" width = "450" height = "300" hspace='10'/>
</p>
- ### Module Introduction
- Network architecture:
<p align="center">
<img src="https://ai-studio-static-online.cdn.bcebos.com/999d37b4ffdd49dc9e3315b7cec7b2c6918fdd57c8594ced9dded758a497913d" hspace='10'/> <br />
</p>
- For more information, please refer to: [U2Net](https://github.com/xuebinqin/U-2-Net)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install U2Netp
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
```python
import cv2
import paddlehub as hub
model = hub.Module(name='U2Netp')
result = model.Segmentation(
images=[cv2.imread('/PATH/TO/IMAGE')],
paths=None,
batch_size=1,
input_size=320,
output_dir='output',
visualization=True)
```
- ### 2、API
```python
def Segmentation(
images=None,
paths=None,
batch_size=1,
input_size=320,
output_dir='output',
visualization=False):
```
- Prediction API, obtaining segmentation result.
- **Parameter**
* images (list[np.ndarray]) : Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list[str]) : Image path.
* batch_size (int) : Batch size.
* input_size (int) : Input image size, default is 320.
* output_dir (str) : Save path of images, 'output' by default.
* visualization (bool) : Whether to save the results as picture files.
- **Return**
* results (list[np.ndarray]): The list of segmentation results.
## IV. Release Note
- 1.0.0
First release
......@@ -57,10 +57,10 @@
- ### 1、命令行预测
```shell
$ hub install ace2p==1.1.0
$ hub run ace2p --input_path "/PATH/TO/IMAGE"
```
- ### 2、代码示例
- ### 2、预测代码示例
```python
import paddlehub as hub
......
# ace2p
|Module Name|ace2p|
| :--- | :---: |
|Category|Image segmentation|
|Network|ACE2P|
|Dataset|LIP|
|Fine-tuning supported or not|No|
|Module Size|259MB|
|Data indicators|-|
|Latest update date |2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Network architecture:
<p align="center">
<img src="https://bj.bcebos.com/paddlehub/paddlehub-img/ace2p_network.jpg" hspace='10'/> <br />
</p>
- Color palette
<p align="left">
<img src="https://bj.bcebos.com/paddlehub/paddlehub-img/ace2p_palette.jpg" hspace='10'/> <br />
</p>
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130913092-312a5f37-842e-4fd0-8db4-5f853fd8419f.jpg" width = "337" height = "505" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130913765-c9572c77-c6bf-46ec-9653-04ff356b4b85.png" width = "337" height = "505" hspace='10'/>
</p>
- ### Module Introduction
- Human Parsing is a fine-grained semantic segmentation task that aims to identify the components (for example, body parts and clothing) of a human image at the pixel level. The PaddleHub Module uses ResNet101 as the backbone network, and accepts input image sizes of 473x473x3.
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install ace2p
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run ace2p --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
human_parser = hub.Module(name="ace2p")
result = human_parser.segmentation(images=[cv2.imread('/PATH/TO/IMAGE')])
```
- ### 3、API
- ```python
def segmentation(images=None,
paths=None,
batch_size=1,
use_gpu=False,
output_dir='ace2p_output',
visualization=False):
```
- Prediction API, used for human parsing.
- **Parameter**
* images (list\[numpy.ndarray\]): Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list\[str\]): Image path.
* batch\_size (int): Batch size.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* output\_dir (str): Save path of output, default is 'ace2p_output'.
* visualization (bool): Whether to save the recognition results as picture files.
- **Return**
* res (list\[dict\]): The list of recognition results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result.
* data (numpy.ndarray): The result of portrait segmentation.
- ```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: mMdel file name,defalt is \_\_model\_\_
* params\_filename: Parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of human parsing
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m ace2p
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/ace2p"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction results
print(base64_to_cv2(r.json()["results"][0]['data']))
```
## 五、更新历史
- 1.0.0
First release
* 1.1.0
Adapt to paddlehub2.0
# deeplabv3p_xception65_humanseg
|Module Name |deeplabv3p_xception65_humanseg|
| :--- | :---: |
|Category|Image segmentation|
|Network|deeplabv3p|
|Dataset|Baidu self-built dataset|
|Fine-tuning supported or not|No|
|Module Size|162MB|
|Data indicators |-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130913092-312a5f37-842e-4fd0-8db4-5f853fd8419f.jpg" width = "337" height = "505" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130913256-41056b21-1c3d-4ee2-b481-969c94754609.png" width = "337" height = "505" hspace='10'/>
</p>
- ### Module Introduction
- DeepLabv3+ model is trained by Baidu self-built dataset, which can be used for portrait segmentation.
<p align="center">
<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/deeplabv3plus.png" hspace='10'/> <br />
</p>
- For more information, please refer to: [deeplabv3p](https://github.com/PaddlePaddle/PaddleSeg)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install deeplabv3p_xception65_humanseg
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```shell
hub run deeplabv3p_xception65_humanseg --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
human_seg = hub.Module(name="deeplabv3p_xception65_humanseg")
result = human_seg.segmentation(images=[cv2.imread('/PATH/TO/IMAGE')])
```
- ### 3.API
- ```python
def segmentation(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_output')
```
- Prediction API, generating segmentation result.
- **Parameter**
* images (list\[numpy.ndarray\]): Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list\[str\]): Image path.
* batch\_size (int): Batch size.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* visualization (bool): Whether to save the recognition results as picture files.
* output\_dir (str): Save path of images.
- **Return**
* res (list\[dict\]): The list of recognition results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result.
* data (numpy.ndarray): The result of portrait segmentation.
- ```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: Model file name,defalt is \_\_model\_\_
* params\_filename: Parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of for human segmentation.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m deeplabv3p_xception65_humanseg
```
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import cv2
import base64
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/deeplabv3p_xception65_humanseg"
r = requests.post(url=url, headers=headers,
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_server.png", rgba)
```
## V. Release Note
- 1.0.0
First release
* 1.1.0
Improve prediction performance
* 1.1.1
Fix the bug of image value out of range
* 1.1.2
Fix memory leakage problem of on cudnn 8.0.4
......@@ -48,7 +48,7 @@
```
hub run humanseg_lite --input_path "/PATH/TO/IMAGE"
```
- ### 2、代码示例
- ### 2、预测代码示例
- 图片分割及视频分割代码示例:
......@@ -72,7 +72,7 @@
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_lite')
human_seg = hub.Module(name='humanseg_lite')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_lite_video.avi'
......
# humanseg_lite
|Module Name |humanseg_lite|
| :--- | :---: |
|Category |Image segmentation|
|Network|shufflenet|
|Dataset|Baidu self-built dataset|
|Fine-tuning supported or not|No|
|Module Size|541k|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130913092-312a5f37-842e-4fd0-8db4-5f853fd8419f.jpg" width = "337" height = "505" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130916087-7d537ad9-bbc8-4bce-9382-8eb132b35532.png" width = "337" height = "505" hspace='10'/>
</p>
- ### Module Introduction
- HumanSeg_lite is based on ShuffleNetV2 network. The network size is only 541K. It is suitable for selfie portrait segmentation and can be segmented in real time on the mobile terminal.
- For more information, please refer to:[humanseg_lite](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/HumanSeg)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install humanseg_lite
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
hub run humanseg_lite --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- Image segmentation and video segmentation example:
- ```python
import cv2
import paddlehub as hub
human_seg = hub.Module(name='humanseg_lite')
im = cv2.imread('/PATH/TO/IMAGE')
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
- Video prediction example:
- ```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_lite')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_lite_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
- ### 3、API
- ```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_lite_output')
```
- Prediction API, generating segmentation result.
- **Parameter**
* images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list\[str\]): image path.
* batch\_size (int): batch size.
* use\_gpu (bool): use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* visualization (bool): Whether to save the results as picture files.
* output\_dir (str): save path of images, humanseg_lite_output by default.
- **Return**
* res (list\[dict\]): The list of recognition results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result.
* data (numpy.ndarray): The result of portrait segmentation.
- ```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
- Prediction API, used to segment video portraits frame by frame.
- **Parameter**
* frame_org (numpy.ndarray): single frame for prediction,ndarray.shape is in the format [H, W, C], BGR.
* frame_id (int): The number of the current frame.
* prev_gray (numpy.ndarray): Grayscale image of the previous network input.
* prev_cfd (numpy.ndarray): The fusion image from optical flow and the prediction result from previous frame.
* use\_gpu (bool): use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- **Return**
* img_matting (numpy.ndarray): The result of portrait segmentation.
* cur_gray (numpy.ndarray): Grayscale image of the current network input.
* optflow_map (numpy.ndarray): The fusion image from optical flow and the prediction result from current frame.
- ```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_lite_video_result'):
```
- Prediction API to produce video segmentation result.
- **Parameter**
* video\_path (str): Video path for segmentation。If None, the video will be obtained from the local camera, and a window will display the online segmentation result.
* use\_gpu (bool): use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* save\_dir (str): save path of video.
- ```python
def save_inference_model(dirname='humanseg_lite_model',
model_filename=None,
params_filename=None,
combined=True)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: model file name,defalt is \_\_model\_\_
* params\_filename: parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of for human segmentation.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
hub serving start -m humanseg_lite
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_lite"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_lite.png", rgba)
```
## V. Release Note
- 1.0.0
First release
- 1.1.0
Added video portrait segmentation interface
Added video stream portrait segmentation interface
* 1.1.1
Fix memory leakage problem of on cudnn 8.0.4
......@@ -52,7 +52,7 @@
```
hub run humanseg_mobile --input_path "/PATH/TO/IMAGE"
```
- ### 2、代码示例
- ### 2、预测代码示例
- 图片分割及视频分割代码示例:
......@@ -76,7 +76,7 @@
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_mobile')
human_seg = hub.Module(name='humanseg_mobile')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_mobile_video.avi'
......
# humanseg_mobile
|Module Name |humanseg_mobile|
| :--- | :---: |
|Category |Image segmentation|
|Network|hrnet|
|Dataset|Baidu self-built dataset|
|Fine-tuning supported or not|No|
|Module Size|5.8M|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130913092-312a5f37-842e-4fd0-8db4-5f853fd8419f.jpg" width = "337" height = "505" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130914325-3795e241-b611-46a1-aa70-ffc47326c86a.png" width = "337" height = "505" hspace='10'/>
</p>
- ### Module Introduction
- HumanSeg_mobile is based on HRNet_w18_small_v1 network. The network size is only 5.8M. It is suitable for selfie portrait segmentation and can be segmented in real time on the mobile terminal.
- For more information, please refer to:[humanseg_mobile](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/HumanSeg)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install humanseg_mobile
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
hub run humanseg_mobile --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- Image segmentation and video segmentation example:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module(name='humanseg_mobile')
im = cv2.imread('/PATH/TO/IMAGE')
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
- Video prediction example:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_mobile')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_mobile_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
- ### 3、API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_mobile_output')
```
- Prediction API, generating segmentation result.
- **Parameter**
* images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list\[str\]): image path.
* batch\_size (int): batch size.
* use\_gpu (bool): use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* visualization (bool): Whether to save the results as picture files.
* output\_dir (str): save path of images, humanseg_mobile_output by default.
- **Return**
* res (list\[dict\]): The list of recognition results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result.
* data (numpy.ndarray): The result of portrait segmentation.
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
- Prediction API, used to segment video portraits frame by frame.
- **Parameter**
* frame_org (numpy.ndarray): single frame for prediction,ndarray.shape is in the format [H, W, C], BGR.
* frame_id (int): The number of the current frame.
* prev_gray (numpy.ndarray): Grayscale image of the previous network input.
* prev_cfd (numpy.ndarray): The fusion image from optical flow and the prediction result from previous frame.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- **Return**
* img_matting (numpy.ndarray): The result of portrait segmentation.
* cur_gray (numpy.ndarray): Grayscale image of the current network input.
* optflow_map (numpy.ndarray): The fusion image from optical flow and the prediction result from current frame.
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_mobile_video_result'):
```
- Prediction API to produce video segmentation result.
- **Parameter**
* video\_path (str): Video path for segmentation。If None, the video will be obtained from the local camera, and a window will display the online segmentation result.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* save\_dir (str): save path of video.
```python
def save_inference_model(dirname='humanseg_mobile_model',
model_filename=None,
params_filename=None,
combined=True)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: Model file name,defalt is \_\_model\_\_
* params\_filename: Parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of for human segmentation.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m humanseg_mobile
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_mobile"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_mobile.png", rgba)
```
## V. Release Note
- 1.0.0
First release
- 1.1.0
Added video portrait split interface
Added video stream portrait segmentation interface
* 1.1.1
Fix the video memory leakage problem of on cudnn 8.0.4
......@@ -51,7 +51,7 @@
```
hub run humanseg_server --input_path "/PATH/TO/IMAGE"
```
- ### 2、代码示例
- ### 2、预测代码示例
- 图片分割及视频分割代码示例:
......@@ -75,7 +75,7 @@
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_server')
human_seg = hub.Module(name='humanseg_server')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_server_video.avi'
......
# humanseg_server
|Module Name |humanseg_server|
| :--- | :---: |
|Category |Image segmentation|
|Network|hrnet|
|Dataset|Baidu self-built dataset|
|Fine-tuning supported or not|No|
|Module Size|159MB|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
<p align="center">
<img src="https://user-images.githubusercontent.com/35907364/130913092-312a5f37-842e-4fd0-8db4-5f853fd8419f.jpg" width = "337" height = "505" hspace='10'/> <img src="https://user-images.githubusercontent.com/35907364/130915531-bd4b2294-47e4-47e1-b9d3-3c1fa8b90f8f.png" width = "337" height = "505" hspace='10'/>
</p>
- ### Module Introduction
- HumanSeg-server model is trained by Baidu self-built dataset, which can be used for portrait segmentation.
- For more information, please refer to:[humanseg_server](https://github.com/PaddlePaddle/PaddleSeg/tree/release/2.2/contrib/HumanSeg)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$ hub install humanseg_server
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Command line Prediction
- ```
hub run humanseg_server --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command Line Instruction](../../../../docs/docs_en/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- Image segmentation and video segmentation example:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module(name='humanseg_server')
im = cv2.imread('/PATH/TO/IMAGE')
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
- Video prediction example:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_server')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_server_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
- ### 3、API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_server_output')
```
- Prediction API, generating segmentation result.
- **Parameter**
* images (list\[numpy.ndarray\]): Image data, ndarray.shape is in the format [H, W, C], BGR.
* paths (list\[str\]): Image path.
* batch\_size (int): Batch size.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* visualization (bool): Whether to save the results as picture files.
* output\_dir (str): Save path of images, humanseg_server_output by default.
- **Return**
* res (list\[dict\]): The list of recognition results, where each element is dict and each field is:
* save\_path (str, optional): Save path of the result.
* data (numpy.ndarray): The result of portrait segmentation.
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
- Prediction API, used to segment video portraits frame by frame.
- **Parameter**
* frame_org (numpy.ndarray): Single frame for prediction,ndarray.shape is in the format [H, W, C], BGR.
* frame_id (int): The number of the current frame.
* prev_gray (numpy.ndarray): Grayscale image of the previous network input.
* prev_cfd (numpy.ndarray): The fusion image from optical flow and the prediction result from previous frame.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
- **Return**
* img_matting (numpy.ndarray): The result of portrait segmentation.
* cur_gray (numpy.ndarray): Grayscale image of the current network input.
* optflow_map (numpy.ndarray): The fusion image from optical flow and the prediction result from current frame.
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_server_video_result'):
```
- Prediction API to produce video segmentation result.
- **Parameter**
* video\_path (str): Video path for segmentation。If None, the video will be obtained from the local camera, and a window will display the online segmentation result.
* use\_gpu (bool): Use GPU or not. **set the CUDA_VISIBLE_DEVICES environment variable first if you are using GPU**
* save\_dir (str): Save path of video.
```python
def save_inference_model(dirname='humanseg_server_model',
model_filename=None,
params_filename=None,
combined=True)
```
- Save the model to the specified path.
- **Parameters**
* dirname: Save path.
* model\_filename: Model file name,defalt is \_\_model\_\_
* params\_filename: Parameter file name,defalt is \_\_params\_\_(Only takes effect when `combined` is True)
* combined: Whether to save the parameters to a unified file.
## IV. Server Deployment
- PaddleHub Serving can deploy an online service of for human segmentation.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m humanseg_server
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before the service, otherwise it need not be set.
- ### Step 2: Send a predictive request
- With a configured server, use the following lines of code to send the prediction request and obtain the result
- ```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# Send an HTTP request
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_server"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_server.png", rgba)
```
## V. Release Note
- 1.0.0
First release
- 1.1.0
Added video portrait segmentation interface
Added video stream portrait segmentation interface
* 1.1.1
Fix memory leakage problem of on cudnn 8.0.4
# chinese_ocr_db_crnn_mobile
| Module Name | chinese_ocr_db_crnn_mobile |
| :------------------ | :------------: |
| Category | image-text_recognition |
| Network | Differentiable Binarization+RCNN |
| Dataset | icdar2015 |
| Fine-tuning supported or not | No |
| Module Size | 16M |
| Latest update date | 2021-02-26 |
| Data indicators | - |
## I. Basic Information of Module
- ### Application Effect Display
- [Online experience in OCR text recognition scenarios](https://www.paddlepaddle.org.cn/hub/scene/ocr)
- Example result:
<p align="center">
<img src="https://user-images.githubusercontent.com/76040149/133097562-d8c9abd1-6c70-4d93-809f-fa4735764836.png" width = "600" hspace='10'/> <br />
</p>
- ### Module Introduction
- chinese_ocr_db_crnn_mobile Module is used to identify Chinese characters in pictures. Get the text box after using [chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/), identify the Chinese characters in the text box, and then do angle classification to the detection text box. CRNN(Convolutional Recurrent Neural Network) is adopted as the final recognition algorithm. This Module is an ultra-lightweight Chinese OCR model that supports direct prediction.
<p align="center">
<img src="https://user-images.githubusercontent.com/76040149/133098254-7c642826-d6d7-4dd0-986e-371622337867.png" width = "300" height = "450" hspace='10'/> <br />
</p>
- For more information, please refer to:[An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
## II. Installation
- ### 1、Environmental dependence
- paddlepaddle >= 1.7.2
- paddlehub >= 1.6.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- shapely
- pyclipper
- ```shell
$ pip install shapely pyclipper
```
- **This Module relies on the third-party libraries shapely and pyclipper. Please install shapely and pyclipper before using this Module.**
- ### 2、Installation
- ```shell
$ hub install chinese_ocr_db_crnn_mobile
```
- If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API and Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run chinese_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
ocr = hub.Module(name="chinese_ocr_db_crnn_mobile", enable_mkldnn=True) # MKLDNN acceleration is only available on CPU
result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
```
- ### 3、API
- ```python
__init__(text_detector_module=None, enable_mkldnn=False)
```
- Construct the ChineseOCRDBCRNN object
- **Parameter**
- text_detector_module(str): PaddleHub Module Name for text detection, use [chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/) by default if set to None. Its function is to detect the text in the picture.
- enable_mkldnn(bool): Whether to enable MKLDNN to accelerate CPU computing. This parameter is valid only when the CPU is running. The default is False.
- ```python
def recognize_text(images=[],
paths=[],
use_gpu=False,
output_dir='ocr_result',
visualization=False,
box_thresh=0.5,
text_thresh=0.5,
angle_classification_thresh=0.9)
```
- Prediction API, detecting the position of all Chinese text in the input image.
- **Parameter**
- paths (list\[str\]): image path
- images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format \[H, W, C\], BGR;
- use\_gpu (bool): use GPU or not **If GPU is used, set the CUDA_VISIBLE_DEVICES environment variable first**
- box\_thresh (float): The confidence threshold of text box detection;
- text\_thresh (float): The confidence threshold of Chinese text recognition;
- angle_classification_thresh(float): The confidence threshold of text Angle classification
- visualization (bool): Whether to save the recognition results as picture files;
- output\_dir (str): path to save the image, ocr\_result by default.
- **Return**
- res (list\[dict\]): The list of recognition results, where each element is dict and each field is:
- data (list\[dict\]): recognition result, each element in the list is dict and each field is:
- text(str): The result text of recognition
- confidence(float): The confidence of the results
- text_box_position(list): The pixel coordinates of the text box in the original picture, a 4*2 matrix, represent the coordinates of the lower left, lower right, upper right and upper left vertices of the text box in turn
data is \[\] if there's no result
- save_path (str, optional): Path to save the result, save_path is '' if no image is saved.
## IV. Server Deployment
- PaddleHub Serving can deploy an online object detection service.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m chinese_ocr_db_crnn_mobile
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- ### Step 2: Send a predictive request
- After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/chinese_ocr_db_crnn_mobile"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction result
print(r.json()["results"])
```
## V. Release Note
* 1.0.0
First release
* 1.0.1
Fixed failure to use the online service invocating model
* 1.0.2
Supports MKLDNN to speed up CPU computing
* 1.1.0
An ultra-lightweight three-stage model (text box detection - angle classification - text recognition) is used to identify text in images.
* 1.1.1
Supports recognition of spaces in text.
* 1.1.2
Fixed an issue where only 30 fields can be detected.
- ```shell
$ hub install chinese_ocr_db_crnn_mobile==1.1.2
```
......@@ -7,7 +7,7 @@
|数据集|WuDaoCorpora 2.0|
|是否支持Fine-tuning|否|
|模型大小|568MB|
|最新更新日期|2021-11-16|
|最新更新日期|2021-12-24|
|数据指标|-|
## 一、模型基本信息
......@@ -16,7 +16,15 @@
Ernie是百度提出的基于知识增强的持续学习语义理解模型,该模型将大数据预训练与多源丰富知识相结合,通过持续学习技术,不断吸收海量文本数据中词汇、结构、语义等方面的知识,实现模型效果不断进化。
auto_punc采用了Ernie1.0预训练模型,在大规模的"悟道"中文文本数据集[WuDaoCorpora 2.0](https://resource.wudaoai.cn/home)上进行了标点恢复任务的训练,模型可直接用于预测,对输入的对中文文本自动添加7种标点符号:逗号(,)、句号(。)、感叹号(!)、问号(?)、顿号(、)、冒号(:)和分号(;)。
["悟道"文本数据集](https://ks3-cn-beijing.ksyun.com/resources/WuDaoCorpora/WuDaoCorpora__A_Super_Large_scale_Chinese_Corporafor_Pre_training_Language_Models.pdf)
采用20多种规则从100TB原始网页数据中清洗得出最终数据集,注重隐私数据信息的去除,源头上避免GPT-3存在的隐私泄露风险;包含教育、科技等50+个行业数据标签,可以支持多领域预训练模型的训练。
- 数据总量:3TB
- 数据格式:json
- 开源数量:200GB
- 数据集下载:https://resource.wudaoai.cn/
- 日期:2021年12月23日
auto_punc采用了Ernie1.0预训练模型,在[WuDaoCorpora 2.0](https://resource.wudaoai.cn/home)的200G开源文本数据集上进行了标点恢复任务的训练,模型可直接用于预测,对输入的对中文文本自动添加7种标点符号:逗号(,)、句号(。)、感叹号(!)、问号(?)、顿号(、)、冒号(:)和分号(;)。
<p align="center">
<img src="https://bj.bcebos.com/paddlehub/paddlehub-img/ernie_network_1.png" hspace='10'/> <br />
......@@ -28,6 +36,7 @@ auto_punc采用了Ernie1.0预训练模型,在大规模的"悟道"中文文本
更多详情请参考
- [WuDaoCorpora: A Super Large-scale Chinese Corpora for Pre-training Language Models](https://ks3-cn-beijing.ksyun.com/resources/WuDaoCorpora/WuDaoCorpora__A_Super_Large_scale_Chinese_Corporafor_Pre_training_Language_Models.pdf)
- [ERNIE: Enhanced Representation through Knowledge Integration](https://arxiv.org/abs/1904.09223)
......
......@@ -29,7 +29,7 @@ from paddlenlp.data import Pad
name="auto_punc",
version="1.0.0",
summary="",
author="PaddlePaddle",
author="KPatrick",
author_email="",
type="text/punctuation_restoration")
class Ernie(paddle.nn.Layer):
......
......@@ -65,7 +65,6 @@
for result in results:
print(result['text'])
print(result['sentiment_label'])
print(result['sentiment_key'])
print(result['positive_probs'])
print(result['negative_probs'])
......
......@@ -139,14 +139,29 @@ class ErnieSkepSentimentAnalysis(TransformerModule):
)
results = []
feature_list = []
for text in texts:
# feature.shape: [1, 512, 1]
# batch on the first dimension
feature = self._convert_text_to_feature(text)
inputs = [self.array2tensor(ndarray) for ndarray in feature]
feature_list.append(feature)
feature_batch = [
np.concatenate([feature[0] for feature in feature_list], axis=0),
np.concatenate([feature[1] for feature in feature_list], axis=0),
np.concatenate([feature[2] for feature in feature_list], axis=0),
np.concatenate([feature[3] for feature in feature_list], axis=0),
np.concatenate([feature[4] for feature in feature_list], axis=0),
]
inputs = [self.array2tensor(ndarray) for ndarray in feature_batch]
output = self.predictor.run(inputs)
probilities = np.array(output[0].data.float_data())
probilities_list = np.array(output[0].data.float_data())
probilities_list = probilities_list.reshape((-1, 2))
for i, probilities in enumerate(probilities_list):
label = self.label_map[np.argmax(probilities)]
result = {
'text': text,
'text': texts[i],
'sentiment_label': label,
'positive_probs': probilities[1],
'negative_probs': probilities[0]
......
# senta_bilstm
| Module Name | senta_bilstm |
| :------------------ | :------------: |
| Category | text-sentiment_analysis |
| Network | BiLSTM |
| Dataset | Dataset built by Baidu |
| Fine-tuning supported or not | No |
| Module Size | 690M |
| Latest update date | 2021-02-26 |
| Data indicators | - |
## I. Basic Information of Module
- ### Module Introduction
- Sentiment Classification (Senta for short) can automatically judge the emotional polarity category of Chinese texts with subjective description and give corresponding confidence, which can help enterprises understand users' consumption habits, analyze hot topics and crisis public opinion monitoring, and provide favorable decision support for enterprises. The model is based on a bidirectional LSTM structure, with positive and negative emotion types.
## II. Installation
- ### 1、Environmental dependence
- paddlepaddle >= 1.8.0
- paddlehub >= 1.8.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install senta_bilstm
```
- If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API and Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run senta_bilstm --input_text "这家餐厅很好吃"
```
or
- ```shell
$ hub run senta_bilstm --input_file test.txt
```
- test.txt stores the text to be predicted, for example:
> 这家餐厅很好吃
> 这部电影真的很差劲
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
senta = hub.Module(name="senta_bilstm")
test_text = ["这家餐厅很好吃", "这部电影真的很差劲"]
results = senta.sentiment_classify(texts=test_text,
use_gpu=False,
batch_size=1)
for result in results:
print(result['text'])
print(result['sentiment_label'])
print(result['sentiment_key'])
print(result['positive_probs'])
print(result['negative_probs'])
# 这家餐厅很好吃 1 positive 0.9407 0.0593
# 这部电影真的很差劲 0 negative 0.02 0.98
```
- ### 3、API
- ```python
def sentiment_classify(texts=[], data={}, use_gpu=False, batch_size=1)
```
- senta_bilstm predicting interfaces, predicting sentiment classification of input sentences (dichotomies, positive/negative)
- **Parameter**
- texts(list): data to be predicted, if texts parameter is used, there is no need to pass in data parameter. You can use any of the two parameters.
- data(dict): predicted data , key must be text,value is data to be predicted. if data parameter is used, there is no need to pass in texts parameter. You can use any of the two parameters. It is suggested to use texts parameter, and data parameter will be discarded later.
- use_gpu(bool): use GPU or not. If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- batch_size(int): batch size
- **Return**
- results(list): result of sentiment classification
- ```python
def get_labels()
```
- get the category of senta_bilstm
- **Return**
- labels(dict): the category of senta_bilstm(Dichotomies, positive/negative)
- ```python
def get_vocab_path()
```
- Get a vocabulary for pre-training
- **Return**
- vocab_path(str): Vocabulary path
## IV. Server Deployment
- PaddleHub Serving can deploy an online sentiment analysis detection service and you can use this interface for online Web applications.
- ## Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m senta_bilstm
```
- The model loading process is displayed on startup. After the startup is successful, the following information is displayed:
- ```shell
Loading senta_bilstm successful.
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- ## Step 2: Send a predictive request
- After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result
- ```python
import requests
import json
# data to be predicted
text = ["这家餐厅很好吃", "这部电影真的很差劲"]
# Set the running configuration
# Corresponding to local prediction senta_bilstm.sentiment_classify(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# set the prediction method to senta_bilstm and send a POST request, content-type should be set to json
# HOST_IP is the IP address of the server
url = "http://HOST_IP:8866/predict/senta_bilstm"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction result
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- For more information about PaddleHub Serving, please refer to:[Serving Deployment](../../../../docs/docs_ch/tutorial/serving.md)
## V. Release Note
* 1.0.0
First release
* 1.0.1
Vocabulary upgrade
* 1.1.0
Significantly improve predictive performance
* 1.2.0
Model upgrade, support transfer learning for text classification, text matching and other tasks
- ```shell
$ hub install senta_bilstm==1.2.0
```
# ernie_gen
| 模型名称 | ernie_gen |
| :------------------ | :-----------: |
| 类别 | 文本-文本生成 |
| 网络 | ERNIE-GEN |
| 数据集 | - |
| 是否支持Fine-tuning | 是 |
| 模型大小 | 85K |
| 最新更新日期 | 2021-07-20 |
| 数据指标 | - |
## 一、模型基本信息
- ### 模型介绍
- ERNIE-GEN 是面向生成任务的预训练-微调框架,首次在预训练阶段加入span-by-span 生成任务,让模型每次能够生成一个语义完整的片段。在预训练和微调中通过填充式生成机制和噪声感知机制来缓解曝光偏差问题。此外, ERNIE-GEN 采样多片段-多粒度目标文本采样策略, 增强源文本和目标文本的关联性,加强了编码器和解码器的交互。
- ernie_gen module是一个具备微调功能的module,可以快速完成特定场景module的制作。
<p align="center">
<img src="https://user-images.githubusercontent.com/76040149/133191670-8eb1c542-f8e8-4715-adb2-6346b976fab1.png" width="600" hspace='10'/>
</p>
- 更多详情请查看:[ERNIE-GEN:An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation](https://arxiv.org/abs/2001.11314)
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
- paddlenlp >= 2.0.0
- ### 2、安装
- ```shell
$ hub install ernie_gen
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ernie_gen can be used **only if it is first targeted at the specific dataset fine-tune**
- There are many types of text generation tasks, ernie_gen only provides the basic parameters for text generation, which can only be used after fine-tuning the dataset for a specific task
- Paddlehub provides a simple fine-tune dataset:[train.txt](./test_data/train.txt), [dev.txt](./test_data/dev.txt)
- Paddlehub also offers multiple fine-tune pre-training models that work well:[Couplet generated](../ernie_gen_couplet/)[Lover words generated](../ernie_gen_lover_words/)[Poetry generated](../ernie_gen_poetry/)
### 1、Fine-tune and encapsulation
- #### Fine-tune Code Example
- ```python
import paddlehub as hub
module = hub.Module(name="ernie_gen")
result = module.finetune(
train_path='train.txt',
dev_path='dev.txt',
max_steps=300,
batch_size=2
)
module.export(params_path=result['last_save_path'], module_name="ernie_gen_test", author="test")
```
- #### API Instruction
- ```python
def finetune(train_path,
dev_path=None,
save_dir="ernie_gen_result",
init_ckpt_path=None,
use_gpu=True,
max_steps=500,
batch_size=8,
max_encode_len=15,
max_decode_len=15,
learning_rate=5e-5,
warmup_proportion=0.1,
weight_decay=0.1,
noise_prob=0,
label_smooth=0,
beam_width=5,
length_penalty=1.0,
log_interval=100,
save_interval=200):
```
- Fine tuning model parameters API
- **Parameter**
- train_path(str): Training set path. The format of the training set should be: "serial number\tinput text\tlabel", such as "1\t床前明月光\t疑是地上霜", note that \t cannot be replaced by Spaces
- dev_path(str): validation set path. The format of the validation set should be: "serial number\tinput text\tlabel, such as "1\t举头望明月\t低头思故乡", note that \t cannot be replaced by Spaces
- save_dir(str): Model saving and validation sets predict output paths.
- init_ckpt_path(str): The model initializes the loading path to realize incremental training.
- use_gpu(bool): use gpu or not
- max_steps(int): Maximum training steps.
- batch_size(int): Batch size during training.
- max_encode_len(int): Maximum encoding length.
- max_decode_len(int): Maximum decoding length.
- learning_rate(float): Learning rate size.
- warmup_proportion(float): Warmup rate.
- weight_decay(float): Weight decay size.
- noise_prob(float): Noise probability, refer to the Ernie Gen's paper.
- label_smooth(float): Label smoothing weight.
- beam_width(int): Beam size of validation set at the time of prediction.
- length_penalty(float): Length penalty weight for validation set prediction.
- log_interval(int): Number of steps at a training log printing interval.
- save_interval(int): training model save interval deployment. The validation set will make predictions after the model is saved.
- **Return**
- result(dict): Run result. Contains 2 keys:
- last_save_path(str): Save path of model at the end of training.
- last_ppl(float): Model confusion at the end of training.
- ```python
def export(
params_path,
module_name,
author,
version="1.0.0",
summary="",
author_email="",
export_path="."):
```
- Module exports an API through which training parameters can be packaged into a Hub Module with one click.
- **Parameter**
- params_path(str): Module parameter path.
- module_name(str): module name, such as "ernie_gen_couplet"。
- author(str): Author name
- max_encode_len(int): Maximum encoding length.
- max_decode_len(int): Maximum decoding length.
- version(str): The version number.
- summary(str): English introduction to Module.
- author_email(str): Email address of the author.
- export_path(str): Module export path.
### 2、模型预测
- **定义`$module_name`为export指定的module_name**
- 模型转换完毕之后,通过`hub install $module_name`安装该模型,即可通过以下2种方式调用自制module:
- #### 法1:命令行预测
- ```python
$ hub run $module_name --input_text="输入文本" --use_gpu True --beam_width 5
```
- 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- #### 法2:API预测
- ```python
import paddlehub as hub
module = hub.Module(name="$module_name")
test_texts = ["输入文本1", "输入文本2"]
# generate包含3个参数,texts为输入文本列表,use_gpu指定是否使用gpu,beam_width指定beam search宽度。
results = module.generate(texts=test_texts, use_gpu=True, beam_width=5)
for result in results:
print(result)
```
- 您也可以将`$module_name`文件夹打包为tar.gz压缩包并联系PaddleHub工作人员上传至PaddleHub模型仓库,这样更多的用户可以通过一键安装的方式使用您的模型。PaddleHub非常欢迎您的贡献,共同推动开源社区成长。
## 四、服务部署
- PaddleHub Serving 可以部署一个文本生成的在线服务。
- ### 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m $module_name -p 8866
```
- 这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
- ### 第二步:发送预测请求
- 客户端通过以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
# 发送HTTP请求
data = {'texts':["输入文本1", "输入文本2"],
'use_gpu':True, 'beam_width':5}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/$module_name"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存结果
results = r.json()["results"]
for result in results:
print(result)
```
- **NOTE:** 上述`$module_name`为export指定的module_name
## 五、更新历史
* 1.0.0
初始发布
* 1.0.1
修复模型导出bug
* 1.0.2
修复windows运行中的bug
* 1.1.0
接入PaddleNLP
- ```shell
$ hub install ernie_gen==1.1.0
```
# PornDetectionCNN API说明
# porn_detection_cnn
## detection(texts=[], data={}, use_gpu=False, batch_size=1)
| 模型名称 | porn_detection_cnn |
| :------------------ | :------------: |
| 类别 | 文本-文本审核 |
| 网络 | CNN |
| 数据集 | 百度自建数据集 |
| 是否支持Fine-tuning | 否 |
| 模型大小 | 20M |
| 最新更新日期 | 2021-02-26 |
| 数据指标 | - |
porn_detection_cnn预测接口,鉴定输入句子是否包含色情文案
## 一、模型基本信息
**参数**
- ### 模型介绍
- 色情检测模型可自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别。
- porn_detection_cnn采用CNN网络结构并按字粒度进行切词,具有较高的预测速度。该模型最大句子长度为256字,仅支持预测。
* texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
* data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。
* use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
* batch_size(int): 批处理大小
**返回**
## 二、安装
* results(list): 鉴定结果
- ### 1、环境依赖
## context(trainable=False)
- paddlepaddle >= 1.6.2
获取porn_detection_cnn的预训练program以及program的输入输出变量
- paddlehub >= 1.6.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
**参数**
- ### 2、安装
* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变
- ```shell
$ hub install porn_detection_cnn
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
**返回**
* inputs(dict): program的输入变量
* outputs(dict): program的输出变量
* main_program(Program): 带有预训练参数的program
## 三、模型API预测
## get_labels()
- ### 1、命令行预测
获取porn_detection_cnn的类别
- ```shell
$ hub run porn_detection_cnn --input_text "黄片下载"
```
**返回**
- 或者
* labels(dict): porn_detection_cnn的类别(二分类,是/不是)
- ```shell
$ hub run porn_detection_cnn --input_file test.txt
```
## get_vocab_path()
- 其中test.txt存放待审查文本,每行仅放置一段待审核文本
获取预训练时使用的词汇表
- 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
**返回**
- ### 2、预测代码示例
* vocab_path(str): 词汇表路径
- ```python
import paddlehub as hub
# PornDetectionCNN 服务部署
porn_detection_cnn = hub.Module(name="porn_detection_cnn")
PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。
test_text = ["黄片下载", "打击黄牛党"]
## 第一步:启动PaddleHub Serving
results = porn_detection_cnn.detection(texts=test_text, use_gpu=True, batch_size=1)
运行启动命令:
```shell
$ hub serving start -m porn_detection_cnn
```
for index, text in enumerate(test_text):
results[index]["text"] = text
for index, result in enumerate(results):
print(results[index])
启动时会显示加载模型过程,启动成功后显示
```shell
Loading porn_detection_cnn successful.
```
# 输出结果如下:
# {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676}
# {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996}
```
这样就完成了服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
- ### 3、API
## 第二步:发送预测请求
- ```python
def detection(texts=[], data={}, use_gpu=False, batch_size=1)
```
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- porn_detection_cnn预测接口,鉴定输入句子是否包含色情文案
```python
import requests
import json
- **参数**
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
- texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
# 设置运行配置
# 对应本地预测porn_detection_cnn.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
- data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。
# 指定预测方法为porn_detection_cnn并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_cnn"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
- use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
- batch_size(int): 批处理大小
- **返回**
- results(list): 鉴定结果
- ```python
def get_labels()
```
- 获取porn_detection_cnn的类别
- **返回**
- labels(dict): porn_detection_cnn的类别(二分类,是/不是)
- ```python
def get_vocab_path()
```
- 获取预训练时使用的词汇表
- **返回**
- vocab_path(str): 词汇表路径
## 四、服务部署
- PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。
- ## 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m porn_detection_cnn
```
- 启动时会显示加载模型过程,启动成功后显示
- ```shell
Loading porn_detection_cnn successful.
```
- 这样就完成了服务化API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
- ## 第二步:发送预测请求
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
# 设置运行配置
# 对应本地预测porn_detection_cnn.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# 指定预测方法为porn_detection_cnn并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_cnn"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- 关于PaddleHub Serving更多信息参考[服务部署](../../../../docs/docs_ch/tutorial/serving.md)
## 五、更新历史
* 1.0.0
初始发布
* 1.1.0
大幅提升预测性能,同时简化接口使用
- ```shell
$ hub install porn_detection_cnn==1.1.0
```
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
关于PaddleHub Serving更多信息参考[服务部署](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.6/docs/tutorial/serving.md)
# PornDetectionGRU API说明
# porn_detection_gru
## detection(texts=[], data={}, use_gpu=False, batch_size=1)
| 模型名称 | porn_detection_gru |
| :------------------ | :------------: |
| 类别 | 文本-文本审核 |
| 网络 | GRU |
| 数据集 | 百度自建数据集 |
| 是否支持Fine-tuning | 否 |
| 模型大小 | 20M |
| 最新更新日期 | 2021-02-26 |
| 数据指标 | - |
porn_detection_gru预测接口,鉴定输入句子是否包含色情文案
## 一、模型基本信息
**参数**
- ### 模型介绍
- 色情检测模型可自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别。
- porn_detection_gru采用GRU网络结构并按字粒度进行切词,具有较高的预测速度。该模型最大句子长度为256字,仅支持预测。
* texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
* data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。
* use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
* batch_size(int): 批处理大小
**返回**
## 二、安装
* results(list): 鉴定结果
- ### 1、环境依赖
## context(trainable=False)
- paddlepaddle >= 1.6.2
获取porn_detection_gru的预训练program以及program的输入输出变量
- paddlehub >= 1.6.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
**参数**
- ### 2、安装
* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变
- ```shell
$ hub install porn_detection_gru
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
**返回**
* inputs(dict): program的输入变量
* outputs(dict): program的输出变量
* main_program(Program): 带有预训练参数的program
## get_labels()
## 三、模型API预测
获取porn_detection_gru的类别
- ### 1、命令行预测
**返回**
- ```shell
$ hub run porn_detection_gru --input_text "黄片下载"
```
* labels(dict): porn_detection_gru的类别
- 或者
## get_vocab_path()
- ```shell
$ hub run porn_detection_gru --input_file test.txt
```
获取预训练时使用的词汇表
- 其中test.txt存放待审查文本,每行仅放置一段待审核文本
**返回**
- 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
* vocab_path(str): 词汇表路径
- ### 2、预测代码示例
# PornDetectionGRU 服务部署
- ```python
import paddlehub as hub
PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。
porn_detection_gru = hub.Module(name="porn_detection_gru")
## 第一步:启动PaddleHub Serving
test_text = ["黄片下载", "打击黄牛党"]
运行启动命令:
```shell
$ hub serving start -m porn_detection_gru
```
results = porn_detection_gru.detection(texts=test_text, use_gpu=True, batch_size=1) # 如不使用GPU,请修改为use_gpu=False
启动时会显示加载模型过程,启动成功后显示
```shell
Loading porn_detection_gru successful.
```
for index, text in enumerate(test_text):
results[index]["text"] = text
for index, result in enumerate(results):
print(results[index])
这样就完成了服务化API的部署,默认端口号为8866。
# 输出结果如下:
# {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676}
# {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996}
```
**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
- ### 3、API
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- ```python
def detection(texts=[], data={}, use_gpu=False, batch_size=1)
```
```python
import requests
import json
- porn_detection_gru预测接口,鉴定输入句子是否包含色情文案
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
- **参数**
# 设置运行配置
# 对应本地预测porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
- texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
# 指定预测方法为porn_detection_gru并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_gru"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
- data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
- batch_size(int): 批处理大小
- **返回**
- results(list): 鉴定结果
- ```python
def get_labels()
```
- 获取porn_detection_gru的类别
- **返回**
- labels(dict): porn_detection_gru的类别(二分类,是/不是)
- ```python
def get_vocab_path()
```
- 获取预训练时使用的词汇表
- **返回**
- vocab_path(str): 词汇表路径
## 四、服务部署
- PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。
- ## 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m porn_detection_gru
```
- 启动时会显示加载模型过程,启动成功后显示
- ```shell
Loading porn_detection_gur successful.
```
- 这样就完成了服务化API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
- ## 第二步:发送预测请求
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
# 设置运行配置
# 对应本地预测porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# 指定预测方法为porn_detection_gru并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_gru"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- 关于PaddleHub Serving更多信息参考[服务部署](../../../../docs/docs_ch/tutorial/serving.md)
## 五、更新历史
* 1.0.0
初始发布
* 1.1.0
大幅提升预测性能,同时简化接口使用
- ```shell
$ hub install porn_detection_gru==1.1.0
```
关于PaddleHub Serving更多信息参考[服务部署](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.6/docs/tutorial/serving.md)
# porn_detection_gru
| Module Name | porn_detection_gru |
| :------------------ | :------------: |
| Category | text-text_review |
| Network | GRU |
| Dataset | Dataset built by Baidu |
| Fine-tuning supported or not | No |
| Module Size | 20M |
| Latest update date | 2021-02-26 |
| Data indicators | - |
## I. Basic Information of Module
- ### Module Introduction
- Pornography detection model can automatically distinguish whether the text is pornographic or not and give the corresponding confidence, and identify the pornographic description, vulgar communication and filthy text in the text.
- porn_detection_gru adopts GRU network structure and cuts words according to word granularity, which has high prediction speed. The maximum sentence length of this model is 256 words, and only prediction is supported.
## II. Installation
- ### 1、Environmental dependence
- paddlepaddle >= 1.6.2
- paddlehub >= 1.6.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install porn_detection_gru
```
- If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API and Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run porn_detection_gru --input_text "黄片下载"
```
- or
- ```shell
$ hub run porn_detection_gru --input_file test.txt
```
- test.txt stores the text to be reviewed. Each line contains only one text
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
porn_detection_gru = hub.Module(name="porn_detection_gru")
test_text = ["黄片下载", "打击黄牛党"]
results = porn_detection_gru.detection(texts=test_text, use_gpu=True, batch_size=1) # If you do not use GPU, please set use_gpu=False
for index, text in enumerate(test_text):
results[index]["text"] = text
for index, result in enumerate(results):
print(results[index])
# The output:
# {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676}
# {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996}
```
- ### 3、API
- ```python
def detection(texts=[], data={}, use_gpu=False, batch_size=1)
```
- prediction api of porn_detection_gru,to identify whether input sentences contain pornography
- **Parameter**
- texts(list): data to be predicted, if texts parameter is used, there is no need to pass in data parameter. You can use any of the two parameters.
- data(dict): predicted data , key must be text,value is data to be predicted. if data parameter is used, there is no need to pass in texts parameter. You can use any of the two parameters. It is suggested to use texts parameter, and data parameter will be discarded later.
- use_gpu(bool): use GPU or not. If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- **Return**
- results(list): prediction result
- ```python
def get_labels()
```
- get the category of porn_detection_gru
- **Return**
- labels(dict): the category of porn_detection_gru (Dichotomies, yes/no)
- ```python
def get_vocab_path()
```
- get a vocabulary for pre-training
- **Return**
- vocab_path(str): Vocabulary path
## IV. Server Deployment
- PaddleHub Serving can deploy an online pornography detection service and you can use this interface for online Web applications.
- ## Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m porn_detection_gru
```
- The model loading process is displayed on startup. After the startup is successful, the following information is displayed:
- ```shell
Loading porn_detection_gur successful.
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- ## Step 2: Send a predictive request
- After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result
- ```python
import requests
import json
# data to be predicted
text = ["黄片下载", "打击黄牛党"]
# Set the running configuration
# Corresponding local forecast porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# set the prediction method to porn_detection_gru and send a POST request, content-type should be set to json
# HOST_IP is the IP address of the server
url = "http://HOST_IP:8866/predict/porn_detection_gru"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction result
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- For more information about PaddleHub Serving, please refer to:[Serving Deployment](../../../../docs/docs_ch/tutorial/serving.md)
## V. Release Note
* 1.0.0
First release
* 1.1.0
Improves prediction performance and simplifies interface usage
- ```shell
$ hub install porn_detection_gru==1.1.0
```
# NPTag
|模型名称|NPTag|
| :--- | :---: |
|类别|文本-文本知识关联|
|网络|ERNIE-CTM|
|数据集|百度自建数据集|
|是否支持Fine-tuning|否|
|模型大小|378MB|
|最新更新日期|2021-12-10|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- NPTag(名词短语标注工具)是首个能够覆盖所有中文名词性词汇及短语的细粒度知识标注工具,旨在解决NLP中,名词性短语收录不足,导致的OOV(out-of-vocabulary,超出收录词表)问题。可直接应用构造知识特征,辅助NLP任务
- NPTag特点
- 包含2000+细粒度类别,覆盖所有中文名词性短语的词类体系,更丰富的知识标注结果
- NPTag试用的词类体系未覆盖所有中文名词性短语的词类体系,对所有类目做了更细类目的识别(如注射剂、鱼类、博物馆等),共包含2000+细粒度类别,且可以直接关联百科知识树。
- 可自由定制的分类框架
- NPTag开源版标注使用的词类体系是我们在实践中对**百科词条**分类应用较好的一个版本,用户可以自由定制自己的词类体系和训练样本,构建自己的NPTag,以获得更好的适配效果。例如,可按照自定义的类别构造训练样本,使用小学习率、短训练周期微调NPTag模型,即可获得自己定制的NPTag工具。
- 模型结构
- NPTag使用ERNIE-CTM+prompt训练而成,使用启发式搜索解码,保证分类结果都在标签体系之内。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 2.1.0
- paddlenlp >= 2.2.0
- paddlehub >= 2.1.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install nptag
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
$ hub run nptag --input_text="糖醋排骨"
```
- 通过命令行方式实现NPTag模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
# Load NPTag
module = hub.Module(name="nptag")
# String input
results = module.predict("糖醋排骨")
print(results)
# [{'text': '糖醋排骨', 'label': '菜品', 'category': '饮食类_菜品'}]
# List input
results = module.predict(["糖醋排骨", "红曲霉菌"])
print(results)
# [{'text': '糖醋排骨', 'label': '菜品', 'category': '饮食类_菜品'}, {'text': '红曲霉菌', 'label': '微生物', 'category': '生物类_微生物'}]
```
- ### 3、API
- ```python
def __init__(
batch_size=32,
max_seq_length=128,
linking=True,
)
```
- **参数**
- batch_size(int): 每个预测批次的样本数目,默认为32。
- max_seq_length(int): 最大句子长度,默认为128。
- linking(bool): 实现与WordTag类别标签的linking,默认为True。
- ```python
def predict(texts)
```
- 预测接口,输入文本,输出名词短语标注结果。
- **参数**
- texts(str or list\[str\]): 待预测数据。
- **返回**
- results(list\[dict\]): 输出结果。每个元素都是dict类型,包含以下信息:
{
'text': str, 原始文本。
'label': str,预测结果。
'category':str,对应的WordTag类别标签。
}
## 四、服务部署
- PaddleHub Serving可以部署一个在线中文名词短语标注服务,可以将此接口用于在线web应用。
- ## 第一步:启动PaddleHub Serving
- 运行启动命令:
```shell
$ hub serving start -m nptag
```
- 这样就完成了服务化API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
- ## 第二步:发送预测请求
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
# 待预测数据(input string)
text = ["糖醋排骨"]
# 设置运行配置
data = {"texts": text}
# 指定预测方法为WordTag并发送post请求,content-type类型应指定json方式
url = "http://127.0.0.1:8866/predict/nptag"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
print(r.json())
# 待预测数据(input list)
text = ["糖醋排骨", "红曲霉菌"]
# 设置运行配置
data = {"texts": text}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
print(r.json())
```
- 关于PaddleHub Serving更多信息参考:[服务部署](../../../../docs/docs_ch/tutorial/serving.md)
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install nptag==1.0.0
```
# -*- coding:utf-8 -*-
import os
import argparse
import paddle
import paddlehub as hub
from paddlehub.module.module import serving, moduleinfo, runnable
from paddlenlp import Taskflow
@moduleinfo(
name="nptag",
version="1.0.0",
summary="",
author="Baidu",
author_email="",
type="nlp/text_to_knowledge",
meta=hub.NLPPredictionModule)
class NPTag(paddle.nn.Layer):
def __init__(self,
batch_size=32,
max_seq_length=128,
linking=True,
):
self.nptag = Taskflow("knowledge_mining", model="nptag", batch_size=batch_size, max_seq_length=max_seq_length, linking=linking)
@serving
def predict(self, texts):
"""
The prediction interface for nptag.
Args:
texts(str or list[str]): the input texts to be predict.
Returns:
results(list[dict]): inference results. The element is a dictionary consists of:
{
'text': str, the input texts.
'head': list[dict], tagging results, the element is a dictionary consists of:
{
'item': str, segmented word.
'offset': int, the offset compared with the first character.
'nptag_label':str, Part-Of-Speech label.
'length': int, word length.
'termid': str, link result with encyclopedia knowledge tree.
}
}
"""
return self.nptag(texts)
@runnable
def run_cmd(self, argvs):
"""
Run as a command
"""
self.parser = argparse.ArgumentParser(
description='Run the %s module.' % self.name,
prog='hub run %s' % self.name,
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
input_data = self.check_input_data(args)
results = self.predict(texts=input_data)
return results
......@@ -65,14 +65,14 @@
- ```shell
$ hub run wordtag --input_text="《孤女》是2010年九州出版社出版的小说,作者是余兼羽。"
```
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- 通过命令行方式实现WordTag模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
# Load ddparser
# Load WordTag
module = hub.Module(name="wordtag")
# String input
......
......@@ -2,6 +2,7 @@
import os
import argparse
import paddle
import paddlehub as hub
from paddlehub.module.module import serving, moduleinfo, runnable
from paddlenlp import Taskflow
......@@ -13,8 +14,9 @@ from paddlenlp import Taskflow
summary="",
author="baidu-nlp",
author_email="",
type="nlp/text_to_knowledge")
class wordtag(hub.NLPPredictionModule):
type="nlp/text_to_knowledge",
meta=hub.NLPPredictionModule)
class WordTag(paddle.nn.Layer):
def __init__(self,
batch_size=32,
max_seq_length=128,
......
......@@ -2,9 +2,9 @@
|模型名称|SkyAR|
| :--- | :---: |
|类别|图像-图像分割|
|类别|视频-视频编辑|
|网络|UNet|
|数据集|UNet|
|数据集|-|
|是否支持Fine-tuning|否|
|模型大小|206MB|
|指标|-|
......@@ -71,7 +71,7 @@
## 三、模型API预测
- ### 1、代码示例
- ### 1、预测代码示例
```python
import paddlehub as hub
......@@ -79,8 +79,8 @@
model = hub.Module(name='SkyAR')
model.MagicSky(
video_path=[path to input video path],
save_path=[path to save video path]
video_path="/PATH/TO/VIDEO",
save_path="/PATH/TO/SAVE/RESULT"
)
```
- ### 2、API
......
# SkyAR
|Module Name|SkyAR|
| :--- | :---: |
|Category|Video editing|
|Network|UNet|
|Dataset|-|
|Fine-tuning supported or not|No|
|Module Size|206MB|
|Data indicators|-|
|Latest update date|2021-02-26|
## I. Basic Information
- ### Application Effect Display
- Sample results:
* Input video:
![Input video](https://img-blog.csdnimg.cn/20210126142046572.gif)
* Jupiter:
![Jupiter](https://img-blog.csdnimg.cn/20210125211435619.gif)
* Rainy day:
![Rainy day](https://img-blog.csdnimg.cn/2021012521152492.gif)
* Galaxy:
![Galaxy](https://img-blog.csdnimg.cn/20210125211523491.gif)
* Ninth area spacecraft:
![Ninth area spacecraft](https://img-blog.csdnimg.cn/20210125211520955.gif)
* Input video:
![Input video](https://img-blog.csdnimg.cn/20210126142038716.gif)
* Floating castle:
![Floating castle](https://img-blog.csdnimg.cn/20210125211514997.gif)
* Thunder and lightning:
![Thunder and lightning](https://img-blog.csdnimg.cn/20210125211433591.gif)
* Super moon:
![Super moon](https://img-blog.csdnimg.cn/20210125211417524.gif)
- ### Module Introduction
- SkyAR is based on [Castle in the Sky: Dynamic Sky Replacement and Harmonization in Videos](https://arxiv.org/abs/2010.11800). It mainly consists of three parts: sky matting network, motion estimation and image fusion.
- For more information, please refer to:[SkyAR](https://github.com/jiupinjia/SkyAR)
## II. Installation
- ### 1、Environmental Dependence
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0
- ### 2、Installation
- ```shell
$hub install SkyAR
```
- In case of any problems during installation, please refer to:[Windows_Quickstart](../../../../docs/docs_en/get_start/windows_quickstart.md)
| [Linux_Quickstart](../../../../docs/docs_en/get_start/linux_quickstart.md) | [Mac_Quickstart](../../../../docs/docs_en/get_start/mac_quickstart.md)
## III. Module API Prediction
- ### 1、Prediction Code Example
```python
import paddlehub as hub
model = hub.Module(name='SkyAR')
model.MagicSky(
video_path=[path to input video path],
save_path=[path to save video path]
)
```
- ### 2、API
```python
def MagicSky(
video_path, save_path, config='jupiter',
is_rainy=False, preview_frames_num=0, is_video_sky=False, is_show=False,
skybox_img=None, skybox_video=None, rain_cap_path=None,
halo_effect=True, auto_light_matching=False,
relighting_factor=0.8, recoloring_factor=0.5, skybox_center_crop=0.5
)
```
- **Parameter**
* video_path(str):input video path.
* save_path(str):save videp path.
* config(str): SkyBox configuration, all preset configurations are as follows: `['cloudy', 'district9ship', 'floatingcastle', 'galaxy', 'jupiter',
'rainy', 'sunny', 'sunset', 'supermoon', 'thunderstorm'
]`, if you use a custom SkyBox, please set it to None.
* skybox_img(str):custom SkyBox image path
* skybox_video(str):custom SkyBox video path
* is_video_sky(bool):customize whether SkyBox is a video
* rain_cap_path(str):custom video path with rain
* is_rainy(bool): whether the sky is raining
* halo_effect(bool):whether to open halo effect
* auto_light_matching(bool):whether to enable automatic brightness matching
* relighting_factor(float): relighting factor
* recoloring_factor(float): recoloring factor
* skybox_center_crop(float):skyBox center crop factor
* preview_frames_num(int):set the number of preview frames
* is_show(bool):whether to preview graphically
## IV. Release Note
- 1.0.0
First release
# nonlocal_kinetics400
|模型名称|nonlocal_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|Non-local|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|129MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- Non-local Neural Networks是由Xiaolong Wang等研究者在2017年提出的模型,主要特点是通过引入Non-local操作来描述距离较远的像素点之间的关联关系。其借助于传统计算机视觉中的non-local mean的思想,并将该思想扩展到神经网络中,通过定义输出位置和所有输入位置之间的关联函数,建立全局关联特性。Non-local模型的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install nonlocal_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run nonlocal_kinetics400 --input_path "/PATH/TO/VIDEO" --use_gpu True
```
或者
- ```shell
hub run nonlocal_kinetics400 --input_file test.txt --use_gpu True
```
- test.txt 存放待分类视频的存放路径;
- Note: 该PaddleHub Module目前只支持在GPU环境下使用,在使用前,请使用下述命令指定GPU设备(设备ID请根据实际情况指定)
- ```shell
export CUDA_VISIBLE_DEVICES=0
```
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
nonlocal = hub.Module(name="nonlocal_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = nonlocal.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install nonlocal_kinetics400==1.0.0
```
# stnet_kinetics400
|模型名称|stnet_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|StNet|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|129MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- StNet模型框架为ActivityNet Kinetics Challenge 2018中夺冠的基础网络框架,是基于ResNet50实现的。该模型提出super-image的概念,在super-image上进行2D卷积,建模视频中局部时空相关性。另外通过temporal modeling block建模视频的全局时空依赖,最后用一个temporal Xception block对抽取的特征序列进行长时序建模。StNet的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install stnet_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run stnet_kinetics400 --input_path "/PATH/TO/VIDEO"
```
或者
- ```shell
hub run stnet_kinetics400 --input_file test.txt
```
- test.txt 存放待分类视频的存放路径
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
stnet = hub.Module(name="stnet_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = stnet.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install stnet_kinetics400==1.0.0
```
# tsm_kinetics400
|模型名称|tsm_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|TSM|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|95MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- TSM(Temporal Shift Module)是由MIT和IBM Watson AI Lab的JiLin,ChuangGan和SongHan等人提出的通过时间位移来提高网络视频理解能力的模块。TSM的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install tsm_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run tsm_kinetics400 --input_path "/PATH/TO/VIDEO"
```
或者
- ```shell
hub run tsm_kinetics400 --input_file test.txt
```
- Note: test.txt 存放待分类视频的存放路径
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
tsm = hub.Module(name="tsm_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = tsm.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install tsm_kinetics400==1.0.0
```
# tsn_kinetics400
|模型名称|tsn_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|TSN|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|95MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- TSN(Temporal Segment Network)是视频分类领域经典的基于2D-CNN的解决方案。该方法主要解决视频的长时间行为判断问题,通过稀疏采样视频帧的方式代替稠密采样,既能捕获视频全局信息,也能去除冗余,降低计算量。最终将每帧特征平均融合后得到视频的整体特征,并用于分类。TSN的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
- 具体网络结构可参考论文:[TSN](https://arxiv.org/abs/1608.00859)
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install tsn_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run tsn_kinetics400 --input_path "/PATH/TO/VIDEO"
```
或者
- ```shell
hub run tsn_kinetics400 --input_file test.txt
```
- Note: test.txt 存放待分类视频的存放路径
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
tsn = hub.Module(name="tsn_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = tsn.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install tsn_kinetics400==1.0.0
```
......@@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
__version__ = '2.1.0'
__version__ = 'develop'
import paddle
from packaging.version import Version
......
......@@ -159,7 +159,7 @@ class CacheUpdater(threading.Thread):
if version:
payload['version'] = version
api_url = uri_path(hubconf.server, 'search')
cache_path = os.path.join("")
cache_path = os.path.join("~")
hub_name = cache_config.hub_name
if os.path.exists(cache_path):
extra = {"command": command, "mtime": os.stat(cache_path).st_mtime, "hub_name": hub_name}
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册