From 64ea4006e6a0526598467c32f3191f68db0e0f96 Mon Sep 17 00:00:00 2001 From: shinichiye <76040149+shinichiye@users.noreply.github.com> Date: Wed, 22 Dec 2021 20:57:54 +0800 Subject: [PATCH] Add modellist&upgrade readme (#1741) * Update README.md update readme * Update README.md correct some mistakes * Update README.md * Create README_en.md * Update the serial number Update the serial number and correct some mistakes * Update README.md * Create README_en.md * Create README_en.md * Update README.md * Update README.md * correct a mistake correct a mistake * Create README_en.md * Create tsn_kinetics400 * Delete tsn_kinetics400 * Create README.md * Create README.md * Create README.md * Update README.md * Create README.md * Update README.md * Update README.md * Update README.md * Update README_ch.md * Update README.md * Update README.md * Update README.md * Update README_ch.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README_ch.md * Update README.md * Update README.md * Update README_ch.md * Update and rename README.md to README_ch.md * Update README_ch.md * Create README.md * Update README.md * Update README.md * Update README_ch.md Co-authored-by: KP <109694228@qq.com> --- README.md | 14 +- README_ch.md | 14 +- modules/README.md | 547 ++++++++++++++++++ modules/README_ch.md | 546 +++++++++++++++++ .../chinese_ocr_db_crnn_mobile/README_en.md | 202 +++++++ .../senta_bilstm/README_en.md | 190 ++++++ .../text_generation/ernie_gen/README_en.md | 230 ++++++++ .../reading_pictures_writing_poems/readme.md | 14 +- .../text_review/porn_detection_cnn/README.md | 207 +++++-- .../text_review/porn_detection_gru/README.md | 208 +++++-- .../porn_detection_gru/README_en.md | 183 ++++++ .../nonlocal_kinetics400/README.md | 109 ++++ .../stnet_kinetics400/README.md | 106 ++++ .../classification/tsm_kinetics400/README.md | 106 ++++ .../classification/tsn_kinetics400/README.md | 108 ++++ 15 files changed, 2647 insertions(+), 137 deletions(-) create mode 100644 modules/README_ch.md create mode 100644 modules/image/text_recognition/chinese_ocr_db_crnn_mobile/README_en.md create mode 100644 modules/text/sentiment_analysis/senta_bilstm/README_en.md create mode 100644 modules/text/text_generation/ernie_gen/README_en.md create mode 100644 modules/text/text_review/porn_detection_gru/README_en.md create mode 100644 modules/video/classification/nonlocal_kinetics400/README.md create mode 100644 modules/video/classification/stnet_kinetics400/README.md create mode 100644 modules/video/classification/tsm_kinetics400/README.md create mode 100644 modules/video/classification/tsn_kinetics400/README.md diff --git a/README.md b/README.md index b62f41bd..54975395 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ English | [简体中文](README_ch.md)

-

QuickStart | Tutorial | Models List | Demos

+

QuickStart | Tutorial | Models List | Demos

------------------------------------------------------------------------------------------ @@ -28,7 +28,7 @@ English | [简体中文](README_ch.md) ## Introduction and Features - **PaddleHub** aims to provide developers with rich, high-quality, and directly usable pre-trained models. -- **Abundant Pre-trained Models**: 300+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage. +- **Abundant Pre-trained Models**: 360+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage. - **No Need for Deep Learning Background**: you can use AI models quickly and enjoy the dividends of the artificial intelligence era. - **Quick Model Prediction**: model prediction can be realized through a few lines of scripts to quickly experience the model effect. - **Model As Service**: one-line command to build deep learning model API service deployment capabilities. @@ -44,8 +44,8 @@ English | [简体中文](README_ch.md) -## Visualization Demo [[More]](./docs/docs_en/visualization.md) -### **Computer Vision (161 models)** +## Visualization Demo [[More]](./docs/docs_en/visualization.md) [[ModelList]](./modules) +### **[Computer Vision (212 models)](./modules#Image)**
@@ -53,7 +53,7 @@ English | [简体中文](README_ch.md) - Many thanks to CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)、[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)、[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)、[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)、[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)、[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)、[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)、[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) for the pre-trained models, you can try to train your models with them. -### **Natural Language Processing (129 models)** +### **[Natural Language Processing (130 models)](./modules#Text)**
@@ -62,7 +62,7 @@ English | [简体中文](README_ch.md) -### Speech (3 models) +### [Speech (15 models)](./modules#Audio) - TTS speech synthesis algorithm, multiple algorithms are available. - Many thanks to CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet) for the pre-trained models, you can try to train your models with Parakeet. - Input: `Life was like a box of chocolates, you never know what you're gonna get.` @@ -95,7 +95,7 @@ English | [简体中文](README_ch.md) -### Video (8 models) +### [Video (8 models)](./modules#Video) - Short video classification trained via large-scale video datasets, supports 3000+ tag types prediction for short Form Videos. - Many thanks to CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo) for the pre-trained model, you can try to train your models with PaddleVideo. - `Example: Input a short video of swimming, the algorithm can output the result of "swimming"` diff --git a/README_ch.md b/README_ch.md index 0214cc8f..ac3c4094 100644 --- a/README_ch.md +++ b/README_ch.md @@ -4,7 +4,7 @@

-

快速开始 | 教程文档 | 模型搜索 | 演示Demo +

快速开始 | 教程文档 | 模型库 | 演示Demo

@@ -30,7 +30,7 @@ ## 简介与特性 - PaddleHub旨在为开发者提供丰富的、高质量的、直接可用的预训练模型 -- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 300+ 预训练模型,全部开源下载,离线可运行 +- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 **360+** 预训练模型,全部开源下载,离线可运行 - **【超低使用门槛】**:无需深度学习背景、无需数据与训练过程,可快速使用AI模型 - **【一键模型快速预测】**:通过一行命令行或者极简的Python API实现模型调用,可快速体验模型效果 - **【一键模型转服务化】**:一行命令,搭建深度学习模型API服务化部署能力 @@ -47,9 +47,9 @@ -## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)** +## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)[【模型库】](./modules/README_ch.md)** -### **图像类(161个)** +### **[图像类(212个)](./modules/README_ch.md#图像)** - 包括图像分类、人脸检测、口罩检测、车辆检测、人脸/人体/手部关键点检测、人像分割、80+语言文本识别、图像超分/上色/动漫化等
@@ -58,7 +58,7 @@ - 感谢CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)、[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)、[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)、[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)、[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)、[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)、[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)、[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) 提供相关预训练模型,训练能力开放,欢迎体验。 -### **文本类(129个)** +### **[文本类(130个)](./modules/README_ch.md#文本)** - 包括中文分词、词性标注与命名实体识别、句法分析、AI写诗/对联/情话/藏头诗、中文的评论情感分析、中文色情文本审核等
@@ -67,7 +67,7 @@ - 感谢CopyRight@[ERNIE](https://github.com/PaddlePaddle/ERNIE)、[LAC](https://github.com/baidu/LAC)、[DDParser](https://github.com/baidu/DDParser)提供相关预训练模型,训练能力开放,欢迎体验。 -### **语音类(3个)** +### **[语音类(15个)](./modules/README_ch.md#语音)** - TTS语音合成算法,多种算法可选 - 感谢CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet)提供预训练模型,训练能力开放,欢迎体验。 - 输入:`Life was like a box of chocolates, you never know what you're gonna get.` @@ -100,7 +100,7 @@
-### **视频类(8个)** +### **[视频类(8个)](./modules/README_ch.md#视频)** - 包含短视频分类,支持3000+标签种类,可输出TOP-K标签,多种算法可选。 - 感谢CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo)提供预训练模型,训练能力开放,欢迎体验。 - `举例:输入一段游泳的短视频,算法可以输出"游泳"结果` diff --git a/modules/README.md b/modules/README.md index e69de29b..7f1e2938 100644 --- a/modules/README.md +++ b/modules/README.md @@ -0,0 +1,547 @@ +English | [简体中文](README_ch.md) + +# CONTENTS +|[Image](#Image) (212)|[Text](#Text) (130)|[Audio](#Audio) (15)|[Video](#Video) (8)|[Industrial Application](#Industrial-Application) (1)| +|--|--|--|--|--| +|[Image Classification](#Image-Classification) (108)|[Text Generation](#Text-Generation) (17)| [Voice Cloning](#Voice-Cloning) (2)|[Video Classification](#Video-Classification) (5)| [Meter Detection](#Meter-Detection) (1)| +|[Image Generation](#Image-Generation) (26)|[Word Embedding](#Word-Embedding) (62)|[Text to Speech](#Text-to-Speech) (5)|[Video Editing](#Video-Editing) (1)|-| +|[Keypoint Detection](#Keypoint-Detection) (5)|[Machine Translation](#Machine-Translation) (2)|[Automatic Speech Recognition](#Automatic-Speech-Recognition) (5)|[Multiple Object tracking](#Multiple-Object-tracking) (2)|-| +|[Semantic Segmentation](#Semantic-Segmentation) (25)|[Language Model](#Language-Model) (30)|[Audio Classification](#Audio-Classification) (3)| -|-| +|[Face Detection](#Face-Detection) (7)|[Sentiment Analysis](#Sentiment-Analysis) (7)|-|-|-| +|[Text Recognition](#Text-Recognition) (17)|[Syntactic Analysis](#Syntactic-Analysis) (1)|-|-|-| +|[Image Editing](#Image-Editing) (8)|[Simultaneous Translation](#Simultaneous-Translation) (5)|-|-|-| +|[Instance Segmentation](#Instance-Segmentation) (1)|[Lexical Analysis](#Lexical-Analysis) (2)|-|-|-| +|[Object Detection](#Object-Detection) (13)|[Punctuation Restoration](#Punctuation-Restoration) (1)|-|-|-| +|[Depth Estimation](#Depth-Estimation) (2)|[Text Review](#Text-Review) (3)|-|-|-| + +## Image + - ### Image Classification + +
expand
+ +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[DriverStatusRecognition](image/classification/DriverStatusRecognition)|MobileNetV3_small_ssld|分心司机检测数据集|| +|[mobilenet_v2_animals](image/classification/mobilenet_v2_animals)|MobileNet_v2|百度自建动物数据集|| +|[repvgg_a1_imagenet](image/classification/repvgg_a1_imagenet)|RepVGG|ImageNet-2012|| +|[repvgg_a0_imagenet](image/classification/repvgg_a0_imagenet)|RepVGG|ImageNet-2012|| +|[resnext152_32x4d_imagenet](image/classification/resnext152_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[resnet_v2_152_imagenet](image/classification/resnet_v2_152_imagenet)|ResNet V2|ImageNet-2012|| +|[resnet50_vd_animals](image/classification/resnet50_vd_animals)|ResNet50_vd|百度自建动物数据集|| +|[food_classification](image/classification/food_classification)|ResNet50_vd_ssld|美食数据集|| +|[mobilenet_v3_large_imagenet_ssld](image/classification/mobilenet_v3_large_imagenet_ssld)|Mobilenet_v3_large|ImageNet-2012|| +|[resnext152_vd_32x4d_imagenet](image/classification/resnext152_vd_32x4d_imagenet)|||| +|[ghostnet_x1_3_imagenet_ssld](image/classification/ghostnet_x1_3_imagenet_ssld)|GhostNet|ImageNet-2012|| +|[rexnet_1_5_imagenet](image/classification/rexnet_1_5_imagenet)|ReXNet|ImageNet-2012|| +|[resnext50_64x4d_imagenet](image/classification/resnext50_64x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[resnext101_64x4d_imagenet](image/classification/resnext101_64x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[efficientnetb0_imagenet](image/classification/efficientnetb0_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb1_imagenet](image/classification/efficientnetb1_imagenet)|EfficientNet|ImageNet-2012|| +|[mobilenet_v2_imagenet_ssld](image/classification/mobilenet_v2_imagenet_ssld)|Mobilenet_v2|ImageNet-2012|| +|[resnet50_vd_dishes](image/classification/resnet50_vd_dishes)|ResNet50_vd|百度自建菜品数据集|| +|[pnasnet_imagenet](image/classification/pnasnet_imagenet)|PNASNet|ImageNet-2012|| +|[rexnet_2_0_imagenet](image/classification/rexnet_2_0_imagenet)|ReXNet|ImageNet-2012|| +|[SnakeIdentification](image/classification/SnakeIdentification)|ResNet50_vd_ssld|蛇种数据集|| +|[hrnet40_imagenet](image/classification/hrnet40_imagenet)|HRNet|ImageNet-2012|| +|[resnet_v2_34_imagenet](image/classification/resnet_v2_34_imagenet)|ResNet V2|ImageNet-2012|| +|[mobilenet_v2_dishes](image/classification/mobilenet_v2_dishes)|MobileNet_v2|百度自建菜品数据集|| +|[resnext101_vd_32x4d_imagenet](image/classification/resnext101_vd_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[repvgg_b2g4_imagenet](image/classification/repvgg_b2g4_imagenet)|RepVGG|ImageNet-2012|| +|[fix_resnext101_32x48d_wsl_imagenet](image/classification/fix_resnext101_32x48d_wsl_imagenet)|ResNeXt|ImageNet-2012|| +|[vgg13_imagenet](image/classification/vgg13_imagenet)|VGG|ImageNet-2012|| +|[se_resnext101_32x4d_imagenet](image/classification/se_resnext101_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012|| +|[hrnet30_imagenet](image/classification/hrnet30_imagenet)|HRNet|ImageNet-2012|| +|[ghostnet_x1_3_imagenet](image/classification/ghostnet_x1_3_imagenet)|GhostNet|ImageNet-2012|| +|[dpn107_imagenet](image/classification/dpn107_imagenet)|DPN|ImageNet-2012|| +|[densenet161_imagenet](image/classification/densenet161_imagenet)|DenseNet|ImageNet-2012|| +|[vgg19_imagenet](image/classification/vgg19_imagenet)|vgg19_imagenet|ImageNet-2012|| +|[mobilenet_v2_imagenet](image/classification/mobilenet_v2_imagenet)|Mobilenet_v2|ImageNet-2012|| +|[resnet50_vd_10w](image/classification/resnet50_vd_10w)|ResNet_vd|百度自建数据集|| +|[resnet_v2_101_imagenet](image/classification/resnet_v2_101_imagenet)|ResNet V2 101|ImageNet-2012|| +|[darknet53_imagenet](image/classification/darknet53_imagenet)|DarkNet|ImageNet-2012|| +|[se_resnext50_32x4d_imagenet](image/classification/se_resnext50_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012|| +|[se_hrnet64_imagenet_ssld](image/classification/se_hrnet64_imagenet_ssld)|HRNet|ImageNet-2012|| +|[resnext101_32x16d_wsl](image/classification/resnext101_32x16d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[hrnet18_imagenet](image/classification/hrnet18_imagenet)|HRNet|ImageNet-2012|| +|[spinalnet_res101_gemstone](image/classification/spinalnet_res101_gemstone)|resnet101|gemstone|| +|[densenet264_imagenet](image/classification/densenet264_imagenet)|DenseNet|ImageNet-2012|| +|[resnext50_vd_32x4d_imagenet](image/classification/resnext50_vd_32x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[SpinalNet_Gemstones](image/classification/SpinalNet_Gemstones)|||| +|[spinalnet_vgg16_gemstone](image/classification/spinalnet_vgg16_gemstone)|vgg16|gemstone|| +|[xception71_imagenet](image/classification/xception71_imagenet)|Xception|ImageNet-2012|| +|[repvgg_b2_imagenet](image/classification/repvgg_b2_imagenet)|RepVGG|ImageNet-2012|| +|[dpn68_imagenet](image/classification/dpn68_imagenet)|DPN|ImageNet-2012|| +|[alexnet_imagenet](image/classification/alexnet_imagenet)|AlexNet|ImageNet-2012|| +|[rexnet_1_3_imagenet](image/classification/rexnet_1_3_imagenet)|ReXNet|ImageNet-2012|| +|[hrnet64_imagenet](image/classification/hrnet64_imagenet)|HRNet|ImageNet-2012|| +|[efficientnetb7_imagenet](image/classification/efficientnetb7_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb0_small_imagenet](image/classification/efficientnetb0_small_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb6_imagenet](image/classification/efficientnetb6_imagenet)|EfficientNet|ImageNet-2012|| +|[hrnet48_imagenet](image/classification/hrnet48_imagenet)|HRNet|ImageNet-2012|| +|[rexnet_3_0_imagenet](image/classification/rexnet_3_0_imagenet)|ReXNet|ImageNet-2012|| +|[shufflenet_v2_imagenet](image/classification/shufflenet_v2_imagenet)|ShuffleNet V2|ImageNet-2012|| +|[ghostnet_x0_5_imagenet](image/classification/ghostnet_x0_5_imagenet)|GhostNet|ImageNet-2012|| +|[inception_v4_imagenet](image/classification/inception_v4_imagenet)|Inception_V4|ImageNet-2012|| +|[resnext101_vd_64x4d_imagenet](image/classification/resnext101_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[densenet201_imagenet](image/classification/densenet201_imagenet)|DenseNet|ImageNet-2012|| +|[vgg16_imagenet](image/classification/vgg16_imagenet)|VGG|ImageNet-2012|| +|[mobilenet_v3_small_imagenet_ssld](image/classification/mobilenet_v3_small_imagenet_ssld)|Mobilenet_v3_Small|ImageNet-2012|| +|[hrnet18_imagenet_ssld](image/classification/hrnet18_imagenet_ssld)|HRNet|ImageNet-2012|| +|[resnext152_64x4d_imagenet](image/classification/resnext152_64x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[efficientnetb3_imagenet](image/classification/efficientnetb3_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb2_imagenet](image/classification/efficientnetb2_imagenet)|EfficientNet|ImageNet-2012|| +|[repvgg_b1g4_imagenet](image/classification/repvgg_b1g4_imagenet)|RepVGG|ImageNet-2012|| +|[resnext101_32x4d_imagenet](image/classification/resnext101_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[resnext50_32x4d_imagenet](image/classification/resnext50_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[repvgg_a2_imagenet](image/classification/repvgg_a2_imagenet)|RepVGG|ImageNet-2012|| +|[resnext152_vd_64x4d_imagenet](image/classification/resnext152_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[xception41_imagenet](image/classification/xception41_imagenet)|Xception|ImageNet-2012|| +|[googlenet_imagenet](image/classification/googlenet_imagenet)|GoogleNet|ImageNet-2012|| +|[resnet50_vd_imagenet_ssld](image/classification/resnet50_vd_imagenet_ssld)|ResNet_vd|ImageNet-2012|| +|[repvgg_b1_imagenet](image/classification/repvgg_b1_imagenet)|RepVGG|ImageNet-2012|| +|[repvgg_b0_imagenet](image/classification/repvgg_b0_imagenet)|RepVGG|ImageNet-2012|| +|[resnet_v2_50_imagenet](image/classification/resnet_v2_50_imagenet)|ResNet V2|ImageNet-2012|| +|[rexnet_1_0_imagenet](image/classification/rexnet_1_0_imagenet)|ReXNet|ImageNet-2012|| +|[resnet_v2_18_imagenet](image/classification/resnet_v2_18_imagenet)|ResNet V2|ImageNet-2012|| +|[resnext101_32x8d_wsl](image/classification/resnext101_32x8d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[efficientnetb4_imagenet](image/classification/efficientnetb4_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb5_imagenet](image/classification/efficientnetb5_imagenet)|EfficientNet|ImageNet-2012|| +|[repvgg_b1g2_imagenet](image/classification/repvgg_b1g2_imagenet)|RepVGG|ImageNet-2012|| +|[resnext101_32x48d_wsl](image/classification/resnext101_32x48d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[resnet50_vd_wildanimals](image/classification/resnet50_vd_wildanimals)|ResNet_vd|IFAW 自建野生动物数据集|| +|[nasnet_imagenet](image/classification/nasnet_imagenet)|NASNet|ImageNet-2012|| +|[se_resnet18_vd_imagenet](image/classification/se_resnet18_vd_imagenet)|||| +|[spinalnet_res50_gemstone](image/classification/spinalnet_res50_gemstone)|resnet50|gemstone|| +|[resnext50_vd_64x4d_imagenet](image/classification/resnext50_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[resnext101_32x32d_wsl](image/classification/resnext101_32x32d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[dpn131_imagenet](image/classification/dpn131_imagenet)|DPN|ImageNet-2012|| +|[xception65_imagenet](image/classification/xception65_imagenet)|Xception|ImageNet-2012|| +|[repvgg_b3g4_imagenet](image/classification/repvgg_b3g4_imagenet)|RepVGG|ImageNet-2012|| +|[marine_biometrics](image/classification/marine_biometrics)|ResNet50_vd_ssld|Fish4Knowledge|| +|[res2net101_vd_26w_4s_imagenet](image/classification/res2net101_vd_26w_4s_imagenet)|Res2Net|ImageNet-2012|| +|[dpn98_imagenet](image/classification/dpn98_imagenet)|DPN|ImageNet-2012|| +|[resnet18_vd_imagenet](image/classification/resnet18_vd_imagenet)|ResNet_vd|ImageNet-2012|| +|[densenet121_imagenet](image/classification/densenet121_imagenet)|DenseNet|ImageNet-2012|| +|[vgg11_imagenet](image/classification/vgg11_imagenet)|VGG|ImageNet-2012|| +|[hrnet44_imagenet](image/classification/hrnet44_imagenet)|HRNet|ImageNet-2012|| +|[densenet169_imagenet](image/classification/densenet169_imagenet)|DenseNet|ImageNet-2012|| +|[hrnet32_imagenet](image/classification/hrnet32_imagenet)|HRNet|ImageNet-2012|| +|[dpn92_imagenet](image/classification/dpn92_imagenet)|DPN|ImageNet-2012|| +|[ghostnet_x1_0_imagenet](image/classification/ghostnet_x1_0_imagenet)|GhostNet|ImageNet-2012|| +|[hrnet48_imagenet_ssld](image/classification/hrnet48_imagenet_ssld)|HRNet|ImageNet-2012|| + +
+ + + - ### Image Generation + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[pixel2style2pixel](image/Image_gan/gan/pixel2style2pixel/)|Pixel2Style2Pixel|-|人脸转正| +|[stgan_bald](image/Image_gan/gan/stgan_bald/)|STGAN|CelebA|秃头生成器| +|[styleganv2_editing](image/Image_gan/gan/styleganv2_editing)|StyleGAN V2|-|人脸编辑| +|[wav2lip](image/Image_gan/gan/wav2lip)|wav2lip|LRS2|唇形生成| +|[attgan_celeba](image/Image_gan/attgan_celeba/)|AttGAN|Celeba|人脸编辑| +|[cyclegan_cityscapes](image/Image_gan/cyclegan_cityscapes)|CycleGAN|Cityscapes|实景图和语义分割结果互相转换| +|[stargan_celeba](image/Image_gan/stargan_celeba)|StarGAN|Celeba|人脸编辑| +|[stgan_celeba](image/Image_gan/stgan_celeba/)|STGAN|Celeba|人脸编辑| +|[ID_Photo_GEN](image/Image_gan/style_transfer/ID_Photo_GEN)|HRNet_W18|-|证件照生成| +|[Photo2Cartoon](image/Image_gan/style_transfer/Photo2Cartoon)|U-GAT-IT|cartoon_data|人脸卡通化| +|[U2Net_Portrait](image/Image_gan/style_transfer/U2Net_Portrait)|U^2Net|-|人脸素描化| +|[UGATIT_100w](image/Image_gan/style_transfer/UGATIT_100w)|U-GAT-IT|selfie2anime|人脸动漫化| +|[UGATIT_83w](image/Image_gan/style_transfer/UGATIT_83w)|U-GAT-IT|selfie2anime|人脸动漫化| +|[UGATIT_92w](image/Image_gan/style_transfer/UGATIT_92w)| U-GAT-IT|selfie2anime|人脸动漫化| +|[animegan_v1_hayao_60](image/Image_gan/style_transfer/animegan_v1_hayao_60)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏| +|[animegan_v2_hayao_64](image/Image_gan/style_transfer/animegan_v2_hayao_64)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏| +|[animegan_v2_hayao_99](image/Image_gan/style_transfer/animegan_v2_hayao_99)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏| +|[animegan_v2_paprika_54](image/Image_gan/style_transfer/animegan_v2_paprika_54)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_paprika_74](image/Image_gan/style_transfer/animegan_v2_paprika_74)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_paprika_97](image/Image_gan/style_transfer/animegan_v2_paprika_97)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_paprika_98](image/Image_gan/style_transfer/animegan_v2_paprika_98)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_shinkai_33](image/Image_gan/style_transfer/animegan_v2_shinkai_33)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚| +|[animegan_v2_shinkai_53](image/Image_gan/style_transfer/animegan_v2_shinkai_53)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚| +|[msgnet](image/Image_gan/style_transfer/msgnet)|msgnet|COCO2014| +|[stylepro_artistic](image/Image_gan/style_transfer/stylepro_artistic)|StyleProNet|MS-COCO + WikiArt|艺术风格迁移| +|stylegan_ffhq|StyleGAN|FFHQ|图像风格迁移| + + - ### Keypoint Detection + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[face_landmark_localization](image/keypoint_detection/face_landmark_localization)|Face_Landmark|AFW/AFLW|人脸关键点检测| +|[hand_pose_localization](image/keypoint_detection/hand_pose_localization)|-|MPII, NZSL|手部关键点检测| +|[openpose_body_estimation](image/keypoint_detection/openpose_body_estimation)|two-branch multi-stage CNN|MPII, COCO 2016|肢体关键点检测| +|[human_pose_estimation_resnet50_mpii](image/keypoint_detection/human_pose_estimation_resnet50_mpii)|Pose_Resnet50|MPII|人体骨骼关键点检测 +|[openpose_hands_estimation](image/keypoint_detection/openpose_hands_estimation)|-|MPII, NZSL|手部关键点检测| + + - ### Semantic Segmentation + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[deeplabv3p_xception65_humanseg](image/semantic_segmentation/deeplabv3p_xception65_humanseg)|deeplabv3p|百度自建数据集|人像分割| +|[humanseg_server](image/semantic_segmentation/humanseg_server)|deeplabv3p|百度自建数据集|人像分割| +|[humanseg_mobile](image/semantic_segmentation/humanseg_mobile)|hrnet|百度自建数据集|人像分割-移动端前置摄像头| +|[humanseg_lite](image/semantic_segmentation/umanseg_lite)|shufflenet|百度自建数据集|轻量级人像分割-移动端实时| +|[ExtremeC3_Portrait_Segmentation](image/semantic_segmentation/ExtremeC3_Portrait_Segmentation)|ExtremeC3|EG1800, Baidu fashion dataset|轻量化人像分割| +|[SINet_Portrait_Segmentation](image/semantic_segmentation/SINet_Portrait_Segmentation)|SINet|EG1800, Baidu fashion dataset|轻量化人像分割| +|[FCN_HRNet_W18_Face_Seg](image/semantic_segmentation/FCN_HRNet_W18_Face_Seg)|FCN_HRNet_W18|-|人像分割| +|[ace2p](image/semantic_segmentation/ace2p)|ACE2P|LIP|人体解析| +|[Pneumonia_CT_LKM_PP](image/semantic_segmentation/Pneumonia_CT_LKM_PP)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析| +|[Pneumonia_CT_LKM_PP_lung](image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析| +|[ocrnet_hrnetw18_voc](image/semantic_segmentation/ocrnet_hrnetw18_voc)|ocrnet, hrnet|PascalVoc2012| +|[U2Net](image/semantic_segmentation/U2Net)|U^2Net|-|图像前景背景分割| +|[U2Netp](image/semantic_segmentation/U2Netp)|U^2Net|-|图像前景背景分割| +|[Extract_Line_Draft](image/semantic_segmentation/Extract_Line_Draft)|UNet|Pixiv|线稿提取| +|[unet_cityscapes](image/semantic_segmentation/unet_cityscapes)|UNet|cityscapes| +|[ocrnet_hrnetw18_cityscapes](image/semantic_segmentation/ocrnet_hrnetw18_cityscapes)|ocrnet_hrnetw18|cityscapes| +|[hardnet_cityscapes](image/semantic_segmentation/hardnet_cityscapes)|hardnet|cityscapes| +|[fcn_hrnetw48_voc](image/semantic_segmentation/fcn_hrnetw48_voc)|fcn_hrnetw48|PascalVoc2012| +|[fcn_hrnetw48_cityscapes](image/semantic_segmentation/fcn_hrnetw48_cityscapes)|fcn_hrnetw48|cityscapes| +|[fcn_hrnetw18_voc](image/semantic_segmentation/fcn_hrnetw18_voc)|fcn_hrnetw18|PascalVoc2012| +|[fcn_hrnetw18_cityscapes](image/semantic_segmentation/fcn_hrnetw18_cityscapes)|fcn_hrnetw18|cityscapes| +|[fastscnn_cityscapes](image/semantic_segmentation/fastscnn_cityscapes)|fastscnn|cityscapes| +|[deeplabv3p_resnet50_voc](image/semantic_segmentation/deeplabv3p_resnet50_voc)|deeplabv3p, resnet50|PascalVoc2012| +|[deeplabv3p_resnet50_cityscapes](image/semantic_segmentation/deeplabv3p_resnet50_cityscapes)|deeplabv3p, resnet50|cityscapes| +|[bisenetv2_cityscapes](image/semantic_segmentation/bisenetv2_cityscapes)|bisenetv2|cityscapes| + + + + - ### Face Detection + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[pyramidbox_lite_mobile](image/face_detection/pyramidbox_lite_mobile)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测-移动端| +|[pyramidbox_lite_mobile_mask](image/face_detection/pyramidbox_lite_mobile_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测-移动端| +|[pyramidbox_lite_server_mask](image/face_detection/pyramidbox_lite_server_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测| +|[ultra_light_fast_generic_face_detector_1mb_640](image/face_detection/ultra_light_fast_generic_face_detector_1mb_640)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备| +|[ultra_light_fast_generic_face_detector_1mb_320](image/face_detection/ultra_light_fast_generic_face_detector_1mb_320)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备| +|[pyramidbox_lite_server](image/face_detection/pyramidbox_lite_server)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测| +|[pyramidbox_face_detection](image/face_detection/pyramidbox_face_detection)|PyramidBox|WIDER FACE数据集|人脸检测| + + - ### Text Recognition + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[chinese_ocr_db_crnn_mobile](image/text_recognition/chinese_ocr_db_crnn_mobile)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别|[chinese_text_detection_db_mobile](image/text_recognition/chinese_text_detection_db_mobile)|Differentiable Binarization|icdar2015数据集|中文文本检测| +|[chinese_text_detection_db_server](image/text_recognition/chinese_text_detection_db_server)|Differentiable Binarization|icdar2015数据集|中文文本检测| +|[chinese_ocr_db_crnn_server](image/text_recognition/chinese_ocr_db_crnn_server)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别| +|[Vehicle_License_Plate_Recognition](image/text_recognition/Vehicle_License_Plate_Recognition)|-|CCPD|车牌识别| +|[chinese_cht_ocr_db_crnn_mobile](image/text_recognition/chinese_cht_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|繁体中文文字识别| +|[japan_ocr_db_crnn_mobile](image/text_recognition/japan_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|日文文字识别| +|[korean_ocr_db_crnn_mobile](image/text_recognition/korean_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|韩文文字识别| +|[german_ocr_db_crnn_mobile](image/text_recognition/german_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|德文文字识别| +|[french_ocr_db_crnn_mobile](image/text_recognition/french_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|法文文字识别| +|[latin_ocr_db_crnn_mobile](image/text_recognition/latin_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|拉丁文文字识别| +|[cyrillic_ocr_db_crnn_mobile](image/text_recognition/cyrillic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|斯拉夫文文字识别| +|[multi_languages_ocr_db_crnn](image/text_recognition/multi_languages_ocr_db_crnn)|Differentiable Binarization+RCNN|icdar2015数据集|多语言文字识别| +|[kannada_ocr_db_crnn_mobile](image/text_recognition/kannada_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|卡纳达文文字识别| +|[arabic_ocr_db_crnn_mobile](image/text_recognition/arabic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|阿拉伯文文字识别| +|[telugu_ocr_db_crnn_mobile](image/text_recognition/telugu_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰卢固文文字识别| +|[devanagari_ocr_db_crnn_mobile](image/text_recognition/devanagari_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|梵文文字识别| +|[tamil_ocr_db_crnn_mobile](image/text_recognition/tamil_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰米尔文文字识别| + + + - ### Image Editing + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[realsr](image/Image_editing/super_resolution/realsr)|LP-KPN|RealSR dataset|图像/视频超分-4倍| +|[deoldify](image/Image_editing/colorization/deoldify)|GAN|ILSVRC 2012|黑白照片/视频着色| +|[photo_restoration](image/Image_editing/colorization/photo_restoration)|基于deoldify和realsr模型|-|老照片修复| +|[user_guided_colorization](image/Image_editing/colorization/user_guided_colorization)|siggraph|ILSVRC 2012|图像着色| +|[falsr_c](image/Image_editing/super_resolution/falsr_c)|falsr_c| DIV2k|轻量化超分-2倍| +|[dcscn](image/Image_editing/super_resolution/dcscn)|dcscn| DIV2k|轻量化超分-2倍| +|[falsr_a](image/Image_editing/super_resolution/falsr_a)|falsr_a| DIV2k|轻量化超分-2倍| +|[falsr_b](image/Image_editing/super_resolution/falsr_b)|falsr_b|DIV2k|轻量化超分-2倍| + + - ### Instance Segmentation + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[solov2](image/instance_segmentation/solov2)|-|COCO2014|实例分割| + + - ### Object Detection + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[faster_rcnn_resnet50_coco2017](image/object_detection/faster_rcnn_resnet50_coco2017)|faster_rcnn|COCO2017|| +|[ssd_vgg16_512_coco2017](image/object_detection/ssd_vgg16_512_coco2017)|SSD|COCO2017|| +|[faster_rcnn_resnet50_fpn_venus](image/object_detection/faster_rcnn_resnet50_fpn_venus)|faster_rcnn|百度自建数据集|大规模通用目标检测| +|[ssd_vgg16_300_coco2017](image/object_detection/ssd_vgg16_300_coco2017)|||| +|[yolov3_resnet34_coco2017](image/object_detection/yolov3_resnet34_coco2017)|YOLOv3|COCO2017|| +|[yolov3_darknet53_pedestrian](image/object_detection/yolov3_darknet53_pedestrian)|YOLOv3|百度自建大规模行人数据集|行人检测| +|[yolov3_mobilenet_v1_coco2017](image/object_detection/yolov3_mobilenet_v1_coco2017)|YOLOv3|COCO2017|| +|[ssd_mobilenet_v1_pascal](image/object_detection/ssd_mobilenet_v1_pascal)|SSD|PASCAL VOC|| +|[faster_rcnn_resnet50_fpn_coco2017](image/object_detection/faster_rcnn_resnet50_fpn_coco2017)|faster_rcnn|COCO2017|| +|[yolov3_darknet53_coco2017](image/object_detection/yolov3_darknet53_coco2017)|YOLOv3|COCO2017|| +|[yolov3_darknet53_vehicles](image/object_detection/yolov3_darknet53_vehicles)|YOLOv3|百度自建大规模车辆数据集|车辆检测| +|[yolov3_darknet53_venus](image/object_detection/yolov3_darknet53_venus)|YOLOv3|百度自建数据集|大规模通用检测| +|[yolov3_resnet50_vd_coco2017](image/object_detection/yolov3_resnet50_vd_coco2017)|YOLOv3|COCO2017|| + + - ### Depth Estimation + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[MiDaS_Large](image/depth_estimation/MiDaS_Large)|-|3D Movies, WSVD, ReDWeb, MegaDepth|| +|[MiDaS_Small](image/depth_estimation/MiDaS_Small)|-|3D Movies, WSVD, ReDWeb, MegaDepth, etc.|| + +## Text + - ### Text Generation + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[ernie_gen](text/text_generation/ernie_gen)|ERNIE-GEN|-|面向生成任务的预训练-微调框架| +|[ernie_gen_poetry](text/text_generation/ernie_gen_poetry)|ERNIE-GEN|开源诗歌数据集|诗歌生成| +|[ernie_gen_couplet](text/text_generation/ernie_gen_couplet)|ERNIE-GEN|开源对联数据集|对联生成| +|[ernie_gen_lover_words](text/text_generation/ernie_gen_lover_words)|ERNIE-GEN|网络情诗、情话数据|情话生成| +|[ernie_tiny_couplet](text/text_generation/ernie_tiny_couplet)|Eernie_tiny|开源对联数据集|对联生成| +|[ernie_gen_acrostic_poetry](text/text_generation/ernie_gen_acrostic_poetry)|ERNIE-GEN|开源诗歌数据集|藏头诗生成| +|[Rumor_prediction](text/text_generation/Rumor_prediction)|-|新浪微博中文谣言数据|谣言预测| +|[plato-mini](text/text_generation/plato-mini)|Unified Transformer|十亿级别的中文对话数据|中文对话| +|[plato2_en_large](text/text_generation/plato2_en_large)|plato2|开放域多轮数据集|超大规模生成式对话| +|[plato2_en_base](text/text_generation/plato2_en_base)|plato2|开放域多轮数据集|超大规模生成式对话| +|[CPM_LM](text/text_generation/CPM_LM)|GPT-2|自建数据集|中文文本生成| +|[unified_transformer-12L-cn](text/text_generation/unified_transformer-12L-cn)|Unified Transformer|千万级别中文会话数据|人机多轮对话| +|[unified_transformer-12L-cn-luge](text/text_generation/unified_transformer-12L-cn-luge)|Unified Transformer|千言对话数据集|人机多轮对话| +|[reading_pictures_writing_poems](text/text_generation/reading_pictures_writing_poems)|多网络级联|-|看图写诗| +|[GPT2_CPM_LM](text/text_generation/GPT2_CPM_LM)|||问答类文本生成| +|[GPT2_Base_CN](text/text_generation/GPT2_Base_CN)|||问答类文本生成| + + - ### Word Embedding + +
expand
+ +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[w2v_weibo_target_word-bigram_dim300](text/embedding/w2v_weibo_target_word-bigram_dim300)|w2v|weibo|| +|[w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_literature_target_word-word_dim300](text/embedding/w2v_literature_target_word-word_dim300)|w2v|literature|| +|[word2vec_skipgram](text/embedding/word2vec_skipgram)|skip-gram|百度自建数据集|| +|[w2v_sogou_target_word-char_dim300](text/embedding/w2v_sogou_target_word-char_dim300)|w2v|sogou|| +|[w2v_weibo_target_bigram-char_dim300](text/embedding/w2v_weibo_target_bigram-char_dim300)|w2v|weibo|| +|[w2v_zhihu_target_word-bigram_dim300](text/embedding/w2v_zhihu_target_word-bigram_dim300)|w2v|zhihu|| +|[w2v_financial_target_word-word_dim300](text/embedding/w2v_financial_target_word-word_dim300)|w2v|financial|| +|[w2v_wiki_target_word-word_dim300](text/embedding/w2v_wiki_target_word-word_dim300)|w2v|wiki|| +|[w2v_baidu_encyclopedia_context_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-word_dim300)|w2v|baidu_encyclopedia|| +|[w2v_weibo_target_word-word_dim300](text/embedding/w2v_weibo_target_word-word_dim300)|w2v|weibo|| +|[w2v_zhihu_target_bigram-char_dim300](text/embedding/w2v_zhihu_target_bigram-char_dim300)|w2v|zhihu|| +|[w2v_zhihu_target_word-word_dim300](text/embedding/w2v_zhihu_target_word-word_dim300)|w2v|zhihu|| +|[w2v_people_daily_target_word-char_dim300](text/embedding/w2v_people_daily_target_word-char_dim300)|w2v|people_daily|| +|[w2v_sikuquanshu_target_word-word_dim300](text/embedding/w2v_sikuquanshu_target_word-word_dim300)|w2v|sikuquanshu|| +|[glove_twitter_target_word-word_dim200_en](text/embedding/glove_twitter_target_word-word_dim200_en)|fasttext|twitter|| +|[fasttext_crawl_target_word-word_dim300_en](text/embedding/fasttext_crawl_target_word-word_dim300_en)|fasttext|crawl|| +|[w2v_wiki_target_word-bigram_dim300](text/embedding/w2v_wiki_target_word-bigram_dim300)|w2v|wiki|| +|[w2v_baidu_encyclopedia_context_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-1_dim300)|w2v|baidu_encyclopedia|| +|[glove_wiki2014-gigaword_target_word-word_dim300_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim300_en)|glove|wiki2014-gigaword|| +|[glove_wiki2014-gigaword_target_word-word_dim50_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim50_en)|glove|wiki2014-gigaword|| +|[w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_wiki_target_bigram-char_dim300](text/embedding/w2v_wiki_target_bigram-char_dim300)|w2v|wiki|| +|[w2v_baidu_encyclopedia_target_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-1_dim300)|w2v|baidu_encyclopedia|| +|[w2v_financial_target_bigram-char_dim300](text/embedding/w2v_financial_target_bigram-char_dim300)|w2v|financial|| +|[glove_wiki2014-gigaword_target_word-word_dim200_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim200_en)|glove|wiki2014-gigaword|| +|[w2v_financial_target_word-bigram_dim300](text/embedding/w2v_financial_target_word-bigram_dim300)|w2v|financial|| +|[w2v_mixed-large_target_word-char_dim300](text/embedding/w2v_mixed-large_target_word-char_dim300)|w2v|mixed|| +|[w2v_baidu_encyclopedia_target_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordPosition_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_target_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordLR_dim300)|w2v|baidu_encyclopedia|| +|[w2v_sogou_target_bigram-char_dim300](text/embedding/w2v_sogou_target_bigram-char_dim300)|w2v|sogou|| +|[w2v_weibo_target_word-char_dim300](text/embedding/w2v_weibo_target_word-char_dim300)|w2v|weibo|| +|[w2v_people_daily_target_word-word_dim300](text/embedding/w2v_people_daily_target_word-word_dim300)|w2v|people_daily|| +|[w2v_zhihu_target_word-char_dim300](text/embedding/w2v_zhihu_target_word-char_dim300)|w2v|zhihu|| +|[w2v_wiki_target_word-char_dim300](text/embedding/w2v_wiki_target_word-char_dim300)|w2v|wiki|| +|[w2v_sogou_target_word-bigram_dim300](text/embedding/w2v_sogou_target_word-bigram_dim300)|w2v|sogou|| +|[w2v_financial_target_word-char_dim300](text/embedding/w2v_financial_target_word-char_dim300)|w2v|financial|| +|[w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia|| +|[glove_wiki2014-gigaword_target_word-word_dim100_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim100_en)|glove|wiki2014-gigaword|| +|[w2v_baidu_encyclopedia_target_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-4_dim300)|w2v|baidu_encyclopedia|| +|[w2v_sogou_target_word-word_dim300](text/embedding/w2v_sogou_target_word-word_dim300)|w2v|sogou|| +|[w2v_literature_target_word-char_dim300](text/embedding/w2v_literature_target_word-char_dim300)|w2v|literature|| +|[w2v_baidu_encyclopedia_target_bigram-char_dim300](text/embedding/w2v_baidu_encyclopedia_target_bigram-char_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_target_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-word_dim300)|w2v|baidu_encyclopedia|| +|[glove_twitter_target_word-word_dim100_en](text/embedding/glove_twitter_target_word-word_dim100_en)|glove|crawl|| +|[w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_context_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-4_dim300)|w2v|baidu_encyclopedia|| +|[w2v_literature_target_bigram-char_dim300](text/embedding/w2v_literature_target_bigram-char_dim300)|w2v|literature|| +|[fasttext_wiki-news_target_word-word_dim300_en](text/embedding/fasttext_wiki-news_target_word-word_dim300_en)|fasttext|wiki-news|| +|[w2v_people_daily_target_word-bigram_dim300](text/embedding/w2v_people_daily_target_word-bigram_dim300)|w2v|people_daily|| +|[w2v_mixed-large_target_word-word_dim300](text/embedding/w2v_mixed-large_target_word-word_dim300)|w2v|mixed|| +|[w2v_people_daily_target_bigram-char_dim300](text/embedding/w2v_people_daily_target_bigram-char_dim300)|w2v|people_daily|| +|[w2v_literature_target_word-bigram_dim300](text/embedding/w2v_literature_target_word-bigram_dim300)|w2v|literature|| +|[glove_twitter_target_word-word_dim25_en](text/embedding/glove_twitter_target_word-word_dim25_en)|glove|twitter|| +|[w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_sikuquanshu_target_word-bigram_dim300](text/embedding/w2v_sikuquanshu_target_word-bigram_dim300)|w2v|sikuquanshu|| +|[w2v_baidu_encyclopedia_context_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-2_dim300)|w2v|baidu_encyclopedia|| +|[glove_twitter_target_word-word_dim50_en](text/embedding/glove_twitter_target_word-word_dim50_en)|glove|twitter|| +|[w2v_baidu_encyclopedia_context_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordLR_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_target_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_context_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordPosition_dim300)|w2v|baidu_encyclopedia|| + +
+ + - ### Machine Translation + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[transformer_zh-en](text/machine_translation/transformer/transformer_zh-en)|Transformer|CWMT2021|中文译英文| +|[transformer_en-de](text/machine_translation/transformer/transformer_en-de)|Transformer|WMT14 EN-DE|英文译德文| + + - ### Language Model + +
expand
+ +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[chinese_electra_small](text/language_model/chinese_electra_small)|||| +|[chinese_electra_base](text/language_model/chinese_electra_base)|||| +|[roberta-wwm-ext-large](text/language_model/roberta-wwm-ext-large)|roberta-wwm-ext-large|百度自建数据集|| +|[chinese-bert-wwm-ext](text/language_model/chinese_bert_wwm_ext)|chinese-bert-wwm-ext|百度自建数据集|| +|[lda_webpage](text/language_model/lda_webpage)|LDA|百度自建网页领域数据集|| +|[lda_novel](text/language_model/lda_novel)|||| +|[bert-base-multilingual-uncased](text/language_model/bert-base-multilingual-uncased)|||| +|[rbt3](text/language_model/rbt3)|||| +|[ernie_v2_eng_base](text/language_model/ernie_v2_eng_base)|ernie_v2_eng_base|百度自建数据集|| +|[bert-base-multilingual-cased](text/language_model/bert-base-multilingual-cased)|||| +|[rbtl3](text/language_model/rbtl3)|||| +|[chinese-bert-wwm](text/language_model/chinese_bert_wwm)|chinese-bert-wwm|百度自建数据集|| +|[bert-large-uncased](text/language_model/bert-large-uncased)|||| +|[slda_novel](text/language_model/slda_novel)|||| +|[slda_news](text/language_model/slda_news)|||| +|[electra_small](text/language_model/electra_small)|||| +|[slda_webpage](text/language_model/slda_webpage)|||| +|[bert-base-cased](text/language_model/bert-base-cased)|||| +|[slda_weibo](text/language_model/slda_weibo)|||| +|[roberta-wwm-ext](text/language_model/roberta-wwm-ext)|roberta-wwm-ext|百度自建数据集|| +|[bert-base-uncased](text/language_model/bert-base-uncased)|||| +|[electra_large](text/language_model/electra_large)|||| +|[ernie](text/language_model/ernie)|ernie-1.0|百度自建数据集|| +|[simnet_bow](text/language_model/simnet_bow)|BOW|百度自建数据集|| +|[ernie_tiny](text/language_model/ernie_tiny)|ernie_tiny|百度自建数据集|| +|[bert-base-chinese](text/language_model/bert-base-chinese)|bert-base-chinese|百度自建数据集|| +|[lda_news](text/language_model/lda_news)|LDA|百度自建新闻领域数据集|| +|[electra_base](text/language_model/electra_base)|||| +|[ernie_v2_eng_large](text/language_model/ernie_v2_eng_large)|ernie_v2_eng_large|百度自建数据集|| +|[bert-large-cased](text/language_model/bert-large-cased)|||| + +
+ + + - ### Sentiment Analysis + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[ernie_skep_sentiment_analysis](text/sentiment_analysis/ernie_skep_sentiment_analysis)|SKEP|百度自建数据集|句子级情感分析| +|[emotion_detection_textcnn](text/sentiment_analysis/emotion_detection_textcnn)|TextCNN|百度自建数据集|对话情绪识别| +|[senta_bilstm](text/sentiment_analysis/senta_bilstm)|BiLSTM|百度自建数据集|中文情感倾向分析| +|[senta_bow](text/sentiment_analysis/senta_bow)|BOW|百度自建数据集|中文情感倾向分析| +|[senta_gru](text/sentiment_analysis/senta_gru)|GRU|百度自建数据集|中文情感倾向分析| +|[senta_lstm](text/sentiment_analysis/senta_lstm)|LSTM|百度自建数据集|中文情感倾向分析| +|[senta_cnn](text/sentiment_analysis/senta_cnn)|CNN|百度自建数据集|中文情感倾向分析| + + - ### Syntactic Analysis + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[DDParser](text/syntactic_analysis/DDParser)|Deep Biaffine Attention|搜索query、网页文本、语音输入等数据|句法分析| + + - ### Simultaneous Translation + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[transformer_nist_wait_1](text/simultaneous_translation/stacl/transformer_nist_wait_1)|transformer|NIST 2008-中英翻译数据集|中译英-wait-1策略| +|[transformer_nist_wait_3](text/simultaneous_translation/stacl/transformer_nist_wait_3)|transformer|NIST 2008-中英翻译数据集|中译英-wait-3策略| +|[transformer_nist_wait_5](text/simultaneous_translation/stacl/transformer_nist_wait_5)|transformer|NIST 2008-中英翻译数据集|中译英-wait-5策略| +|[transformer_nist_wait_7](text/simultaneous_translation/stacl/transformer_nist_wait_7)|transformer|NIST 2008-中英翻译数据集|中译英-wait-7策略| +|[transformer_nist_wait_all](text/simultaneous_translation/stacl/transformer_nist_wait_all)|transformer|NIST 2008-中英翻译数据集|中译英-waitk=-1策略| + + + - ### Lexical Analysis + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[jieba_paddle](text/lexical_analysis/jieba_paddle)|BiGRU+CRF|百度自建数据集|百度自研联合的词法分析模型,能整体性地完成中文分词、词性标注、专名识别任务。在百度自建数据集上评测,LAC效果:Precision=88.0%,Recall=88.7%,F1-Score=88.4%。| +|[lac](text/lexical_analysis/lac)|BiGRU+CRF|百度自建数据集|jieba使用Paddle搭建的切词网络(双向GRU)。同时支持jieba的传统切词方法,如精确模式、全模式、搜索引擎模式等切词模式。| + + - ### Punctuation Restoration + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[auto_punc](text/punctuation_restoration/auto_punc)|Ernie-1.0|WuDaoCorpora 2.0|自动添加7种标点符号| + + - ### Text Review + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[porn_detection_cnn](text/text_review/porn_detection_cnn)|CNN|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别| +|[porn_detection_gru](text/text_review/porn_detection_gru)|GRU|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别| +|[porn_detection_lstm](text/text_review/porn_detection_lstm)|LSTM|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别| + +## Audio + + - ### Video cloning + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[ge2e_fastspeech2_pwgan](audio/voice_cloning/ge2e_fastspeech2_pwgan)|FastSpeech2|AISHELL-3|中文语音克隆| +|[lstm_tacotron2](audio/voice_cloning/lstm_tacotron2)|LSTM、Tacotron2、WaveFlow|AISHELL-3|中文语音克隆| + + - ### Text to Speech + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[transformer_tts_ljspeech](audio/tts/transformer_tts_ljspeech)|Transformer|LJSpeech-1.1|英文语音合成| +|[fastspeech_ljspeech](audio/tts/fastspeech_ljspeech)|FastSpeech|LJSpeech-1.1|英文语音合成| +|[fastspeech2_baker](audio/tts/fastspeech2_baker)|FastSpeech2|Chinese Standard Mandarin Speech Copus|中文语音合成| +|[fastspeech2_ljspeech](audio/tts/fastspeech2_ljspeech)|FastSpeech2|LJSpeech-1.1|英文语音合成| +|[deepvoice3_ljspeech](audio/tts/deepvoice3_ljspeech)|DeepVoice3|LJSpeech-1.1|英文语音合成| + + - ### Automatic Speech Recognition + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[deepspeech2_aishell](audio/asr/deepspeech2_aishell)|DeepSpeech2|AISHELL-1|中文语音识别| +|[deepspeech2_librispeech](audio/asr/deepspeech2_librispeech)|DeepSpeech2|LibriSpeech|英文语音识别| +|[u2_conformer_aishell](audio/asr/u2_conformer_aishell)|DeepSpeech2|AISHELL-1|中文语音识别| +|[u2_conformer_wenetspeech](audio/asr/u2_conformer_wenetspeech)|Conformer|WenetSpeech|中文语音识别| +|[u2_conformer_librispeech](audio/asr/u2_conformer_librispeech)|DeepSpeech2|LibriSpeech|英文语音识别| + + + - ### Audio Classification + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[panns_cnn6](audio/audio_classification/PANNs/cnn6)|PANNs|Google Audioset|主要包含4个卷积层和2个全连接层,模型参数为4.5M。经过预训练后,可以用于提取音频的embbedding,维度是512| +|[panns_cnn14](audio/audio_classification/PANNs/cnn14)|PANNs|Google Audioset|主要包含12个卷积层和2个全连接层,模型参数为79.6M。经过预训练后,可以用于提取音频的embbedding,维度是2048| +|[panns_cnn10](audio/audio_classification/PANNs/cnn10)|PANNs|Google Audioset|主要包含8个卷积层和2个全连接层,模型参数为4.9M。经过预训练后,可以用于提取音频的embbedding,维度是512| + +## Video + - ### Video Classification + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[videotag_tsn_lstm](video/classification/videotag_tsn_lstm)|TSN + AttentionLSTM|百度自建数据集|大规模短视频分类打标签| +|[tsn_kinetics400](video/classification/tsn_kinetics400)|TSN|Kinetics-400|视频分类| +|[tsm_kinetics400](video/classification/tsm_kinetics400)|TSM|Kinetics-400|视频分类| +|[stnet_kinetics400](video/classification/stnet_kinetics400)|StNet|Kinetics-400|视频分类| +|[nonlocal_kinetics400](video/classification/nonlocal_kinetics400)|Non-local|Kinetics-400|视频分类| + + + - ### Video Editing + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[SkyAR](video/Video_editing/SkyAR)|UNet|UNet|视频换天| + + - ### Multiple Object tracking + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[fairmot_dla34](video/multiple_object_tracking/fairmot_dla34)|CenterNet|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|实时多目标跟踪| +|[jde_darknet53](video/multiple_object_tracking/jde_darknet53)|YOLOv3|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|多目标跟踪-兼顾精度和速度| + +## Industrial Application + + - ### Meter Detection + +|module|Network|Dataset|Introduction| +|--|--|--|--| +|[WatermeterSegmentation](image/semantic_segmentation/WatermeterSegmentation)|DeepLabV3|水表的数字表盘分割数据集|水表的数字表盘分割| diff --git a/modules/README_ch.md b/modules/README_ch.md new file mode 100644 index 00000000..abf40552 --- /dev/null +++ b/modules/README_ch.md @@ -0,0 +1,546 @@ +简体中文 | [English](README.md) + +# 目录 +|[图像](#图像) (212个)|[文本](#文本) (130个)|[语音](#语音) (15个)|[视频](#视频) (8个)|[工业应用](#工业应用) (1个)| +|--|--|--|--|--| +|[图像分类](#图像分类) (108)|[文本生成](#文本生成) (17)| [声音克隆](#声音克隆) (2)|[视频分类](#视频分类) (5)| [表针识别](#表针识别) (1)| +|[图像生成](#图像生成) (26)|[词向量](#词向量) (62)|[语音合成](#语音合成) (5)|[视频修复](#视频修复) (1)|-| +|[关键点检测](#关键点检测) (5)|[机器翻译](#机器翻译) (2)|[语音识别](#语音识别) (5)|[多目标追踪](#多目标追踪) (2)|-| +|[图像分割](#图像分割) (25)|[语义模型](#语义模型) (30)|[声音分类](#声音分类) (3)| -|-| +|[人脸检测](#人脸检测) (7)|[情感分析](#情感分析) (7)|-|-|-| +|[文字识别](#文字识别) (17)|[句法分析](#句法分析) (1)|-|-|-| +|[图像编辑](#图像编辑) (8)|[同声传译](#同声传译) (5)|-|-|-| +|[实例分割](#实例分割) (1)|[词法分析](#词法分析) (2)|-|-|-| +|[目标检测](#目标检测) (13)|[标点恢复](#标点恢复) (1)|-|-|-| +|[深度估计](#深度估计) (2)|[文本审核](#文本审核) (3)|-|-|-| + +## 图像 + - ### 图像分类 + +
expand
+ +|module|网络|数据集|简介| +|--|--|--|--| +|[DriverStatusRecognition](image/classification/DriverStatusRecognition)|MobileNetV3_small_ssld|分心司机检测数据集|| +|[mobilenet_v2_animals](image/classification/mobilenet_v2_animals)|MobileNet_v2|百度自建动物数据集|| +|[repvgg_a1_imagenet](image/classification/repvgg_a1_imagenet)|RepVGG|ImageNet-2012|| +|[repvgg_a0_imagenet](image/classification/repvgg_a0_imagenet)|RepVGG|ImageNet-2012|| +|[resnext152_32x4d_imagenet](image/classification/resnext152_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[resnet_v2_152_imagenet](image/classification/resnet_v2_152_imagenet)|ResNet V2|ImageNet-2012|| +|[resnet50_vd_animals](image/classification/resnet50_vd_animals)|ResNet50_vd|百度自建动物数据集|| +|[food_classification](image/classification/food_classification)|ResNet50_vd_ssld|美食数据集|| +|[mobilenet_v3_large_imagenet_ssld](image/classification/mobilenet_v3_large_imagenet_ssld)|Mobilenet_v3_large|ImageNet-2012|| +|[resnext152_vd_32x4d_imagenet](image/classification/resnext152_vd_32x4d_imagenet)|||| +|[ghostnet_x1_3_imagenet_ssld](image/classification/ghostnet_x1_3_imagenet_ssld)|GhostNet|ImageNet-2012|| +|[rexnet_1_5_imagenet](image/classification/rexnet_1_5_imagenet)|ReXNet|ImageNet-2012|| +|[resnext50_64x4d_imagenet](image/classification/resnext50_64x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[resnext101_64x4d_imagenet](image/classification/resnext101_64x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[efficientnetb0_imagenet](image/classification/efficientnetb0_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb1_imagenet](image/classification/efficientnetb1_imagenet)|EfficientNet|ImageNet-2012|| +|[mobilenet_v2_imagenet_ssld](image/classification/mobilenet_v2_imagenet_ssld)|Mobilenet_v2|ImageNet-2012|| +|[resnet50_vd_dishes](image/classification/resnet50_vd_dishes)|ResNet50_vd|百度自建菜品数据集|| +|[pnasnet_imagenet](image/classification/pnasnet_imagenet)|PNASNet|ImageNet-2012|| +|[rexnet_2_0_imagenet](image/classification/rexnet_2_0_imagenet)|ReXNet|ImageNet-2012|| +|[SnakeIdentification](image/classification/SnakeIdentification)|ResNet50_vd_ssld|蛇种数据集|| +|[hrnet40_imagenet](image/classification/hrnet40_imagenet)|HRNet|ImageNet-2012|| +|[resnet_v2_34_imagenet](image/classification/resnet_v2_34_imagenet)|ResNet V2|ImageNet-2012|| +|[mobilenet_v2_dishes](image/classification/mobilenet_v2_dishes)|MobileNet_v2|百度自建菜品数据集|| +|[resnext101_vd_32x4d_imagenet](image/classification/resnext101_vd_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[repvgg_b2g4_imagenet](image/classification/repvgg_b2g4_imagenet)|RepVGG|ImageNet-2012|| +|[fix_resnext101_32x48d_wsl_imagenet](image/classification/fix_resnext101_32x48d_wsl_imagenet)|ResNeXt|ImageNet-2012|| +|[vgg13_imagenet](image/classification/vgg13_imagenet)|VGG|ImageNet-2012|| +|[se_resnext101_32x4d_imagenet](image/classification/se_resnext101_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012|| +|[hrnet30_imagenet](image/classification/hrnet30_imagenet)|HRNet|ImageNet-2012|| +|[ghostnet_x1_3_imagenet](image/classification/ghostnet_x1_3_imagenet)|GhostNet|ImageNet-2012|| +|[dpn107_imagenet](image/classification/dpn107_imagenet)|DPN|ImageNet-2012|| +|[densenet161_imagenet](image/classification/densenet161_imagenet)|DenseNet|ImageNet-2012|| +|[vgg19_imagenet](image/classification/vgg19_imagenet)|vgg19_imagenet|ImageNet-2012|| +|[mobilenet_v2_imagenet](image/classification/mobilenet_v2_imagenet)|Mobilenet_v2|ImageNet-2012|| +|[resnet50_vd_10w](image/classification/resnet50_vd_10w)|ResNet_vd|百度自建数据集|| +|[resnet_v2_101_imagenet](image/classification/resnet_v2_101_imagenet)|ResNet V2 101|ImageNet-2012|| +|[darknet53_imagenet](image/classification/darknet53_imagenet)|DarkNet|ImageNet-2012|| +|[se_resnext50_32x4d_imagenet](image/classification/se_resnext50_32x4d_imagenet)|SE_ResNeXt|ImageNet-2012|| +|[se_hrnet64_imagenet_ssld](image/classification/se_hrnet64_imagenet_ssld)|HRNet|ImageNet-2012|| +|[resnext101_32x16d_wsl](image/classification/resnext101_32x16d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[hrnet18_imagenet](image/classification/hrnet18_imagenet)|HRNet|ImageNet-2012|| +|[spinalnet_res101_gemstone](image/classification/spinalnet_res101_gemstone)|resnet101|gemstone|| +|[densenet264_imagenet](image/classification/densenet264_imagenet)|DenseNet|ImageNet-2012|| +|[resnext50_vd_32x4d_imagenet](image/classification/resnext50_vd_32x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[SpinalNet_Gemstones](image/classification/SpinalNet_Gemstones)|||| +|[spinalnet_vgg16_gemstone](image/classification/spinalnet_vgg16_gemstone)|vgg16|gemstone|| +|[xception71_imagenet](image/classification/xception71_imagenet)|Xception|ImageNet-2012|| +|[repvgg_b2_imagenet](image/classification/repvgg_b2_imagenet)|RepVGG|ImageNet-2012|| +|[dpn68_imagenet](image/classification/dpn68_imagenet)|DPN|ImageNet-2012|| +|[alexnet_imagenet](image/classification/alexnet_imagenet)|AlexNet|ImageNet-2012|| +|[rexnet_1_3_imagenet](image/classification/rexnet_1_3_imagenet)|ReXNet|ImageNet-2012|| +|[hrnet64_imagenet](image/classification/hrnet64_imagenet)|HRNet|ImageNet-2012|| +|[efficientnetb7_imagenet](image/classification/efficientnetb7_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb0_small_imagenet](image/classification/efficientnetb0_small_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb6_imagenet](image/classification/efficientnetb6_imagenet)|EfficientNet|ImageNet-2012|| +|[hrnet48_imagenet](image/classification/hrnet48_imagenet)|HRNet|ImageNet-2012|| +|[rexnet_3_0_imagenet](image/classification/rexnet_3_0_imagenet)|ReXNet|ImageNet-2012|| +|[shufflenet_v2_imagenet](image/classification/shufflenet_v2_imagenet)|ShuffleNet V2|ImageNet-2012|| +|[ghostnet_x0_5_imagenet](image/classification/ghostnet_x0_5_imagenet)|GhostNet|ImageNet-2012|| +|[inception_v4_imagenet](image/classification/inception_v4_imagenet)|Inception_V4|ImageNet-2012|| +|[resnext101_vd_64x4d_imagenet](image/classification/resnext101_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[densenet201_imagenet](image/classification/densenet201_imagenet)|DenseNet|ImageNet-2012|| +|[vgg16_imagenet](image/classification/vgg16_imagenet)|VGG|ImageNet-2012|| +|[mobilenet_v3_small_imagenet_ssld](image/classification/mobilenet_v3_small_imagenet_ssld)|Mobilenet_v3_Small|ImageNet-2012|| +|[hrnet18_imagenet_ssld](image/classification/hrnet18_imagenet_ssld)|HRNet|ImageNet-2012|| +|[resnext152_64x4d_imagenet](image/classification/resnext152_64x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[efficientnetb3_imagenet](image/classification/efficientnetb3_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb2_imagenet](image/classification/efficientnetb2_imagenet)|EfficientNet|ImageNet-2012|| +|[repvgg_b1g4_imagenet](image/classification/repvgg_b1g4_imagenet)|RepVGG|ImageNet-2012|| +|[resnext101_32x4d_imagenet](image/classification/resnext101_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[resnext50_32x4d_imagenet](image/classification/resnext50_32x4d_imagenet)|ResNeXt|ImageNet-2012|| +|[repvgg_a2_imagenet](image/classification/repvgg_a2_imagenet)|RepVGG|ImageNet-2012|| +|[resnext152_vd_64x4d_imagenet](image/classification/resnext152_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[xception41_imagenet](image/classification/xception41_imagenet)|Xception|ImageNet-2012|| +|[googlenet_imagenet](image/classification/googlenet_imagenet)|GoogleNet|ImageNet-2012|| +|[resnet50_vd_imagenet_ssld](image/classification/resnet50_vd_imagenet_ssld)|ResNet_vd|ImageNet-2012|| +|[repvgg_b1_imagenet](image/classification/repvgg_b1_imagenet)|RepVGG|ImageNet-2012|| +|[repvgg_b0_imagenet](image/classification/repvgg_b0_imagenet)|RepVGG|ImageNet-2012|| +|[resnet_v2_50_imagenet](image/classification/resnet_v2_50_imagenet)|ResNet V2|ImageNet-2012|| +|[rexnet_1_0_imagenet](image/classification/rexnet_1_0_imagenet)|ReXNet|ImageNet-2012|| +|[resnet_v2_18_imagenet](image/classification/resnet_v2_18_imagenet)|ResNet V2|ImageNet-2012|| +|[resnext101_32x8d_wsl](image/classification/resnext101_32x8d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[efficientnetb4_imagenet](image/classification/efficientnetb4_imagenet)|EfficientNet|ImageNet-2012|| +|[efficientnetb5_imagenet](image/classification/efficientnetb5_imagenet)|EfficientNet|ImageNet-2012|| +|[repvgg_b1g2_imagenet](image/classification/repvgg_b1g2_imagenet)|RepVGG|ImageNet-2012|| +|[resnext101_32x48d_wsl](image/classification/resnext101_32x48d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[resnet50_vd_wildanimals](image/classification/resnet50_vd_wildanimals)|ResNet_vd|IFAW 自建野生动物数据集|| +|[nasnet_imagenet](image/classification/nasnet_imagenet)|NASNet|ImageNet-2012|| +|[se_resnet18_vd_imagenet](image/classification/se_resnet18_vd_imagenet)|||| +|[spinalnet_res50_gemstone](image/classification/spinalnet_res50_gemstone)|resnet50|gemstone|| +|[resnext50_vd_64x4d_imagenet](image/classification/resnext50_vd_64x4d_imagenet)|ResNeXt_vd|ImageNet-2012|| +|[resnext101_32x32d_wsl](image/classification/resnext101_32x32d_wsl)|ResNeXt_wsl|ImageNet-2012|| +|[dpn131_imagenet](image/classification/dpn131_imagenet)|DPN|ImageNet-2012|| +|[xception65_imagenet](image/classification/xception65_imagenet)|Xception|ImageNet-2012|| +|[repvgg_b3g4_imagenet](image/classification/repvgg_b3g4_imagenet)|RepVGG|ImageNet-2012|| +|[marine_biometrics](image/classification/marine_biometrics)|ResNet50_vd_ssld|Fish4Knowledge|| +|[res2net101_vd_26w_4s_imagenet](image/classification/res2net101_vd_26w_4s_imagenet)|Res2Net|ImageNet-2012|| +|[dpn98_imagenet](image/classification/dpn98_imagenet)|DPN|ImageNet-2012|| +|[resnet18_vd_imagenet](image/classification/resnet18_vd_imagenet)|ResNet_vd|ImageNet-2012|| +|[densenet121_imagenet](image/classification/densenet121_imagenet)|DenseNet|ImageNet-2012|| +|[vgg11_imagenet](image/classification/vgg11_imagenet)|VGG|ImageNet-2012|| +|[hrnet44_imagenet](image/classification/hrnet44_imagenet)|HRNet|ImageNet-2012|| +|[densenet169_imagenet](image/classification/densenet169_imagenet)|DenseNet|ImageNet-2012|| +|[hrnet32_imagenet](image/classification/hrnet32_imagenet)|HRNet|ImageNet-2012|| +|[dpn92_imagenet](image/classification/dpn92_imagenet)|DPN|ImageNet-2012|| +|[ghostnet_x1_0_imagenet](image/classification/ghostnet_x1_0_imagenet)|GhostNet|ImageNet-2012|| +|[hrnet48_imagenet_ssld](image/classification/hrnet48_imagenet_ssld)|HRNet|ImageNet-2012|| + +
+ + + - ### 图像生成 + +|module|网络|数据集|简介| +|--|--|--|--| +|[pixel2style2pixel](image/Image_gan/gan/pixel2style2pixel/)|Pixel2Style2Pixel|-|人脸转正| +|[stgan_bald](image/Image_gan/gan/stgan_bald/)|STGAN|CelebA|秃头生成器| +|[styleganv2_editing](image/Image_gan/gan/styleganv2_editing)|StyleGAN V2|-|人脸编辑| +|[wav2lip](image/Image_gan/gan/wav2lip)|wav2lip|LRS2|唇形生成| +|[attgan_celeba](image/Image_gan/attgan_celeba/)|AttGAN|Celeba|人脸编辑| +|[cyclegan_cityscapes](image/Image_gan/cyclegan_cityscapes)|CycleGAN|Cityscapes|实景图和语义分割结果互相转换| +|[stargan_celeba](image/Image_gan/stargan_celeba)|StarGAN|Celeba|人脸编辑| +|[stgan_celeba](image/Image_gan/stgan_celeba/)|STGAN|Celeba|人脸编辑| +|[ID_Photo_GEN](image/Image_gan/style_transfer/ID_Photo_GEN)|HRNet_W18|-|证件照生成| +|[Photo2Cartoon](image/Image_gan/style_transfer/Photo2Cartoon)|U-GAT-IT|cartoon_data|人脸卡通化| +|[U2Net_Portrait](image/Image_gan/style_transfer/U2Net_Portrait)|U^2Net|-|人脸素描化| +|[UGATIT_100w](image/Image_gan/style_transfer/UGATIT_100w)|U-GAT-IT|selfie2anime|人脸动漫化| +|[UGATIT_83w](image/Image_gan/style_transfer/UGATIT_83w)|U-GAT-IT|selfie2anime|人脸动漫化| +|[UGATIT_92w](image/Image_gan/style_transfer/UGATIT_92w)| U-GAT-IT|selfie2anime|人脸动漫化| +|[animegan_v1_hayao_60](image/Image_gan/style_transfer/animegan_v1_hayao_60)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏| +|[animegan_v2_hayao_64](image/Image_gan/style_transfer/animegan_v2_hayao_64)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏| +|[animegan_v2_hayao_99](image/Image_gan/style_transfer/animegan_v2_hayao_99)|AnimeGAN|The Wind Rises|图像风格迁移-宫崎骏| +|[animegan_v2_paprika_54](image/Image_gan/style_transfer/animegan_v2_paprika_54)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_paprika_74](image/Image_gan/style_transfer/animegan_v2_paprika_74)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_paprika_97](image/Image_gan/style_transfer/animegan_v2_paprika_97)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_paprika_98](image/Image_gan/style_transfer/animegan_v2_paprika_98)|AnimeGAN|Paprika|图像风格迁移-今敏| +|[animegan_v2_shinkai_33](image/Image_gan/style_transfer/animegan_v2_shinkai_33)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚| +|[animegan_v2_shinkai_53](image/Image_gan/style_transfer/animegan_v2_shinkai_53)|AnimeGAN|Your Name, Weathering with you|图像风格迁移-新海诚| +|[msgnet](image/Image_gan/style_transfer/msgnet)|msgnet|COCO2014| +|[stylepro_artistic](image/Image_gan/style_transfer/stylepro_artistic)|StyleProNet|MS-COCO + WikiArt|艺术风格迁移| +|stylegan_ffhq|StyleGAN|FFHQ|图像风格迁移| + + - ### 关键点检测 + +|module|网络|数据集|简介| +|--|--|--|--| +|[face_landmark_localization](image/keypoint_detection/face_landmark_localization)|Face_Landmark|AFW/AFLW|人脸关键点检测| +|[hand_pose_localization](image/keypoint_detection/hand_pose_localization)|-|MPII, NZSL|手部关键点检测| +|[openpose_body_estimation](image/keypoint_detection/openpose_body_estimation)|two-branch multi-stage CNN|MPII, COCO 2016|肢体关键点检测| +|[human_pose_estimation_resnet50_mpii](image/keypoint_detection/human_pose_estimation_resnet50_mpii)|Pose_Resnet50|MPII|人体骨骼关键点检测 +|[openpose_hands_estimation](image/keypoint_detection/openpose_hands_estimation)|-|MPII, NZSL|手部关键点检测| + + - ### 图像分割 + +|module|网络|数据集|简介| +|--|--|--|--| +|[deeplabv3p_xception65_humanseg](image/semantic_segmentation/deeplabv3p_xception65_humanseg)|deeplabv3p|百度自建数据集|人像分割| +|[humanseg_server](image/semantic_segmentation/humanseg_server)|deeplabv3p|百度自建数据集|人像分割| +|[humanseg_mobile](image/semantic_segmentation/humanseg_mobile)|hrnet|百度自建数据集|人像分割-移动端前置摄像头| +|[humanseg_lite](image/semantic_segmentation/umanseg_lite)|shufflenet|百度自建数据集|轻量级人像分割-移动端实时| +|[ExtremeC3_Portrait_Segmentation](image/semantic_segmentation/ExtremeC3_Portrait_Segmentation)|ExtremeC3|EG1800, Baidu fashion dataset|轻量化人像分割| +|[SINet_Portrait_Segmentation](image/semantic_segmentation/SINet_Portrait_Segmentation)|SINet|EG1800, Baidu fashion dataset|轻量化人像分割| +|[FCN_HRNet_W18_Face_Seg](image/semantic_segmentation/FCN_HRNet_W18_Face_Seg)|FCN_HRNet_W18|-|人像分割| +|[ace2p](image/semantic_segmentation/ace2p)|ACE2P|LIP|人体解析| +|[Pneumonia_CT_LKM_PP](image/semantic_segmentation/Pneumonia_CT_LKM_PP)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析| +|[Pneumonia_CT_LKM_PP_lung](image/semantic_segmentation/Pneumonia_CT_LKM_PP_lung)|U-NET+|连心医疗授权脱敏数据集|肺炎CT影像分析| +|[ocrnet_hrnetw18_voc](image/semantic_segmentation/ocrnet_hrnetw18_voc)|ocrnet, hrnet|PascalVoc2012| +|[U2Net](image/semantic_segmentation/U2Net)|U^2Net|-|图像前景背景分割| +|[U2Netp](image/semantic_segmentation/U2Netp)|U^2Net|-|图像前景背景分割| +|[Extract_Line_Draft](image/semantic_segmentation/Extract_Line_Draft)|UNet|Pixiv|线稿提取| +|[unet_cityscapes](image/semantic_segmentation/unet_cityscapes)|UNet|cityscapes| +|[ocrnet_hrnetw18_cityscapes](image/semantic_segmentation/ocrnet_hrnetw18_cityscapes)|ocrnet_hrnetw18|cityscapes| +|[hardnet_cityscapes](image/semantic_segmentation/hardnet_cityscapes)|hardnet|cityscapes| +|[fcn_hrnetw48_voc](image/semantic_segmentation/fcn_hrnetw48_voc)|fcn_hrnetw48|PascalVoc2012| +|[fcn_hrnetw48_cityscapes](image/semantic_segmentation/fcn_hrnetw48_cityscapes)|fcn_hrnetw48|cityscapes| +|[fcn_hrnetw18_voc](image/semantic_segmentation/fcn_hrnetw18_voc)|fcn_hrnetw18|PascalVoc2012| +|[fcn_hrnetw18_cityscapes](image/semantic_segmentation/fcn_hrnetw18_cityscapes)|fcn_hrnetw18|cityscapes| +|[fastscnn_cityscapes](image/semantic_segmentation/fastscnn_cityscapes)|fastscnn|cityscapes| +|[deeplabv3p_resnet50_voc](image/semantic_segmentation/deeplabv3p_resnet50_voc)|deeplabv3p, resnet50|PascalVoc2012| +|[deeplabv3p_resnet50_cityscapes](image/semantic_segmentation/deeplabv3p_resnet50_cityscapes)|deeplabv3p, resnet50|cityscapes| +|[bisenetv2_cityscapes](image/semantic_segmentation/bisenetv2_cityscapes)|bisenetv2|cityscapes| + + + + - ### 人脸检测 + +|module|网络|数据集|简介| +|--|--|--|--| +|[pyramidbox_lite_mobile](image/face_detection/pyramidbox_lite_mobile)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测-移动端| +|[pyramidbox_lite_mobile_mask](image/face_detection/pyramidbox_lite_mobile_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测-移动端| +|[pyramidbox_lite_server_mask](image/face_detection/pyramidbox_lite_server_mask)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸口罩检测| +|[ultra_light_fast_generic_face_detector_1mb_640](image/face_detection/ultra_light_fast_generic_face_detector_1mb_640)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备| +|[ultra_light_fast_generic_face_detector_1mb_320](image/face_detection/ultra_light_fast_generic_face_detector_1mb_320)|Ultra-Light-Fast-Generic-Face-Detector-1MB|WIDER FACE数据集|轻量级通用人脸检测-低算力设备| +|[pyramidbox_lite_server](image/face_detection/pyramidbox_lite_server)|PyramidBox|WIDER FACE数据集 + 百度自采人脸数据集|轻量级人脸检测| +|[pyramidbox_face_detection](image/face_detection/pyramidbox_face_detection)|PyramidBox|WIDER FACE数据集|人脸检测| + + - ### 文字识别 + +|module|网络|数据集|简介| +|--|--|--|--| +|[chinese_ocr_db_crnn_mobile](image/text_recognition/chinese_ocr_db_crnn_mobile)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别|[chinese_text_detection_db_mobile](image/text_recognition/chinese_text_detection_db_mobile)|Differentiable Binarization|icdar2015数据集|中文文本检测| +|[chinese_text_detection_db_server](image/text_recognition/chinese_text_detection_db_server)|Differentiable Binarization|icdar2015数据集|中文文本检测| +|[chinese_ocr_db_crnn_server](image/text_recognition/chinese_ocr_db_crnn_server)|Differentiable Binarization+RCNN|icdar2015数据集|中文文字识别| +|[Vehicle_License_Plate_Recognition](image/text_recognition/Vehicle_License_Plate_Recognition)|-|CCPD|车牌识别| +|[chinese_cht_ocr_db_crnn_mobile](image/text_recognition/chinese_cht_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|繁体中文文字识别| +|[japan_ocr_db_crnn_mobile](image/text_recognition/japan_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|日文文字识别| +|[korean_ocr_db_crnn_mobile](image/text_recognition/korean_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|韩文文字识别| +|[german_ocr_db_crnn_mobile](image/text_recognition/german_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|德文文字识别| +|[french_ocr_db_crnn_mobile](image/text_recognition/french_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|法文文字识别| +|[latin_ocr_db_crnn_mobile](image/text_recognition/latin_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|拉丁文文字识别| +|[cyrillic_ocr_db_crnn_mobile](image/text_recognition/cyrillic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|斯拉夫文文字识别| +|[multi_languages_ocr_db_crnn](image/text_recognition/multi_languages_ocr_db_crnn)|Differentiable Binarization+RCNN|icdar2015数据集|多语言文字识别| +|[kannada_ocr_db_crnn_mobile](image/text_recognition/kannada_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|卡纳达文文字识别| +|[arabic_ocr_db_crnn_mobile](image/text_recognition/arabic_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|阿拉伯文文字识别| +|[telugu_ocr_db_crnn_mobile](image/text_recognition/telugu_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰卢固文文字识别| +|[devanagari_ocr_db_crnn_mobile](image/text_recognition/devanagari_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|梵文文字识别| +|[tamil_ocr_db_crnn_mobile](image/text_recognition/tamil_ocr_db_crnn_mobile)|Differentiable Binarization+CRNN|icdar2015数据集|泰米尔文文字识别| + + + - ### 图像编辑 + +|module|网络|数据集|简介| +|--|--|--|--| +|[realsr](image/Image_editing/super_resolution/realsr)|LP-KPN|RealSR dataset|图像/视频超分-4倍| +|[deoldify](image/Image_editing/colorization/deoldify)|GAN|ILSVRC 2012|黑白照片/视频着色| +|[photo_restoration](image/Image_editing/colorization/photo_restoration)|基于deoldify和realsr模型|-|老照片修复| +|[user_guided_colorization](image/Image_editing/colorization/user_guided_colorization)|siggraph|ILSVRC 2012|图像着色| +|[falsr_c](image/Image_editing/super_resolution/falsr_c)|falsr_c| DIV2k|轻量化超分-2倍| +|[dcscn](image/Image_editing/super_resolution/dcscn)|dcscn| DIV2k|轻量化超分-2倍| +|[falsr_a](image/Image_editing/super_resolution/falsr_a)|falsr_a| DIV2k|轻量化超分-2倍| +|[falsr_b](image/Image_editing/super_resolution/falsr_b)|falsr_b|DIV2k|轻量化超分-2倍| + + - ### 实例分割 + +|module|网络|数据集|简介| +|--|--|--|--| +|[solov2](image/instance_segmentation/solov2)|-|COCO2014|实例分割| + + - ### 目标检测 + +|module|网络|数据集|简介| +|--|--|--|--| +|[faster_rcnn_resnet50_coco2017](image/object_detection/faster_rcnn_resnet50_coco2017)|faster_rcnn|COCO2017|| +|[ssd_vgg16_512_coco2017](image/object_detection/ssd_vgg16_512_coco2017)|SSD|COCO2017|| +|[faster_rcnn_resnet50_fpn_venus](image/object_detection/faster_rcnn_resnet50_fpn_venus)|faster_rcnn|百度自建数据集|大规模通用目标检测| +|[ssd_vgg16_300_coco2017](image/object_detection/ssd_vgg16_300_coco2017)|||| +|[yolov3_resnet34_coco2017](image/object_detection/yolov3_resnet34_coco2017)|YOLOv3|COCO2017|| +|[yolov3_darknet53_pedestrian](image/object_detection/yolov3_darknet53_pedestrian)|YOLOv3|百度自建大规模行人数据集|行人检测| +|[yolov3_mobilenet_v1_coco2017](image/object_detection/yolov3_mobilenet_v1_coco2017)|YOLOv3|COCO2017|| +|[ssd_mobilenet_v1_pascal](image/object_detection/ssd_mobilenet_v1_pascal)|SSD|PASCAL VOC|| +|[faster_rcnn_resnet50_fpn_coco2017](image/object_detection/faster_rcnn_resnet50_fpn_coco2017)|faster_rcnn|COCO2017|| +|[yolov3_darknet53_coco2017](image/object_detection/yolov3_darknet53_coco2017)|YOLOv3|COCO2017|| +|[yolov3_darknet53_vehicles](image/object_detection/yolov3_darknet53_vehicles)|YOLOv3|百度自建大规模车辆数据集|车辆检测| +|[yolov3_darknet53_venus](image/object_detection/yolov3_darknet53_venus)|YOLOv3|百度自建数据集|大规模通用检测| +|[yolov3_resnet50_vd_coco2017](image/object_detection/yolov3_resnet50_vd_coco2017)|YOLOv3|COCO2017|| + + - ### 深度估计 + +|module|网络|数据集|简介| +|--|--|--|--| +|[MiDaS_Large](image/depth_estimation/MiDaS_Large)|-|3D Movies, WSVD, ReDWeb, MegaDepth|| +|[MiDaS_Small](image/depth_estimation/MiDaS_Small)|-|3D Movies, WSVD, ReDWeb, MegaDepth, etc.|| + +## 文本 + - ### 文本生成 + +|module|网络|数据集|简介| +|--|--|--|--| +|[ernie_gen](text/text_generation/ernie_gen)|ERNIE-GEN|-|面向生成任务的预训练-微调框架| +|[ernie_gen_poetry](text/text_generation/ernie_gen_poetry)|ERNIE-GEN|开源诗歌数据集|诗歌生成| +|[ernie_gen_couplet](text/text_generation/ernie_gen_couplet)|ERNIE-GEN|开源对联数据集|对联生成| +|[ernie_gen_lover_words](text/text_generation/ernie_gen_lover_words)|ERNIE-GEN|网络情诗、情话数据|情话生成| +|[ernie_tiny_couplet](text/text_generation/ernie_tiny_couplet)|Eernie_tiny|开源对联数据集|对联生成| +|[ernie_gen_acrostic_poetry](text/text_generation/ernie_gen_acrostic_poetry)|ERNIE-GEN|开源诗歌数据集|藏头诗生成| +|[Rumor_prediction](text/text_generation/Rumor_prediction)|-|新浪微博中文谣言数据|谣言预测| +|[plato-mini](text/text_generation/plato-mini)|Unified Transformer|十亿级别的中文对话数据|中文对话| +|[plato2_en_large](text/text_generation/plato2_en_large)|plato2|开放域多轮数据集|超大规模生成式对话| +|[plato2_en_base](text/text_generation/plato2_en_base)|plato2|开放域多轮数据集|超大规模生成式对话| +|[CPM_LM](text/text_generation/CPM_LM)|GPT-2|自建数据集|中文文本生成| +|[unified_transformer-12L-cn](text/text_generation/unified_transformer-12L-cn)|Unified Transformer|千万级别中文会话数据|人机多轮对话| +|[unified_transformer-12L-cn-luge](text/text_generation/unified_transformer-12L-cn-luge)|Unified Transformer|千言对话数据集|人机多轮对话| +|[reading_pictures_writing_poems](text/text_generation/reading_pictures_writing_poems)|多网络级联|-|看图写诗| +|[GPT2_CPM_LM](text/text_generation/GPT2_CPM_LM)|||问答类文本生成| +|[GPT2_Base_CN](text/text_generation/GPT2_Base_CN)|||问答类文本生成| + + - ### 词向量 + +
expand
+ +|module|网络|数据集|简介| +|--|--|--|--| +|[w2v_weibo_target_word-bigram_dim300](text/embedding/w2v_weibo_target_word-bigram_dim300)|w2v|weibo|| +|[w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_literature_target_word-word_dim300](text/embedding/w2v_literature_target_word-word_dim300)|w2v|literature|| +|[word2vec_skipgram](text/embedding/word2vec_skipgram)|skip-gram|百度自建数据集|| +|[w2v_sogou_target_word-char_dim300](text/embedding/w2v_sogou_target_word-char_dim300)|w2v|sogou|| +|[w2v_weibo_target_bigram-char_dim300](text/embedding/w2v_weibo_target_bigram-char_dim300)|w2v|weibo|| +|[w2v_zhihu_target_word-bigram_dim300](text/embedding/w2v_zhihu_target_word-bigram_dim300)|w2v|zhihu|| +|[w2v_financial_target_word-word_dim300](text/embedding/w2v_financial_target_word-word_dim300)|w2v|financial|| +|[w2v_wiki_target_word-word_dim300](text/embedding/w2v_wiki_target_word-word_dim300)|w2v|wiki|| +|[w2v_baidu_encyclopedia_context_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-word_dim300)|w2v|baidu_encyclopedia|| +|[w2v_weibo_target_word-word_dim300](text/embedding/w2v_weibo_target_word-word_dim300)|w2v|weibo|| +|[w2v_zhihu_target_bigram-char_dim300](text/embedding/w2v_zhihu_target_bigram-char_dim300)|w2v|zhihu|| +|[w2v_zhihu_target_word-word_dim300](text/embedding/w2v_zhihu_target_word-word_dim300)|w2v|zhihu|| +|[w2v_people_daily_target_word-char_dim300](text/embedding/w2v_people_daily_target_word-char_dim300)|w2v|people_daily|| +|[w2v_sikuquanshu_target_word-word_dim300](text/embedding/w2v_sikuquanshu_target_word-word_dim300)|w2v|sikuquanshu|| +|[glove_twitter_target_word-word_dim200_en](text/embedding/glove_twitter_target_word-word_dim200_en)|fasttext|twitter|| +|[fasttext_crawl_target_word-word_dim300_en](text/embedding/fasttext_crawl_target_word-word_dim300_en)|fasttext|crawl|| +|[w2v_wiki_target_word-bigram_dim300](text/embedding/w2v_wiki_target_word-bigram_dim300)|w2v|wiki|| +|[w2v_baidu_encyclopedia_context_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-1_dim300)|w2v|baidu_encyclopedia|| +|[glove_wiki2014-gigaword_target_word-word_dim300_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim300_en)|glove|wiki2014-gigaword|| +|[glove_wiki2014-gigaword_target_word-word_dim50_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim50_en)|glove|wiki2014-gigaword|| +|[w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_wiki_target_bigram-char_dim300](text/embedding/w2v_wiki_target_bigram-char_dim300)|w2v|wiki|| +|[w2v_baidu_encyclopedia_target_word-character_char1-1_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-1_dim300)|w2v|baidu_encyclopedia|| +|[w2v_financial_target_bigram-char_dim300](text/embedding/w2v_financial_target_bigram-char_dim300)|w2v|financial|| +|[glove_wiki2014-gigaword_target_word-word_dim200_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim200_en)|glove|wiki2014-gigaword|| +|[w2v_financial_target_word-bigram_dim300](text/embedding/w2v_financial_target_word-bigram_dim300)|w2v|financial|| +|[w2v_mixed-large_target_word-char_dim300](text/embedding/w2v_mixed-large_target_word-char_dim300)|w2v|mixed|| +|[w2v_baidu_encyclopedia_target_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordPosition_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_target_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-wordLR_dim300)|w2v|baidu_encyclopedia|| +|[w2v_sogou_target_bigram-char_dim300](text/embedding/w2v_sogou_target_bigram-char_dim300)|w2v|sogou|| +|[w2v_weibo_target_word-char_dim300](text/embedding/w2v_weibo_target_word-char_dim300)|w2v|weibo|| +|[w2v_people_daily_target_word-word_dim300](text/embedding/w2v_people_daily_target_word-word_dim300)|w2v|people_daily|| +|[w2v_zhihu_target_word-char_dim300](text/embedding/w2v_zhihu_target_word-char_dim300)|w2v|zhihu|| +|[w2v_wiki_target_word-char_dim300](text/embedding/w2v_wiki_target_word-char_dim300)|w2v|wiki|| +|[w2v_sogou_target_word-bigram_dim300](text/embedding/w2v_sogou_target_word-bigram_dim300)|w2v|sogou|| +|[w2v_financial_target_word-char_dim300](text/embedding/w2v_financial_target_word-char_dim300)|w2v|financial|| +|[w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_1-3_dim300)|w2v|baidu_encyclopedia|| +|[glove_wiki2014-gigaword_target_word-word_dim100_en](text/embedding/glove_wiki2014-gigaword_target_word-word_dim100_en)|glove|wiki2014-gigaword|| +|[w2v_baidu_encyclopedia_target_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-4_dim300)|w2v|baidu_encyclopedia|| +|[w2v_sogou_target_word-word_dim300](text/embedding/w2v_sogou_target_word-word_dim300)|w2v|sogou|| +|[w2v_literature_target_word-char_dim300](text/embedding/w2v_literature_target_word-char_dim300)|w2v|literature|| +|[w2v_baidu_encyclopedia_target_bigram-char_dim300](text/embedding/w2v_baidu_encyclopedia_target_bigram-char_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_target_word-word_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-word_dim300)|w2v|baidu_encyclopedia|| +|[glove_twitter_target_word-word_dim100_en](text/embedding/glove_twitter_target_word-word_dim100_en)|glove|crawl|| +|[w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-ngram_2-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_context_word-character_char1-4_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-4_dim300)|w2v|baidu_encyclopedia|| +|[w2v_literature_target_bigram-char_dim300](text/embedding/w2v_literature_target_bigram-char_dim300)|w2v|literature|| +|[fasttext_wiki-news_target_word-word_dim300_en](text/embedding/fasttext_wiki-news_target_word-word_dim300_en)|fasttext|wiki-news|| +|[w2v_people_daily_target_word-bigram_dim300](text/embedding/w2v_people_daily_target_word-bigram_dim300)|w2v|people_daily|| +|[w2v_mixed-large_target_word-word_dim300](text/embedding/w2v_mixed-large_target_word-word_dim300)|w2v|mixed|| +|[w2v_people_daily_target_bigram-char_dim300](text/embedding/w2v_people_daily_target_bigram-char_dim300)|w2v|people_daily|| +|[w2v_literature_target_word-bigram_dim300](text/embedding/w2v_literature_target_word-bigram_dim300)|w2v|literature|| +|[glove_twitter_target_word-word_dim25_en](text/embedding/glove_twitter_target_word-word_dim25_en)|glove|twitter|| +|[w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-ngram_1-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_sikuquanshu_target_word-bigram_dim300](text/embedding/w2v_sikuquanshu_target_word-bigram_dim300)|w2v|sikuquanshu|| +|[w2v_baidu_encyclopedia_context_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-character_char1-2_dim300)|w2v|baidu_encyclopedia|| +|[glove_twitter_target_word-word_dim50_en](text/embedding/glove_twitter_target_word-word_dim50_en)|glove|twitter|| +|[w2v_baidu_encyclopedia_context_word-wordLR_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordLR_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_target_word-character_char1-2_dim300](text/embedding/w2v_baidu_encyclopedia_target_word-character_char1-2_dim300)|w2v|baidu_encyclopedia|| +|[w2v_baidu_encyclopedia_context_word-wordPosition_dim300](text/embedding/w2v_baidu_encyclopedia_context_word-wordPosition_dim300)|w2v|baidu_encyclopedia|| + +
+ + - ### 机器翻译 + +|module|网络|数据集|简介| +|--|--|--|--| +|[transformer_zh-en](text/machine_translation/transformer/transformer_zh-en)|Transformer|CWMT2021|中文译英文| +|[transformer_en-de](text/machine_translation/transformer/transformer_en-de)|Transformer|WMT14 EN-DE|英文译德文| + + - ### 语义模型 + +
expand
+ +|module|网络|数据集|简介| +|--|--|--|--| +|[chinese_electra_small](text/language_model/chinese_electra_small)|||| +|[chinese_electra_base](text/language_model/chinese_electra_base)|||| +|[roberta-wwm-ext-large](text/language_model/roberta-wwm-ext-large)|roberta-wwm-ext-large|百度自建数据集|| +|[chinese-bert-wwm-ext](text/language_model/chinese_bert_wwm_ext)|chinese-bert-wwm-ext|百度自建数据集|| +|[lda_webpage](text/language_model/lda_webpage)|LDA|百度自建网页领域数据集|| +|[lda_novel](text/language_model/lda_novel)|||| +|[bert-base-multilingual-uncased](text/language_model/bert-base-multilingual-uncased)|||| +|[rbt3](text/language_model/rbt3)|||| +|[ernie_v2_eng_base](text/language_model/ernie_v2_eng_base)|ernie_v2_eng_base|百度自建数据集|| +|[bert-base-multilingual-cased](text/language_model/bert-base-multilingual-cased)|||| +|[rbtl3](text/language_model/rbtl3)|||| +|[chinese-bert-wwm](text/language_model/chinese_bert_wwm)|chinese-bert-wwm|百度自建数据集|| +|[bert-large-uncased](text/language_model/bert-large-uncased)|||| +|[slda_novel](text/language_model/slda_novel)|||| +|[slda_news](text/language_model/slda_news)|||| +|[electra_small](text/language_model/electra_small)|||| +|[slda_webpage](text/language_model/slda_webpage)|||| +|[bert-base-cased](text/language_model/bert-base-cased)|||| +|[slda_weibo](text/language_model/slda_weibo)|||| +|[roberta-wwm-ext](text/language_model/roberta-wwm-ext)|roberta-wwm-ext|百度自建数据集|| +|[bert-base-uncased](text/language_model/bert-base-uncased)|||| +|[electra_large](text/language_model/electra_large)|||| +|[ernie](text/language_model/ernie)|ernie-1.0|百度自建数据集|| +|[simnet_bow](text/language_model/simnet_bow)|BOW|百度自建数据集|| +|[ernie_tiny](text/language_model/ernie_tiny)|ernie_tiny|百度自建数据集|| +|[bert-base-chinese](text/language_model/bert-base-chinese)|bert-base-chinese|百度自建数据集|| +|[lda_news](text/language_model/lda_news)|LDA|百度自建新闻领域数据集|| +|[electra_base](text/language_model/electra_base)|||| +|[ernie_v2_eng_large](text/language_model/ernie_v2_eng_large)|ernie_v2_eng_large|百度自建数据集|| +|[bert-large-cased](text/language_model/bert-large-cased)|||| + +
+ + + - ### 情感分析 + +|module|网络|数据集|简介| +|--|--|--|--| +|[ernie_skep_sentiment_analysis](text/sentiment_analysis/ernie_skep_sentiment_analysis)|SKEP|百度自建数据集|句子级情感分析| +|[emotion_detection_textcnn](text/sentiment_analysis/emotion_detection_textcnn)|TextCNN|百度自建数据集|对话情绪识别| +|[senta_bilstm](text/sentiment_analysis/senta_bilstm)|BiLSTM|百度自建数据集|中文情感倾向分析| +|[senta_bow](text/sentiment_analysis/senta_bow)|BOW|百度自建数据集|中文情感倾向分析| +|[senta_gru](text/sentiment_analysis/senta_gru)|GRU|百度自建数据集|中文情感倾向分析| +|[senta_lstm](text/sentiment_analysis/senta_lstm)|LSTM|百度自建数据集|中文情感倾向分析| +|[senta_cnn](text/sentiment_analysis/senta_cnn)|CNN|百度自建数据集|中文情感倾向分析| + + - ### 句法分析 + +|module|网络|数据集|简介| +|--|--|--|--| +|[DDParser](text/syntactic_analysis/DDParser)|Deep Biaffine Attention|搜索query、网页文本、语音输入等数据|句法分析| + + - ### 同声传译 + +|module|网络|数据集|简介| +|--|--|--|--| +|[transformer_nist_wait_1](text/simultaneous_translation/stacl/transformer_nist_wait_1)|transformer|NIST 2008-中英翻译数据集|中译英-wait-1策略| +|[transformer_nist_wait_3](text/simultaneous_translation/stacl/transformer_nist_wait_3)|transformer|NIST 2008-中英翻译数据集|中译英-wait-3策略| +|[transformer_nist_wait_5](text/simultaneous_translation/stacl/transformer_nist_wait_5)|transformer|NIST 2008-中英翻译数据集|中译英-wait-5策略| +|[transformer_nist_wait_7](text/simultaneous_translation/stacl/transformer_nist_wait_7)|transformer|NIST 2008-中英翻译数据集|中译英-wait-7策略| +|[transformer_nist_wait_all](text/simultaneous_translation/stacl/transformer_nist_wait_all)|transformer|NIST 2008-中英翻译数据集|中译英-waitk=-1策略| + + + - ### 词法分析 + +|module|网络|数据集|简介| +|--|--|--|--| +|[jieba_paddle](text/lexical_analysis/jieba_paddle)|BiGRU+CRF|百度自建数据集|百度自研联合的词法分析模型,能整体性地完成中文分词、词性标注、专名识别任务。在百度自建数据集上评测,LAC效果:Precision=88.0%,Recall=88.7%,F1-Score=88.4%。| +|[lac](text/lexical_analysis/lac)|BiGRU+CRF|百度自建数据集|jieba使用Paddle搭建的切词网络(双向GRU)。同时支持jieba的传统切词方法,如精确模式、全模式、搜索引擎模式等切词模式。| + + - ### 标点恢复 + +|module|网络|数据集|简介| +|--|--|--|--| +|[auto_punc](text/punctuation_restoration/auto_punc)|Ernie-1.0|WuDaoCorpora 2.0|自动添加7种标点符号| + + - ### 文本审核 + +|module|网络|数据集|简介| +|--|--|--|--| +|[porn_detection_cnn](text/text_review/porn_detection_cnn)|CNN|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别| +|[porn_detection_gru](text/text_review/porn_detection_gru)|GRU|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别| +|[porn_detection_lstm](text/text_review/porn_detection_lstm)|LSTM|百度自建数据集|色情检测,自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别| + +## 语音 + - ### 声音克隆 + +|module|网络|数据集|简介| +|--|--|--|--| +|[ge2e_fastspeech2_pwgan](audio/voice_cloning/ge2e_fastspeech2_pwgan)|FastSpeech2|AISHELL-3|中文语音克隆| +|[lstm_tacotron2](audio/voice_cloning/lstm_tacotron2)|LSTM、Tacotron2、WaveFlow|AISHELL-3|中文语音克隆| + + - ### 语音合成 + +|module|网络|数据集|简介| +|--|--|--|--| +|[transformer_tts_ljspeech](audio/tts/transformer_tts_ljspeech)|Transformer|LJSpeech-1.1|英文语音合成| +|[fastspeech_ljspeech](audio/tts/fastspeech_ljspeech)|FastSpeech|LJSpeech-1.1|英文语音合成| +|[fastspeech2_baker](audio/tts/fastspeech2_baker)|FastSpeech2|Chinese Standard Mandarin Speech Copus|中文语音合成| +|[fastspeech2_ljspeech](audio/tts/fastspeech2_ljspeech)|FastSpeech2|LJSpeech-1.1|英文语音合成| +|[deepvoice3_ljspeech](audio/tts/deepvoice3_ljspeech)|DeepVoice3|LJSpeech-1.1|英文语音合成| + + - ### 语音识别 + +|module|网络|数据集|简介| +|--|--|--|--| +|[deepspeech2_aishell](audio/asr/deepspeech2_aishell)|DeepSpeech2|AISHELL-1|中文语音识别| +|[deepspeech2_librispeech](audio/asr/deepspeech2_librispeech)|DeepSpeech2|LibriSpeech|英文语音识别| +|[u2_conformer_aishell](audio/asr/u2_conformer_aishell)|DeepSpeech2|AISHELL-1|中文语音识别| +|[u2_conformer_wenetspeech](audio/asr/u2_conformer_wenetspeech)|Conformer|WenetSpeech|中文语音识别| +|[u2_conformer_librispeech](audio/asr/u2_conformer_librispeech)|DeepSpeech2|LibriSpeech|英文语音识别| + + + - ### 声音分类 + +|module|网络|数据集|简介| +|--|--|--|--| +|[panns_cnn6](audio/audio_classification/PANNs/cnn6)|PANNs|Google Audioset|主要包含4个卷积层和2个全连接层,模型参数为4.5M。经过预训练后,可以用于提取音频的embbedding,维度是512| +|[panns_cnn14](audio/audio_classification/PANNs/cnn14)|PANNs|Google Audioset|主要包含12个卷积层和2个全连接层,模型参数为79.6M。经过预训练后,可以用于提取音频的embbedding,维度是2048| +|[panns_cnn10](audio/audio_classification/PANNs/cnn10)|PANNs|Google Audioset|主要包含8个卷积层和2个全连接层,模型参数为4.9M。经过预训练后,可以用于提取音频的embbedding,维度是512| + +## 视频 + - ### 视频分类 + +|module|网络|数据集|简介| +|--|--|--|--| +|[videotag_tsn_lstm](video/classification/videotag_tsn_lstm)|TSN + AttentionLSTM|百度自建数据集|大规模短视频分类打标签| +|[tsn_kinetics400](video/classification/tsn_kinetics400)|TSN|Kinetics-400|视频分类| +|[tsm_kinetics400](video/classification/tsm_kinetics400)|TSM|Kinetics-400|视频分类| +|[stnet_kinetics400](video/classification/stnet_kinetics400)|StNet|Kinetics-400|视频分类| +|[nonlocal_kinetics400](video/classification/nonlocal_kinetics400)|Non-local|Kinetics-400|视频分类| + + + - ### 视频修复 + +|module|网络|数据集|简介| +|--|--|--|--| +|[SkyAR](video/Video_editing/SkyAR)|UNet|UNet|视频换天| + + - ### 多目标追踪 + +|module|网络|数据集|简介| +|--|--|--|--| +|[fairmot_dla34](video/multiple_object_tracking/fairmot_dla34)|CenterNet|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|实时多目标跟踪| +|[jde_darknet53](video/multiple_object_tracking/jde_darknet53)|YOLOv3|Caltech Pedestrian+CityPersons+CUHK-SYSU+PRW+ETHZ+MOT17|多目标跟踪-兼顾精度和速度| + +## 工业应用 + + - ### 表针识别 + +|module|网络|数据集|简介| +|--|--|--|--| +|[WatermeterSegmentation](image/semantic_segmentation/WatermeterSegmentation)|DeepLabV3|水表的数字表盘分割数据集|水表的数字表盘分割| diff --git a/modules/image/text_recognition/chinese_ocr_db_crnn_mobile/README_en.md b/modules/image/text_recognition/chinese_ocr_db_crnn_mobile/README_en.md new file mode 100644 index 00000000..679b2a05 --- /dev/null +++ b/modules/image/text_recognition/chinese_ocr_db_crnn_mobile/README_en.md @@ -0,0 +1,202 @@ +# chinese_ocr_db_crnn_mobile + +| Module Name | chinese_ocr_db_crnn_mobile | +| :------------------ | :------------: | +| Category | image-text_recognition | +| Network | Differentiable Binarization+RCNN | +| Dataset | icdar2015 | +| Fine-tuning supported or not | No | +| Module Size | 16M | +| Latest update date | 2021-02-26 | +| Data indicators | - | + + +## I. Basic Information of Module + +- ### Application Effect Display + - [Online experience in OCR text recognition scenarios](https://www.paddlepaddle.org.cn/hub/scene/ocr) + - Example result: +

+
+

+ +- ### Module Introduction + + - chinese_ocr_db_crnn_mobile Module is used to identify Chinese characters in pictures. Get the text box after using [chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/), identify the Chinese characters in the text box, and then do angle classification to the detection text box. CRNN(Convolutional Recurrent Neural Network) is adopted as the final recognition algorithm. This Module is an ultra-lightweight Chinese OCR model that supports direct prediction. + + +

+
+

+ + - For more information, please refer to:[An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf) + + + +## II. Installation + +- ### 1、Environmental dependence + + - paddlepaddle >= 1.7.2 + + - paddlehub >= 1.6.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) + + - shapely + + - pyclipper + + - ```shell + $ pip install shapely pyclipper + ``` + - **This Module relies on the third-party libraries shapely and pyclipper. Please install shapely and pyclipper before using this Module.** + +- ### 2、Installation + + - ```shell + $ hub install chinese_ocr_db_crnn_mobile + ``` + - If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + +## III. Module API and Prediction + + +- ### 1、Command line Prediction + + - ```shell + $ hub run chinese_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE" + ``` + - If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、Prediction Code Example + + - ```python + import paddlehub as hub + import cv2 + + ocr = hub.Module(name="chinese_ocr_db_crnn_mobile", enable_mkldnn=True) # MKLDNN acceleration is only available on CPU + result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')]) + + # or + # result = ocr.recognize_text(paths=['/PATH/TO/IMAGE']) + ``` + +- ### 3、API + + - ```python + __init__(text_detector_module=None, enable_mkldnn=False) + ``` + + - Construct the ChineseOCRDBCRNN object + + - **Parameter** + + - text_detector_module(str): PaddleHub Module Name for text detection, use [chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/) by default if set to None. Its function is to detect the text in the picture. + - enable_mkldnn(bool): Whether to enable MKLDNN to accelerate CPU computing. This parameter is valid only when the CPU is running. The default is False. + + + - ```python + def recognize_text(images=[], + paths=[], + use_gpu=False, + output_dir='ocr_result', + visualization=False, + box_thresh=0.5, + text_thresh=0.5, + angle_classification_thresh=0.9) + ``` + + - Prediction API, detecting the position of all Chinese text in the input image. + + - **Parameter** + + - paths (list\[str\]): image path + - images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format \[H, W, C\], BGR; + - use\_gpu (bool): use GPU or not **If GPU is used, set the CUDA_VISIBLE_DEVICES environment variable first** + - box\_thresh (float): The confidence threshold of text box detection; + - text\_thresh (float): The confidence threshold of Chinese text recognition; + - angle_classification_thresh(float): The confidence threshold of text Angle classification + - visualization (bool): Whether to save the recognition results as picture files; + - output\_dir (str): path to save the image, ocr\_result by default. + + - **Return** + + - res (list\[dict\]): The list of recognition results, where each element is dict and each field is: + - data (list\[dict\]): recognition result, each element in the list is dict and each field is: + - text(str): The result text of recognition + - confidence(float): The confidence of the results + - text_box_position(list): The pixel coordinates of the text box in the original picture, a 4*2 matrix, represent the coordinates of the lower left, lower right, upper right and upper left vertices of the text box in turn + data is \[\] if there's no result + - save_path (str, optional): Path to save the result, save_path is '' if no image is saved. + + +## IV. Server Deployment + +- PaddleHub Serving can deploy an online object detection service. + +- ### Step 1: Start PaddleHub Serving + + - Run the startup command: + - ```shell + $ hub serving start -m chinese_ocr_db_crnn_mobile + ``` + + - The servitization API is now deployed and the default port number is 8866. + + - **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it. + + +- ### Step 2: Send a predictive request + + - After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result + + - ```python + import requests + import json + import cv2 + import base64 + + def cv2_to_base64(image): + data = cv2.imencode('.jpg', image)[1] + return base64.b64encode(data.tostring()).decode('utf8') + + # Send an HTTP request + data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/chinese_ocr_db_crnn_mobile" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # print prediction result + print(r.json()["results"]) + ``` + +## V. Release Note + +* 1.0.0 + + First release + +* 1.0.1 + + Fixed failure to use the online service invocating model + +* 1.0.2 + + Supports MKLDNN to speed up CPU computing + +* 1.1.0 + + An ultra-lightweight three-stage model (text box detection - angle classification - text recognition) is used to identify text in images. + +* 1.1.1 + + Supports recognition of spaces in text. + +* 1.1.2 + + Fixed an issue where only 30 fields can be detected. + + - ```shell + $ hub install chinese_ocr_db_crnn_mobile==1.1.2 + ``` diff --git a/modules/text/sentiment_analysis/senta_bilstm/README_en.md b/modules/text/sentiment_analysis/senta_bilstm/README_en.md new file mode 100644 index 00000000..ae7ca125 --- /dev/null +++ b/modules/text/sentiment_analysis/senta_bilstm/README_en.md @@ -0,0 +1,190 @@ +# senta_bilstm + +| Module Name | senta_bilstm | +| :------------------ | :------------: | +| Category | text-sentiment_analysis | +| Network | BiLSTM | +| Dataset | Dataset built by Baidu | +| Fine-tuning supported or not | No | +| Module Size | 690M | +| Latest update date | 2021-02-26 | +| Data indicators | - | + + +## I. Basic Information of Module + +- ### Module Introduction + + - Sentiment Classification (Senta for short) can automatically judge the emotional polarity category of Chinese texts with subjective description and give corresponding confidence, which can help enterprises understand users' consumption habits, analyze hot topics and crisis public opinion monitoring, and provide favorable decision support for enterprises. The model is based on a bidirectional LSTM structure, with positive and negative emotion types. + + + +## II. Installation + +- ### 1、Environmental dependence + + - paddlepaddle >= 1.8.0 + + - paddlehub >= 1.8.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、Installation + + - ```shell + $ hub install senta_bilstm + ``` + - If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + + +## III. Module API and Prediction + +- ### 1、Command line Prediction + + - ```shell + $ hub run senta_bilstm --input_text "这家餐厅很好吃" + ``` + or + - ```shell + $ hub run senta_bilstm --input_file test.txt + ``` + - test.txt stores the text to be predicted, for example: + + > 这家餐厅很好吃 + + > 这部电影真的很差劲 + + - If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + + +- ### 2、Prediction Code Example + + - ```python + import paddlehub as hub + + senta = hub.Module(name="senta_bilstm") + test_text = ["这家餐厅很好吃", "这部电影真的很差劲"] + results = senta.sentiment_classify(texts=test_text, + use_gpu=False, + batch_size=1) + + for result in results: + print(result['text']) + print(result['sentiment_label']) + print(result['sentiment_key']) + print(result['positive_probs']) + print(result['negative_probs']) + + # 这家餐厅很好吃 1 positive 0.9407 0.0593 + # 这部电影真的很差劲 0 negative 0.02 0.98 + ``` + +- ### 3、API + + - ```python + def sentiment_classify(texts=[], data={}, use_gpu=False, batch_size=1) + ``` + + - senta_bilstm predicting interfaces, predicting sentiment classification of input sentences (dichotomies, positive/negative) + + - **Parameter** + + - texts(list): data to be predicted, if texts parameter is used, there is no need to pass in data parameter. You can use any of the two parameters. + - data(dict): predicted data , key must be text,value is data to be predicted. if data parameter is used, there is no need to pass in texts parameter. You can use any of the two parameters. It is suggested to use texts parameter, and data parameter will be discarded later. + - use_gpu(bool): use GPU or not. If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it. + - batch_size(int): batch size + + - **Return** + + - results(list): result of sentiment classification + + + - ```python + def get_labels() + ``` + - get the category of senta_bilstm + + - **Return** + + - labels(dict): the category of senta_bilstm(Dichotomies, positive/negative) + + - ```python + def get_vocab_path() + ``` + - Get a vocabulary for pre-training + + - **Return** + + - vocab_path(str): Vocabulary path + + + +## IV. Server Deployment + +- PaddleHub Serving can deploy an online sentiment analysis detection service and you can use this interface for online Web applications. + +- ## Step 1: Start PaddleHub Serving + + - Run the startup command: + - ```shell + $ hub serving start -m senta_bilstm + ``` + + - The model loading process is displayed on startup. After the startup is successful, the following information is displayed: + - ```shell + Loading senta_bilstm successful. + ``` + + - The servitization API is now deployed and the default port number is 8866. + + - **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it. + +- ## Step 2: Send a predictive request + + - After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result + + - ```python + import requests + import json + + # data to be predicted + text = ["这家餐厅很好吃", "这部电影真的很差劲"] + + # Set the running configuration + # Corresponding to local prediction senta_bilstm.sentiment_classify(texts=text, batch_size=1, use_gpu=True) + data = {"texts": text, "batch_size": 1, "use_gpu":True} + + # set the prediction method to senta_bilstm and send a POST request, content-type should be set to json + # HOST_IP is the IP address of the server + url = "http://HOST_IP:8866/predict/senta_bilstm" + headers = {"Content-Type": "application/json"} + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # print prediction result + print(json.dumps(r.json(), indent=4, ensure_ascii=False)) + ``` + + - For more information about PaddleHub Serving, please refer to:[Serving Deployment](../../../../docs/docs_ch/tutorial/serving.md) + + + +## V. Release Note + +* 1.0.0 + + First release + +* 1.0.1 + + Vocabulary upgrade + +* 1.1.0 + + Significantly improve predictive performance + +* 1.2.0 + + Model upgrade, support transfer learning for text classification, text matching and other tasks + - ```shell + $ hub install senta_bilstm==1.2.0 + ``` diff --git a/modules/text/text_generation/ernie_gen/README_en.md b/modules/text/text_generation/ernie_gen/README_en.md new file mode 100644 index 00000000..6e8264a4 --- /dev/null +++ b/modules/text/text_generation/ernie_gen/README_en.md @@ -0,0 +1,230 @@ +# ernie_gen + +| 模型名称 | ernie_gen | +| :------------------ | :-----------: | +| 类别 | 文本-文本生成 | +| 网络 | ERNIE-GEN | +| 数据集 | - | +| 是否支持Fine-tuning | 是 | +| 模型大小 | 85K | +| 最新更新日期 | 2021-07-20 | +| 数据指标 | - | + + +## 一、模型基本信息 + +- ### 模型介绍 + - ERNIE-GEN 是面向生成任务的预训练-微调框架,首次在预训练阶段加入span-by-span 生成任务,让模型每次能够生成一个语义完整的片段。在预训练和微调中通过填充式生成机制和噪声感知机制来缓解曝光偏差问题。此外, ERNIE-GEN 采样多片段-多粒度目标文本采样策略, 增强源文本和目标文本的关联性,加强了编码器和解码器的交互。 + - ernie_gen module是一个具备微调功能的module,可以快速完成特定场景module的制作。 + +

+ +

+ +- 更多详情请查看:[ERNIE-GEN:An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation](https://arxiv.org/abs/2001.11314) + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 2.0.0 + + - paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst) + + - paddlenlp >= 2.0.0 + +- ### 2、安装 + + - ```shell + $ hub install ernie_gen + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + +## 三、模型API预测 + +- ernie_gen can be used **only if it is first targeted at the specific dataset fine-tune** + - There are many types of text generation tasks, ernie_gen only provides the basic parameters for text generation, which can only be used after fine-tuning the dataset for a specific task + - Paddlehub provides a simple fine-tune dataset:[train.txt](./test_data/train.txt), [dev.txt](./test_data/dev.txt) + - Paddlehub also offers multiple fine-tune pre-training models that work well:[Couplet generated](../ernie_gen_couplet/),[Lover words generated](../ernie_gen_lover_words/),[Poetry generated](../ernie_gen_poetry/)等 + +### 1、Fine-tune and encapsulation + +- #### Fine-tune Code Example + + - ```python + import paddlehub as hub + + module = hub.Module(name="ernie_gen") + + result = module.finetune( + train_path='train.txt', + dev_path='dev.txt', + max_steps=300, + batch_size=2 + ) + + module.export(params_path=result['last_save_path'], module_name="ernie_gen_test", author="test") + ``` + +- #### API Instruction + + - ```python + def finetune(train_path, + dev_path=None, + save_dir="ernie_gen_result", + init_ckpt_path=None, + use_gpu=True, + max_steps=500, + batch_size=8, + max_encode_len=15, + max_decode_len=15, + learning_rate=5e-5, + warmup_proportion=0.1, + weight_decay=0.1, + noise_prob=0, + label_smooth=0, + beam_width=5, + length_penalty=1.0, + log_interval=100, + save_interval=200): + ``` + + - Fine tuning model parameters API + - **Parameter** + - train_path(str): Training set path. The format of the training set should be: "serial number\tinput text\tlabel", such as "1\t床前明月光\t疑是地上霜", note that \t cannot be replaced by Spaces + - dev_path(str): validation set path. The format of the validation set should be: "serial number\tinput text\tlabel, such as "1\t举头望明月\t低头思故乡", note that \t cannot be replaced by Spaces + - save_dir(str): Model saving and validation sets predict output paths. + - init_ckpt_path(str): The model initializes the loading path to realize incremental training. + - use_gpu(bool): use gpu or not + - max_steps(int): Maximum training steps. + - batch_size(int): Batch size during training. + - max_encode_len(int): Maximum encoding length. + - max_decode_len(int): Maximum decoding length. + - learning_rate(float): Learning rate size. + - warmup_proportion(float): Warmup rate. + - weight_decay(float): Weight decay size. + - noise_prob(float): Noise probability, refer to the Ernie Gen's paper. + - label_smooth(float): Label smoothing weight. + - beam_width(int): Beam size of validation set at the time of prediction. + - length_penalty(float): Length penalty weight for validation set prediction. + - log_interval(int): Number of steps at a training log printing interval. + - save_interval(int): training model save interval deployment. The validation set will make predictions after the model is saved. + - **Return** + - result(dict): Run result. Contains 2 keys: + - last_save_path(str): Save path of model at the end of training. + - last_ppl(float): Model confusion at the end of training. + + - ```python + def export( + params_path, + module_name, + author, + version="1.0.0", + summary="", + author_email="", + export_path="."): + ``` + + - Module exports an API through which training parameters can be packaged into a Hub Module with one click. + - **Parameter** + - params_path(str): Module parameter path. + - module_name(str): module name, such as "ernie_gen_couplet"。 + - author(str): Author name + - max_encode_len(int): Maximum encoding length. + - max_decode_len(int): Maximum decoding length. + - version(str): The version number. + - summary(str): English introduction to Module. + - author_email(str): Email address of the author. + - export_path(str): Module export path. + +### 2、模型预测 + +- **定义`$module_name`为export指定的module_name** + +- 模型转换完毕之后,通过`hub install $module_name`安装该模型,即可通过以下2种方式调用自制module: + +- #### 法1:命令行预测 + + - ```python + $ hub run $module_name --input_text="输入文本" --use_gpu True --beam_width 5 + ``` + + - 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- #### 法2:API预测 + + - ```python + import paddlehub as hub + + module = hub.Module(name="$module_name") + + test_texts = ["输入文本1", "输入文本2"] + # generate包含3个参数,texts为输入文本列表,use_gpu指定是否使用gpu,beam_width指定beam search宽度。 + results = module.generate(texts=test_texts, use_gpu=True, beam_width=5) + for result in results: + print(result) + ``` + +- 您也可以将`$module_name`文件夹打包为tar.gz压缩包并联系PaddleHub工作人员上传至PaddleHub模型仓库,这样更多的用户可以通过一键安装的方式使用您的模型。PaddleHub非常欢迎您的贡献,共同推动开源社区成长。 + +## 四、服务部署 + +- PaddleHub Serving 可以部署一个文本生成的在线服务。 + +- ### 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m $module_name -p 8866 + ``` + + - 这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。 + +- ### 第二步:发送预测请求 + + - 客户端通过以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + + # 发送HTTP请求 + + data = {'texts':["输入文本1", "输入文本2"], + 'use_gpu':True, 'beam_width':5} + headers = {"Content-type": "application/json"} + url = "http://127.0.0.1:8866/predict/$module_name" + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 保存结果 + results = r.json()["results"] + for result in results: + print(result) + ``` + +- **NOTE:** 上述`$module_name`为export指定的module_name + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.0.1 + + 修复模型导出bug + +* 1.0.2 + + 修复windows运行中的bug + +* 1.1.0 + + 接入PaddleNLP + + - ```shell + $ hub install ernie_gen==1.1.0 + ``` diff --git a/modules/text/text_generation/reading_pictures_writing_poems/readme.md b/modules/text/text_generation/reading_pictures_writing_poems/readme.md index 5468e04b..7a6351c3 100644 --- a/modules/text/text_generation/reading_pictures_writing_poems/readme.md +++ b/modules/text/text_generation/reading_pictures_writing_poems/readme.md @@ -63,13 +63,13 @@ - ### 2、预测代码示例 - ```python - import paddlehub as hub - - readingPicturesWritingPoems = hub.Module(name="reading_pictures_writing_poems") - results = readingPicturesWritingPoems.WritingPoem(image = "scenery.jpg", use_gpu=False) - - for result in results: - print(result) + import paddlehub as hub + + readingPicturesWritingPoems = hub.Module(name="reading_pictures_writing_poems") + results = readingPicturesWritingPoems.WritingPoem(image = "scenery.jpg", use_gpu=False) + + for result in results: + print(result) ``` - ### 3、API diff --git a/modules/text/text_review/porn_detection_cnn/README.md b/modules/text/text_review/porn_detection_cnn/README.md index e72a71a6..588ce206 100644 --- a/modules/text/text_review/porn_detection_cnn/README.md +++ b/modules/text/text_review/porn_detection_cnn/README.md @@ -1,93 +1,184 @@ -# PornDetectionCNN API说明 +# porn_detection_cnn -## detection(texts=[], data={}, use_gpu=False, batch_size=1) +| 模型名称 | porn_detection_cnn | +| :------------------ | :------------: | +| 类别 | 文本-文本审核 | +| 网络 | CNN | +| 数据集 | 百度自建数据集 | +| 是否支持Fine-tuning | 否 | +| 模型大小 | 20M | +| 最新更新日期 | 2021-02-26 | +| 数据指标 | - | -porn_detection_cnn预测接口,鉴定输入句子是否包含色情文案 +## 一、模型基本信息 -**参数** +- ### 模型介绍 + - 色情检测模型可自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别。 + - porn_detection_cnn采用CNN网络结构并按字粒度进行切词,具有较高的预测速度。该模型最大句子长度为256字,仅支持预测。 -* texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可 -* data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。 -* use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置 -* batch_size(int): 批处理大小 -**返回** +## 二、安装 -* results(list): 鉴定结果 +- ### 1、环境依赖 -## context(trainable=False) + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) -获取porn_detection_cnn的预训练program以及program的输入输出变量 +- ### 2、安装 -**参数** + - ```shell + $ hub install porn_detection_cnn + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变 -**返回** +## 三、模型API预测 -* inputs(dict): program的输入变量 -* outputs(dict): program的输出变量 -* main_program(Program): 带有预训练参数的program +- ### 1、命令行预测 -## get_labels() + - ```shell + $ hub run porn_detection_cnn --input_text "黄片下载" + ``` + + - 或者 -获取porn_detection_cnn的类别 + - ```shell + $ hub run porn_detection_cnn --input_file test.txt + ``` + + - 其中test.txt存放待审查文本,每行仅放置一段待审核文本 + + - 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -**返回** +- ### 2、预测代码示例 -* labels(dict): porn_detection_cnn的类别(二分类,是/不是) + - ```python + import paddlehub as hub + + porn_detection_cnn = hub.Module(name="porn_detection_cnn") + + test_text = ["黄片下载", "打击黄牛党"] + + results = porn_detection_cnn.detection(texts=test_text, use_gpu=True, batch_size=1) + + for index, text in enumerate(test_text): + results[index]["text"] = text + for index, result in enumerate(results): + print(results[index]) + + # 输出结果如下: + # {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676} + # {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996} + ``` -## get_vocab_path() + +- ### 3、API -获取预训练时使用的词汇表 + - ```python + def detection(texts=[], data={}, use_gpu=False, batch_size=1) + ``` + + - porn_detection_cnn预测接口,鉴定输入句子是否包含色情文案 -**返回** + - **参数** -* vocab_path(str): 词汇表路径 + - texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可 -# PornDetectionCNN 服务部署 + - data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。 -PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。 + - use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置 -## 第一步:启动PaddleHub Serving + - batch_size(int): 批处理大小 -运行启动命令: -```shell -$ hub serving start -m porn_detection_cnn -``` + - **返回** -启动时会显示加载模型过程,启动成功后显示 -```shell -Loading porn_detection_cnn successful. -``` + - results(list): 鉴定结果 -这样就完成了服务化API的部署,默认端口号为8866。 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + - ```python + def get_labels() + ``` + - 获取porn_detection_cnn的类别 -## 第二步:发送预测请求 + - **返回** -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - labels(dict): porn_detection_cnn的类别(二分类,是/不是) -```python -import requests -import json + - ```python + def get_vocab_path() + ``` -# 待预测数据 -text = ["黄片下载", "打击黄牛党"] + - 获取预训练时使用的词汇表 -# 设置运行配置 -# 对应本地预测porn_detection_cnn.detection(texts=text, batch_size=1, use_gpu=True) -data = {"texts": text, "batch_size": 1, "use_gpu":True} + - **返回** -# 指定预测方法为porn_detection_cnn并发送post请求,content-type类型应指定json方式 -# HOST_IP为服务器IP -url = "http://HOST_IP:8866/predict/porn_detection_cnn" -headers = {"Content-Type": "application/json"} -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + - vocab_path(str): 词汇表路径 -# 打印预测结果 -print(json.dumps(r.json(), indent=4, ensure_ascii=False)) -``` -关于PaddleHub Serving更多信息参考[服务部署](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.6/docs/tutorial/serving.md) + +## 四、服务部署 + +- PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。 + +- ## 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m porn_detection_cnn + ``` + + - 启动时会显示加载模型过程,启动成功后显示 + - ```shell + Loading porn_detection_cnn successful. + ``` + + - 这样就完成了服务化API的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + +- ## 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + + # 待预测数据 + text = ["黄片下载", "打击黄牛党"] + + # 设置运行配置 + # 对应本地预测porn_detection_cnn.detection(texts=text, batch_size=1, use_gpu=True) + data = {"texts": text, "batch_size": 1, "use_gpu":True} + + # 指定预测方法为porn_detection_cnn并发送post请求,content-type类型应指定json方式 + # HOST_IP为服务器IP + url = "http://HOST_IP:8866/predict/porn_detection_cnn" + headers = {"Content-Type": "application/json"} + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(json.dumps(r.json(), indent=4, ensure_ascii=False)) + ``` + + - 关于PaddleHub Serving更多信息参考[服务部署](../../../../docs/docs_ch/tutorial/serving.md) + + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 大幅提升预测性能,同时简化接口使用 + + - ```shell + $ hub install porn_detection_cnn==1.1.0 + ``` + + diff --git a/modules/text/text_review/porn_detection_gru/README.md b/modules/text/text_review/porn_detection_gru/README.md index add8f9f9..46ba9783 100644 --- a/modules/text/text_review/porn_detection_gru/README.md +++ b/modules/text/text_review/porn_detection_gru/README.md @@ -1,93 +1,185 @@ -# PornDetectionGRU API说明 +# porn_detection_gru -## detection(texts=[], data={}, use_gpu=False, batch_size=1) +| 模型名称 | porn_detection_gru | +| :------------------ | :------------: | +| 类别 | 文本-文本审核 | +| 网络 | GRU | +| 数据集 | 百度自建数据集 | +| 是否支持Fine-tuning | 否 | +| 模型大小 | 20M | +| 最新更新日期 | 2021-02-26 | +| 数据指标 | - | -porn_detection_gru预测接口,鉴定输入句子是否包含色情文案 +## 一、模型基本信息 -**参数** +- ### 模型介绍 + - 色情检测模型可自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别。 + - porn_detection_gru采用GRU网络结构并按字粒度进行切词,具有较高的预测速度。该模型最大句子长度为256字,仅支持预测。 -* texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可 -* data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。 -* use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置 -* batch_size(int): 批处理大小 -**返回** +## 二、安装 -* results(list): 鉴定结果 +- ### 1、环境依赖 -## context(trainable=False) + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) -获取porn_detection_gru的预训练program以及program的输入输出变量 +- ### 2、安装 -**参数** + - ```shell + $ hub install porn_detection_gru + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) -* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变 -**返回** -* inputs(dict): program的输入变量 -* outputs(dict): program的输出变量 -* main_program(Program): 带有预训练参数的program +## 三、模型API预测 -## get_labels() +- ### 1、命令行预测 -获取porn_detection_gru的类别 + - ```shell + $ hub run porn_detection_gru --input_text "黄片下载" + ``` + + - 或者 -**返回** + - ```shell + $ hub run porn_detection_gru --input_file test.txt + ``` + + - 其中test.txt存放待审查文本,每行仅放置一段待审核文本 + + - 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) -* labels(dict): porn_detection_gru的类别 +- ### 2、预测代码示例 -## get_vocab_path() + - ```python + import paddlehub as hub + + porn_detection_gru = hub.Module(name="porn_detection_gru") + + test_text = ["黄片下载", "打击黄牛党"] + + results = porn_detection_gru.detection(texts=test_text, use_gpu=True, batch_size=1) # 如不使用GPU,请修改为use_gpu=False + + for index, text in enumerate(test_text): + results[index]["text"] = text + for index, result in enumerate(results): + print(results[index]) + + # 输出结果如下: + # {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676} + # {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996} + ``` -获取预训练时使用的词汇表 + +- ### 3、API -**返回** + - ```python + def detection(texts=[], data={}, use_gpu=False, batch_size=1) + ``` + + - porn_detection_gru预测接口,鉴定输入句子是否包含色情文案 -* vocab_path(str): 词汇表路径 + - **参数** -# PornDetectionGRU 服务部署 + - texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可 -PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。 + - data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。 -## 第一步:启动PaddleHub Serving + - use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置 -运行启动命令: -```shell -$ hub serving start -m porn_detection_gru -``` + - batch_size(int): 批处理大小 -启动时会显示加载模型过程,启动成功后显示 -```shell -Loading porn_detection_gru successful. -``` + - **返回** -这样就完成了服务化API的部署,默认端口号为8866。 + - results(list): 鉴定结果 -**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 -## 第二步:发送预测请求 + - ```python + def get_labels() + ``` + - 获取porn_detection_gru的类别 -配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + - **返回** -```python -import requests -import json + - labels(dict): porn_detection_gru的类别(二分类,是/不是) -# 待预测数据 -text = ["黄片下载", "打击黄牛党"] + - ```python + def get_vocab_path() + ``` -# 设置运行配置 -# 对应本地预测porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True) -data = {"texts": text, "batch_size": 1, "use_gpu":True} + - 获取预训练时使用的词汇表 -# 指定预测方法为porn_detection_gru并发送post请求,content-type类型应指定json方式 -# HOST_IP为服务器IP -url = "http://HOST_IP:8866/predict/porn_detection_gru" -headers = {"Content-Type": "application/json"} -r = requests.post(url=url, headers=headers, data=json.dumps(data)) + - **返回** -# 打印预测结果 -print(json.dumps(r.json(), indent=4, ensure_ascii=False)) -``` + - vocab_path(str): 词汇表路径 -关于PaddleHub Serving更多信息参考[服务部署](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.6/docs/tutorial/serving.md) + + +## 四、服务部署 + +- PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。 + +- ## 第一步:启动PaddleHub Serving + + - 运行启动命令: + - ```shell + $ hub serving start -m porn_detection_gru + ``` + + - 启动时会显示加载模型过程,启动成功后显示 + - ```shell + Loading porn_detection_gur successful. + ``` + + - 这样就完成了服务化API的部署,默认端口号为8866。 + + - **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。 + +- ## 第二步:发送预测请求 + + - 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果 + + - ```python + import requests + import json + + # 待预测数据 + text = ["黄片下载", "打击黄牛党"] + + # 设置运行配置 + # 对应本地预测porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True) + data = {"texts": text, "batch_size": 1, "use_gpu":True} + + # 指定预测方法为porn_detection_gru并发送post请求,content-type类型应指定json方式 + # HOST_IP为服务器IP + url = "http://HOST_IP:8866/predict/porn_detection_gru" + headers = {"Content-Type": "application/json"} + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # 打印预测结果 + print(json.dumps(r.json(), indent=4, ensure_ascii=False)) + ``` + + - 关于PaddleHub Serving更多信息参考[服务部署](../../../../docs/docs_ch/tutorial/serving.md) + + + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + +* 1.1.0 + + 大幅提升预测性能,同时简化接口使用 + + - ```shell + $ hub install porn_detection_gru==1.1.0 + ``` + diff --git a/modules/text/text_review/porn_detection_gru/README_en.md b/modules/text/text_review/porn_detection_gru/README_en.md new file mode 100644 index 00000000..3a8446fa --- /dev/null +++ b/modules/text/text_review/porn_detection_gru/README_en.md @@ -0,0 +1,183 @@ +# porn_detection_gru + +| Module Name | porn_detection_gru | +| :------------------ | :------------: | +| Category | text-text_review | +| Network | GRU | +| Dataset | Dataset built by Baidu | +| Fine-tuning supported or not | No | +| Module Size | 20M | +| Latest update date | 2021-02-26 | +| Data indicators | - | + +## I. Basic Information of Module + +- ### Module Introduction + - Pornography detection model can automatically distinguish whether the text is pornographic or not and give the corresponding confidence, and identify the pornographic description, vulgar communication and filthy text in the text. + - porn_detection_gru adopts GRU network structure and cuts words according to word granularity, which has high prediction speed. The maximum sentence length of this model is 256 words, and only prediction is supported. + + +## II. Installation + +- ### 1、Environmental dependence + + - paddlepaddle >= 1.6.2 + + - paddlehub >= 1.6.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、Installation + + - ```shell + $ hub install porn_detection_gru + ``` + - If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + + +## III. Module API and Prediction + +- ### 1、Command line Prediction + + - ```shell + $ hub run porn_detection_gru --input_text "黄片下载" + ``` + + - or + + - ```shell + $ hub run porn_detection_gru --input_file test.txt + ``` + + - test.txt stores the text to be reviewed. Each line contains only one text + + - If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、Prediction Code Example + + - ```python + import paddlehub as hub + + porn_detection_gru = hub.Module(name="porn_detection_gru") + + test_text = ["黄片下载", "打击黄牛党"] + + results = porn_detection_gru.detection(texts=test_text, use_gpu=True, batch_size=1) # If you do not use GPU, please set use_gpu=False + + for index, text in enumerate(test_text): + results[index]["text"] = text + for index, result in enumerate(results): + print(results[index]) + + # The output: + # {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676} + # {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996} + ``` + + +- ### 3、API + + - ```python + def detection(texts=[], data={}, use_gpu=False, batch_size=1) + ``` + + - prediction api of porn_detection_gru,to identify whether input sentences contain pornography + + - **Parameter** + + - texts(list): data to be predicted, if texts parameter is used, there is no need to pass in data parameter. You can use any of the two parameters. + + - data(dict): predicted data , key must be text,value is data to be predicted. if data parameter is used, there is no need to pass in texts parameter. You can use any of the two parameters. It is suggested to use texts parameter, and data parameter will be discarded later. + + - use_gpu(bool): use GPU or not. If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it. + + - **Return** + + - results(list): prediction result + + + - ```python + def get_labels() + ``` + - get the category of porn_detection_gru + + - **Return** + + - labels(dict): the category of porn_detection_gru (Dichotomies, yes/no) + + - ```python + def get_vocab_path() + ``` + + - get a vocabulary for pre-training + + - **Return** + + - vocab_path(str): Vocabulary path + + + +## IV. Server Deployment + +- PaddleHub Serving can deploy an online pornography detection service and you can use this interface for online Web applications. + +- ## Step 1: Start PaddleHub Serving + + - Run the startup command: + - ```shell + $ hub serving start -m porn_detection_gru + ``` + + - The model loading process is displayed on startup. After the startup is successful, the following information is displayed: + - ```shell + Loading porn_detection_gur successful. + ``` + + - The servitization API is now deployed and the default port number is 8866. + + - **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it. + + +- ## Step 2: Send a predictive request + + - After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result + - ```python + import requests + import json + + # data to be predicted + text = ["黄片下载", "打击黄牛党"] + + # Set the running configuration + # Corresponding local forecast porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True) + data = {"texts": text, "batch_size": 1, "use_gpu":True} + + # set the prediction method to porn_detection_gru and send a POST request, content-type should be set to json + # HOST_IP is the IP address of the server + url = "http://HOST_IP:8866/predict/porn_detection_gru" + headers = {"Content-Type": "application/json"} + r = requests.post(url=url, headers=headers, data=json.dumps(data)) + + # print prediction result + print(json.dumps(r.json(), indent=4, ensure_ascii=False)) + ``` + + - For more information about PaddleHub Serving, please refer to:[Serving Deployment](../../../../docs/docs_ch/tutorial/serving.md) + + + + +## V. Release Note + +* 1.0.0 + + First release + +* 1.1.0 + + Improves prediction performance and simplifies interface usage + + - ```shell + $ hub install porn_detection_gru==1.1.0 + ``` + diff --git a/modules/video/classification/nonlocal_kinetics400/README.md b/modules/video/classification/nonlocal_kinetics400/README.md new file mode 100644 index 00000000..0e88d19b --- /dev/null +++ b/modules/video/classification/nonlocal_kinetics400/README.md @@ -0,0 +1,109 @@ +# nonlocal_kinetics400 + +|模型名称|nonlocal_kinetics400| +| :--- | :---: | +|类别|视频-视频分类| +|网络|Non-local| +|数据集|Kinetics-400| +|是否支持Fine-tuning|否| +|模型大小|129MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + + +## 一、模型基本信息 + +- ### 模型介绍 + + - Non-local Neural Networks是由Xiaolong Wang等研究者在2017年提出的模型,主要特点是通过引入Non-local操作来描述距离较远的像素点之间的关联关系。其借助于传统计算机视觉中的non-local mean的思想,并将该思想扩展到神经网络中,通过定义输出位置和所有输入位置之间的关联函数,建立全局关联特性。Non-local模型的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。 + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install nonlocal_kinetics400 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + hub run nonlocal_kinetics400 --input_path "/PATH/TO/VIDEO" --use_gpu True + ``` + + 或者 + + - ```shell + hub run nonlocal_kinetics400 --input_file test.txt --use_gpu True + ``` + + - test.txt 存放待分类视频的存放路径; + - Note: 该PaddleHub Module目前只支持在GPU环境下使用,在使用前,请使用下述命令指定GPU设备(设备ID请根据实际情况指定) + + - ```shell + export CUDA_VISIBLE_DEVICES=0 + ``` + + - 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + + import paddlehub as hub + + nonlocal = hub.Module(name="nonlocal_kinetics400") + + test_video_path = "/PATH/TO/VIDEO" + + # set input dict + input_dict = {"image": [test_video_path]} + + # execute predict and print the result + results = nonlocal.video_classification(data=input_dict) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def video_classification(data) + ``` + + - 用于视频分类预测 + + - **参数** + + - data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。 + + + - **返回** + + - result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。 + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install nonlocal_kinetics400==1.0.0 + ``` diff --git a/modules/video/classification/stnet_kinetics400/README.md b/modules/video/classification/stnet_kinetics400/README.md new file mode 100644 index 00000000..4cbed174 --- /dev/null +++ b/modules/video/classification/stnet_kinetics400/README.md @@ -0,0 +1,106 @@ +# stnet_kinetics400 + +|模型名称|stnet_kinetics400| +| :--- | :---: | +|类别|视频-视频分类| +|网络|StNet| +|数据集|Kinetics-400| +|是否支持Fine-tuning|否| +|模型大小|129MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + + +## 一、模型基本信息 + +- ### 模型介绍 + + - StNet模型框架为ActivityNet Kinetics Challenge 2018中夺冠的基础网络框架,是基于ResNet50实现的。该模型提出super-image的概念,在super-image上进行2D卷积,建模视频中局部时空相关性。另外通过temporal modeling block建模视频的全局时空依赖,最后用一个temporal Xception block对抽取的特征序列进行长时序建模。StNet的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。 + + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install stnet_kinetics400 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + hub run stnet_kinetics400 --input_path "/PATH/TO/VIDEO" + ``` + + 或者 + + - ```shell + hub run stnet_kinetics400 --input_file test.txt + ``` + + - test.txt 存放待分类视频的存放路径 + + + - 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + + import paddlehub as hub + + stnet = hub.Module(name="stnet_kinetics400") + + test_video_path = "/PATH/TO/VIDEO" + + # set input dict + input_dict = {"image": [test_video_path]} + + # execute predict and print the result + results = stnet.video_classification(data=input_dict) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def video_classification(data) + ``` + + - 用于视频分类预测 + + - **参数** + + - data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。 + + + - **返回** + + - result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。 + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install stnet_kinetics400==1.0.0 + ``` diff --git a/modules/video/classification/tsm_kinetics400/README.md b/modules/video/classification/tsm_kinetics400/README.md new file mode 100644 index 00000000..5301071b --- /dev/null +++ b/modules/video/classification/tsm_kinetics400/README.md @@ -0,0 +1,106 @@ +# tsm_kinetics400 + +|模型名称|tsm_kinetics400| +| :--- | :---: | +|类别|视频-视频分类| +|网络|TSM| +|数据集|Kinetics-400| +|是否支持Fine-tuning|否| +|模型大小|95MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + + +## 一、模型基本信息 + +- ### 模型介绍 + + - TSM(Temporal Shift Module)是由MIT和IBM Watson AI Lab的JiLin,ChuangGan和SongHan等人提出的通过时间位移来提高网络视频理解能力的模块。TSM的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。 + + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install tsm_kinetics400 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + hub run tsm_kinetics400 --input_path "/PATH/TO/VIDEO" + ``` + + 或者 + + - ```shell + hub run tsm_kinetics400 --input_file test.txt + ``` + + - Note: test.txt 存放待分类视频的存放路径 + + + - 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + + import paddlehub as hub + + tsm = hub.Module(name="tsm_kinetics400") + + test_video_path = "/PATH/TO/VIDEO" + + # set input dict + input_dict = {"image": [test_video_path]} + + # execute predict and print the result + results = tsm.video_classification(data=input_dict) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def video_classification(data) + ``` + + - 用于视频分类预测 + + - **参数** + + - data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。 + + + - **返回** + + - result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。 + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install tsm_kinetics400==1.0.0 + ``` diff --git a/modules/video/classification/tsn_kinetics400/README.md b/modules/video/classification/tsn_kinetics400/README.md new file mode 100644 index 00000000..e2d2e876 --- /dev/null +++ b/modules/video/classification/tsn_kinetics400/README.md @@ -0,0 +1,108 @@ +# tsn_kinetics400 + +|模型名称|tsn_kinetics400| +| :--- | :---: | +|类别|视频-视频分类| +|网络|TSN| +|数据集|Kinetics-400| +|是否支持Fine-tuning|否| +|模型大小|95MB| +|最新更新日期|2021-02-26| +|数据指标|-| + + + +## 一、模型基本信息 + +- ### 模型介绍 + + - TSN(Temporal Segment Network)是视频分类领域经典的基于2D-CNN的解决方案。该方法主要解决视频的长时间行为判断问题,通过稀疏采样视频帧的方式代替稠密采样,既能捕获视频全局信息,也能去除冗余,降低计算量。最终将每帧特征平均融合后得到视频的整体特征,并用于分类。TSN的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。 + + - 具体网络结构可参考论文:[TSN](https://arxiv.org/abs/1608.00859)。 + + + +## 二、安装 + +- ### 1、环境依赖 + + - paddlepaddle >= 1.4.0 + + - paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst) + +- ### 2、安装 + + - ```shell + $ hub install tsn_kinetics400 + ``` + - 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md) + | [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md) + + + + +## 三、模型API预测 + +- ### 1、命令行预测 + + - ```shell + hub run tsn_kinetics400 --input_path "/PATH/TO/VIDEO" + ``` + + 或者 + + - ```shell + hub run tsn_kinetics400 --input_file test.txt + ``` + + - Note: test.txt 存放待分类视频的存放路径 + + + - 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst) + +- ### 2、预测代码示例 + + - ```python + + import paddlehub as hub + + tsn = hub.Module(name="tsn_kinetics400") + + test_video_path = "/PATH/TO/VIDEO" + + # set input dict + input_dict = {"image": [test_video_path]} + + # execute predict and print the result + results = tsn.video_classification(data=input_dict) + for result in results: + print(result) + ``` + +- ### 3、API + + - ```python + def video_classification(data) + ``` + + - 用于视频分类预测 + + - **参数** + + - data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。 + + + - **返回** + + - result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。 + + +## 五、更新历史 + +* 1.0.0 + + 初始发布 + + - ```shell + $ hub install tsn_kinetics400==1.0.0 + ``` -- GitLab