未验证 提交 64ea4006 编写于 作者: S shinichiye 提交者: GitHub

Add modellist&upgrade readme (#1741)

* Update README.md

update readme

* Update README.md

correct some mistakes

* Update README.md

* Create README_en.md

* Update the serial number

Update the serial number and correct some mistakes

* Update README.md

* Create README_en.md

* Create README_en.md

* Update README.md

* Update README.md

* correct a mistake

correct a mistake

* Create README_en.md

* Create tsn_kinetics400

* Delete tsn_kinetics400

* Create README.md

* Create README.md

* Create README.md

* Update README.md

* Create README.md

* Update README.md

* Update README.md

* Update README.md

* Update README_ch.md

* Update README.md

* Update README.md

* Update README.md

* Update README_ch.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README_ch.md

* Update README.md

* Update README.md

* Update README_ch.md

* Update and rename README.md to README_ch.md

* Update README_ch.md

* Create README.md

* Update README.md

* Update README.md

* Update README_ch.md
Co-authored-by: NKP <109694228@qq.com>
上级 85724977
......@@ -4,7 +4,7 @@ English | [简体中文](README_ch.md)
<img src="./docs/imgs/paddlehub_logo.jpg" align="middle">
<p align="center">
<div align="center">
<h3> <a href=#QuickStart> QuickStart </a> | <a href="https://paddlehub.readthedocs.io/en/release-v2.1"> Tutorial </a> | <a href="https://www.paddlepaddle.org.cn/hublist"> Models List </a> | <a href="https://www.paddlepaddle.org.cn/hub"> Demos </a> </h3>
<h3> <a href=#QuickStart> QuickStart </a> | <a href="https://paddlehub.readthedocs.io/en/release-v2.1"> Tutorial </a> | <a href="./modules"> Models List </a> | <a href="https://www.paddlepaddle.org.cn/hub"> Demos </a> </h3>
</div>
------------------------------------------------------------------------------------------
......@@ -28,7 +28,7 @@ English | [简体中文](README_ch.md)
## Introduction and Features
- **PaddleHub** aims to provide developers with rich, high-quality, and directly usable pre-trained models.
- **Abundant Pre-trained Models**: 300+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage.
- **Abundant Pre-trained Models**: 360+ pre-trained models cover the 5 major categories, including Image, Text, Audio, Video, and Industrial application. All of them are free for download and offline usage.
- **No Need for Deep Learning Background**: you can use AI models quickly and enjoy the dividends of the artificial intelligence era.
- **Quick Model Prediction**: model prediction can be realized through a few lines of scripts to quickly experience the model effect.
- **Model As Service**: one-line command to build deep learning model API service deployment capabilities.
......@@ -44,8 +44,8 @@ English | [简体中文](README_ch.md)
## Visualization Demo [[More]](./docs/docs_en/visualization.md)
### **Computer Vision (161 models)**
## Visualization Demo [[More]](./docs/docs_en/visualization.md) [[ModelList]](./modules)
### **[Computer Vision (212 models)](./modules#Image)**
<div align="center">
<img src="./docs/imgs/Readme_Related/Image_all.gif" width = "530" height = "400" />
</div>
......@@ -53,7 +53,7 @@ English | [简体中文](README_ch.md)
- Many thanks to CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) for the pre-trained models, you can try to train your models with them.
### **Natural Language Processing (129 models)**
### **[Natural Language Processing (130 models)](./modules#Text)**
<div align="center">
<img src="./docs/imgs/Readme_Related/Text_all.gif" width = "640" height = "240" />
</div>
......@@ -62,7 +62,7 @@ English | [简体中文](README_ch.md)
### Speech (3 models)
### [Speech (15 models)](./modules#Audio)
- TTS speech synthesis algorithm, multiple algorithms are available.
- Many thanks to CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet) for the pre-trained models, you can try to train your models with Parakeet.
- Input: `Life was like a box of chocolates, you never know what you're gonna get.`
......@@ -95,7 +95,7 @@ English | [简体中文](README_ch.md)
</table>
</div>
### Video (8 models)
### [Video (8 models)](./modules#Video)
- Short video classification trained via large-scale video datasets, supports 3000+ tag types prediction for short Form Videos.
- Many thanks to CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo) for the pre-trained model, you can try to train your models with PaddleVideo.
- `Example: Input a short video of swimming, the algorithm can output the result of "swimming"`
......
......@@ -4,7 +4,7 @@
<img src="./docs/imgs/paddlehub_logo.jpg" align="middle">
<p align="center">
<div align="center">
<h3> <a href=#QuickStart> 快速开始 </a> | <a href="https://paddlehub.readthedocs.io/zh_CN/release-v2.1//"> 教程文档 </a> | <a href="https://www.paddlepaddle.org.cn/hublist"> 模型搜索 </a> | <a href="https://www.paddlepaddle.org.cn/hub"> 演示Demo </a>
<h3> <a href=#QuickStart> 快速开始 </a> | <a href="https://paddlehub.readthedocs.io/zh_CN/release-v2.1//"> 教程文档 </a> | <a href="./modules/README_ch.md"> 模型库 </a> | <a href="https://www.paddlepaddle.org.cn/hub"> 演示Demo </a>
</h3>
</div>
......@@ -30,7 +30,7 @@
## 简介与特性
- PaddleHub旨在为开发者提供丰富的、高质量的、直接可用的预训练模型
- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 300+ 预训练模型,全部开源下载,离线可运行
- **【模型种类丰富】**: 涵盖CV、NLP、Audio、Video、工业应用主流五大品类的 **360+** 预训练模型,全部开源下载,离线可运行
- **【超低使用门槛】**:无需深度学习背景、无需数据与训练过程,可快速使用AI模型
- **【一键模型快速预测】**:通过一行命令行或者极简的Python API实现模型调用,可快速体验模型效果
- **【一键模型转服务化】**:一行命令,搭建深度学习模型API服务化部署能力
......@@ -47,9 +47,9 @@
## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)**
## **精品模型效果展示[【更多】](./docs/docs_ch/visualization.md)[【模型库】](./modules/README_ch.md)**
### **图像类(161个)**
### **[图像类(212个)](./modules/README_ch.md#图像)**
- 包括图像分类、人脸检测、口罩检测、车辆检测、人脸/人体/手部关键点检测、人像分割、80+语言文本识别、图像超分/上色/动漫化等
<div align="center">
<img src="./docs/imgs/Readme_Related/Image_all.gif" width = "530" height = "400" />
......@@ -58,7 +58,7 @@
- 感谢CopyRight@[PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)[PaddleDetection](https://github.com/PaddlePaddle/PaddleDetection)[PaddleGAN](https://github.com/PaddlePaddle/PaddleGAN)[AnimeGAN](https://github.com/TachibanaYoshino/AnimeGANv2)[openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose)[PaddleSeg](https://github.com/PaddlePaddle/PaddleSeg)[Zhengxia Zou](https://github.com/jiupinjia/SkyAR)[PaddleClas](https://github.com/PaddlePaddle/PaddleClas) 提供相关预训练模型,训练能力开放,欢迎体验。
### **文本类(129个)**
### **[文本类(130个)](./modules/README_ch.md#文本)**
- 包括中文分词、词性标注与命名实体识别、句法分析、AI写诗/对联/情话/藏头诗、中文的评论情感分析、中文色情文本审核等
<div align="center">
<img src="./docs/imgs/Readme_Related/Text_all.gif" width = "640" height = "240" />
......@@ -67,7 +67,7 @@
- 感谢CopyRight@[ERNIE](https://github.com/PaddlePaddle/ERNIE)[LAC](https://github.com/baidu/LAC)[DDParser](https://github.com/baidu/DDParser)提供相关预训练模型,训练能力开放,欢迎体验。
### **语音类(3个)**
### **[语音类(15个)](./modules/README_ch.md#语音)**
- TTS语音合成算法,多种算法可选
- 感谢CopyRight@[Parakeet](https://github.com/PaddlePaddle/Parakeet)提供预训练模型,训练能力开放,欢迎体验。
- 输入:`Life was like a box of chocolates, you never know what you're gonna get.`
......@@ -100,7 +100,7 @@
</table>
</div>
### **视频类(8个)**
### **[视频类(8个)](./modules/README_ch.md#视频)**
- 包含短视频分类,支持3000+标签种类,可输出TOP-K标签,多种算法可选。
- 感谢CopyRight@[PaddleVideo](https://github.com/PaddlePaddle/PaddleVideo)提供预训练模型,训练能力开放,欢迎体验。
- `举例:输入一段游泳的短视频,算法可以输出"游泳"结果`
......
此差异已折叠。
此差异已折叠。
# chinese_ocr_db_crnn_mobile
| Module Name | chinese_ocr_db_crnn_mobile |
| :------------------ | :------------: |
| Category | image-text_recognition |
| Network | Differentiable Binarization+RCNN |
| Dataset | icdar2015 |
| Fine-tuning supported or not | No |
| Module Size | 16M |
| Latest update date | 2021-02-26 |
| Data indicators | - |
## I. Basic Information of Module
- ### Application Effect Display
- [Online experience in OCR text recognition scenarios](https://www.paddlepaddle.org.cn/hub/scene/ocr)
- Example result:
<p align="center">
<img src="https://user-images.githubusercontent.com/76040149/133097562-d8c9abd1-6c70-4d93-809f-fa4735764836.png" width = "600" hspace='10'/> <br />
</p>
- ### Module Introduction
- chinese_ocr_db_crnn_mobile Module is used to identify Chinese characters in pictures. Get the text box after using [chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/), identify the Chinese characters in the text box, and then do angle classification to the detection text box. CRNN(Convolutional Recurrent Neural Network) is adopted as the final recognition algorithm. This Module is an ultra-lightweight Chinese OCR model that supports direct prediction.
<p align="center">
<img src="https://user-images.githubusercontent.com/76040149/133098254-7c642826-d6d7-4dd0-986e-371622337867.png" width = "300" height = "450" hspace='10'/> <br />
</p>
- For more information, please refer to:[An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition](https://arxiv.org/pdf/1507.05717.pdf)
## II. Installation
- ### 1、Environmental dependence
- paddlepaddle >= 1.7.2
- paddlehub >= 1.6.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- shapely
- pyclipper
- ```shell
$ pip install shapely pyclipper
```
- **This Module relies on the third-party libraries shapely and pyclipper. Please install shapely and pyclipper before using this Module.**
- ### 2、Installation
- ```shell
$ hub install chinese_ocr_db_crnn_mobile
```
- If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API and Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run chinese_ocr_db_crnn_mobile --input_path "/PATH/TO/IMAGE"
```
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
import cv2
ocr = hub.Module(name="chinese_ocr_db_crnn_mobile", enable_mkldnn=True) # MKLDNN acceleration is only available on CPU
result = ocr.recognize_text(images=[cv2.imread('/PATH/TO/IMAGE')])
# or
# result = ocr.recognize_text(paths=['/PATH/TO/IMAGE'])
```
- ### 3、API
- ```python
__init__(text_detector_module=None, enable_mkldnn=False)
```
- Construct the ChineseOCRDBCRNN object
- **Parameter**
- text_detector_module(str): PaddleHub Module Name for text detection, use [chinese_text_detection_db_mobile Module](../chinese_text_detection_db_mobile/) by default if set to None. Its function is to detect the text in the picture.
- enable_mkldnn(bool): Whether to enable MKLDNN to accelerate CPU computing. This parameter is valid only when the CPU is running. The default is False.
- ```python
def recognize_text(images=[],
paths=[],
use_gpu=False,
output_dir='ocr_result',
visualization=False,
box_thresh=0.5,
text_thresh=0.5,
angle_classification_thresh=0.9)
```
- Prediction API, detecting the position of all Chinese text in the input image.
- **Parameter**
- paths (list\[str\]): image path
- images (list\[numpy.ndarray\]): image data, ndarray.shape is in the format \[H, W, C\], BGR;
- use\_gpu (bool): use GPU or not **If GPU is used, set the CUDA_VISIBLE_DEVICES environment variable first**
- box\_thresh (float): The confidence threshold of text box detection;
- text\_thresh (float): The confidence threshold of Chinese text recognition;
- angle_classification_thresh(float): The confidence threshold of text Angle classification
- visualization (bool): Whether to save the recognition results as picture files;
- output\_dir (str): path to save the image, ocr\_result by default.
- **Return**
- res (list\[dict\]): The list of recognition results, where each element is dict and each field is:
- data (list\[dict\]): recognition result, each element in the list is dict and each field is:
- text(str): The result text of recognition
- confidence(float): The confidence of the results
- text_box_position(list): The pixel coordinates of the text box in the original picture, a 4*2 matrix, represent the coordinates of the lower left, lower right, upper right and upper left vertices of the text box in turn
data is \[\] if there's no result
- save_path (str, optional): Path to save the result, save_path is '' if no image is saved.
## IV. Server Deployment
- PaddleHub Serving can deploy an online object detection service.
- ### Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m chinese_ocr_db_crnn_mobile
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- ### Step 2: Send a predictive request
- After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result
- ```python
import requests
import json
import cv2
import base64
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
# Send an HTTP request
data = {'images':[cv2_to_base64(cv2.imread("/PATH/TO/IMAGE"))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/chinese_ocr_db_crnn_mobile"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction result
print(r.json()["results"])
```
## V. Release Note
* 1.0.0
First release
* 1.0.1
Fixed failure to use the online service invocating model
* 1.0.2
Supports MKLDNN to speed up CPU computing
* 1.1.0
An ultra-lightweight three-stage model (text box detection - angle classification - text recognition) is used to identify text in images.
* 1.1.1
Supports recognition of spaces in text.
* 1.1.2
Fixed an issue where only 30 fields can be detected.
- ```shell
$ hub install chinese_ocr_db_crnn_mobile==1.1.2
```
# senta_bilstm
| Module Name | senta_bilstm |
| :------------------ | :------------: |
| Category | text-sentiment_analysis |
| Network | BiLSTM |
| Dataset | Dataset built by Baidu |
| Fine-tuning supported or not | No |
| Module Size | 690M |
| Latest update date | 2021-02-26 |
| Data indicators | - |
## I. Basic Information of Module
- ### Module Introduction
- Sentiment Classification (Senta for short) can automatically judge the emotional polarity category of Chinese texts with subjective description and give corresponding confidence, which can help enterprises understand users' consumption habits, analyze hot topics and crisis public opinion monitoring, and provide favorable decision support for enterprises. The model is based on a bidirectional LSTM structure, with positive and negative emotion types.
## II. Installation
- ### 1、Environmental dependence
- paddlepaddle >= 1.8.0
- paddlehub >= 1.8.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install senta_bilstm
```
- If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API and Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run senta_bilstm --input_text "这家餐厅很好吃"
```
or
- ```shell
$ hub run senta_bilstm --input_file test.txt
```
- test.txt stores the text to be predicted, for example:
> 这家餐厅很好吃
> 这部电影真的很差劲
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
senta = hub.Module(name="senta_bilstm")
test_text = ["这家餐厅很好吃", "这部电影真的很差劲"]
results = senta.sentiment_classify(texts=test_text,
use_gpu=False,
batch_size=1)
for result in results:
print(result['text'])
print(result['sentiment_label'])
print(result['sentiment_key'])
print(result['positive_probs'])
print(result['negative_probs'])
# 这家餐厅很好吃 1 positive 0.9407 0.0593
# 这部电影真的很差劲 0 negative 0.02 0.98
```
- ### 3、API
- ```python
def sentiment_classify(texts=[], data={}, use_gpu=False, batch_size=1)
```
- senta_bilstm predicting interfaces, predicting sentiment classification of input sentences (dichotomies, positive/negative)
- **Parameter**
- texts(list): data to be predicted, if texts parameter is used, there is no need to pass in data parameter. You can use any of the two parameters.
- data(dict): predicted data , key must be text,value is data to be predicted. if data parameter is used, there is no need to pass in texts parameter. You can use any of the two parameters. It is suggested to use texts parameter, and data parameter will be discarded later.
- use_gpu(bool): use GPU or not. If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- batch_size(int): batch size
- **Return**
- results(list): result of sentiment classification
- ```python
def get_labels()
```
- get the category of senta_bilstm
- **Return**
- labels(dict): the category of senta_bilstm(Dichotomies, positive/negative)
- ```python
def get_vocab_path()
```
- Get a vocabulary for pre-training
- **Return**
- vocab_path(str): Vocabulary path
## IV. Server Deployment
- PaddleHub Serving can deploy an online sentiment analysis detection service and you can use this interface for online Web applications.
- ## Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m senta_bilstm
```
- The model loading process is displayed on startup. After the startup is successful, the following information is displayed:
- ```shell
Loading senta_bilstm successful.
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- ## Step 2: Send a predictive request
- After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result
- ```python
import requests
import json
# data to be predicted
text = ["这家餐厅很好吃", "这部电影真的很差劲"]
# Set the running configuration
# Corresponding to local prediction senta_bilstm.sentiment_classify(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# set the prediction method to senta_bilstm and send a POST request, content-type should be set to json
# HOST_IP is the IP address of the server
url = "http://HOST_IP:8866/predict/senta_bilstm"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction result
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- For more information about PaddleHub Serving, please refer to:[Serving Deployment](../../../../docs/docs_ch/tutorial/serving.md)
## V. Release Note
* 1.0.0
First release
* 1.0.1
Vocabulary upgrade
* 1.1.0
Significantly improve predictive performance
* 1.2.0
Model upgrade, support transfer learning for text classification, text matching and other tasks
- ```shell
$ hub install senta_bilstm==1.2.0
```
# ernie_gen
| 模型名称 | ernie_gen |
| :------------------ | :-----------: |
| 类别 | 文本-文本生成 |
| 网络 | ERNIE-GEN |
| 数据集 | - |
| 是否支持Fine-tuning | 是 |
| 模型大小 | 85K |
| 最新更新日期 | 2021-07-20 |
| 数据指标 | - |
## 一、模型基本信息
- ### 模型介绍
- ERNIE-GEN 是面向生成任务的预训练-微调框架,首次在预训练阶段加入span-by-span 生成任务,让模型每次能够生成一个语义完整的片段。在预训练和微调中通过填充式生成机制和噪声感知机制来缓解曝光偏差问题。此外, ERNIE-GEN 采样多片段-多粒度目标文本采样策略, 增强源文本和目标文本的关联性,加强了编码器和解码器的交互。
- ernie_gen module是一个具备微调功能的module,可以快速完成特定场景module的制作。
<p align="center">
<img src="https://user-images.githubusercontent.com/76040149/133191670-8eb1c542-f8e8-4715-adb2-6346b976fab1.png" width="600" hspace='10'/>
</p>
- 更多详情请查看:[ERNIE-GEN:An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation](https://arxiv.org/abs/2001.11314)
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 2.0.0
- paddlehub >= 2.0.0 | [如何安装paddlehub](../../../../docs/docs_ch/get_start/installation.rst)
- paddlenlp >= 2.0.0
- ### 2、安装
- ```shell
$ hub install ernie_gen
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ernie_gen can be used **only if it is first targeted at the specific dataset fine-tune**
- There are many types of text generation tasks, ernie_gen only provides the basic parameters for text generation, which can only be used after fine-tuning the dataset for a specific task
- Paddlehub provides a simple fine-tune dataset:[train.txt](./test_data/train.txt), [dev.txt](./test_data/dev.txt)
- Paddlehub also offers multiple fine-tune pre-training models that work well:[Couplet generated](../ernie_gen_couplet/)[Lover words generated](../ernie_gen_lover_words/)[Poetry generated](../ernie_gen_poetry/)
### 1、Fine-tune and encapsulation
- #### Fine-tune Code Example
- ```python
import paddlehub as hub
module = hub.Module(name="ernie_gen")
result = module.finetune(
train_path='train.txt',
dev_path='dev.txt',
max_steps=300,
batch_size=2
)
module.export(params_path=result['last_save_path'], module_name="ernie_gen_test", author="test")
```
- #### API Instruction
- ```python
def finetune(train_path,
dev_path=None,
save_dir="ernie_gen_result",
init_ckpt_path=None,
use_gpu=True,
max_steps=500,
batch_size=8,
max_encode_len=15,
max_decode_len=15,
learning_rate=5e-5,
warmup_proportion=0.1,
weight_decay=0.1,
noise_prob=0,
label_smooth=0,
beam_width=5,
length_penalty=1.0,
log_interval=100,
save_interval=200):
```
- Fine tuning model parameters API
- **Parameter**
- train_path(str): Training set path. The format of the training set should be: "serial number\tinput text\tlabel", such as "1\t床前明月光\t疑是地上霜", note that \t cannot be replaced by Spaces
- dev_path(str): validation set path. The format of the validation set should be: "serial number\tinput text\tlabel, such as "1\t举头望明月\t低头思故乡", note that \t cannot be replaced by Spaces
- save_dir(str): Model saving and validation sets predict output paths.
- init_ckpt_path(str): The model initializes the loading path to realize incremental training.
- use_gpu(bool): use gpu or not
- max_steps(int): Maximum training steps.
- batch_size(int): Batch size during training.
- max_encode_len(int): Maximum encoding length.
- max_decode_len(int): Maximum decoding length.
- learning_rate(float): Learning rate size.
- warmup_proportion(float): Warmup rate.
- weight_decay(float): Weight decay size.
- noise_prob(float): Noise probability, refer to the Ernie Gen's paper.
- label_smooth(float): Label smoothing weight.
- beam_width(int): Beam size of validation set at the time of prediction.
- length_penalty(float): Length penalty weight for validation set prediction.
- log_interval(int): Number of steps at a training log printing interval.
- save_interval(int): training model save interval deployment. The validation set will make predictions after the model is saved.
- **Return**
- result(dict): Run result. Contains 2 keys:
- last_save_path(str): Save path of model at the end of training.
- last_ppl(float): Model confusion at the end of training.
- ```python
def export(
params_path,
module_name,
author,
version="1.0.0",
summary="",
author_email="",
export_path="."):
```
- Module exports an API through which training parameters can be packaged into a Hub Module with one click.
- **Parameter**
- params_path(str): Module parameter path.
- module_name(str): module name, such as "ernie_gen_couplet"。
- author(str): Author name
- max_encode_len(int): Maximum encoding length.
- max_decode_len(int): Maximum decoding length.
- version(str): The version number.
- summary(str): English introduction to Module.
- author_email(str): Email address of the author.
- export_path(str): Module export path.
### 2、模型预测
- **定义`$module_name`为export指定的module_name**
- 模型转换完毕之后,通过`hub install $module_name`安装该模型,即可通过以下2种方式调用自制module:
- #### 法1:命令行预测
- ```python
$ hub run $module_name --input_text="输入文本" --use_gpu True --beam_width 5
```
- 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- #### 法2:API预测
- ```python
import paddlehub as hub
module = hub.Module(name="$module_name")
test_texts = ["输入文本1", "输入文本2"]
# generate包含3个参数,texts为输入文本列表,use_gpu指定是否使用gpu,beam_width指定beam search宽度。
results = module.generate(texts=test_texts, use_gpu=True, beam_width=5)
for result in results:
print(result)
```
- 您也可以将`$module_name`文件夹打包为tar.gz压缩包并联系PaddleHub工作人员上传至PaddleHub模型仓库,这样更多的用户可以通过一键安装的方式使用您的模型。PaddleHub非常欢迎您的贡献,共同推动开源社区成长。
## 四、服务部署
- PaddleHub Serving 可以部署一个文本生成的在线服务。
- ### 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m $module_name -p 8866
```
- 这样就完成了一个目标检测的服务化API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
- ### 第二步:发送预测请求
- 客户端通过以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
# 发送HTTP请求
data = {'texts':["输入文本1", "输入文本2"],
'use_gpu':True, 'beam_width':5}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/$module_name"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存结果
results = r.json()["results"]
for result in results:
print(result)
```
- **NOTE:** 上述`$module_name`为export指定的module_name
## 五、更新历史
* 1.0.0
初始发布
* 1.0.1
修复模型导出bug
* 1.0.2
修复windows运行中的bug
* 1.1.0
接入PaddleNLP
- ```shell
$ hub install ernie_gen==1.1.0
```
......@@ -63,13 +63,13 @@
- ### 2、预测代码示例
- ```python
import paddlehub as hub
readingPicturesWritingPoems = hub.Module(name="reading_pictures_writing_poems")
results = readingPicturesWritingPoems.WritingPoem(image = "scenery.jpg", use_gpu=False)
for result in results:
print(result)
import paddlehub as hub
readingPicturesWritingPoems = hub.Module(name="reading_pictures_writing_poems")
results = readingPicturesWritingPoems.WritingPoem(image = "scenery.jpg", use_gpu=False)
for result in results:
print(result)
```
- ### 3、API
......
# PornDetectionCNN API说明
# porn_detection_cnn
## detection(texts=[], data={}, use_gpu=False, batch_size=1)
| 模型名称 | porn_detection_cnn |
| :------------------ | :------------: |
| 类别 | 文本-文本审核 |
| 网络 | CNN |
| 数据集 | 百度自建数据集 |
| 是否支持Fine-tuning | 否 |
| 模型大小 | 20M |
| 最新更新日期 | 2021-02-26 |
| 数据指标 | - |
porn_detection_cnn预测接口,鉴定输入句子是否包含色情文案
## 一、模型基本信息
**参数**
- ### 模型介绍
- 色情检测模型可自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别。
- porn_detection_cnn采用CNN网络结构并按字粒度进行切词,具有较高的预测速度。该模型最大句子长度为256字,仅支持预测。
* texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
* data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。
* use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
* batch_size(int): 批处理大小
**返回**
## 二、安装
* results(list): 鉴定结果
- ### 1、环境依赖
## context(trainable=False)
- paddlepaddle >= 1.6.2
- paddlehub >= 1.6.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
获取porn_detection_cnn的预训练program以及program的输入输出变量
- ### 2、安装
**参数**
- ```shell
$ hub install porn_detection_cnn
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变
**返回**
## 三、模型API预测
* inputs(dict): program的输入变量
* outputs(dict): program的输出变量
* main_program(Program): 带有预训练参数的program
- ### 1、命令行预测
## get_labels()
- ```shell
$ hub run porn_detection_cnn --input_text "黄片下载"
```
- 或者
获取porn_detection_cnn的类别
- ```shell
$ hub run porn_detection_cnn --input_file test.txt
```
- 其中test.txt存放待审查文本,每行仅放置一段待审核文本
- 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
**返回**
- ### 2、预测代码示例
* labels(dict): porn_detection_cnn的类别(二分类,是/不是)
- ```python
import paddlehub as hub
porn_detection_cnn = hub.Module(name="porn_detection_cnn")
test_text = ["黄片下载", "打击黄牛党"]
results = porn_detection_cnn.detection(texts=test_text, use_gpu=True, batch_size=1)
for index, text in enumerate(test_text):
results[index]["text"] = text
for index, result in enumerate(results):
print(results[index])
# 输出结果如下:
# {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676}
# {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996}
```
## get_vocab_path()
- ### 3、API
获取预训练时使用的词汇表
- ```python
def detection(texts=[], data={}, use_gpu=False, batch_size=1)
```
- porn_detection_cnn预测接口,鉴定输入句子是否包含色情文案
**返回**
- **参数**
* vocab_path(str): 词汇表路径
- texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
# PornDetectionCNN 服务部署
- data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。
PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。
- use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
## 第一步:启动PaddleHub Serving
- batch_size(int): 批处理大小
运行启动命令:
```shell
$ hub serving start -m porn_detection_cnn
```
- **返回**
启动时会显示加载模型过程,启动成功后显示
```shell
Loading porn_detection_cnn successful.
```
- results(list): 鉴定结果
这样就完成了服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
- ```python
def get_labels()
```
- 获取porn_detection_cnn的类别
## 第二步:发送预测请求
- **返回**
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- labels(dict): porn_detection_cnn的类别(二分类,是/不是)
```python
import requests
import json
- ```python
def get_vocab_path()
```
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
- 获取预训练时使用的词汇表
# 设置运行配置
# 对应本地预测porn_detection_cnn.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
- **返回**
# 指定预测方法为porn_detection_cnn并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_cnn"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
- vocab_path(str): 词汇表路径
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
关于PaddleHub Serving更多信息参考[服务部署](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.6/docs/tutorial/serving.md)
## 四、服务部署
- PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。
- ## 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m porn_detection_cnn
```
- 启动时会显示加载模型过程,启动成功后显示
- ```shell
Loading porn_detection_cnn successful.
```
- 这样就完成了服务化API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
- ## 第二步:发送预测请求
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
# 设置运行配置
# 对应本地预测porn_detection_cnn.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# 指定预测方法为porn_detection_cnn并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_cnn"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- 关于PaddleHub Serving更多信息参考[服务部署](../../../../docs/docs_ch/tutorial/serving.md)
## 五、更新历史
* 1.0.0
初始发布
* 1.1.0
大幅提升预测性能,同时简化接口使用
- ```shell
$ hub install porn_detection_cnn==1.1.0
```
# PornDetectionGRU API说明
# porn_detection_gru
## detection(texts=[], data={}, use_gpu=False, batch_size=1)
| 模型名称 | porn_detection_gru |
| :------------------ | :------------: |
| 类别 | 文本-文本审核 |
| 网络 | GRU |
| 数据集 | 百度自建数据集 |
| 是否支持Fine-tuning | 否 |
| 模型大小 | 20M |
| 最新更新日期 | 2021-02-26 |
| 数据指标 | - |
porn_detection_gru预测接口,鉴定输入句子是否包含色情文案
## 一、模型基本信息
**参数**
- ### 模型介绍
- 色情检测模型可自动判别文本是否涉黄并给出相应的置信度,对文本中的色情描述、低俗交友、污秽文案进行识别。
- porn_detection_gru采用GRU网络结构并按字粒度进行切词,具有较高的预测速度。该模型最大句子长度为256字,仅支持预测。
* texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
* data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃。
* use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
* batch_size(int): 批处理大小
**返回**
## 二、安装
* results(list): 鉴定结果
- ### 1、环境依赖
## context(trainable=False)
- paddlepaddle >= 1.6.2
- paddlehub >= 1.6.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
获取porn_detection_gru的预训练program以及program的输入输出变量
- ### 2、安装
**参数**
- ```shell
$ hub install porn_detection_gru
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
* trainable(bool): trainable=True表示program中的参数在Fine-tune时需要微调,否则保持不变
**返回**
* inputs(dict): program的输入变量
* outputs(dict): program的输出变量
* main_program(Program): 带有预训练参数的program
## 三、模型API预测
## get_labels()
- ### 1、命令行预测
获取porn_detection_gru的类别
- ```shell
$ hub run porn_detection_gru --input_text "黄片下载"
```
- 或者
**返回**
- ```shell
$ hub run porn_detection_gru --input_file test.txt
```
- 其中test.txt存放待审查文本,每行仅放置一段待审核文本
- 通过命令行方式实现hub模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
* labels(dict): porn_detection_gru的类别
- ### 2、预测代码示例
## get_vocab_path()
- ```python
import paddlehub as hub
porn_detection_gru = hub.Module(name="porn_detection_gru")
test_text = ["黄片下载", "打击黄牛党"]
results = porn_detection_gru.detection(texts=test_text, use_gpu=True, batch_size=1) # 如不使用GPU,请修改为use_gpu=False
for index, text in enumerate(test_text):
results[index]["text"] = text
for index, result in enumerate(results):
print(results[index])
# 输出结果如下:
# {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676}
# {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996}
```
获取预训练时使用的词汇表
- ### 3、API
**返回**
- ```python
def detection(texts=[], data={}, use_gpu=False, batch_size=1)
```
- porn_detection_gru预测接口,鉴定输入句子是否包含色情文案
* vocab_path(str): 词汇表路径
- **参数**
# PornDetectionGRU 服务部署
- texts(list): 待预测数据,如果使用texts参数,则不用传入data参数,二选一即可
PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用
- data(dict): 预测数据,key必须为text,value是带预测数据。如果使用data参数,则不用传入texts参数,二选一即可。建议使用texts参数,data参数后续会废弃
## 第一步:启动PaddleHub Serving
- use_gpu(bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置
运行启动命令:
```shell
$ hub serving start -m porn_detection_gru
```
- batch_size(int): 批处理大小
启动时会显示加载模型过程,启动成功后显示
```shell
Loading porn_detection_gru successful.
```
- **返回**
这样就完成了服务化API的部署,默认端口号为8866。
- results(list): 鉴定结果
**NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
- ```python
def get_labels()
```
- 获取porn_detection_gru的类别
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- **返回**
```python
import requests
import json
- labels(dict): porn_detection_gru的类别(二分类,是/不是)
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
- ```python
def get_vocab_path()
```
# 设置运行配置
# 对应本地预测porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
- 获取预训练时使用的词汇表
# 指定预测方法为porn_detection_gru并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_gru"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
- **返回**
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- vocab_path(str): 词汇表路径
关于PaddleHub Serving更多信息参考[服务部署](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.6/docs/tutorial/serving.md)
## 四、服务部署
- PaddleHub Serving可以部署一个在线色情文案检测服务,可以将此接口用于在线web应用。
- ## 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m porn_detection_gru
```
- 启动时会显示加载模型过程,启动成功后显示
- ```shell
Loading porn_detection_gur successful.
```
- 这样就完成了服务化API的部署,默认端口号为8866。
- **NOTE:** 如使用GPU预测,则需要在启动服务之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
- ## 第二步:发送预测请求
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
# 待预测数据
text = ["黄片下载", "打击黄牛党"]
# 设置运行配置
# 对应本地预测porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# 指定预测方法为porn_detection_gru并发送post请求,content-type类型应指定json方式
# HOST_IP为服务器IP
url = "http://HOST_IP:8866/predict/porn_detection_gru"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- 关于PaddleHub Serving更多信息参考[服务部署](../../../../docs/docs_ch/tutorial/serving.md)
## 五、更新历史
* 1.0.0
初始发布
* 1.1.0
大幅提升预测性能,同时简化接口使用
- ```shell
$ hub install porn_detection_gru==1.1.0
```
# porn_detection_gru
| Module Name | porn_detection_gru |
| :------------------ | :------------: |
| Category | text-text_review |
| Network | GRU |
| Dataset | Dataset built by Baidu |
| Fine-tuning supported or not | No |
| Module Size | 20M |
| Latest update date | 2021-02-26 |
| Data indicators | - |
## I. Basic Information of Module
- ### Module Introduction
- Pornography detection model can automatically distinguish whether the text is pornographic or not and give the corresponding confidence, and identify the pornographic description, vulgar communication and filthy text in the text.
- porn_detection_gru adopts GRU network structure and cuts words according to word granularity, which has high prediction speed. The maximum sentence length of this model is 256 words, and only prediction is supported.
## II. Installation
- ### 1、Environmental dependence
- paddlepaddle >= 1.6.2
- paddlehub >= 1.6.0 | [How to install PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、Installation
- ```shell
$ hub install porn_detection_gru
```
- If you have problems during installation, please refer to:[windows_quickstart](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [linux_quickstart](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [mac_quickstart](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## III. Module API and Prediction
- ### 1、Command line Prediction
- ```shell
$ hub run porn_detection_gru --input_text "黄片下载"
```
- or
- ```shell
$ hub run porn_detection_gru --input_file test.txt
```
- test.txt stores the text to be reviewed. Each line contains only one text
- If you want to call the Hub module through the command line, please refer to: [PaddleHub Command line instruction](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、Prediction Code Example
- ```python
import paddlehub as hub
porn_detection_gru = hub.Module(name="porn_detection_gru")
test_text = ["黄片下载", "打击黄牛党"]
results = porn_detection_gru.detection(texts=test_text, use_gpu=True, batch_size=1) # If you do not use GPU, please set use_gpu=False
for index, text in enumerate(test_text):
results[index]["text"] = text
for index, result in enumerate(results):
print(results[index])
# The output:
# {'text': '黄片下载', 'porn_detection_label': 1, 'porn_detection_key': 'porn', 'porn_probs': 0.9324, 'not_porn_probs': 0.0676}
# {'text': '打击黄牛党', 'porn_detection_label': 0, 'porn_detection_key': 'not_porn', 'porn_probs': 0.0004, 'not_porn_probs': 0.9996}
```
- ### 3、API
- ```python
def detection(texts=[], data={}, use_gpu=False, batch_size=1)
```
- prediction api of porn_detection_gru,to identify whether input sentences contain pornography
- **Parameter**
- texts(list): data to be predicted, if texts parameter is used, there is no need to pass in data parameter. You can use any of the two parameters.
- data(dict): predicted data , key must be text,value is data to be predicted. if data parameter is used, there is no need to pass in texts parameter. You can use any of the two parameters. It is suggested to use texts parameter, and data parameter will be discarded later.
- use_gpu(bool): use GPU or not. If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- **Return**
- results(list): prediction result
- ```python
def get_labels()
```
- get the category of porn_detection_gru
- **Return**
- labels(dict): the category of porn_detection_gru (Dichotomies, yes/no)
- ```python
def get_vocab_path()
```
- get a vocabulary for pre-training
- **Return**
- vocab_path(str): Vocabulary path
## IV. Server Deployment
- PaddleHub Serving can deploy an online pornography detection service and you can use this interface for online Web applications.
- ## Step 1: Start PaddleHub Serving
- Run the startup command:
- ```shell
$ hub serving start -m porn_detection_gru
```
- The model loading process is displayed on startup. After the startup is successful, the following information is displayed:
- ```shell
Loading porn_detection_gur successful.
```
- The servitization API is now deployed and the default port number is 8866.
- **NOTE:** If GPU is used for prediction, set CUDA_VISIBLE_DEVICES environment variable before prediction. Otherwise, need not set it.
- ## Step 2: Send a predictive request
- After configuring the server, the following lines of code can be used to send the prediction request and obtain the prediction result
- ```python
import requests
import json
# data to be predicted
text = ["黄片下载", "打击黄牛党"]
# Set the running configuration
# Corresponding local forecast porn_detection_gru.detection(texts=text, batch_size=1, use_gpu=True)
data = {"texts": text, "batch_size": 1, "use_gpu":True}
# set the prediction method to porn_detection_gru and send a POST request, content-type should be set to json
# HOST_IP is the IP address of the server
url = "http://HOST_IP:8866/predict/porn_detection_gru"
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# print prediction result
print(json.dumps(r.json(), indent=4, ensure_ascii=False))
```
- For more information about PaddleHub Serving, please refer to:[Serving Deployment](../../../../docs/docs_ch/tutorial/serving.md)
## V. Release Note
* 1.0.0
First release
* 1.1.0
Improves prediction performance and simplifies interface usage
- ```shell
$ hub install porn_detection_gru==1.1.0
```
# nonlocal_kinetics400
|模型名称|nonlocal_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|Non-local|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|129MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- Non-local Neural Networks是由Xiaolong Wang等研究者在2017年提出的模型,主要特点是通过引入Non-local操作来描述距离较远的像素点之间的关联关系。其借助于传统计算机视觉中的non-local mean的思想,并将该思想扩展到神经网络中,通过定义输出位置和所有输入位置之间的关联函数,建立全局关联特性。Non-local模型的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install nonlocal_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run nonlocal_kinetics400 --input_path "/PATH/TO/VIDEO" --use_gpu True
```
或者
- ```shell
hub run nonlocal_kinetics400 --input_file test.txt --use_gpu True
```
- test.txt 存放待分类视频的存放路径;
- Note: 该PaddleHub Module目前只支持在GPU环境下使用,在使用前,请使用下述命令指定GPU设备(设备ID请根据实际情况指定)
- ```shell
export CUDA_VISIBLE_DEVICES=0
```
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
nonlocal = hub.Module(name="nonlocal_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = nonlocal.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install nonlocal_kinetics400==1.0.0
```
# stnet_kinetics400
|模型名称|stnet_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|StNet|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|129MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- StNet模型框架为ActivityNet Kinetics Challenge 2018中夺冠的基础网络框架,是基于ResNet50实现的。该模型提出super-image的概念,在super-image上进行2D卷积,建模视频中局部时空相关性。另外通过temporal modeling block建模视频的全局时空依赖,最后用一个temporal Xception block对抽取的特征序列进行长时序建模。StNet的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install stnet_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run stnet_kinetics400 --input_path "/PATH/TO/VIDEO"
```
或者
- ```shell
hub run stnet_kinetics400 --input_file test.txt
```
- test.txt 存放待分类视频的存放路径
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
stnet = hub.Module(name="stnet_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = stnet.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install stnet_kinetics400==1.0.0
```
# tsm_kinetics400
|模型名称|tsm_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|TSM|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|95MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- TSM(Temporal Shift Module)是由MIT和IBM Watson AI Lab的JiLin,ChuangGan和SongHan等人提出的通过时间位移来提高网络视频理解能力的模块。TSM的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install tsm_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run tsm_kinetics400 --input_path "/PATH/TO/VIDEO"
```
或者
- ```shell
hub run tsm_kinetics400 --input_file test.txt
```
- Note: test.txt 存放待分类视频的存放路径
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
tsm = hub.Module(name="tsm_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = tsm.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install tsm_kinetics400==1.0.0
```
# tsn_kinetics400
|模型名称|tsn_kinetics400|
| :--- | :---: |
|类别|视频-视频分类|
|网络|TSN|
|数据集|Kinetics-400|
|是否支持Fine-tuning|否|
|模型大小|95MB|
|最新更新日期|2021-02-26|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- TSN(Temporal Segment Network)是视频分类领域经典的基于2D-CNN的解决方案。该方法主要解决视频的长时间行为判断问题,通过稀疏采样视频帧的方式代替稠密采样,既能捕获视频全局信息,也能去除冗余,降低计算量。最终将每帧特征平均融合后得到视频的整体特征,并用于分类。TSN的训练数据采用由DeepMind公布的Kinetics-400动作识别数据集。该PaddleHub Module可支持预测。
- 具体网络结构可参考论文:[TSN](https://arxiv.org/abs/1608.00859)
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 1.4.0
- paddlehub >= 1.0.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install tsn_kinetics400
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、命令行预测
- ```shell
hub run tsn_kinetics400 --input_path "/PATH/TO/VIDEO"
```
或者
- ```shell
hub run tsn_kinetics400 --input_file test.txt
```
- Note: test.txt 存放待分类视频的存放路径
- 通过命令行方式实现文字识别模型的调用,更多请见 [PaddleHub命令行指令](../../../../docs/docs_ch/tutorial/cmd_usage.rst)
- ### 2、预测代码示例
- ```python
import paddlehub as hub
tsn = hub.Module(name="tsn_kinetics400")
test_video_path = "/PATH/TO/VIDEO"
# set input dict
input_dict = {"image": [test_video_path]}
# execute predict and print the result
results = tsn.video_classification(data=input_dict)
for result in results:
print(result)
```
- ### 3、API
- ```python
def video_classification(data)
```
- 用于视频分类预测
- **参数**
- data(dict): dict类型,key为image,str类型;value为待分类的视频路径,list类型。
- **返回**
- result(list\[dict\]): list类型,每个元素为对应输入视频的预测结果。预测结果为dict类型,key为label,value为该label对应的概率值。
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install tsn_kinetics400==1.0.0
```
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册