未验证 提交 87eca2df 编写于 作者: C chenjian 提交者: GitHub

add baidu_language_recognition module (#1984)

* add baidu_language_recognition module

* fix

* fix doc

* fix

* fix doc
Co-authored-by: Nwuzewu <wuzewu@baidu.com>
Co-authored-by: jm_12138's avatarjm12138 <2286040843@qq.com>
上级 aa949a69
# baidu_language_recognition
|模型名称|baidu_language_recognition|
| :--- | :---: |
|类别|文本-语种识别|
|网络|-|
|数据集|-|
|是否支持Fine-tuning|否|
|模型大小|-|
|最新更新日期|2022-09-01|
|数据指标|-|
## 一、模型基本信息
- ### 模型介绍
- 本模块提供百度翻译开放平台的服务,可支持语种识别。您只需要通过传入文本内容,就可以得到识别出来的语种类别。
## 二、安装
- ### 1、环境依赖
- paddlepaddle >= 2.1.0
- paddlehub >= 2.3.0 | [如何安装PaddleHub](../../../../docs/docs_ch/get_start/installation.rst)
- ### 2、安装
- ```shell
$ hub install baidu_language_recognition
```
- 如您安装时遇到问题,可参考:[零基础windows安装](../../../../docs/docs_ch/get_start/windows_quickstart.md)
| [零基础Linux安装](../../../../docs/docs_ch/get_start/linux_quickstart.md) | [零基础MacOS安装](../../../../docs/docs_ch/get_start/mac_quickstart.md)
## 三、模型API预测
- ### 1、预测代码示例
- ```python
import paddlehub as hub
module = hub.Module(name='baidu_language_recognition')
result = module.recognize("I like panda")
print(result)
```
- ### 2、API
- ```python
def recognize(query: str)
```
- 语种识别API,输入文本句子,输出识别后的语种编码。
- **参数**
- `query`(str): 待识别的语言。
- **返回**
- `result`(str): 识别的结果,语言的ISO 639-1编码。
目前支持识别的语种如下:
<p align="center">
<img src="https://user-images.githubusercontent.com/22424850/188105543-21610399-23de-471b-ab60-82c3e95660a6.png" width = "80%" hspace='10'/>
## 四、服务部署
- 通过启动PaddleHub Serving,可以加载模型部署在线语种识别服务。
- ### 第一步:启动PaddleHub Serving
- 运行启动命令:
- ```shell
$ hub serving start -m baidu_language_recognition
```
- 通过以上命令可完成一个语种识别API的部署,默认端口号为8866。
- ## 第二步:发送预测请求
- 配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
- ```python
import requests
import json
text = "I like panda"
data = {"query": text}
# 发送post请求,content-type类型应指定json方式,url中的ip地址需改为对应机器的ip
url = "http://127.0.0.1:8866/predict/baidu_language_recognition"
# 指定post请求的headers为application/json方式
headers = {"Content-Type": "application/json"}
r = requests.post(url=url, headers=headers, data=json.dumps(data))
print(r.json())
```
- 关于PaddleHub Serving更多信息参考:[服务部署](../../../../docs/docs_ch/tutorial/serving.md)
## 五、更新历史
* 1.0.0
初始发布
- ```shell
$ hub install baidu_language_recognition==1.0.0
```
import argparse
import random
from hashlib import md5
from typing import Optional
import requests
import paddlehub as hub
from paddlehub.module.module import moduleinfo
from paddlehub.module.module import runnable
from paddlehub.module.module import serving
def make_md5(s, encoding='utf-8'):
return md5(s.encode(encoding)).hexdigest()
@moduleinfo(name="baidu_language_recognition",
version="1.0.0",
type="text/machine_translation",
summary="",
author="baidu-nlp",
author_email="paddle-dev@baidu.com")
class BaiduLanguageRecognition:
def __init__(self, appid=None, appkey=None):
"""
:param appid: appid for requesting Baidu translation service.
:param appkey: appkey for requesting Baidu translation service.
"""
# Set your own appid/appkey.
if appid == None:
self.appid = '20201015000580007'
else:
self.appid = appid
if appkey is None:
self.appkey = 'IFJB6jBORFuMmVGDRud1'
else:
self.appkey = appkey
self.url = 'https://fanyi-api.baidu.com/api/trans/vip/language'
def recognize(self, query: str):
"""
Create image by text prompts using ErnieVilG model.
:param query: Text to be translated.
Return language type code.
"""
# Generate salt and sign
salt = random.randint(32768, 65536)
sign = make_md5(self.appid + query + str(salt) + self.appkey)
# Build request
headers = {'Content-Type': 'application/x-www-form-urlencoded'}
payload = {'appid': self.appid, 'q': query, 'salt': salt, 'sign': sign}
# Send request
try:
r = requests.post(self.url, params=payload, headers=headers)
result = r.json()
except Exception as e:
error_msg = str(e)
raise RuntimeError(error_msg)
if result['error_code'] != 0:
raise RuntimeError(result['error_msg'])
return result['data']['src']
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(title="Input options", description="Input data. Required")
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
if args.appid is not None and args.appkey is not None:
self.appid = args.appid
self.appkey = args.appkey
result = self.recognize(args.query)
return result
@serving
def serving_method(self, query):
"""
Run as a service.
"""
return self.recognize(query)
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument('--query', type=str)
self.arg_input_group.add_argument('--appid', type=str, default=None, help="注册得到的个人appid")
self.arg_input_group.add_argument('--appkey', type=str, default=None, help="注册得到的个人appkey")
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册